Sync from SUSE:SLFO:Main xen revision bb833c336a571aebfd87e356e7e38454

This commit is contained in:
2025-03-20 12:41:05 +01:00
parent 5be7b338bb
commit b8bb876dbd
39 changed files with 217 additions and 5174 deletions

View File

@@ -32,7 +32,6 @@ and Tools" installs all the packages below:
virt-install (Optional, to install VMs)
virt-manager (Optional, to manage VMs graphically)
virt-viewer (Optional, to view VMs outside virt-manager)
vm-install (Optional, to install VMs with xl only)
You then need to reboot your machine. Instead of booting a normal Linux
kernel, you will boot the Xen hypervisor and a slightly changed Linux kernel.
@@ -174,35 +173,6 @@ details. The installation of an OS within the VM can be automated if the OS
supports it.
Creating a VM with vm-install
-----------------------------
The vm-install program is also provided to create VMs. Like virt-install,
this optional program handles creating both the VM's libvirt XML definition
and disk(s). It also creates a legacy configuration file for use with 'xl'.
It can help install any operating system, not just SUSE.
From the command line, run "vm-install". If the DISPLAY environment variable
is set and the supporting packages (python-gtk) are installed, a graphical
wizard will start. Otherwise, a text wizard will start. If vm-install is
started with the '--use-xl' flag, it will not require libvirt nor attempt
to communicate with libvirt when creating a VM and instead will only use the
'xl' toolstack to start VM installations.
Once you have the VM configured, click "OK". The wizard will now create a
configuration file for the VM, and create a disk image. The disk image will
exist in /var/lib/xen/images, and a corresponding configuration file will exist
in /etc/xen/vm. The operating system's installation program will then run
within the VM.
When the VM shuts down (because the installation -- or at least the first
stage of it -- is done), the wizard finalizes the VM's configuration and
restarts the VM.
The creation of VMs can be automated; read the vm-install man page for more
details. The installation of an OS within the VM can be automated if the OS
supports it.
Creating a VM Manually
----------------------
If you create a VM manually (as opposed to using virt-install, which is the
@@ -231,11 +201,11 @@ Managing Virtual Machines
-------------------------
VMs can be managed from the command line using 'virsh' or from virt-manager.
VMs created by virt-install or vm-install (without vm-install's --use-xl flag)
will automatically be defined in libvirt. VMs defined in libvirt may be managed
by virt-manager or from the command line using the 'virsh' command. However,
if you copy a VM from another machine and manually create a VM XML configuration
file, you will need to import it into libvirt with a command like:
VMs created by virt-install will automatically be defined in libvirt. VMs
defined in libvirt may be managed by virt-manager or from the command line
using the 'virsh' command. However, if you copy a VM from another machine
and manually create a VM XML configuration file, you will need to import it
into libvirt with a command like:
virsh define <path to>/my-vm.xml
This imports the configuration into libvirt (and therefore virt-manager becomes
aware of it, also).

View File

@@ -11,44 +11,42 @@ host console or also by inserting a sleep before each ip command
and executing it manually at the command line. This seems to be
an artifact of using 'set -e' everywhere.
Index: xen-4.15.0-testing/tools/hotplug/Linux/xen-network-common.sh
===================================================================
--- xen-4.15.0-testing.orig/tools/hotplug/Linux/xen-network-common.sh
+++ xen-4.15.0-testing/tools/hotplug/Linux/xen-network-common.sh
@@ -90,7 +90,7 @@ _setup_bridge_port() {
---
tools/hotplug/Linux/xen-network-common.sh | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/tools/hotplug/Linux/xen-network-common.sh
+++ b/tools/hotplug/Linux/xen-network-common.sh
@@ -84,7 +84,7 @@
local virtual="$2"
# take interface down ...
- ip link set dev ${dev} down
+ (ip link set dev ${dev} down || true)
+ ip link set dev ${dev} down || true
if [ $virtual -ne 0 ] ; then
# Initialise a dummy MAC address. We choose the numerically
@@ -101,7 +101,7 @@ _setup_bridge_port() {
@@ -95,7 +95,7 @@
fi
# ... and configure it
- ip address flush dev ${dev}
+ (ip address flush dev ${dev} || true)
+ ip address flush dev ${dev} || true
}
setup_physical_bridge_port() {
@@ -136,15 +136,15 @@ add_to_bridge () {
@@ -123,12 +123,12 @@
# Don't add $dev to $bridge if it's already on the bridge.
if [ ! -e "/sys/class/net/${bridge}/brif/${dev}" ]; then
log debug "adding $dev to bridge $bridge"
if which brctl >&/dev/null; then
- brctl addif ${bridge} ${dev}
+ (brctl addif ${bridge} ${dev} || true)
else
- ip link set ${dev} master ${bridge}
+ (ip link set ${dev} master ${bridge} || true)
fi
+ ip link set ${dev} master ${bridge} || true
else
log debug "$dev already on bridge $bridge"
fi
- ip link set dev ${dev} up
+ (ip link set dev ${dev} up || true)
+ ip link set dev ${dev} up || true
}
remove_from_bridge () {

View File

@@ -1,64 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Wed, 9 Dec 2020 16:40:00 +0100
Subject: libxc sr bitmap long
tools: add API to work with sevaral bits at once
Introduce new API to test if a fixed number of bits is clear or set,
and clear or set them all at once.
The caller has to make sure the input bitnumber is a multiple of BITS_PER_LONG.
This API avoids the loop over each bit in a known range just to see
if all of them are either clear or set.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
v02:
- change return type from int to bool (jgross)
---
tools/libs/ctrl/xc_bitops.h | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
--- a/tools/libs/ctrl/xc_bitops.h
+++ b/tools/libs/ctrl/xc_bitops.h
@@ -3,6 +3,7 @@
/* bitmap operations for single threaded access */
+#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
@@ -81,4 +82,31 @@ static inline void bitmap_or(void *_dst,
dst[i] |= other[i];
}
+static inline bool test_bit_long_set(unsigned long nr_base, const void *_addr)
+{
+ const unsigned long *addr = _addr;
+ unsigned long val = addr[nr_base / BITS_PER_LONG];
+
+ return val == ~0;
+}
+
+static inline bool test_bit_long_clear(unsigned long nr_base, const void *_addr)
+{
+ const unsigned long *addr = _addr;
+ unsigned long val = addr[nr_base / BITS_PER_LONG];
+
+ return val == 0;
+}
+
+static inline void clear_bit_long(unsigned long nr_base, void *_addr)
+{
+ unsigned long *addr = _addr;
+ addr[nr_base / BITS_PER_LONG] = 0;
+}
+
+static inline void set_bit_long(unsigned long nr_base, void *_addr)
+{
+ unsigned long *addr = _addr;
+ addr[nr_base / BITS_PER_LONG] = ~0;
+}
#endif /* XC_BITOPS_H */

View File

@@ -1,144 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Thu, 7 Jan 2021 15:58:30 +0100
Subject: libxc sr LIBXL_HAVE_DOMAIN_SUSPEND_PROPS
tools: adjust libxl_domain_suspend to receive a struct props
Upcoming changes will pass more knobs down to xc_domain_save.
Adjust the libxl_domain_suspend API to allow easy adding of additional knobs.
No change in behavior intented.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
---
tools/include/libxl.h | 26 +++++++++++++++++++++++---
tools/libs/light/libxl_domain.c | 7 ++++---
tools/xl/xl_migrate.c | 9 ++++++---
tools/xl/xl_saverestore.c | 3 ++-
4 files changed, 35 insertions(+), 10 deletions(-)
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -1855,13 +1855,28 @@ static inline int libxl_retrieve_domain_
libxl_retrieve_domain_configuration_0x041200
#endif
-int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd,
- int flags, /* LIBXL_SUSPEND_* */
- const libxl_asyncop_how *ao_how)
- LIBXL_EXTERNAL_CALLERS_ONLY;
+/*
+ * LIBXL_HAVE_DOMAIN_SUSPEND_PROPS indicates that the
+ * libxl_domain_suspend_props() function takes a props struct.
+ */
+#define LIBXL_HAVE_DOMAIN_SUSPEND_PROPS 1
+
+typedef struct {
+ uint32_t flags; /* LIBXL_SUSPEND_* */
+} libxl_domain_suspend_suse_properties;
#define LIBXL_SUSPEND_DEBUG 1
#define LIBXL_SUSPEND_LIVE 2
+#define LIBXL_HAVE_DOMAIN_SUSPEND_SUSE
+int libxl_domain_suspend_suse(libxl_ctx *ctx, uint32_t domid, int fd,
+ const libxl_domain_suspend_suse_properties *props, /* optional */
+ const libxl_asyncop_how *ao_how)
+ LIBXL_EXTERNAL_CALLERS_ONLY;
+
+int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
+ const libxl_asyncop_how *ao_how)
+ LIBXL_EXTERNAL_CALLERS_ONLY;
+
/*
* Only suspend domain, do not save its state to file, do not destroy it.
* Suspended domain can be resumed with libxl_domain_resume()
--- a/tools/libs/light/libxl_domain.c
+++ b/tools/libs/light/libxl_domain.c
@@ -502,7 +502,8 @@ static void domain_suspend_cb(libxl__egc
}
-int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
+static int do_libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd,
+ const libxl_domain_suspend_suse_properties *props,
const libxl_asyncop_how *ao_how)
{
AO_CREATE(ctx, domid, ao_how);
@@ -523,8 +524,8 @@ int libxl_domain_suspend(libxl_ctx *ctx,
dss->domid = domid;
dss->fd = fd;
dss->type = type;
- dss->live = flags & LIBXL_SUSPEND_LIVE;
- dss->debug = flags & LIBXL_SUSPEND_DEBUG;
+ dss->live = props->flags & LIBXL_SUSPEND_LIVE;
+ dss->debug = props->flags & LIBXL_SUSPEND_DEBUG;
dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_NONE;
rc = libxl__fd_flags_modify_save(gc, dss->fd,
@@ -539,6 +540,21 @@ int libxl_domain_suspend(libxl_ctx *ctx,
return AO_CREATE_FAIL(rc);
}
+int libxl_domain_suspend_suse(libxl_ctx *ctx, uint32_t domid, int fd,
+ const libxl_domain_suspend_suse_properties *props,
+ const libxl_asyncop_how *ao_how)
+{
+ return do_libxl_domain_suspend(ctx, domid, fd, props, ao_how);
+}
+
+int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
+ const libxl_asyncop_how *ao_how)
+{
+ libxl_domain_suspend_suse_properties props = { .flags = flags };
+
+ return do_libxl_domain_suspend(ctx, domid, fd, &props, ao_how);
+}
+
static void domain_suspend_empty_cb(libxl__egc *egc,
libxl__domain_suspend_state *dss, int rc)
{
--- a/tools/xl/xl_migrate.c
+++ b/tools/xl/xl_migrate.c
@@ -186,7 +186,10 @@ static void migrate_domain(uint32_t domi
char *away_domname;
char rc_buf;
uint8_t *config_data;
- int config_len, flags = LIBXL_SUSPEND_LIVE;
+ int config_len;
+ libxl_domain_suspend_suse_properties props = {
+ .flags = LIBXL_SUSPEND_LIVE,
+ };
save_domain_core_begin(domid, preserve_domid, override_config_file,
&config_data, &config_len);
@@ -205,8 +208,8 @@ static void migrate_domain(uint32_t domi
xtl_stdiostream_adjust_flags(logger, XTL_STDIOSTREAM_HIDE_PROGRESS, 0);
if (debug)
- flags |= LIBXL_SUSPEND_DEBUG;
- rc = libxl_domain_suspend(ctx, domid, send_fd, flags, NULL);
+ props.flags |= LIBXL_SUSPEND_DEBUG;
+ rc = libxl_domain_suspend_suse(ctx, domid, send_fd, &props, NULL);
if (rc) {
fprintf(stderr, "migration sender: libxl_domain_suspend failed"
" (rc=%d)\n", rc);
--- a/tools/xl/xl_saverestore.c
+++ b/tools/xl/xl_saverestore.c
@@ -130,6 +130,7 @@ static int save_domain(uint32_t domid, i
int fd;
uint8_t *config_data;
int config_len;
+ libxl_domain_suspend_suse_properties props = {};
save_domain_core_begin(domid, preserve_domid, override_config_file,
&config_data, &config_len);
@@ -146,7 +147,7 @@ static int save_domain(uint32_t domid, i
save_domain_core_writeconfig(fd, filename, config_data, config_len);
- int rc = libxl_domain_suspend(ctx, domid, fd, 0, NULL);
+ int rc = libxl_domain_suspend_suse(ctx, domid, fd, &props, NULL);
close(fd);
if (rc < 0) {

View File

@@ -1,238 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Thu, 7 Jan 2021 20:25:28 +0100
Subject: libxc sr abort_if_busy
tools: add --abort_if_busy to libxl_domain_suspend
Provide a knob to the host admin to abort the live migration of a
running domU if the downtime during final transit will be too long
for the workload within domU.
Adjust error reporting. Add ERROR_MIGRATION_ABORTED to allow callers of
libxl_domain_suspend to distinguish between errors and the requested
constraint.
Adjust precopy_policy to simplify reporting of remaining dirty pages.
The loop in send_memory_live populates ->dirty_count in a different
place than ->iteration. Let it proceeed one more time to provide the
desired information before leaving the loop.
This patch adjusts xl(1) and the libxl API.
External users check LIBXL_HAVE_DOMAIN_SUSPEND_PROPS for the availibility
of the new .abort_if_busy property.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
docs/man/xl.1.pod.in | 8 +++++++
tools/include/libxl.h | 1 +
tools/libs/light/libxl_dom_save.c | 7 ++++++-
tools/libs/light/libxl_domain.c | 1 +
tools/libs/light/libxl_internal.h | 2 ++
tools/libs/light/libxl_stream_write.c | 9 +++++++-
tools/libs/light/libxl_types.idl | 1 +
tools/xl/xl_cmdtable.c | 6 +++++-
tools/xl/xl_migrate.c | 30 ++++++++++++++++++++-------
9 files changed, 55 insertions(+), 10 deletions(-)
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -513,6 +513,14 @@ low, the guest is suspended and the domU
This allows the host admin to control for how long the domU will likely
be suspended during transit.
+=item B<--abort_if_busy>
+
+Abort migration instead of doing final suspend/move/resume if the
+guest produced more than I<min_remaining> dirty pages during th number
+of I<max_iters> iterations.
+This avoids long periods of time where the guest is suspended, which
+may confuse the workload within domU.
+
=back
=item B<remus> [I<OPTIONS>] I<domain-id> I<host>
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -1868,6 +1868,7 @@ typedef struct {
} libxl_domain_suspend_suse_properties;
#define LIBXL_SUSPEND_DEBUG 1
#define LIBXL_SUSPEND_LIVE 2
+#define LIBXL_SUSPEND_ABORT_IF_BUSY 4
#define LIBXL_HAVE_DOMAIN_SUSPEND_SUSE
int libxl_domain_suspend_suse(libxl_ctx *ctx, uint32_t domid, int fd,
--- a/tools/libs/light/libxl_dom_save.c
+++ b/tools/libs/light/libxl_dom_save.c
@@ -383,11 +383,16 @@ static int libxl__domain_save_precopy_po
stats.iteration, stats.dirty_count, stats.total_written);
if (stats.dirty_count >= 0 && stats.dirty_count < dss->min_remaining)
goto stop_copy;
- if (stats.iteration >= dss->max_iters)
+ if (stats.dirty_count >= 0 && stats.iteration >= dss->max_iters)
goto stop_copy;
return XGS_POLICY_CONTINUE_PRECOPY;
stop_copy:
+ if (dss->abort_if_busy)
+ {
+ dss->remaining_dirty_pages = stats.dirty_count;
+ return XGS_POLICY_ABORT;
+ }
return XGS_POLICY_STOP_AND_COPY;
}
--- a/tools/libs/light/libxl_domain.c
+++ b/tools/libs/light/libxl_domain.c
@@ -526,6 +526,7 @@ static int do_libxl_domain_suspend(libxl
dss->type = type;
dss->max_iters = props->max_iters ?: LIBXL_XGS_POLICY_MAX_ITERATIONS;
dss->min_remaining = props->min_remaining ?: LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT;
+ dss->abort_if_busy = props->flags & LIBXL_SUSPEND_ABORT_IF_BUSY;
dss->live = props->flags & LIBXL_SUSPEND_LIVE;
dss->debug = props->flags & LIBXL_SUSPEND_DEBUG;
dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_NONE;
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -3651,9 +3651,11 @@ struct libxl__domain_save_state {
libxl_domain_type type;
int live;
int debug;
+ int abort_if_busy;
int checkpointed_stream;
uint32_t max_iters;
uint32_t min_remaining;
+ long remaining_dirty_pages;
const libxl_domain_remus_info *remus;
/* private */
int rc;
--- a/tools/libs/light/libxl_stream_write.c
+++ b/tools/libs/light/libxl_stream_write.c
@@ -344,11 +344,18 @@ void libxl__xc_domain_save_done(libxl__e
goto err;
if (retval) {
+ if (dss->remaining_dirty_pages) {
+ LOGD(NOTICE, dss->domid, "saving domain: aborted,"
+ " %ld remaining dirty pages.", dss->remaining_dirty_pages);
+ } else {
LOGEVD(ERROR, errnoval, dss->domid, "saving domain: %s",
dss->dsps.guest_responded ?
"domain responded to suspend request" :
"domain did not respond to suspend request");
- if (!dss->dsps.guest_responded)
+ }
+ if (dss->remaining_dirty_pages)
+ rc = ERROR_MIGRATION_ABORTED;
+ else if(!dss->dsps.guest_responded)
rc = ERROR_GUEST_TIMEDOUT;
else if (dss->rc)
rc = dss->rc;
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -76,6 +76,7 @@ libxl_error = Enumeration("error", [
(-30, "QMP_DEVICE_NOT_ACTIVE"), # a device has failed to be become active
(-31, "QMP_DEVICE_NOT_FOUND"), # the requested device has not been found
(-32, "QEMU_API"), # QEMU's replies don't contains expected members
+ (-33, "MIGRATION_ABORTED"),
], value_namespace = "")
libxl_domain_type = Enumeration("domain_type", [
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -177,7 +177,11 @@ const struct cmd_spec cmd_table[] = {
"-p Do not unpause domain after migrating it.\n"
"-D Preserve the domain id\n"
"--max_iters N Number of copy iterations before final stop+move\n"
- "--min_remaining N Number of remaining dirty pages before final stop+move"
+ "--min_remaining N Number of remaining dirty pages before final stop+move\n"
+ "--abort_if_busy Abort migration instead of doing final stop+move,\n"
+ " if the number of dirty pages is higher than <min_remaining>\n"
+ " after <max_iters> iterations. Otherwise the amount of memory\n"
+ " to be transfered would exceed maximum allowed domU downtime."
},
{ "restore",
&main_restore, 0, 1,
--- a/tools/xl/xl_migrate.c
+++ b/tools/xl/xl_migrate.c
@@ -177,7 +177,7 @@ static void migrate_do_preamble(int send
}
static void migrate_domain(uint32_t domid, int preserve_domid,
- const char *rune, int debug,
+ const char *rune, int debug, int abort_if_busy,
uint32_t max_iters,
uint32_t min_remaining,
const char *override_config_file)
@@ -213,14 +213,20 @@ static void migrate_domain(uint32_t domi
if (debug)
props.flags |= LIBXL_SUSPEND_DEBUG;
+ if (abort_if_busy)
+ props.flags |= LIBXL_SUSPEND_ABORT_IF_BUSY;
rc = libxl_domain_suspend_suse(ctx, domid, send_fd, &props, NULL);
if (rc) {
fprintf(stderr, "migration sender: libxl_domain_suspend failed"
" (rc=%d)\n", rc);
- if (rc == ERROR_GUEST_TIMEDOUT)
- goto failed_suspend;
- else
- goto failed_resume;
+ switch (rc) {
+ case ERROR_GUEST_TIMEDOUT:
+ goto failed_suspend;
+ case ERROR_MIGRATION_ABORTED:
+ goto failed_busy;
+ default:
+ goto failed_resume;
+ }
}
//fprintf(stderr, "migration sender: Transfer complete.\n");
@@ -302,6 +308,12 @@ static void migrate_domain(uint32_t domi
fprintf(stderr, "Migration failed, failed to suspend at sender.\n");
exit(EXIT_FAILURE);
+ failed_busy:
+ close(send_fd);
+ migration_child_report(recv_fd);
+ fprintf(stderr, "Migration aborted as requested, domain is too busy.\n");
+ exit(EXIT_FAILURE);
+
failed_resume:
close(send_fd);
migration_child_report(recv_fd);
@@ -545,13 +557,14 @@ int main_migrate(int argc, char **argv)
char *rune = NULL;
char *host;
int opt, daemonize = 1, monitor = 1, debug = 0, pause_after_migration = 0;
- int preserve_domid = 0;
+ int preserve_domid = 0, abort_if_busy = 0;
uint32_t max_iters = 0;
uint32_t min_remaining = 0;
static struct option opts[] = {
{"debug", 0, 0, 0x100},
{"max_iters", 1, 0, 0x101},
{"min_remaining", 1, 0, 0x102},
+ {"abort_if_busy", 0, 0, 0x103},
{"live", 0, 0, 0x200},
COMMON_LONG_OPTS
};
@@ -585,6 +598,9 @@ int main_migrate(int argc, char **argv)
case 0x102: /* --min_remaining */
min_remaining = atoi(optarg);
break;
+ case 0x103: /* --abort_if_busy */
+ abort_if_busy = 1;
+ break;
case 0x200: /* --live */
/* ignored for compatibility with xm */
break;
@@ -619,7 +635,7 @@ int main_migrate(int argc, char **argv)
pause_after_migration ? " -p" : "");
}
- migrate_domain(domid, preserve_domid, rune, debug,
+ migrate_domain(domid, preserve_domid, rune, debug, abort_if_busy,
max_iters, min_remaining, config_filename);
return EXIT_SUCCESS;
}

View File

@@ -1,148 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Sat, 9 Jan 2021 11:32:17 +0100
Subject: libxc sr max_iters
tools: add --max_iters to libxl_domain_suspend
Migrating a large, and potentially busy, domU will take more
time than neccessary due to excessive number of copying iterations.
Allow to host admin to control the number of iterations which
copy cumulated domU dirty pages to the target host.
The default remains 5, which means one initial iteration to copy the
entire domU memory, and up to 4 additional iterations to copy dirty
memory from the still running domU. After the given number of iterations
the domU is suspended, remaining dirty memory is copied and the domU is
finally moved to the target host.
This patch adjusts xl(1) and the libxl API.
External users check LIBXL_HAVE_DOMAIN_SUSPEND_PROPS for the availibility
of the new .max_iters property.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
docs/man/xl.1.pod.in | 4 ++++
tools/include/libxl.h | 1 +
tools/libs/light/libxl_dom_save.c | 2 +-
tools/libs/light/libxl_domain.c | 1 +
tools/libs/light/libxl_internal.h | 1 +
tools/xl/xl_cmdtable.c | 3 ++-
tools/xl/xl_migrate.c | 10 +++++++++-
7 files changed, 19 insertions(+), 3 deletions(-)
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -501,6 +501,10 @@ such that it will be identical on the de
configuration is overridden using the B<-C> option. Note that it is not
possible to use this option for a 'localhost' migration.
+=item B<--max_iters> I<iterations>
+
+Number of copy iterations before final suspend+move (default: 5)
+
=back
=item B<remus> [I<OPTIONS>] I<domain-id> I<host>
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -1863,6 +1863,7 @@ static inline int libxl_retrieve_domain_
typedef struct {
uint32_t flags; /* LIBXL_SUSPEND_* */
+ uint32_t max_iters;
} libxl_domain_suspend_suse_properties;
#define LIBXL_SUSPEND_DEBUG 1
#define LIBXL_SUSPEND_LIVE 2
--- a/tools/libs/light/libxl_dom_save.c
+++ b/tools/libs/light/libxl_dom_save.c
@@ -383,7 +383,7 @@ static int libxl__domain_save_precopy_po
stats.iteration, stats.dirty_count, stats.total_written);
if (stats.dirty_count >= 0 && stats.dirty_count < LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT)
goto stop_copy;
- if (stats.iteration >= LIBXL_XGS_POLICY_MAX_ITERATIONS)
+ if (stats.iteration >= dss->max_iters)
goto stop_copy;
return XGS_POLICY_CONTINUE_PRECOPY;
--- a/tools/libs/light/libxl_domain.c
+++ b/tools/libs/light/libxl_domain.c
@@ -524,6 +524,7 @@ static int do_libxl_domain_suspend(libxl
dss->domid = domid;
dss->fd = fd;
dss->type = type;
+ dss->max_iters = props->max_iters ?: LIBXL_XGS_POLICY_MAX_ITERATIONS;
dss->live = props->flags & LIBXL_SUSPEND_LIVE;
dss->debug = props->flags & LIBXL_SUSPEND_DEBUG;
dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_NONE;
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -3652,6 +3652,7 @@ struct libxl__domain_save_state {
int live;
int debug;
int checkpointed_stream;
+ uint32_t max_iters;
const libxl_domain_remus_info *remus;
/* private */
int rc;
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -175,7 +175,8 @@ const struct cmd_spec cmd_table[] = {
" of the domain.\n"
"--debug Enable verification mode.\n"
"-p Do not unpause domain after migrating it.\n"
- "-D Preserve the domain id"
+ "-D Preserve the domain id\n"
+ "--max_iters N Number of copy iterations before final stop+move"
},
{ "restore",
&main_restore, 0, 1,
--- a/tools/xl/xl_migrate.c
+++ b/tools/xl/xl_migrate.c
@@ -178,6 +178,7 @@ static void migrate_do_preamble(int send
static void migrate_domain(uint32_t domid, int preserve_domid,
const char *rune, int debug,
+ uint32_t max_iters,
const char *override_config_file)
{
pid_t child = -1;
@@ -189,6 +190,7 @@ static void migrate_domain(uint32_t domi
int config_len;
libxl_domain_suspend_suse_properties props = {
.flags = LIBXL_SUSPEND_LIVE,
+ .max_iters = max_iters,
};
save_domain_core_begin(domid, preserve_domid, override_config_file,
@@ -542,8 +544,10 @@ int main_migrate(int argc, char **argv)
char *host;
int opt, daemonize = 1, monitor = 1, debug = 0, pause_after_migration = 0;
int preserve_domid = 0;
+ uint32_t max_iters = 0;
static struct option opts[] = {
{"debug", 0, 0, 0x100},
+ {"max_iters", 1, 0, 0x101},
{"live", 0, 0, 0x200},
COMMON_LONG_OPTS
};
@@ -571,6 +575,9 @@ int main_migrate(int argc, char **argv)
case 0x100: /* --debug */
debug = 1;
break;
+ case 0x101: /* --max_iters */
+ max_iters = atoi(optarg);
+ break;
case 0x200: /* --live */
/* ignored for compatibility with xm */
break;
@@ -605,7 +612,8 @@ int main_migrate(int argc, char **argv)
pause_after_migration ? " -p" : "");
}
- migrate_domain(domid, preserve_domid, rune, debug, config_filename);
+ migrate_domain(domid, preserve_domid, rune, debug,
+ max_iters, config_filename);
return EXIT_SUCCESS;
}

View File

@@ -1,173 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Thu, 7 Jan 2021 19:39:28 +0100
Subject: libxc sr min_remaining
tools: add --min_remaining to libxl_domain_suspend
The decision to stop+move a domU to the new host must be based on two factors:
- the available network bandwidth for the migration stream
- the maximum time a workload within a domU can be savely suspended
Both values define how many dirty pages a workload may produce prior the
final stop+move.
The default value of 50 pages is much too low with todays network bandwidths.
On an idle 1GiB link these 200K will be transferred within ~2ms.
Give the admin a knob to adjust the point when the final stop+move will
be done, so he can base this decision on his own needs.
This patch adjusts xl(1) and the libxl API.
External users check LIBXL_HAVE_DOMAIN_SUSPEND_PROPS for the availibility
of the new .min_remaining property.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
docs/man/xl.1.pod.in | 8 ++++++++
tools/include/libxl.h | 1 +
tools/libs/light/libxl_dom_save.c | 2 +-
tools/libs/light/libxl_domain.c | 1 +
tools/libs/light/libxl_internal.h | 1 +
tools/xl/xl_cmdtable.c | 23 ++++++++++++-----------
tools/xl/xl_migrate.c | 9 ++++++++-
7 files changed, 32 insertions(+), 13 deletions(-)
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -505,6 +505,14 @@ possible to use this option for a 'local
Number of copy iterations before final suspend+move (default: 5)
+=item B<--min_remaing> I<pages>
+
+Number of remaining dirty pages. If the number of dirty pages drops that
+low, the guest is suspended and the domU will finally be moved to I<host>.
+
+This allows the host admin to control for how long the domU will likely
+be suspended during transit.
+
=back
=item B<remus> [I<OPTIONS>] I<domain-id> I<host>
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -1864,6 +1864,7 @@ static inline int libxl_retrieve_domain_
typedef struct {
uint32_t flags; /* LIBXL_SUSPEND_* */
uint32_t max_iters;
+ uint32_t min_remaining;
} libxl_domain_suspend_suse_properties;
#define LIBXL_SUSPEND_DEBUG 1
#define LIBXL_SUSPEND_LIVE 2
--- a/tools/libs/light/libxl_dom_save.c
+++ b/tools/libs/light/libxl_dom_save.c
@@ -381,7 +381,7 @@ static int libxl__domain_save_precopy_po
LOGD(DEBUG, shs->domid, "iteration %u dirty_count %ld total_written %lu",
stats.iteration, stats.dirty_count, stats.total_written);
- if (stats.dirty_count >= 0 && stats.dirty_count < LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT)
+ if (stats.dirty_count >= 0 && stats.dirty_count < dss->min_remaining)
goto stop_copy;
if (stats.iteration >= dss->max_iters)
goto stop_copy;
--- a/tools/libs/light/libxl_domain.c
+++ b/tools/libs/light/libxl_domain.c
@@ -525,6 +525,7 @@ static int do_libxl_domain_suspend(libxl
dss->fd = fd;
dss->type = type;
dss->max_iters = props->max_iters ?: LIBXL_XGS_POLICY_MAX_ITERATIONS;
+ dss->min_remaining = props->min_remaining ?: LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT;
dss->live = props->flags & LIBXL_SUSPEND_LIVE;
dss->debug = props->flags & LIBXL_SUSPEND_DEBUG;
dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_NONE;
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -3653,6 +3653,7 @@ struct libxl__domain_save_state {
int debug;
int checkpointed_stream;
uint32_t max_iters;
+ uint32_t min_remaining;
const libxl_domain_remus_info *remus;
/* private */
int rc;
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -166,17 +166,18 @@ const struct cmd_spec cmd_table[] = {
&main_migrate, 0, 1,
"Migrate a domain to another host",
"[options] <Domain> <host>",
- "-h Print this help.\n"
- "-C <config> Send <config> instead of config file from creation.\n"
- "-s <sshcommand> Use <sshcommand> instead of ssh. String will be passed\n"
- " to sh. If empty, run <host> instead of ssh <host> xl\n"
- " migrate-receive [-d -e]\n"
- "-e Do not wait in the background (on <host>) for the death\n"
- " of the domain.\n"
- "--debug Enable verification mode.\n"
- "-p Do not unpause domain after migrating it.\n"
- "-D Preserve the domain id\n"
- "--max_iters N Number of copy iterations before final stop+move"
+ "-h Print this help.\n"
+ "-C <config> Send <config> instead of config file from creation.\n"
+ "-s <sshcommand> Use <sshcommand> instead of ssh. String will be passed\n"
+ " to sh. If empty, run <host> instead of ssh <host> xl\n"
+ " migrate-receive [-d -e]\n"
+ "-e Do not wait in the background (on <host>) for the death\n"
+ " of the domain.\n"
+ "--debug Enable verification mode.\n"
+ "-p Do not unpause domain after migrating it.\n"
+ "-D Preserve the domain id\n"
+ "--max_iters N Number of copy iterations before final stop+move\n"
+ "--min_remaining N Number of remaining dirty pages before final stop+move"
},
{ "restore",
&main_restore, 0, 1,
--- a/tools/xl/xl_migrate.c
+++ b/tools/xl/xl_migrate.c
@@ -179,6 +179,7 @@ static void migrate_do_preamble(int send
static void migrate_domain(uint32_t domid, int preserve_domid,
const char *rune, int debug,
uint32_t max_iters,
+ uint32_t min_remaining,
const char *override_config_file)
{
pid_t child = -1;
@@ -191,6 +192,7 @@ static void migrate_domain(uint32_t domi
libxl_domain_suspend_suse_properties props = {
.flags = LIBXL_SUSPEND_LIVE,
.max_iters = max_iters,
+ .min_remaining = min_remaining,
};
save_domain_core_begin(domid, preserve_domid, override_config_file,
@@ -545,9 +547,11 @@ int main_migrate(int argc, char **argv)
int opt, daemonize = 1, monitor = 1, debug = 0, pause_after_migration = 0;
int preserve_domid = 0;
uint32_t max_iters = 0;
+ uint32_t min_remaining = 0;
static struct option opts[] = {
{"debug", 0, 0, 0x100},
{"max_iters", 1, 0, 0x101},
+ {"min_remaining", 1, 0, 0x102},
{"live", 0, 0, 0x200},
COMMON_LONG_OPTS
};
@@ -578,6 +582,9 @@ int main_migrate(int argc, char **argv)
case 0x101: /* --max_iters */
max_iters = atoi(optarg);
break;
+ case 0x102: /* --min_remaining */
+ min_remaining = atoi(optarg);
+ break;
case 0x200: /* --live */
/* ignored for compatibility with xm */
break;
@@ -613,7 +620,7 @@ int main_migrate(int argc, char **argv)
}
migrate_domain(domid, preserve_domid, rune, debug,
- max_iters, config_filename);
+ max_iters, min_remaining, config_filename);
return EXIT_SUCCESS;
}

View File

@@ -1,24 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Mon, 4 Jan 2021 20:58:42 +0200
Subject: libxc sr number of iterations
Reduce default value of --max_iters from 5 to 1.
The workload within domU will continue to produce dirty pages.
It is unreasonable to expect any slowdown during migration.
Now there is one initial copy of all memory, one instead of five
iterations for dirty memory, and a final copy iteration prior move.
---
tools/libs/light/libxl_internal.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -125,7 +125,7 @@
#define DOMID_XS_PATH "domid"
#define PVSHIM_BASENAME "xen-shim"
#define PVSHIM_CMDLINE "pv-shim console=xen,pv"
-#define LIBXL_XGS_POLICY_MAX_ITERATIONS 5
+#define LIBXL_XGS_POLICY_MAX_ITERATIONS 1
#define LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT 50
#define DIV_ROUNDUP(n, d) (((n) + (d) - 1) / (d))

View File

@@ -1,90 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 8 Jan 2021 18:19:49 +0100
Subject: libxc sr precopy_policy
tools: add callback to libxl for precopy_policy and precopy_stats
This duplicates simple_precopy_policy. To recap its purpose:
- do up to 5 iterations of copying dirty domU memory to target,
including the initial copying of all domU memory, excluding
the final copying while the domU is suspended
- do fewer iterations in case the domU dirtied less than 50 pages
Take the opportunity to also move xen_pfn_t into qw().
Signed-off-by: Olaf Hering <olaf@aepfle.de>
v02:
- use plain struct precopy_stats instead of inventing
a new precopy_stats_t (anthony)
---
tools/libs/light/libxl_dom_save.c | 19 +++++++++++++++++++
tools/libs/light/libxl_internal.h | 2 ++
tools/libs/light/libxl_save_msgs_gen.pl | 3 ++-
3 files changed, 23 insertions(+), 1 deletion(-)
--- a/tools/libs/light/libxl_dom_save.c
+++ b/tools/libs/light/libxl_dom_save.c
@@ -373,6 +373,24 @@ int libxl__save_emulator_xenstore_data(l
return rc;
}
+static int libxl__domain_save_precopy_policy(struct precopy_stats stats, void *user)
+{
+ libxl__save_helper_state *shs = user;
+ libxl__domain_save_state *dss = shs->caller_state;
+ STATE_AO_GC(dss->ao);
+
+ LOGD(DEBUG, shs->domid, "iteration %u dirty_count %ld total_written %lu",
+ stats.iteration, stats.dirty_count, stats.total_written);
+ if (stats.dirty_count >= 0 && stats.dirty_count < LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT)
+ goto stop_copy;
+ if (stats.iteration >= LIBXL_XGS_POLICY_MAX_ITERATIONS)
+ goto stop_copy;
+ return XGS_POLICY_CONTINUE_PRECOPY;
+
+stop_copy:
+ return XGS_POLICY_STOP_AND_COPY;
+}
+
/*----- main code for saving, in order of execution -----*/
void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
@@ -430,6 +448,7 @@ void libxl__domain_save(libxl__egc *egc,
callbacks->suspend = libxl__domain_suspend_callback;
callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
+ callbacks->precopy_policy = libxl__domain_save_precopy_policy;
dss->sws.ao = dss->ao;
dss->sws.dss = dss;
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -125,6 +125,8 @@
#define DOMID_XS_PATH "domid"
#define PVSHIM_BASENAME "xen-shim"
#define PVSHIM_CMDLINE "pv-shim console=xen,pv"
+#define LIBXL_XGS_POLICY_MAX_ITERATIONS 5
+#define LIBXL_XGS_POLICY_TARGET_DIRTY_COUNT 50
#define DIV_ROUNDUP(n, d) (((n) + (d) - 1) / (d))
--- a/tools/libs/light/libxl_save_msgs_gen.pl
+++ b/tools/libs/light/libxl_save_msgs_gen.pl
@@ -23,6 +23,7 @@ our @msgs = (
STRING doing_what),
'unsigned long', 'done',
'unsigned long', 'total'] ],
+ [ 'scxW', "precopy_policy", ['struct precopy_stats', 'stats'] ],
[ 'srcxA', "suspend", [] ],
[ 'srcxA', "postcopy", [] ],
[ 'srcxA', "checkpoint", [] ],
@@ -142,7 +143,7 @@ static void bytes_put(unsigned char *con
END
-foreach my $simpletype (qw(int uint16_t uint32_t unsigned), 'unsigned long', 'xen_pfn_t') {
+foreach my $simpletype (qw(int uint16_t uint32_t unsigned xen_pfn_t), 'struct precopy_stats', 'unsigned long') {
my $typeid = typeid($simpletype);
$out_body{'callout'} .= <<END;
static int ${typeid}_get(const unsigned char **msg,

View File

@@ -1,103 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Wed, 28 Oct 2020 12:07:36 +0100
Subject: libxc sr readv_exact
tools: add readv_exact to libxenctrl
Read a batch of iovec's.
Short reads are the common case, finish the trailing iov with read_exact.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
v2:
- add comment to short-read handling
---
tools/libs/ctrl/xc_private.c | 57 +++++++++++++++++++++++++++++++++++-
tools/libs/ctrl/xc_private.h | 1 +
2 files changed, 57 insertions(+), 1 deletion(-)
--- a/tools/libs/ctrl/xc_private.c
+++ b/tools/libs/ctrl/xc_private.c
@@ -633,8 +633,23 @@ int write_exact(int fd, const void *data
#if defined(__MINIOS__)
/*
- * MiniOS's libc doesn't know about writev(). Implement it as multiple write()s.
+ * MiniOS's libc doesn't know about readv/writev().
+ * Implement it as multiple read/write()s.
*/
+int readv_exact(int fd, const struct iovec *iov, int iovcnt)
+{
+ int rc, i;
+
+ for ( i = 0; i < iovcnt; ++i )
+ {
+ rc = read_exact(fd, iov[i].iov_base, iov[i].iov_len);
+ if ( rc )
+ return rc;
+ }
+
+ return 0;
+}
+
int writev_exact(int fd, const struct iovec *iov, int iovcnt)
{
int rc, i;
@@ -649,6 +664,46 @@ int writev_exact(int fd, const struct io
return 0;
}
#else
+int readv_exact(int fd, const struct iovec *iov, int iovcnt)
+{
+ int rc = 0, idx = 0;
+ ssize_t len;
+
+ while ( idx < iovcnt )
+ {
+ len = readv(fd, &iov[idx], min(iovcnt - idx, IOV_MAX));
+ if ( len == -1 && errno == EINTR )
+ continue;
+ if ( len <= 0 )
+ {
+ rc = -1;
+ goto out;
+ }
+
+ /* Finish a potential short read in the last iov */
+ while ( len > 0 && idx < iovcnt )
+ {
+ if ( len >= iov[idx].iov_len )
+ {
+ len -= iov[idx].iov_len;
+ }
+ else
+ {
+ void *p = iov[idx].iov_base + len;
+ size_t l = iov[idx].iov_len - len;
+
+ rc = read_exact(fd, p, l);
+ if ( rc )
+ goto out;
+ len = 0;
+ }
+ idx++;
+ }
+ }
+out:
+ return rc;
+}
+
int writev_exact(int fd, const struct iovec *iov, int iovcnt)
{
struct iovec *local_iov = NULL;
--- a/tools/libs/ctrl/xc_private.h
+++ b/tools/libs/ctrl/xc_private.h
@@ -382,6 +382,7 @@ int xc_flush_mmu_updates(xc_interface *x
/* Return 0 on success; -1 on error setting errno. */
int read_exact(int fd, void *data, size_t size); /* EOF => -1, errno=0 */
+int readv_exact(int fd, const struct iovec *iov, int iovcnt);
int write_exact(int fd, const void *data, size_t size);
int writev_exact(int fd, const struct iovec *iov, int iovcnt);

View File

@@ -1,435 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Tue, 27 Oct 2020 19:21:50 +0100
Subject: libxc sr restore handle_buffered_page_data
tools: restore: split handle_page_data
handle_page_data must be able to read directly into mapped guest memory.
This will avoid unneccesary memcpy calls for data that can be consumed verbatim.
Split the various steps of record processing:
- move processing to handle_buffered_page_data
- adjust xenforeignmemory_map to set errno in case of failure
- adjust verify mode to set errno in case of failure
This change is preparation for future changes in handle_page_data,
no change in behavior is intended.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 4 +
tools/libs/guest/xg_sr_restore.c | 320 ++++++++++++++++++++-----------
2 files changed, 207 insertions(+), 117 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -262,6 +262,10 @@ struct xc_sr_context
int *map_errs;
xen_pfn_t *pp_pfns;
xen_pfn_t *pp_mfns;
+ void **guest_data;
+
+ void *guest_mapping;
+ uint32_t nr_mapped_pages;
int send_back_fd;
unsigned long p2m_size;
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -183,121 +183,18 @@ int populate_pfns(struct xc_sr_context *
return rc;
}
-/*
- * Given a list of pfns, their types, and a block of page data from the
- * stream, populate and record their types, map the relevant subset and copy
- * the data into the guest.
- */
-static int process_page_data(struct xc_sr_context *ctx, unsigned int count,
- xen_pfn_t *pfns, uint32_t *types, void *page_data)
+static int handle_static_data_end_v2(struct xc_sr_context *ctx)
{
- xc_interface *xch = ctx->xch;
- int rc;
- void *mapping = NULL, *guest_page = NULL;
- unsigned int i, /* i indexes the pfns from the record. */
- j, /* j indexes the subset of pfns we decide to map. */
- nr_pages = 0;
-
- rc = populate_pfns(ctx, count, pfns, types);
- if ( rc )
- {
- ERROR("Failed to populate pfns for batch of %u pages", count);
- goto err;
- }
-
- for ( i = 0; i < count; ++i )
- {
- ctx->restore.ops.set_page_type(ctx, pfns[i], types[i]);
-
- if ( page_type_has_stream_data(types[i]) )
- ctx->restore.mfns[nr_pages++] = ctx->restore.ops.pfn_to_gfn(ctx, pfns[i]);
- }
-
- /* Nothing to do? */
- if ( nr_pages == 0 )
- goto done;
-
- mapping = guest_page = xenforeignmemory_map(
- xch->fmem, ctx->domid, PROT_READ | PROT_WRITE,
- nr_pages, ctx->restore.mfns, ctx->restore.map_errs);
- if ( !mapping )
- {
- rc = -1;
- PERROR("Unable to map %u mfns for %u pages of data",
- nr_pages, count);
- goto err;
- }
-
- for ( i = 0, j = 0; i < count; ++i )
- {
- if ( !page_type_has_stream_data(types[i]) )
- continue;
-
- if ( ctx->restore.map_errs[j] )
- {
- rc = -1;
- ERROR("Mapping pfn %#"PRIpfn" (mfn %#"PRIpfn", type %#"PRIx32") failed with %d",
- pfns[i], ctx->restore.mfns[j], types[i], ctx->restore.map_errs[j]);
- goto err;
- }
-
- /* Undo page normalisation done by the saver. */
- rc = ctx->restore.ops.localise_page(ctx, types[i], page_data);
- if ( rc )
- {
- ERROR("Failed to localise pfn %#"PRIpfn" (type %#"PRIx32")",
- pfns[i], types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
- goto err;
- }
-
- if ( ctx->restore.verify )
- {
- /* Verify mode - compare incoming data to what we already have. */
- if ( memcmp(guest_page, page_data, PAGE_SIZE) )
- ERROR("verify pfn %#"PRIpfn" failed (type %#"PRIx32")",
- pfns[i], types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
- }
- else
- {
- /* Regular mode - copy incoming data into place. */
- memcpy(guest_page, page_data, PAGE_SIZE);
- }
-
- ++j;
- guest_page += PAGE_SIZE;
- page_data += PAGE_SIZE;
- }
-
- done:
- rc = 0;
-
- err:
- if ( mapping )
- xenforeignmemory_unmap(xch->fmem, mapping, nr_pages);
-
- return rc;
-}
+ int rc = 0;
-/*
- * Validate a PAGE_DATA record from the stream, and pass the results to
- * process_page_data() to actually perform the legwork.
- */
-static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
-{
+#if defined(__i386__) || defined(__x86_64__)
xc_interface *xch = ctx->xch;
- struct xc_sr_rec_page_data_header *pages = rec->data;
- unsigned int i, pages_of_data = 0;
- int rc = -1;
-
- xen_pfn_t pfn;
- uint32_t type;
-
/*
* v2 compatibility only exists for x86 streams. This is a bit of a
* bodge, but it is less bad than duplicating handle_page_data() between
* different architectures.
*/
-#if defined(__i386__) || defined(__x86_64__)
+
/* v2 compat. Infer the position of STATIC_DATA_END. */
if ( ctx->restore.format_version < 3 && !ctx->restore.seen_static_data_end )
{
@@ -315,12 +212,26 @@ static int handle_page_data(struct xc_sr
ERROR("No STATIC_DATA_END seen");
goto err;
}
+
+ rc = 0;
+err:
#endif
- if ( rec->length < sizeof(*pages) )
+ return rc;
+}
+
+static bool verify_rec_page_hdr(struct xc_sr_context *ctx, uint32_t rec_length,
+ struct xc_sr_rec_page_data_header *pages)
+{
+ xc_interface *xch = ctx->xch;
+ bool ret = false;
+
+ errno = EINVAL;
+
+ if ( rec_length < sizeof(*pages) )
{
ERROR("PAGE_DATA record truncated: length %u, min %zu",
- rec->length, sizeof(*pages));
+ rec_length, sizeof(*pages));
goto err;
}
@@ -330,13 +241,28 @@ static int handle_page_data(struct xc_sr
goto err;
}
- if ( rec->length < sizeof(*pages) + (pages->count * sizeof(uint64_t)) )
+ if ( rec_length < sizeof(*pages) + (pages->count * sizeof(uint64_t)) )
{
ERROR("PAGE_DATA record (length %u) too short to contain %u"
- " pfns worth of information", rec->length, pages->count);
+ " pfns worth of information", rec_length, pages->count);
goto err;
}
+ ret = true;
+
+err:
+ return ret;
+}
+
+static bool verify_rec_page_pfns(struct xc_sr_context *ctx, uint32_t rec_length,
+ struct xc_sr_rec_page_data_header *pages)
+{
+ xc_interface *xch = ctx->xch;
+ uint32_t i, pages_of_data = 0;
+ xen_pfn_t pfn;
+ uint32_t type;
+ bool ret = false;
+
for ( i = 0; i < pages->count; ++i )
{
pfn = pages->pfn[i] & PAGE_DATA_PFN_MASK;
@@ -363,19 +289,177 @@ static int handle_page_data(struct xc_sr
ctx->restore.types[i] = type;
}
- if ( rec->length != (sizeof(*pages) +
+ if ( rec_length != (sizeof(*pages) +
(sizeof(uint64_t) * pages->count) +
(PAGE_SIZE * pages_of_data)) )
{
ERROR("PAGE_DATA record wrong size: length %u, expected "
- "%zu + %zu + %lu", rec->length, sizeof(*pages),
+ "%zu + %zu + %lu", rec_length, sizeof(*pages),
(sizeof(uint64_t) * pages->count), (PAGE_SIZE * pages_of_data));
goto err;
}
- rc = process_page_data(ctx, pages->count, ctx->restore.pfns,
- ctx->restore.types, &pages->pfn[pages->count]);
+ ret = true;
+
+err:
+ return ret;
+}
+
+/*
+ * Populate pfns, if required
+ * Fill guest_data with either mapped address or NULL
+ * The caller must unmap guest_mapping
+ */
+static int map_guest_pages(struct xc_sr_context *ctx,
+ struct xc_sr_rec_page_data_header *pages)
+{
+ xc_interface *xch = ctx->xch;
+ uint32_t i, p;
+ int rc;
+
+ rc = populate_pfns(ctx, pages->count, ctx->restore.pfns, ctx->restore.types);
+ if ( rc )
+ {
+ ERROR("Failed to populate pfns for batch of %u pages", pages->count);
+ goto err;
+ }
+
+ ctx->restore.nr_mapped_pages = 0;
+
+ for ( i = 0; i < pages->count; i++ )
+ {
+ ctx->restore.ops.set_page_type(ctx, ctx->restore.pfns[i], ctx->restore.types[i]);
+
+ if ( page_type_has_stream_data(ctx->restore.types[i]) == false )
+ {
+ ctx->restore.guest_data[i] = NULL;
+ continue;
+ }
+
+ ctx->restore.mfns[ctx->restore.nr_mapped_pages++] = ctx->restore.ops.pfn_to_gfn(ctx, ctx->restore.pfns[i]);
+ }
+
+ /* Nothing to do? */
+ if ( ctx->restore.nr_mapped_pages == 0 )
+ goto done;
+
+ ctx->restore.guest_mapping = xenforeignmemory_map(xch->fmem, ctx->domid,
+ PROT_READ | PROT_WRITE, ctx->restore.nr_mapped_pages,
+ ctx->restore.mfns, ctx->restore.map_errs);
+ if ( !ctx->restore.guest_mapping )
+ {
+ rc = -1;
+ PERROR("Unable to map %u mfns for %u pages of data",
+ ctx->restore.nr_mapped_pages, pages->count);
+ goto err;
+ }
+
+ /* Verify mapping, and assign address to pfn data */
+ for ( i = 0, p = 0; i < pages->count; i++ )
+ {
+ if ( !page_type_has_stream_data(ctx->restore.types[i]) )
+ continue;
+
+ if ( ctx->restore.map_errs[p] == 0 )
+ {
+ ctx->restore.guest_data[i] = ctx->restore.guest_mapping + (p * PAGE_SIZE);
+ p++;
+ continue;
+ }
+
+ errno = ctx->restore.map_errs[p];
+ rc = -1;
+ PERROR("Mapping pfn %#"PRIpfn" (mfn %#"PRIpfn", type %#"PRIx32") failed",
+ ctx->restore.pfns[i], ctx->restore.mfns[p], ctx->restore.types[i]);
+ goto err;
+ }
+
+done:
+ rc = 0;
+
+err:
+ return rc;
+}
+
+/*
+ * Handle PAGE_DATA record from an existing buffer
+ * Given a list of pfns, their types, and a block of page data from the
+ * stream, populate and record their types, map the relevant subset and copy
+ * the data into the guest.
+ */
+static int handle_buffered_page_data(struct xc_sr_context *ctx,
+ struct xc_sr_record *rec)
+{
+ xc_interface *xch = ctx->xch;
+ struct xc_sr_rec_page_data_header *pages = rec->data;
+ void *p;
+ uint32_t i;
+ int rc = -1, idx;
+
+ rc = handle_static_data_end_v2(ctx);
+ if ( rc )
+ goto err;
+
+ /* First read and verify the header */
+ if ( !verify_rec_page_hdr(ctx, rec->length, pages) )
+ {
+ rc = -1;
+ goto err;
+ }
+
+ /* Then read and verify the pfn numbers */
+ if ( !verify_rec_page_pfns(ctx, rec->length, pages) )
+ {
+ rc = -1;
+ goto err;
+ }
+
+ /* Map the target pfn */
+ rc = map_guest_pages(ctx, pages);
+ if ( rc )
+ goto err;
+
+ for ( i = 0, idx = 0; i < pages->count; i++ )
+ {
+ if ( !ctx->restore.guest_data[i] )
+ continue;
+
+ p = &pages->pfn[pages->count] + (idx * PAGE_SIZE);
+ rc = ctx->restore.ops.localise_page(ctx, ctx->restore.types[i], p);
+ if ( rc )
+ {
+ ERROR("Failed to localise pfn %#"PRIpfn" (type %#"PRIx32")",
+ ctx->restore.pfns[i], ctx->restore.types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
+ goto err;
+
+ }
+
+ if ( ctx->restore.verify )
+ {
+ if ( memcmp(ctx->restore.guest_data[i], p, PAGE_SIZE) )
+ {
+ errno = EIO;
+ ERROR("verify pfn %#"PRIpfn" failed (type %#"PRIx32")",
+ ctx->restore.pfns[i], ctx->restore.types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
+ goto err;
+ }
+ }
+ else
+ {
+ memcpy(ctx->restore.guest_data[i], p, PAGE_SIZE);
+ }
+
+ idx++;
+ }
+
+ rc = 0;
+
err:
+ if ( ctx->restore.guest_mapping )
+ {
+ xenforeignmemory_unmap(xch->fmem, ctx->restore.guest_mapping, ctx->restore.nr_mapped_pages);
+ ctx->restore.guest_mapping = NULL;
+ }
return rc;
}
@@ -623,7 +707,7 @@ static int process_buffered_record(struc
break;
case REC_TYPE_PAGE_DATA:
- rc = handle_page_data(ctx, rec);
+ rc = handle_buffered_page_data(ctx, rec);
break;
case REC_TYPE_VERIFY:
@@ -703,9 +787,10 @@ static int setup(struct xc_sr_context *c
ctx->restore.map_errs = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.map_errs));
ctx->restore.pp_pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pp_pfns));
ctx->restore.pp_mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pp_mfns));
+ ctx->restore.guest_data = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.guest_data));
if ( !ctx->restore.pfns || !ctx->restore.types || !ctx->restore.mfns ||
!ctx->restore.map_errs || !ctx->restore.pp_pfns ||
- !ctx->restore.pp_mfns )
+ !ctx->restore.pp_mfns || !ctx->restore.guest_data )
{
ERROR("Unable to allocate memory");
rc = -1;
@@ -742,6 +827,7 @@ static void cleanup(struct xc_sr_context
free(ctx->restore.buffered_records);
free(ctx->restore.populated_pfns);
+ free(ctx->restore.guest_data);
free(ctx->restore.pp_mfns);
free(ctx->restore.pp_pfns);
free(ctx->restore.map_errs);

View File

@@ -1,230 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Thu, 29 Oct 2020 16:13:10 +0100
Subject: libxc sr restore handle_incoming_page_data
tools: restore: write data directly into guest
Read incoming migration stream directly into the guest memory.
This avoids the memory allocation and copying, and the resulting
performance penalty.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 3 +
tools/libs/guest/xg_sr_restore.c | 155 ++++++++++++++++++++++++++++++-
2 files changed, 153 insertions(+), 5 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -263,6 +263,8 @@ struct xc_sr_context
xen_pfn_t *pp_pfns;
xen_pfn_t *pp_mfns;
void **guest_data;
+ struct iovec *iov;
+ struct xc_sr_rec_page_data_header *pages;
void *guest_mapping;
uint32_t nr_mapped_pages;
@@ -311,6 +313,7 @@ struct xc_sr_context
/* Sender has invoked verify mode on the stream. */
bool verify;
+ void *verify_buf;
} restore;
};
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -382,6 +382,129 @@ err:
}
/*
+ * Handle PAGE_DATA record from the stream.
+ * Given a list of pfns, their types, and a block of page data from the
+ * stream, populate and record their types, map the relevant subset and copy
+ * the data into the guest.
+ */
+static int handle_incoming_page_data(struct xc_sr_context *ctx,
+ struct xc_sr_rhdr *rhdr)
+{
+ xc_interface *xch = ctx->xch;
+ struct xc_sr_rec_page_data_header *pages = ctx->restore.pages;
+ uint64_t *pfn_nums = &pages->pfn[0];
+ uint32_t i;
+ int rc, iov_idx;
+
+ rc = handle_static_data_end_v2(ctx);
+ if ( rc )
+ goto err;
+
+ /* First read and verify the header */
+ rc = read_exact(ctx->fd, pages, sizeof(*pages));
+ if ( rc )
+ {
+ PERROR("Could not read rec_pfn header");
+ goto err;
+ }
+
+ if ( !verify_rec_page_hdr(ctx, rhdr->length, pages) )
+ {
+ rc = -1;
+ goto err;
+ }
+
+ /* Then read and verify the incoming pfn numbers */
+ rc = read_exact(ctx->fd, pfn_nums, sizeof(*pfn_nums) * pages->count);
+ if ( rc )
+ {
+ PERROR("Could not read rec_pfn data");
+ goto err;
+ }
+
+ if ( !verify_rec_page_pfns(ctx, rhdr->length, pages) )
+ {
+ rc = -1;
+ goto err;
+ }
+
+ /* Finally read and verify the incoming pfn data */
+ rc = map_guest_pages(ctx, pages);
+ if ( rc )
+ goto err;
+
+ /* Prepare read buffers, either guest or throw-away memory */
+ for ( i = 0, iov_idx = 0; i < pages->count; i++ )
+ {
+ struct iovec *iov;
+
+ if ( !ctx->restore.guest_data[i] )
+ continue;
+
+ iov = &ctx->restore.iov[iov_idx];
+ iov->iov_len = PAGE_SIZE;
+ if ( ctx->restore.verify )
+ iov->iov_base = ctx->restore.verify_buf + (i * PAGE_SIZE);
+ else
+ iov->iov_base = ctx->restore.guest_data[i];
+ iov_idx++;
+ }
+
+ if ( !iov_idx )
+ goto done;
+
+ rc = readv_exact(ctx->fd, ctx->restore.iov, iov_idx);
+ if ( rc )
+ {
+ PERROR("read of %d pages failed", iov_idx);
+ goto err;
+ }
+
+ /* Post-processing of pfn data */
+ for ( i = 0, iov_idx = 0; i < pages->count; i++ )
+ {
+ void *addr;
+
+ if ( !ctx->restore.guest_data[i] )
+ continue;
+
+ addr = ctx->restore.iov[iov_idx].iov_base;
+ rc = ctx->restore.ops.localise_page(ctx, ctx->restore.types[i], addr);
+ if ( rc )
+ {
+ ERROR("Failed to localise pfn %#"PRIpfn" (type %#"PRIx32")",
+ ctx->restore.pfns[i],
+ ctx->restore.types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
+ goto err;
+
+ }
+
+ if ( ctx->restore.verify )
+ {
+ if ( memcmp(ctx->restore.guest_data[i], addr, PAGE_SIZE) )
+ {
+ ERROR("verify pfn %#"PRIpfn" failed (type %#"PRIx32")",
+ ctx->restore.pfns[i],
+ ctx->restore.types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
+ }
+ }
+
+ iov_idx++;
+ }
+
+done:
+ rc = 0;
+
+err:
+ if ( ctx->restore.guest_mapping )
+ {
+ xenforeignmemory_unmap(xch->fmem, ctx->restore.guest_mapping, ctx->restore.nr_mapped_pages);
+ ctx->restore.guest_mapping = NULL;
+ }
+ return rc;
+}
+
+/*
* Handle PAGE_DATA record from an existing buffer
* Given a list of pfns, their types, and a block of page data from the
* stream, populate and record their types, map the relevant subset and copy
@@ -713,6 +836,15 @@ static int process_buffered_record(struc
case REC_TYPE_VERIFY:
DPRINTF("Verify mode enabled");
ctx->restore.verify = true;
+ if ( !ctx->restore.verify_buf )
+ {
+ ctx->restore.verify_buf = malloc(MAX_BATCH_SIZE * PAGE_SIZE);
+ if ( !ctx->restore.verify_buf )
+ {
+ PERROR("Unable to allocate verify_buf");
+ rc = -1;
+ }
+ }
break;
case REC_TYPE_CHECKPOINT:
@@ -739,11 +871,19 @@ static int process_incoming_record_heade
struct xc_sr_record rec;
int rc;
- rc = read_record_data(ctx, ctx->fd, rhdr, &rec);
- if ( rc )
- return rc;
+ switch ( rhdr->type )
+ {
+ case REC_TYPE_PAGE_DATA:
+ rc = handle_incoming_page_data(ctx, rhdr);
+ break;
+ default:
+ rc = read_record_data(ctx, ctx->fd, rhdr, &rec);
+ if ( rc == 0 )
+ rc = process_buffered_record(ctx, &rec);;
+ break;
+ }
- return process_buffered_record(ctx, &rec);
+ return rc;
}
@@ -788,9 +928,12 @@ static int setup(struct xc_sr_context *c
ctx->restore.pp_pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pp_pfns));
ctx->restore.pp_mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pp_mfns));
ctx->restore.guest_data = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.guest_data));
+ ctx->restore.iov = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.iov));
+ ctx->restore.pages = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pages->pfn) + sizeof(*ctx->restore.pages));
if ( !ctx->restore.pfns || !ctx->restore.types || !ctx->restore.mfns ||
!ctx->restore.map_errs || !ctx->restore.pp_pfns ||
- !ctx->restore.pp_mfns || !ctx->restore.guest_data )
+ !ctx->restore.pp_mfns || !ctx->restore.guest_data ||
+ !ctx->restore.iov || !ctx->restore.pages )
{
ERROR("Unable to allocate memory");
rc = -1;
@@ -827,6 +970,8 @@ static void cleanup(struct xc_sr_context
free(ctx->restore.buffered_records);
free(ctx->restore.populated_pfns);
+ free(ctx->restore.pages);
+ free(ctx->restore.iov);
free(ctx->restore.guest_data);
free(ctx->restore.pp_mfns);
free(ctx->restore.pp_pfns);

View File

@@ -1,701 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Mon, 7 Aug 2017 12:58:02 +0000
Subject: libxc sr restore hvm legacy superpage
tools: use superpages during restore of HVM guest
bsc#1035231 - migration of HVM domU does not use superpages on destination dom0
bsc#1055695 - XEN: 11SP4 and 12SP3 HVM guests can not be restored
During creating of a HVM domU meminit_hvm() tries to map superpages.
After save/restore or migration this mapping is lost, everything is
allocated in single pages. This causes a performance degradation after
migration.
Add neccessary code to preallocate a superpage for an incoming chunk of
pfns. In case a pfn was not populated on the sending side, it must be
freed on the receiving side to avoid over-allocation.
The existing code for x86_pv is moved unmodified into its own file.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_dom_x86.c | 5 -
tools/libs/guest/xg_private.h | 5 +
tools/libs/guest/xg_sr_common.h | 28 +-
tools/libs/guest/xg_sr_restore.c | 60 +---
tools/libs/guest/xg_sr_restore_x86_hvm.c | 381 ++++++++++++++++++++++-
tools/libs/guest/xg_sr_restore_x86_pv.c | 61 +++-
6 files changed, 467 insertions(+), 73 deletions(-)
--- a/tools/libs/guest/xg_dom_x86.c
+++ b/tools/libs/guest/xg_dom_x86.c
@@ -44,11 +44,6 @@
#define SUPERPAGE_BATCH_SIZE 512
-#define SUPERPAGE_2MB_SHIFT 9
-#define SUPERPAGE_2MB_NR_PFNS (1UL << SUPERPAGE_2MB_SHIFT)
-#define SUPERPAGE_1GB_SHIFT 18
-#define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT)
-
#define X86_CR0_PE 0x01
#define X86_CR0_ET 0x10
--- a/tools/libs/guest/xg_private.h
+++ b/tools/libs/guest/xg_private.h
@@ -180,4 +180,9 @@ struct xc_cpu_policy {
};
#endif /* x86 */
+#define SUPERPAGE_2MB_SHIFT 9
+#define SUPERPAGE_2MB_NR_PFNS (1UL << SUPERPAGE_2MB_SHIFT)
+#define SUPERPAGE_1GB_SHIFT 18
+#define SUPERPAGE_1GB_NR_PFNS (1UL << SUPERPAGE_1GB_SHIFT)
+
#endif /* XG_PRIVATE_H */
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -208,6 +208,16 @@ struct xc_sr_restore_ops
int (*setup)(struct xc_sr_context *ctx);
/**
+ * Populate PFNs
+ *
+ * Given a set of pfns, obtain memory from Xen to fill the physmap for the
+ * unpopulated subset.
+ */
+ int (*populate_pfns)(struct xc_sr_context *ctx, unsigned count,
+ const xen_pfn_t *original_pfns, const uint32_t *types);
+
+
+ /**
* Process an individual record from the stream. The caller shall take
* care of processing common records (e.g. END, PAGE_DATA).
*
@@ -338,6 +348,8 @@ struct xc_sr_context
int send_back_fd;
unsigned long p2m_size;
+ unsigned long max_pages;
+ unsigned long tot_pages;
xc_hypercall_buffer_t dirty_bitmap_hbuf;
/* From Image Header. */
@@ -471,6 +483,14 @@ struct xc_sr_context
{
/* HVM context blob. */
struct xc_sr_blob context;
+
+ /* Bitmap of currently allocated PFNs during restore. */
+ struct sr_bitmap attempted_1g;
+ struct sr_bitmap attempted_2m;
+ struct sr_bitmap allocated_pfns;
+ xen_pfn_t prev_populated_pfn;
+ xen_pfn_t iteration_tracker_pfn;
+ unsigned long iteration;
} restore;
};
} hvm;
@@ -535,14 +555,6 @@ int read_record_header(struct xc_sr_cont
int read_record_data(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr,
struct xc_sr_record *rec);
-/*
- * This would ideally be private in restore.c, but is needed by
- * x86_pv_localise_page() if we receive pagetables frames ahead of the
- * contents of the frames they point at.
- */
-int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
- const xen_pfn_t *original_pfns, const uint32_t *types);
-
/* Handle a STATIC_DATA_END record. */
int handle_static_data_end(struct xc_sr_context *ctx);
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -71,60 +71,6 @@ static int read_headers(struct xc_sr_con
return 0;
}
-/*
- * Given a set of pfns, obtain memory from Xen to fill the physmap for the
- * unpopulated subset. If types is NULL, no page type checking is performed
- * and all unpopulated pfns are populated.
- */
-int populate_pfns(struct xc_sr_context *ctx, unsigned int count,
- const xen_pfn_t *original_pfns, const uint32_t *types)
-{
- xc_interface *xch = ctx->xch;
- unsigned int i, nr_pfns = 0;
- int rc = -1;
-
- for ( i = 0; i < count; ++i )
- {
- if ( (!types || page_type_to_populate(types[i])) &&
- !pfn_is_populated(ctx, original_pfns[i]) )
- {
- rc = pfn_set_populated(ctx, original_pfns[i]);
- if ( rc )
- goto err;
- ctx->restore.pp_pfns[nr_pfns] = ctx->restore.pp_mfns[nr_pfns] = original_pfns[i];
- ++nr_pfns;
- }
- }
-
- if ( nr_pfns )
- {
- rc = xc_domain_populate_physmap_exact(
- xch, ctx->domid, nr_pfns, 0, 0, ctx->restore.pp_mfns);
- if ( rc )
- {
- PERROR("Failed to populate physmap");
- goto err;
- }
-
- for ( i = 0; i < nr_pfns; ++i )
- {
- if ( ctx->restore.pp_mfns[i] == INVALID_MFN )
- {
- ERROR("Populate physmap failed for pfn %u", i);
- rc = -1;
- goto err;
- }
-
- ctx->restore.ops.set_gfn(ctx, ctx->restore.pp_pfns[i], ctx->restore.pp_mfns[i]);
- }
- }
-
- rc = 0;
-
- err:
- return rc;
-}
-
static int handle_static_data_end_v2(struct xc_sr_context *ctx)
{
int rc = 0;
@@ -259,7 +205,8 @@ static int map_guest_pages(struct xc_sr_
uint32_t i, p;
int rc;
- rc = populate_pfns(ctx, pages->count, ctx->restore.pfns, ctx->restore.types);
+ rc = ctx->restore.ops.populate_pfns(ctx, pages->count, ctx->restore.pfns,
+ ctx->restore.types);
if ( rc )
{
ERROR("Failed to populate pfns for batch of %u pages", pages->count);
@@ -1074,6 +1021,9 @@ int xc_domain_restore(xc_interface *xch,
return -1;
}
+ /* See xc_domain_getinfo */
+ ctx.restore.max_pages = ctx.dominfo.max_pages;
+ ctx.restore.tot_pages = ctx.dominfo.tot_pages;
ctx.restore.p2m_size = nr_pfns;
ctx.restore.ops = hvm ? restore_ops_x86_hvm : restore_ops_x86_pv;
--- a/tools/libs/guest/xg_sr_restore_x86_hvm.c
+++ b/tools/libs/guest/xg_sr_restore_x86_hvm.c
@@ -130,6 +130,33 @@ static int x86_hvm_localise_page(struct
return 0;
}
+static bool x86_hvm_expand_sp_bitmaps(struct xc_sr_context *ctx, unsigned long max_pfn)
+{
+ struct sr_bitmap *bm;
+
+ bm = &ctx->x86.hvm.restore.attempted_1g;
+ if ( !sr_bitmap_expand(bm, max_pfn >> SUPERPAGE_1GB_SHIFT) )
+ return false;
+
+ bm = &ctx->x86.hvm.restore.attempted_2m;
+ if ( !sr_bitmap_expand(bm, max_pfn >> SUPERPAGE_2MB_SHIFT) )
+ return false;
+
+ bm = &ctx->x86.hvm.restore.allocated_pfns;
+ if ( !sr_bitmap_expand(bm, max_pfn) )
+ return false;
+
+ return true;
+}
+
+static void x86_hvm_no_superpage(struct xc_sr_context *ctx, unsigned long addr)
+{
+ unsigned long pfn = addr >> XC_PAGE_SHIFT;
+
+ sr_set_bit(pfn >> SUPERPAGE_1GB_SHIFT, &ctx->x86.hvm.restore.attempted_1g);
+ sr_set_bit(pfn >> SUPERPAGE_2MB_SHIFT, &ctx->x86.hvm.restore.attempted_2m);
+}
+
/*
* restore_ops function. Confirms the stream matches the domain.
*/
@@ -164,12 +191,24 @@ static int x86_hvm_setup(struct xc_sr_co
max_pfn = max(ctx->restore.p2m_size, max_pages);
if ( !sr_bitmap_expand(&ctx->restore.populated_pfns, max_pfn) )
- {
- PERROR("Unable to allocate memory for populated_pfns bitmap");
- return -1;
- }
+ goto out;
+
+ if ( !x86_hvm_expand_sp_bitmaps(ctx, max_pfn) )
+ goto out;
+
+ /* FIXME: distinguish between PVH and HVM */
+ /* No superpage in 1st 2MB due to VGA hole */
+ x86_hvm_no_superpage(ctx, 0xA0000u);
+#define LAPIC_BASE_ADDRESS 0xfee00000u
+#define ACPI_INFO_PHYSICAL_ADDRESS 0xfc000000u
+ x86_hvm_no_superpage(ctx, LAPIC_BASE_ADDRESS);
+ x86_hvm_no_superpage(ctx, ACPI_INFO_PHYSICAL_ADDRESS);
return 0;
+
+out:
+ PERROR("Unable to allocate memory for pfn bitmaps");
+ return -1;
}
/*
@@ -250,6 +289,9 @@ static int x86_hvm_stream_complete(struc
static int x86_hvm_cleanup(struct xc_sr_context *ctx)
{
sr_bitmap_free(&ctx->restore.populated_pfns);
+ sr_bitmap_free(&ctx->x86.hvm.restore.attempted_1g);
+ sr_bitmap_free(&ctx->x86.hvm.restore.attempted_2m);
+ sr_bitmap_free(&ctx->x86.hvm.restore.allocated_pfns);
free(ctx->x86.hvm.restore.context.ptr);
free(ctx->x86.restore.cpuid.ptr);
@@ -258,6 +300,336 @@ static int x86_hvm_cleanup(struct xc_sr_
return 0;
}
+/*
+ * Set a range of pfns as allocated
+ */
+static void pfn_set_long_allocated(struct xc_sr_context *ctx, xen_pfn_t base_pfn)
+{
+ sr_set_long_bit(base_pfn, &ctx->x86.hvm.restore.allocated_pfns);
+}
+
+static void pfn_set_allocated(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+ sr_set_bit(pfn, &ctx->x86.hvm.restore.allocated_pfns);
+}
+
+struct x86_hvm_sp {
+ xen_pfn_t pfn;
+ xen_pfn_t base_pfn;
+ unsigned long index;
+ unsigned long count;
+};
+
+/*
+ * Try to allocate a 1GB page for this pfn, but avoid Over-allocation.
+ * If this succeeds, mark the range of 2MB pages as busy.
+ */
+static bool x86_hvm_alloc_1g(struct xc_sr_context *ctx, struct x86_hvm_sp *sp)
+{
+ xc_interface *xch = ctx->xch;
+ unsigned int order;
+ int i, done;
+ xen_pfn_t extent;
+
+ /* Only one attempt to avoid overlapping allocation */
+ if ( sr_test_and_set_bit(sp->index, &ctx->x86.hvm.restore.attempted_1g) )
+ return false;
+
+ order = SUPERPAGE_1GB_SHIFT;
+ sp->count = SUPERPAGE_1GB_NR_PFNS;
+
+ /* Allocate only if there is room for another superpage */
+ if ( ctx->restore.tot_pages + sp->count > ctx->restore.max_pages )
+ return false;
+
+ extent = sp->base_pfn = (sp->pfn >> order) << order;
+ done = xc_domain_populate_physmap(xch, ctx->domid, 1, order, 0, &extent);
+ if ( done < 0 ) {
+ PERROR("populate_physmap failed.");
+ return false;
+ }
+ if ( done == 0 )
+ return false;
+
+ DPRINTF("1G %" PRI_xen_pfn "\n", sp->base_pfn);
+
+ /* Mark all 2MB pages as done to avoid overlapping allocation */
+ for ( i = 0; i < (SUPERPAGE_1GB_NR_PFNS/SUPERPAGE_2MB_NR_PFNS); i++ )
+ sr_set_bit((sp->base_pfn >> SUPERPAGE_2MB_SHIFT) + i, &ctx->x86.hvm.restore.attempted_2m);
+
+ return true;
+}
+
+/* Allocate a 2MB page if x86_hvm_alloc_1g failed, avoid Over-allocation. */
+static bool x86_hvm_alloc_2m(struct xc_sr_context *ctx, struct x86_hvm_sp *sp)
+{
+ xc_interface *xch = ctx->xch;
+ unsigned int order;
+ int done;
+ xen_pfn_t extent;
+
+ /* Only one attempt to avoid overlapping allocation */
+ if ( sr_test_and_set_bit(sp->index, &ctx->x86.hvm.restore.attempted_2m) )
+ return false;
+
+ order = SUPERPAGE_2MB_SHIFT;
+ sp->count = SUPERPAGE_2MB_NR_PFNS;
+
+ /* Allocate only if there is room for another superpage */
+ if ( ctx->restore.tot_pages + sp->count > ctx->restore.max_pages )
+ return false;
+
+ extent = sp->base_pfn = (sp->pfn >> order) << order;
+ done = xc_domain_populate_physmap(xch, ctx->domid, 1, order, 0, &extent);
+ if ( done < 0 ) {
+ PERROR("populate_physmap failed.");
+ return false;
+ }
+ if ( done == 0 )
+ return false;
+
+ DPRINTF("2M %" PRI_xen_pfn "\n", sp->base_pfn);
+ return true;
+}
+
+/* Allocate a single page if x86_hvm_alloc_2m failed. */
+static bool x86_hvm_alloc_4k(struct xc_sr_context *ctx, struct x86_hvm_sp *sp)
+{
+ xc_interface *xch = ctx->xch;
+ unsigned int order;
+ int done;
+ xen_pfn_t extent;
+
+ order = 0;
+ sp->count = 1UL;
+
+ /* Allocate only if there is room for another page */
+ if ( ctx->restore.tot_pages + sp->count > ctx->restore.max_pages ) {
+ errno = E2BIG;
+ return false;
+ }
+
+ extent = sp->base_pfn = (sp->pfn >> order) << order;
+ done = xc_domain_populate_physmap(xch, ctx->domid, 1, order, 0, &extent);
+ if ( done < 0 ) {
+ PERROR("populate_physmap failed.");
+ return false;
+ }
+ if ( done == 0 ) {
+ errno = ENOMEM;
+ return false;
+ }
+
+ DPRINTF("4K %" PRI_xen_pfn "\n", sp->base_pfn);
+ return true;
+}
+/*
+ * Attempt to allocate a superpage where the pfn resides.
+ */
+static int x86_hvm_allocate_pfn(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+ bool success;
+ unsigned long idx_1g, idx_2m;
+ struct x86_hvm_sp sp = {
+ .pfn = pfn
+ };
+
+ if ( sr_test_bit(pfn, &ctx->x86.hvm.restore.allocated_pfns) )
+ return 0;
+
+ idx_1g = pfn >> SUPERPAGE_1GB_SHIFT;
+ idx_2m = pfn >> SUPERPAGE_2MB_SHIFT;
+
+ sp.index = idx_1g;
+ success = x86_hvm_alloc_1g(ctx, &sp);
+
+ if ( success == false ) {
+ sp.index = idx_2m;
+ success = x86_hvm_alloc_2m(ctx, &sp);
+ }
+
+ if ( success == false ) {
+ sp.index = 0;
+ success = x86_hvm_alloc_4k(ctx, &sp);
+ }
+
+ if ( success == false )
+ return -1;
+
+ do {
+ if ( sp.count >= BITS_PER_LONG && (sp.count % BITS_PER_LONG) == 0 ) {
+ sp.count -= BITS_PER_LONG;
+ ctx->restore.tot_pages += BITS_PER_LONG;
+ pfn_set_long_allocated(ctx, sp.base_pfn + sp.count);
+ } else {
+ sp.count--;
+ ctx->restore.tot_pages++;
+ pfn_set_allocated(ctx, sp.base_pfn + sp.count);
+ }
+ } while ( sp.count );
+
+ return 0;
+}
+
+/*
+ * Deallocate memory.
+ * There was likely an optimistic superpage allocation.
+ * This means more pages may have been allocated past gap_end.
+ * This range is not freed now. Incoming higher pfns will release it.
+ */
+static int x86_hvm_punch_hole(struct xc_sr_context *ctx,
+ xen_pfn_t gap_start, xen_pfn_t gap_end)
+{
+ xc_interface *xch = ctx->xch;
+ xen_pfn_t _pfn, pfn;
+ uint32_t domid, freed = 0;
+ int rc;
+
+ pfn = gap_start >> SUPERPAGE_1GB_SHIFT;
+ do
+ {
+ sr_set_bit(pfn, &ctx->x86.hvm.restore.attempted_1g);
+ } while (++pfn <= gap_end >> SUPERPAGE_1GB_SHIFT);
+
+ pfn = gap_start >> SUPERPAGE_2MB_SHIFT;
+ do
+ {
+ sr_set_bit(pfn, &ctx->x86.hvm.restore.attempted_2m);
+ } while (++pfn <= gap_end >> SUPERPAGE_2MB_SHIFT);
+
+ pfn = gap_start;
+
+ while ( pfn <= gap_end )
+ {
+ if ( sr_test_and_clear_bit(pfn, &ctx->x86.hvm.restore.allocated_pfns) )
+ {
+ domid = ctx->domid;
+ _pfn = pfn;
+ rc = xc_domain_decrease_reservation_exact(xch, domid, 1, 0, &_pfn);
+ if ( rc )
+ {
+ PERROR("Failed to release pfn %" PRI_xen_pfn, pfn);
+ return -1;
+ }
+ ctx->restore.tot_pages--;
+ freed++;
+ }
+ pfn++;
+ }
+ if ( freed )
+ DPRINTF("freed %u between %" PRI_xen_pfn " %" PRI_xen_pfn "\n",
+ freed, gap_start, gap_end);
+ return 0;
+}
+
+static int x86_hvm_unpopulate_page(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+ sr_clear_bit(pfn, &ctx->restore.populated_pfns);
+ return x86_hvm_punch_hole(ctx, pfn, pfn);
+}
+
+static int x86_hvm_populate_page(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+ xen_pfn_t gap_start, gap_end;
+ bool has_gap, first_iteration;
+ int rc;
+
+ /*
+ * Check for a gap between the previous populated pfn and this pfn.
+ * In case a gap exists, it is required to punch a hole to release memory,
+ * starting after the previous pfn and before this pfn.
+ *
+ * But: this can be done only during the first iteration, which is the
+ * only place where superpage allocations are attempted. All following
+ * iterations lack the info to properly maintain prev_populated_pfn.
+ */
+ has_gap = ctx->x86.hvm.restore.prev_populated_pfn + 1 < pfn;
+ first_iteration = ctx->x86.hvm.restore.iteration == 0;
+ if ( has_gap && first_iteration )
+ {
+ gap_start = ctx->x86.hvm.restore.prev_populated_pfn + 1;
+ gap_end = pfn - 1;
+
+ rc = x86_hvm_punch_hole(ctx, gap_start, gap_end);
+ if ( rc )
+ goto err;
+ }
+
+ rc = x86_hvm_allocate_pfn(ctx, pfn);
+ if ( rc )
+ goto err;
+ pfn_set_populated(ctx, pfn);
+ ctx->x86.hvm.restore.prev_populated_pfn = pfn;
+
+ rc = 0;
+err:
+ return rc;
+}
+
+/*
+ * Try to allocate superpages.
+ * This works without memory map because the pfns arrive in incremental order.
+ * All pfn numbers and their type are submitted.
+ * Only pfns with data will have also pfn content transmitted.
+ */
+static int x86_hvm_populate_pfns(struct xc_sr_context *ctx, unsigned count,
+ const xen_pfn_t *original_pfns,
+ const uint32_t *types)
+{
+ xc_interface *xch = ctx->xch;
+ xen_pfn_t pfn, min_pfn, max_pfn;
+ bool to_populate, populated;
+ unsigned i = count;
+ int rc = 0;
+
+ min_pfn = count ? original_pfns[0] : 0;
+ max_pfn = count ? original_pfns[count - 1] : 0;
+ DPRINTF("batch of %u pfns between %" PRI_xen_pfn " %" PRI_xen_pfn "\n",
+ count, min_pfn, max_pfn);
+
+ if ( !x86_hvm_expand_sp_bitmaps(ctx, max_pfn) )
+ {
+ ERROR("Unable to allocate memory for pfn bitmaps");
+ return -1;
+ }
+
+ /*
+ * There is no indicator for a new iteration.
+ * Simulate it by checking if a lower pfn is coming in.
+ * In the end it matters only to know if this iteration is the first one.
+ */
+ if ( min_pfn < ctx->x86.hvm.restore.iteration_tracker_pfn )
+ ctx->x86.hvm.restore.iteration++;
+ ctx->x86.hvm.restore.iteration_tracker_pfn = min_pfn;
+
+ for ( i = 0; i < count; ++i )
+ {
+ pfn = original_pfns[i];
+
+ to_populate = page_type_to_populate(types[i]);
+ populated = pfn_is_populated(ctx, pfn);
+
+ /*
+ * page has data, pfn populated: nothing to do
+ * page has data, pfn not populated: likely never seen before
+ * page has no data, pfn populated: likely ballooned out during migration
+ * page has no data, pfn not populated: nothing to do
+ */
+ if ( to_populate && !populated )
+ {
+ rc = x86_hvm_populate_page(ctx, pfn);
+ } else if ( !to_populate && populated )
+ {
+ rc = x86_hvm_unpopulate_page(ctx, pfn);
+ }
+ if ( rc )
+ break;
+ }
+
+ return rc;
+}
+
+
struct xc_sr_restore_ops restore_ops_x86_hvm =
{
.pfn_is_valid = x86_hvm_pfn_is_valid,
@@ -266,6 +638,7 @@ struct xc_sr_restore_ops restore_ops_x86
.set_page_type = x86_hvm_set_page_type,
.localise_page = x86_hvm_localise_page,
.setup = x86_hvm_setup,
+ .populate_pfns = x86_hvm_populate_pfns,
.process_record = x86_hvm_process_record,
.static_data_complete = x86_static_data_complete,
.stream_complete = x86_hvm_stream_complete,
--- a/tools/libs/guest/xg_sr_restore_x86_pv.c
+++ b/tools/libs/guest/xg_sr_restore_x86_pv.c
@@ -960,6 +960,64 @@ static void x86_pv_set_gfn(struct xc_sr_
}
/*
+ * Given a set of pfns, obtain memory from Xen to fill the physmap for the
+ * unpopulated subset. If types is NULL, no page type checking is performed
+ * and all unpopulated pfns are populated.
+ */
+static int x86_pv_populate_pfns(struct xc_sr_context *ctx, unsigned count,
+ const xen_pfn_t *original_pfns,
+ const uint32_t *types)
+{
+ xc_interface *xch = ctx->xch;
+ xen_pfn_t *mfns = ctx->restore.pp_mfns,
+ *pfns = ctx->restore.pp_pfns;
+ unsigned int i, nr_pfns = 0;
+ int rc = -1;
+
+ for ( i = 0; i < count; ++i )
+ {
+ if ( (!types ||
+ (types && page_type_has_stream_data(types[i]) == true)) &&
+ !pfn_is_populated(ctx, original_pfns[i]) )
+ {
+ rc = pfn_set_populated(ctx, original_pfns[i]);
+ if ( rc )
+ goto err;
+ pfns[nr_pfns] = mfns[nr_pfns] = original_pfns[i];
+ ++nr_pfns;
+ }
+ }
+
+ if ( nr_pfns )
+ {
+ rc = xc_domain_populate_physmap_exact(
+ xch, ctx->domid, nr_pfns, 0, 0, mfns);
+ if ( rc )
+ {
+ PERROR("Failed to populate physmap");
+ goto err;
+ }
+
+ for ( i = 0; i < nr_pfns; ++i )
+ {
+ if ( mfns[i] == INVALID_MFN )
+ {
+ ERROR("Populate physmap failed for pfn %u", i);
+ rc = -1;
+ goto err;
+ }
+
+ ctx->restore.ops.set_gfn(ctx, pfns[i], mfns[i]);
+ }
+ }
+
+ rc = 0;
+
+ err:
+ return rc;
+}
+
+/*
* restore_ops function. Convert pfns back to mfns in pagetables. Possibly
* needs to populate new frames if a PTE is found referring to a frame which
* hasn't yet been seen from PAGE_DATA records.
@@ -1003,7 +1061,7 @@ static int x86_pv_localise_page(struct x
}
}
- if ( to_populate && populate_pfns(ctx, to_populate, pfns, NULL) )
+ if ( to_populate && x86_pv_populate_pfns(ctx, to_populate, pfns, NULL) )
return -1;
for ( i = 0; i < (PAGE_SIZE / sizeof(uint64_t)); ++i )
@@ -1200,6 +1258,7 @@ struct xc_sr_restore_ops restore_ops_x86
.set_gfn = x86_pv_set_gfn,
.localise_page = x86_pv_localise_page,
.setup = x86_pv_setup,
+ .populate_pfns = x86_pv_populate_pfns,
.process_record = x86_pv_process_record,
.static_data_complete = x86_static_data_complete,
.stream_complete = x86_pv_stream_complete,

View File

@@ -1,101 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 14:44:09 +0200
Subject: libxc sr restore map_errs
tools: restore: preallocate map_errs array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in an incoming batch.
Allocate the space once.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_restore.c | 22 +++++++---------------
2 files changed, 8 insertions(+), 15 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -259,6 +259,7 @@ struct xc_sr_context
xen_pfn_t *pfns;
uint32_t *types;
xen_pfn_t *mfns;
+ int *map_errs;
int send_back_fd;
unsigned long p2m_size;
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -204,21 +204,12 @@ static int process_page_data(struct xc_s
xen_pfn_t *pfns, uint32_t *types, void *page_data)
{
xc_interface *xch = ctx->xch;
- int *map_errs = malloc(count * sizeof(*map_errs));
int rc;
void *mapping = NULL, *guest_page = NULL;
unsigned int i, /* i indexes the pfns from the record. */
j, /* j indexes the subset of pfns we decide to map. */
nr_pages = 0;
- if ( !map_errs )
- {
- rc = -1;
- ERROR("Failed to allocate %zu bytes to process page data",
- count * sizeof(*map_errs));
- goto err;
- }
-
rc = populate_pfns(ctx, count, pfns, types);
if ( rc )
{
@@ -240,7 +231,7 @@ static int process_page_data(struct xc_s
mapping = guest_page = xenforeignmemory_map(
xch->fmem, ctx->domid, PROT_READ | PROT_WRITE,
- nr_pages, ctx->restore.mfns, map_errs);
+ nr_pages, ctx->restore.mfns, ctx->restore.map_errs);
if ( !mapping )
{
rc = -1;
@@ -254,11 +245,11 @@ static int process_page_data(struct xc_s
if ( !page_type_has_stream_data(types[i]) )
continue;
- if ( map_errs[j] )
+ if ( ctx->restore.map_errs[j] )
{
rc = -1;
ERROR("Mapping pfn %#"PRIpfn" (mfn %#"PRIpfn", type %#"PRIx32") failed with %d",
- pfns[i], ctx->restore.mfns[j], types[i], map_errs[j]);
+ pfns[i], ctx->restore.mfns[j], types[i], ctx->restore.map_errs[j]);
goto err;
}
@@ -296,8 +287,6 @@ static int process_page_data(struct xc_s
if ( mapping )
xenforeignmemory_unmap(xch->fmem, mapping, nr_pages);
- free(map_errs);
-
return rc;
}
@@ -704,7 +693,9 @@ static int setup(struct xc_sr_context *c
ctx->restore.pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pfns));
ctx->restore.types = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.types));
ctx->restore.mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.mfns));
- if ( !ctx->restore.pfns || !ctx->restore.types || !ctx->restore.mfns )
+ ctx->restore.map_errs = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.map_errs));
+ if ( !ctx->restore.pfns || !ctx->restore.types || !ctx->restore.mfns ||
+ !ctx->restore.map_errs )
{
ERROR("Unable to allocate memory");
rc = -1;
@@ -741,6 +732,7 @@ static void cleanup(struct xc_sr_context
free(ctx->restore.buffered_records);
free(ctx->restore.populated_pfns);
+ free(ctx->restore.map_errs);
free(ctx->restore.mfns);
free(ctx->restore.types);
free(ctx->restore.pfns);

View File

@@ -1,103 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 14:42:19 +0200
Subject: libxc sr restore mfns
tools: restore: preallocate mfns array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in an incoming batch.
Allocate the space once.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_restore.c | 16 ++++++++--------
2 files changed, 9 insertions(+), 8 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -258,6 +258,7 @@ struct xc_sr_context
struct restore_callbacks *callbacks;
xen_pfn_t *pfns;
uint32_t *types;
+ xen_pfn_t *mfns;
int send_back_fd;
unsigned long p2m_size;
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -204,7 +204,6 @@ static int process_page_data(struct xc_s
xen_pfn_t *pfns, uint32_t *types, void *page_data)
{
xc_interface *xch = ctx->xch;
- xen_pfn_t *mfns = malloc(count * sizeof(*mfns));
int *map_errs = malloc(count * sizeof(*map_errs));
int rc;
void *mapping = NULL, *guest_page = NULL;
@@ -212,11 +211,11 @@ static int process_page_data(struct xc_s
j, /* j indexes the subset of pfns we decide to map. */
nr_pages = 0;
- if ( !mfns || !map_errs )
+ if ( !map_errs )
{
rc = -1;
ERROR("Failed to allocate %zu bytes to process page data",
- count * (sizeof(*mfns) + sizeof(*map_errs)));
+ count * sizeof(*map_errs));
goto err;
}
@@ -232,7 +231,7 @@ static int process_page_data(struct xc_s
ctx->restore.ops.set_page_type(ctx, pfns[i], types[i]);
if ( page_type_has_stream_data(types[i]) )
- mfns[nr_pages++] = ctx->restore.ops.pfn_to_gfn(ctx, pfns[i]);
+ ctx->restore.mfns[nr_pages++] = ctx->restore.ops.pfn_to_gfn(ctx, pfns[i]);
}
/* Nothing to do? */
@@ -241,7 +240,7 @@ static int process_page_data(struct xc_s
mapping = guest_page = xenforeignmemory_map(
xch->fmem, ctx->domid, PROT_READ | PROT_WRITE,
- nr_pages, mfns, map_errs);
+ nr_pages, ctx->restore.mfns, map_errs);
if ( !mapping )
{
rc = -1;
@@ -259,7 +258,7 @@ static int process_page_data(struct xc_s
{
rc = -1;
ERROR("Mapping pfn %#"PRIpfn" (mfn %#"PRIpfn", type %#"PRIx32") failed with %d",
- pfns[i], mfns[j], types[i], map_errs[j]);
+ pfns[i], ctx->restore.mfns[j], types[i], map_errs[j]);
goto err;
}
@@ -298,7 +297,6 @@ static int process_page_data(struct xc_s
xenforeignmemory_unmap(xch->fmem, mapping, nr_pages);
free(map_errs);
- free(mfns);
return rc;
}
@@ -705,7 +703,8 @@ static int setup(struct xc_sr_context *c
ctx->restore.pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pfns));
ctx->restore.types = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.types));
- if ( !ctx->restore.pfns || !ctx->restore.types )
+ ctx->restore.mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.mfns));
+ if ( !ctx->restore.pfns || !ctx->restore.types || !ctx->restore.mfns )
{
ERROR("Unable to allocate memory");
rc = -1;
@@ -742,6 +741,7 @@ static void cleanup(struct xc_sr_context
free(ctx->restore.buffered_records);
free(ctx->restore.populated_pfns);
+ free(ctx->restore.mfns);
free(ctx->restore.types);
free(ctx->restore.pfns);

View File

@@ -1,108 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 14:39:30 +0200
Subject: libxc sr restore pfns
tools: restore: preallocate pfns array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in an incoming batch.
Allocate the space once.
Adjust the verification for page count. It must be at least one page,
but not more than MAX_BATCH_SIZE.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_restore.c | 23 +++++++++++++++--------
2 files changed, 16 insertions(+), 8 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -256,6 +256,7 @@ struct xc_sr_context
{
struct xc_sr_restore_ops ops;
struct restore_callbacks *callbacks;
+ xen_pfn_t *pfns;
int send_back_fd;
unsigned long p2m_size;
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -314,7 +314,7 @@ static int handle_page_data(struct xc_sr
unsigned int i, pages_of_data = 0;
int rc = -1;
- xen_pfn_t *pfns = NULL, pfn;
+ xen_pfn_t pfn;
uint32_t *types = NULL, type;
/*
@@ -349,9 +349,9 @@ static int handle_page_data(struct xc_sr
goto err;
}
- if ( pages->count < 1 )
+ if ( !pages->count || pages->count > MAX_BATCH_SIZE )
{
- ERROR("Expected at least 1 pfn in PAGE_DATA record");
+ ERROR("Unexpected pfn count %u in PAGE_DATA record", pages->count);
goto err;
}
@@ -362,9 +362,8 @@ static int handle_page_data(struct xc_sr
goto err;
}
- pfns = malloc(pages->count * sizeof(*pfns));
types = malloc(pages->count * sizeof(*types));
- if ( !pfns || !types )
+ if ( !types )
{
ERROR("Unable to allocate enough memory for %u pfns",
pages->count);
@@ -393,7 +392,7 @@ static int handle_page_data(struct xc_sr
* have a page worth of data in the record. */
pages_of_data++;
- pfns[i] = pfn;
+ ctx->restore.pfns[i] = pfn;
types[i] = type;
}
@@ -407,11 +406,10 @@ static int handle_page_data(struct xc_sr
goto err;
}
- rc = process_page_data(ctx, pages->count, pfns, types,
+ rc = process_page_data(ctx, pages->count, ctx->restore.pfns, types,
&pages->pfn[pages->count]);
err:
free(types);
- free(pfns);
return rc;
}
@@ -715,6 +713,14 @@ static int setup(struct xc_sr_context *c
goto err;
}
+ ctx->restore.pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pfns));
+ if ( !ctx->restore.pfns )
+ {
+ ERROR("Unable to allocate memory");
+ rc = -1;
+ goto err;
+ }
+
ctx->restore.buffered_records = malloc(
DEFAULT_BUF_RECORDS * sizeof(struct xc_sr_record));
if ( !ctx->restore.buffered_records )
@@ -745,6 +751,7 @@ static void cleanup(struct xc_sr_context
free(ctx->restore.buffered_records);
free(ctx->restore.populated_pfns);
+ free(ctx->restore.pfns);
if ( ctx->restore.ops.cleanup(ctx) )
PERROR("Failed to clean up");

View File

@@ -1,111 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 14:54:12 +0200
Subject: libxc sr restore populate_pfns mfns
tools: restore: preallocate populate_pfns mfns array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in an incoming batch.
Allocate the space once.
Use some prefix to avoid conflict with an array used in handle_page_data.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_restore.c | 23 ++++++++---------------
2 files changed, 9 insertions(+), 15 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -261,6 +261,7 @@ struct xc_sr_context
xen_pfn_t *mfns;
int *map_errs;
xen_pfn_t *pp_pfns;
+ xen_pfn_t *pp_mfns;
int send_back_fd;
unsigned long p2m_size;
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -138,17 +138,9 @@ int populate_pfns(struct xc_sr_context *
const xen_pfn_t *original_pfns, const uint32_t *types)
{
xc_interface *xch = ctx->xch;
- xen_pfn_t *mfns = malloc(count * sizeof(*mfns));
unsigned int i, nr_pfns = 0;
int rc = -1;
- if ( !mfns )
- {
- ERROR("Failed to allocate %zu bytes for populating the physmap",
- 2 * count * sizeof(*mfns));
- goto err;
- }
-
for ( i = 0; i < count; ++i )
{
if ( (!types || page_type_to_populate(types[i])) &&
@@ -157,7 +149,7 @@ int populate_pfns(struct xc_sr_context *
rc = pfn_set_populated(ctx, original_pfns[i]);
if ( rc )
goto err;
- ctx->restore.pp_pfns[nr_pfns] = mfns[nr_pfns] = original_pfns[i];
+ ctx->restore.pp_pfns[nr_pfns] = ctx->restore.pp_mfns[nr_pfns] = original_pfns[i];
++nr_pfns;
}
}
@@ -165,7 +157,7 @@ int populate_pfns(struct xc_sr_context *
if ( nr_pfns )
{
rc = xc_domain_populate_physmap_exact(
- xch, ctx->domid, nr_pfns, 0, 0, mfns);
+ xch, ctx->domid, nr_pfns, 0, 0, ctx->restore.pp_mfns);
if ( rc )
{
PERROR("Failed to populate physmap");
@@ -174,22 +166,20 @@ int populate_pfns(struct xc_sr_context *
for ( i = 0; i < nr_pfns; ++i )
{
- if ( mfns[i] == INVALID_MFN )
+ if ( ctx->restore.pp_mfns[i] == INVALID_MFN )
{
ERROR("Populate physmap failed for pfn %u", i);
rc = -1;
goto err;
}
- ctx->restore.ops.set_gfn(ctx, ctx->restore.pp_pfns[i], mfns[i]);
+ ctx->restore.ops.set_gfn(ctx, ctx->restore.pp_pfns[i], ctx->restore.pp_mfns[i]);
}
}
rc = 0;
err:
- free(mfns);
-
return rc;
}
@@ -693,8 +683,10 @@ static int setup(struct xc_sr_context *c
ctx->restore.mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.mfns));
ctx->restore.map_errs = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.map_errs));
ctx->restore.pp_pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pp_pfns));
+ ctx->restore.pp_mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pp_mfns));
if ( !ctx->restore.pfns || !ctx->restore.types || !ctx->restore.mfns ||
- !ctx->restore.map_errs || !ctx->restore.pp_pfns )
+ !ctx->restore.map_errs || !ctx->restore.pp_pfns ||
+ !ctx->restore.pp_mfns )
{
ERROR("Unable to allocate memory");
rc = -1;
@@ -731,6 +723,7 @@ static void cleanup(struct xc_sr_context
free(ctx->restore.buffered_records);
free(ctx->restore.populated_pfns);
+ free(ctx->restore.pp_mfns);
free(ctx->restore.pp_pfns);
free(ctx->restore.map_errs);
free(ctx->restore.mfns);

View File

@@ -1,89 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 14:58:53 +0200
Subject: libxc sr restore populate_pfns pfns
tools: restore: preallocate populate_pfns pfns array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in an incoming batch.
Allocate the space once.
Use some prefix to avoid conflict with an array used in handle_page_data.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_restore.c | 14 +++++++-------
2 files changed, 8 insertions(+), 7 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -260,6 +260,7 @@ struct xc_sr_context
uint32_t *types;
xen_pfn_t *mfns;
int *map_errs;
+ xen_pfn_t *pp_pfns;
int send_back_fd;
unsigned long p2m_size;
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -138,12 +138,11 @@ int populate_pfns(struct xc_sr_context *
const xen_pfn_t *original_pfns, const uint32_t *types)
{
xc_interface *xch = ctx->xch;
- xen_pfn_t *mfns = malloc(count * sizeof(*mfns)),
- *pfns = malloc(count * sizeof(*pfns));
+ xen_pfn_t *mfns = malloc(count * sizeof(*mfns));
unsigned int i, nr_pfns = 0;
int rc = -1;
- if ( !mfns || !pfns )
+ if ( !mfns )
{
ERROR("Failed to allocate %zu bytes for populating the physmap",
2 * count * sizeof(*mfns));
@@ -158,7 +157,7 @@ int populate_pfns(struct xc_sr_context *
rc = pfn_set_populated(ctx, original_pfns[i]);
if ( rc )
goto err;
- pfns[nr_pfns] = mfns[nr_pfns] = original_pfns[i];
+ ctx->restore.pp_pfns[nr_pfns] = mfns[nr_pfns] = original_pfns[i];
++nr_pfns;
}
}
@@ -182,14 +181,13 @@ int populate_pfns(struct xc_sr_context *
goto err;
}
- ctx->restore.ops.set_gfn(ctx, pfns[i], mfns[i]);
+ ctx->restore.ops.set_gfn(ctx, ctx->restore.pp_pfns[i], mfns[i]);
}
}
rc = 0;
err:
- free(pfns);
free(mfns);
return rc;
@@ -694,8 +692,9 @@ static int setup(struct xc_sr_context *c
ctx->restore.types = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.types));
ctx->restore.mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.mfns));
ctx->restore.map_errs = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.map_errs));
+ ctx->restore.pp_pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pp_pfns));
if ( !ctx->restore.pfns || !ctx->restore.types || !ctx->restore.mfns ||
- !ctx->restore.map_errs )
+ !ctx->restore.map_errs || !ctx->restore.pp_pfns )
{
ERROR("Unable to allocate memory");
rc = -1;
@@ -732,6 +731,7 @@ static void cleanup(struct xc_sr_context
free(ctx->restore.buffered_records);
free(ctx->restore.populated_pfns);
+ free(ctx->restore.pp_pfns);
free(ctx->restore.map_errs);
free(ctx->restore.mfns);
free(ctx->restore.types);

View File

@@ -1,272 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Mon, 26 Oct 2020 12:19:17 +0100
Subject: libxc sr restore read_record
tools: restore: split record processing
handle_page_data must be able to read directly into mapped guest memory.
This will avoid unneccesary memcpy calls for data which can be consumed verbatim.
Rearrange the code to allow decisions based on the incoming record.
This change is preparation for future changes in handle_page_data,
no change in behavior is intended.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
tools/libs/guest/xg_sr_common.c | 33 ++++++++++++---------
tools/libs/guest/xg_sr_common.h | 4 ++-
tools/libs/guest/xg_sr_restore.c | 49 ++++++++++++++++++++++----------
tools/libs/guest/xg_sr_save.c | 7 ++++-
4 files changed, 63 insertions(+), 30 deletions(-)
--- a/tools/libs/guest/xg_sr_common.c
+++ b/tools/libs/guest/xg_sr_common.c
@@ -91,26 +91,33 @@ int write_split_record(struct xc_sr_cont
return -1;
}
-int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec)
+int read_record_header(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr)
{
xc_interface *xch = ctx->xch;
- struct xc_sr_rhdr rhdr;
- size_t datasz;
- if ( read_exact(fd, &rhdr, sizeof(rhdr)) )
+ if ( read_exact(fd, rhdr, sizeof(*rhdr)) )
{
PERROR("Failed to read Record Header from stream");
return -1;
}
- if ( rhdr.length > REC_LENGTH_MAX )
+ if ( rhdr->length > REC_LENGTH_MAX )
{
- ERROR("Record (0x%08x, %s) length %#x exceeds max (%#x)", rhdr.type,
- rec_type_to_str(rhdr.type), rhdr.length, REC_LENGTH_MAX);
+ ERROR("Record (0x%08x, %s) length %#x exceeds max (%#x)", rhdr->type,
+ rec_type_to_str(rhdr->type), rhdr->length, REC_LENGTH_MAX);
return -1;
}
- datasz = ROUNDUP(rhdr.length, REC_ALIGN_ORDER);
+ return 0;
+}
+
+int read_record_data(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr,
+ struct xc_sr_record *rec)
+{
+ xc_interface *xch = ctx->xch;
+ size_t datasz;
+
+ datasz = ROUNDUP(rhdr->length, REC_ALIGN_ORDER);
if ( datasz )
{
@@ -119,7 +126,7 @@ int read_record(struct xc_sr_context *ct
if ( !rec->data )
{
ERROR("Unable to allocate %zu bytes for record data (0x%08x, %s)",
- datasz, rhdr.type, rec_type_to_str(rhdr.type));
+ datasz, rhdr->type, rec_type_to_str(rhdr->type));
return -1;
}
@@ -128,18 +135,18 @@ int read_record(struct xc_sr_context *ct
free(rec->data);
rec->data = NULL;
PERROR("Failed to read %zu bytes of data for record (0x%08x, %s)",
- datasz, rhdr.type, rec_type_to_str(rhdr.type));
+ datasz, rhdr->type, rec_type_to_str(rhdr->type));
return -1;
}
}
else
rec->data = NULL;
- rec->type = rhdr.type;
- rec->length = rhdr.length;
+ rec->type = rhdr->type;
+ rec->length = rhdr->length;
return 0;
-};
+}
static void __attribute__((unused)) build_assertions(void)
{
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -458,7 +458,9 @@ static inline int write_record(struct xc
*
* On failure, the contents of the record structure are undefined.
*/
-int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec);
+int read_record_header(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr);
+int read_record_data(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr,
+ struct xc_sr_record *rec);
/*
* This would ideally be private in restore.c, but is needed by
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -453,7 +453,7 @@ static int send_checkpoint_dirty_pfn_lis
return rc;
}
-static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec);
+static int process_buffered_record(struct xc_sr_context *ctx, struct xc_sr_record *rec);
static int handle_checkpoint(struct xc_sr_context *ctx)
{
xc_interface *xch = ctx->xch;
@@ -492,7 +492,7 @@ static int handle_checkpoint(struct xc_s
for ( i = 0; i < ctx->restore.buffered_rec_num; i++ )
{
- rc = process_record(ctx, &ctx->restore.buffered_records[i]);
+ rc = process_buffered_record(ctx, &ctx->restore.buffered_records[i]);
if ( rc )
goto err;
}
@@ -553,10 +553,11 @@ static int handle_checkpoint(struct xc_s
return rc;
}
-static int buffer_record(struct xc_sr_context *ctx, struct xc_sr_record *rec)
+static int buffer_record(struct xc_sr_context *ctx, struct xc_sr_rhdr *rhdr)
{
xc_interface *xch = ctx->xch;
unsigned int new_alloc_num;
+ struct xc_sr_record rec;
struct xc_sr_record *p;
if ( ctx->restore.buffered_rec_num >= ctx->restore.allocated_rec_num )
@@ -574,8 +575,13 @@ static int buffer_record(struct xc_sr_co
ctx->restore.allocated_rec_num = new_alloc_num;
}
+ if ( read_record_data(ctx, ctx->fd, rhdr, &rec) )
+ {
+ return -1;
+ }
+
memcpy(&ctx->restore.buffered_records[ctx->restore.buffered_rec_num++],
- rec, sizeof(*rec));
+ &rec, sizeof(rec));
return 0;
}
@@ -606,7 +612,7 @@ int handle_static_data_end(struct xc_sr_
return rc;
}
-static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec)
+static int process_buffered_record(struct xc_sr_context *ctx, struct xc_sr_record *rec)
{
xc_interface *xch = ctx->xch;
int rc = 0;
@@ -644,6 +650,19 @@ static int process_record(struct xc_sr_c
return rc;
}
+static int process_incoming_record_header(struct xc_sr_context *ctx, struct xc_sr_rhdr *rhdr)
+{
+ struct xc_sr_record rec;
+ int rc;
+
+ rc = read_record_data(ctx, ctx->fd, rhdr, &rec);
+ if ( rc )
+ return rc;
+
+ return process_buffered_record(ctx, &rec);
+}
+
+
static int setup(struct xc_sr_context *ctx)
{
xc_interface *xch = ctx->xch;
@@ -740,7 +759,7 @@ static void cleanup(struct xc_sr_context
static int restore(struct xc_sr_context *ctx)
{
xc_interface *xch = ctx->xch;
- struct xc_sr_record rec;
+ struct xc_sr_rhdr rhdr;
int rc, saved_rc = 0, saved_errno = 0;
IPRINTF("Restoring domain");
@@ -751,7 +770,7 @@ static int restore(struct xc_sr_context
do
{
- rc = read_record(ctx, ctx->fd, &rec);
+ rc = read_record_header(ctx, ctx->fd, &rhdr);
if ( rc )
{
if ( ctx->restore.buffer_all_records )
@@ -761,25 +780,25 @@ static int restore(struct xc_sr_context
}
if ( ctx->restore.buffer_all_records &&
- rec.type != REC_TYPE_END &&
- rec.type != REC_TYPE_CHECKPOINT )
+ rhdr.type != REC_TYPE_END &&
+ rhdr.type != REC_TYPE_CHECKPOINT )
{
- rc = buffer_record(ctx, &rec);
+ rc = buffer_record(ctx, &rhdr);
if ( rc )
goto err;
}
else
{
- rc = process_record(ctx, &rec);
+ rc = process_incoming_record_header(ctx, &rhdr);
if ( rc == RECORD_NOT_PROCESSED )
{
- if ( rec.type & REC_TYPE_OPTIONAL )
+ if ( rhdr.type & REC_TYPE_OPTIONAL )
DPRINTF("Ignoring optional record %#x (%s)",
- rec.type, rec_type_to_str(rec.type));
+ rhdr.type, rec_type_to_str(rhdr.type));
else
{
ERROR("Mandatory record %#x (%s) not handled",
- rec.type, rec_type_to_str(rec.type));
+ rhdr.type, rec_type_to_str(rhdr.type));
rc = -1;
goto err;
}
@@ -790,7 +809,7 @@ static int restore(struct xc_sr_context
goto err;
}
- } while ( rec.type != REC_TYPE_END );
+ } while ( rhdr.type != REC_TYPE_END );
remus_failover:
if ( ctx->stream_type == XC_STREAM_COLO )
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -590,6 +590,7 @@ static int send_memory_live(struct xc_sr
static int colo_merge_secondary_dirty_bitmap(struct xc_sr_context *ctx)
{
xc_interface *xch = ctx->xch;
+ struct xc_sr_rhdr rhdr;
struct xc_sr_record rec;
uint64_t *pfns = NULL;
uint64_t pfn;
@@ -598,7 +599,11 @@ static int colo_merge_secondary_dirty_bi
DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
&ctx->save.dirty_bitmap_hbuf);
- rc = read_record(ctx, ctx->save.recv_fd, &rec);
+ rc = read_record_header(ctx, ctx->save.recv_fd, &rhdr);
+ if ( rc )
+ goto err;
+
+ rc = read_record_data(ctx, ctx->save.recv_fd, &rhdr, &rec);
if ( rc )
goto err;

View File

@@ -1,93 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 14:39:31 +0200
Subject: libxc sr restore types
tools: restore: preallocate types array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in an incoming batch.
Allocate the space once.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_restore.c | 22 +++++++---------------
2 files changed, 8 insertions(+), 15 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -257,6 +257,7 @@ struct xc_sr_context
struct xc_sr_restore_ops ops;
struct restore_callbacks *callbacks;
xen_pfn_t *pfns;
+ uint32_t *types;
int send_back_fd;
unsigned long p2m_size;
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -315,7 +315,7 @@ static int handle_page_data(struct xc_sr
int rc = -1;
xen_pfn_t pfn;
- uint32_t *types = NULL, type;
+ uint32_t type;
/*
* v2 compatibility only exists for x86 streams. This is a bit of a
@@ -362,14 +362,6 @@ static int handle_page_data(struct xc_sr
goto err;
}
- types = malloc(pages->count * sizeof(*types));
- if ( !types )
- {
- ERROR("Unable to allocate enough memory for %u pfns",
- pages->count);
- goto err;
- }
-
for ( i = 0; i < pages->count; ++i )
{
pfn = pages->pfn[i] & PAGE_DATA_PFN_MASK;
@@ -393,7 +385,7 @@ static int handle_page_data(struct xc_sr
pages_of_data++;
ctx->restore.pfns[i] = pfn;
- types[i] = type;
+ ctx->restore.types[i] = type;
}
if ( rec->length != (sizeof(*pages) +
@@ -406,11 +398,9 @@ static int handle_page_data(struct xc_sr
goto err;
}
- rc = process_page_data(ctx, pages->count, ctx->restore.pfns, types,
- &pages->pfn[pages->count]);
+ rc = process_page_data(ctx, pages->count, ctx->restore.pfns,
+ ctx->restore.types, &pages->pfn[pages->count]);
err:
- free(types);
-
return rc;
}
@@ -714,7 +704,8 @@ static int setup(struct xc_sr_context *c
}
ctx->restore.pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pfns));
- if ( !ctx->restore.pfns )
+ ctx->restore.types = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.types));
+ if ( !ctx->restore.pfns || !ctx->restore.types )
{
ERROR("Unable to allocate memory");
rc = -1;
@@ -751,6 +742,7 @@ static void cleanup(struct xc_sr_context
free(ctx->restore.buffered_records);
free(ctx->restore.populated_pfns);
+ free(ctx->restore.types);
free(ctx->restore.pfns);
if ( ctx->restore.ops.cleanup(ctx) )

View File

@@ -1,109 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 11:26:05 +0200
Subject: libxc sr save errors
tools: save: preallocate errors array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in a batch.
Allocate the space once.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_save.c | 20 ++++++++++----------
2 files changed, 11 insertions(+), 10 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -246,6 +246,7 @@ struct xc_sr_context
xen_pfn_t *batch_pfns;
xen_pfn_t *mfns;
xen_pfn_t *types;
+ int *errors;
unsigned int nr_batch_pfns;
unsigned long *deferred_pages;
unsigned long nr_deferred_pages;
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -91,7 +91,7 @@ static int write_batch(struct xc_sr_cont
void *guest_mapping = NULL;
void **guest_data = NULL;
void **local_pages = NULL;
- int *errors = NULL, rc = -1;
+ int rc = -1;
unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
unsigned int nr_pfns = ctx->save.nr_batch_pfns;
void *page, *orig_page;
@@ -104,8 +104,6 @@ static int write_batch(struct xc_sr_cont
assert(nr_pfns != 0);
- /* Errors from attempting to map the gfns. */
- errors = malloc(nr_pfns * sizeof(*errors));
/* Pointers to page data to send. Mapped gfns or local allocations. */
guest_data = calloc(nr_pfns, sizeof(*guest_data));
/* Pointers to locally allocated pages. Need freeing. */
@@ -113,7 +111,7 @@ static int write_batch(struct xc_sr_cont
/* iovec[] for writev(). */
iov = malloc((nr_pfns + 4) * sizeof(*iov));
- if ( !errors || !guest_data || !local_pages || !iov )
+ if ( !guest_data || !local_pages || !iov )
{
ERROR("Unable to allocate arrays for a batch of %u pages",
nr_pfns);
@@ -158,8 +156,8 @@ static int write_batch(struct xc_sr_cont
if ( nr_pages > 0 )
{
- guest_mapping = xenforeignmemory_map(
- xch->fmem, ctx->domid, PROT_READ, nr_pages, ctx->save.mfns, errors);
+ guest_mapping = xenforeignmemory_map(xch->fmem, ctx->domid, PROT_READ,
+ nr_pages, ctx->save.mfns, ctx->save.errors);
if ( !guest_mapping )
{
PERROR("Failed to map guest pages");
@@ -172,10 +170,11 @@ static int write_batch(struct xc_sr_cont
if ( !page_type_has_stream_data(ctx->save.types[i]) )
continue;
- if ( errors[p] )
+ if ( ctx->save.errors[p] )
{
ERROR("Mapping of pfn %#"PRIpfn" (mfn %#"PRIpfn") failed %d",
- ctx->save.batch_pfns[i], ctx->save.mfns[p], errors[p]);
+ ctx->save.batch_pfns[i], ctx->save.mfns[p],
+ ctx->save.errors[p]);
goto err;
}
@@ -271,7 +270,6 @@ static int write_batch(struct xc_sr_cont
free(iov);
free(local_pages);
free(guest_data);
- free(errors);
return rc;
}
@@ -846,10 +844,11 @@ static int setup(struct xc_sr_context *c
sizeof(*ctx->save.batch_pfns));
ctx->save.mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.mfns));
ctx->save.types = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.types));
+ ctx->save.errors = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.errors));
ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size);
if ( !ctx->save.batch_pfns || !ctx->save.mfns || !ctx->save.types ||
- !dirty_bitmap || !ctx->save.deferred_pages )
+ !ctx->save.errors || !dirty_bitmap || !ctx->save.deferred_pages )
{
ERROR("Unable to allocate memory for dirty bitmaps, batch pfns and"
" deferred pages");
@@ -880,6 +879,7 @@ static void cleanup(struct xc_sr_context
xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
NRPAGES(bitmap_size(ctx->save.p2m_size)));
free(ctx->save.deferred_pages);
+ free(ctx->save.errors);
free(ctx->save.types);
free(ctx->save.mfns);
free(ctx->save.batch_pfns);

View File

@@ -1,123 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 11:40:45 +0200
Subject: libxc sr save guest_data
tools: save: preallocate guest_data array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in a batch.
Allocate the space once.
Because this was allocated with calloc:
Adjust the loop to clear unused entries as needed.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_save.c | 20 +++++++++++---------
2 files changed, 12 insertions(+), 9 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -249,6 +249,7 @@ struct xc_sr_context
int *errors;
struct iovec *iov;
uint64_t *rec_pfns;
+ void **guest_data;
unsigned int nr_batch_pfns;
unsigned long *deferred_pages;
unsigned long nr_deferred_pages;
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -89,7 +89,6 @@ static int write_batch(struct xc_sr_cont
{
xc_interface *xch = ctx->xch;
void *guest_mapping = NULL;
- void **guest_data = NULL;
void **local_pages = NULL;
int rc = -1;
unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
@@ -103,12 +102,10 @@ static int write_batch(struct xc_sr_cont
assert(nr_pfns != 0);
- /* Pointers to page data to send. Mapped gfns or local allocations. */
- guest_data = calloc(nr_pfns, sizeof(*guest_data));
/* Pointers to locally allocated pages. Need freeing. */
local_pages = calloc(nr_pfns, sizeof(*local_pages));
- if ( !guest_data || !local_pages )
+ if ( !local_pages )
{
ERROR("Unable to allocate arrays for a batch of %u pages",
nr_pfns);
@@ -165,7 +162,10 @@ static int write_batch(struct xc_sr_cont
for ( i = 0, p = 0; i < nr_pfns; ++i )
{
if ( !page_type_has_stream_data(ctx->save.types[i]) )
+ {
+ ctx->save.guest_data[i] = NULL;
continue;
+ }
if ( ctx->save.errors[p] )
{
@@ -183,6 +183,7 @@ static int write_batch(struct xc_sr_cont
if ( rc )
{
+ ctx->save.guest_data[i] = NULL;
if ( rc == -1 && errno == EAGAIN )
{
set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages);
@@ -194,7 +195,7 @@ static int write_batch(struct xc_sr_cont
goto err;
}
else
- guest_data[i] = page;
+ ctx->save.guest_data[i] = page;
rc = -1;
++p;
@@ -232,9 +233,9 @@ static int write_batch(struct xc_sr_cont
{
for ( i = 0; i < nr_pfns; ++i )
{
- if ( guest_data[i] )
+ if ( ctx->save.guest_data[i] )
{
- ctx->save.iov[iovcnt].iov_base = guest_data[i];
+ ctx->save.iov[iovcnt].iov_base = ctx->save.guest_data[i];
ctx->save.iov[iovcnt].iov_len = PAGE_SIZE;
iovcnt++;
--nr_pages;
@@ -258,7 +259,6 @@ static int write_batch(struct xc_sr_cont
for ( i = 0; local_pages && i < nr_pfns; ++i )
free(local_pages[i]);
free(local_pages);
- free(guest_data);
return rc;
}
@@ -836,11 +836,12 @@ static int setup(struct xc_sr_context *c
ctx->save.errors = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.errors));
ctx->save.iov = malloc((4 + MAX_BATCH_SIZE) * sizeof(*ctx->save.iov));
ctx->save.rec_pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.rec_pfns));
+ ctx->save.guest_data = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.guest_data));
ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size);
if ( !ctx->save.batch_pfns || !ctx->save.mfns || !ctx->save.types ||
!ctx->save.errors || !ctx->save.iov || !ctx->save.rec_pfns ||
- !dirty_bitmap || !ctx->save.deferred_pages )
+ !ctx->save.guest_data ||!dirty_bitmap || !ctx->save.deferred_pages )
{
ERROR("Unable to allocate memory for dirty bitmaps, batch pfns and"
" deferred pages");
@@ -871,6 +872,7 @@ static void cleanup(struct xc_sr_context
xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
NRPAGES(bitmap_size(ctx->save.p2m_size)));
free(ctx->save.deferred_pages);
+ free(ctx->save.guest_data);
free(ctx->save.rec_pfns);
free(ctx->save.iov);
free(ctx->save.errors);

View File

@@ -1,124 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 11:30:41 +0200
Subject: libxc sr save iov
tools: save: preallocate iov array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in a batch.
Allocate the space once.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_save.c | 34 ++++++++++++++++-----------------
2 files changed, 18 insertions(+), 17 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -247,6 +247,7 @@ struct xc_sr_context
xen_pfn_t *mfns;
xen_pfn_t *types;
int *errors;
+ struct iovec *iov;
unsigned int nr_batch_pfns;
unsigned long *deferred_pages;
unsigned long nr_deferred_pages;
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -96,7 +96,7 @@ static int write_batch(struct xc_sr_cont
unsigned int nr_pfns = ctx->save.nr_batch_pfns;
void *page, *orig_page;
uint64_t *rec_pfns = NULL;
- struct iovec *iov = NULL; int iovcnt = 0;
+ int iovcnt = 0;
struct xc_sr_rec_page_data_header hdr = { 0 };
struct xc_sr_record rec = {
.type = REC_TYPE_PAGE_DATA,
@@ -108,10 +108,8 @@ static int write_batch(struct xc_sr_cont
guest_data = calloc(nr_pfns, sizeof(*guest_data));
/* Pointers to locally allocated pages. Need freeing. */
local_pages = calloc(nr_pfns, sizeof(*local_pages));
- /* iovec[] for writev(). */
- iov = malloc((nr_pfns + 4) * sizeof(*iov));
- if ( !guest_data || !local_pages || !iov )
+ if ( !guest_data || !local_pages )
{
ERROR("Unable to allocate arrays for a batch of %u pages",
nr_pfns);
@@ -221,17 +219,17 @@ static int write_batch(struct xc_sr_cont
for ( i = 0; i < nr_pfns; ++i )
rec_pfns[i] = ((uint64_t)(ctx->save.types[i]) << 32) | ctx->save.batch_pfns[i];
- iov[0].iov_base = &rec.type;
- iov[0].iov_len = sizeof(rec.type);
+ ctx->save.iov[0].iov_base = &rec.type;
+ ctx->save.iov[0].iov_len = sizeof(rec.type);
- iov[1].iov_base = &rec.length;
- iov[1].iov_len = sizeof(rec.length);
+ ctx->save.iov[1].iov_base = &rec.length;
+ ctx->save.iov[1].iov_len = sizeof(rec.length);
- iov[2].iov_base = &hdr;
- iov[2].iov_len = sizeof(hdr);
+ ctx->save.iov[2].iov_base = &hdr;
+ ctx->save.iov[2].iov_len = sizeof(hdr);
- iov[3].iov_base = rec_pfns;
- iov[3].iov_len = nr_pfns * sizeof(*rec_pfns);
+ ctx->save.iov[3].iov_base = rec_pfns;
+ ctx->save.iov[3].iov_len = nr_pfns * sizeof(*rec_pfns);
iovcnt = 4;
ctx->save.pages_sent += nr_pages;
@@ -243,15 +241,15 @@ static int write_batch(struct xc_sr_cont
{
if ( guest_data[i] )
{
- iov[iovcnt].iov_base = guest_data[i];
- iov[iovcnt].iov_len = PAGE_SIZE;
+ ctx->save.iov[iovcnt].iov_base = guest_data[i];
+ ctx->save.iov[iovcnt].iov_len = PAGE_SIZE;
iovcnt++;
--nr_pages;
}
}
}
- if ( writev_exact(ctx->fd, iov, iovcnt) )
+ if ( writev_exact(ctx->fd, ctx->save.iov, iovcnt) )
{
PERROR("Failed to write page data to stream");
goto err;
@@ -267,7 +265,6 @@ static int write_batch(struct xc_sr_cont
xenforeignmemory_unmap(xch->fmem, guest_mapping, nr_pages_mapped);
for ( i = 0; local_pages && i < nr_pfns; ++i )
free(local_pages[i]);
- free(iov);
free(local_pages);
free(guest_data);
@@ -845,10 +842,12 @@ static int setup(struct xc_sr_context *c
ctx->save.mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.mfns));
ctx->save.types = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.types));
ctx->save.errors = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.errors));
+ ctx->save.iov = malloc((4 + MAX_BATCH_SIZE) * sizeof(*ctx->save.iov));
ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size);
if ( !ctx->save.batch_pfns || !ctx->save.mfns || !ctx->save.types ||
- !ctx->save.errors || !dirty_bitmap || !ctx->save.deferred_pages )
+ !ctx->save.errors || !ctx->save.iov || !dirty_bitmap ||
+ !ctx->save.deferred_pages )
{
ERROR("Unable to allocate memory for dirty bitmaps, batch pfns and"
" deferred pages");
@@ -879,6 +878,7 @@ static void cleanup(struct xc_sr_context
xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
NRPAGES(bitmap_size(ctx->save.p2m_size)));
free(ctx->save.deferred_pages);
+ free(ctx->save.iov);
free(ctx->save.errors);
free(ctx->save.types);
free(ctx->save.mfns);

View File

@@ -1,218 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 12:47:56 +0200
Subject: libxc sr save local_pages
tools: save: preallocate local_pages array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in a batch.
Allocate the space once.
Adjust the code to use the unmodified src page in case of HVM.
In case of PV the page may need to be normalised, use a private memory
area for this purpose.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 22 ++++++++++---------
tools/libs/guest/xg_sr_save.c | 26 ++++------------------
tools/libs/guest/xg_sr_save_x86_hvm.c | 5 +++--
tools/libs/guest/xg_sr_save_x86_pv.c | 31 ++++++++++++++++++---------
4 files changed, 40 insertions(+), 44 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -33,16 +33,12 @@ struct xc_sr_save_ops
* Optionally transform the contents of a page from being specific to the
* sending environment, to being generic for the stream.
*
- * The page of data at the end of 'page' may be a read-only mapping of a
- * running guest; it must not be modified. If no transformation is
- * required, the callee should leave '*pages' untouched.
+ * The page of data '*src' may be a read-only mapping of a running guest;
+ * it must not be modified. If no transformation is required, the callee
+ * should leave '*src' untouched, and return it via '**ptr'.
*
- * If a transformation is required, the callee should allocate themselves
- * a local page using malloc() and return it via '*page'.
- *
- * The caller shall free() '*page' in all cases. In the case that the
- * callee encounters an error, it should *NOT* free() the memory it
- * allocated for '*page'.
+ * If a transformation is required, the callee should provide the
+ * transformed page in a private buffer and return it via '**ptr'.
*
* It is valid to fail with EAGAIN if the transformation is not able to be
* completed at this point. The page shall be retried later.
@@ -50,7 +46,7 @@ struct xc_sr_save_ops
* @returns 0 for success, -1 for failure, with errno appropriately set.
*/
int (*normalise_page)(struct xc_sr_context *ctx, xen_pfn_t type,
- void **page);
+ void *src, unsigned int idx, void **ptr);
/**
* Set up local environment to save a domain. (Typically querying
@@ -359,6 +355,12 @@ struct xc_sr_context
{
struct
{
+ /* Used by write_batch for modified pages. */
+ void *normalised_pages;
+ } save;
+
+ struct
+ {
/* State machine for the order of received records. */
bool seen_pv_info;
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -89,11 +89,10 @@ static int write_batch(struct xc_sr_cont
{
xc_interface *xch = ctx->xch;
void *guest_mapping = NULL;
- void **local_pages = NULL;
int rc = -1;
unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
unsigned int nr_pfns = ctx->save.nr_batch_pfns;
- void *page, *orig_page;
+ void *src;
int iovcnt = 0;
struct xc_sr_rec_page_data_header hdr = { 0 };
struct xc_sr_record rec = {
@@ -102,16 +101,6 @@ static int write_batch(struct xc_sr_cont
assert(nr_pfns != 0);
- /* Pointers to locally allocated pages. Need freeing. */
- local_pages = calloc(nr_pfns, sizeof(*local_pages));
-
- if ( !local_pages )
- {
- ERROR("Unable to allocate arrays for a batch of %u pages",
- nr_pfns);
- goto err;
- }
-
for ( i = 0; i < nr_pfns; ++i )
{
ctx->save.types[i] = ctx->save.mfns[i] = ctx->save.ops.pfn_to_gfn(ctx,
@@ -175,11 +164,9 @@ static int write_batch(struct xc_sr_cont
goto err;
}
- orig_page = page = guest_mapping + (p * PAGE_SIZE);
- rc = ctx->save.ops.normalise_page(ctx, ctx->save.types[i], &page);
-
- if ( orig_page != page )
- local_pages[i] = page;
+ src = guest_mapping + (p * PAGE_SIZE);
+ rc = ctx->save.ops.normalise_page(ctx, ctx->save.types[i], src, i,
+ &ctx->save.guest_data[i]);
if ( rc )
{
@@ -194,8 +181,6 @@ static int write_batch(struct xc_sr_cont
else
goto err;
}
- else
- ctx->save.guest_data[i] = page;
rc = -1;
++p;
@@ -256,9 +241,6 @@ static int write_batch(struct xc_sr_cont
err:
if ( guest_mapping )
xenforeignmemory_unmap(xch->fmem, guest_mapping, nr_pages_mapped);
- for ( i = 0; local_pages && i < nr_pfns; ++i )
- free(local_pages[i]);
- free(local_pages);
return rc;
}
--- a/tools/libs/guest/xg_sr_save_x86_hvm.c
+++ b/tools/libs/guest/xg_sr_save_x86_hvm.c
@@ -129,9 +129,10 @@ static xen_pfn_t x86_hvm_pfn_to_gfn(cons
return pfn;
}
-static int x86_hvm_normalise_page(struct xc_sr_context *ctx,
- xen_pfn_t type, void **page)
+static int x86_hvm_normalise_page(struct xc_sr_context *ctx, xen_pfn_t type,
+ void *src, unsigned int idx, void **ptr)
{
+ *ptr = src;
return 0;
}
--- a/tools/libs/guest/xg_sr_save_x86_pv.c
+++ b/tools/libs/guest/xg_sr_save_x86_pv.c
@@ -999,29 +999,31 @@ static xen_pfn_t x86_pv_pfn_to_gfn(const
* save_ops function. Performs pagetable normalisation on appropriate pages.
*/
static int x86_pv_normalise_page(struct xc_sr_context *ctx, xen_pfn_t type,
- void **page)
+ void *src, unsigned int idx, void **ptr)
{
xc_interface *xch = ctx->xch;
- void *local_page;
+ void *dst;
int rc;
type &= XEN_DOMCTL_PFINFO_LTABTYPE_MASK;
if ( type < XEN_DOMCTL_PFINFO_L1TAB || type > XEN_DOMCTL_PFINFO_L4TAB )
+ {
+ *ptr = src;
return 0;
+ }
- local_page = malloc(PAGE_SIZE);
- if ( !local_page )
+ if ( idx >= MAX_BATCH_SIZE )
{
- ERROR("Unable to allocate scratch page");
- rc = -1;
- goto out;
+ ERROR("idx %u out of range", idx);
+ errno = ERANGE;
+ return -1;
}
- rc = normalise_pagetable(ctx, *page, local_page, type);
- *page = local_page;
+ dst = ctx->x86.pv.save.normalised_pages + (idx * PAGE_SIZE);
+ rc = normalise_pagetable(ctx, src, dst, type);
+ *ptr = dst;
- out:
return rc;
}
@@ -1031,8 +1033,16 @@ static int x86_pv_normalise_page(struct
*/
static int x86_pv_setup(struct xc_sr_context *ctx)
{
+ xc_interface *xch = ctx->xch;
int rc;
+ ctx->x86.pv.save.normalised_pages = malloc(MAX_BATCH_SIZE * PAGE_SIZE);
+ if ( !ctx->x86.pv.save.normalised_pages )
+ {
+ PERROR("Failed to allocate normalised_pages");
+ return -1;
+ }
+
rc = x86_pv_domain_info(ctx);
if ( rc )
return rc;
@@ -1118,6 +1128,7 @@ static int x86_pv_check_vm_state(struct
static int x86_pv_cleanup(struct xc_sr_context *ctx)
{
+ free(ctx->x86.pv.save.normalised_pages);
free(ctx->x86.pv.p2m_pfns);
if ( ctx->x86.pv.p2m )

View File

@@ -1,132 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 11:20:36 +0200
Subject: libxc sr save mfns
tools: save: preallocate mfns array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in a batch, see add_to_batch.
Allocate the space once.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_save.c | 25 +++++++++++++------------
2 files changed, 14 insertions(+), 12 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -244,6 +244,7 @@ struct xc_sr_context
struct precopy_stats stats;
xen_pfn_t *batch_pfns;
+ xen_pfn_t *mfns;
unsigned int nr_batch_pfns;
unsigned long *deferred_pages;
unsigned long nr_deferred_pages;
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -88,7 +88,7 @@ static int write_checkpoint_record(struc
static int write_batch(struct xc_sr_context *ctx)
{
xc_interface *xch = ctx->xch;
- xen_pfn_t *mfns = NULL, *types = NULL;
+ xen_pfn_t *types = NULL;
void *guest_mapping = NULL;
void **guest_data = NULL;
void **local_pages = NULL;
@@ -105,8 +105,6 @@ static int write_batch(struct xc_sr_cont
assert(nr_pfns != 0);
- /* Mfns of the batch pfns. */
- mfns = malloc(nr_pfns * sizeof(*mfns));
/* Types of the batch pfns. */
types = malloc(nr_pfns * sizeof(*types));
/* Errors from attempting to map the gfns. */
@@ -118,7 +116,7 @@ static int write_batch(struct xc_sr_cont
/* iovec[] for writev(). */
iov = malloc((nr_pfns + 4) * sizeof(*iov));
- if ( !mfns || !types || !errors || !guest_data || !local_pages || !iov )
+ if ( !types || !errors || !guest_data || !local_pages || !iov )
{
ERROR("Unable to allocate arrays for a batch of %u pages",
nr_pfns);
@@ -127,11 +125,11 @@ static int write_batch(struct xc_sr_cont
for ( i = 0; i < nr_pfns; ++i )
{
- types[i] = mfns[i] = ctx->save.ops.pfn_to_gfn(ctx,
+ types[i] = ctx->save.mfns[i] = ctx->save.ops.pfn_to_gfn(ctx,
ctx->save.batch_pfns[i]);
/* Likely a ballooned page. */
- if ( mfns[i] == INVALID_MFN )
+ if ( ctx->save.mfns[i] == INVALID_MFN )
{
set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages);
++ctx->save.nr_deferred_pages;
@@ -150,20 +148,21 @@ static int write_batch(struct xc_sr_cont
{
if ( !is_known_page_type(types[i]) )
{
- ERROR("Unknown type %#"PRIpfn" for pfn %#"PRIpfn, types[i], mfns[i]);
+ ERROR("Unknown type %#"PRIpfn" for pfn %#"PRIpfn,
+ types[i], ctx->save.mfns[i]);
goto err;
}
if ( !page_type_has_stream_data(types[i]) )
continue;
- mfns[nr_pages++] = mfns[i];
+ ctx->save.mfns[nr_pages++] = ctx->save.mfns[i];
}
if ( nr_pages > 0 )
{
guest_mapping = xenforeignmemory_map(
- xch->fmem, ctx->domid, PROT_READ, nr_pages, mfns, errors);
+ xch->fmem, ctx->domid, PROT_READ, nr_pages, ctx->save.mfns, errors);
if ( !guest_mapping )
{
PERROR("Failed to map guest pages");
@@ -179,7 +178,7 @@ static int write_batch(struct xc_sr_cont
if ( errors[p] )
{
ERROR("Mapping of pfn %#"PRIpfn" (mfn %#"PRIpfn") failed %d",
- ctx->save.batch_pfns[i], mfns[p], errors[p]);
+ ctx->save.batch_pfns[i], ctx->save.mfns[p], errors[p]);
goto err;
}
@@ -277,7 +276,6 @@ static int write_batch(struct xc_sr_cont
free(guest_data);
free(errors);
free(types);
- free(mfns);
return rc;
}
@@ -850,9 +848,11 @@ static int setup(struct xc_sr_context *c
xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size)));
ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE *
sizeof(*ctx->save.batch_pfns));
+ ctx->save.mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.mfns));
ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size);
- if ( !ctx->save.batch_pfns || !dirty_bitmap || !ctx->save.deferred_pages )
+ if ( !ctx->save.batch_pfns || !ctx->save.mfns ||
+ !dirty_bitmap || !ctx->save.deferred_pages )
{
ERROR("Unable to allocate memory for dirty bitmaps, batch pfns and"
" deferred pages");
@@ -883,6 +883,7 @@ static void cleanup(struct xc_sr_context
xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
NRPAGES(bitmap_size(ctx->save.p2m_size)));
free(ctx->save.deferred_pages);
+ free(ctx->save.mfns);
free(ctx->save.batch_pfns);
}

View File

@@ -1,110 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 11:34:00 +0200
Subject: libxc sr save rec_pfns
tools: save: preallocate rec_pfns array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in a batch.
Allocate the space once.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_save.c | 28 +++++++++++-----------------
2 files changed, 12 insertions(+), 17 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -248,6 +248,7 @@ struct xc_sr_context
xen_pfn_t *types;
int *errors;
struct iovec *iov;
+ uint64_t *rec_pfns;
unsigned int nr_batch_pfns;
unsigned long *deferred_pages;
unsigned long nr_deferred_pages;
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -95,7 +95,6 @@ static int write_batch(struct xc_sr_cont
unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
unsigned int nr_pfns = ctx->save.nr_batch_pfns;
void *page, *orig_page;
- uint64_t *rec_pfns = NULL;
int iovcnt = 0;
struct xc_sr_rec_page_data_header hdr = { 0 };
struct xc_sr_record rec = {
@@ -202,22 +201,15 @@ static int write_batch(struct xc_sr_cont
}
}
- rec_pfns = malloc(nr_pfns * sizeof(*rec_pfns));
- if ( !rec_pfns )
- {
- ERROR("Unable to allocate %zu bytes of memory for page data pfn list",
- nr_pfns * sizeof(*rec_pfns));
- goto err;
- }
-
hdr.count = nr_pfns;
rec.length = sizeof(hdr);
- rec.length += nr_pfns * sizeof(*rec_pfns);
+ rec.length += nr_pfns * sizeof(*ctx->save.rec_pfns);
rec.length += nr_pages * PAGE_SIZE;
for ( i = 0; i < nr_pfns; ++i )
- rec_pfns[i] = ((uint64_t)(ctx->save.types[i]) << 32) | ctx->save.batch_pfns[i];
+ ctx->save.rec_pfns[i] = ((uint64_t)(ctx->save.types[i]) << 32) |
+ ctx->save.batch_pfns[i];
ctx->save.iov[0].iov_base = &rec.type;
ctx->save.iov[0].iov_len = sizeof(rec.type);
@@ -228,12 +220,13 @@ static int write_batch(struct xc_sr_cont
ctx->save.iov[2].iov_base = &hdr;
ctx->save.iov[2].iov_len = sizeof(hdr);
- ctx->save.iov[3].iov_base = rec_pfns;
- ctx->save.iov[3].iov_len = nr_pfns * sizeof(*rec_pfns);
+ ctx->save.iov[3].iov_base = ctx->save.rec_pfns;
+ ctx->save.iov[3].iov_len = nr_pfns * sizeof(*ctx->save.rec_pfns);
iovcnt = 4;
ctx->save.pages_sent += nr_pages;
- ctx->save.overhead_sent += sizeof(rec) + sizeof(hdr) + nr_pfns * sizeof(*rec_pfns);
+ ctx->save.overhead_sent += sizeof(rec) + sizeof(hdr) +
+ nr_pfns * sizeof(*ctx->save.rec_pfns);
if ( nr_pages )
{
@@ -260,7 +253,6 @@ static int write_batch(struct xc_sr_cont
rc = ctx->save.nr_batch_pfns = 0;
err:
- free(rec_pfns);
if ( guest_mapping )
xenforeignmemory_unmap(xch->fmem, guest_mapping, nr_pages_mapped);
for ( i = 0; local_pages && i < nr_pfns; ++i )
@@ -843,11 +835,12 @@ static int setup(struct xc_sr_context *c
ctx->save.types = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.types));
ctx->save.errors = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.errors));
ctx->save.iov = malloc((4 + MAX_BATCH_SIZE) * sizeof(*ctx->save.iov));
+ ctx->save.rec_pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.rec_pfns));
ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size);
if ( !ctx->save.batch_pfns || !ctx->save.mfns || !ctx->save.types ||
- !ctx->save.errors || !ctx->save.iov || !dirty_bitmap ||
- !ctx->save.deferred_pages )
+ !ctx->save.errors || !ctx->save.iov || !ctx->save.rec_pfns ||
+ !dirty_bitmap || !ctx->save.deferred_pages )
{
ERROR("Unable to allocate memory for dirty bitmaps, batch pfns and"
" deferred pages");
@@ -878,6 +871,7 @@ static void cleanup(struct xc_sr_context
xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
NRPAGES(bitmap_size(ctx->save.p2m_size)));
free(ctx->save.deferred_pages);
+ free(ctx->save.rec_pfns);
free(ctx->save.iov);
free(ctx->save.errors);
free(ctx->save.types);

View File

@@ -1,116 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 15:39:59 +0200
Subject: libxc sr save show_transfer_rate
tools: show migration transfer rate in send_dirty_pages
Show how fast domU pages are transferred in each iteration.
The relevant data is how fast the pfns travel, not so much how much
protocol overhead exists. So the reported MiB/sec is just for pfns.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
v02:
- rearrange MiB_sec calculation (jgross)
---
tools/libs/guest/xg_sr_common.h | 2 ++
tools/libs/guest/xg_sr_save.c | 46 +++++++++++++++++++++++++++++++++
2 files changed, 48 insertions(+)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -238,6 +238,8 @@ struct xc_sr_context
bool debug;
unsigned long p2m_size;
+ size_t pages_sent;
+ size_t overhead_sent;
struct precopy_stats stats;
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -1,5 +1,6 @@
#include <assert.h>
#include <arpa/inet.h>
+#include <time.h>
#include "xg_sr_common.h"
@@ -238,6 +239,8 @@ static int write_batch(struct xc_sr_cont
iov[3].iov_len = nr_pfns * sizeof(*rec_pfns);
iovcnt = 4;
+ ctx->save.pages_sent += nr_pages;
+ ctx->save.overhead_sent += sizeof(rec) + sizeof(hdr) + nr_pfns * sizeof(*rec_pfns);
if ( nr_pages )
{
@@ -356,6 +359,42 @@ static int suspend_domain(struct xc_sr_c
return 0;
}
+static void show_transfer_rate(struct xc_sr_context *ctx, struct timespec *start)
+{
+ xc_interface *xch = ctx->xch;
+ struct timespec end = {}, diff = {};
+ size_t ms, MiB_sec;
+
+ if (!ctx->save.pages_sent)
+ return;
+
+ if ( clock_gettime(CLOCK_MONOTONIC, &end) )
+ PERROR("clock_gettime");
+
+ if ( (end.tv_nsec - start->tv_nsec) < 0 )
+ {
+ diff.tv_sec = end.tv_sec - start->tv_sec - 1;
+ diff.tv_nsec = end.tv_nsec - start->tv_nsec + (1000U*1000U*1000U);
+ }
+ else
+ {
+ diff.tv_sec = end.tv_sec - start->tv_sec;
+ diff.tv_nsec = end.tv_nsec - start->tv_nsec;
+ }
+
+ ms = (diff.tv_nsec / (1000U*1000U));
+ ms += (diff.tv_sec * 1000U);
+ if (!ms)
+ ms = 1;
+
+ MiB_sec = (ctx->save.pages_sent * PAGE_SIZE * 1000U) / ms / (1024U*1024U);
+
+ errno = 0;
+ IPRINTF("%s: %zu bytes + %zu pages in %ld.%09ld sec, %zu MiB/sec", __func__,
+ ctx->save.overhead_sent, ctx->save.pages_sent,
+ diff.tv_sec, diff.tv_nsec, MiB_sec);
+}
+
/*
* Send a subset of pages in the guests p2m, according to the dirty bitmap.
* Used for each subsequent iteration of the live migration loop.
@@ -369,9 +408,15 @@ static int send_dirty_pages(struct xc_sr
xen_pfn_t p;
unsigned long written;
int rc;
+ struct timespec start = {};
DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
&ctx->save.dirty_bitmap_hbuf);
+ ctx->save.pages_sent = 0;
+ ctx->save.overhead_sent = 0;
+ if ( clock_gettime(CLOCK_MONOTONIC, &start) )
+ PERROR("clock_gettime");
+
for ( p = 0, written = 0; p < ctx->save.p2m_size; ++p )
{
if ( !test_bit(p, dirty_bitmap) )
@@ -395,6 +440,7 @@ static int send_dirty_pages(struct xc_sr
if ( written > entries )
DPRINTF("Bitmap contained more entries than expected...");
+ show_transfer_rate(ctx, &start);
xc_report_progress_step(xch, entries, entries);
return ctx->save.ops.check_vm_state(ctx);

View File

@@ -1,154 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 23 Oct 2020 11:23:51 +0200
Subject: libxc sr save types
tools: save: preallocate types array
Remove repeated allocation from migration loop. There will never be
more than MAX_BATCH_SIZE pages to process in a batch.
Allocate the space once.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 1 +
tools/libs/guest/xg_sr_save.c | 28 +++++++++++++---------------
2 files changed, 14 insertions(+), 15 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -245,6 +245,7 @@ struct xc_sr_context
xen_pfn_t *batch_pfns;
xen_pfn_t *mfns;
+ xen_pfn_t *types;
unsigned int nr_batch_pfns;
unsigned long *deferred_pages;
unsigned long nr_deferred_pages;
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -88,7 +88,6 @@ static int write_checkpoint_record(struc
static int write_batch(struct xc_sr_context *ctx)
{
xc_interface *xch = ctx->xch;
- xen_pfn_t *types = NULL;
void *guest_mapping = NULL;
void **guest_data = NULL;
void **local_pages = NULL;
@@ -105,8 +104,6 @@ static int write_batch(struct xc_sr_cont
assert(nr_pfns != 0);
- /* Types of the batch pfns. */
- types = malloc(nr_pfns * sizeof(*types));
/* Errors from attempting to map the gfns. */
errors = malloc(nr_pfns * sizeof(*errors));
/* Pointers to page data to send. Mapped gfns or local allocations. */
@@ -116,7 +113,7 @@ static int write_batch(struct xc_sr_cont
/* iovec[] for writev(). */
iov = malloc((nr_pfns + 4) * sizeof(*iov));
- if ( !types || !errors || !guest_data || !local_pages || !iov )
+ if ( !errors || !guest_data || !local_pages || !iov )
{
ERROR("Unable to allocate arrays for a batch of %u pages",
nr_pfns);
@@ -125,7 +122,7 @@ static int write_batch(struct xc_sr_cont
for ( i = 0; i < nr_pfns; ++i )
{
- types[i] = ctx->save.mfns[i] = ctx->save.ops.pfn_to_gfn(ctx,
+ ctx->save.types[i] = ctx->save.mfns[i] = ctx->save.ops.pfn_to_gfn(ctx,
ctx->save.batch_pfns[i]);
/* Likely a ballooned page. */
@@ -136,7 +133,7 @@ static int write_batch(struct xc_sr_cont
}
}
- rc = xc_get_pfn_type_batch(xch, ctx->domid, nr_pfns, types);
+ rc = xc_get_pfn_type_batch(xch, ctx->domid, nr_pfns, ctx->save.types);
if ( rc )
{
PERROR("Failed to get types for pfn batch");
@@ -146,14 +143,14 @@ static int write_batch(struct xc_sr_cont
for ( i = 0; i < nr_pfns; ++i )
{
- if ( !is_known_page_type(types[i]) )
+ if ( !is_known_page_type(ctx->save.types[i]) )
{
ERROR("Unknown type %#"PRIpfn" for pfn %#"PRIpfn,
- types[i], ctx->save.mfns[i]);
+ ctx->save.types[i], ctx->save.mfns[i]);
goto err;
}
- if ( !page_type_has_stream_data(types[i]) )
+ if ( !page_type_has_stream_data(ctx->save.types[i]) )
continue;
ctx->save.mfns[nr_pages++] = ctx->save.mfns[i];
@@ -172,7 +169,7 @@ static int write_batch(struct xc_sr_cont
for ( i = 0, p = 0; i < nr_pfns; ++i )
{
- if ( !page_type_has_stream_data(types[i]) )
+ if ( !page_type_has_stream_data(ctx->save.types[i]) )
continue;
if ( errors[p] )
@@ -183,7 +180,7 @@ static int write_batch(struct xc_sr_cont
}
orig_page = page = guest_mapping + (p * PAGE_SIZE);
- rc = ctx->save.ops.normalise_page(ctx, types[i], &page);
+ rc = ctx->save.ops.normalise_page(ctx, ctx->save.types[i], &page);
if ( orig_page != page )
local_pages[i] = page;
@@ -194,7 +191,7 @@ static int write_batch(struct xc_sr_cont
{
set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages);
++ctx->save.nr_deferred_pages;
- types[i] = XEN_DOMCTL_PFINFO_XTAB;
+ ctx->save.types[i] = XEN_DOMCTL_PFINFO_XTAB;
--nr_pages;
}
else
@@ -223,7 +220,7 @@ static int write_batch(struct xc_sr_cont
rec.length += nr_pages * PAGE_SIZE;
for ( i = 0; i < nr_pfns; ++i )
- rec_pfns[i] = ((uint64_t)(types[i]) << 32) | ctx->save.batch_pfns[i];
+ rec_pfns[i] = ((uint64_t)(ctx->save.types[i]) << 32) | ctx->save.batch_pfns[i];
iov[0].iov_base = &rec.type;
iov[0].iov_len = sizeof(rec.type);
@@ -275,7 +272,6 @@ static int write_batch(struct xc_sr_cont
free(local_pages);
free(guest_data);
free(errors);
- free(types);
return rc;
}
@@ -849,9 +845,10 @@ static int setup(struct xc_sr_context *c
ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE *
sizeof(*ctx->save.batch_pfns));
ctx->save.mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.mfns));
+ ctx->save.types = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.types));
ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size);
- if ( !ctx->save.batch_pfns || !ctx->save.mfns ||
+ if ( !ctx->save.batch_pfns || !ctx->save.mfns || !ctx->save.types ||
!dirty_bitmap || !ctx->save.deferred_pages )
{
ERROR("Unable to allocate memory for dirty bitmaps, batch pfns and"
@@ -883,6 +880,7 @@ static void cleanup(struct xc_sr_context
xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
NRPAGES(bitmap_size(ctx->save.p2m_size)));
free(ctx->save.deferred_pages);
+ free(ctx->save.types);
free(ctx->save.mfns);
free(ctx->save.batch_pfns);
}

View File

@@ -1,263 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Thu, 4 Feb 2021 20:33:53 +0100
Subject: libxc sr track migration time
Track live migration state unconditionally in logfiles to see how long a domU was suspended.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/include/xentoollog.h | 1 +
tools/libs/ctrl/xc_domain.c | 12 +++++--
tools/libs/ctrl/xc_private.h | 9 +++++
tools/libs/guest/xg_resume.c | 5 ++-
tools/libs/guest/xg_sr_common.c | 59 ++++++++++++++++++++++++++++++++
tools/libs/guest/xg_sr_common.h | 3 ++
tools/libs/guest/xg_sr_restore.c | 3 ++
tools/libs/guest/xg_sr_save.c | 6 +++-
tools/xl/xl.c | 2 ++
9 files changed, 96 insertions(+), 4 deletions(-)
--- a/tools/include/xentoollog.h
+++ b/tools/include/xentoollog.h
@@ -133,6 +133,7 @@ const char *xtl_level_to_string(xentooll
});
+#define XL_NO_SUSEINFO "XL_NO_SUSEINFO"
#endif /* XENTOOLLOG_H */
/*
--- a/tools/libs/ctrl/xc_domain.c
+++ b/tools/libs/ctrl/xc_domain.c
@@ -66,20 +66,28 @@ int xc_domain_cacheflush(xc_interface *x
int xc_domain_pause(xc_interface *xch,
uint32_t domid)
{
+ int ret;
struct xen_domctl domctl = {};
domctl.cmd = XEN_DOMCTL_pausedomain;
domctl.domain = domid;
- return do_domctl(xch, &domctl);
+ ret = do_domctl(xch, &domctl);
+ if (getenv(XL_NO_SUSEINFO) == NULL)
+ SUSEINFO("domid %u: %s returned %d", domid, __func__, ret);
+ return ret;
}
int xc_domain_unpause(xc_interface *xch,
uint32_t domid)
{
+ int ret;
struct xen_domctl domctl = {};
domctl.cmd = XEN_DOMCTL_unpausedomain;
domctl.domain = domid;
- return do_domctl(xch, &domctl);
+ ret = do_domctl(xch, &domctl);
+ if (getenv(XL_NO_SUSEINFO) == NULL)
+ SUSEINFO("domid %u: %s returned %d", domid, __func__, ret);
+ return ret;
}
--- a/tools/libs/ctrl/xc_private.h
+++ b/tools/libs/ctrl/xc_private.h
@@ -42,6 +42,15 @@
#include <xen-tools/common-macros.h>
+/*
+ * Using loglevel ERROR to make sure the intended informational messages appear
+ * in libvirts libxl-driver.log
+ */
+#define SUSEINFO(_m, _a...) do { int ERROR_errno = errno; \
+ xc_report(xch, xch->error_handler, XTL_ERROR, XC_ERROR_NONE, "SUSEINFO: " _m , ## _a ); \
+ errno = ERROR_errno; \
+ } while (0)
+
#if defined(HAVE_VALGRIND_MEMCHECK_H) && !defined(NDEBUG) && !defined(__MINIOS__)
/* Compile in Valgrind client requests? */
#include <valgrind/memcheck.h>
--- a/tools/libs/guest/xg_resume.c
+++ b/tools/libs/guest/xg_resume.c
@@ -259,7 +259,10 @@ out:
*/
int xc_domain_resume(xc_interface *xch, uint32_t domid, int fast)
{
- return (fast
+ int ret = (fast
? xc_domain_resume_cooperative(xch, domid)
: xc_domain_resume_any(xch, domid));
+ if (getenv(XL_NO_SUSEINFO) == NULL)
+ SUSEINFO("domid %u: %s%s returned %d", domid, __func__, fast ? " fast" : "", ret);
+ return ret;
}
--- a/tools/libs/guest/xg_sr_common.c
+++ b/tools/libs/guest/xg_sr_common.c
@@ -163,6 +163,65 @@ static void __attribute__((unused)) buil
BUILD_BUG_ON(sizeof(struct xc_sr_rec_hvm_params) != 8);
}
+/* Write a two-character hex representation of 'byte' to digits[].
+ Pre-condition: sizeof(digits) >= 2 */
+static void byte_to_hex(char *digits, const uint8_t byte)
+{
+ uint8_t nybbel = byte >> 4;
+
+ if ( nybbel > 9 )
+ digits[0] = 'a' + nybbel-10;
+ else
+ digits[0] = '0' + nybbel;
+
+ nybbel = byte & 0x0f;
+ if ( nybbel > 9 )
+ digits[1] = 'a' + nybbel-10;
+ else
+ digits[1] = '0' + nybbel;
+}
+
+/* Convert an array of 16 unsigned bytes to a DCE/OSF formatted UUID
+ string.
+
+ Pre-condition: sizeof(dest) >= 37 */
+void sr_uuid_to_string(char *dest, const uint8_t *uuid)
+{
+ int i = 0;
+ char *p = dest;
+
+ for (; i < 4; i++ )
+ {
+ byte_to_hex(p, uuid[i]);
+ p += 2;
+ }
+ *p++ = '-';
+ for (; i < 6; i++ )
+ {
+ byte_to_hex(p, uuid[i]);
+ p += 2;
+ }
+ *p++ = '-';
+ for (; i < 8; i++ )
+ {
+ byte_to_hex(p, uuid[i]);
+ p += 2;
+ }
+ *p++ = '-';
+ for (; i < 10; i++ )
+ {
+ byte_to_hex(p, uuid[i]);
+ p += 2;
+ }
+ *p++ = '-';
+ for (; i < 16; i++ )
+ {
+ byte_to_hex(p, uuid[i]);
+ p += 2;
+ }
+ *p = '\0';
+}
+
/*
* Expand the tracking structures as needed.
* To avoid realloc()ing too excessively, the size increased to the nearest
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -294,6 +294,7 @@ struct xc_sr_context
xc_stream_type_t stream_type;
xc_domaininfo_t dominfo;
+ char uuid[16*2+4+1];
union /* Common save or restore data. */
{
@@ -505,6 +506,8 @@ extern struct xc_sr_save_ops save_ops_x8
extern struct xc_sr_restore_ops restore_ops_x86_pv;
extern struct xc_sr_restore_ops restore_ops_x86_hvm;
+extern void sr_uuid_to_string(char *dest, const uint8_t *uuid);
+
struct xc_sr_record
{
uint32_t type;
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -871,6 +871,8 @@ static int restore(struct xc_sr_context
struct xc_sr_rhdr rhdr;
int rc, saved_rc = 0, saved_errno = 0;
+ SUSEINFO("domid %u: %s %s start", ctx->domid, ctx->uuid, __func__);
+ DPRINTF("domid %u: max_pages %lx tot_pages %lx p2m_size %lx", ctx->domid, ctx->restore.max_pages, ctx->restore.tot_pages, ctx->restore.p2m_size);
IPRINTF("Restoring domain");
rc = setup(ctx);
@@ -946,6 +948,7 @@ static int restore(struct xc_sr_context
PERROR("Restore failed");
done:
+ SUSEINFO("domid %u: %s done", ctx->domid, __func__);
cleanup(ctx);
if ( saved_rc )
@@ -1011,6 +1014,7 @@ int xc_domain_restore(xc_interface *xch,
io_fd, dom, hvm, stream_type);
ctx.domid = dom;
+ sr_uuid_to_string(ctx.uuid, ctx.dominfo.handle);
if ( read_headers(&ctx) )
return -1;
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -353,7 +353,7 @@ static void show_transfer_rate(struct xc
MiB_sec = (ctx->save.pages_sent * PAGE_SIZE * 1000U) / ms / (1024U*1024U);
errno = 0;
- IPRINTF("%s: %zu bytes + %zu pages in %ld.%09ld sec, %zu MiB/sec", __func__,
+ SUSEINFO("domid %u: %zu bytes + %zu pages in %ld.%09ld sec, %zu MiB/sec", ctx->domid,
ctx->save.overhead_sent, ctx->save.pages_sent,
diff.tv_sec, diff.tv_nsec, MiB_sec);
}
@@ -875,13 +875,16 @@ static int save(struct xc_sr_context *ct
{
xc_interface *xch = ctx->xch;
int rc, saved_rc = 0, saved_errno = 0;
+ unsigned long tot_pages = ctx->dominfo.tot_pages;
+ SUSEINFO("domid %u: %s %s start, %lu pages allocated", ctx->domid, ctx->uuid, __func__, tot_pages);
IPRINTF("Saving domain %d, type %s",
ctx->domid, dhdr_type_to_str(guest_type));
rc = setup(ctx);
if ( rc )
goto err;
+ SUSEINFO("domid %u: p2m_size %lx", ctx->domid, ctx->save.p2m_size);
xc_report_progress_single(xch, "Start of stream");
@@ -995,6 +998,7 @@ static int save(struct xc_sr_context *ct
PERROR("Save failed");
done:
+ SUSEINFO("domid %u: %s done", ctx->domid, __func__);
cleanup(ctx);
if ( saved_rc )
@@ -1054,6 +1058,7 @@ int xc_domain_save(xc_interface *xch, in
io_fd, dom, flags, hvm);
ctx.domid = dom;
+ sr_uuid_to_string(ctx.uuid, ctx.dominfo.handle);
if ( hvm )
{
--- a/tools/xl/xl.c
+++ b/tools/xl/xl.c
@@ -424,6 +424,8 @@ int main(int argc, char **argv)
logger = xtl_createlogger_stdiostream(stderr, minmsglevel, xtl_flags);
if (!logger) exit(EXIT_FAILURE);
+ /* Provide context to libxl and libxc: no SUSEINFO() from xl */
+ setenv(XL_NO_SUSEINFO, "1", 0);
xl_ctx_alloc();
atexit(xl_ctx_free);

View File

@@ -1,197 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 5 Feb 2021 20:16:02 +0100
Subject: libxc sr xg_sr_bitmap populated_pfns
tools: use xg_sr_bitmap for populated_pfns
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.h | 20 ++++++-
tools/libs/guest/xg_sr_restore.c | 69 ------------------------
tools/libs/guest/xg_sr_restore_x86_hvm.c | 9 ++++
tools/libs/guest/xg_sr_restore_x86_pv.c | 7 +++
4 files changed, 34 insertions(+), 71 deletions(-)
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -375,8 +375,7 @@ struct xc_sr_context
uint32_t xenstore_domid, console_domid;
/* Bitmap of currently populated PFNs during restore. */
- unsigned long *populated_pfns;
- xen_pfn_t max_populated_pfn;
+ struct sr_bitmap populated_pfns;
/* Sender has invoked verify mode on the stream. */
bool verify;
@@ -632,6 +631,23 @@ static inline bool page_type_has_stream_
}
}
+static inline bool pfn_is_populated(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+ return sr_test_bit(pfn, &ctx->restore.populated_pfns);
+}
+
+static inline int pfn_set_populated(struct xc_sr_context *ctx, xen_pfn_t pfn)
+{
+ xc_interface *xch = ctx->xch;
+
+ if ( sr_set_bit(pfn, &ctx->restore.populated_pfns) == false )
+ {
+ PERROR("Failed to realloc populated_pfns bitmap");
+ errno = ENOMEM;
+ return -1;
+ }
+ return 0;
+}
#endif
/*
* Local variables:
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -72,64 +72,6 @@ static int read_headers(struct xc_sr_con
}
/*
- * Is a pfn populated?
- */
-static bool pfn_is_populated(const struct xc_sr_context *ctx, xen_pfn_t pfn)
-{
- if ( pfn > ctx->restore.max_populated_pfn )
- return false;
- return test_bit(pfn, ctx->restore.populated_pfns);
-}
-
-/*
- * Set a pfn as populated, expanding the tracking structures if needed. To
- * avoid realloc()ing too excessively, the size increased to the nearest power
- * of two large enough to contain the required pfn.
- */
-static int pfn_set_populated(struct xc_sr_context *ctx, xen_pfn_t pfn)
-{
- xc_interface *xch = ctx->xch;
-
- if ( pfn > ctx->restore.max_populated_pfn )
- {
- xen_pfn_t new_max;
- size_t old_sz, new_sz;
- unsigned long *p;
-
- /* Round up to the nearest power of two larger than pfn, less 1. */
- new_max = pfn;
- new_max |= new_max >> 1;
- new_max |= new_max >> 2;
- new_max |= new_max >> 4;
- new_max |= new_max >> 8;
- new_max |= new_max >> 16;
-#ifdef __x86_64__
- new_max |= new_max >> 32;
-#endif
-
- old_sz = bitmap_size(ctx->restore.max_populated_pfn + 1);
- new_sz = bitmap_size(new_max + 1);
- p = realloc(ctx->restore.populated_pfns, new_sz);
- if ( !p )
- {
- ERROR("Failed to realloc populated bitmap");
- errno = ENOMEM;
- return -1;
- }
-
- memset((uint8_t *)p + old_sz, 0x00, new_sz - old_sz);
-
- ctx->restore.populated_pfns = p;
- ctx->restore.max_populated_pfn = new_max;
- }
-
- assert(!test_bit(pfn, ctx->restore.populated_pfns));
- set_bit(pfn, ctx->restore.populated_pfns);
-
- return 0;
-}
-
-/*
* Given a set of pfns, obtain memory from Xen to fill the physmap for the
* unpopulated subset. If types is NULL, no page type checking is performed
* and all unpopulated pfns are populated.
@@ -911,16 +853,6 @@ static int setup(struct xc_sr_context *c
if ( rc )
goto err;
- ctx->restore.max_populated_pfn = (32 * 1024 / 4) - 1;
- ctx->restore.populated_pfns = bitmap_alloc(
- ctx->restore.max_populated_pfn + 1);
- if ( !ctx->restore.populated_pfns )
- {
- ERROR("Unable to allocate memory for populated_pfns bitmap");
- rc = -1;
- goto err;
- }
-
ctx->restore.pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.pfns));
ctx->restore.types = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.types));
ctx->restore.mfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->restore.mfns));
@@ -969,7 +901,6 @@ static void cleanup(struct xc_sr_context
xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->restore.p2m_size)));
free(ctx->restore.buffered_records);
- free(ctx->restore.populated_pfns);
free(ctx->restore.pages);
free(ctx->restore.iov);
free(ctx->restore.guest_data);
--- a/tools/libs/guest/xg_sr_restore_x86_hvm.c
+++ b/tools/libs/guest/xg_sr_restore_x86_hvm.c
@@ -136,6 +136,7 @@ static int x86_hvm_localise_page(struct
static int x86_hvm_setup(struct xc_sr_context *ctx)
{
xc_interface *xch = ctx->xch;
+ unsigned long max_pfn, max_pages = ctx->dominfo.max_pages;
if ( ctx->restore.guest_type != DHDR_TYPE_X86_HVM )
{
@@ -161,6 +162,13 @@ static int x86_hvm_setup(struct xc_sr_co
}
#endif
+ max_pfn = max(ctx->restore.p2m_size, max_pages);
+ if ( !sr_bitmap_expand(&ctx->restore.populated_pfns, max_pfn) )
+ {
+ PERROR("Unable to allocate memory for populated_pfns bitmap");
+ return -1;
+ }
+
return 0;
}
@@ -241,6 +249,7 @@ static int x86_hvm_stream_complete(struc
static int x86_hvm_cleanup(struct xc_sr_context *ctx)
{
+ sr_bitmap_free(&ctx->restore.populated_pfns);
free(ctx->x86.hvm.restore.context.ptr);
free(ctx->x86.restore.cpuid.ptr);
--- a/tools/libs/guest/xg_sr_restore_x86_pv.c
+++ b/tools/libs/guest/xg_sr_restore_x86_pv.c
@@ -1060,6 +1060,12 @@ static int x86_pv_setup(struct xc_sr_con
if ( rc )
return rc;
+ if ( !sr_bitmap_expand(&ctx->restore.populated_pfns, 32 * 1024 / 4) )
+ {
+ PERROR("Unable to allocate memory for populated_pfns bitmap");
+ return -1;
+ }
+
ctx->x86.pv.restore.nr_vcpus = ctx->dominfo.max_vcpu_id + 1;
ctx->x86.pv.restore.vcpus = calloc(sizeof(struct xc_sr_x86_pv_restore_vcpu),
ctx->x86.pv.restore.nr_vcpus);
@@ -1153,6 +1159,7 @@ static int x86_pv_stream_complete(struct
*/
static int x86_pv_cleanup(struct xc_sr_context *ctx)
{
+ sr_bitmap_free(&ctx->restore.populated_pfns);
free(ctx->x86.pv.p2m);
free(ctx->x86.pv.p2m_pfns);

View File

@@ -1,141 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Fri, 5 Feb 2021 19:50:03 +0100
Subject: libxc sr xg_sr_bitmap
tools: add API for expandable bitmaps
Since the incoming migration stream lacks info about what the highest pfn
will be, some data structures can not be allocated upfront.
Add an API for expandable bitmaps, loosely based on pfn_set_populated.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
tools/libs/guest/xg_sr_common.c | 39 +++++++++++++++++++
tools/libs/guest/xg_sr_common.h | 67 +++++++++++++++++++++++++++++++++
2 files changed, 106 insertions(+)
--- a/tools/libs/guest/xg_sr_common.c
+++ b/tools/libs/guest/xg_sr_common.c
@@ -164,6 +164,45 @@ static void __attribute__((unused)) buil
}
/*
+ * Expand the tracking structures as needed.
+ * To avoid realloc()ing too excessively, the size increased to the nearest
+ * power of two large enough to contain the required number of bits.
+ */
+bool _sr_bitmap_expand(struct sr_bitmap *bm, unsigned long bits)
+{
+ size_t new_max;
+ size_t old_sz, new_sz;
+ void *p;
+
+ if (bits <= bm->bits)
+ return true;
+
+ /* Round up to the nearest power of two larger than bit, less 1. */
+ new_max = bits;
+ new_max |= new_max >> 1;
+ new_max |= new_max >> 2;
+ new_max |= new_max >> 4;
+ new_max |= new_max >> 8;
+ new_max |= new_max >> 16;
+ new_max |= sizeof(unsigned long) > 4 ? new_max >> 32 : 0;
+
+ /* Allocate units of unsigned long */
+ new_max = (new_max + BITS_PER_LONG - 1) & ~(BITS_PER_LONG - 1);
+
+ old_sz = bitmap_size(bm->bits);
+ new_sz = bitmap_size(new_max);
+ p = realloc(bm->p, new_sz);
+ if (!p)
+ return false;
+
+ memset(p + old_sz, 0, new_sz - old_sz);
+ bm->p = p;
+ bm->bits = new_max;
+
+ return true;
+}
+
+/*
* Local variables:
* mode: C
* c-file-style: "BSD"
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -18,6 +18,73 @@ const char *rec_type_to_str(uint32_t typ
struct xc_sr_context;
struct xc_sr_record;
+struct sr_bitmap
+{
+ void *p;
+ unsigned long bits;
+};
+
+extern bool _sr_bitmap_expand(struct sr_bitmap *bm, unsigned long bits);
+
+static inline bool sr_bitmap_expand(struct sr_bitmap *bm, unsigned long bits)
+{
+ if (bits > bm->bits)
+ return _sr_bitmap_expand(bm, bits);
+ return true;
+}
+
+static inline void sr_bitmap_free(struct sr_bitmap *bm)
+{
+ free(bm->p);
+ bm->p = NULL;
+}
+
+static inline bool sr_set_bit(unsigned long bit, struct sr_bitmap *bm)
+{
+ if (sr_bitmap_expand(bm, bit + 1) == false)
+ return false;
+
+ set_bit(bit, bm->p);
+ return true;
+}
+
+static inline bool sr_test_bit(unsigned long bit, struct sr_bitmap *bm)
+{
+ if (bit + 1 > bm->bits)
+ return false;
+ return !!test_bit(bit, bm->p);
+}
+
+static inline void sr_clear_bit(unsigned long bit, struct sr_bitmap *bm)
+{
+ if (bit + 1 <= bm->bits)
+ clear_bit(bit, bm->p);
+}
+
+static inline bool sr_test_and_clear_bit(unsigned long bit, struct sr_bitmap *bm)
+{
+ if (bit + 1 > bm->bits)
+ return false;
+ return !!test_and_clear_bit(bit, bm->p);
+}
+
+/* No way to report potential allocation error, bitmap must be expanded prior usage */
+static inline bool sr_test_and_set_bit(unsigned long bit, struct sr_bitmap *bm)
+{
+ if (bit + 1 > bm->bits)
+ return false;
+ return !!test_and_set_bit(bit, bm->p);
+}
+
+static inline bool sr_set_long_bit(unsigned long base_bit, struct sr_bitmap *bm)
+{
+ if (sr_bitmap_expand(bm, base_bit + BITS_PER_LONG) == false)
+ return false;
+
+ set_bit_long(base_bit, bm->p);
+ return true;
+}
+
/**
* Save operations. To be implemented for each type of guest, for use by the
* common save algorithm.

View File

@@ -1,46 +0,0 @@
From: Olaf Hering <olaf@aepfle.de>
Date: Thu, 29 Oct 2020 17:00:19 +0100
Subject: libxc sr xl migration debug
xl: fix description of migrate --debug
xl migrate --debug used to track every pfn in every batch of pages.
But these times are gone. The code in xc_domain_save is the consumer
of this knob, now may enable verification mode.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
v03:
- adjust to describe what --debug would do when the code which
consumes this knob is fixed.
v02:
- the option has no effect anymore
---
docs/man/xl.1.pod.in | 4 +++-
tools/xl/xl_cmdtable.c | 2 +-
2 files changed, 4 insertions(+), 2 deletions(-)
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -486,7 +486,9 @@ domain.
=item B<--debug>
-Display huge (!) amount of debug information during the migration process.
+This enables verification mode, which will transfer the entire domU memory
+once more to the receiving host to make sure the content is identical on
+both sides.
=item B<-p>
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -173,7 +173,7 @@ const struct cmd_spec cmd_table[] = {
" migrate-receive [-d -e]\n"
"-e Do not wait in the background (on <host>) for the death\n"
" of the domain.\n"
- "--debug Print huge (!) amount of debug during the migration process.\n"
+ "--debug Enable verification mode.\n"
"-p Do not unpause domain after migrating it.\n"
"-D Preserve the domain id"
},

View File

@@ -6,105 +6,97 @@ Subject: [PATCH] replace obsolete network configuration commands in scripts
Some scripts still use obsolete network configuration commands ifconfig and
brctl. Replace them by commands from iproute2 package.
---
README | 3 +--
tools/hotplug/Linux/colo-proxy-setup | 14 ++++++--------
tools/hotplug/Linux/remus-netbuf-setup | 3 ++-
tools/hotplug/Linux/vif-bridge | 7 ++++---
tools/hotplug/Linux/vif-nat | 2 +-
tools/hotplug/Linux/vif-route | 6 ++++--
tools/hotplug/Linux/xen-network-common.sh | 6 ++----
.../i386-dm/qemu-ifup-Linux | 5 +++--
9 files changed, 26 insertions(+), 26 deletions(-)
tools/hotplug/Linux/colo-proxy-setup | 14 --------------
tools/hotplug/Linux/remus-netbuf-setup | 2 +-
tools/hotplug/Linux/vif-bridge | 6 +-----
tools/hotplug/Linux/vif-nat | 2 +-
tools/hotplug/Linux/vif-route | 6 ++++--
tools/hotplug/Linux/xen-network-common.sh | 15 +--------------
6 files changed, 8 insertions(+), 37 deletions(-)
Index: xen-4.19.0-testing/README
===================================================================
--- xen-4.19.0-testing.orig/README
+++ xen-4.19.0-testing/README
@@ -59,8 +59,7 @@ provided by your OS distributor:
* Development install of GLib v2.0 (e.g. libglib2.0-dev)
* Development install of Pixman (e.g. libpixman-1-dev)
* pkg-config
- * bridge-utils package (/sbin/brctl)
- * iproute package (/sbin/ip)
+ * iproute package (/sbin/ip, /sbin/bridge)
* GNU bison and GNU flex
* ACPI ASL compiler (iasl)
--- a/tools/hotplug/Linux/colo-proxy-setup
+++ b/tools/hotplug/Linux/colo-proxy-setup
@@ -76,17 +76,10 @@
Index: xen-4.19.0-testing/tools/hotplug/Linux/remus-netbuf-setup
===================================================================
--- xen-4.19.0-testing.orig/tools/hotplug/Linux/remus-netbuf-setup
+++ xen-4.19.0-testing/tools/hotplug/Linux/remus-netbuf-setup
@@ -76,6 +76,7 @@
#specific setup code such as renaming.
dir=$(dirname "$0")
. "$dir/xen-hotplug-common.sh"
+. "$dir/xen-network-common.sh"
function setup_secondary()
{
- if which brctl >&/dev/null; then
- do_without_error brctl delif $bridge $vifname
- do_without_error brctl addbr $forwardbr
- do_without_error brctl addif $forwardbr $vifname
- do_without_error brctl addif $forwardbr $forwarddev
- else
do_without_error ip link set $vifname nomaster
do_without_error ip link add name $forwardbr type bridge
do_without_error ip link set $vifname master $forwardbr
do_without_error ip link set $forwarddev master $forwardbr
- fi
do_without_error ip link set dev $forwardbr up
do_without_error modprobe xt_SECCOLO
findCommand "$@"
@@ -98,17 +91,10 @@
@@ -139,8 +140,16 @@ check_ifb() {
function teardown_secondary()
{
- if which brctl >&/dev/null; then
- do_without_error brctl delif $forwardbr $forwarddev
- do_without_error brctl delif $forwardbr $vifname
- do_without_error brctl delbr $forwardbr
- do_without_error brctl addif $bridge $vifname
- else
do_without_error ip link set $forwarddev nomaster
do_without_error ip link set $vifname nomaster
do_without_error ip link delete $forwardbr type bridge
do_without_error ip link set $vifname master $bridge
- fi
do_without_error iptables -t mangle -D PREROUTING -m physdev --physdev-in \
$vifname -j SECCOLO --index $index
--- a/tools/hotplug/Linux/remus-netbuf-setup
+++ b/tools/hotplug/Linux/remus-netbuf-setup
@@ -139,7 +139,7 @@
setup_ifb() {
- for ifb in `ifconfig -a -s|egrep ^ifb|cut -d ' ' -f1`
+ if [ "$legacy_tools" ]; then
+ ifbs=`ifconfig -a -s|egrep ^ifb|cut -d ' ' -f1`
+ else
+ ifbs=$(ip --oneline link show type ifb | cut -d ' ' -f2)
+ fi
+ for ifb in $ifbs
+ for ifb in $(ip --oneline link show type ifb | awk -F : '(NR == 1) { print $2; }')
do
+ if [ ! "$legacy_tools" ]; then
+ ifb="${ifb%:}"
+ fi
check_ifb "$ifb" || continue
REMUS_IFB="$ifb"
break
Index: xen-4.19.0-testing/tools/hotplug/Linux/vif-bridge
===================================================================
--- xen-4.19.0-testing.orig/tools/hotplug/Linux/vif-bridge
+++ xen-4.19.0-testing/tools/hotplug/Linux/vif-bridge
@@ -42,7 +42,8 @@ if [ -z "$bridge" ]; then
if which brctl >&/dev/null; then
bridge=$(brctl show | awk 'NR==2{print$1}')
else
--- a/tools/hotplug/Linux/vif-bridge
+++ b/tools/hotplug/Linux/vif-bridge
@@ -39,11 +39,7 @@
bridge=$(xenstore_read_default "$XENBUS_PATH/bridge" "$bridge")
if [ -z "$bridge" ]; then
- if which brctl >&/dev/null; then
- bridge=$(brctl show | awk 'NR==2{print$1}')
- else
- bridge=$(bridge link | cut -d" " -f7)
+ bridge=$(ip --oneline link show type bridge | awk '(NR == 1) { print $2; }')
+ bridge="${bridge%:}"
fi
- fi
+ read bridge < <(ip --oneline link show type bridge | awk -F : '(NR == 1) { print $2; }')
if [ -z "$bridge" ]
then
Index: xen-4.19.0-testing/tools/hotplug/Linux/vif-nat
===================================================================
--- xen-4.19.0-testing.orig/tools/hotplug/Linux/vif-nat
+++ xen-4.19.0-testing/tools/hotplug/Linux/vif-nat
@@ -172,7 +172,11 @@ case "$command" in
fatal "Could not find bridge, and none was specified"
--- a/tools/hotplug/Linux/vif-nat
+++ b/tools/hotplug/Linux/vif-nat
@@ -172,7 +172,7 @@
;;
offline)
[ "$dhcp" != 'no' ] && dhcp_down
- do_without_error ifconfig "${dev}" down
+ if [ "$legacy_tools" ]; then
+ do_without_error ifconfig "${dev}" down
+ else
+ do_without_error ip link set "${dev}" down
+ fi
+ do_without_error ip link set "${dev}" down
;;
esac
Index: xen-4.19.0-testing/tools/hotplug/Linux/vif-route
===================================================================
--- xen-4.19.0-testing.orig/tools/hotplug/Linux/vif-route
+++ xen-4.19.0-testing/tools/hotplug/Linux/vif-route
@@ -23,13 +23,23 @@ main_ip=$(dom0_ip)
--- a/tools/hotplug/Linux/vif-route
+++ b/tools/hotplug/Linux/vif-route
@@ -23,13 +23,15 @@
case "${command}" in
add|online)
- ifconfig ${dev} ${main_ip} netmask 255.255.255.255 up
+ if [ "$legacy_tools" ]; then
+ ifconfig ${dev} ${main_ip} netmask 255.255.255.255 up
+ else
+ ip addr add "${main_ip}/32" dev "$dev"
+ fi
+ ip addr add "${main_ip}/32" dev "$dev"
+ ip link set "dev" up
echo 1 >/proc/sys/net/ipv4/conf/${dev}/proxy_arp
ipcmd='add'
@@ -112,40 +104,49 @@ Index: xen-4.19.0-testing/tools/hotplug/Linux/vif-route
;;
remove|offline)
- do_without_error ifdown ${dev}
+ if [ "$legacy_tools" ]; then
+ do_without_error ifdown ${dev}
+ else
+ do_without_error ip addr flush dev "$dev"
+ do_without_error ip link set "$dev" down
+ fi
+ do_without_error ip addr flush dev "$dev"
+ do_without_error ip link set "$dev" down
ipcmd='del'
cmdprefix='do_without_error'
;;
Index: xen-4.19.0-testing/tools/hotplug/Linux/xen-network-common.sh
===================================================================
--- xen-4.19.0-testing.orig/tools/hotplug/Linux/xen-network-common.sh
+++ xen-4.19.0-testing/tools/hotplug/Linux/xen-network-common.sh
@@ -15,6 +15,12 @@
#
--- a/tools/hotplug/Linux/xen-network-common.sh
+++ b/tools/hotplug/Linux/xen-network-common.sh
@@ -111,13 +111,7 @@
# Don't create the bridge if it already exists.
if [ ! -e "/sys/class/net/${bridge}/bridge" ]; then
- if which brctl >&/dev/null; then
- brctl addbr ${bridge}
- brctl stp ${bridge} off
- brctl setfd ${bridge} 0
- else
ip link add name ${bridge} type bridge stp_state 0 forward_delay 0
- fi
fi
}
+# Use brctl and ifconfig on older systems
+legacy_tools=
+if [ -f /sbin/brctl -a -f /sbin/ifconfig ]; then
+ legacy_tools="true"
+fi
+
# Gentoo doesn't have ifup/ifdown, so we define appropriate alternatives.
# Other platforms just use ifup / ifdown directly.
@@ -152,8 +158,10 @@ remove_from_bridge () {
@@ -129,11 +123,7 @@
# Don't add $dev to $bridge if it's already on the bridge.
if [ ! -e "/sys/class/net/${bridge}/brif/${dev}" ]; then
log debug "adding $dev to bridge $bridge"
- if which brctl >&/dev/null; then
- brctl addif ${bridge} ${dev}
- else
ip link set ${dev} master ${bridge}
- fi
else
log debug "$dev already on bridge $bridge"
fi
@@ -150,11 +140,8 @@
# Don't remove $dev from $bridge if it's not on the bridge.
if [ -e "/sys/class/net/${bridge}/brif/${dev}" ]; then
log debug "removing $dev from bridge $bridge"
if which brctl >&/dev/null; then
do_without_error brctl delif ${bridge} ${dev}
+ do_without_error ifconfig "$dev" down
else
- if which brctl >&/dev/null; then
- do_without_error brctl delif ${bridge} ${dev}
- else
do_without_error ip link set ${dev} nomaster
- fi
+ do_without_error ip link set "$dev" down
fi
else
log debug "$dev not on bridge $bridge"
fi

View File

@@ -1,25 +0,0 @@
References: bsc#985503
Index: xen-4.15.1-testing/tools/hotplug/Linux/vif-route
===================================================================
--- xen-4.15.1-testing.orig/tools/hotplug/Linux/vif-route
+++ xen-4.15.1-testing/tools/hotplug/Linux/vif-route
@@ -57,11 +57,13 @@ case "${type_if}" in
;;
esac
-# If we've been given a list of IP addresses, then add routes from dom0 to
-# the guest using those addresses.
-for addr in ${ip} ; do
- ${cmdprefix} ip route ${ipcmd} ${addr} dev ${dev} src ${main_ip} metric ${metric}
-done
+if [ "${ip}" ] && [ "${ipcmd}" ] ; then
+ # If we've been given a list of IP addresses, then add routes from dom0 to
+ # the guest using those addresses.
+ for addr in ${ip} ; do
+ ${cmdprefix} ip route ${ipcmd} ${addr} dev ${dev} src ${main_ip} metric ${metric}
+ done
+fi
handle_iptable

BIN
xen-4.20.0-testing-src.tar.bz2 (Stored with Git LFS)

Binary file not shown.

View File

@@ -1,3 +1,94 @@
-------------------------------------------------------------------
Wed Mar 5 06:18:13 MST 2025 - carnold@suse.com
- Update to Xen 4.20.0 FCS release (jsc#PED-8907)
* See release candidate changelog entries below for 4.20.0
* Reduce xenstore library dependencies.
* Enable CONFIG_UBSAN (Arm64, x86, PPC, RISC-V) for GitLab CI.
* Support for Intel EPT Paging-Write Feature.
* AMD Zen 5 CPU support, including for new hardware mitigations
for the SRSO speculative vulnerability.
- bsc#1238043 - VUL-0: CVE-2025-1713: xen: deadlock potential with
VT-d and legacy PCI device pass-through (XSA-467)
This fix is part of the final tarball
- Remove references to vm-install from README.SUSE
-------------------------------------------------------------------
Fri Feb 28 09:09:09 UTC 2025 - ohering@suse.de
- refresh replace-obsolete-network-configuration-commands-in-s.patch
to not accidently enter untested brctl code paths
- bsc#985503 - vif-route.patch is obsolete since Xen 4.15
- bsc#1035231 - remove SUSE specific changes for save/restore/migrate
to reduce future maintainence overhead. The bottleneck during
migration is the overhead of mapping HVM domU pages into dom0,
which was not addressed by these changes.
The options --abort_if_busy --max_iters --min_remaining will not
be recognized anymore by xl or virsh.
libxc-bitmap-long.patch
libxc-sr-xl-migration-debug.patch
libxc-sr-readv_exact.patch
libxc-sr-save-show_transfer_rate.patch
libxc-sr-save-mfns.patch
libxc-sr-save-types.patch
libxc-sr-save-errors.patch
libxc-sr-save-iov.patch
libxc-sr-save-rec_pfns.patch
libxc-sr-save-guest_data.patch
libxc-sr-save-local_pages.patch
libxc-sr-restore-pfns.patch
libxc-sr-restore-types.patch
libxc-sr-restore-mfns.patch
libxc-sr-restore-map_errs.patch
libxc-sr-restore-populate_pfns-pfns.patch
libxc-sr-restore-populate_pfns-mfns.patch
libxc-sr-restore-read_record.patch
libxc-sr-restore-handle_buffered_page_data.patch
libxc-sr-restore-handle_incoming_page_data.patch
libxc-sr-LIBXL_HAVE_DOMAIN_SUSPEND_PROPS.patch
libxc-sr-precopy_policy.patch
libxc-sr-max_iters.patch
libxc-sr-min_remaining.patch
libxc-sr-abort_if_busy.patch
libxc-sr-xg_sr_bitmap.patch
libxc-sr-xg_sr_bitmap-populated_pfns.patch
libxc-sr-restore-hvm-legacy-superpage.patch
libxc-sr-track-migration-time.patch
libxc-sr-number-of-iterations.patch
-------------------------------------------------------------------
Thu Feb 20 10:09:41 MST 2025 - carnold@suse.com
- Update to Xen 4.20.0 RC5 release
* x86/shutdown: offline APs with interrupts disabled on all CPUs
* x86/smp: perform disabling on interrupts ahead of AP shutdown
* x86/pci: disable MSI(-X) on all devices at shutdown
* x86/iommu: disable interrupts at shutdown
* x86/HVM: use XVFREE() in hvmemul_cache_destroy()
* xen/console: Fix truncation of panic() messages
* xen/memory: Make resource_max_frames() to return 0 on unknown
type
* x86/svm: Separate STI and VMRUN instructions in
svm_asm_do_resume()
* x86/MCE-telem: adjust cookie definition
- Drop patch contained in new tarball
x86-shutdown-offline-APs-with-interrupts-disabled-on-all-CPUs.patch
-------------------------------------------------------------------
Tue Feb 11 09:43:18 MST 2025 - carnold@suse.com
- bsc#1233796 - [XEN][15-SP7-BEAT3] Xen call trace and APIC Error
found after reboot operation on AMD machine.
x86-shutdown-offline-APs-with-interrupts-disabled-on-all-CPUs.patch
-------------------------------------------------------------------
Mon Feb 10 06:02:04 MST 2025 - carnold@suse.com
- Update to Xen 4.20.0 RC4 release
* AMD/IOMMU: log IVHD contents
* AMD/IOMMU: drop stray MSI enabling
* radix-tree: introduce RADIX_TREE{,_INIT}()
-------------------------------------------------------------------
Fri Jan 31 09:59:45 MST 2025 - carnold@suse.com

View File

@@ -125,7 +125,7 @@ BuildRequires: pesign-obs-integration
BuildRequires: python-rpm-macros
Provides: installhint(reboot-needed)
Version: 4.20.0_06
Version: 4.20.0_08
Release: 0
Summary: Xen Virtualization: Hypervisor (aka VMM aka Microkernel)
License: GPL-2.0-only
@@ -161,37 +161,6 @@ Source10183: xen_maskcalc.py
Source99: baselibs.conf
# Upstream patches
# EMBARGOED security fixes
# libxc
Patch301: libxc-bitmap-long.patch
Patch302: libxc-sr-xl-migration-debug.patch
Patch303: libxc-sr-readv_exact.patch
Patch304: libxc-sr-save-show_transfer_rate.patch
Patch305: libxc-sr-save-mfns.patch
Patch306: libxc-sr-save-types.patch
Patch307: libxc-sr-save-errors.patch
Patch308: libxc-sr-save-iov.patch
Patch309: libxc-sr-save-rec_pfns.patch
Patch310: libxc-sr-save-guest_data.patch
Patch311: libxc-sr-save-local_pages.patch
Patch312: libxc-sr-restore-pfns.patch
Patch313: libxc-sr-restore-types.patch
Patch314: libxc-sr-restore-mfns.patch
Patch315: libxc-sr-restore-map_errs.patch
Patch316: libxc-sr-restore-populate_pfns-pfns.patch
Patch317: libxc-sr-restore-populate_pfns-mfns.patch
Patch318: libxc-sr-restore-read_record.patch
Patch319: libxc-sr-restore-handle_buffered_page_data.patch
Patch320: libxc-sr-restore-handle_incoming_page_data.patch
Patch321: libxc-sr-LIBXL_HAVE_DOMAIN_SUSPEND_PROPS.patch
Patch322: libxc-sr-precopy_policy.patch
Patch323: libxc-sr-max_iters.patch
Patch324: libxc-sr-min_remaining.patch
Patch325: libxc-sr-abort_if_busy.patch
Patch326: libxc-sr-xg_sr_bitmap.patch
Patch327: libxc-sr-xg_sr_bitmap-populated_pfns.patch
Patch328: libxc-sr-restore-hvm-legacy-superpage.patch
Patch329: libxc-sr-track-migration-time.patch
Patch330: libxc-sr-number-of-iterations.patch
# Our platform specific patches
Patch400: xen-destdir.patch
Patch401: vif-bridge-no-iptables.patch
@@ -204,7 +173,6 @@ Patch407: replace-obsolete-network-configuration-commands-in-s.patch
Patch408: ignore-ip-command-script-errors.patch
# Needs to go upstream
Patch420: suspend_evtchn_lock.patch
Patch421: vif-route.patch
# Other bug fixes or features
Patch450: xen.sysconfig-fillup.patch
Patch451: xenconsole-no-multiple-connections.patch

View File

@@ -9,7 +9,7 @@ References: fate#323663 - Run Xenstore in stubdomain
-## Default: daemon
+## Default: domain
#
# Select type of xentore service.
# Select type of xenstore service.
#
@@ -80,14 +80,14 @@ XENSTORED_TRACE=
XENSTORE_DOMAIN_KERNEL=

View File

@@ -81,7 +81,7 @@ Keep going if the event type and shutdown reason remains the same.
save_domain_core_writeconfig(fd, filename, config_data, config_len);
int rc = libxl_domain_suspend_suse(ctx, domid, fd, &props, NULL);
int rc = libxl_domain_suspend(ctx, domid, fd, 0, NULL);
close(fd);
+ if (xsh) {