vnc: distiguish between ipv4/ipv6 omitted vs set to off

The VNC code for interpreting QemuOpts does not currently distinguish between ipv4/ipv6 being omitted, and being set to 'off', because historically the 'ipv4' and 'ipv6' options were just flags which did not accept a value. The upshot is that if someone runs $QEMU -vnc localhost:1,ipv6=off QEMU still uses PF_UNSPEC and thus may still bind to IPv6, when it should use PF_INET. This is another instance of the problem previously fixed for chardevs in commit b77e7c8e99 Author: Paolo Bonzini <pbonzini@redhat.com> Date: Mon Oct 12 15:35:16 2015 +0200 qemu-sockets: fix conversion of ipv4/ipv6 JSON to QemuOpts Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1452518225-11751-6-git-send-email-berrange@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
sockets: remove use of QemuOpts from socket_dgram
2016-01-19 15:41:01 +01:00 · 2016-01-19 15:41:01 +01:00 · 2016-01-19 15:41:01 +01:00 · 2016-01-19 15:41:01 +01:00 · 2016-01-19 15:41:01 +01:00 · 2016-01-19 15:39:06 +01:00
978 changed files with 63201 additions and 31610 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -24,6 +24,8 @@
 /*-darwin-user
 /*-linux-user
 /*-bsd-user
+/ivshmem-client
+/ivshmem-server
 /libdis*
 /libuser
 /linux-headers/asm
--- a/95
+++ b/95
@@ -62,14 +62,22 @@ Guest CPU cores (TCG):
 ----------------------
 Overall
 L: qemu-devel@nongnu.org
-S: Odd fixes
+M: Paolo Bonzini <pbonzini@redhat.com>
+M: Peter Crosthwaite <crosthwaite.peter@gmail.com>
+M: Richard Henderson <rth@twiddle.net>
+S: Maintained
 F: cpu-exec.c
+F: cpu-exec-common.c
+F: cpus.c
 F: cputlb.c
+F: exec.c
 F: softmmu_template.h
-F: translate-all.c
-F: include/exec/cpu_ldst.h
-F: include/exec/cpu_ldst_template.h
+F: translate-all.*
+F: translate-common.c
+F: include/exec/cpu*.h
+F: include/exec/exec-all.h
 F: include/exec/helper*.h
+F: include/exec/tb-hash.h

 Alpha
 M: Richard Henderson <rth@twiddle.net>
@@ -81,6 +89,7 @@ F: disas/alpha.c

 ARM
 M: Peter Maydell <peter.maydell@linaro.org>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: target-arm/
 F: hw/arm/
@@ -216,6 +225,7 @@ F: */kvm.*

 ARM
 M: Peter Maydell <peter.maydell@linaro.org>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: target-arm/kvm.c

@@ -235,9 +245,14 @@ M: Cornelia Huck <cornelia.huck@de.ibm.com>
 M: Alexander Graf <agraf@suse.de>
 S: Maintained
 F: target-s390x/kvm.c
+F: target-s390x/ioinst.[ch]
+F: target-s390x/machine.c
 F: hw/intc/s390_flic.c
 F: hw/intc/s390_flic_kvm.c
 F: include/hw/s390x/s390_flic.h
+F: gdb-xml/s390*.xml
+T: git git://github.com/cohuck/qemu.git s390-next
+T: git git://github.com/borntraeger/qemu.git s390-next

 X86
 M: Paolo Bonzini <pbonzini@redhat.com>
@@ -287,6 +302,7 @@ ARM Machines
 ------------
 Allwinner-a10
 M: Beniamino Galvani <b.galvani@gmail.com>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/*/allwinner*
 F: include/hw/*/allwinner*
@@ -294,6 +310,7 @@ F: hw/arm/cubieboard.c

 ARM PrimeCell
 M: Peter Maydell <peter.maydell@linaro.org>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/char/pl011.c
 F: hw/display/pl110*
@@ -308,6 +325,7 @@ F: include/hw/arm/primecell.h

 ARM cores
 M: Peter Maydell <peter.maydell@linaro.org>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/intc/arm*
 F: hw/intc/gic_internal.h
@@ -327,54 +345,64 @@ M: Evgeny Voevodin <e.voevodin@samsung.com>
 M: Maksim Kozlov <m.kozlov@samsung.com>
 M: Igor Mitsyanko <i.mitsyanko@gmail.com>
 M: Dmitry Solodkiy <d.solodkiy@samsung.com>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/*/exynos*

 Calxeda Highbank
 M: Rob Herring <robh@kernel.org>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/arm/highbank.c
 F: hw/net/xgmac.c

 Canon DIGIC
 M: Antony Pavlov <antonynpavlov@gmail.com>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: include/hw/arm/digic.h
 F: hw/*/digic*

 Gumstix
 L: qemu-devel@nongnu.org
+L: qemu-arm@nongnu.org
 S: Orphan
 F: hw/arm/gumstix.c

 i.MX31
 M: Peter Chubb <peter.chubb@nicta.com.au>
+L: qemu-arm@nongnu.org
 S: Odd fixes
 F: hw/*/imx*
 F: hw/arm/kzm.c

 Integrator CP
 M: Peter Maydell <peter.maydell@linaro.org>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/arm/integratorcp.c

 Musicpal
 M: Jan Kiszka <jan.kiszka@web.de>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/arm/musicpal.c

 nSeries
 M: Andrzej Zaborowski <balrogg@gmail.com>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/arm/nseries.c

 Palm
 M: Andrzej Zaborowski <balrogg@gmail.com>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/arm/palm.c

 Real View
 M: Peter Maydell <peter.maydell@linaro.org>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/arm/realview*
 F: hw/intc/realview_gic.c
@@ -382,6 +410,7 @@ F: include/hw/intc/realview_gic.h

 PXA2XX
 M: Andrzej Zaborowski <balrogg@gmail.com>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/arm/mainstone.c
 F: hw/arm/spitz.c
@@ -391,17 +420,20 @@ F: hw/*/pxa2xx*

 Stellaris
 M: Peter Maydell <peter.maydell@linaro.org>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/*/stellaris*

 Versatile PB
 M: Peter Maydell <peter.maydell@linaro.org>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/*/versatile*

 Xilinx Zynq
 M: Alistair Francis <alistair.francis@xilinx.com>
 M: Peter Crosthwaite <crosthwaite.peter@gmail.com>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/arm/xilinx_zynq.c
 F: hw/misc/zynq_slcr.c
@@ -411,6 +443,7 @@ F: hw/ssi/xilinx_spips.c
 Xilinx ZynqMP
 M: Alistair Francis <alistair.francis@xilinx.com>
 M: Peter Crosthwaite <crosthwaite.peter@gmail.com>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/arm/xlnx-zynqmp.c
 F: hw/arm/xlnx-ep108.c
@@ -419,6 +452,7 @@ F: include/hw/arm/xlnx-zynqmp.h
 ARM ACPI Subsystem
 M: Shannon Zhao <zhaoshenglong@huawei.com>
 M: Shannon Zhao <shannon.zhao@linaro.org>
+L: qemu-arm@nongnu.org
 S: Maintained
 F: hw/arm/virt-acpi-build.c
 F: include/hw/arm/virt-acpi-build.h
@@ -616,15 +650,13 @@ M: Christian Borntraeger <borntraeger@de.ibm.com>
 M: Alexander Graf <agraf@suse.de>
 S: Supported
 F: hw/char/sclp*.[hc]
-F: hw/s390x/s390-virtio-ccw.c
-F: hw/s390x/css.[hc]
-F: hw/s390x/sclp*.[hc]
-F: hw/s390x/ipl*.[hc]
-F: hw/s390x/*pci*.[hc]
-F: hw/s390x/s390-skeys*.c
+F: hw/s390x/
+X: hw/s390x/s390-virtio-bus.[ch]
 F: include/hw/s390x/
 F: pc-bios/s390-ccw/
-T: git git://github.com/cohuck/qemu virtio-ccw-upstr
+F: hw/watchdog/wdt_diag288.c
+T: git git://github.com/cohuck/qemu.git s390-next
+T: git git://github.com/borntraeger/qemu.git s390-next

 UniCore32 Machines
 -------------
@@ -831,6 +863,7 @@ F: net/vhost-user.c

 virtio-9p
 M: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
+M: Greg Kurz <gkurz@linux.vnet.ibm.com>
 S: Supported
 F: hw/9pfs/
 F: fsdev/
@@ -851,7 +884,8 @@ M: Cornelia Huck <cornelia.huck@de.ibm.com>
 M: Christian Borntraeger <borntraeger@de.ibm.com>
 S: Supported
 F: hw/s390x/virtio-ccw.[hc]
-T: git git://github.com/cohuck/qemu virtio-ccw-upstr
+T: git git://github.com/cohuck/qemu.git s390-next
+T: git git://github.com/borntraeger/qemu.git s390-next

 virtio-input
 M: Gerd Hoffmann <kraxel@redhat.com>
@@ -907,6 +941,13 @@ M: Jiri Pirko <jiri@resnulli.us>
 S: Maintained
 F: hw/net/rocker/

+NVDIMM
+M: Xiao Guangrong <guangrong.xiao@linux.intel.com>
+S: Maintained
+F: hw/acpi/nvdimm.c
+F: hw/mem/nvdimm.c
+F: include/hw/mem/nvdimm.h
+
 Subsystems
 ----------
 Audio
@@ -995,7 +1036,8 @@ Device Tree
 M: Peter Crosthwaite <crosthwaite.peter@gmail.com>
 M: Alexander Graf <agraf@suse.de>
 S: Maintained
-F: device_tree.[ch]
+F: device_tree.c
+F: include/sysemu/device_tree.h

 Error reporting
 M: Markus Armbruster <armbru@redhat.com>
@@ -1017,6 +1059,7 @@ S: Supported
 F: include/exec/ioport.h
 F: ioport.c
 F: include/exec/memory.h
+F: include/exec/ram_addr.h
 F: memory.c
 F: include/exec/memory-internal.h
 F: exec.c
@@ -1073,8 +1116,9 @@ F: net/netmap.c
 Network Block Device (NBD)
 M: Paolo Bonzini <pbonzini@redhat.com>
 S: Odd Fixes
-F: block/nbd.c
-F: nbd.*
+F: block/nbd*
+F: nbd/
+F: include/block/nbd*
 F: qemu-nbd.c
 T: git git://github.com/bonzini/qemu.git nbd-next

@@ -1139,6 +1183,8 @@ F: include/qom/
 X: include/qom/cpu.h
 F: qom/
 X: qom/cpu.c
+F: tests/check-qom-interface.c
+F: tests/check-qom-proplist.c
 F: tests/qom-test.c

 QMP
@@ -1155,6 +1201,7 @@ SLIRP
 M: Jan Kiszka <jan.kiszka@siemens.com>
 S: Maintained
 F: slirp/
+F: net/slirp.c
 T: git git://git.kiszka.org/qemu.git queues/slirp

 Tracing
@@ -1206,6 +1253,21 @@ S: Odd fixes
 F: util/buffer.c
 F: include/qemu/buffer.h

+I/O Channels
+M: Daniel P. Berrange <berrange@redhat.com>
+S: Maintained
+F: io/
+F: include/io/
+F: tests/test-io-*
+
+Sockets
+M: Daniel P. Berrange <berrange@redhat.com>
+M: Gerd Hoffmann <kraxel@redhat.com>
+M: Paolo Bonzini <pbonzini@redhat.com>
+S: Maintained
+F: include/qemu/sockets.h
+F: util/qemu-sockets.c
+
 Usermode Emulation
 ------------------
 Overall
@@ -1235,6 +1297,7 @@ AArch64 target
 M: Claudio Fontana <claudio.fontana@huawei.com>
 M: Claudio Fontana <claudio.fontana@gmail.com>
 S: Maintained
+L: qemu-arm@nongnu.org
 F: tcg/aarch64/
 F: disas/arm-a64.cc
 F: disas/libvixl/
@@ -1242,6 +1305,7 @@ F: disas/libvixl/
 ARM target
 M: Andrzej Zaborowski <balrogg@gmail.com>
 S: Maintained
+L: qemu-arm@nongnu.org
 F: tcg/arm/
 F: disas/arm.c

@@ -1444,6 +1508,7 @@ M: Denis V. Lunev <den@openvz.org>
 L: qemu-block@nongnu.org
 S: Supported
 F: block/parallels.c
+F: docs/specs/parallels.txt

 qed
 M: Stefan Hajnoczi <stefanha@redhat.com>
--- a/9
+++ b/9
@@ -159,6 +159,7 @@ dummy := $(call unnest-vars,, \
                crypto-obj-y \
                crypto-aes-obj-y \
                qom-obj-y \
+                io-obj-y \
                common-obj-y \
                common-obj-m)

@@ -178,6 +179,7 @@ SOFTMMU_SUBDIR_RULES=$(filter %-softmmu,$(SUBDIR_RULES))

 $(SOFTMMU_SUBDIR_RULES): $(block-obj-y)
 $(SOFTMMU_SUBDIR_RULES): $(crypto-obj-y)
+$(SOFTMMU_SUBDIR_RULES): $(io-obj-y)
 $(SOFTMMU_SUBDIR_RULES): config-all-devices.mak

 subdir-%:
@@ -238,7 +240,7 @@ qemu-io$(EXESUF): qemu-io.o $(block-obj-y) $(crypto-obj-y) $(qom-obj-y) libqemuu

 qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o

-fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/virtio-9p-marshal.o libqemuutil.a libqemustub.a
+fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/9p-marshal.o fsdev/9p-iov-marshal.o libqemuutil.a libqemustub.a
 fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap

 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
@@ -269,7 +271,8 @@ $(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)

 qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json \
               $(SRC_PATH)/qapi/block.json $(SRC_PATH)/qapi/block-core.json \
-               $(SRC_PATH)/qapi/event.json $(SRC_PATH)/qapi/introspect.json
+               $(SRC_PATH)/qapi/event.json $(SRC_PATH)/qapi/introspect.json \
+               $(SRC_PATH)/qapi/crypto.json

 qapi-types.c qapi-types.h :\
 $(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
@@ -440,7 +443,7 @@ endif
 install: all $(if $(BUILD_DOCS),install-doc) \
 install-datadir install-localstatedir
 ifneq ($(TOOLS),)
-	$(call install-prog,$(TOOLS),$(DESTDIR)$(bindir))
+	$(call install-prog,$(subst qemu-ga,qemu-ga$(EXESUF),$(TOOLS)),$(DESTDIR)$(bindir))
 endif
 ifneq ($(CONFIG_MODULES),)
 	$(INSTALL_DIR) "$(DESTDIR)$(qemu_moddir)"
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -8,7 +8,8 @@ util-obj-y += qmp-introspect.o qapi-types.o qapi-visit.o qapi-event.o
 # block-obj-y is code used by both qemu system emulation and qemu-img

 block-obj-y = async.o thread-pool.o
-block-obj-y += nbd.o block.o blockjob.o
+block-obj-y += nbd/
+block-obj-y += block.o blockjob.o
 block-obj-y += main-loop.o iohandler.o qemu-timer.o
 block-obj-$(CONFIG_POSIX) += aio-posix.o
 block-obj-$(CONFIG_WIN32) += aio-win32.o
@@ -28,6 +29,11 @@ crypto-aes-obj-y = crypto/

 qom-obj-y = qom/

+#######################################################################
+# io-obj-y is code used by both qemu system emulation and qemu-img
+
+io-obj-y = io/
+
 ######################################################################
 # Target independent part of system emulation. The long term path is to
 # suppress *all* target specific code in case of system emulation, i.e. a
@@ -54,6 +60,8 @@ common-obj-y += audio/
 common-obj-y += hw/
 common-obj-y += accel.o

+common-obj-y += replay/
+
 common-obj-y += ui/
 common-obj-y += bt-host.o bt-vhci.o
 bt-host.o-cflags := $(BLUEZ_CFLAGS)
--- a/Makefile.target
+++ b/Makefile.target
@@ -176,6 +176,7 @@ dummy := $(call unnest-vars,.., \
               crypto-obj-y \
               crypto-aes-obj-y \
               qom-obj-y \
+               io-obj-y \
               common-obj-y \
               common-obj-m)
 target-obj-y := $(target-obj-y-save)
@@ -185,6 +186,7 @@ all-obj-y += $(qom-obj-y)
 all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y)
 all-obj-$(CONFIG_USER_ONLY) += $(crypto-aes-obj-y)
 all-obj-$(CONFIG_SOFTMMU) += $(crypto-obj-y)
+all-obj-$(CONFIG_SOFTMMU) += $(io-obj-y)

 $(QEMU_PROG_BUILD): config-devices.mak

--- a/2
+++ b/2
@@ -1 +1 @@
-2.4.50
+2.5.50
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -17,6 +17,9 @@
 #include "block/block.h"
 #include "qemu/queue.h"
 #include "qemu/sockets.h"
+#ifdef CONFIG_EPOLL
+#include <sys/epoll.h>
+#endif

 struct AioHandler
 {
@@ -29,6 +32,162 @@ struct AioHandler
    QLIST_ENTRY(AioHandler) node;
 };

+#ifdef CONFIG_EPOLL
+
+/* The fd number threashold to switch to epoll */
+#define EPOLL_ENABLE_THRESHOLD 64
+
+static void aio_epoll_disable(AioContext *ctx)
+{
+    ctx->epoll_available = false;
+    if (!ctx->epoll_enabled) {
+        return;
+    }
+    ctx->epoll_enabled = false;
+    close(ctx->epollfd);
+}
+
+static inline int epoll_events_from_pfd(int pfd_events)
+{
+    return (pfd_events & G_IO_IN ? EPOLLIN : 0) |
+           (pfd_events & G_IO_OUT ? EPOLLOUT : 0) |
+           (pfd_events & G_IO_HUP ? EPOLLHUP : 0) |
+           (pfd_events & G_IO_ERR ? EPOLLERR : 0);
+}
+
+static bool aio_epoll_try_enable(AioContext *ctx)
+{
+    AioHandler *node;
+    struct epoll_event event;
+
+    QLIST_FOREACH(node, &ctx->aio_handlers, node) {
+        int r;
+        if (node->deleted || !node->pfd.events) {
+            continue;
+        }
+        event.events = epoll_events_from_pfd(node->pfd.events);
+        event.data.ptr = node;
+        r = epoll_ctl(ctx->epollfd, EPOLL_CTL_ADD, node->pfd.fd, &event);
+        if (r) {
+            return false;
+        }
+    }
+    ctx->epoll_enabled = true;
+    return true;
+}
+
+static void aio_epoll_update(AioContext *ctx, AioHandler *node, bool is_new)
+{
+    struct epoll_event event;
+    int r;
+
+    if (!ctx->epoll_enabled) {
+        return;
+    }
+    if (!node->pfd.events) {
+        r = epoll_ctl(ctx->epollfd, EPOLL_CTL_DEL, node->pfd.fd, &event);
+        if (r) {
+            aio_epoll_disable(ctx);
+        }
+    } else {
+        event.data.ptr = node;
+        event.events = epoll_events_from_pfd(node->pfd.events);
+        if (is_new) {
+            r = epoll_ctl(ctx->epollfd, EPOLL_CTL_ADD, node->pfd.fd, &event);
+            if (r) {
+                aio_epoll_disable(ctx);
+            }
+        } else {
+            r = epoll_ctl(ctx->epollfd, EPOLL_CTL_MOD, node->pfd.fd, &event);
+            if (r) {
+                aio_epoll_disable(ctx);
+            }
+        }
+    }
+}
+
+static int aio_epoll(AioContext *ctx, GPollFD *pfds,
+                     unsigned npfd, int64_t timeout)
+{
+    AioHandler *node;
+    int i, ret = 0;
+    struct epoll_event events[128];
+
+    assert(npfd == 1);
+    assert(pfds[0].fd == ctx->epollfd);
+    if (timeout > 0) {
+        ret = qemu_poll_ns(pfds, npfd, timeout);
+    }
+    if (timeout <= 0 || ret > 0) {
+        ret = epoll_wait(ctx->epollfd, events,
+                         sizeof(events) / sizeof(events[0]),
+                         timeout);
+        if (ret <= 0) {
+            goto out;
+        }
+        for (i = 0; i < ret; i++) {
+            int ev = events[i].events;
+            node = events[i].data.ptr;
+            node->pfd.revents = (ev & EPOLLIN ? G_IO_IN : 0) |
+                (ev & EPOLLOUT ? G_IO_OUT : 0) |
+                (ev & EPOLLHUP ? G_IO_HUP : 0) |
+                (ev & EPOLLERR ? G_IO_ERR : 0);
+        }
+    }
+out:
+    return ret;
+}
+
+static bool aio_epoll_enabled(AioContext *ctx)
+{
+    /* Fall back to ppoll when external clients are disabled. */
+    return !aio_external_disabled(ctx) && ctx->epoll_enabled;
+}
+
+static bool aio_epoll_check_poll(AioContext *ctx, GPollFD *pfds,
+                                 unsigned npfd, int64_t timeout)
+{
+    if (!ctx->epoll_available) {
+        return false;
+    }
+    if (aio_epoll_enabled(ctx)) {
+        return true;
+    }
+    if (npfd >= EPOLL_ENABLE_THRESHOLD) {
+        if (aio_epoll_try_enable(ctx)) {
+            return true;
+        } else {
+            aio_epoll_disable(ctx);
+        }
+    }
+    return false;
+}
+
+#else
+
+static void aio_epoll_update(AioContext *ctx, AioHandler *node, bool is_new)
+{
+}
+
+static int aio_epoll(AioContext *ctx, GPollFD *pfds,
+                     unsigned npfd, int64_t timeout)
+{
+    assert(false);
+}
+
+static bool aio_epoll_enabled(AioContext *ctx)
+{
+    return false;
+}
+
+static bool aio_epoll_check_poll(AioContext *ctx, GPollFD *pfds,
+                          unsigned npfd, int64_t timeout)
+{
+    return false;
+}
+
+#endif
+
 static AioHandler *find_aio_handler(AioContext *ctx, int fd)
 {
    AioHandler *node;
@@ -50,6 +209,8 @@ void aio_set_fd_handler(AioContext *ctx,
                        void *opaque)
 {
    AioHandler *node;
+    bool is_new = false;
+    bool deleted = false;

    node = find_aio_handler(ctx, fd);

@@ -68,7 +229,7 @@ void aio_set_fd_handler(AioContext *ctx,
                 * releasing the walking_handlers lock.
                 */
                QLIST_REMOVE(node, node);
-                g_free(node);
+                deleted = true;
            }
        }
    } else {
@@ -79,6 +240,7 @@ void aio_set_fd_handler(AioContext *ctx,
            QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);

            g_source_add_poll(&ctx->source, &node->pfd);
+            is_new = true;
        }
        /* Update handler with latest information */
        node->io_read = io_read;
@@ -90,7 +252,11 @@ void aio_set_fd_handler(AioContext *ctx,
        node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0);
    }

+    aio_epoll_update(ctx, node, is_new);
    aio_notify(ctx);
+    if (deleted) {
+        g_free(node);
+    }
 }

 void aio_set_event_notifier(AioContext *ctx,
@@ -262,6 +428,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
    /* fill pollfds */
    QLIST_FOREACH(node, &ctx->aio_handlers, node) {
        if (!node->deleted && node->pfd.events
+            && !aio_epoll_enabled(ctx)
            && aio_node_check(ctx, node->is_external)) {
            add_pollfd(node);
        }
@@ -273,7 +440,17 @@ bool aio_poll(AioContext *ctx, bool blocking)
    if (timeout) {
        aio_context_release(ctx);
    }
-    ret = qemu_poll_ns((GPollFD *)pollfds, npfd, timeout);
+    if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) {
+        AioHandler epoll_handler;
+
+        epoll_handler.pfd.fd = ctx->epollfd;
+        epoll_handler.pfd.events = G_IO_IN | G_IO_OUT | G_IO_HUP | G_IO_ERR;
+        npfd = 0;
+        add_pollfd(&epoll_handler);
+        ret = aio_epoll(ctx, pollfds, npfd, timeout);
+    } else  {
+        ret = qemu_poll_ns(pollfds, npfd, timeout);
+    }
    if (blocking) {
        atomic_sub(&ctx->notify_me, 2);
    }
@@ -302,3 +479,16 @@ bool aio_poll(AioContext *ctx, bool blocking)

    return progress;
 }
+
+void aio_context_setup(AioContext *ctx, Error **errp)
+{
+#ifdef CONFIG_EPOLL
+    assert(!ctx->epollfd);
+    ctx->epollfd = epoll_create1(EPOLL_CLOEXEC);
+    if (ctx->epollfd == -1) {
+        ctx->epoll_available = false;
+    } else {
+        ctx->epoll_available = true;
+    }
+#endif
+}
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -369,3 +369,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
    aio_context_release(ctx);
    return progress;
 }
+
+void aio_context_setup(AioContext *ctx, Error **errp)
+{
+}
--- a/arch_init.c
+++ b/arch_init.c
@@ -258,9 +258,7 @@ void do_acpitable_option(const QemuOpts *opts)

    acpi_table_add(opts, &err);
    if (err) {
-        error_report("Wrong acpi table provided: %s",
-                     error_get_pretty(err));
-        error_free(err);
+        error_reportf_err(err, "Wrong acpi table provided: ");
        exit(1);
    }
 #endif
--- a/async.c
+++ b/async.c
@@ -59,6 +59,11 @@ QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
    return bh;
 }

+void aio_bh_call(QEMUBH *bh)
+{
+    bh->cb(bh->opaque);
+}
+
 /* Multiple occurrences of aio_bh_poll cannot be called concurrently */
 int aio_bh_poll(AioContext *ctx)
 {
@@ -84,7 +89,7 @@ int aio_bh_poll(AioContext *ctx)
                ret = 1;
            }
            bh->idle = 0;
-            bh->cb(bh->opaque);
+            aio_bh_call(bh);
        }
    }

@@ -320,12 +325,18 @@ AioContext *aio_context_new(Error **errp)
 {
    int ret;
    AioContext *ctx;
+    Error *local_err = NULL;
+
    ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
+    aio_context_setup(ctx, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        goto fail;
+    }
    ret = event_notifier_init(&ctx->notifier, false);
    if (ret < 0) {
-        g_source_destroy(&ctx->source);
        error_setg_errno(errp, -ret, "Failed to initialize event notifier");
-        return NULL;
+        goto fail;
    }
    g_source_set_can_recurse(&ctx->source, true);
    aio_set_event_notifier(ctx, &ctx->notifier,
@@ -340,6 +351,9 @@ AioContext *aio_context_new(Error **errp)
    ctx->notify_dummy_bh = aio_bh_new(ctx, notify_dummy_bh, NULL);

    return ctx;
+fail:
+    g_source_destroy(&ctx->source);
+    return NULL;
 }

 void aio_context_ref(AioContext *ctx)
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -1806,9 +1806,6 @@ static void audio_init (void)
    atexit (audio_atexit);

    s->ts = timer_new_ns(QEMU_CLOCK_VIRTUAL, audio_timer, s);
-    if (!s->ts) {
-        hw_error("Could not create audio timer\n");
-    }

    audio_process_options ("AUDIO", audio_options);

@@ -1859,12 +1856,8 @@ static void audio_init (void)

    if (!done) {
        done = !audio_driver_init (s, &no_audio_driver);
-        if (!done) {
-            hw_error("Could not initialize audio subsystem\n");
-        }
-        else {
-            dolog ("warning: Using timer based audio emulation\n");
-        }
+        assert(done);
+        dolog("warning: Using timer based audio emulation\n");
    }

    if (conf.period.hertz <= 0) {
--- a/audio/coreaudio.c
+++ b/audio/coreaudio.c
@@ -32,6 +32,10 @@
 #define AUDIO_CAP "coreaudio"
 #include "audio_int.h"

+#ifndef MAC_OS_X_VERSION_10_6
+#define MAC_OS_X_VERSION_10_6 1060
+#endif
+
 static int isAtexit;

 typedef struct {
@@ -45,11 +49,233 @@ typedef struct coreaudioVoiceOut {
    AudioDeviceID outputDeviceID;
    UInt32 audioDevicePropertyBufferFrameSize;
    AudioStreamBasicDescription outputStreamBasicDescription;
+    AudioDeviceIOProcID ioprocid;
    int live;
    int decr;
    int rpos;
 } coreaudioVoiceOut;

+#if MAC_OS_X_VERSION_MAX_ALLOWED >= MAC_OS_X_VERSION_10_6
+/* The APIs used here only become available from 10.6 */
+
+static OSStatus coreaudio_get_voice(AudioDeviceID *id)
+{
+    UInt32 size = sizeof(*id);
+    AudioObjectPropertyAddress addr = {
+        kAudioHardwarePropertyDefaultOutputDevice,
+        kAudioObjectPropertyScopeGlobal,
+        kAudioObjectPropertyElementMaster
+    };
+
+    return AudioObjectGetPropertyData(kAudioObjectSystemObject,
+                                      &addr,
+                                      0,
+                                      NULL,
+                                      &size,
+                                      id);
+}
+
+static OSStatus coreaudio_get_framesizerange(AudioDeviceID id,
+                                             AudioValueRange *framerange)
+{
+    UInt32 size = sizeof(*framerange);
+    AudioObjectPropertyAddress addr = {
+        kAudioDevicePropertyBufferFrameSizeRange,
+        kAudioDevicePropertyScopeOutput,
+        kAudioObjectPropertyElementMaster
+    };
+
+    return AudioObjectGetPropertyData(id,
+                                      &addr,
+                                      0,
+                                      NULL,
+                                      &size,
+                                      framerange);
+}
+
+static OSStatus coreaudio_get_framesize(AudioDeviceID id, UInt32 *framesize)
+{
+    UInt32 size = sizeof(*framesize);
+    AudioObjectPropertyAddress addr = {
+        kAudioDevicePropertyBufferFrameSize,
+        kAudioDevicePropertyScopeOutput,
+        kAudioObjectPropertyElementMaster
+    };
+
+    return AudioObjectGetPropertyData(id,
+                                      &addr,
+                                      0,
+                                      NULL,
+                                      &size,
+                                      framesize);
+}
+
+static OSStatus coreaudio_set_framesize(AudioDeviceID id, UInt32 *framesize)
+{
+    UInt32 size = sizeof(*framesize);
+    AudioObjectPropertyAddress addr = {
+        kAudioDevicePropertyBufferFrameSize,
+        kAudioDevicePropertyScopeOutput,
+        kAudioObjectPropertyElementMaster
+    };
+
+    return AudioObjectSetPropertyData(id,
+                                      &addr,
+                                      0,
+                                      NULL,
+                                      size,
+                                      framesize);
+}
+
+static OSStatus coreaudio_get_streamformat(AudioDeviceID id,
+                                           AudioStreamBasicDescription *d)
+{
+    UInt32 size = sizeof(*d);
+    AudioObjectPropertyAddress addr = {
+        kAudioDevicePropertyStreamFormat,
+        kAudioDevicePropertyScopeOutput,
+        kAudioObjectPropertyElementMaster
+    };
+
+    return AudioObjectGetPropertyData(id,
+                                      &addr,
+                                      0,
+                                      NULL,
+                                      &size,
+                                      d);
+}
+
+static OSStatus coreaudio_set_streamformat(AudioDeviceID id,
+                                           AudioStreamBasicDescription *d)
+{
+    UInt32 size = sizeof(*d);
+    AudioObjectPropertyAddress addr = {
+        kAudioDevicePropertyStreamFormat,
+        kAudioDevicePropertyScopeOutput,
+        kAudioObjectPropertyElementMaster
+    };
+
+    return AudioObjectSetPropertyData(id,
+                                      &addr,
+                                      0,
+                                      NULL,
+                                      size,
+                                      d);
+}
+
+static OSStatus coreaudio_get_isrunning(AudioDeviceID id, UInt32 *result)
+{
+    UInt32 size = sizeof(*result);
+    AudioObjectPropertyAddress addr = {
+        kAudioDevicePropertyDeviceIsRunning,
+        kAudioDevicePropertyScopeOutput,
+        kAudioObjectPropertyElementMaster
+    };
+
+    return AudioObjectGetPropertyData(id,
+                                      &addr,
+                                      0,
+                                      NULL,
+                                      &size,
+                                      result);
+}
+#else
+/* Legacy versions of functions using deprecated APIs */
+
+static OSStatus coreaudio_get_voice(AudioDeviceID *id)
+{
+    UInt32 size = sizeof(*id);
+
+    return AudioHardwareGetProperty(
+        kAudioHardwarePropertyDefaultOutputDevice,
+        &size,
+        id);
+}
+
+static OSStatus coreaudio_get_framesizerange(AudioDeviceID id,
+                                             AudioValueRange *framerange)
+{
+    UInt32 size = sizeof(*framerange);
+
+    return AudioDeviceGetProperty(
+        id,
+        0,
+        0,
+        kAudioDevicePropertyBufferFrameSizeRange,
+        &size,
+        framerange);
+}
+
+static OSStatus coreaudio_get_framesize(AudioDeviceID id, UInt32 *framesize)
+{
+    UInt32 size = sizeof(*framesize);
+
+    return AudioDeviceGetProperty(
+        id,
+        0,
+        false,
+        kAudioDevicePropertyBufferFrameSize,
+        &size,
+        framesize);
+}
+
+static OSStatus coreaudio_set_framesize(AudioDeviceID id, UInt32 *framesize)
+{
+    UInt32 size = sizeof(*framesize);
+
+    return AudioDeviceSetProperty(
+        id,
+        NULL,
+        0,
+        false,
+        kAudioDevicePropertyBufferFrameSize,
+        size,
+        framesize);
+}
+
+static OSStatus coreaudio_get_streamformat(AudioDeviceID id,
+                                           AudioStreamBasicDescription *d)
+{
+    UInt32 size = sizeof(*d);
+
+    return AudioDeviceGetProperty(
+        id,
+        0,
+        false,
+        kAudioDevicePropertyStreamFormat,
+        &size,
+        d);
+}
+
+static OSStatus coreaudio_set_streamformat(AudioDeviceID id,
+                                           AudioStreamBasicDescription *d)
+{
+    UInt32 size = sizeof(*d);
+
+    return AudioDeviceSetProperty(
+        id,
+        0,
+        0,
+        0,
+        kAudioDevicePropertyStreamFormat,
+        size,
+        d);
+}
+
+static OSStatus coreaudio_get_isrunning(AudioDeviceID id, UInt32 *result)
+{
+    UInt32 size = sizeof(*result);
+
+    return AudioDeviceGetProperty(
+        id,
+        0,
+        0,
+        kAudioDevicePropertyDeviceIsRunning,
+        &size,
+        result);
+}
+#endif
+
 static void coreaudio_logstatus (OSStatus status)
 {
    const char *str = "BUG";
@@ -144,10 +370,7 @@ static inline UInt32 isPlaying (AudioDeviceID outputDeviceID)
 {
    OSStatus status;
    UInt32 result = 0;
-    UInt32 propertySize = sizeof(outputDeviceID);
-    status = AudioDeviceGetProperty(
-        outputDeviceID, 0, 0,
-        kAudioDevicePropertyDeviceIsRunning, &propertySize, &result);
+    status = coreaudio_get_isrunning(outputDeviceID, &result);
    if (status != kAudioHardwareNoError) {
        coreaudio_logerr(status,
                         "Could not determine whether Device is playing\n");
@@ -288,7 +511,6 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
 {
    OSStatus status;
    coreaudioVoiceOut *core = (coreaudioVoiceOut *) hw;
-    UInt32 propertySize;
    int err;
    const char *typ = "playback";
    AudioValueRange frameRange;
@@ -303,12 +525,7 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,

    audio_pcm_init_info (&hw->info, as);

-    /* open default output device */
-    propertySize = sizeof(core->outputDeviceID);
-    status = AudioHardwareGetProperty(
-        kAudioHardwarePropertyDefaultOutputDevice,
-        &propertySize,
-        &core->outputDeviceID);
+    status = coreaudio_get_voice(&core->outputDeviceID);
    if (status != kAudioHardwareNoError) {
        coreaudio_logerr2 (status, typ,
                           "Could not get default output Device\n");
@@ -320,14 +537,8 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
    }

    /* get minimum and maximum buffer frame sizes */
-    propertySize = sizeof(frameRange);
-    status = AudioDeviceGetProperty(
-        core->outputDeviceID,
-        0,
-        0,
-        kAudioDevicePropertyBufferFrameSizeRange,
-        &propertySize,
-        &frameRange);
+    status = coreaudio_get_framesizerange(core->outputDeviceID,
+                                          &frameRange);
    if (status != kAudioHardwareNoError) {
        coreaudio_logerr2 (status, typ,
                           "Could not get device buffer frame range\n");
@@ -347,15 +558,8 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
    }

    /* set Buffer Frame Size */
-    propertySize = sizeof(core->audioDevicePropertyBufferFrameSize);
-    status = AudioDeviceSetProperty(
-        core->outputDeviceID,
-        NULL,
-        0,
-        false,
-        kAudioDevicePropertyBufferFrameSize,
-        propertySize,
-        &core->audioDevicePropertyBufferFrameSize);
+    status = coreaudio_set_framesize(core->outputDeviceID,
+                                     &core->audioDevicePropertyBufferFrameSize);
    if (status != kAudioHardwareNoError) {
        coreaudio_logerr2 (status, typ,
                           "Could not set device buffer frame size %" PRIu32 "\n",
@@ -364,14 +568,8 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
    }

    /* get Buffer Frame Size */
-    propertySize = sizeof(core->audioDevicePropertyBufferFrameSize);
-    status = AudioDeviceGetProperty(
-        core->outputDeviceID,
-        0,
-        false,
-        kAudioDevicePropertyBufferFrameSize,
-        &propertySize,
-        &core->audioDevicePropertyBufferFrameSize);
+    status = coreaudio_get_framesize(core->outputDeviceID,
+                                     &core->audioDevicePropertyBufferFrameSize);
    if (status != kAudioHardwareNoError) {
        coreaudio_logerr2 (status, typ,
                           "Could not get device buffer frame size\n");
@@ -380,14 +578,8 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
    hw->samples = conf->nbuffers * core->audioDevicePropertyBufferFrameSize;

    /* get StreamFormat */
-    propertySize = sizeof(core->outputStreamBasicDescription);
-    status = AudioDeviceGetProperty(
-        core->outputDeviceID,
-        0,
-        false,
-        kAudioDevicePropertyStreamFormat,
-        &propertySize,
-        &core->outputStreamBasicDescription);
+    status = coreaudio_get_streamformat(core->outputDeviceID,
+                                        &core->outputStreamBasicDescription);
    if (status != kAudioHardwareNoError) {
        coreaudio_logerr2 (status, typ,
                           "Could not get Device Stream properties\n");
@@ -397,15 +589,8 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,

    /* set Samplerate */
    core->outputStreamBasicDescription.mSampleRate = (Float64) as->freq;
-    propertySize = sizeof(core->outputStreamBasicDescription);
-    status = AudioDeviceSetProperty(
-        core->outputDeviceID,
-        0,
-        0,
-        0,
-        kAudioDevicePropertyStreamFormat,
-        propertySize,
-        &core->outputStreamBasicDescription);
+    status = coreaudio_set_streamformat(core->outputDeviceID,
+                                        &core->outputStreamBasicDescription);
    if (status != kAudioHardwareNoError) {
        coreaudio_logerr2 (status, typ, "Could not set samplerate %d\n",
                           as->freq);
@@ -414,8 +599,12 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
    }

    /* set Callback */
-    status = AudioDeviceAddIOProc(core->outputDeviceID, audioDeviceIOProc, hw);
-    if (status != kAudioHardwareNoError) {
+    core->ioprocid = NULL;
+    status = AudioDeviceCreateIOProcID(core->outputDeviceID,
+                                       audioDeviceIOProc,
+                                       hw,
+                                       &core->ioprocid);
+    if (status != kAudioHardwareNoError || core->ioprocid == NULL) {
        coreaudio_logerr2 (status, typ, "Could not set IOProc\n");
        core->outputDeviceID = kAudioDeviceUnknown;
        return -1;
@@ -423,10 +612,10 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,

    /* start Playback */
    if (!isPlaying(core->outputDeviceID)) {
-        status = AudioDeviceStart(core->outputDeviceID, audioDeviceIOProc);
+        status = AudioDeviceStart(core->outputDeviceID, core->ioprocid);
        if (status != kAudioHardwareNoError) {
            coreaudio_logerr2 (status, typ, "Could not start playback\n");
-            AudioDeviceRemoveIOProc(core->outputDeviceID, audioDeviceIOProc);
+            AudioDeviceDestroyIOProcID(core->outputDeviceID, core->ioprocid);
            core->outputDeviceID = kAudioDeviceUnknown;
            return -1;
        }
@@ -444,15 +633,15 @@ static void coreaudio_fini_out (HWVoiceOut *hw)
    if (!isAtexit) {
        /* stop playback */
        if (isPlaying(core->outputDeviceID)) {
-            status = AudioDeviceStop(core->outputDeviceID, audioDeviceIOProc);
+            status = AudioDeviceStop(core->outputDeviceID, core->ioprocid);
            if (status != kAudioHardwareNoError) {
                coreaudio_logerr (status, "Could not stop playback\n");
            }
        }

        /* remove callback */
-        status = AudioDeviceRemoveIOProc(core->outputDeviceID,
-                                         audioDeviceIOProc);
+        status = AudioDeviceDestroyIOProcID(core->outputDeviceID,
+                                            core->ioprocid);
        if (status != kAudioHardwareNoError) {
            coreaudio_logerr (status, "Could not remove IOProc\n");
        }
@@ -475,7 +664,7 @@ static int coreaudio_ctl_out (HWVoiceOut *hw, int cmd, ...)
    case VOICE_ENABLE:
        /* start playback */
        if (!isPlaying(core->outputDeviceID)) {
-            status = AudioDeviceStart(core->outputDeviceID, audioDeviceIOProc);
+            status = AudioDeviceStart(core->outputDeviceID, core->ioprocid);
            if (status != kAudioHardwareNoError) {
                coreaudio_logerr (status, "Could not resume playback\n");
            }
@@ -486,7 +675,8 @@ static int coreaudio_ctl_out (HWVoiceOut *hw, int cmd, ...)
        /* stop playback */
        if (!isAtexit) {
            if (isPlaying(core->outputDeviceID)) {
-                status = AudioDeviceStop(core->outputDeviceID, audioDeviceIOProc);
+                status = AudioDeviceStop(core->outputDeviceID,
+                                         core->ioprocid);
                if (status != kAudioHardwareNoError) {
                    coreaudio_logerr (status, "Could not pause playback\n");
                }
--- a/backends/baum.c
+++ b/backends/baum.c
@@ -566,6 +566,7 @@ static CharDriverState *chr_baum_init(const char *id,
                                      ChardevReturn *ret,
                                      Error **errp)
 {
+    ChardevCommon *common = qapi_ChardevDummy_base(backend->u.braille);
    BaumDriverState *baum;
    CharDriverState *chr;
    brlapi_handle_t *handle;
@@ -576,8 +577,12 @@ static CharDriverState *chr_baum_init(const char *id,
 #endif
    int tty;

+    chr = qemu_chr_alloc(common, errp);
+    if (!chr) {
+        return NULL;
+    }
    baum = g_malloc0(sizeof(BaumDriverState));
-    baum->chr = chr = qemu_chr_alloc();
+    baum->chr = chr;

    chr->opaque = baum;
    chr->chr_write = baum_write;
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -313,9 +313,11 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
        assert(maxnode <= MAX_NODES);
        if (mbind(ptr, sz, backend->policy,
                  maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) {
-            error_setg_errno(errp, errno,
-                             "cannot bind memory to host NUMA nodes");
-            return;
+            if (backend->policy != MPOL_DEFAULT || errno != ENOSYS) {
+                error_setg_errno(errp, errno,
+                                 "cannot bind memory to host NUMA nodes");
+                return;
+            }
        }
 #endif
        /* Preallocate memory after the NUMA policy has been instantiated.
--- a/backends/msmouse.c
+++ b/backends/msmouse.c
@@ -68,9 +68,13 @@ static CharDriverState *qemu_chr_open_msmouse(const char *id,
                                              ChardevReturn *ret,
                                              Error **errp)
 {
+    ChardevCommon *common = qapi_ChardevDummy_base(backend->u.msmouse);
    CharDriverState *chr;

-    chr = qemu_chr_alloc();
+    chr = qemu_chr_alloc(common, errp);
+    if (!chr) {
+        return NULL;
+    }
    chr->chr_write = msmouse_chr_write;
    chr->chr_close = msmouse_chr_close;
    chr->explicit_be_open = true;
--- a/balloon.c
+++ b/balloon.c
@@ -36,6 +36,17 @@
 static QEMUBalloonEvent *balloon_event_fn;
 static QEMUBalloonStatus *balloon_stat_fn;
 static void *balloon_opaque;
+static bool balloon_inhibited;
+
+bool qemu_balloon_is_inhibited(void)
+{
+    return balloon_inhibited;
+}
+
+void qemu_balloon_inhibit(bool state)
+{
+    balloon_inhibited = state;
+}

 static bool have_balloon(Error **errp)
 {
--- a/block.c
+++ b/block.c
@@ -29,6 +29,7 @@
 #include "qemu/error-report.h"
 #include "qemu/module.h"
 #include "qapi/qmp/qerror.h"
+#include "qapi/qmp/qbool.h"
 #include "qapi/qmp/qjson.h"
 #include "sysemu/block-backend.h"
 #include "sysemu/sysemu.h"
@@ -73,8 +74,7 @@ struct BdrvDirtyBitmap {

 #define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */

-static QTAILQ_HEAD(, BlockDriverState) bdrv_states =
-    QTAILQ_HEAD_INITIALIZER(bdrv_states);
+struct BdrvStates bdrv_states = QTAILQ_HEAD_INITIALIZER(bdrv_states);

 static QTAILQ_HEAD(, BlockDriverState) graph_bdrv_states =
    QTAILQ_HEAD_INITIALIZER(graph_bdrv_states);
@@ -624,6 +624,20 @@ static int refresh_total_sectors(BlockDriverState *bs, int64_t hint)
    return 0;
 }

+/**
+ * Combines a QDict of new block driver @options with any missing options taken
+ * from @old_options, so that leaving out an option defaults to its old value.
+ */
+static void bdrv_join_options(BlockDriverState *bs, QDict *options,
+                              QDict *old_options)
+{
+    if (bs->drv && bs->drv->bdrv_join_options) {
+        bs->drv->bdrv_join_options(options, old_options);
+    } else {
+        qdict_join(options, old_options, false);
+    }
+}
+
 /**
 * Set open flags for a given discard mode
 *
@@ -682,60 +696,81 @@ static int bdrv_temp_snapshot_flags(int flags)
 }

 /*
- * Returns the flags that bs->file should get if a protocol driver is expected,
- * based on the given flags for the parent BDS
+ * Returns the options and flags that bs->file should get if a protocol driver
+ * is expected, based on the given options and flags for the parent BDS
 */
-static int bdrv_inherited_flags(int flags)
+static void bdrv_inherited_options(int *child_flags, QDict *child_options,
+                                   int parent_flags, QDict *parent_options)
 {
+    int flags = parent_flags;
+
    /* Enable protocol handling, disable format probing for bs->file */
    flags |= BDRV_O_PROTOCOL;

+    /* If the cache mode isn't explicitly set, inherit direct and no-flush from
+     * the parent. */
+    qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_DIRECT);
+    qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_NO_FLUSH);
+
    /* Our block drivers take care to send flushes and respect unmap policy,
-     * so we can enable both unconditionally on lower layers. */
-    flags |= BDRV_O_CACHE_WB | BDRV_O_UNMAP;
+     * so we can default to enable both on lower layers regardless of the
+     * corresponding parent options. */
+    qdict_set_default_str(child_options, BDRV_OPT_CACHE_WB, "on");
+    flags |= BDRV_O_UNMAP;

    /* Clear flags that only apply to the top layer */
    flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING | BDRV_O_COPY_ON_READ);

-    return flags;
+    *child_flags = flags;
 }

 const BdrvChildRole child_file = {
-    .inherit_flags = bdrv_inherited_flags,
+    .inherit_options = bdrv_inherited_options,
 };

 /*
- * Returns the flags that bs->file should get if the use of formats (and not
- * only protocols) is permitted for it, based on the given flags for the parent
- * BDS
+ * Returns the options and flags that bs->file should get if the use of formats
+ * (and not only protocols) is permitted for it, based on the given options and
+ * flags for the parent BDS
 */
-static int bdrv_inherited_fmt_flags(int parent_flags)
+static void bdrv_inherited_fmt_options(int *child_flags, QDict *child_options,
+                                       int parent_flags, QDict *parent_options)
 {
-    int flags = child_file.inherit_flags(parent_flags);
-    return flags & ~BDRV_O_PROTOCOL;
+    child_file.inherit_options(child_flags, child_options,
+                               parent_flags, parent_options);
+
+    *child_flags &= ~BDRV_O_PROTOCOL;
 }

 const BdrvChildRole child_format = {
-    .inherit_flags = bdrv_inherited_fmt_flags,
+    .inherit_options = bdrv_inherited_fmt_options,
 };

 /*
- * Returns the flags that bs->backing should get, based on the given flags
- * for the parent BDS
+ * Returns the options and flags that bs->backing should get, based on the
+ * given options and flags for the parent BDS
 */
-static int bdrv_backing_flags(int flags)
+static void bdrv_backing_options(int *child_flags, QDict *child_options,
+                                 int parent_flags, QDict *parent_options)
 {
+    int flags = parent_flags;
+
+    /* The cache mode is inherited unmodified for backing files */
+    qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_WB);
+    qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_DIRECT);
+    qdict_copy_default(child_options, parent_options, BDRV_OPT_CACHE_NO_FLUSH);
+
    /* backing files always opened read-only */
    flags &= ~(BDRV_O_RDWR | BDRV_O_COPY_ON_READ);

    /* snapshot=on is handled on the top layer */
    flags &= ~(BDRV_O_SNAPSHOT | BDRV_O_TEMPORARY);

-    return flags;
+    *child_flags = flags;
 }

 static const BdrvChildRole child_backing = {
-    .inherit_flags = bdrv_backing_flags,
+    .inherit_options = bdrv_backing_options,
 };

 static int bdrv_open_flags(BlockDriverState *bs, int flags)
@@ -758,6 +793,42 @@ static int bdrv_open_flags(BlockDriverState *bs, int flags)
    return open_flags;
 }

+static void update_flags_from_options(int *flags, QemuOpts *opts)
+{
+    *flags &= ~BDRV_O_CACHE_MASK;
+
+    assert(qemu_opt_find(opts, BDRV_OPT_CACHE_WB));
+    if (qemu_opt_get_bool(opts, BDRV_OPT_CACHE_WB, false)) {
+        *flags |= BDRV_O_CACHE_WB;
+    }
+
+    assert(qemu_opt_find(opts, BDRV_OPT_CACHE_NO_FLUSH));
+    if (qemu_opt_get_bool(opts, BDRV_OPT_CACHE_NO_FLUSH, false)) {
+        *flags |= BDRV_O_NO_FLUSH;
+    }
+
+    assert(qemu_opt_find(opts, BDRV_OPT_CACHE_DIRECT));
+    if (qemu_opt_get_bool(opts, BDRV_OPT_CACHE_DIRECT, false)) {
+        *flags |= BDRV_O_NOCACHE;
+    }
+}
+
+static void update_options_from_flags(QDict *options, int flags)
+{
+    if (!qdict_haskey(options, BDRV_OPT_CACHE_WB)) {
+        qdict_put(options, BDRV_OPT_CACHE_WB,
+                  qbool_from_bool(flags & BDRV_O_CACHE_WB));
+    }
+    if (!qdict_haskey(options, BDRV_OPT_CACHE_DIRECT)) {
+        qdict_put(options, BDRV_OPT_CACHE_DIRECT,
+                  qbool_from_bool(flags & BDRV_O_NOCACHE));
+    }
+    if (!qdict_haskey(options, BDRV_OPT_CACHE_NO_FLUSH)) {
+        qdict_put(options, BDRV_OPT_CACHE_NO_FLUSH,
+                  qbool_from_bool(flags & BDRV_O_NO_FLUSH));
+    }
+}
+
 static void bdrv_assign_node_name(BlockDriverState *bs,
                                  const char *node_name,
                                  Error **errp)
@@ -804,6 +875,26 @@ static QemuOptsList bdrv_runtime_opts = {
            .type = QEMU_OPT_STRING,
            .help = "Node name of the block device node",
        },
+        {
+            .name = "driver",
+            .type = QEMU_OPT_STRING,
+            .help = "Block driver to use for the node",
+        },
+        {
+            .name = BDRV_OPT_CACHE_WB,
+            .type = QEMU_OPT_BOOL,
+            .help = "Enable writeback mode",
+        },
+        {
+            .name = BDRV_OPT_CACHE_DIRECT,
+            .type = QEMU_OPT_BOOL,
+            .help = "Bypass software writeback cache on the host",
+        },
+        {
+            .name = BDRV_OPT_CACHE_NO_FLUSH,
+            .type = QEMU_OPT_BOOL,
+            .help = "Ignore flush requests",
+        },
        { /* end of list */ }
    },
 };
@@ -814,18 +905,31 @@ static QemuOptsList bdrv_runtime_opts = {
 * Removes all processed options from *options.
 */
 static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file,
-    QDict *options, int flags, BlockDriver *drv, Error **errp)
+                            QDict *options, int flags, Error **errp)
 {
    int ret, open_flags;
    const char *filename;
+    const char *driver_name = NULL;
    const char *node_name = NULL;
    QemuOpts *opts;
+    BlockDriver *drv;
    Error *local_err = NULL;

-    assert(drv != NULL);
    assert(bs->file == NULL);
    assert(options != NULL && bs->options != options);

+    opts = qemu_opts_create(&bdrv_runtime_opts, NULL, 0, &error_abort);
+    qemu_opts_absorb_qdict(opts, options, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        ret = -EINVAL;
+        goto fail_opts;
+    }
+
+    driver_name = qemu_opt_get(opts, "driver");
+    drv = bdrv_find_format(driver_name);
+    assert(drv != NULL);
+
    if (file != NULL) {
        filename = file->bs->filename;
    } else {
@@ -835,19 +939,12 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file,
    if (drv->bdrv_needs_filename && !filename) {
        error_setg(errp, "The '%s' block driver requires a file name",
                   drv->format_name);
-        return -EINVAL;
-    }
-
-    trace_bdrv_open_common(bs, filename ?: "", flags, drv->format_name);
-
-    opts = qemu_opts_create(&bdrv_runtime_opts, NULL, 0, &error_abort);
-    qemu_opts_absorb_qdict(opts, options, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
        ret = -EINVAL;
        goto fail_opts;
    }

+    trace_bdrv_open_common(bs, filename ?: "", flags, drv->format_name);
+
    node_name = qemu_opt_get(opts, "node-name");
    bdrv_assign_node_name(bs, node_name, &local_err);
    if (local_err) {
@@ -892,7 +989,9 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file,
    bs->drv = drv;
    bs->opaque = g_malloc0(drv->instance_size);

-    bs->enable_write_cache = !!(flags & BDRV_O_CACHE_WB);
+    /* Apply cache mode options */
+    update_flags_from_options(&bs->open_flags, opts);
+    bdrv_set_enable_write_cache(bs, bs->open_flags & BDRV_O_CACHE_WB);

    /* Open the image, either directly or using a protocol */
    if (drv->bdrv_file_open) {
@@ -985,37 +1084,45 @@ static QDict *parse_json_filename(const char *filename, Error **errp)
    return options;
 }

+static void parse_json_protocol(QDict *options, const char **pfilename,
+                                Error **errp)
+{
+    QDict *json_options;
+    Error *local_err = NULL;
+
+    /* Parse json: pseudo-protocol */
+    if (!*pfilename || !g_str_has_prefix(*pfilename, "json:")) {
+        return;
+    }
+
+    json_options = parse_json_filename(*pfilename, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    /* Options given in the filename have lower priority than options
+     * specified directly */
+    qdict_join(options, json_options, false);
+    QDECREF(json_options);
+    *pfilename = NULL;
+}
+
 /*
 * Fills in default options for opening images and converts the legacy
 * filename/flags pair to option QDict entries.
 * The BDRV_O_PROTOCOL flag in *flags will be set or cleared accordingly if a
 * block driver has been specified explicitly.
 */
-static int bdrv_fill_options(QDict **options, const char **pfilename,
+static int bdrv_fill_options(QDict **options, const char *filename,
                             int *flags, Error **errp)
 {
-    const char *filename = *pfilename;
    const char *drvname;
    bool protocol = *flags & BDRV_O_PROTOCOL;
    bool parse_filename = false;
    BlockDriver *drv = NULL;
    Error *local_err = NULL;

-    /* Parse json: pseudo-protocol */
-    if (filename && g_str_has_prefix(filename, "json:")) {
-        QDict *json_options = parse_json_filename(filename, &local_err);
-        if (local_err) {
-            error_propagate(errp, local_err);
-            return -EINVAL;
-        }
-
-        /* Options given in the filename have lower priority than options
-         * specified directly */
-        qdict_join(*options, json_options, false);
-        QDECREF(json_options);
-        *pfilename = filename = NULL;
-    }
-
    drvname = qdict_get_try_str(*options, "driver");
    if (drvname) {
        drv = bdrv_find_format(drvname);
@@ -1034,6 +1141,9 @@ static int bdrv_fill_options(QDict **options, const char **pfilename,
        *flags &= ~BDRV_O_PROTOCOL;
    }

+    /* Translate cache options from flags into options */
+    update_options_from_flags(*options, *flags);
+
    /* Fetch the file name from the options QDict if necessary */
    if (protocol && filename) {
        if (!qdict_haskey(*options, "filename")) {
@@ -1088,11 +1198,13 @@ static int bdrv_fill_options(QDict **options, const char **pfilename,

 static BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
                                    BlockDriverState *child_bs,
+                                    const char *child_name,
                                    const BdrvChildRole *child_role)
 {
    BdrvChild *child = g_new(BdrvChild, 1);
    *child = (BdrvChild) {
        .bs     = child_bs,
+        .name   = g_strdup(child_name),
        .role   = child_role,
    };

@@ -1106,6 +1218,7 @@ static void bdrv_detach_child(BdrvChild *child)
 {
    QLIST_REMOVE(child, next);
    QLIST_REMOVE(child, next_parent);
+    g_free(child->name);
    g_free(child);
 }

@@ -1152,7 +1265,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
        bs->backing = NULL;
        goto out;
    }
-    bs->backing = bdrv_attach_child(bs, backing_hd, &child_backing);
+    bs->backing = bdrv_attach_child(bs, backing_hd, "backing", &child_backing);
    bs->open_flags &= ~BDRV_O_NO_BACKING;
    pstrcpy(bs->backing_file, sizeof(bs->backing_file), backing_hd->filename);
    pstrcpy(bs->backing_format, sizeof(bs->backing_format),
@@ -1169,30 +1282,43 @@ out:
 /*
 * Opens the backing file for a BlockDriverState if not yet open
 *
- * options is a QDict of options to pass to the block drivers, or NULL for an
- * empty set of options. The reference to the QDict is transferred to this
- * function (even on failure), so if the caller intends to reuse the dictionary,
- * it needs to use QINCREF() before calling bdrv_file_open.
+ * bdref_key specifies the key for the image's BlockdevRef in the options QDict.
+ * That QDict has to be flattened; therefore, if the BlockdevRef is a QDict
+ * itself, all options starting with "${bdref_key}." are considered part of the
+ * BlockdevRef.
+ *
+ * TODO Can this be unified with bdrv_open_image()?
 */
-int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
+int bdrv_open_backing_file(BlockDriverState *bs, QDict *parent_options,
+                           const char *bdref_key, Error **errp)
 {
    char *backing_filename = g_malloc0(PATH_MAX);
+    char *bdref_key_dot;
+    const char *reference = NULL;
    int ret = 0;
    BlockDriverState *backing_hd;
+    QDict *options;
+    QDict *tmp_parent_options = NULL;
    Error *local_err = NULL;

    if (bs->backing != NULL) {
-        QDECREF(options);
        goto free_exit;
    }

    /* NULL means an empty set of options */
-    if (options == NULL) {
-        options = qdict_new();
+    if (parent_options == NULL) {
+        tmp_parent_options = qdict_new();
+        parent_options = tmp_parent_options;
    }

    bs->open_flags &= ~BDRV_O_NO_BACKING;
-    if (qdict_haskey(options, "file.filename")) {
+
+    bdref_key_dot = g_strdup_printf("%s.", bdref_key);
+    qdict_extract_subqdict(parent_options, &options, bdref_key_dot);
+    g_free(bdref_key_dot);
+
+    reference = qdict_get_try_str(parent_options, bdref_key);
+    if (reference || qdict_haskey(options, "file.filename")) {
        backing_filename[0] = '\0';
    } else if (bs->backing_file[0] == '\0' && qdict_size(options) == 0) {
        QDECREF(options);
@@ -1215,23 +1341,18 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
        goto free_exit;
    }

-    backing_hd = bdrv_new();
-
    if (bs->backing_format[0] != '\0' && !qdict_haskey(options, "driver")) {
        qdict_put(options, "driver", qstring_from_str(bs->backing_format));
    }

-    assert(bs->backing == NULL);
+    backing_hd = NULL;
    ret = bdrv_open_inherit(&backing_hd,
                            *backing_filename ? backing_filename : NULL,
-                            NULL, options, 0, bs, &child_backing, &local_err);
+                            reference, options, 0, bs, &child_backing,
+                            errp);
    if (ret < 0) {
-        bdrv_unref(backing_hd);
-        backing_hd = NULL;
        bs->open_flags |= BDRV_O_NO_BACKING;
-        error_setg(errp, "Could not open backing file: %s",
-                   error_get_pretty(local_err));
-        error_free(local_err);
+        error_prepend(errp, "Could not open backing file: ");
        goto free_exit;
    }

@@ -1240,8 +1361,11 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
    bdrv_set_backing_hd(bs, backing_hd);
    bdrv_unref(backing_hd);

+    qdict_del(parent_options, bdref_key);
+
 free_exit:
    g_free(backing_filename);
+    QDECREF(tmp_parent_options);
    return ret;
 }

@@ -1295,7 +1419,7 @@ BdrvChild *bdrv_open_child(const char *filename,
        goto done;
    }

-    c = bdrv_attach_child(parent, bs, child_role);
+    c = bdrv_attach_child(parent, bs, bdref_key, child_role);

 done:
    qdict_del(options, bdref_key);
@@ -1334,13 +1458,11 @@ int bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
    opts = qemu_opts_create(bdrv_qcow2.create_opts, NULL, 0,
                            &error_abort);
    qemu_opt_set_number(opts, BLOCK_OPT_SIZE, total_size, &error_abort);
-    ret = bdrv_create(&bdrv_qcow2, tmp_filename, opts, &local_err);
+    ret = bdrv_create(&bdrv_qcow2, tmp_filename, opts, errp);
    qemu_opts_del(opts);
    if (ret < 0) {
-        error_setg_errno(errp, -ret, "Could not create temporary overlay "
-                         "'%s': %s", tmp_filename,
-                         error_get_pretty(local_err));
-        error_free(local_err);
+        error_prepend(errp, "Could not create temporary overlay '%s': ",
+                      tmp_filename);
        goto out;
    }

@@ -1394,6 +1516,7 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
    BlockDriverState *bs;
    BlockDriver *drv = NULL;
    const char *drvname;
+    const char *backing;
    Error *local_err = NULL;
    int snapshot_flags = 0;

@@ -1437,21 +1560,34 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
        options = qdict_new();
    }

-    if (child_role) {
-        bs->inherits_from = parent;
-        flags = child_role->inherit_flags(parent->open_flags);
+    /* json: syntax counts as explicit options, as if in the QDict */
+    parse_json_protocol(options, &filename, &local_err);
+    if (local_err) {
+        ret = -EINVAL;
+        goto fail;
    }

-    ret = bdrv_fill_options(&options, &filename, &flags, &local_err);
+    bs->explicit_options = qdict_clone_shallow(options);
+
+    if (child_role) {
+        bs->inherits_from = parent;
+        child_role->inherit_options(&flags, options,
+                                    parent->open_flags, parent->options);
+    }
+
+    ret = bdrv_fill_options(&options, filename, &flags, &local_err);
    if (local_err) {
        goto fail;
    }

+    bs->open_flags = flags;
+    bs->options = options;
+    options = qdict_clone_shallow(options);
+
    /* Find the right image format driver */
    drvname = qdict_get_try_str(options, "driver");
    if (drvname) {
        drv = bdrv_find_format(drvname);
-        qdict_del(options, "driver");
        if (!drv) {
            error_setg(errp, "Unknown driver: '%s'", drvname);
            ret = -EINVAL;
@@ -1461,9 +1597,11 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,

    assert(drvname || !(flags & BDRV_O_PROTOCOL));

-    bs->open_flags = flags;
-    bs->options = options;
-    options = qdict_clone_shallow(options);
+    backing = qdict_get_try_str(options, "backing");
+    if (backing && *backing == '\0') {
+        flags |= BDRV_O_NO_BACKING;
+        qdict_del(options, "backing");
+    }

    /* Open image file without format layer */
    if ((flags & BDRV_O_PROTOCOL) == 0) {
@@ -1472,7 +1610,7 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
        }
        if (flags & BDRV_O_SNAPSHOT) {
            snapshot_flags = bdrv_temp_snapshot_flags(flags);
-            flags = bdrv_backing_flags(flags);
+            bdrv_backing_options(&flags, options, flags, options);
        }

        bs->open_flags = flags;
@@ -1492,6 +1630,19 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
        if (ret < 0) {
            goto fail;
        }
+        /*
+         * This option update would logically belong in bdrv_fill_options(),
+         * but we first need to open bs->file for the probing to work, while
+         * opening bs->file already requires the (mostly) final set of options
+         * so that cache mode etc. can be inherited.
+         *
+         * Adding the driver later is somewhat ugly, but it's not an option
+         * that would ever be inherited, so it's correct. We just need to make
+         * sure to update both bs->options (which has the full effective
+         * options for bs) and options (which has file.* already removed).
+         */
+        qdict_put(bs->options, "driver", qstring_from_str(drv->format_name));
+        qdict_put(options, "driver", qstring_from_str(drv->format_name));
    } else if (!drv) {
        error_setg(errp, "Must specify either driver or file");
        ret = -EINVAL;
@@ -1505,7 +1656,7 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
    assert(!(flags & BDRV_O_PROTOCOL) || !file);

    /* Open the image */
-    ret = bdrv_open_common(bs, file, options, flags, drv, &local_err);
+    ret = bdrv_open_common(bs, file, options, flags, &local_err);
    if (ret < 0) {
        goto fail;
    }
@@ -1517,10 +1668,7 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,

    /* If there is a backing file, use it */
    if ((flags & BDRV_O_NO_BACKING) == 0) {
-        QDict *backing_options;
-
-        qdict_extract_subqdict(options, &backing_options, "backing.");
-        ret = bdrv_open_backing_file(bs, backing_options, &local_err);
+        ret = bdrv_open_backing_file(bs, options, "backing", &local_err);
        if (ret < 0) {
            goto close_and_fail;
        }
@@ -1575,6 +1723,7 @@ fail:
    if (file != NULL) {
        bdrv_unref_child(bs, file);
    }
+    QDECREF(bs->explicit_options);
    QDECREF(bs->options);
    QDECREF(options);
    bs->options = NULL;
@@ -1637,15 +1786,19 @@ typedef struct BlockReopenQueueEntry {
 * bs_queue, or the existing bs_queue being used.
 *
 */
-BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
-                                    BlockDriverState *bs,
-                                    QDict *options, int flags)
+static BlockReopenQueue *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
+                                                 BlockDriverState *bs,
+                                                 QDict *options,
+                                                 int flags,
+                                                 const BdrvChildRole *role,
+                                                 QDict *parent_options,
+                                                 int parent_flags)
 {
    assert(bs != NULL);

    BlockReopenQueueEntry *bs_entry;
    BdrvChild *child;
-    QDict *old_options;
+    QDict *old_options, *explicit_options;

    if (bs_queue == NULL) {
        bs_queue = g_new0(BlockReopenQueue, 1);
@@ -1656,23 +1809,63 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
        options = qdict_new();
    }

+    /*
+     * Precedence of options:
+     * 1. Explicitly passed in options (highest)
+     * 2. Set in flags (only for top level)
+     * 3. Retained from explicitly set options of bs
+     * 4. Inherited from parent node
+     * 5. Retained from effective options of bs
+     */
+
+    if (!parent_options) {
+        /*
+         * Any setting represented by flags is always updated. If the
+         * corresponding QDict option is set, it takes precedence. Otherwise
+         * the flag is translated into a QDict option. The old setting of bs is
+         * not considered.
+         */
+        update_options_from_flags(options, flags);
+    }
+
+    /* Old explicitly set values (don't overwrite by inherited value) */
+    old_options = qdict_clone_shallow(bs->explicit_options);
+    bdrv_join_options(bs, options, old_options);
+    QDECREF(old_options);
+
+    explicit_options = qdict_clone_shallow(options);
+
+    /* Inherit from parent node */
+    if (parent_options) {
+        assert(!flags);
+        role->inherit_options(&flags, options, parent_flags, parent_options);
+    }
+
+    /* Old values are used for options that aren't set yet */
    old_options = qdict_clone_shallow(bs->options);
-    qdict_join(options, old_options, false);
+    bdrv_join_options(bs, options, old_options);
    QDECREF(old_options);

    /* bdrv_open() masks this flag out */
    flags &= ~BDRV_O_PROTOCOL;

    QLIST_FOREACH(child, &bs->children, next) {
-        int child_flags;
+        QDict *new_child_options;
+        char *child_key_dot;

+        /* reopen can only change the options of block devices that were
+         * implicitly created and inherited options. For other (referenced)
+         * block devices, a syntax like "backing.foo" results in an error. */
        if (child->bs->inherits_from != bs) {
            continue;
        }

-        child_flags = child->role->inherit_flags(flags);
-        /* TODO Pass down child flags (backing.*, extents.*, ...) */
-        bdrv_reopen_queue(bs_queue, child->bs, NULL, child_flags);
+        child_key_dot = g_strdup_printf("%s.", child->name);
+        qdict_extract_subqdict(options, &new_child_options, child_key_dot);
+        g_free(child_key_dot);
+
+        bdrv_reopen_queue_child(bs_queue, child->bs, new_child_options, 0,
+                                child->role, options, flags);
    }

    bs_entry = g_new0(BlockReopenQueueEntry, 1);
@@ -1680,11 +1873,20 @@ BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,

    bs_entry->state.bs = bs;
    bs_entry->state.options = options;
+    bs_entry->state.explicit_options = explicit_options;
    bs_entry->state.flags = flags;

    return bs_queue;
 }

+BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
+                                    BlockDriverState *bs,
+                                    QDict *options, int flags)
+{
+    return bdrv_reopen_queue_child(bs_queue, bs, options, flags,
+                                   NULL, NULL, 0);
+}
+
 /*
 * Reopen multiple BlockDriverStates atomically & transactionally.
 *
@@ -1731,6 +1933,8 @@ cleanup:
    QSIMPLEQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
        if (ret && bs_entry->prepared) {
            bdrv_reopen_abort(&bs_entry->state);
+        } else if (ret) {
+            QDECREF(bs_entry->state.explicit_options);
        }
        QDECREF(bs_entry->state.options);
        g_free(bs_entry);
@@ -1778,11 +1982,47 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
    int ret = -1;
    Error *local_err = NULL;
    BlockDriver *drv;
+    QemuOpts *opts;
+    const char *value;

    assert(reopen_state != NULL);
    assert(reopen_state->bs->drv != NULL);
    drv = reopen_state->bs->drv;

+    /* Process generic block layer options */
+    opts = qemu_opts_create(&bdrv_runtime_opts, NULL, 0, &error_abort);
+    qemu_opts_absorb_qdict(opts, reopen_state->options, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        ret = -EINVAL;
+        goto error;
+    }
+
+    update_flags_from_options(&reopen_state->flags, opts);
+
+    /* If a guest device is attached, it owns WCE */
+    if (reopen_state->bs->blk && blk_get_attached_dev(reopen_state->bs->blk)) {
+        bool old_wce = bdrv_enable_write_cache(reopen_state->bs);
+        bool new_wce = (reopen_state->flags & BDRV_O_CACHE_WB);
+        if (old_wce != new_wce) {
+            error_setg(errp, "Cannot change cache.writeback: Device attached");
+            ret = -EINVAL;
+            goto error;
+        }
+    }
+
+    /* node-name and driver must be unchanged. Put them back into the QDict, so
+     * that they are checked at the end of this function. */
+    value = qemu_opt_get(opts, "node-name");
+    if (value) {
+        qdict_put(reopen_state->options, "node-name", qstring_from_str(value));
+    }
+
+    value = qemu_opt_get(opts, "driver");
+    if (value) {
+        qdict_put(reopen_state->options, "driver", qstring_from_str(value));
+    }
+
    /* if we are to stay read-only, do not allow permission change
     * to r/w */
    if (!(reopen_state->bs->open_flags & BDRV_O_ALLOW_RDWR) &&
@@ -1795,8 +2035,7 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,

    ret = bdrv_flush(reopen_state->bs);
    if (ret) {
-        error_set(errp, ERROR_CLASS_GENERIC_ERROR, "Error (%s) flushing drive",
-                  strerror(-ret));
+        error_setg_errno(errp, -ret, "Error flushing drive");
        goto error;
    }

@@ -1844,6 +2083,7 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
    ret = 0;

 error:
+    qemu_opts_del(opts);
    return ret;
 }

@@ -1866,6 +2106,9 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
    }

    /* set BDS specific flags now */
+    QDECREF(reopen_state->bs->explicit_options);
+
+    reopen_state->bs->explicit_options   = reopen_state->explicit_options;
    reopen_state->bs->open_flags         = reopen_state->flags;
    reopen_state->bs->enable_write_cache = !!(reopen_state->flags &
                                              BDRV_O_CACHE_WB);
@@ -1889,6 +2132,8 @@ void bdrv_reopen_abort(BDRVReopenState *reopen_state)
    if (drv->bdrv_reopen_abort) {
        drv->bdrv_reopen_abort(reopen_state);
    }
+
+    QDECREF(reopen_state->explicit_options);
 }


@@ -1901,13 +2146,14 @@ void bdrv_close(BlockDriverState *bs)
    }

    /* Disable I/O limits and drain all pending throttled requests */
-    if (bs->io_limits_enabled) {
+    if (bs->throttle_state) {
        bdrv_io_limits_disable(bs);
    }

-    bdrv_drain(bs); /* complete I/O */
+    bdrv_drained_begin(bs); /* complete I/O */
    bdrv_flush(bs);
    bdrv_drain(bs); /* in case flush left pending I/O */
+
    notifier_list_notify(&bs->close_notifiers, bs);

    if (bs->blk) {
@@ -1947,6 +2193,7 @@ void bdrv_close(BlockDriverState *bs)
        bs->sg = 0;
        bs->zero_beyond_eof = false;
        QDECREF(bs->options);
+        QDECREF(bs->explicit_options);
        bs->options = NULL;
        QDECREF(bs->full_open_options);
        bs->full_open_options = NULL;
@@ -1956,6 +2203,7 @@ void bdrv_close(BlockDriverState *bs)
        g_free(ban);
    }
    QLIST_INIT(&bs->aio_notifiers);
+    bdrv_drained_end(bs);
 }

 void bdrv_close_all(void)
@@ -2683,12 +2931,12 @@ BlockDriverState *bdrv_lookup_bs(const char *device,
        blk = blk_by_name(device);

        if (blk) {
-            if (!blk_bs(blk)) {
+            bs = blk_bs(blk);
+            if (!bs) {
                error_setg(errp, "Device '%s' has no medium", device);
-                return NULL;
            }

-            return blk_bs(blk);
+            return bs;
        }
    }

@@ -2846,7 +3094,7 @@ ImageInfoSpecific *bdrv_get_specific_info(BlockDriverState *bs)
    return NULL;
 }

-void bdrv_debug_event(BlockDriverState *bs, BlkDebugEvent event)
+void bdrv_debug_event(BlockDriverState *bs, BlkdebugEvent event)
 {
    if (!bs || !bs->drv || !bs->drv->bdrv_debug_event) {
        return;
@@ -3399,10 +3647,25 @@ void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
    hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
 }

-void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
 {
    assert(bdrv_dirty_bitmap_enabled(bitmap));
-    hbitmap_reset_all(bitmap->bitmap);
+    if (!out) {
+        hbitmap_reset_all(bitmap->bitmap);
+    } else {
+        HBitmap *backup = bitmap->bitmap;
+        bitmap->bitmap = hbitmap_alloc(bitmap->size,
+                                       hbitmap_granularity(backup));
+        *out = backup;
+    }
+}
+
+void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in)
+{
+    HBitmap *tmp = bitmap->bitmap;
+    assert(bdrv_dirty_bitmap_enabled(bitmap));
+    bitmap->bitmap = in;
+    hbitmap_free(tmp);
 }

 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
@@ -3463,9 +3726,9 @@ bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp)
    if (!QLIST_EMPTY(&bs->op_blockers[op])) {
        blocker = QLIST_FIRST(&bs->op_blockers[op]);
        if (errp) {
-            error_setg(errp, "Node '%s' is busy: %s",
-                       bdrv_get_device_or_node_name(bs),
-                       error_get_pretty(blocker->reason));
+            *errp = error_copy(blocker->reason);
+            error_prepend(errp, "Node '%s' is busy: ",
+                          bdrv_get_device_or_node_name(bs));
        }
        return true;
    }
@@ -3706,7 +3969,7 @@ void bdrv_detach_aio_context(BlockDriverState *bs)
        baf->detach_aio_context(baf->opaque);
    }

-    if (bs->io_limits_enabled) {
+    if (bs->throttle_state) {
        throttle_timers_detach_aio_context(&bs->throttle_timers);
    }
    if (bs->drv->bdrv_detach_aio_context) {
@@ -3742,7 +4005,7 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
    if (bs->drv->bdrv_attach_aio_context) {
        bs->drv->bdrv_attach_aio_context(bs, new_context);
    }
-    if (bs->io_limits_enabled) {
+    if (bs->throttle_state) {
        throttle_timers_attach_aio_context(&bs->throttle_timers, new_context);
    }

@@ -3803,12 +4066,12 @@ void bdrv_remove_aio_context_notifier(BlockDriverState *bs,
 }

 int bdrv_amend_options(BlockDriverState *bs, QemuOpts *opts,
-                       BlockDriverAmendStatusCB *status_cb)
+                       BlockDriverAmendStatusCB *status_cb, void *cb_opaque)
 {
    if (!bs->drv->bdrv_amend_options) {
        return -ENOTSUP;
    }
-    return bs->drv->bdrv_amend_options(bs, opts, status_cb);
+    return bs->drv->bdrv_amend_options(bs, opts, status_cb, cb_opaque);
 }

 /* This function will be called by the bdrv_recurse_is_first_non_filter method
@@ -3906,20 +4169,39 @@ out:
 static bool append_open_options(QDict *d, BlockDriverState *bs)
 {
    const QDictEntry *entry;
+    QemuOptDesc *desc;
+    BdrvChild *child;
    bool found_any = false;
+    const char *p;

    for (entry = qdict_first(bs->options); entry;
         entry = qdict_next(bs->options, entry))
    {
-        /* Only take options for this level and exclude all non-driver-specific
-         * options */
-        if (!strchr(qdict_entry_key(entry), '.') &&
-            strcmp(qdict_entry_key(entry), "node-name"))
-        {
-            qobject_incref(qdict_entry_value(entry));
-            qdict_put_obj(d, qdict_entry_key(entry), qdict_entry_value(entry));
-            found_any = true;
+        /* Exclude options for children */
+        QLIST_FOREACH(child, &bs->children, next) {
+            if (strstart(qdict_entry_key(entry), child->name, &p)
+                && (!*p || *p == '.'))
+            {
+                break;
+            }
        }
+        if (child) {
+            continue;
+        }
+
+        /* And exclude all non-driver-specific options */
+        for (desc = bdrv_runtime_opts.desc; desc->name; desc++) {
+            if (!strcmp(qdict_entry_key(entry), desc->name)) {
+                break;
+            }
+        }
+        if (desc->name) {
+            continue;
+        }
+
+        qobject_incref(qdict_entry_value(entry));
+        qdict_put_obj(d, qdict_entry_key(entry), qdict_entry_value(entry));
+        found_any = true;
    }

    return found_any;
@@ -3961,7 +4243,10 @@ void bdrv_refresh_filename(BlockDriverState *bs)
            bs->full_open_options = NULL;
        }

-        drv->bdrv_refresh_filename(bs);
+        opts = qdict_new();
+        append_open_options(opts, bs);
+        drv->bdrv_refresh_filename(bs, opts);
+        QDECREF(opts);
    } else if (bs->file) {
        /* Try to reconstruct valid information from the underlying file */
        bool has_open_options;
--- a/block/accounting.c
+++ b/block/accounting.c
@@ -2,6 +2,7 @@
 * QEMU System Emulator block accounting
 *
 * Copyright (c) 2011 Christoph Hellwig
+ * Copyright (c) 2015 Igalia, S.L.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
@@ -25,6 +26,54 @@
 #include "block/accounting.h"
 #include "block/block_int.h"
 #include "qemu/timer.h"
+#include "sysemu/qtest.h"
+
+static QEMUClockType clock_type = QEMU_CLOCK_REALTIME;
+static const int qtest_latency_ns = NANOSECONDS_PER_SECOND / 1000;
+
+void block_acct_init(BlockAcctStats *stats, bool account_invalid,
+                     bool account_failed)
+{
+    stats->account_invalid = account_invalid;
+    stats->account_failed = account_failed;
+
+    if (qtest_enabled()) {
+        clock_type = QEMU_CLOCK_VIRTUAL;
+    }
+}
+
+void block_acct_cleanup(BlockAcctStats *stats)
+{
+    BlockAcctTimedStats *s, *next;
+    QSLIST_FOREACH_SAFE(s, &stats->intervals, entries, next) {
+        g_free(s);
+    }
+}
+
+void block_acct_add_interval(BlockAcctStats *stats, unsigned interval_length)
+{
+    BlockAcctTimedStats *s;
+    unsigned i;
+
+    s = g_new0(BlockAcctTimedStats, 1);
+    s->interval_length = interval_length;
+    QSLIST_INSERT_HEAD(&stats->intervals, s, entries);
+
+    for (i = 0; i < BLOCK_MAX_IOTYPE; i++) {
+        timed_average_init(&s->latency[i], clock_type,
+                           (uint64_t) interval_length * NANOSECONDS_PER_SECOND);
+    }
+}
+
+BlockAcctTimedStats *block_acct_interval_next(BlockAcctStats *stats,
+                                              BlockAcctTimedStats *s)
+{
+    if (s == NULL) {
+        return QSLIST_FIRST(&stats->intervals);
+    } else {
+        return QSLIST_NEXT(s, entries);
+    }
+}

 void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
                      int64_t bytes, enum BlockAcctType type)
@@ -32,20 +81,71 @@ void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
    assert(type < BLOCK_MAX_IOTYPE);

    cookie->bytes = bytes;
-    cookie->start_time_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
+    cookie->start_time_ns = qemu_clock_get_ns(clock_type);
    cookie->type = type;
 }

 void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
 {
+    BlockAcctTimedStats *s;
+    int64_t time_ns = qemu_clock_get_ns(clock_type);
+    int64_t latency_ns = time_ns - cookie->start_time_ns;
+
+    if (qtest_enabled()) {
+        latency_ns = qtest_latency_ns;
+    }
+
    assert(cookie->type < BLOCK_MAX_IOTYPE);

    stats->nr_bytes[cookie->type] += cookie->bytes;
    stats->nr_ops[cookie->type]++;
-    stats->total_time_ns[cookie->type] +=
-        qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - cookie->start_time_ns;
+    stats->total_time_ns[cookie->type] += latency_ns;
+    stats->last_access_time_ns = time_ns;
+
+    QSLIST_FOREACH(s, &stats->intervals, entries) {
+        timed_average_account(&s->latency[cookie->type], latency_ns);
+    }
 }

+void block_acct_failed(BlockAcctStats *stats, BlockAcctCookie *cookie)
+{
+    assert(cookie->type < BLOCK_MAX_IOTYPE);
+
+    stats->failed_ops[cookie->type]++;
+
+    if (stats->account_failed) {
+        BlockAcctTimedStats *s;
+        int64_t time_ns = qemu_clock_get_ns(clock_type);
+        int64_t latency_ns = time_ns - cookie->start_time_ns;
+
+        if (qtest_enabled()) {
+            latency_ns = qtest_latency_ns;
+        }
+
+        stats->total_time_ns[cookie->type] += latency_ns;
+        stats->last_access_time_ns = time_ns;
+
+        QSLIST_FOREACH(s, &stats->intervals, entries) {
+            timed_average_account(&s->latency[cookie->type], latency_ns);
+        }
+    }
+}
+
+void block_acct_invalid(BlockAcctStats *stats, enum BlockAcctType type)
+{
+    assert(type < BLOCK_MAX_IOTYPE);
+
+    /* block_acct_done() and block_acct_failed() update
+     * total_time_ns[], but this one does not. The reason is that
+     * invalid requests are accounted during their submission,
+     * therefore there's no actual I/O involved. */
+
+    stats->invalid_ops[type]++;
+
+    if (stats->account_invalid) {
+        stats->last_access_time_ns = qemu_clock_get_ns(clock_type);
+    }
+}

 void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type,
                      int num_requests)
@@ -53,3 +153,20 @@ void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type,
    assert(type < BLOCK_MAX_IOTYPE);
    stats->merged[type] += num_requests;
 }
+
+int64_t block_acct_idle_time_ns(BlockAcctStats *stats)
+{
+    return qemu_clock_get_ns(clock_type) - stats->last_access_time_ns;
+}
+
+double block_acct_queue_depth(BlockAcctTimedStats *stats,
+                              enum BlockAcctType type)
+{
+    uint64_t sum, elapsed;
+
+    assert(type < BLOCK_MAX_IOTYPE);
+
+    sum = timed_average_sum(&stats->latency[type], &elapsed);
+
+    return (double) sum / elapsed;
+}
--- a/block/backup.c
+++ b/block/backup.c
@@ -132,7 +132,7 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
        qemu_iovec_init_external(&bounce_qiov, &iov, 1);

        if (is_write_notifier) {
-            ret = bdrv_co_no_copy_on_readv(bs,
+            ret = bdrv_co_readv_no_serialising(bs,
                                           start * BACKUP_SECTORS_PER_CLUSTER,
                                           n, &bounce_qiov);
        } else {
@@ -221,11 +221,45 @@ static void backup_iostatus_reset(BlockJob *job)
    }
 }

+static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
+{
+    BdrvDirtyBitmap *bm;
+    BlockDriverState *bs = job->common.bs;
+
+    if (ret < 0 || block_job_is_cancelled(&job->common)) {
+        /* Merge the successor back into the parent, delete nothing. */
+        bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
+        assert(bm);
+    } else {
+        /* Everything is fine, delete this bitmap and install the backup. */
+        bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
+        assert(bm);
+    }
+}
+
+static void backup_commit(BlockJob *job)
+{
+    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+    if (s->sync_bitmap) {
+        backup_cleanup_sync_bitmap(s, 0);
+    }
+}
+
+static void backup_abort(BlockJob *job)
+{
+    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+    if (s->sync_bitmap) {
+        backup_cleanup_sync_bitmap(s, -1);
+    }
+}
+
 static const BlockJobDriver backup_job_driver = {
    .instance_size  = sizeof(BackupBlockJob),
    .job_type       = BLOCK_JOB_TYPE_BACKUP,
    .set_speed      = backup_set_speed,
    .iostatus_reset = backup_iostatus_reset,
+    .commit         = backup_commit,
+    .abort          = backup_abort,
 };

 static BlockErrorAction backup_error_action(BackupBlockJob *job,
@@ -441,19 +475,6 @@ static void coroutine_fn backup_run(void *opaque)
    /* wait until pending backup_do_cow() calls have completed */
    qemu_co_rwlock_wrlock(&job->flush_rwlock);
    qemu_co_rwlock_unlock(&job->flush_rwlock);
-
-    if (job->sync_bitmap) {
-        BdrvDirtyBitmap *bm;
-        if (ret < 0 || block_job_is_cancelled(&job->common)) {
-            /* Merge the successor back into the parent, delete nothing. */
-            bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
-            assert(bm);
-        } else {
-            /* Everything is fine, delete this bitmap and install the backup. */
-            bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
-            assert(bm);
-        }
-    }
    hbitmap_free(job->bitmap);

    if (target->blk) {
@@ -472,7 +493,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
                  BlockdevOnError on_source_error,
                  BlockdevOnError on_target_error,
                  BlockCompletionFunc *cb, void *opaque,
-                  Error **errp)
+                  BlockJobTxn *txn, Error **errp)
 {
    int64_t len;

@@ -554,6 +575,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
                       sync_bitmap : NULL;
    job->common.len = len;
    job->common.co = qemu_coroutine_create(backup_run);
+    block_job_txn_add_job(txn, &job->common);
    qemu_coroutine_enter(job->common.co, job);
    return;

--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -30,12 +30,13 @@
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qint.h"
 #include "qapi/qmp/qstring.h"
+#include "sysemu/qtest.h"

 typedef struct BDRVBlkdebugState {
    int state;
    int new_state;

-    QLIST_HEAD(, BlkdebugRule) rules[BLKDBG_EVENT_MAX];
+    QLIST_HEAD(, BlkdebugRule) rules[BLKDBG__MAX];
    QSIMPLEQ_HEAD(, BlkdebugRule) active_rules;
    QLIST_HEAD(, BlkdebugSuspendedReq) suspended_reqs;
 } BDRVBlkdebugState;
@@ -63,7 +64,7 @@ enum {
 };

 typedef struct BlkdebugRule {
-    BlkDebugEvent event;
+    BlkdebugEvent event;
    int action;
    int state;
    union {
@@ -142,69 +143,12 @@ static QemuOptsList *config_groups[] = {
    NULL
 };

-static const char *event_names[BLKDBG_EVENT_MAX] = {
-    [BLKDBG_L1_UPDATE]                      = "l1_update",
-    [BLKDBG_L1_GROW_ALLOC_TABLE]            = "l1_grow.alloc_table",
-    [BLKDBG_L1_GROW_WRITE_TABLE]            = "l1_grow.write_table",
-    [BLKDBG_L1_GROW_ACTIVATE_TABLE]         = "l1_grow.activate_table",
-
-    [BLKDBG_L2_LOAD]                        = "l2_load",
-    [BLKDBG_L2_UPDATE]                      = "l2_update",
-    [BLKDBG_L2_UPDATE_COMPRESSED]           = "l2_update_compressed",
-    [BLKDBG_L2_ALLOC_COW_READ]              = "l2_alloc.cow_read",
-    [BLKDBG_L2_ALLOC_WRITE]                 = "l2_alloc.write",
-
-    [BLKDBG_READ_AIO]                       = "read_aio",
-    [BLKDBG_READ_BACKING_AIO]               = "read_backing_aio",
-    [BLKDBG_READ_COMPRESSED]                = "read_compressed",
-
-    [BLKDBG_WRITE_AIO]                      = "write_aio",
-    [BLKDBG_WRITE_COMPRESSED]               = "write_compressed",
-
-    [BLKDBG_VMSTATE_LOAD]                   = "vmstate_load",
-    [BLKDBG_VMSTATE_SAVE]                   = "vmstate_save",
-
-    [BLKDBG_COW_READ]                       = "cow_read",
-    [BLKDBG_COW_WRITE]                      = "cow_write",
-
-    [BLKDBG_REFTABLE_LOAD]                  = "reftable_load",
-    [BLKDBG_REFTABLE_GROW]                  = "reftable_grow",
-    [BLKDBG_REFTABLE_UPDATE]                = "reftable_update",
-
-    [BLKDBG_REFBLOCK_LOAD]                  = "refblock_load",
-    [BLKDBG_REFBLOCK_UPDATE]                = "refblock_update",
-    [BLKDBG_REFBLOCK_UPDATE_PART]           = "refblock_update_part",
-    [BLKDBG_REFBLOCK_ALLOC]                 = "refblock_alloc",
-    [BLKDBG_REFBLOCK_ALLOC_HOOKUP]          = "refblock_alloc.hookup",
-    [BLKDBG_REFBLOCK_ALLOC_WRITE]           = "refblock_alloc.write",
-    [BLKDBG_REFBLOCK_ALLOC_WRITE_BLOCKS]    = "refblock_alloc.write_blocks",
-    [BLKDBG_REFBLOCK_ALLOC_WRITE_TABLE]     = "refblock_alloc.write_table",
-    [BLKDBG_REFBLOCK_ALLOC_SWITCH_TABLE]    = "refblock_alloc.switch_table",
-
-    [BLKDBG_CLUSTER_ALLOC]                  = "cluster_alloc",
-    [BLKDBG_CLUSTER_ALLOC_BYTES]            = "cluster_alloc_bytes",
-    [BLKDBG_CLUSTER_FREE]                   = "cluster_free",
-
-    [BLKDBG_FLUSH_TO_OS]                    = "flush_to_os",
-    [BLKDBG_FLUSH_TO_DISK]                  = "flush_to_disk",
-
-    [BLKDBG_PWRITEV_RMW_HEAD]               = "pwritev_rmw.head",
-    [BLKDBG_PWRITEV_RMW_AFTER_HEAD]         = "pwritev_rmw.after_head",
-    [BLKDBG_PWRITEV_RMW_TAIL]               = "pwritev_rmw.tail",
-    [BLKDBG_PWRITEV_RMW_AFTER_TAIL]         = "pwritev_rmw.after_tail",
-    [BLKDBG_PWRITEV]                        = "pwritev",
-    [BLKDBG_PWRITEV_ZERO]                   = "pwritev_zero",
-    [BLKDBG_PWRITEV_DONE]                   = "pwritev_done",
-
-    [BLKDBG_EMPTY_IMAGE_PREPARE]            = "empty_image_prepare",
-};
-
-static int get_event_by_name(const char *name, BlkDebugEvent *event)
+static int get_event_by_name(const char *name, BlkdebugEvent *event)
 {
    int i;

-    for (i = 0; i < BLKDBG_EVENT_MAX; i++) {
-        if (!strcmp(event_names[i], name)) {
+    for (i = 0; i < BLKDBG__MAX; i++) {
+        if (!strcmp(BlkdebugEvent_lookup[i], name)) {
            *event = i;
            return 0;
        }
@@ -223,7 +167,7 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
    struct add_rule_data *d = opaque;
    BDRVBlkdebugState *s = d->s;
    const char* event_name;
-    BlkDebugEvent event;
+    BlkdebugEvent event;
    struct BlkdebugRule *rule;

    /* Find the right event for the rule */
@@ -563,7 +507,7 @@ static void blkdebug_close(BlockDriverState *bs)
    BlkdebugRule *rule, *next;
    int i;

-    for (i = 0; i < BLKDBG_EVENT_MAX; i++) {
+    for (i = 0; i < BLKDBG__MAX; i++) {
        QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
            remove_rule(rule);
        }
@@ -583,9 +527,13 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
    remove_rule(rule);
    QLIST_INSERT_HEAD(&s->suspended_reqs, &r, next);

-    printf("blkdebug: Suspended request '%s'\n", r.tag);
+    if (!qtest_enabled()) {
+        printf("blkdebug: Suspended request '%s'\n", r.tag);
+    }
    qemu_coroutine_yield();
-    printf("blkdebug: Resuming request '%s'\n", r.tag);
+    if (!qtest_enabled()) {
+        printf("blkdebug: Resuming request '%s'\n", r.tag);
+    }

    QLIST_REMOVE(&r, next);
    g_free(r.tag);
@@ -622,13 +570,13 @@ static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
    return injected;
 }

-static void blkdebug_debug_event(BlockDriverState *bs, BlkDebugEvent event)
+static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event)
 {
    BDRVBlkdebugState *s = bs->opaque;
    struct BlkdebugRule *rule, *next;
    bool injected;

-    assert((int)event >= 0 && event < BLKDBG_EVENT_MAX);
+    assert((int)event >= 0 && event < BLKDBG__MAX);

    injected = false;
    s->new_state = s->state;
@@ -643,7 +591,7 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
 {
    BDRVBlkdebugState *s = bs->opaque;
    struct BlkdebugRule *rule;
-    BlkDebugEvent blkdebug_event;
+    BlkdebugEvent blkdebug_event;

    if (get_event_by_name(event, &blkdebug_event) < 0) {
        return -ENOENT;
@@ -685,7 +633,7 @@ static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
    BlkdebugRule *rule, *next;
    int i, ret = -ENOENT;

-    for (i = 0; i < BLKDBG_EVENT_MAX; i++) {
+    for (i = 0; i < BLKDBG__MAX; i++) {
        QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
            if (rule->action == ACTION_SUSPEND &&
                !strcmp(rule->options.suspend.tag, tag)) {
@@ -726,17 +674,15 @@ static int blkdebug_truncate(BlockDriverState *bs, int64_t offset)
    return bdrv_truncate(bs->file->bs, offset);
 }

-static void blkdebug_refresh_filename(BlockDriverState *bs)
+static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
 {
    QDict *opts;
    const QDictEntry *e;
    bool force_json = false;

-    for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
+    for (e = qdict_first(options); e; e = qdict_next(options, e)) {
        if (strcmp(qdict_entry_key(e), "config") &&
-            strcmp(qdict_entry_key(e), "x-image") &&
-            strcmp(qdict_entry_key(e), "image") &&
-            strncmp(qdict_entry_key(e), "image.", strlen("image.")))
+            strcmp(qdict_entry_key(e), "x-image"))
        {
            force_json = true;
            break;
@@ -752,7 +698,7 @@ static void blkdebug_refresh_filename(BlockDriverState *bs)
    if (!force_json && bs->file->bs->exact_filename[0]) {
        snprintf(bs->exact_filename, sizeof(bs->exact_filename),
                 "blkdebug:%s:%s",
-                 qdict_get_try_str(bs->options, "config") ?: "",
+                 qdict_get_try_str(options, "config") ?: "",
                 bs->file->bs->exact_filename);
    }

@@ -762,11 +708,8 @@ static void blkdebug_refresh_filename(BlockDriverState *bs)
    QINCREF(bs->file->bs->full_open_options);
    qdict_put_obj(opts, "image", QOBJECT(bs->file->bs->full_open_options));

-    for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
-        if (strcmp(qdict_entry_key(e), "x-image") &&
-            strcmp(qdict_entry_key(e), "image") &&
-            strncmp(qdict_entry_key(e), "image.", strlen("image.")))
-        {
+    for (e = qdict_first(options); e; e = qdict_next(options, e)) {
+        if (strcmp(qdict_entry_key(e), "x-image")) {
            qobject_incref(qdict_entry_value(e));
            qdict_put_obj(opts, qdict_entry_key(e), qdict_entry_value(e));
        }
@@ -775,6 +718,12 @@ static void blkdebug_refresh_filename(BlockDriverState *bs)
    bs->full_open_options = opts;
 }

+static int blkdebug_reopen_prepare(BDRVReopenState *reopen_state,
+                                   BlockReopenQueue *queue, Error **errp)
+{
+    return 0;
+}
+
 static BlockDriver bdrv_blkdebug = {
    .format_name            = "blkdebug",
    .protocol_name          = "blkdebug",
@@ -783,6 +732,7 @@ static BlockDriver bdrv_blkdebug = {
    .bdrv_parse_filename    = blkdebug_parse_filename,
    .bdrv_file_open         = blkdebug_open,
    .bdrv_close             = blkdebug_close,
+    .bdrv_reopen_prepare    = blkdebug_reopen_prepare,
    .bdrv_getlength         = blkdebug_getlength,
    .bdrv_truncate          = blkdebug_truncate,
    .bdrv_refresh_filename  = blkdebug_refresh_filename,
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -307,7 +307,7 @@ static void blkverify_attach_aio_context(BlockDriverState *bs,
    bdrv_attach_aio_context(s->test_file->bs, new_context);
 }

-static void blkverify_refresh_filename(BlockDriverState *bs)
+static void blkverify_refresh_filename(BlockDriverState *bs, QDict *options)
 {
    BDRVBlkverifyState *s = bs->opaque;

--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -176,6 +176,7 @@ static void blk_delete(BlockBackend *blk)
    }
    g_free(blk->name);
    drive_info_del(blk->legacy_dinfo);
+    block_acct_cleanup(&blk->stats);
    g_free(blk);
 }

@@ -189,6 +190,11 @@ static void drive_info_del(DriveInfo *dinfo)
    g_free(dinfo);
 }

+int blk_get_refcnt(BlockBackend *blk)
+{
+    return blk ? blk->refcnt : 0;
+}
+
 /*
 * Increment @blk's reference count.
 * @blk must not be null.
@@ -333,6 +339,18 @@ void blk_hide_on_behalf_of_hmp_drive_del(BlockBackend *blk)
    }
 }

+/*
+ * Disassociates the currently associated BlockDriverState from @blk.
+ */
+void blk_remove_bs(BlockBackend *blk)
+{
+    blk_update_root_state(blk);
+
+    blk->bs->blk = NULL;
+    bdrv_unref(blk->bs);
+    blk->bs = NULL;
+}
+
 /*
 * Associates a new BlockDriverState with @blk.
 */
@@ -417,18 +435,15 @@ void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops,
 void blk_dev_change_media_cb(BlockBackend *blk, bool load)
 {
    if (blk->dev_ops && blk->dev_ops->change_media_cb) {
-        bool tray_was_closed = !blk_dev_is_tray_open(blk);
+        bool tray_was_open, tray_is_open;

+        tray_was_open = blk_dev_is_tray_open(blk);
        blk->dev_ops->change_media_cb(blk->dev_opaque, load);
-        if (tray_was_closed) {
-            /* tray open */
-            qapi_event_send_device_tray_moved(blk_name(blk),
-                                              true, &error_abort);
-        }
-        if (load) {
-            /* tray close */
-            qapi_event_send_device_tray_moved(blk_name(blk),
-                                              false, &error_abort);
+        tray_is_open = blk_dev_is_tray_open(blk);
+
+        if (tray_was_open != tray_is_open) {
+            qapi_event_send_device_tray_moved(blk_name(blk), tray_is_open,
+                                              &error_abort);
        }
    }
 }
@@ -627,8 +642,9 @@ static void error_callback_bh(void *opaque)
    qemu_aio_unref(acb);
 }

-static BlockAIOCB *abort_aio_request(BlockBackend *blk, BlockCompletionFunc *cb,
-                                     void *opaque, int ret)
+BlockAIOCB *blk_abort_aio_request(BlockBackend *blk,
+                                  BlockCompletionFunc *cb,
+                                  void *opaque, int ret)
 {
    struct BlockBackendAIOCB *acb;
    QEMUBH *bh;
@@ -650,7 +666,7 @@ BlockAIOCB *blk_aio_write_zeroes(BlockBackend *blk, int64_t sector_num,
 {
    int ret = blk_check_request(blk, sector_num, nb_sectors);
    if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
+        return blk_abort_aio_request(blk, cb, opaque, ret);
    }

    return bdrv_aio_write_zeroes(blk->bs, sector_num, nb_sectors, flags,
@@ -710,7 +726,7 @@ BlockAIOCB *blk_aio_readv(BlockBackend *blk, int64_t sector_num,
 {
    int ret = blk_check_request(blk, sector_num, nb_sectors);
    if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
+        return blk_abort_aio_request(blk, cb, opaque, ret);
    }

    return bdrv_aio_readv(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
@@ -722,7 +738,7 @@ BlockAIOCB *blk_aio_writev(BlockBackend *blk, int64_t sector_num,
 {
    int ret = blk_check_request(blk, sector_num, nb_sectors);
    if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
+        return blk_abort_aio_request(blk, cb, opaque, ret);
    }

    return bdrv_aio_writev(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
@@ -732,7 +748,7 @@ BlockAIOCB *blk_aio_flush(BlockBackend *blk,
                          BlockCompletionFunc *cb, void *opaque)
 {
    if (!blk_is_available(blk)) {
-        return abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
+        return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
    }

    return bdrv_aio_flush(blk->bs, cb, opaque);
@@ -744,7 +760,7 @@ BlockAIOCB *blk_aio_discard(BlockBackend *blk,
 {
    int ret = blk_check_request(blk, sector_num, nb_sectors);
    if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
+        return blk_abort_aio_request(blk, cb, opaque, ret);
    }

    return bdrv_aio_discard(blk->bs, sector_num, nb_sectors, cb, opaque);
@@ -787,7 +803,7 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
                          BlockCompletionFunc *cb, void *opaque)
 {
    if (!blk_is_available(blk)) {
-        return abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
+        return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
    }

    return bdrv_aio_ioctl(blk->bs, req, buf, cb, opaque);
@@ -1007,11 +1023,21 @@ int blk_get_max_transfer_length(BlockBackend *blk)
    }
 }

+int blk_get_max_iov(BlockBackend *blk)
+{
+    return blk->bs->bl.max_iov;
+}
+
 void blk_set_guest_block_size(BlockBackend *blk, int align)
 {
    blk->guest_block_size = align;
 }

+void *blk_try_blockalign(BlockBackend *blk, size_t size)
+{
+    return qemu_try_blockalign(blk ? blk->bs : NULL, size);
+}
+
 void *blk_blockalign(BlockBackend *blk, size_t size)
 {
    return qemu_blockalign(blk ? blk->bs : NULL, size);
@@ -1227,6 +1253,33 @@ void blk_update_root_state(BlockBackend *blk)
    }
 }

+/*
+ * Applies the information in the root state to the given BlockDriverState. This
+ * does not include the flags which have to be specified for bdrv_open(), use
+ * blk_get_open_flags_from_root_state() to inquire them.
+ */
+void blk_apply_root_state(BlockBackend *blk, BlockDriverState *bs)
+{
+    bs->detect_zeroes = blk->root_state.detect_zeroes;
+    if (blk->root_state.throttle_group) {
+        bdrv_io_limits_enable(bs, blk->root_state.throttle_group);
+    }
+}
+
+/*
+ * Returns the flags to be used for bdrv_open() of a BlockDriverState which is
+ * supposed to inherit the root state.
+ */
+int blk_get_open_flags_from_root_state(BlockBackend *blk)
+{
+    int bs_flags;
+
+    bs_flags = blk->root_state.read_only ? 0 : BDRV_O_RDWR;
+    bs_flags |= blk->root_state.open_flags & ~BDRV_O_RDWR;
+
+    return bs_flags;
+}
+
 BlockBackendRootState *blk_get_root_state(BlockBackend *blk)
 {
    return &blk->root_state;
--- a/block/commit.c
+++ b/block/commit.c
@@ -236,14 +236,14 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
    orig_overlay_flags = bdrv_get_flags(overlay_bs);

    /* convert base & overlay_bs to r/w, if necessary */
-    if (!(orig_base_flags & BDRV_O_RDWR)) {
-        reopen_queue = bdrv_reopen_queue(reopen_queue, base, NULL,
-                                         orig_base_flags | BDRV_O_RDWR);
-    }
    if (!(orig_overlay_flags & BDRV_O_RDWR)) {
        reopen_queue = bdrv_reopen_queue(reopen_queue, overlay_bs, NULL,
                                         orig_overlay_flags | BDRV_O_RDWR);
    }
+    if (!(orig_base_flags & BDRV_O_RDWR)) {
+        reopen_queue = bdrv_reopen_queue(reopen_queue, base, NULL,
+                                         orig_base_flags | BDRV_O_RDWR);
+    }
    if (reopen_queue) {
        bdrv_reopen_multiple(reopen_queue, &local_err);
        if (local_err != NULL) {
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -429,28 +429,23 @@ static coroutine_fn int qemu_gluster_co_write_zeroes(BlockDriverState *bs,
        int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
 {
    int ret;
-    GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
+    GlusterAIOCB acb;
    BDRVGlusterState *s = bs->opaque;
    off_t size = nb_sectors * BDRV_SECTOR_SIZE;
    off_t offset = sector_num * BDRV_SECTOR_SIZE;

-    acb->size = size;
-    acb->ret = 0;
-    acb->coroutine = qemu_coroutine_self();
-    acb->aio_context = bdrv_get_aio_context(bs);
+    acb.size = size;
+    acb.ret = 0;
+    acb.coroutine = qemu_coroutine_self();
+    acb.aio_context = bdrv_get_aio_context(bs);

-    ret = glfs_zerofill_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
+    ret = glfs_zerofill_async(s->fd, offset, size, gluster_finish_aiocb, &acb);
    if (ret < 0) {
-        ret = -errno;
-        goto out;
+        return -errno;
    }

    qemu_coroutine_yield();
-    ret = acb->ret;
-
-out:
-    g_slice_free(GlusterAIOCB, acb);
-    return ret;
+    return acb.ret;
 }

 static inline bool gluster_supports_zerofill(void)
@@ -541,35 +536,30 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
        int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int write)
 {
    int ret;
-    GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
+    GlusterAIOCB acb;
    BDRVGlusterState *s = bs->opaque;
    size_t size = nb_sectors * BDRV_SECTOR_SIZE;
    off_t offset = sector_num * BDRV_SECTOR_SIZE;

-    acb->size = size;
-    acb->ret = 0;
-    acb->coroutine = qemu_coroutine_self();
-    acb->aio_context = bdrv_get_aio_context(bs);
+    acb.size = size;
+    acb.ret = 0;
+    acb.coroutine = qemu_coroutine_self();
+    acb.aio_context = bdrv_get_aio_context(bs);

    if (write) {
        ret = glfs_pwritev_async(s->fd, qiov->iov, qiov->niov, offset, 0,
-            &gluster_finish_aiocb, acb);
+            gluster_finish_aiocb, &acb);
    } else {
        ret = glfs_preadv_async(s->fd, qiov->iov, qiov->niov, offset, 0,
-            &gluster_finish_aiocb, acb);
+            gluster_finish_aiocb, &acb);
    }

    if (ret < 0) {
-        ret = -errno;
-        goto out;
+        return -errno;
    }

    qemu_coroutine_yield();
-    ret = acb->ret;
-
-out:
-    g_slice_free(GlusterAIOCB, acb);
-    return ret;
+    return acb.ret;
 }

 static int qemu_gluster_truncate(BlockDriverState *bs, int64_t offset)
@@ -600,26 +590,21 @@ static coroutine_fn int qemu_gluster_co_writev(BlockDriverState *bs,
 static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
 {
    int ret;
-    GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
+    GlusterAIOCB acb;
    BDRVGlusterState *s = bs->opaque;

-    acb->size = 0;
-    acb->ret = 0;
-    acb->coroutine = qemu_coroutine_self();
-    acb->aio_context = bdrv_get_aio_context(bs);
+    acb.size = 0;
+    acb.ret = 0;
+    acb.coroutine = qemu_coroutine_self();
+    acb.aio_context = bdrv_get_aio_context(bs);

-    ret = glfs_fsync_async(s->fd, &gluster_finish_aiocb, acb);
+    ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
    if (ret < 0) {
-        ret = -errno;
-        goto out;
+        return -errno;
    }

    qemu_coroutine_yield();
-    ret = acb->ret;
-
-out:
-    g_slice_free(GlusterAIOCB, acb);
-    return ret;
+    return acb.ret;
 }

 #ifdef CONFIG_GLUSTERFS_DISCARD
@@ -627,28 +612,23 @@ static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
        int64_t sector_num, int nb_sectors)
 {
    int ret;
-    GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
+    GlusterAIOCB acb;
    BDRVGlusterState *s = bs->opaque;
    size_t size = nb_sectors * BDRV_SECTOR_SIZE;
    off_t offset = sector_num * BDRV_SECTOR_SIZE;

-    acb->size = 0;
-    acb->ret = 0;
-    acb->coroutine = qemu_coroutine_self();
-    acb->aio_context = bdrv_get_aio_context(bs);
+    acb.size = 0;
+    acb.ret = 0;
+    acb.coroutine = qemu_coroutine_self();
+    acb.aio_context = bdrv_get_aio_context(bs);

-    ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
+    ret = glfs_discard_async(s->fd, offset, size, gluster_finish_aiocb, &acb);
    if (ret < 0) {
-        ret = -errno;
-        goto out;
+        return -errno;
    }

    qemu_coroutine_yield();
-    ret = acb->ret;
-
-out:
-    g_slice_free(GlusterAIOCB, acb);
-    return ret;
+    return acb.ret;
 }
 #endif

--- a/block/io.c
+++ b/block/io.c
@@ -166,9 +166,13 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
        bs->bl.max_transfer_length = bs->file->bs->bl.max_transfer_length;
        bs->bl.min_mem_alignment = bs->file->bs->bl.min_mem_alignment;
        bs->bl.opt_mem_alignment = bs->file->bs->bl.opt_mem_alignment;
+        bs->bl.max_iov = bs->file->bs->bl.max_iov;
    } else {
        bs->bl.min_mem_alignment = 512;
        bs->bl.opt_mem_alignment = getpagesize();
+
+        /* Safe default since most protocols use readv()/writev()/etc */
+        bs->bl.max_iov = IOV_MAX;
    }

    if (bs->backing) {
@@ -189,6 +193,9 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
        bs->bl.min_mem_alignment =
            MAX(bs->bl.min_mem_alignment,
                bs->backing->bs->bl.min_mem_alignment);
+        bs->bl.max_iov =
+            MIN(bs->bl.max_iov,
+                bs->backing->bs->bl.max_iov);
    }

    /* Then let the driver override it */
@@ -237,8 +244,21 @@ bool bdrv_requests_pending(BlockDriverState *bs)
    return false;
 }

+static void bdrv_drain_recurse(BlockDriverState *bs)
+{
+    BdrvChild *child;
+
+    if (bs->drv && bs->drv->bdrv_drain) {
+        bs->drv->bdrv_drain(bs);
+    }
+    QLIST_FOREACH(child, &bs->children, next) {
+        bdrv_drain_recurse(child->bs);
+    }
+}
+
 /*
- * Wait for pending requests to complete on a single BlockDriverState subtree
+ * Wait for pending requests to complete on a single BlockDriverState subtree,
+ * and suspend block driver's internal I/O until next request arrives.
 *
 * Note that unlike bdrv_drain_all(), the caller must hold the BlockDriverState
 * AioContext.
@@ -251,6 +271,7 @@ void bdrv_drain(BlockDriverState *bs)
 {
    bool busy = true;

+    bdrv_drain_recurse(bs);
    while (busy) {
        /* Keep iterating */
         bdrv_flush_io_queue(bs);
@@ -348,13 +369,14 @@ static void tracked_request_end(BdrvTrackedRequest *req)
 static void tracked_request_begin(BdrvTrackedRequest *req,
                                  BlockDriverState *bs,
                                  int64_t offset,
-                                  unsigned int bytes, bool is_write)
+                                  unsigned int bytes,
+                                  enum BdrvTrackedRequestType type)
 {
    *req = (BdrvTrackedRequest){
        .bs = bs,
        .offset         = offset,
        .bytes          = bytes,
-        .is_write       = is_write,
+        .type           = type,
        .co             = qemu_coroutine_self(),
        .serialising    = false,
        .overlap_offset = offset,
@@ -848,7 +870,9 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
        mark_request_serialising(req, bdrv_get_cluster_size(bs));
    }

-    wait_serialising_requests(req);
+    if (!(flags & BDRV_REQ_NO_SERIALISING)) {
+        wait_serialising_requests(req);
+    }

    if (flags & BDRV_REQ_COPY_ON_READ) {
        int pnum;
@@ -937,7 +961,7 @@ static int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs,
    }

    /* Don't do copy-on-read if we read data before write operation */
-    if (bs->copy_on_read && !(flags & BDRV_REQ_NO_COPY_ON_READ)) {
+    if (bs->copy_on_read && !(flags & BDRV_REQ_NO_SERIALISING)) {
        flags |= BDRV_REQ_COPY_ON_READ;
    }

@@ -971,7 +995,7 @@ static int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs,
        bytes = ROUND_UP(bytes, align);
    }

-    tracked_request_begin(&req, bs, offset, bytes, false);
+    tracked_request_begin(&req, bs, offset, bytes, BDRV_TRACKED_READ);
    ret = bdrv_aligned_preadv(bs, &req, offset, bytes, align,
                              use_local_qiov ? &local_qiov : qiov,
                              flags);
@@ -1006,13 +1030,13 @@ int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
    return bdrv_co_do_readv(bs, sector_num, nb_sectors, qiov, 0);
 }

-int coroutine_fn bdrv_co_no_copy_on_readv(BlockDriverState *bs,
+int coroutine_fn bdrv_co_readv_no_serialising(BlockDriverState *bs,
    int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
 {
-    trace_bdrv_co_no_copy_on_readv(bs, sector_num, nb_sectors);
+    trace_bdrv_co_readv_no_serialising(bs, sector_num, nb_sectors);

    return bdrv_co_do_readv(bs, sector_num, nb_sectors, qiov,
-                            BDRV_REQ_NO_COPY_ON_READ);
+                            BDRV_REQ_NO_SERIALISING);
 }

 int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
@@ -1292,7 +1316,7 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
     * Pad qiov with the read parts and be sure to have a tracked request not
     * only for bdrv_aligned_pwritev, but also for the reads of the RMW cycle.
     */
-    tracked_request_begin(&req, bs, offset, bytes, true);
+    tracked_request_begin(&req, bs, offset, bytes, BDRV_TRACKED_WRITE);

    if (!qiov) {
        ret = bdrv_co_do_zero_pwritev(bs, offset, bytes, flags, &req);
@@ -1865,7 +1889,8 @@ static int multiwrite_merge(BlockDriverState *bs, BlockRequest *reqs,
            merge = 1;
        }

-        if (reqs[outidx].qiov->niov + reqs[i].qiov->niov + 1 > IOV_MAX) {
+        if (reqs[outidx].qiov->niov + reqs[i].qiov->niov + 1 >
+            bs->bl.max_iov) {
            merge = 0;
        }

@@ -2317,18 +2342,20 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque)
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 {
    int ret;
+    BdrvTrackedRequest req;

    if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
        bdrv_is_sg(bs)) {
        return 0;
    }

+    tracked_request_begin(&req, bs, 0, 0, BDRV_TRACKED_FLUSH);
    /* Write back cached data to the OS even with cache=unsafe */
    BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS);
    if (bs->drv->bdrv_co_flush_to_os) {
        ret = bs->drv->bdrv_co_flush_to_os(bs);
        if (ret < 0) {
-            return ret;
+            goto out;
        }
    }

@@ -2368,14 +2395,17 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
        ret = 0;
    }
    if (ret < 0) {
-        return ret;
+        goto out;
    }

    /* Now flush the underlying protocol.  It will also have BDRV_O_NO_FLUSH
     * in the case of cache=unsafe, so there are no useless flushes.
     */
 flush_parent:
-    return bs->file ? bdrv_co_flush(bs->file->bs) : 0;
+    ret = bs->file ? bdrv_co_flush(bs->file->bs) : 0;
+out:
+    tracked_request_end(&req);
+    return ret;
 }

 int bdrv_flush(BlockDriverState *bs)
@@ -2418,6 +2448,7 @@ static void coroutine_fn bdrv_discard_co_entry(void *opaque)
 int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
                                 int nb_sectors)
 {
+    BdrvTrackedRequest req;
    int max_discard, ret;

    if (!bs->drv) {
@@ -2440,6 +2471,8 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
        return 0;
    }

+    tracked_request_begin(&req, bs, sector_num, nb_sectors,
+                          BDRV_TRACKED_DISCARD);
    bdrv_set_dirty(bs, sector_num, nb_sectors);

    max_discard = MIN_NON_ZERO(bs->bl.max_discard, BDRV_REQUEST_MAX_SECTORS);
@@ -2473,20 +2506,24 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
            acb = bs->drv->bdrv_aio_discard(bs, sector_num, nb_sectors,
                                            bdrv_co_io_em_complete, &co);
            if (acb == NULL) {
-                return -EIO;
+                ret = -EIO;
+                goto out;
            } else {
                qemu_coroutine_yield();
                ret = co.ret;
            }
        }
        if (ret && ret != -ENOTSUP) {
-            return ret;
+            goto out;
        }

        sector_num += num;
        nb_sectors -= num;
    }
-    return 0;
+    ret = 0;
+out:
+    tracked_request_end(&req);
+    return ret;
 }

 int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
@@ -2515,26 +2552,110 @@ int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
    return rwco.ret;
 }

-/* needed for generic scsi interface */
+typedef struct {
+    CoroutineIOCompletion *co;
+    QEMUBH *bh;
+} BdrvIoctlCompletionData;

-int bdrv_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
+static void bdrv_ioctl_bh_cb(void *opaque)
+{
+    BdrvIoctlCompletionData *data = opaque;
+
+    bdrv_co_io_em_complete(data->co, -ENOTSUP);
+    qemu_bh_delete(data->bh);
+}
+
+static int bdrv_co_do_ioctl(BlockDriverState *bs, int req, void *buf)
 {
    BlockDriver *drv = bs->drv;
+    BdrvTrackedRequest tracked_req;
+    CoroutineIOCompletion co = {
+        .coroutine = qemu_coroutine_self(),
+    };
+    BlockAIOCB *acb;

-    if (drv && drv->bdrv_ioctl)
-        return drv->bdrv_ioctl(bs, req, buf);
-    return -ENOTSUP;
+    tracked_request_begin(&tracked_req, bs, 0, 0, BDRV_TRACKED_IOCTL);
+    if (!drv || !drv->bdrv_aio_ioctl) {
+        co.ret = -ENOTSUP;
+        goto out;
+    }
+
+    acb = drv->bdrv_aio_ioctl(bs, req, buf, bdrv_co_io_em_complete, &co);
+    if (!acb) {
+        BdrvIoctlCompletionData *data = g_new(BdrvIoctlCompletionData, 1);
+        data->bh = aio_bh_new(bdrv_get_aio_context(bs),
+                                bdrv_ioctl_bh_cb, data);
+        data->co = &co;
+        qemu_bh_schedule(data->bh);
+    }
+    qemu_coroutine_yield();
+out:
+    tracked_request_end(&tracked_req);
+    return co.ret;
+}
+
+typedef struct {
+    BlockDriverState *bs;
+    int req;
+    void *buf;
+    int ret;
+} BdrvIoctlCoData;
+
+static void coroutine_fn bdrv_co_ioctl_entry(void *opaque)
+{
+    BdrvIoctlCoData *data = opaque;
+    data->ret = bdrv_co_do_ioctl(data->bs, data->req, data->buf);
+}
+
+/* needed for generic scsi interface */
+int bdrv_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
+{
+    BdrvIoctlCoData data = {
+        .bs = bs,
+        .req = req,
+        .buf = buf,
+        .ret = -EINPROGRESS,
+    };
+
+    if (qemu_in_coroutine()) {
+        /* Fast-path if already in coroutine context */
+        bdrv_co_ioctl_entry(&data);
+    } else {
+        Coroutine *co = qemu_coroutine_create(bdrv_co_ioctl_entry);
+
+        qemu_coroutine_enter(co, &data);
+        while (data.ret == -EINPROGRESS) {
+            aio_poll(bdrv_get_aio_context(bs), true);
+        }
+    }
+    return data.ret;
+}
+
+static void coroutine_fn bdrv_co_aio_ioctl_entry(void *opaque)
+{
+    BlockAIOCBCoroutine *acb = opaque;
+    acb->req.error = bdrv_co_do_ioctl(acb->common.bs,
+                                      acb->req.req, acb->req.buf);
+    bdrv_co_complete(acb);
 }

 BlockAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
        unsigned long int req, void *buf,
        BlockCompletionFunc *cb, void *opaque)
 {
-    BlockDriver *drv = bs->drv;
+    BlockAIOCBCoroutine *acb = qemu_aio_get(&bdrv_em_co_aiocb_info,
+                                            bs, cb, opaque);
+    Coroutine *co;

-    if (drv && drv->bdrv_aio_ioctl)
-        return drv->bdrv_aio_ioctl(bs, req, buf, cb, opaque);
-    return NULL;
+    acb->need_bh = true;
+    acb->req.error = -EINPROGRESS;
+    acb->req.req = req;
+    acb->req.buf = buf;
+    co = qemu_coroutine_create(bdrv_co_aio_ioctl_entry);
+    qemu_coroutine_enter(co, acb);
+
+    bdrv_co_maybe_schedule_bh(acb);
+    return &acb->common;
 }

 void *qemu_blockalign(BlockDriverState *bs, size_t size)
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -84,6 +84,7 @@ typedef struct IscsiTask {
    IscsiLun *iscsilun;
    QEMUTimer retry_timer;
    bool force_next_flush;
+    int err_code;
 } IscsiTask;

 typedef struct IscsiAIOCB {
@@ -96,6 +97,7 @@ typedef struct IscsiAIOCB {
    int status;
    int64_t sector_num;
    int nb_sectors;
+    int ret;
 #ifdef __linux__
    sg_io_hdr_t *ioh;
 #endif
@@ -169,19 +171,70 @@ static inline unsigned exp_random(double mean)
    return -mean * log((double)rand() / RAND_MAX);
 }

-/* SCSI_STATUS_TASK_SET_FULL and SCSI_STATUS_TIMEOUT were introduced
- * in libiscsi 1.10.0 as part of an enum. The LIBISCSI_API_VERSION
- * macro was introduced in 1.11.0. So use the API_VERSION macro as
- * a hint that the macros are defined and define them ourselves
- * otherwise to keep the required libiscsi version at 1.9.0 */
-#if !defined(LIBISCSI_API_VERSION)
-#define QEMU_SCSI_STATUS_TASK_SET_FULL  0x28
-#define QEMU_SCSI_STATUS_TIMEOUT        0x0f000002
-#else
-#define QEMU_SCSI_STATUS_TASK_SET_FULL  SCSI_STATUS_TASK_SET_FULL
-#define QEMU_SCSI_STATUS_TIMEOUT        SCSI_STATUS_TIMEOUT
+/* SCSI_SENSE_ASCQ_INVALID_FIELD_IN_PARAMETER_LIST was introduced in
+ * libiscsi 1.10.0, together with other constants we need.  Use it as
+ * a hint that we have to define them ourselves if needed, to keep the
+ * minimum required libiscsi version at 1.9.0.  We use an ASCQ macro for
+ * the test because SCSI_STATUS_* is an enum.
+ *
+ * To guard against future changes where SCSI_SENSE_ASCQ_* also becomes
+ * an enum, check against the LIBISCSI_API_VERSION macro, which was
+ * introduced in 1.11.0.  If it is present, there is no need to define
+ * anything.
+ */
+#if !defined(SCSI_SENSE_ASCQ_INVALID_FIELD_IN_PARAMETER_LIST) && \
+    !defined(LIBISCSI_API_VERSION)
+#define SCSI_STATUS_TASK_SET_FULL                          0x28
+#define SCSI_STATUS_TIMEOUT                                0x0f000002
+#define SCSI_SENSE_ASCQ_INVALID_FIELD_IN_PARAMETER_LIST    0x2600
+#define SCSI_SENSE_ASCQ_PARAMETER_LIST_LENGTH_ERROR        0x1a00
 #endif

+static int iscsi_translate_sense(struct scsi_sense *sense)
+{
+    int ret;
+
+    switch (sense->key) {
+    case SCSI_SENSE_NOT_READY:
+        return -EBUSY;
+    case SCSI_SENSE_DATA_PROTECTION:
+        return -EACCES;
+    case SCSI_SENSE_COMMAND_ABORTED:
+        return -ECANCELED;
+    case SCSI_SENSE_ILLEGAL_REQUEST:
+        /* Parse ASCQ */
+        break;
+    default:
+        return -EIO;
+    }
+    switch (sense->ascq) {
+    case SCSI_SENSE_ASCQ_PARAMETER_LIST_LENGTH_ERROR:
+    case SCSI_SENSE_ASCQ_INVALID_OPERATION_CODE:
+    case SCSI_SENSE_ASCQ_INVALID_FIELD_IN_CDB:
+    case SCSI_SENSE_ASCQ_INVALID_FIELD_IN_PARAMETER_LIST:
+        ret = -EINVAL;
+        break;
+    case SCSI_SENSE_ASCQ_LBA_OUT_OF_RANGE:
+        ret = -ENOSPC;
+        break;
+    case SCSI_SENSE_ASCQ_LOGICAL_UNIT_NOT_SUPPORTED:
+        ret = -ENOTSUP;
+        break;
+    case SCSI_SENSE_ASCQ_MEDIUM_NOT_PRESENT:
+    case SCSI_SENSE_ASCQ_MEDIUM_NOT_PRESENT_TRAY_CLOSED:
+    case SCSI_SENSE_ASCQ_MEDIUM_NOT_PRESENT_TRAY_OPEN:
+        ret = -ENOMEDIUM;
+        break;
+    case SCSI_SENSE_ASCQ_WRITE_PROTECTED:
+        ret = -EACCES;
+        break;
+    default:
+        ret = -EIO;
+        break;
+    }
+    return ret;
+}
+
 static void
 iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
                        void *command_data, void *opaque)
@@ -203,11 +256,11 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
                goto out;
            }
            if (status == SCSI_STATUS_BUSY ||
-                status == QEMU_SCSI_STATUS_TIMEOUT ||
-                status == QEMU_SCSI_STATUS_TASK_SET_FULL) {
+                status == SCSI_STATUS_TIMEOUT ||
+                status == SCSI_STATUS_TASK_SET_FULL) {
                unsigned retry_time =
                    exp_random(iscsi_retry_times[iTask->retries - 1]);
-                if (status == QEMU_SCSI_STATUS_TIMEOUT) {
+                if (status == SCSI_STATUS_TIMEOUT) {
                    /* make sure the request is rescheduled AFTER the
                     * reconnect is initiated */
                    retry_time = EVENT_INTERVAL * 2;
@@ -226,6 +279,7 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
                return;
            }
        }
+        iTask->err_code = iscsi_translate_sense(&task->sense);
        error_report("iSCSI Failure: %s", iscsi_get_error(iscsi));
    } else {
        iTask->iscsilun->force_next_flush |= iTask->force_next_flush;
@@ -455,7 +509,7 @@ retry:
    }

    if (iTask.status != SCSI_STATUS_GOOD) {
-        return -EIO;
+        return iTask.err_code;
    }

    iscsi_allocationmap_set(iscsilun, sector_num, nb_sectors);
@@ -644,7 +698,7 @@ retry:
    }

    if (iTask.status != SCSI_STATUS_GOOD) {
-        return -EIO;
+        return iTask.err_code;
    }

    return 0;
@@ -683,7 +737,7 @@ retry:
    }

    if (iTask.status != SCSI_STATUS_GOOD) {
-        return -EIO;
+        return iTask.err_code;
    }

    return 0;
@@ -703,7 +757,7 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
    if (status < 0) {
        error_report("Failed to ioctl(SG_IO) to iSCSI lun. %s",
                     iscsi_get_error(iscsi));
-        acb->status = -EIO;
+        acb->status = iscsi_translate_sense(&acb->task->sense);
    }

    acb->ioh->driver_status = 0;
@@ -726,6 +780,38 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
    iscsi_schedule_bh(acb);
 }

+static void iscsi_ioctl_bh_completion(void *opaque)
+{
+    IscsiAIOCB *acb = opaque;
+
+    qemu_bh_delete(acb->bh);
+    acb->common.cb(acb->common.opaque, acb->ret);
+    qemu_aio_unref(acb);
+}
+
+static void iscsi_ioctl_handle_emulated(IscsiAIOCB *acb, int req, void *buf)
+{
+    BlockDriverState *bs = acb->common.bs;
+    IscsiLun *iscsilun = bs->opaque;
+    int ret = 0;
+
+    switch (req) {
+    case SG_GET_VERSION_NUM:
+        *(int *)buf = 30000;
+        break;
+    case SG_GET_SCSI_ID:
+        ((struct sg_scsi_id *)buf)->scsi_type = iscsilun->type;
+        break;
+    default:
+        ret = -EINVAL;
+    }
+    assert(!acb->bh);
+    acb->bh = aio_bh_new(bdrv_get_aio_context(bs),
+                         iscsi_ioctl_bh_completion, acb);
+    acb->ret = ret;
+    qemu_bh_schedule(acb->bh);
+}
+
 static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
        unsigned long int req, void *buf,
        BlockCompletionFunc *cb, void *opaque)
@@ -735,8 +821,6 @@ static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
    struct iscsi_data data;
    IscsiAIOCB *acb;

-    assert(req == SG_IO);
-
    acb = qemu_aio_get(&iscsi_aiocb_info, bs, cb, opaque);

    acb->iscsilun = iscsilun;
@@ -745,6 +829,11 @@ static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
    acb->buf         = NULL;
    acb->ioh         = buf;

+    if (req != SG_IO) {
+        iscsi_ioctl_handle_emulated(acb, req, buf);
+        return &acb->common;
+    }
+
    acb->task = malloc(sizeof(struct scsi_task));
    if (acb->task == NULL) {
        error_report("iSCSI: Failed to allocate task for scsi command. %s",
@@ -809,38 +898,6 @@ static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
    return &acb->common;
 }

-static void ioctl_cb(void *opaque, int status)
-{
-    int *p_status = opaque;
-    *p_status = status;
-}
-
-static int iscsi_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
-{
-    IscsiLun *iscsilun = bs->opaque;
-    int status;
-
-    switch (req) {
-    case SG_GET_VERSION_NUM:
-        *(int *)buf = 30000;
-        break;
-    case SG_GET_SCSI_ID:
-        ((struct sg_scsi_id *)buf)->scsi_type = iscsilun->type;
-        break;
-    case SG_IO:
-        status = -EINPROGRESS;
-        iscsi_aio_ioctl(bs, req, buf, ioctl_cb, &status);
-
-        while (status == -EINPROGRESS) {
-            aio_poll(iscsilun->aio_context, true);
-        }
-
-        return 0;
-    default:
-        return -1;
-    }
-    return 0;
-}
 #endif

 static int64_t
@@ -905,7 +962,7 @@ retry:
    }

    if (iTask.status != SCSI_STATUS_GOOD) {
-        return -EIO;
+        return iTask.err_code;
    }

    iscsi_allocationmap_clear(iscsilun, sector_num, nb_sectors);
@@ -999,7 +1056,7 @@ retry:
    }

    if (iTask.status != SCSI_STATUS_GOOD) {
-        return -EIO;
+        return iTask.err_code;
    }

    if (flags & BDRV_REQ_MAY_UNMAP) {
@@ -1186,8 +1243,13 @@ static void iscsi_readcapacity_sync(IscsiLun *iscsilun, Error **errp)
                    iscsilun->lbprz = !!rc16->lbprz;
                    iscsilun->use_16_for_rw = (rc16->returned_lba > 0xffffffff);
                }
+                break;
            }
-            break;
+            if (task != NULL && task->status == SCSI_STATUS_CHECK_CONDITION
+                && task->sense.key == SCSI_SENSE_UNIT_ATTENTION) {
+                break;
+            }
+            /* Fall through and try READ CAPACITY(10) instead.  */
        case TYPE_ROM:
            task = iscsi_readcapacity10_sync(iscsilun->iscsi, iscsilun->lun, 0, 0);
            if (task != NULL && task->status == SCSI_STATUS_GOOD) {
@@ -1213,7 +1275,7 @@ static void iscsi_readcapacity_sync(IscsiLun *iscsilun, Error **errp)
             && retries-- > 0);

    if (task == NULL || task->status != SCSI_STATUS_GOOD) {
-        error_setg(errp, "iSCSI: failed to send readcapacity10 command.");
+        error_setg(errp, "iSCSI: failed to send readcapacity10/16 command");
    } else if (!iscsilun->block_size ||
               iscsilun->block_size % BDRV_SECTOR_SIZE) {
        error_setg(errp, "iSCSI: the target returned an invalid "
@@ -1771,7 +1833,6 @@ static BlockDriver bdrv_iscsi = {
    .bdrv_co_flush_to_disk = iscsi_co_flush,

 #ifdef __linux__
-    .bdrv_ioctl       = iscsi_ioctl,
    .bdrv_aio_ioctl   = iscsi_aio_ioctl,
 #endif

--- a/block/mirror.c
+++ b/block/mirror.c
@@ -18,6 +18,7 @@
 #include "qapi/qmp/qerror.h"
 #include "qemu/ratelimit.h"
 #include "qemu/bitmap.h"
+#include "qemu/error-report.h"

 #define SLICE_TIME    100000000ULL /* ns */
 #define MAX_IN_FLIGHT 16
@@ -160,13 +161,15 @@ static void mirror_read_complete(void *opaque, int ret)
 static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
 {
    BlockDriverState *source = s->common.bs;
-    int nb_sectors, sectors_per_chunk, nb_chunks;
+    int nb_sectors, sectors_per_chunk, nb_chunks, max_iov;
    int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
    uint64_t delay_ns = 0;
    MirrorOp *op;
    int pnum;
    int64_t ret;

+    max_iov = MIN(source->bl.max_iov, s->target->bl.max_iov);
+
    s->sector_num = hbitmap_iter_next(&s->hbi);
    if (s->sector_num < 0) {
        bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
@@ -247,7 +250,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
            trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
            break;
        }
-        if (IOV_MAX < nb_chunks + added_chunks) {
+        if (max_iov < nb_chunks + added_chunks) {
            trace_mirror_break_iov_max(s, nb_chunks, added_chunks);
            break;
        }
@@ -370,11 +373,22 @@ static void mirror_exit(BlockJob *job, void *opaque)
        if (s->to_replace) {
            to_replace = s->to_replace;
        }
+
+        /* This was checked in mirror_start_job(), but meanwhile one of the
+         * nodes could have been newly attached to a BlockBackend. */
+        if (to_replace->blk && s->target->blk) {
+            error_report("block job: Can't create node with two BlockBackends");
+            data->ret = -EINVAL;
+            goto out;
+        }
+
        if (bdrv_get_flags(s->target) != bdrv_get_flags(to_replace)) {
            bdrv_reopen(s->target, bdrv_get_flags(to_replace), NULL);
        }
        bdrv_replace_in_backing_chain(to_replace, s->target);
    }
+
+out:
    if (s->to_replace) {
        bdrv_op_unblock_all(s->to_replace, s->replace_blocker);
        error_free(s->replace_blocker);
@@ -384,9 +398,11 @@ static void mirror_exit(BlockJob *job, void *opaque)
        aio_context_release(replace_aio_context);
    }
    g_free(s->replaces);
+    bdrv_op_unblock_all(s->target, s->common.blocker);
    bdrv_unref(s->target);
    block_job_completed(&s->common, data->ret);
    g_free(data);
+    bdrv_drained_end(src);
    bdrv_unref(src);
 }

@@ -606,6 +622,9 @@ immediate_exit:

    data = g_malloc(sizeof(*data));
    data->ret = ret;
+    /* Before we switch to target in mirror_exit, make sure data doesn't
+     * change. */
+    bdrv_drained_begin(s->common.bs);
    block_job_defer_to_main_loop(&s->common, mirror_exit, data);
 }

@@ -635,7 +654,7 @@ static void mirror_complete(BlockJob *job, Error **errp)
    Error *local_err = NULL;
    int ret;

-    ret = bdrv_open_backing_file(s->target, NULL, &local_err);
+    ret = bdrv_open_backing_file(s->target, NULL, "backing", &local_err);
    if (ret < 0) {
        error_propagate(errp, local_err);
        return;
@@ -700,6 +719,7 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
                             bool is_none_mode, BlockDriverState *base)
 {
    MirrorBlockJob *s;
+    BlockDriverState *replaced_bs;

    if (granularity == 0) {
        granularity = bdrv_get_default_bitmap_granularity(target);
@@ -723,6 +743,21 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
        buf_size = DEFAULT_MIRROR_BUF_SIZE;
    }

+    /* We can't support this case as long as the block layer can't handle
+     * multiple BlockBackends per BlockDriverState. */
+    if (replaces) {
+        replaced_bs = bdrv_lookup_bs(replaces, replaces, errp);
+        if (replaced_bs == NULL) {
+            return;
+        }
+    } else {
+        replaced_bs = bs;
+    }
+    if (replaced_bs->blk && target->blk) {
+        error_setg(errp, "Can't create node with two BlockBackends");
+        return;
+    }
+
    s = block_job_create(driver, bs, speed, cb, opaque, errp);
    if (!s) {
        return;
@@ -741,9 +776,12 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
    s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
    if (!s->dirty_bitmap) {
        g_free(s->replaces);
-        block_job_release(bs);
+        block_job_unref(&s->common);
        return;
    }
+
+    bdrv_op_block_all(s->target, s->common.blocker);
+
    bdrv_set_enable_write_cache(s->target, true);
    if (s->target->blk) {
        blk_set_on_error(s->target->blk, on_target_error, on_target_error);
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -342,13 +342,13 @@ static void nbd_attach_aio_context(BlockDriverState *bs,
    nbd_client_attach_aio_context(bs, new_context);
 }

-static void nbd_refresh_filename(BlockDriverState *bs)
+static void nbd_refresh_filename(BlockDriverState *bs, QDict *options)
 {
    QDict *opts = qdict_new();
-    const char *path   = qdict_get_try_str(bs->options, "path");
-    const char *host   = qdict_get_try_str(bs->options, "host");
-    const char *port   = qdict_get_try_str(bs->options, "port");
-    const char *export = qdict_get_try_str(bs->options, "export");
+    const char *path   = qdict_get_try_str(options, "path");
+    const char *host   = qdict_get_try_str(options, "host");
+    const char *port   = qdict_get_try_str(options, "port");
+    const char *export = qdict_get_try_str(options, "export");

    qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("nbd")));

--- a/block/parallels.c
+++ b/block/parallels.c
@@ -61,7 +61,7 @@ typedef struct ParallelsHeader {
 typedef enum ParallelsPreallocMode {
    PRL_PREALLOC_MODE_FALLOCATE = 0,
    PRL_PREALLOC_MODE_TRUNCATE = 1,
-    PRL_PREALLOC_MODE_MAX = 2,
+    PRL_PREALLOC_MODE__MAX = 2,
 } ParallelsPreallocMode;

 static const char *prealloc_mode_lookup[] = {
@@ -220,7 +220,7 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num,
        s->bat_bitmap[idx + i] = cpu_to_le32(s->data_end / s->off_multiplier);
        s->data_end += s->tracks;
        bitmap_set(s->bat_dirty_bmap,
-                   bat_entry_off(idx) / s->bat_dirty_block, 1);
+                   bat_entry_off(idx + i) / s->bat_dirty_block, 1);
    }

    return bat2sect(s, idx) + sector_num % s->tracks;
@@ -660,7 +660,7 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
    s->prealloc_size = MAX(s->tracks, s->prealloc_size >> BDRV_SECTOR_BITS);
    buf = qemu_opt_get_del(opts, PARALLELS_OPT_PREALLOC_MODE);
    s->prealloc_mode = qapi_enum_parse(prealloc_mode_lookup, buf,
-            PRL_PREALLOC_MODE_MAX, PRL_PREALLOC_MODE_FALLOCATE, &local_err);
+            PRL_PREALLOC_MODE__MAX, PRL_PREALLOC_MODE_FALLOCATE, &local_err);
    g_free(buf);
    if (local_err != NULL) {
        goto fail_options;
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -64,7 +64,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs, Error **errp)
    info->backing_file_depth = bdrv_get_backing_file_depth(bs);
    info->detect_zeroes = bs->detect_zeroes;

-    if (bs->io_limits_enabled) {
+    if (bs->throttle_state) {
        ThrottleConfig cfg;

        throttle_group_get_config(bs, &cfg);
@@ -245,15 +245,18 @@ void bdrv_query_image_info(BlockDriverState *bs,
        info->has_backing_filename = true;
        bdrv_get_full_backing_filename(bs, backing_filename2, PATH_MAX, &err);
        if (err) {
-            error_propagate(errp, err);
-            qapi_free_ImageInfo(info);
+            /* Can't reconstruct the full backing filename, so we must omit
+             * this field and apply a Best Effort to this query. */
            g_free(backing_filename2);
-            return;
+            backing_filename2 = NULL;
+            error_free(err);
+            err = NULL;
        }

-        if (strcmp(backing_filename, backing_filename2) != 0) {
-            info->full_backing_filename =
-                        g_strdup(backing_filename2);
+        /* Always report the full_backing_filename if present, even if it's the
+         * same as backing_filename. That they are same is useful info. */
+        if (backing_filename2) {
+            info->full_backing_filename = g_strdup(backing_filename2);
            info->has_full_backing_filename = true;
        }

@@ -346,17 +349,68 @@ static BlockStats *bdrv_query_stats(const BlockDriverState *bs,
    s->stats = g_malloc0(sizeof(*s->stats));
    if (bs->blk) {
        BlockAcctStats *stats = blk_get_stats(bs->blk);
+        BlockAcctTimedStats *ts = NULL;

        s->stats->rd_bytes = stats->nr_bytes[BLOCK_ACCT_READ];
        s->stats->wr_bytes = stats->nr_bytes[BLOCK_ACCT_WRITE];
        s->stats->rd_operations = stats->nr_ops[BLOCK_ACCT_READ];
        s->stats->wr_operations = stats->nr_ops[BLOCK_ACCT_WRITE];
+
+        s->stats->failed_rd_operations = stats->failed_ops[BLOCK_ACCT_READ];
+        s->stats->failed_wr_operations = stats->failed_ops[BLOCK_ACCT_WRITE];
+        s->stats->failed_flush_operations = stats->failed_ops[BLOCK_ACCT_FLUSH];
+
+        s->stats->invalid_rd_operations = stats->invalid_ops[BLOCK_ACCT_READ];
+        s->stats->invalid_wr_operations = stats->invalid_ops[BLOCK_ACCT_WRITE];
+        s->stats->invalid_flush_operations =
+            stats->invalid_ops[BLOCK_ACCT_FLUSH];
+
        s->stats->rd_merged = stats->merged[BLOCK_ACCT_READ];
        s->stats->wr_merged = stats->merged[BLOCK_ACCT_WRITE];
        s->stats->flush_operations = stats->nr_ops[BLOCK_ACCT_FLUSH];
        s->stats->wr_total_time_ns = stats->total_time_ns[BLOCK_ACCT_WRITE];
        s->stats->rd_total_time_ns = stats->total_time_ns[BLOCK_ACCT_READ];
        s->stats->flush_total_time_ns = stats->total_time_ns[BLOCK_ACCT_FLUSH];
+
+        s->stats->has_idle_time_ns = stats->last_access_time_ns > 0;
+        if (s->stats->has_idle_time_ns) {
+            s->stats->idle_time_ns = block_acct_idle_time_ns(stats);
+        }
+
+        s->stats->account_invalid = stats->account_invalid;
+        s->stats->account_failed = stats->account_failed;
+
+        while ((ts = block_acct_interval_next(stats, ts))) {
+            BlockDeviceTimedStatsList *timed_stats =
+                g_malloc0(sizeof(*timed_stats));
+            BlockDeviceTimedStats *dev_stats = g_malloc0(sizeof(*dev_stats));
+            timed_stats->next = s->stats->timed_stats;
+            timed_stats->value = dev_stats;
+            s->stats->timed_stats = timed_stats;
+
+            TimedAverage *rd = &ts->latency[BLOCK_ACCT_READ];
+            TimedAverage *wr = &ts->latency[BLOCK_ACCT_WRITE];
+            TimedAverage *fl = &ts->latency[BLOCK_ACCT_FLUSH];
+
+            dev_stats->interval_length = ts->interval_length;
+
+            dev_stats->min_rd_latency_ns = timed_average_min(rd);
+            dev_stats->max_rd_latency_ns = timed_average_max(rd);
+            dev_stats->avg_rd_latency_ns = timed_average_avg(rd);
+
+            dev_stats->min_wr_latency_ns = timed_average_min(wr);
+            dev_stats->max_wr_latency_ns = timed_average_max(wr);
+            dev_stats->avg_wr_latency_ns = timed_average_avg(wr);
+
+            dev_stats->min_flush_latency_ns = timed_average_min(fl);
+            dev_stats->max_flush_latency_ns = timed_average_max(fl);
+            dev_stats->avg_flush_latency_ns = timed_average_avg(fl);
+
+            dev_stats->avg_rd_queue_depth =
+                block_acct_queue_depth(ts, BLOCK_ACCT_READ);
+            dev_stats->avg_wr_queue_depth =
+                block_acct_queue_depth(ts, BLOCK_ACCT_WRITE);
+        }
    }

    s->stats->wr_highest_offset = bs->wr_highest_offset;
@@ -385,7 +439,9 @@ BlockInfoList *qmp_query_block(Error **errp)
        bdrv_query_info(blk, &info->value, &local_err);
        if (local_err) {
            error_propagate(errp, local_err);
-            goto err;
+            g_free(info);
+            qapi_free_BlockInfoList(head);
+            return NULL;
        }

        *p_next = info;
@@ -393,10 +449,6 @@ BlockInfoList *qmp_query_block(Error **errp)
    }

    return head;
-
- err:
-    qapi_free_BlockInfoList(head);
-    return NULL;
 }

 BlockStatsList *qmp_query_blockstats(bool has_query_nodes,
@@ -539,7 +591,7 @@ static void dump_qlist(fprintf_function func_fprintf, void *f, int indentation,
    int i = 0;

    for (entry = qlist_first(list); entry; entry = qlist_next(entry), i++) {
-        qtype_code type = qobject_type(entry->value);
+        QType type = qobject_type(entry->value);
        bool composite = (type == QTYPE_QDICT || type == QTYPE_QLIST);
        const char *format = composite ? "%*s[%i]:\n" : "%*s[%i]: ";

@@ -557,7 +609,7 @@ static void dump_qdict(fprintf_function func_fprintf, void *f, int indentation,
    const QDictEntry *entry;

    for (entry = qdict_first(dict); entry; entry = qdict_next(dict, entry)) {
-        qtype_code type = qobject_type(entry->value);
+        QType type = qobject_type(entry->value);
        bool composite = (type == QTYPE_QDICT || type == QTYPE_QLIST);
        const char *format = composite ? "%*s%s:\n" : "%*s%s: ";
        char key[strlen(entry->key) + 1];
@@ -627,7 +679,10 @@ void bdrv_image_info_dump(fprintf_function func_fprintf, void *f,

    if (info->has_backing_filename) {
        func_fprintf(f, "backing file: %s", info->backing_filename);
-        if (info->has_full_backing_filename) {
+        if (!info->has_full_backing_filename) {
+            func_fprintf(f, " (cannot determine actual path)");
+        } else if (strcmp(info->backing_filename,
+                          info->full_backing_filename) != 0) {
            func_fprintf(f, " (actual path: %s)", info->full_backing_filename);
        }
        func_fprintf(f, "\n");
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -312,7 +312,7 @@ static int count_contiguous_clusters(int nb_clusters, int cluster_size,
    if (!offset)
        return 0;

-    assert(qcow2_get_cluster_type(first_entry) != QCOW2_CLUSTER_COMPRESSED);
+    assert(qcow2_get_cluster_type(first_entry) == QCOW2_CLUSTER_NORMAL);

    for (i = 0; i < nb_clusters; i++) {
        uint64_t l2_entry = be64_to_cpu(l2_table[i]) & mask;
@@ -324,14 +324,16 @@ static int count_contiguous_clusters(int nb_clusters, int cluster_size,
 	return i;
 }

-static int count_contiguous_free_clusters(int nb_clusters, uint64_t *l2_table)
+static int count_contiguous_clusters_by_type(int nb_clusters,
+                                             uint64_t *l2_table,
+                                             int wanted_type)
 {
    int i;

    for (i = 0; i < nb_clusters; i++) {
        int type = qcow2_get_cluster_type(be64_to_cpu(l2_table[i]));

-        if (type != QCOW2_CLUSTER_UNALLOCATED) {
+        if (type != wanted_type) {
            break;
        }
    }
@@ -554,13 +556,14 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
            ret = -EIO;
            goto fail;
        }
-        c = count_contiguous_clusters(nb_clusters, s->cluster_size,
-                &l2_table[l2_index], QCOW_OFLAG_ZERO);
+        c = count_contiguous_clusters_by_type(nb_clusters, &l2_table[l2_index],
+                                              QCOW2_CLUSTER_ZERO);
        *cluster_offset = 0;
        break;
    case QCOW2_CLUSTER_UNALLOCATED:
        /* how many empty clusters ? */
-        c = count_contiguous_free_clusters(nb_clusters, &l2_table[l2_index]);
+        c = count_contiguous_clusters_by_type(nb_clusters, &l2_table[l2_index],
+                                              QCOW2_CLUSTER_UNALLOCATED);
        *cluster_offset = 0;
        break;
    case QCOW2_CLUSTER_NORMAL:
@@ -1638,7 +1641,8 @@ fail:
 static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
                                      int l1_size, int64_t *visited_l1_entries,
                                      int64_t l1_entries,
-                                      BlockDriverAmendStatusCB *status_cb)
+                                      BlockDriverAmendStatusCB *status_cb,
+                                      void *cb_opaque)
 {
    BDRVQcow2State *s = bs->opaque;
    bool is_active_l1 = (l1_table == s->l1_table);
@@ -1664,7 +1668,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
            /* unallocated */
            (*visited_l1_entries)++;
            if (status_cb) {
-                status_cb(bs, *visited_l1_entries, l1_entries);
+                status_cb(bs, *visited_l1_entries, l1_entries, cb_opaque);
            }
            continue;
        }
@@ -1801,7 +1805,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,

        (*visited_l1_entries)++;
        if (status_cb) {
-            status_cb(bs, *visited_l1_entries, l1_entries);
+            status_cb(bs, *visited_l1_entries, l1_entries, cb_opaque);
        }
    }

@@ -1825,7 +1829,8 @@ fail:
 * qcow2 version which doesn't yet support metadata zero clusters.
 */
 int qcow2_expand_zero_clusters(BlockDriverState *bs,
-                               BlockDriverAmendStatusCB *status_cb)
+                               BlockDriverAmendStatusCB *status_cb,
+                               void *cb_opaque)
 {
    BDRVQcow2State *s = bs->opaque;
    uint64_t *l1_table = NULL;
@@ -1842,7 +1847,7 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs,

    ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size,
                                     &visited_l1_entries, l1_entries,
-                                     status_cb);
+                                     status_cb, cb_opaque);
    if (ret < 0) {
        goto fail;
    }
@@ -1878,7 +1883,7 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs,

        ret = expand_zero_clusters_in_l1(bs, l1_table, s->snapshots[i].l1_size,
                                         &visited_l1_entries, l1_entries,
-                                         status_cb);
+                                         status_cb, cb_opaque);
        if (ret < 0) {
            goto fail;
        }
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -560,13 +560,16 @@ static int alloc_refcount_block(BlockDriverState *bs,
    }

    /* Hook up the new refcount table in the qcow2 header */
-    uint8_t data[12];
-    cpu_to_be64w((uint64_t*)data, table_offset);
-    cpu_to_be32w((uint32_t*)(data + 8), table_clusters);
+    struct QEMU_PACKED {
+        uint64_t d64;
+        uint32_t d32;
+    } data;
+    cpu_to_be64w(&data.d64, table_offset);
+    cpu_to_be32w(&data.d32, table_clusters);
    BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_ALLOC_SWITCH_TABLE);
    ret = bdrv_pwrite_sync(bs->file->bs,
                           offsetof(QCowHeader, refcount_table_offset),
-                           data, sizeof(data));
+                           &data, sizeof(data));
    if (ret < 0) {
        goto fail_table;
    }
@@ -1241,7 +1244,7 @@ fail:
 /* refcount checking functions */


-static size_t refcount_array_byte_size(BDRVQcow2State *s, uint64_t entries)
+static uint64_t refcount_array_byte_size(BDRVQcow2State *s, uint64_t entries)
 {
    /* This assertion holds because there is no way we can address more than
     * 2^(64 - 9) clusters at once (with cluster size 512 = 2^9, and because
@@ -1342,6 +1345,9 @@ static int inc_refcounts(BlockDriverState *bs,
        if (refcount == s->refcount_max) {
            fprintf(stderr, "ERROR: overflow cluster offset=0x%" PRIx64
                    "\n", cluster_offset);
+            fprintf(stderr, "Use qemu-img amend to increase the refcount entry "
+                    "width or qemu-img convert to create a clean copy if the "
+                    "image cannot be opened for writing\n");
            res->corruptions++;
            continue;
        }
@@ -2464,3 +2470,450 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,

    return 0;
 }
+
+/* A pointer to a function of this type is given to walk_over_reftable(). That
+ * function will create refblocks and pass them to a RefblockFinishOp once they
+ * are completed (@refblock). @refblock_empty is set if the refblock is
+ * completely empty.
+ *
+ * Along with the refblock, a corresponding reftable entry is passed, in the
+ * reftable @reftable (which may be reallocated) at @reftable_index.
+ *
+ * @allocated should be set to true if a new cluster has been allocated.
+ */
+typedef int (RefblockFinishOp)(BlockDriverState *bs, uint64_t **reftable,
+                               uint64_t reftable_index, uint64_t *reftable_size,
+                               void *refblock, bool refblock_empty,
+                               bool *allocated, Error **errp);
+
+/**
+ * This "operation" for walk_over_reftable() allocates the refblock on disk (if
+ * it is not empty) and inserts its offset into the new reftable. The size of
+ * this new reftable is increased as required.
+ */
+static int alloc_refblock(BlockDriverState *bs, uint64_t **reftable,
+                          uint64_t reftable_index, uint64_t *reftable_size,
+                          void *refblock, bool refblock_empty, bool *allocated,
+                          Error **errp)
+{
+    BDRVQcow2State *s = bs->opaque;
+    int64_t offset;
+
+    if (!refblock_empty && reftable_index >= *reftable_size) {
+        uint64_t *new_reftable;
+        uint64_t new_reftable_size;
+
+        new_reftable_size = ROUND_UP(reftable_index + 1,
+                                     s->cluster_size / sizeof(uint64_t));
+        if (new_reftable_size > QCOW_MAX_REFTABLE_SIZE / sizeof(uint64_t)) {
+            error_setg(errp,
+                       "This operation would make the refcount table grow "
+                       "beyond the maximum size supported by QEMU, aborting");
+            return -ENOTSUP;
+        }
+
+        new_reftable = g_try_realloc(*reftable, new_reftable_size *
+                                                sizeof(uint64_t));
+        if (!new_reftable) {
+            error_setg(errp, "Failed to increase reftable buffer size");
+            return -ENOMEM;
+        }
+
+        memset(new_reftable + *reftable_size, 0,
+               (new_reftable_size - *reftable_size) * sizeof(uint64_t));
+
+        *reftable      = new_reftable;
+        *reftable_size = new_reftable_size;
+    }
+
+    if (!refblock_empty && !(*reftable)[reftable_index]) {
+        offset = qcow2_alloc_clusters(bs, s->cluster_size);
+        if (offset < 0) {
+            error_setg_errno(errp, -offset, "Failed to allocate refblock");
+            return offset;
+        }
+        (*reftable)[reftable_index] = offset;
+        *allocated = true;
+    }
+
+    return 0;
+}
+
+/**
+ * This "operation" for walk_over_reftable() writes the refblock to disk at the
+ * offset specified by the new reftable's entry. It does not modify the new
+ * reftable or change any refcounts.
+ */
+static int flush_refblock(BlockDriverState *bs, uint64_t **reftable,
+                          uint64_t reftable_index, uint64_t *reftable_size,
+                          void *refblock, bool refblock_empty, bool *allocated,
+                          Error **errp)
+{
+    BDRVQcow2State *s = bs->opaque;
+    int64_t offset;
+    int ret;
+
+    if (reftable_index < *reftable_size && (*reftable)[reftable_index]) {
+        offset = (*reftable)[reftable_index];
+
+        ret = qcow2_pre_write_overlap_check(bs, 0, offset, s->cluster_size);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "Overlap check failed");
+            return ret;
+        }
+
+        ret = bdrv_pwrite(bs->file->bs, offset, refblock, s->cluster_size);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "Failed to write refblock");
+            return ret;
+        }
+    } else {
+        assert(refblock_empty);
+    }
+
+    return 0;
+}
+
+/**
+ * This function walks over the existing reftable and every referenced refblock;
+ * if @new_set_refcount is non-NULL, it is called for every refcount entry to
+ * create an equal new entry in the passed @new_refblock. Once that
+ * @new_refblock is completely filled, @operation will be called.
+ *
+ * @status_cb and @cb_opaque are used for the amend operation's status callback.
+ * @index is the index of the walk_over_reftable() calls and @total is the total
+ * number of walk_over_reftable() calls per amend operation. Both are used for
+ * calculating the parameters for the status callback.
+ *
+ * @allocated is set to true if a new cluster has been allocated.
+ */
+static int walk_over_reftable(BlockDriverState *bs, uint64_t **new_reftable,
+                              uint64_t *new_reftable_index,
+                              uint64_t *new_reftable_size,
+                              void *new_refblock, int new_refblock_size,
+                              int new_refcount_bits,
+                              RefblockFinishOp *operation, bool *allocated,
+                              Qcow2SetRefcountFunc *new_set_refcount,
+                              BlockDriverAmendStatusCB *status_cb,
+                              void *cb_opaque, int index, int total,
+                              Error **errp)
+{
+    BDRVQcow2State *s = bs->opaque;
+    uint64_t reftable_index;
+    bool new_refblock_empty = true;
+    int refblock_index;
+    int new_refblock_index = 0;
+    int ret;
+
+    for (reftable_index = 0; reftable_index < s->refcount_table_size;
+         reftable_index++)
+    {
+        uint64_t refblock_offset = s->refcount_table[reftable_index]
+                                 & REFT_OFFSET_MASK;
+
+        status_cb(bs, (uint64_t)index * s->refcount_table_size + reftable_index,
+                  (uint64_t)total * s->refcount_table_size, cb_opaque);
+
+        if (refblock_offset) {
+            void *refblock;
+
+            if (offset_into_cluster(s, refblock_offset)) {
+                qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#"
+                                        PRIx64 " unaligned (reftable index: %#"
+                                        PRIx64 ")", refblock_offset,
+                                        reftable_index);
+                error_setg(errp,
+                           "Image is corrupt (unaligned refblock offset)");
+                return -EIO;
+            }
+
+            ret = qcow2_cache_get(bs, s->refcount_block_cache, refblock_offset,
+                                  &refblock);
+            if (ret < 0) {
+                error_setg_errno(errp, -ret, "Failed to retrieve refblock");
+                return ret;
+            }
+
+            for (refblock_index = 0; refblock_index < s->refcount_block_size;
+                 refblock_index++)
+            {
+                uint64_t refcount;
+
+                if (new_refblock_index >= new_refblock_size) {
+                    /* new_refblock is now complete */
+                    ret = operation(bs, new_reftable, *new_reftable_index,
+                                    new_reftable_size, new_refblock,
+                                    new_refblock_empty, allocated, errp);
+                    if (ret < 0) {
+                        qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
+                        return ret;
+                    }
+
+                    (*new_reftable_index)++;
+                    new_refblock_index = 0;
+                    new_refblock_empty = true;
+                }
+
+                refcount = s->get_refcount(refblock, refblock_index);
+                if (new_refcount_bits < 64 && refcount >> new_refcount_bits) {
+                    uint64_t offset;
+
+                    qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
+
+                    offset = ((reftable_index << s->refcount_block_bits)
+                              + refblock_index) << s->cluster_bits;
+
+                    error_setg(errp, "Cannot decrease refcount entry width to "
+                               "%i bits: Cluster at offset %#" PRIx64 " has a "
+                               "refcount of %" PRIu64, new_refcount_bits,
+                               offset, refcount);
+                    return -EINVAL;
+                }
+
+                if (new_set_refcount) {
+                    new_set_refcount(new_refblock, new_refblock_index++,
+                                     refcount);
+                } else {
+                    new_refblock_index++;
+                }
+                new_refblock_empty = new_refblock_empty && refcount == 0;
+            }
+
+            qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
+        } else {
+            /* No refblock means every refcount is 0 */
+            for (refblock_index = 0; refblock_index < s->refcount_block_size;
+                 refblock_index++)
+            {
+                if (new_refblock_index >= new_refblock_size) {
+                    /* new_refblock is now complete */
+                    ret = operation(bs, new_reftable, *new_reftable_index,
+                                    new_reftable_size, new_refblock,
+                                    new_refblock_empty, allocated, errp);
+                    if (ret < 0) {
+                        return ret;
+                    }
+
+                    (*new_reftable_index)++;
+                    new_refblock_index = 0;
+                    new_refblock_empty = true;
+                }
+
+                if (new_set_refcount) {
+                    new_set_refcount(new_refblock, new_refblock_index++, 0);
+                } else {
+                    new_refblock_index++;
+                }
+            }
+        }
+    }
+
+    if (new_refblock_index > 0) {
+        /* Complete the potentially existing partially filled final refblock */
+        if (new_set_refcount) {
+            for (; new_refblock_index < new_refblock_size;
+                 new_refblock_index++)
+            {
+                new_set_refcount(new_refblock, new_refblock_index, 0);
+            }
+        }
+
+        ret = operation(bs, new_reftable, *new_reftable_index,
+                        new_reftable_size, new_refblock, new_refblock_empty,
+                        allocated, errp);
+        if (ret < 0) {
+            return ret;
+        }
+
+        (*new_reftable_index)++;
+    }
+
+    status_cb(bs, (uint64_t)(index + 1) * s->refcount_table_size,
+              (uint64_t)total * s->refcount_table_size, cb_opaque);
+
+    return 0;
+}
+
+int qcow2_change_refcount_order(BlockDriverState *bs, int refcount_order,
+                                BlockDriverAmendStatusCB *status_cb,
+                                void *cb_opaque, Error **errp)
+{
+    BDRVQcow2State *s = bs->opaque;
+    Qcow2GetRefcountFunc *new_get_refcount;
+    Qcow2SetRefcountFunc *new_set_refcount;
+    void *new_refblock = qemu_blockalign(bs->file->bs, s->cluster_size);
+    uint64_t *new_reftable = NULL, new_reftable_size = 0;
+    uint64_t *old_reftable, old_reftable_size, old_reftable_offset;
+    uint64_t new_reftable_index = 0;
+    uint64_t i;
+    int64_t new_reftable_offset = 0, allocated_reftable_size = 0;
+    int new_refblock_size, new_refcount_bits = 1 << refcount_order;
+    int old_refcount_order;
+    int walk_index = 0;
+    int ret;
+    bool new_allocation;
+
+    assert(s->qcow_version >= 3);
+    assert(refcount_order >= 0 && refcount_order <= 6);
+
+    /* see qcow2_open() */
+    new_refblock_size = 1 << (s->cluster_bits - (refcount_order - 3));
+
+    new_get_refcount = get_refcount_funcs[refcount_order];
+    new_set_refcount = set_refcount_funcs[refcount_order];
+
+
+    do {
+        int total_walks;
+
+        new_allocation = false;
+
+        /* At least we have to do this walk and the one which writes the
+         * refblocks; also, at least we have to do this loop here at least
+         * twice (normally), first to do the allocations, and second to
+         * determine that everything is correctly allocated, this then makes
+         * three walks in total */
+        total_walks = MAX(walk_index + 2, 3);
+
+        /* First, allocate the structures so they are present in the refcount
+         * structures */
+        ret = walk_over_reftable(bs, &new_reftable, &new_reftable_index,
+                                 &new_reftable_size, NULL, new_refblock_size,
+                                 new_refcount_bits, &alloc_refblock,
+                                 &new_allocation, NULL, status_cb, cb_opaque,
+                                 walk_index++, total_walks, errp);
+        if (ret < 0) {
+            goto done;
+        }
+
+        new_reftable_index = 0;
+
+        if (new_allocation) {
+            if (new_reftable_offset) {
+                qcow2_free_clusters(bs, new_reftable_offset,
+                                    allocated_reftable_size * sizeof(uint64_t),
+                                    QCOW2_DISCARD_NEVER);
+            }
+
+            new_reftable_offset = qcow2_alloc_clusters(bs, new_reftable_size *
+                                                           sizeof(uint64_t));
+            if (new_reftable_offset < 0) {
+                error_setg_errno(errp, -new_reftable_offset,
+                                 "Failed to allocate the new reftable");
+                ret = new_reftable_offset;
+                goto done;
+            }
+            allocated_reftable_size = new_reftable_size;
+        }
+    } while (new_allocation);
+
+    /* Second, write the new refblocks */
+    ret = walk_over_reftable(bs, &new_reftable, &new_reftable_index,
+                             &new_reftable_size, new_refblock,
+                             new_refblock_size, new_refcount_bits,
+                             &flush_refblock, &new_allocation, new_set_refcount,
+                             status_cb, cb_opaque, walk_index, walk_index + 1,
+                             errp);
+    if (ret < 0) {
+        goto done;
+    }
+    assert(!new_allocation);
+
+
+    /* Write the new reftable */
+    ret = qcow2_pre_write_overlap_check(bs, 0, new_reftable_offset,
+                                        new_reftable_size * sizeof(uint64_t));
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Overlap check failed");
+        goto done;
+    }
+
+    for (i = 0; i < new_reftable_size; i++) {
+        cpu_to_be64s(&new_reftable[i]);
+    }
+
+    ret = bdrv_pwrite(bs->file->bs, new_reftable_offset, new_reftable,
+                      new_reftable_size * sizeof(uint64_t));
+
+    for (i = 0; i < new_reftable_size; i++) {
+        be64_to_cpus(&new_reftable[i]);
+    }
+
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Failed to write the new reftable");
+        goto done;
+    }
+
+
+    /* Empty the refcount cache */
+    ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Failed to flush the refblock cache");
+        goto done;
+    }
+
+    /* Update the image header to point to the new reftable; this only updates
+     * the fields which are relevant to qcow2_update_header(); other fields
+     * such as s->refcount_table or s->refcount_bits stay stale for now
+     * (because we have to restore everything if qcow2_update_header() fails) */
+    old_refcount_order  = s->refcount_order;
+    old_reftable_size   = s->refcount_table_size;
+    old_reftable_offset = s->refcount_table_offset;
+
+    s->refcount_order        = refcount_order;
+    s->refcount_table_size   = new_reftable_size;
+    s->refcount_table_offset = new_reftable_offset;
+
+    ret = qcow2_update_header(bs);
+    if (ret < 0) {
+        s->refcount_order        = old_refcount_order;
+        s->refcount_table_size   = old_reftable_size;
+        s->refcount_table_offset = old_reftable_offset;
+        error_setg_errno(errp, -ret, "Failed to update the qcow2 header");
+        goto done;
+    }
+
+    /* Now update the rest of the in-memory information */
+    old_reftable = s->refcount_table;
+    s->refcount_table = new_reftable;
+
+    s->refcount_bits = 1 << refcount_order;
+    s->refcount_max = UINT64_C(1) << (s->refcount_bits - 1);
+    s->refcount_max += s->refcount_max - 1;
+
+    s->refcount_block_bits = s->cluster_bits - (refcount_order - 3);
+    s->refcount_block_size = 1 << s->refcount_block_bits;
+
+    s->get_refcount = new_get_refcount;
+    s->set_refcount = new_set_refcount;
+
+    /* For cleaning up all old refblocks and the old reftable below the "done"
+     * label */
+    new_reftable        = old_reftable;
+    new_reftable_size   = old_reftable_size;
+    new_reftable_offset = old_reftable_offset;
+
+done:
+    if (new_reftable) {
+        /* On success, new_reftable actually points to the old reftable (and
+         * new_reftable_size is the old reftable's size); but that is just
+         * fine */
+        for (i = 0; i < new_reftable_size; i++) {
+            uint64_t offset = new_reftable[i] & REFT_OFFSET_MASK;
+            if (offset) {
+                qcow2_free_clusters(bs, offset, s->cluster_size,
+                                    QCOW2_DISCARD_OTHER);
+            }
+        }
+        g_free(new_reftable);
+
+        if (new_reftable_offset > 0) {
+            qcow2_free_clusters(bs, new_reftable_offset,
+                                new_reftable_size * sizeof(uint64_t),
+                                QCOW2_DISCARD_OTHER);
+        }
+    }
+
+    qemu_vfree(new_refblock);
+    return ret;
+}
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1282,6 +1282,52 @@ static void qcow2_reopen_abort(BDRVReopenState *state)
    g_free(state->opaque);
 }

+static void qcow2_join_options(QDict *options, QDict *old_options)
+{
+    bool has_new_overlap_template =
+        qdict_haskey(options, QCOW2_OPT_OVERLAP) ||
+        qdict_haskey(options, QCOW2_OPT_OVERLAP_TEMPLATE);
+    bool has_new_total_cache_size =
+        qdict_haskey(options, QCOW2_OPT_CACHE_SIZE);
+    bool has_all_cache_options;
+
+    /* New overlap template overrides all old overlap options */
+    if (has_new_overlap_template) {
+        qdict_del(old_options, QCOW2_OPT_OVERLAP);
+        qdict_del(old_options, QCOW2_OPT_OVERLAP_TEMPLATE);
+        qdict_del(old_options, QCOW2_OPT_OVERLAP_MAIN_HEADER);
+        qdict_del(old_options, QCOW2_OPT_OVERLAP_ACTIVE_L1);
+        qdict_del(old_options, QCOW2_OPT_OVERLAP_ACTIVE_L2);
+        qdict_del(old_options, QCOW2_OPT_OVERLAP_REFCOUNT_TABLE);
+        qdict_del(old_options, QCOW2_OPT_OVERLAP_REFCOUNT_BLOCK);
+        qdict_del(old_options, QCOW2_OPT_OVERLAP_SNAPSHOT_TABLE);
+        qdict_del(old_options, QCOW2_OPT_OVERLAP_INACTIVE_L1);
+        qdict_del(old_options, QCOW2_OPT_OVERLAP_INACTIVE_L2);
+    }
+
+    /* New total cache size overrides all old options */
+    if (qdict_haskey(options, QCOW2_OPT_CACHE_SIZE)) {
+        qdict_del(old_options, QCOW2_OPT_L2_CACHE_SIZE);
+        qdict_del(old_options, QCOW2_OPT_REFCOUNT_CACHE_SIZE);
+    }
+
+    qdict_join(options, old_options, false);
+
+    /*
+     * If after merging all cache size options are set, an old total size is
+     * overwritten. Do keep all options, however, if all three are new. The
+     * resulting error message is what we want to happen.
+     */
+    has_all_cache_options =
+        qdict_haskey(options, QCOW2_OPT_CACHE_SIZE) ||
+        qdict_haskey(options, QCOW2_OPT_L2_CACHE_SIZE) ||
+        qdict_haskey(options, QCOW2_OPT_REFCOUNT_CACHE_SIZE);
+
+    if (has_all_cache_options && !has_new_total_cache_size) {
+        qdict_del(options, QCOW2_OPT_CACHE_SIZE);
+    }
+}
+
 static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
        int64_t sector_num, int nb_sectors, int *pnum)
 {
@@ -1716,9 +1762,8 @@ static void qcow2_invalidate_cache(BlockDriverState *bs, Error **errp)
    ret = qcow2_open(bs, options, flags, &local_err);
    QDECREF(options);
    if (local_err) {
-        error_setg(errp, "Could not reopen qcow2 layer: %s",
-                   error_get_pretty(local_err));
-        error_free(local_err);
+        error_propagate(errp, local_err);
+        error_prepend(errp, "Could not reopen qcow2 layer: ");
        return;
    } else if (ret < 0) {
        error_setg_errno(errp, -ret, "Could not reopen qcow2 layer");
@@ -2269,7 +2314,7 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
                                         DEFAULT_CLUSTER_SIZE);
    buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
    prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
-                               PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
+                               PREALLOC_MODE__MAX, PREALLOC_MODE_OFF,
                               &local_err);
    if (local_err) {
        error_propagate(errp, local_err);
@@ -2757,6 +2802,10 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs)
            .has_corrupt        = true,
            .refcount_bits      = s->refcount_bits,
        };
+    } else {
+        /* if this assertion fails, this probably means a new version was
+         * added without having it covered here */
+        assert(false);
    }

    return spec_info;
@@ -2824,7 +2873,7 @@ static int qcow2_load_vmstate(BlockDriverState *bs, uint8_t *buf,
 * have to be removed.
 */
 static int qcow2_downgrade(BlockDriverState *bs, int target_version,
-                           BlockDriverAmendStatusCB *status_cb)
+                           BlockDriverAmendStatusCB *status_cb, void *cb_opaque)
 {
    BDRVQcow2State *s = bs->opaque;
    int current_version = s->qcow_version;
@@ -2839,13 +2888,7 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version,
    }

    if (s->refcount_order != 4) {
-        /* we would have to convert the image to a refcount_order == 4 image
-         * here; however, since qemu (at the time of writing this) does not
-         * support anything different than 4 anyway, there is no point in doing
-         * so right now; however, we should error out (if qemu supports this in
-         * the future and this code has not been adapted) */
-        error_report("qcow2_downgrade: Image refcount orders other than 4 are "
-                     "currently not supported.");
+        error_report("compat=0.10 requires refcount_bits=16");
        return -ENOTSUP;
    }

@@ -2873,7 +2916,7 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version,
    /* clearing autoclear features is trivial */
    s->autoclear_features = 0;

-    ret = qcow2_expand_zero_clusters(bs, status_cb);
+    ret = qcow2_expand_zero_clusters(bs, status_cb, cb_opaque);
    if (ret < 0) {
        return ret;
    }
@@ -2887,8 +2930,79 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version,
    return 0;
 }

+typedef enum Qcow2AmendOperation {
+    /* This is the value Qcow2AmendHelperCBInfo::last_operation will be
+     * statically initialized to so that the helper CB can discern the first
+     * invocation from an operation change */
+    QCOW2_NO_OPERATION = 0,
+
+    QCOW2_CHANGING_REFCOUNT_ORDER,
+    QCOW2_DOWNGRADING,
+} Qcow2AmendOperation;
+
+typedef struct Qcow2AmendHelperCBInfo {
+    /* The code coordinating the amend operations should only modify
+     * these four fields; the rest will be managed by the CB */
+    BlockDriverAmendStatusCB *original_status_cb;
+    void *original_cb_opaque;
+
+    Qcow2AmendOperation current_operation;
+
+    /* Total number of operations to perform (only set once) */
+    int total_operations;
+
+    /* The following fields are managed by the CB */
+
+    /* Number of operations completed */
+    int operations_completed;
+
+    /* Cumulative offset of all completed operations */
+    int64_t offset_completed;
+
+    Qcow2AmendOperation last_operation;
+    int64_t last_work_size;
+} Qcow2AmendHelperCBInfo;
+
+static void qcow2_amend_helper_cb(BlockDriverState *bs,
+                                  int64_t operation_offset,
+                                  int64_t operation_work_size, void *opaque)
+{
+    Qcow2AmendHelperCBInfo *info = opaque;
+    int64_t current_work_size;
+    int64_t projected_work_size;
+
+    if (info->current_operation != info->last_operation) {
+        if (info->last_operation != QCOW2_NO_OPERATION) {
+            info->offset_completed += info->last_work_size;
+            info->operations_completed++;
+        }
+
+        info->last_operation = info->current_operation;
+    }
+
+    assert(info->total_operations > 0);
+    assert(info->operations_completed < info->total_operations);
+
+    info->last_work_size = operation_work_size;
+
+    current_work_size = info->offset_completed + operation_work_size;
+
+    /* current_work_size is the total work size for (operations_completed + 1)
+     * operations (which includes this one), so multiply it by the number of
+     * operations not covered and divide it by the number of operations
+     * covered to get a projection for the operations not covered */
+    projected_work_size = current_work_size * (info->total_operations -
+                                               info->operations_completed - 1)
+                                            / (info->operations_completed + 1);
+
+    info->original_status_cb(bs, info->offset_completed + operation_offset,
+                             current_work_size + projected_work_size,
+                             info->original_cb_opaque);
+}
+
 static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
-                               BlockDriverAmendStatusCB *status_cb)
+                               BlockDriverAmendStatusCB *status_cb,
+                               void *cb_opaque)
 {
    BDRVQcow2State *s = bs->opaque;
    int old_version = s->qcow_version, new_version = old_version;
@@ -2898,8 +3012,10 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
    const char *compat = NULL;
    uint64_t cluster_size = s->cluster_size;
    bool encrypt;
+    int refcount_bits = s->refcount_bits;
    int ret;
    QemuOptDesc *desc = opts->list->desc;
+    Qcow2AmendHelperCBInfo helper_cb_info;

    while (desc && desc->name) {
        if (!qemu_opt_find(opts, desc->name)) {
@@ -2917,11 +3033,11 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
            } else if (!strcmp(compat, "1.1")) {
                new_version = 3;
            } else {
-                fprintf(stderr, "Unknown compatibility level %s.\n", compat);
+                error_report("Unknown compatibility level %s", compat);
                return -EINVAL;
            }
        } else if (!strcmp(desc->name, BLOCK_OPT_PREALLOC)) {
-            fprintf(stderr, "Cannot change preallocation mode.\n");
+            error_report("Cannot change preallocation mode");
            return -ENOTSUP;
        } else if (!strcmp(desc->name, BLOCK_OPT_SIZE)) {
            new_size = qemu_opt_get_size(opts, BLOCK_OPT_SIZE, 0);
@@ -2934,47 +3050,74 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
                                        !!s->cipher);

            if (encrypt != !!s->cipher) {
-                fprintf(stderr, "Changing the encryption flag is not "
-                        "supported.\n");
+                error_report("Changing the encryption flag is not supported");
                return -ENOTSUP;
            }
        } else if (!strcmp(desc->name, BLOCK_OPT_CLUSTER_SIZE)) {
            cluster_size = qemu_opt_get_size(opts, BLOCK_OPT_CLUSTER_SIZE,
                                             cluster_size);
            if (cluster_size != s->cluster_size) {
-                fprintf(stderr, "Changing the cluster size is not "
-                        "supported.\n");
+                error_report("Changing the cluster size is not supported");
                return -ENOTSUP;
            }
        } else if (!strcmp(desc->name, BLOCK_OPT_LAZY_REFCOUNTS)) {
            lazy_refcounts = qemu_opt_get_bool(opts, BLOCK_OPT_LAZY_REFCOUNTS,
                                               lazy_refcounts);
        } else if (!strcmp(desc->name, BLOCK_OPT_REFCOUNT_BITS)) {
-            error_report("Cannot change refcount entry width");
-            return -ENOTSUP;
+            refcount_bits = qemu_opt_get_number(opts, BLOCK_OPT_REFCOUNT_BITS,
+                                                refcount_bits);
+
+            if (refcount_bits <= 0 || refcount_bits > 64 ||
+                !is_power_of_2(refcount_bits))
+            {
+                error_report("Refcount width must be a power of two and may "
+                             "not exceed 64 bits");
+                return -EINVAL;
+            }
        } else {
-            /* if this assertion fails, this probably means a new option was
+            /* if this point is reached, this probably means a new option was
             * added without having it covered here */
-            assert(false);
+            abort();
        }

        desc++;
    }

-    if (new_version != old_version) {
-        if (new_version > old_version) {
-            /* Upgrade */
-            s->qcow_version = new_version;
-            ret = qcow2_update_header(bs);
-            if (ret < 0) {
-                s->qcow_version = old_version;
-                return ret;
-            }
-        } else {
-            ret = qcow2_downgrade(bs, new_version, status_cb);
-            if (ret < 0) {
-                return ret;
-            }
+    helper_cb_info = (Qcow2AmendHelperCBInfo){
+        .original_status_cb = status_cb,
+        .original_cb_opaque = cb_opaque,
+        .total_operations = (new_version < old_version)
+                          + (s->refcount_bits != refcount_bits)
+    };
+
+    /* Upgrade first (some features may require compat=1.1) */
+    if (new_version > old_version) {
+        s->qcow_version = new_version;
+        ret = qcow2_update_header(bs);
+        if (ret < 0) {
+            s->qcow_version = old_version;
+            return ret;
+        }
+    }
+
+    if (s->refcount_bits != refcount_bits) {
+        int refcount_order = ctz32(refcount_bits);
+        Error *local_error = NULL;
+
+        if (new_version < 3 && refcount_bits != 16) {
+            error_report("Different refcount widths than 16 bits require "
+                         "compatibility level 1.1 or above (use compat=1.1 or "
+                         "greater)");
+            return -EINVAL;
+        }
+
+        helper_cb_info.current_operation = QCOW2_CHANGING_REFCOUNT_ORDER;
+        ret = qcow2_change_refcount_order(bs, refcount_order,
+                                          &qcow2_amend_helper_cb,
+                                          &helper_cb_info, &local_error);
+        if (ret < 0) {
+            error_report_err(local_error);
+            return ret;
        }
    }

@@ -2989,9 +3132,9 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,

    if (s->use_lazy_refcounts != lazy_refcounts) {
        if (lazy_refcounts) {
-            if (s->qcow_version < 3) {
-                fprintf(stderr, "Lazy refcounts only supported with compatibility "
-                        "level 1.1 and above (use compat=1.1 or greater)\n");
+            if (new_version < 3) {
+                error_report("Lazy refcounts only supported with compatibility "
+                             "level 1.1 and above (use compat=1.1 or greater)");
                return -EINVAL;
            }
            s->compatible_features |= QCOW2_COMPAT_LAZY_REFCOUNTS;
@@ -3025,6 +3168,16 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
        }
    }

+    /* Downgrade last (so unsupported features can be removed before) */
+    if (new_version < old_version) {
+        helper_cb_info.current_operation = QCOW2_DOWNGRADING;
+        ret = qcow2_downgrade(bs, new_version, &qcow2_amend_helper_cb,
+                              &helper_cb_info);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
    return 0;
 }

@@ -3145,6 +3298,7 @@ BlockDriver bdrv_qcow2 = {
    .bdrv_reopen_prepare  = qcow2_reopen_prepare,
    .bdrv_reopen_commit   = qcow2_reopen_commit,
    .bdrv_reopen_abort    = qcow2_reopen_abort,
+    .bdrv_join_options    = qcow2_join_options,
    .bdrv_create        = qcow2_create,
    .bdrv_has_zero_init = bdrv_has_zero_init_1,
    .bdrv_co_get_block_status = qcow2_co_get_block_status,
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -529,6 +529,10 @@ int qcow2_check_metadata_overlap(BlockDriverState *bs, int ign, int64_t offset,
 int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
                                  int64_t size);

+int qcow2_change_refcount_order(BlockDriverState *bs, int refcount_order,
+                                BlockDriverAmendStatusCB *status_cb,
+                                void *cb_opaque, Error **errp);
+
 /* qcow2-cluster.c functions */
 int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
                        bool exact_size);
@@ -553,7 +557,8 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
 int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors);

 int qcow2_expand_zero_clusters(BlockDriverState *bs,
-                               BlockDriverAmendStatusCB *status_cb);
+                               BlockDriverAmendStatusCB *status_cb,
+                               void *cb_opaque);

 /* qcow2-snapshot.c functions */
 int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info);
--- a/block/qed.c
+++ b/block/qed.c
@@ -375,6 +375,18 @@ static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
    }
 }

+static void bdrv_qed_drain(BlockDriverState *bs)
+{
+    BDRVQEDState *s = bs->opaque;
+
+    /* Cancel timer and start doing I/O that were meant to happen as if it
+     * fired, that way we get bdrv_drain() taking care of the ongoing requests
+     * correctly. */
+    qed_cancel_need_check_timer(s);
+    qed_plug_allocating_write_reqs(s);
+    bdrv_aio_flush(s->bs, qed_clear_need_check, s);
+}
+
 static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
                         Error **errp)
 {
@@ -1599,9 +1611,8 @@ static void bdrv_qed_invalidate_cache(BlockDriverState *bs, Error **errp)
    memset(s, 0, sizeof(BDRVQEDState));
    ret = bdrv_qed_open(bs, NULL, bs->open_flags, &local_err);
    if (local_err) {
-        error_setg(errp, "Could not reopen qed layer: %s",
-                   error_get_pretty(local_err));
-        error_free(local_err);
+        error_propagate(errp, local_err);
+        error_prepend(errp, "Could not reopen qed layer: ");
        return;
    } else if (ret < 0) {
        error_setg_errno(errp, -ret, "Could not reopen qed layer");
@@ -1676,6 +1687,7 @@ static BlockDriver bdrv_qed = {
    .bdrv_check               = bdrv_qed_check,
    .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
    .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
+    .bdrv_drain               = bdrv_qed_drain,
 };

 static void bdrv_qed_init(void)
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -847,7 +847,7 @@ static int parse_read_pattern(const char *opt)
        return QUORUM_READ_PATTERN_QUORUM;
    }

-    for (i = 0; i < QUORUM_READ_PATTERN_MAX; i++) {
+    for (i = 0; i < QUORUM_READ_PATTERN__MAX; i++) {
        if (!strcmp(opt, QuorumReadPattern_lookup[i])) {
            return i;
        }
@@ -997,7 +997,7 @@ static void quorum_attach_aio_context(BlockDriverState *bs,
    }
 }

-static void quorum_refresh_filename(BlockDriverState *bs)
+static void quorum_refresh_filename(BlockDriverState *bs, QDict *options)
 {
    BDRVQuorumState *s = bs->opaque;
    QDict *opts;
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -500,21 +500,17 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
        goto fail;
    }
    if (!s->use_aio && (bdrv_flags & BDRV_O_NATIVE_AIO)) {
-        error_printf("WARNING: aio=native was specified for '%s', but "
-                     "it requires cache.direct=on, which was not "
-                     "specified. Falling back to aio=threads.\n"
-                     "         This will become an error condition in "
-                     "future QEMU versions.\n",
-                     bs->filename);
+        error_setg(errp, "aio=native was specified, but it requires "
+                         "cache.direct=on, which was not specified.");
+        ret = -EINVAL;
+        goto fail;
    }
 #else
    if (bdrv_flags & BDRV_O_NATIVE_AIO) {
-        error_printf("WARNING: aio=native was specified for '%s', but "
-                     "is not supported in this build. Falling back to "
-                     "aio=threads.\n"
-                     "         This will become an error condition in "
-                     "future QEMU versions.\n",
-                     bs->filename);
+        error_setg(errp, "aio=native was specified, but is not supported "
+                         "in this build.");
+        ret = -EINVAL;
+        goto fail;
    }
 #endif /* !defined(CONFIG_LINUX_AIO) */

@@ -1636,7 +1632,7 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
    nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
    buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
    prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
-                               PREALLOC_MODE_MAX, PREALLOC_MODE_OFF,
+                               PREALLOC_MODE__MAX, PREALLOC_MODE_OFF,
                               &local_err);
    g_free(buf);
    if (local_err) {
@@ -1976,8 +1972,8 @@ BlockDriver bdrv_file = {

 #if defined(__APPLE__) && defined(__MACH__)
 static kern_return_t FindEjectableCDMedia( io_iterator_t *mediaIterator );
-static kern_return_t GetBSDPath( io_iterator_t mediaIterator, char *bsdPath, CFIndex maxPathSize );
-
+static kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
+                                CFIndex maxPathSize, int flags);
 kern_return_t FindEjectableCDMedia( io_iterator_t *mediaIterator )
 {
    kern_return_t       kernResult;
@@ -2004,7 +2000,8 @@ kern_return_t FindEjectableCDMedia( io_iterator_t *mediaIterator )
    return kernResult;
 }

-kern_return_t GetBSDPath( io_iterator_t mediaIterator, char *bsdPath, CFIndex maxPathSize )
+kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
+                         CFIndex maxPathSize, int flags)
 {
    io_object_t     nextMedia;
    kern_return_t   kernResult = KERN_FAILURE;
@@ -2017,7 +2014,9 @@ kern_return_t GetBSDPath( io_iterator_t mediaIterator, char *bsdPath, CFIndex ma
        if ( bsdPathAsCFString ) {
            size_t devPathLength;
            strcpy( bsdPath, _PATH_DEV );
-            strcat( bsdPath, "r" );
+            if (flags & BDRV_O_NOCACHE) {
+                strcat(bsdPath, "r");
+            }
            devPathLength = strlen( bsdPath );
            if ( CFStringGetCString( bsdPathAsCFString, bsdPath + devPathLength, maxPathSize - devPathLength, kCFStringEncodingASCII ) ) {
                kernResult = KERN_SUCCESS;
@@ -2129,8 +2128,8 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
        int fd;

        kernResult = FindEjectableCDMedia( &mediaIterator );
-        kernResult = GetBSDPath( mediaIterator, bsdPath, sizeof( bsdPath ) );
-
+        kernResult = GetBSDPath(mediaIterator, bsdPath, sizeof(bsdPath),
+                                flags);
        if ( bsdPath[ 0 ] != '\0' ) {
            strcat(bsdPath,"s0");
            /* some CDs don't have a partition 0 */
@@ -2175,12 +2174,6 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
 }

 #if defined(__linux__)
-static int hdev_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
-{
-    BDRVRawState *s = bs->opaque;
-
-    return ioctl(s->fd, req, buf);
-}

 static BlockAIOCB *hdev_aio_ioctl(BlockDriverState *bs,
        unsigned long int req, void *buf,
@@ -2338,7 +2331,6 @@ static BlockDriver bdrv_host_device = {

    /* generic scsi device */
 #ifdef __linux__
-    .bdrv_ioctl         = hdev_ioctl,
    .bdrv_aio_ioctl     = hdev_aio_ioctl,
 #endif
 };
@@ -2471,7 +2463,6 @@ static BlockDriver bdrv_host_cdrom = {
    .bdrv_lock_medium   = cdrom_lock_medium,

    /* generic scsi device */
-    .bdrv_ioctl         = hdev_ioctl,
    .bdrv_aio_ioctl     = hdev_aio_ioctl,
 };
 #endif /* __linux__ */
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -169,11 +169,6 @@ static void raw_lock_medium(BlockDriverState *bs, bool locked)
    bdrv_lock_medium(bs->file->bs, locked);
 }

-static int raw_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
-{
-    return bdrv_ioctl(bs->file->bs, req, buf);
-}
-
 static BlockAIOCB *raw_aio_ioctl(BlockDriverState *bs,
                                 unsigned long int req, void *buf,
                                 BlockCompletionFunc *cb,
@@ -262,7 +257,6 @@ BlockDriver bdrv_raw = {
    .bdrv_media_changed   = &raw_media_changed,
    .bdrv_eject           = &raw_eject,
    .bdrv_lock_medium     = &raw_lock_medium,
-    .bdrv_ioctl           = &raw_ioctl,
    .bdrv_aio_ioctl       = &raw_aio_ioctl,
    .create_opts          = &raw_create_opts,
    .bdrv_has_zero_init   = &raw_has_zero_init
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -1861,8 +1861,7 @@ static int sd_create(const char *filename, QemuOpts *opts,

        fd = connect_to_sdog(s, &local_err);
        if (fd < 0) {
-            error_report("%s", error_get_pretty(local_err));
-            error_free(local_err);
+            error_report_err(local_err);
            ret = -EIO;
            goto out;
        }
@@ -2406,9 +2405,8 @@ static int sd_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)

    ret = do_sd_create(s, &new_vid, 1, &local_err);
    if (ret < 0) {
-        error_report("failed to create inode for snapshot: %s",
-                     error_get_pretty(local_err));
-        error_free(local_err);
+        error_reportf_err(local_err,
+                          "failed to create inode for snapshot: ");
        goto cleanup;
    }

--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -229,6 +229,8 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
                         Error **errp)
 {
    BlockDriver *drv = bs->drv;
+    int ret;
+
    if (!drv) {
        error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, bdrv_get_device_name(bs));
        return -ENOMEDIUM;
@@ -239,23 +241,26 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
    }

    /* drain all pending i/o before deleting snapshot */
-    bdrv_drain(bs);
+    bdrv_drained_begin(bs);

    if (drv->bdrv_snapshot_delete) {
-        return drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
+        ret = drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
+    } else if (bs->file) {
+        ret = bdrv_snapshot_delete(bs->file->bs, snapshot_id, name, errp);
+    } else {
+        error_setg(errp, "Block format '%s' used by device '%s' "
+                   "does not support internal snapshot deletion",
+                   drv->format_name, bdrv_get_device_name(bs));
+        ret = -ENOTSUP;
    }
-    if (bs->file) {
-        return bdrv_snapshot_delete(bs->file->bs, snapshot_id, name, errp);
-    }
-    error_setg(errp, "Block format '%s' used by device '%s' "
-               "does not support internal snapshot deletion",
-               drv->format_name, bdrv_get_device_name(bs));
-    return -ENOTSUP;
+
+    bdrv_drained_end(bs);
+    return ret;
 }

-void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
-                                        const char *id_or_name,
-                                        Error **errp)
+int bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
+                                       const char *id_or_name,
+                                       Error **errp)
 {
    int ret;
    Error *local_err = NULL;
@@ -270,6 +275,7 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
    if (ret < 0) {
        error_propagate(errp, local_err);
    }
+    return ret;
 }

 int bdrv_snapshot_list(BlockDriverState *bs,
@@ -356,3 +362,130 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,

    return ret;
 }
+
+
+/* Group operations. All block drivers are involved.
+ * These functions will properly handle dataplane (take aio_context_acquire
+ * when appropriate for appropriate block drivers) */
+
+bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs)
+{
+    bool ok = true;
+    BlockDriverState *bs = NULL;
+
+    while (ok && (bs = bdrv_next(bs))) {
+        AioContext *ctx = bdrv_get_aio_context(bs);
+
+        aio_context_acquire(ctx);
+        if (bdrv_is_inserted(bs) && !bdrv_is_read_only(bs)) {
+            ok = bdrv_can_snapshot(bs);
+        }
+        aio_context_release(ctx);
+    }
+
+    *first_bad_bs = bs;
+    return ok;
+}
+
+int bdrv_all_delete_snapshot(const char *name, BlockDriverState **first_bad_bs,
+                             Error **err)
+{
+    int ret = 0;
+    BlockDriverState *bs = NULL;
+    QEMUSnapshotInfo sn1, *snapshot = &sn1;
+
+    while (ret == 0 && (bs = bdrv_next(bs))) {
+        AioContext *ctx = bdrv_get_aio_context(bs);
+
+        aio_context_acquire(ctx);
+        if (bdrv_can_snapshot(bs) &&
+                bdrv_snapshot_find(bs, snapshot, name) >= 0) {
+            ret = bdrv_snapshot_delete_by_id_or_name(bs, name, err);
+        }
+        aio_context_release(ctx);
+    }
+
+    *first_bad_bs = bs;
+    return ret;
+}
+
+
+int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs)
+{
+    int err = 0;
+    BlockDriverState *bs = NULL;
+
+    while (err == 0 && (bs = bdrv_next(bs))) {
+        AioContext *ctx = bdrv_get_aio_context(bs);
+
+        aio_context_acquire(ctx);
+        if (bdrv_can_snapshot(bs)) {
+            err = bdrv_snapshot_goto(bs, name);
+        }
+        aio_context_release(ctx);
+    }
+
+    *first_bad_bs = bs;
+    return err;
+}
+
+int bdrv_all_find_snapshot(const char *name, BlockDriverState **first_bad_bs)
+{
+    QEMUSnapshotInfo sn;
+    int err = 0;
+    BlockDriverState *bs = NULL;
+
+    while (err == 0 && (bs = bdrv_next(bs))) {
+        AioContext *ctx = bdrv_get_aio_context(bs);
+
+        aio_context_acquire(ctx);
+        if (bdrv_can_snapshot(bs)) {
+            err = bdrv_snapshot_find(bs, &sn, name);
+        }
+        aio_context_release(ctx);
+    }
+
+    *first_bad_bs = bs;
+    return err;
+}
+
+int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
+                             BlockDriverState *vm_state_bs,
+                             uint64_t vm_state_size,
+                             BlockDriverState **first_bad_bs)
+{
+    int err = 0;
+    BlockDriverState *bs = NULL;
+
+    while (err == 0 && (bs = bdrv_next(bs))) {
+        AioContext *ctx = bdrv_get_aio_context(bs);
+
+        aio_context_acquire(ctx);
+        if (bs == vm_state_bs) {
+            sn->vm_state_size = vm_state_size;
+            err = bdrv_snapshot_create(bs, sn);
+        } else if (bdrv_can_snapshot(bs)) {
+            sn->vm_state_size = 0;
+            err = bdrv_snapshot_create(bs, sn);
+        }
+        aio_context_release(ctx);
+    }
+
+    *first_bad_bs = bs;
+    return err;
+}
+
+BlockDriverState *bdrv_all_find_vmstate_bs(void)
+{
+    bool not_found = true;
+    BlockDriverState *bs = NULL;
+
+    while (not_found && (bs = bdrv_next(bs))) {
+        AioContext *ctx = bdrv_get_aio_context(bs);
+
+        aio_context_acquire(ctx);
+        not_found = !bdrv_can_snapshot(bs);
+        aio_context_release(ctx);
+    }
+    return bs;
+}
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -437,6 +437,9 @@ void throttle_group_register_bs(BlockDriverState *bs, const char *groupname)
 * list, destroying the timers and setting the throttle_state pointer
 * to NULL.
 *
+ * The BlockDriverState must not have pending throttled requests, so
+ * the caller has to drain them first.
+ *
 * The group will be destroyed if it's empty after this operation.
 *
 * @bs: the BlockDriverState to remove
@@ -446,6 +449,10 @@ void throttle_group_unregister_bs(BlockDriverState *bs)
    ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
    int i;

+    assert(bs->pending_reqs[0] == 0 && bs->pending_reqs[1] == 0);
+    assert(qemu_co_queue_empty(&bs->throttled_reqs[0]));
+    assert(qemu_co_queue_empty(&bs->throttled_reqs[1]));
+
    qemu_mutex_lock(&tg->lock);
    for (i = 0; i < 2; i++) {
        if (tg->tokens[i] == bs) {
--- a/block/vhdx-log.c
+++ b/block/vhdx-log.c
@@ -784,12 +784,13 @@ int vhdx_parse_log(BlockDriverState *bs, BDRVVHDXState *s, bool *flushed,
    if (logs.valid) {
        if (bs->read_only) {
            ret = -EPERM;
-            error_setg_errno(errp, EPERM,
-                             "VHDX image file '%s' opened read-only, but "
-                             "contains a log that needs to be replayed.  To "
-                             "replay the log, execute:\n qemu-img check -r "
-                             "all '%s'",
-                             bs->filename, bs->filename);
+            error_setg(errp,
+                       "VHDX image file '%s' opened read-only, but "
+                       "contains a log that needs to be replayed",
+                       bs->filename);
+            error_append_hint(errp,  "To replay the log, run:\n"
+                              "qemu-img check -r all '%s'\n",
+                              bs->filename);
            goto exit;
        }
        /* now flush the log */
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -760,6 +760,17 @@ static int vmdk_open_sparse(BlockDriverState *bs, BdrvChild *file, int flags,
    }
 }

+static const char *next_line(const char *s)
+{
+    while (*s) {
+        if (*s == '\n') {
+            return s + 1;
+        }
+        s++;
+    }
+    return s;
+}
+
 static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
                              const char *desc_file_path, QDict *options,
                              Error **errp)
@@ -769,7 +780,7 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
    char access[11];
    char type[11];
    char fname[512];
-    const char *p = desc;
+    const char *p, *np;
    int64_t sectors = 0;
    int64_t flat_offset;
    char *extent_path;
@@ -779,7 +790,7 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
    char extent_opt_prefix[32];
    Error *local_err = NULL;

-    while (*p) {
+    for (p = desc; *p; p = next_line(p)) {
        /* parse extent line in one of below formats:
         *
         * RW [size in sectors] FLAT "file-name.vmdk" OFFSET
@@ -791,29 +802,26 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
        matches = sscanf(p, "%10s %" SCNd64 " %10s \"%511[^\n\r\"]\" %" SCNd64,
                         access, &sectors, type, fname, &flat_offset);
        if (matches < 4 || strcmp(access, "RW")) {
-            goto next_line;
+            continue;
        } else if (!strcmp(type, "FLAT")) {
            if (matches != 5 || flat_offset < 0) {
-                error_setg(errp, "Invalid extent lines: \n%s", p);
-                return -EINVAL;
+                goto invalid;
            }
        } else if (!strcmp(type, "VMFS")) {
            if (matches == 4) {
                flat_offset = 0;
            } else {
-                error_setg(errp, "Invalid extent lines:\n%s", p);
-                return -EINVAL;
+                goto invalid;
            }
        } else if (matches != 4) {
-            error_setg(errp, "Invalid extent lines:\n%s", p);
-            return -EINVAL;
+            goto invalid;
        }

        if (sectors <= 0 ||
            (strcmp(type, "FLAT") && strcmp(type, "SPARSE") &&
             strcmp(type, "VMFS") && strcmp(type, "VMFSSPARSE")) ||
            (strcmp(access, "RW"))) {
-            goto next_line;
+            continue;
        }

        if (!path_is_absolute(fname) && !path_has_protocol(fname) &&
@@ -870,17 +878,17 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
            return -ENOTSUP;
        }
        extent->type = g_strdup(type);
-next_line:
-        /* move to next line */
-        while (*p) {
-            if (*p == '\n') {
-                p++;
-                break;
-            }
-            p++;
-        }
    }
    return 0;
+
+invalid:
+    np = next_line(p);
+    assert(np != p);
+    if (np[-1] == '\n') {
+        np--;
+    }
+    error_setg(errp, "Invalid extent line: %.*s", (int)(np - p), p);
+    return -EINVAL;
 }

 static int vmdk_open_desc_file(BlockDriverState *bs, int flags, char *buf,
@@ -1494,8 +1502,8 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,

    if (sector_num > bs->total_sectors) {
        error_report("Wrong offset: sector_num=0x%" PRIx64
-                " total_sectors=0x%" PRIx64 "\n",
-                sector_num, bs->total_sectors);
+                     " total_sectors=0x%" PRIx64,
+                     sector_num, bs->total_sectors);
        return -EIO;
    }

--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -27,9 +27,8 @@ static void nbd_accept(void *opaque)
    socklen_t addr_len = sizeof(addr);

    int fd = accept(server_fd, (struct sockaddr *)&addr, &addr_len);
-    if (fd >= 0 && !nbd_client_new(NULL, fd, nbd_client_put)) {
-        shutdown(fd, 2);
-        close(fd);
+    if (fd >= 0) {
+        nbd_client_new(NULL, fd, nbd_client_put);
    }
 }

--- a/blockdev.c
+++ b/blockdev.c
--- a/blockjob.c
+++ b/blockjob.c
@@ -37,6 +37,19 @@
 #include "qemu/timer.h"
 #include "qapi-event.h"

+/* Transactional group of block jobs */
+struct BlockJobTxn {
+
+    /* Is this txn being cancelled? */
+    bool aborting;
+
+    /* List of jobs */
+    QLIST_HEAD(, BlockJob) jobs;
+
+    /* Reference count */
+    int refcnt;
+};
+
 void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
                       int64_t speed, BlockCompletionFunc *cb,
                       void *opaque, Error **errp)
@@ -60,6 +73,7 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
    job->cb            = cb;
    job->opaque        = opaque;
    job->busy          = true;
+    job->refcnt        = 1;
    bs->job = job;

    /* Only set speed when necessary to avoid NotSupported error */
@@ -68,7 +82,7 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,

        block_job_set_speed(job, speed, &local_err);
        if (local_err) {
-            block_job_release(bs);
+            block_job_unref(job);
            error_propagate(errp, local_err);
            return NULL;
        }
@@ -76,15 +90,101 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
    return job;
 }

-void block_job_release(BlockDriverState *bs)
+void block_job_ref(BlockJob *job)
 {
-    BlockJob *job = bs->job;
+    ++job->refcnt;
+}

-    bs->job = NULL;
-    bdrv_op_unblock_all(bs, job->blocker);
-    error_free(job->blocker);
-    g_free(job->id);
-    g_free(job);
+void block_job_unref(BlockJob *job)
+{
+    if (--job->refcnt == 0) {
+        job->bs->job = NULL;
+        bdrv_op_unblock_all(job->bs, job->blocker);
+        bdrv_unref(job->bs);
+        error_free(job->blocker);
+        g_free(job->id);
+        g_free(job);
+    }
+}
+
+static void block_job_completed_single(BlockJob *job)
+{
+    if (!job->ret) {
+        if (job->driver->commit) {
+            job->driver->commit(job);
+        }
+    } else {
+        if (job->driver->abort) {
+            job->driver->abort(job);
+        }
+    }
+    job->cb(job->opaque, job->ret);
+    if (job->txn) {
+        block_job_txn_unref(job->txn);
+    }
+    block_job_unref(job);
+}
+
+static void block_job_completed_txn_abort(BlockJob *job)
+{
+    AioContext *ctx;
+    BlockJobTxn *txn = job->txn;
+    BlockJob *other_job, *next;
+
+    if (txn->aborting) {
+        /*
+         * We are cancelled by another job, which will handle everything.
+         */
+        return;
+    }
+    txn->aborting = true;
+    /* We are the first failed job. Cancel other jobs. */
+    QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
+        ctx = bdrv_get_aio_context(other_job->bs);
+        aio_context_acquire(ctx);
+    }
+    QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
+        if (other_job == job || other_job->completed) {
+            /* Other jobs are "effectively" cancelled by us, set the status for
+             * them; this job, however, may or may not be cancelled, depending
+             * on the caller, so leave it. */
+            if (other_job != job) {
+                other_job->cancelled = true;
+            }
+            continue;
+        }
+        block_job_cancel_sync(other_job);
+        assert(other_job->completed);
+    }
+    QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
+        ctx = bdrv_get_aio_context(other_job->bs);
+        block_job_completed_single(other_job);
+        aio_context_release(ctx);
+    }
+}
+
+static void block_job_completed_txn_success(BlockJob *job)
+{
+    AioContext *ctx;
+    BlockJobTxn *txn = job->txn;
+    BlockJob *other_job, *next;
+    /*
+     * Successful completion, see if there are other running jobs in this
+     * txn.
+     */
+    QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
+        if (!other_job->completed) {
+            return;
+        }
+    }
+    /* We are the last completed job, commit the transaction. */
+    QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
+        ctx = bdrv_get_aio_context(other_job->bs);
+        aio_context_acquire(ctx);
+        assert(other_job->ret == 0);
+        block_job_completed_single(other_job);
+        aio_context_release(ctx);
+    }
 }

 void block_job_completed(BlockJob *job, int ret)
@@ -92,8 +192,16 @@ void block_job_completed(BlockJob *job, int ret)
    BlockDriverState *bs = job->bs;

    assert(bs->job == job);
-    job->cb(job->opaque, ret);
-    block_job_release(bs);
+    assert(!job->completed);
+    job->completed = true;
+    job->ret = ret;
+    if (!job->txn) {
+        block_job_completed_single(job);
+    } else if (ret < 0 || block_job_is_cancelled(job)) {
+        block_job_completed_txn_abort(job);
+    } else {
+        block_job_completed_txn_success(job);
+    }
 }

 void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -178,43 +286,29 @@ struct BlockFinishData {
    int ret;
 };

-static void block_job_finish_cb(void *opaque, int ret)
-{
-    struct BlockFinishData *data = opaque;
-
-    data->cancelled = block_job_is_cancelled(data->job);
-    data->ret = ret;
-    data->cb(data->opaque, ret);
-}
-
 static int block_job_finish_sync(BlockJob *job,
                                 void (*finish)(BlockJob *, Error **errp),
                                 Error **errp)
 {
-    struct BlockFinishData data;
    BlockDriverState *bs = job->bs;
    Error *local_err = NULL;
+    int ret;

    assert(bs->job == job);

-    /* Set up our own callback to store the result and chain to
-     * the original callback.
-     */
-    data.job = job;
-    data.cb = job->cb;
-    data.opaque = job->opaque;
-    data.ret = -EINPROGRESS;
-    job->cb = block_job_finish_cb;
-    job->opaque = &data;
+    block_job_ref(job);
    finish(job, &local_err);
    if (local_err) {
        error_propagate(errp, local_err);
+        block_job_unref(job);
        return -EBUSY;
    }
-    while (data.ret == -EINPROGRESS) {
+    while (!job->completed) {
        aio_poll(bdrv_get_aio_context(bs), true);
    }
-    return (data.cancelled && data.ret == 0) ? -ECANCELED : data.ret;
+    ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
+    block_job_unref(job);
+    return ret;
 }

 /* A wrapper around block_job_cancel() taking an Error ** parameter so it may be
@@ -406,3 +500,36 @@ void block_job_defer_to_main_loop(BlockJob *job,

    qemu_bh_schedule(data->bh);
 }
+
+BlockJobTxn *block_job_txn_new(void)
+{
+    BlockJobTxn *txn = g_new0(BlockJobTxn, 1);
+    QLIST_INIT(&txn->jobs);
+    txn->refcnt = 1;
+    return txn;
+}
+
+static void block_job_txn_ref(BlockJobTxn *txn)
+{
+    txn->refcnt++;
+}
+
+void block_job_txn_unref(BlockJobTxn *txn)
+{
+    if (txn && --txn->refcnt == 0) {
+        g_free(txn);
+    }
+}
+
+void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
+{
+    if (!txn) {
+        return;
+    }
+
+    assert(!job->txn);
+    job->txn = txn;
+
+    QLIST_INSERT_HEAD(&txn->jobs, job, txn_list);
+    block_job_txn_ref(txn);
+}
--- a/bsd-user/elfload.c
+++ b/bsd-user/elfload.c
@@ -740,8 +740,7 @@ static void padzero(abi_ulong elf_bss, abi_ulong last_bss)
           size must be known */
        if (qemu_real_host_page_size < qemu_host_page_size) {
            abi_ulong end_addr, end_addr1;
-            end_addr1 = (elf_bss + qemu_real_host_page_size - 1) &
-                ~(qemu_real_host_page_size - 1);
+            end_addr1 = REAL_HOST_PAGE_ALIGN(elf_bss);
            end_addr = HOST_PAGE_ALIGN(elf_bss);
            if (end_addr1 < end_addr) {
                mmap((void *)g2h(end_addr1), end_addr - end_addr1,
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -938,7 +938,7 @@ int main(int argc, char **argv)
            unsigned long tmp;
            if (fscanf(fp, "%lu", &tmp) == 1) {
                mmap_min_addr = tmp;
-                qemu_log("host mmap_min_addr=0x%lx\n", mmap_min_addr);
+                qemu_log_mask(CPU_LOG_PAGE, "host mmap_min_addr=0x%lx\n", mmap_min_addr);
            }
            fclose(fp);
        }
@@ -955,7 +955,7 @@ int main(int argc, char **argv)

    free(target_environ);

-    if (qemu_log_enabled()) {
+    if (qemu_loglevel_mask(CPU_LOG_PAGE)) {
        qemu_log("guest_base  0x%lx\n", guest_base);
        log_page_dump();

--- a/bsd-user/signal.c
+++ b/bsd-user/signal.c
@@ -26,8 +26,6 @@
 #include "qemu.h"
 #include "target_signal.h"

-//#define DEBUG_SIGNAL
-
 void signal_init(void)
 {
 }
--- a/112
+++ b/112
@@ -8,6 +8,9 @@
 CLICOLOR_FORCE= GREP_OPTIONS=
 unset CLICOLOR_FORCE GREP_OPTIONS

+# Don't allow CCACHE, if present, to use cached results of compile tests!
+export CCACHE_RECACHE=yes
+
 # Temporary directory used for files created while
 # configure runs. Since it is in the build directory
 # we can safely blow away any previous version of it
@@ -261,6 +264,7 @@ rdma=""
 gprof="no"
 debug_tcg="no"
 debug="no"
+fortify_source=""
 strip_opt="yes"
 tcg_interpreter="no"
 bigendian="no"
@@ -723,6 +727,8 @@ if test "$mingw32" = "yes" ; then
  QEMU_CFLAGS="-DWIN32_LEAN_AND_MEAN -DWINVER=0x501 $QEMU_CFLAGS"
  # enable C99/POSIX format strings (needs mingw32-runtime 3.15 or later)
  QEMU_CFLAGS="-D__USE_MINGW_ANSI_STDIO=1 $QEMU_CFLAGS"
+  # MinGW needs -mthreads for TLS and macro _MT.
+  QEMU_CFLAGS="-mthreads $QEMU_CFLAGS"
  LIBS="-lwinmm -lws2_32 -liphlpapi $LIBS"
  write_c_skeleton;
  if compile_prog "" "-liberty" ; then
@@ -787,6 +793,9 @@ for opt do
  --enable-modules)
      modules="yes"
  ;;
+  --disable-modules)
+      modules="no"
+  ;;
  --cpu=*)
  ;;
  --target-list=*) target_list="$optarg"
@@ -876,6 +885,7 @@ for opt do
      debug_tcg="yes"
      debug="yes"
      strip_opt="no"
+      fortify_source="no"
  ;;
  --enable-sparse) sparse="yes"
  ;;
@@ -1340,7 +1350,6 @@ disabled with --disable-FEATURE, default is enabled if available:
  vte             vte support for the gtk UI
  curses          curses UI
  vnc             VNC UI support
-  vnc-tls         TLS encryption for VNC server
  vnc-sasl        SASL encryption for VNC server
  vnc-jpeg        JPEG lossy compression for VNC server
  vnc-png         PNG compression for VNC server
@@ -1419,6 +1428,9 @@ if compile_object ; then
 else
    error_exit "\"$cc\" either does not exist or does not work"
 fi
+if ! compile_prog ; then
+    error_exit "\"$cc\" cannot build an executable (is your linker broken?)"
+fi

 # Check that the C++ compiler exists and works with the C compiler
 if has $cxx; then
@@ -1479,6 +1491,16 @@ for flag in $gcc_flags; do
 done

 if test "$stack_protector" != "no"; then
+  cat > $TMPC << EOF
+int main(int argc, char *argv[])
+{
+    char arr[64], *p = arr, *c = argv[0];
+    while (*c) {
+        *p++ = *c++;
+    }
+    return 0;
+}
+EOF
  gcc_flags="-fstack-protector-strong -fstack-protector-all"
  sp_on=0
  for flag in $gcc_flags; do
@@ -1881,16 +1903,34 @@ fi
 # libseccomp check

 if test "$seccomp" != "no" ; then
-    if test "$cpu" = "i386" || test "$cpu" = "x86_64" &&
-        $pkg_config --atleast-version=2.1.1 libseccomp; then
+    case "$cpu" in
+    i386|x86_64)
+        libseccomp_minver="2.1.0"
+        ;;
+    arm|aarch64)
+        libseccomp_minver="2.2.3"
+        ;;
+    *)
+        libseccomp_minver=""
+        ;;
+    esac
+
+    if test "$libseccomp_minver" != "" &&
+       $pkg_config --atleast-version=$libseccomp_minver libseccomp ; then
        libs_softmmu="$libs_softmmu `$pkg_config --libs libseccomp`"
        QEMU_CFLAGS="$QEMU_CFLAGS `$pkg_config --cflags libseccomp`"
-	seccomp="yes"
+        seccomp="yes"
    else
-	if test "$seccomp" = "yes"; then
-            feature_not_found "libseccomp" "Install libseccomp devel >= 2.1.1"
-	fi
-	seccomp="no"
+        if test "$seccomp" = "yes" ; then
+            if test "$libseccomp_minver" != "" ; then
+                feature_not_found "libseccomp" \
+                    "Install libseccomp devel >= $libseccomp_minver"
+            else
+                feature_not_found "libseccomp" \
+                    "libseccomp is not supported for host cpu $cpu"
+            fi
+        fi
+        seccomp="no"
    fi
 fi
 ##########################################
@@ -1921,6 +1961,23 @@ EOF
  elif
      cat > $TMPC <<EOF &&
 #include <xenctrl.h>
+#include <stdint.h>
+int main(void) {
+  xc_interface *xc = NULL;
+  xen_domain_handle_t handle;
+  xc_domain_create(xc, 0, handle, 0, NULL, NULL);
+  return 0;
+}
+EOF
+      compile_prog "" "$xen_libs"
+    then
+    xen_ctrl_version=470
+    xen=yes
+
+  # Xen 4.6
+  elif
+      cat > $TMPC <<EOF &&
+#include <xenctrl.h>
 #include <xenstore.h>
 #include <stdint.h>
 #include <xen/hvm/hvm_info_table.h>
@@ -2369,6 +2426,14 @@ else
 fi


+##########################################
+# getifaddrs (for tests/test-io-channel-socket )
+
+have_ifaddrs_h=yes
+if ! check_include "ifaddrs.h" ; then
+  have_ifaddrs_h=no
+fi
+
 ##########################################
 # VTE probe

@@ -4398,6 +4463,7 @@ fi
 # check if ccache is interfering with
 # semantic analysis of macros

+unset CCACHE_CPP2
 ccache_cpp2=no
 cat > $TMPC << EOF
 static const int Z = 1;
@@ -4421,6 +4487,20 @@ if ! compile_object "-Werror"; then
    ccache_cpp2=yes
 fi

+#################################################
+# clang does not support glibc + FORTIFY_SOURCE.
+
+if test "$fortify_source" != "no"; then
+  if echo | $cc -dM -E - | grep __clang__ > /dev/null 2>&1 ; then
+    fortify_source="no";
+  elif test -n "$cxx" &&
+       echo | $cxx -dM -E - | grep __clang__ >/dev/null 2>&1 ; then
+    fortify_source="no";
+  else
+    fortify_source="yes"
+  fi
+fi
+
 ##########################################
 # End of CC checks
 # After here, no more $cc or $ld runs
@@ -4428,8 +4508,10 @@ fi
 if test "$gcov" = "yes" ; then
  CFLAGS="-fprofile-arcs -ftest-coverage -g $CFLAGS"
  LDFLAGS="-fprofile-arcs -ftest-coverage $LDFLAGS"
-elif test "$debug" = "no" ; then
+elif test "$fortify_source" = "yes" ; then
  CFLAGS="-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 $CFLAGS"
+elif test "$debug" = "no"; then
+  CFLAGS="-O2 $CFLAGS"
 fi

 ##########################################
@@ -4684,7 +4766,11 @@ echo "GTK GL support    $gtk_gl"
 echo "GNUTLS support    $gnutls"
 echo "GNUTLS hash       $gnutls_hash"
 echo "libgcrypt         $gcrypt"
-echo "nettle            $nettle ${nettle+($nettle_version)}"
+if test "$nettle" = "yes"; then
+    echo "nettle            $nettle ($nettle_version)"
+else
+    echo "nettle            $nettle"
+fi
 echo "libtasn1          $tasn1"
 echo "VTE support       $vte"
 echo "curses support    $curses"
@@ -4731,7 +4817,7 @@ echo "libcap-ng support $cap_ng"
 echo "vhost-net support $vhost_net"
 echo "vhost-scsi support $vhost_scsi"
 echo "Trace backends    $trace_backends"
-if test "$trace_backend" = "simple"; then
+if have_backend "simple"; then
 echo "Trace output file $trace_file-<pid>"
 fi
 if test "$spice" = "yes"; then
@@ -5063,6 +5149,9 @@ fi
 if test "$tasn1" = "yes" ; then
  echo "CONFIG_TASN1=y" >> $config_host_mak
 fi
+if test "$have_ifaddrs_h" = "yes" ; then
+    echo "HAVE_IFADDRS_H=y" >> $config_host_mak
+fi
 if test "$vte" = "yes" ; then
  echo "CONFIG_VTE=y" >> $config_host_mak
  echo "VTE_CFLAGS=$vte_cflags" >> $config_host_mak
@@ -5639,6 +5728,7 @@ case "$target_name" in
      echo "CONFIG_KVM=y" >> $config_target_mak
      if test "$vhost_net" = "yes" ; then
        echo "CONFIG_VHOST_NET=y" >> $config_target_mak
+        echo "CONFIG_VHOST_NET_TEST_$target_name=y" >> $config_host_mak
      fi
    fi
 esac
--- a/contrib/ivshmem-server/ivshmem-server.c
+++ b/contrib/ivshmem-server/ivshmem-server.c
@@ -168,7 +168,9 @@ ivshmem_server_handle_new_conn(IvshmemServer *server)
    }
    if (i == G_MAXUINT16) {
        IVSHMEM_SERVER_DEBUG(server, "cannot allocate new client id\n");
-        goto fail;
+        close(newfd);
+        g_free(peer);
+        return -1;
    }
    peer->id = server->cur_id++;

--- a/contrib/ivshmem-server/main.c
+++ b/contrib/ivshmem-server/main.c
@@ -65,7 +65,7 @@ ivshmem_server_parse_args(IvshmemServerArgs *args, int argc, char *argv[])
 {
    int c;
    unsigned long long v;
-    Error *errp = NULL;
+    Error *err = NULL;

    while ((c = getopt(argc, argv,
                       "h"  /* help */
@@ -104,11 +104,9 @@ ivshmem_server_parse_args(IvshmemServerArgs *args, int argc, char *argv[])
            break;

        case 'l': /* shm_size */
-            parse_option_size("shm_size", optarg, &args->shm_size, &errp);
-            if (errp) {
-                fprintf(stderr, "cannot parse shm size: %s\n",
-                        error_get_pretty(errp));
-                error_free(errp);
+            parse_option_size("shm_size", optarg, &args->shm_size, &err);
+            if (err) {
+                error_report_err(err);
                ivshmem_server_usage(argv[0], 1);
            }
            break;
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -30,6 +30,7 @@
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
 #include "hw/i386/apic.h"
 #endif
+#include "sysemu/replay.h"

 /* -icount align implementation. */

@@ -184,7 +185,7 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, uint8_t *tb_ptr)
 /* Execute the code without caching the generated code. An interpreter
   could be used if available. */
 static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
-                             TranslationBlock *orig_tb)
+                             TranslationBlock *orig_tb, bool ignore_icount)
 {
    TranslationBlock *tb;

@@ -194,7 +195,8 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
        max_cycles = CF_COUNT_MASK;

    tb = tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base, orig_tb->flags,
-                     max_cycles | CF_NOCACHE);
+                     max_cycles | CF_NOCACHE
+                         | (ignore_icount ? CF_IGNORE_ICOUNT : 0));
    tb->orig_tb = tcg_ctx.tb_ctx.tb_invalidated_flag ? NULL : orig_tb;
    cpu->current_tb = tb;
    /* execute the generated code */
@@ -345,21 +347,25 @@ int cpu_exec(CPUState *cpu)
    uintptr_t next_tb;
    SyncClocks sc;

+    /* replay_interrupt may need current_cpu */
+    current_cpu = cpu;
+
    if (cpu->halted) {
 #if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
-        if (cpu->interrupt_request & CPU_INTERRUPT_POLL) {
+        if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
+            && replay_interrupt()) {
            apic_poll_irq(x86_cpu->apic_state);
            cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
        }
 #endif
        if (!cpu_has_work(cpu)) {
+            current_cpu = NULL;
            return EXCP_HALTED;
        }

        cpu->halted = 0;
    }

-    current_cpu = cpu;
    atomic_mb_set(&tcg_current_cpu, cpu);
    rcu_read_lock();

@@ -401,10 +407,22 @@ int cpu_exec(CPUState *cpu)
                    cpu->exception_index = -1;
                    break;
 #else
-                    cc->do_interrupt(cpu);
-                    cpu->exception_index = -1;
+                    if (replay_exception()) {
+                        cc->do_interrupt(cpu);
+                        cpu->exception_index = -1;
+                    } else if (!replay_has_interrupt()) {
+                        /* give a chance to iothread in replay mode */
+                        ret = EXCP_INTERRUPT;
+                        break;
+                    }
 #endif
                }
+            } else if (replay_has_exception()
+                       && cpu->icount_decr.u16.low + cpu->icount_extra == 0) {
+                /* try to cause an exception pending in the log */
+                cpu_exec_nocache(cpu, 1, tb_find_fast(cpu), true);
+                ret = -1;
+                break;
            }

            next_tb = 0; /* force lookup of first TB */
@@ -420,30 +438,40 @@ int cpu_exec(CPUState *cpu)
                        cpu->exception_index = EXCP_DEBUG;
                        cpu_loop_exit(cpu);
                    }
-                    if (interrupt_request & CPU_INTERRUPT_HALT) {
+                    if (replay_mode == REPLAY_MODE_PLAY
+                        && !replay_has_interrupt()) {
+                        /* Do nothing */
+                    } else if (interrupt_request & CPU_INTERRUPT_HALT) {
+                        replay_interrupt();
                        cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
                        cpu->halted = 1;
                        cpu->exception_index = EXCP_HLT;
                        cpu_loop_exit(cpu);
                    }
 #if defined(TARGET_I386)
-                    if (interrupt_request & CPU_INTERRUPT_INIT) {
+                    else if (interrupt_request & CPU_INTERRUPT_INIT) {
+                        replay_interrupt();
                        cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0);
                        do_cpu_init(x86_cpu);
                        cpu->exception_index = EXCP_HALTED;
                        cpu_loop_exit(cpu);
                    }
 #else
-                    if (interrupt_request & CPU_INTERRUPT_RESET) {
+                    else if (interrupt_request & CPU_INTERRUPT_RESET) {
+                        replay_interrupt();
                        cpu_reset(cpu);
+                        cpu_loop_exit(cpu);
                    }
 #endif
                    /* The target hook has 3 exit conditions:
                       False when the interrupt isn't processed,
                       True when it is, and we should restart on a new TB,
                       and via longjmp via cpu_loop_exit.  */
-                    if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
-                        next_tb = 0;
+                    else {
+                        replay_interrupt();
+                        if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
+                            next_tb = 0;
+                        }
                    }
                    /* Don't use the cached interrupt_request value,
                       do_interrupt may have updated the EXITTB flag. */
@@ -454,7 +482,8 @@ int cpu_exec(CPUState *cpu)
                        next_tb = 0;
                    }
                }
-                if (unlikely(cpu->exit_request)) {
+                if (unlikely(cpu->exit_request
+                             || replay_has_interrupt())) {
                    cpu->exit_request = 0;
                    cpu->exception_index = EXCP_INTERRUPT;
                    cpu_loop_exit(cpu);
@@ -519,7 +548,7 @@ int cpu_exec(CPUState *cpu)
                            if (insns_left > 0) {
                                /* Execute remaining instructions.  */
                                tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
-                                cpu_exec_nocache(cpu, insns_left, tb);
+                                cpu_exec_nocache(cpu, insns_left, tb, false);
                                align_clocks(&sc, cpu);
                            }
                            cpu->exception_index = EXCP_INTERRUPT;
@@ -539,15 +568,27 @@ int cpu_exec(CPUState *cpu)
                   only be set by a memory fault) */
            } /* for(;;) */
        } else {
-            /* Reload env after longjmp - the compiler may have smashed all
-             * local variables as longjmp is marked 'noreturn'. */
+#if defined(__clang__) || !QEMU_GNUC_PREREQ(4, 6)
+            /* Some compilers wrongly smash all local variables after
+             * siglongjmp. There were bug reports for gcc 4.5.0 and clang.
+             * Reload essential local variables here for those compilers.
+             * Newer versions of gcc would complain about this code (-Wclobbered). */
            cpu = current_cpu;
            cc = CPU_GET_CLASS(cpu);
-            cpu->can_do_io = 1;
 #ifdef TARGET_I386
            x86_cpu = X86_CPU(cpu);
            env = &x86_cpu->env;
 #endif
+#else /* buggy compiler */
+            /* Assert that the compiler does not smash local variables. */
+            g_assert(cpu == current_cpu);
+            g_assert(cc == CPU_GET_CLASS(cpu));
+#ifdef TARGET_I386
+            g_assert(x86_cpu == X86_CPU(cpu));
+            g_assert(env == &x86_cpu->env);
+#endif
+#endif /* buggy compiler */
+            cpu->can_do_io = 1;
            tb_lock_reset();
        }
    } /* for(;;) */
--- a/cpus.c
+++ b/cpus.c
@@ -42,6 +42,7 @@
 #include "qemu/seqlock.h"
 #include "qapi-event.h"
 #include "hw/nmi.h"
+#include "sysemu/replay.h"

 #ifndef _WIN32
 #include "qemu/compatfd.h"
@@ -334,7 +335,7 @@ static int64_t qemu_icount_round(int64_t count)
    return (count + (1 << icount_time_shift) - 1) >> icount_time_shift;
 }

-static void icount_warp_rt(void *opaque)
+static void icount_warp_rt(void)
 {
    /* The icount_warp_timer is rescheduled soon after vm_clock_warp_start
     * changes from -1 to another value, so the race here is okay.
@@ -345,7 +346,8 @@ static void icount_warp_rt(void *opaque)

    seqlock_write_lock(&timers_state.vm_clock_seqlock);
    if (runstate_is_running()) {
-        int64_t clock = cpu_get_clock_locked();
+        int64_t clock = REPLAY_CLOCK(REPLAY_CLOCK_VIRTUAL_RT,
+                                     cpu_get_clock_locked());
        int64_t warp_delta;

        warp_delta = clock - vm_clock_warp_start;
@@ -368,6 +370,11 @@ static void icount_warp_rt(void *opaque)
    }
 }

+static void icount_dummy_timer(void *opaque)
+{
+    (void)opaque;
+}
+
 void qtest_clock_warp(int64_t dest)
 {
    int64_t clock = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
@@ -403,6 +410,18 @@ void qemu_clock_warp(QEMUClockType type)
        return;
    }

+    /* Nothing to do if the VM is stopped: QEMU_CLOCK_VIRTUAL timers
+     * do not fire, so computing the deadline does not make sense.
+     */
+    if (!runstate_is_running()) {
+        return;
+    }
+
+    /* warp clock deterministically in record/replay mode */
+    if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP)) {
+        return;
+    }
+
    if (icount_sleep) {
        /*
         * If the CPUs have been sleeping, advance QEMU_CLOCK_VIRTUAL timer now.
@@ -412,7 +431,7 @@ void qemu_clock_warp(QEMUClockType type)
         * the CPU starts running, in case the CPU is woken by an event other
         * than the earliest QEMU_CLOCK_VIRTUAL timer.
         */
-        icount_warp_rt(NULL);
+        icount_warp_rt();
        timer_del(icount_warp_timer);
    }
    if (!all_cpu_threads_idle()) {
@@ -605,7 +624,7 @@ void configure_icount(QemuOpts *opts, Error **errp)
    icount_sleep = qemu_opt_get_bool(opts, "sleep", true);
    if (icount_sleep) {
        icount_warp_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL_RT,
-                                         icount_warp_rt, NULL);
+                                         icount_dummy_timer, NULL);
    }

    icount_align_option = qemu_opt_get_bool(opts, "align", false);
@@ -694,15 +713,6 @@ void cpu_synchronize_all_post_init(void)
    }
 }

-void cpu_clean_all_dirty(void)
-{
-    CPUState *cpu;
-
-    CPU_FOREACH(cpu) {
-        cpu_clean_state(cpu);
-    }
-}
-
 static int do_vm_stop(RunState state)
 {
    int ret = 0;
@@ -1405,12 +1415,36 @@ int vm_stop_force_state(RunState state)
        return vm_stop(state);
    } else {
        runstate_set(state);
+
+        bdrv_drain_all();
        /* Make sure to return an error if the flush in a previous vm_stop()
         * failed. */
        return bdrv_flush_all();
    }
 }

+static int64_t tcg_get_icount_limit(void)
+{
+    int64_t deadline;
+
+    if (replay_mode != REPLAY_MODE_PLAY) {
+        deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
+
+        /* Maintain prior (possibly buggy) behaviour where if no deadline
+         * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than
+         * INT32_MAX nanoseconds ahead, we still use INT32_MAX
+         * nanoseconds.
+         */
+        if ((deadline < 0) || (deadline > INT32_MAX)) {
+            deadline = INT32_MAX;
+        }
+
+        return qemu_icount_round(deadline);
+    } else {
+        return replay_get_instructions();
+    }
+}
+
 static int tcg_cpu_exec(CPUState *cpu)
 {
    int ret;
@@ -1423,24 +1457,12 @@ static int tcg_cpu_exec(CPUState *cpu)
 #endif
    if (use_icount) {
        int64_t count;
-        int64_t deadline;
        int decr;
        timers_state.qemu_icount -= (cpu->icount_decr.u16.low
                                    + cpu->icount_extra);
        cpu->icount_decr.u16.low = 0;
        cpu->icount_extra = 0;
-        deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
-
-        /* Maintain prior (possibly buggy) behaviour where if no deadline
-         * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than
-         * INT32_MAX nanoseconds ahead, we still use INT32_MAX
-         * nanoseconds.
-         */
-        if ((deadline < 0) || (deadline > INT32_MAX)) {
-            deadline = INT32_MAX;
-        }
-
-        count = qemu_icount_round(deadline);
+        count = tcg_get_icount_limit();
        timers_state.qemu_icount += count;
        decr = (count > 0xffff) ? 0xffff : count;
        count -= decr;
@@ -1458,6 +1480,7 @@ static int tcg_cpu_exec(CPUState *cpu)
                        + cpu->icount_extra);
        cpu->icount_decr.u32 = 0;
        cpu->icount_extra = 0;
+        replay_account_executed_instructions();
    }
    return ret;
 }
@@ -1535,22 +1558,29 @@ CpuInfoList *qmp_query_cpus(Error **errp)
        info->value->qom_path = object_get_canonical_path(OBJECT(cpu));
        info->value->thread_id = cpu->thread_id;
 #if defined(TARGET_I386)
-        info->value->has_pc = true;
-        info->value->pc = env->eip + env->segs[R_CS].base;
+        info->value->arch = CPU_INFO_ARCH_X86;
+        info->value->u.x86 = g_new0(CpuInfoX86, 1);
+        info->value->u.x86->pc = env->eip + env->segs[R_CS].base;
 #elif defined(TARGET_PPC)
-        info->value->has_nip = true;
-        info->value->nip = env->nip;
+        info->value->arch = CPU_INFO_ARCH_PPC;
+        info->value->u.ppc = g_new0(CpuInfoPPC, 1);
+        info->value->u.ppc->nip = env->nip;
 #elif defined(TARGET_SPARC)
-        info->value->has_pc = true;
-        info->value->pc = env->pc;
-        info->value->has_npc = true;
-        info->value->npc = env->npc;
+        info->value->arch = CPU_INFO_ARCH_SPARC;
+        info->value->u.sparc = g_new0(CpuInfoSPARC, 1);
+        info->value->u.sparc->pc = env->pc;
+        info->value->u.sparc->npc = env->npc;
 #elif defined(TARGET_MIPS)
-        info->value->has_PC = true;
-        info->value->PC = env->active_tc.PC;
+        info->value->arch = CPU_INFO_ARCH_MIPS;
+        info->value->u.mips = g_new0(CpuInfoMIPS, 1);
+        info->value->u.mips->PC = env->active_tc.PC;
 #elif defined(TARGET_TRICORE)
-        info->value->has_PC = true;
-        info->value->PC = env->PC;
+        info->value->arch = CPU_INFO_ARCH_TRICORE;
+        info->value->u.tricore = g_new0(CpuInfoTricore, 1);
+        info->value->u.tricore->PC = env->PC;
+#else
+        info->value->arch = CPU_INFO_ARCH_OTHER;
+        info->value->u.other = g_new0(CpuInfoOther, 1);
 #endif

        /* XXX: waiting for the qapi to support GSList */
--- a/crypto/Makefile.objs
+++ b/crypto/Makefile.objs
@@ -7,6 +7,7 @@ crypto-obj-y += tlscreds.o
 crypto-obj-y += tlscredsanon.o
 crypto-obj-y += tlscredsx509.o
 crypto-obj-y += tlssession.o
+crypto-obj-y += secret.o

 # Let the userspace emulators avoid linking gnutls/etc
 crypto-aes-obj-y = aes.o
--- a/crypto/cipher.c
+++ b/crypto/cipher.c
@@ -21,19 +21,67 @@
 #include "crypto/cipher.h"


-static size_t alg_key_len[QCRYPTO_CIPHER_ALG_LAST] = {
+static size_t alg_key_len[QCRYPTO_CIPHER_ALG__MAX] = {
    [QCRYPTO_CIPHER_ALG_AES_128] = 16,
    [QCRYPTO_CIPHER_ALG_AES_192] = 24,
    [QCRYPTO_CIPHER_ALG_AES_256] = 32,
    [QCRYPTO_CIPHER_ALG_DES_RFB] = 8,
 };

+static size_t alg_block_len[QCRYPTO_CIPHER_ALG__MAX] = {
+    [QCRYPTO_CIPHER_ALG_AES_128] = 16,
+    [QCRYPTO_CIPHER_ALG_AES_192] = 16,
+    [QCRYPTO_CIPHER_ALG_AES_256] = 16,
+    [QCRYPTO_CIPHER_ALG_DES_RFB] = 8,
+};
+
+static bool mode_need_iv[QCRYPTO_CIPHER_MODE__MAX] = {
+    [QCRYPTO_CIPHER_MODE_ECB] = false,
+    [QCRYPTO_CIPHER_MODE_CBC] = true,
+};
+
+
+size_t qcrypto_cipher_get_block_len(QCryptoCipherAlgorithm alg)
+{
+    if (alg >= G_N_ELEMENTS(alg_key_len)) {
+        return 0;
+    }
+    return alg_block_len[alg];
+}
+
+
+size_t qcrypto_cipher_get_key_len(QCryptoCipherAlgorithm alg)
+{
+    if (alg >= G_N_ELEMENTS(alg_key_len)) {
+        return 0;
+    }
+    return alg_key_len[alg];
+}
+
+
+size_t qcrypto_cipher_get_iv_len(QCryptoCipherAlgorithm alg,
+                                 QCryptoCipherMode mode)
+{
+    if (alg >= G_N_ELEMENTS(alg_block_len)) {
+        return 0;
+    }
+    if (mode >= G_N_ELEMENTS(mode_need_iv)) {
+        return 0;
+    }
+
+    if (mode_need_iv[mode]) {
+        return alg_block_len[alg];
+    }
+    return 0;
+}
+
+
 static bool
 qcrypto_cipher_validate_key_length(QCryptoCipherAlgorithm alg,
                                   size_t nkey,
                                   Error **errp)
 {
-    if ((unsigned)alg >= QCRYPTO_CIPHER_ALG_LAST) {
+    if ((unsigned)alg >= QCRYPTO_CIPHER_ALG__MAX) {
        error_setg(errp, "Cipher algorithm %d out of range",
                   alg);
        return false;
@@ -41,7 +89,7 @@ qcrypto_cipher_validate_key_length(QCryptoCipherAlgorithm alg,

    if (alg_key_len[alg] != nkey) {
        error_setg(errp, "Cipher key length %zu should be %zu",
-                   alg_key_len[alg], nkey);
+                   nkey, alg_key_len[alg]);
        return false;
    }
    return true;
--- a/crypto/hash.c
+++ b/crypto/hash.c
@@ -24,12 +24,18 @@
 #include <gnutls/gnutls.h>
 #include <gnutls/crypto.h>

-static int qcrypto_hash_alg_map[QCRYPTO_HASH_ALG_LAST] = {
+static int qcrypto_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = {
    [QCRYPTO_HASH_ALG_MD5] = GNUTLS_DIG_MD5,
    [QCRYPTO_HASH_ALG_SHA1] = GNUTLS_DIG_SHA1,
    [QCRYPTO_HASH_ALG_SHA256] = GNUTLS_DIG_SHA256,
 };

+static size_t qcrypto_hash_alg_size[QCRYPTO_HASH_ALG__MAX] = {
+    [QCRYPTO_HASH_ALG_MD5] = 16,
+    [QCRYPTO_HASH_ALG_SHA1] = 20,
+    [QCRYPTO_HASH_ALG_SHA256] = 32,
+};
+
 gboolean qcrypto_hash_supports(QCryptoHashAlgorithm alg)
 {
    if (alg < G_N_ELEMENTS(qcrypto_hash_alg_map)) {
@@ -38,6 +44,15 @@ gboolean qcrypto_hash_supports(QCryptoHashAlgorithm alg)
    return false;
 }

+size_t qcrypto_hash_digest_len(QCryptoHashAlgorithm alg)
+{
+    if (alg >= G_N_ELEMENTS(qcrypto_hash_alg_size)) {
+        return 0;
+    }
+    return qcrypto_hash_alg_size[alg];
+}
+
+
 int qcrypto_hash_bytesv(QCryptoHashAlgorithm alg,
                        const struct iovec *iov,
                        size_t niov,
--- a/crypto/secret.c
+++ b/crypto/secret.c
@@ -0,0 +1,513 @@
+/*
+ * QEMU crypto secret support
+ *
+ * Copyright (c) 2015 Red Hat, Inc.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#include "crypto/secret.h"
+#include "crypto/cipher.h"
+#include "qom/object_interfaces.h"
+#include "qemu/base64.h"
+#include "trace.h"
+
+
+static void
+qcrypto_secret_load_data(QCryptoSecret *secret,
+                         uint8_t **output,
+                         size_t *outputlen,
+                         Error **errp)
+{
+    char *data = NULL;
+    size_t length = 0;
+    GError *gerr = NULL;
+
+    *output = NULL;
+    *outputlen = 0;
+
+    if (secret->file) {
+        if (secret->data) {
+            error_setg(errp,
+                       "'file' and 'data' are mutually exclusive");
+            return;
+        }
+        if (!g_file_get_contents(secret->file, &data, &length, &gerr)) {
+            error_setg(errp,
+                       "Unable to read %s: %s",
+                       secret->file, gerr->message);
+            g_error_free(gerr);
+            return;
+        }
+        *output = (uint8_t *)data;
+        *outputlen = length;
+    } else if (secret->data) {
+        *outputlen = strlen(secret->data);
+        *output = (uint8_t *)g_strdup(secret->data);
+    } else {
+        error_setg(errp, "Either 'file' or 'data' must be provided");
+    }
+}
+
+
+static void qcrypto_secret_decrypt(QCryptoSecret *secret,
+                                   const uint8_t *input,
+                                   size_t inputlen,
+                                   uint8_t **output,
+                                   size_t *outputlen,
+                                   Error **errp)
+{
+    uint8_t *key = NULL, *ciphertext = NULL, *iv = NULL;
+    size_t keylen, ciphertextlen, ivlen;
+    QCryptoCipher *aes = NULL;
+    uint8_t *plaintext = NULL;
+
+    *output = NULL;
+    *outputlen = 0;
+
+    if (qcrypto_secret_lookup(secret->keyid,
+                              &key, &keylen,
+                              errp) < 0) {
+        goto cleanup;
+    }
+
+    if (keylen != 32) {
+        error_setg(errp, "Key should be 32 bytes in length");
+        goto cleanup;
+    }
+
+    if (!secret->iv) {
+        error_setg(errp, "IV is required to decrypt secret");
+        goto cleanup;
+    }
+
+    iv = qbase64_decode(secret->iv, -1, &ivlen, errp);
+    if (!iv) {
+        goto cleanup;
+    }
+    if (ivlen != 16) {
+        error_setg(errp, "IV should be 16 bytes in length not %zu",
+                   ivlen);
+        goto cleanup;
+    }
+
+    aes = qcrypto_cipher_new(QCRYPTO_CIPHER_ALG_AES_256,
+                             QCRYPTO_CIPHER_MODE_CBC,
+                             key, keylen,
+                             errp);
+    if (!aes) {
+        goto cleanup;
+    }
+
+    if (qcrypto_cipher_setiv(aes, iv, ivlen, errp) < 0) {
+        goto cleanup;
+    }
+
+    if (secret->format == QCRYPTO_SECRET_FORMAT_BASE64) {
+        ciphertext = qbase64_decode((const gchar*)input,
+                                    inputlen,
+                                    &ciphertextlen,
+                                    errp);
+        if (!ciphertext) {
+            goto cleanup;
+        }
+        plaintext = g_new0(uint8_t, ciphertextlen + 1);
+    } else {
+        ciphertextlen = inputlen;
+        plaintext = g_new0(uint8_t, inputlen + 1);
+    }
+    if (qcrypto_cipher_decrypt(aes,
+                               ciphertext ? ciphertext : input,
+                               plaintext,
+                               ciphertextlen,
+                               errp) < 0) {
+        plaintext = NULL;
+        goto cleanup;
+    }
+
+    if (plaintext[ciphertextlen - 1] > 16 ||
+        plaintext[ciphertextlen - 1] > ciphertextlen) {
+        error_setg(errp, "Incorrect number of padding bytes (%d) "
+                   "found on decrypted data",
+                   (int)plaintext[ciphertextlen - 1]);
+        g_free(plaintext);
+        plaintext = NULL;
+        goto cleanup;
+    }
+
+    /* Even though plaintext may contain arbitrary NUL
+     * ensure it is explicitly NUL terminated.
+     */
+    ciphertextlen -= plaintext[ciphertextlen - 1];
+    plaintext[ciphertextlen] = '\0';
+
+    *output = plaintext;
+    *outputlen = ciphertextlen;
+
+ cleanup:
+    g_free(ciphertext);
+    g_free(iv);
+    g_free(key);
+    qcrypto_cipher_free(aes);
+}
+
+
+static void qcrypto_secret_decode(const uint8_t *input,
+                                  size_t inputlen,
+                                  uint8_t **output,
+                                  size_t *outputlen,
+                                  Error **errp)
+{
+    *output = qbase64_decode((const gchar*)input,
+                             inputlen,
+                             outputlen,
+                             errp);
+}
+
+
+static void
+qcrypto_secret_prop_set_loaded(Object *obj,
+                               bool value,
+                               Error **errp)
+{
+    QCryptoSecret *secret = QCRYPTO_SECRET(obj);
+
+    if (value) {
+        Error *local_err = NULL;
+        uint8_t *input = NULL;
+        size_t inputlen = 0;
+        uint8_t *output = NULL;
+        size_t outputlen = 0;
+
+        qcrypto_secret_load_data(secret, &input, &inputlen, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            return;
+        }
+
+        if (secret->keyid) {
+            qcrypto_secret_decrypt(secret, input, inputlen,
+                                   &output, &outputlen, &local_err);
+            g_free(input);
+            if (local_err) {
+                error_propagate(errp, local_err);
+                return;
+            }
+            input = output;
+            inputlen = outputlen;
+        } else {
+            if (secret->format != QCRYPTO_SECRET_FORMAT_RAW) {
+                qcrypto_secret_decode(input, inputlen,
+                                      &output, &outputlen, &local_err);
+                g_free(input);
+                if (local_err) {
+                    error_propagate(errp, local_err);
+                    return;
+                }
+                input = output;
+                inputlen = outputlen;
+            }
+        }
+
+        secret->rawdata = input;
+        secret->rawlen = inputlen;
+    } else {
+        g_free(secret->rawdata);
+        secret->rawlen = 0;
+    }
+}
+
+
+static bool
+qcrypto_secret_prop_get_loaded(Object *obj,
+                               Error **errp G_GNUC_UNUSED)
+{
+    QCryptoSecret *secret = QCRYPTO_SECRET(obj);
+    return secret->data != NULL;
+}
+
+
+static void
+qcrypto_secret_prop_set_format(Object *obj,
+                               int value,
+                               Error **errp G_GNUC_UNUSED)
+{
+    QCryptoSecret *creds = QCRYPTO_SECRET(obj);
+
+    creds->format = value;
+}
+
+
+static int
+qcrypto_secret_prop_get_format(Object *obj,
+                               Error **errp G_GNUC_UNUSED)
+{
+    QCryptoSecret *creds = QCRYPTO_SECRET(obj);
+
+    return creds->format;
+}
+
+
+static void
+qcrypto_secret_prop_set_data(Object *obj,
+                             const char *value,
+                             Error **errp)
+{
+    QCryptoSecret *secret = QCRYPTO_SECRET(obj);
+
+    g_free(secret->data);
+    secret->data = g_strdup(value);
+}
+
+
+static char *
+qcrypto_secret_prop_get_data(Object *obj,
+                             Error **errp)
+{
+    QCryptoSecret *secret = QCRYPTO_SECRET(obj);
+    return g_strdup(secret->data);
+}
+
+
+static void
+qcrypto_secret_prop_set_file(Object *obj,
+                             const char *value,
+                             Error **errp)
+{
+    QCryptoSecret *secret = QCRYPTO_SECRET(obj);
+
+    g_free(secret->file);
+    secret->file = g_strdup(value);
+}
+
+
+static char *
+qcrypto_secret_prop_get_file(Object *obj,
+                             Error **errp)
+{
+    QCryptoSecret *secret = QCRYPTO_SECRET(obj);
+    return g_strdup(secret->file);
+}
+
+
+static void
+qcrypto_secret_prop_set_iv(Object *obj,
+                           const char *value,
+                           Error **errp)
+{
+    QCryptoSecret *secret = QCRYPTO_SECRET(obj);
+
+    g_free(secret->iv);
+    secret->iv = g_strdup(value);
+}
+
+
+static char *
+qcrypto_secret_prop_get_iv(Object *obj,
+                           Error **errp)
+{
+    QCryptoSecret *secret = QCRYPTO_SECRET(obj);
+    return g_strdup(secret->iv);
+}
+
+
+static void
+qcrypto_secret_prop_set_keyid(Object *obj,
+                              const char *value,
+                              Error **errp)
+{
+    QCryptoSecret *secret = QCRYPTO_SECRET(obj);
+
+    g_free(secret->keyid);
+    secret->keyid = g_strdup(value);
+}
+
+
+static char *
+qcrypto_secret_prop_get_keyid(Object *obj,
+                              Error **errp)
+{
+    QCryptoSecret *secret = QCRYPTO_SECRET(obj);
+    return g_strdup(secret->keyid);
+}
+
+
+static void
+qcrypto_secret_complete(UserCreatable *uc, Error **errp)
+{
+    object_property_set_bool(OBJECT(uc), true, "loaded", errp);
+}
+
+
+static void
+qcrypto_secret_init(Object *obj)
+{
+    object_property_add_bool(obj, "loaded",
+                             qcrypto_secret_prop_get_loaded,
+                             qcrypto_secret_prop_set_loaded,
+                             NULL);
+    object_property_add_enum(obj, "format",
+                             "QCryptoSecretFormat",
+                             QCryptoSecretFormat_lookup,
+                             qcrypto_secret_prop_get_format,
+                             qcrypto_secret_prop_set_format,
+                             NULL);
+    object_property_add_str(obj, "data",
+                            qcrypto_secret_prop_get_data,
+                            qcrypto_secret_prop_set_data,
+                            NULL);
+    object_property_add_str(obj, "file",
+                            qcrypto_secret_prop_get_file,
+                            qcrypto_secret_prop_set_file,
+                            NULL);
+    object_property_add_str(obj, "keyid",
+                            qcrypto_secret_prop_get_keyid,
+                            qcrypto_secret_prop_set_keyid,
+                            NULL);
+    object_property_add_str(obj, "iv",
+                            qcrypto_secret_prop_get_iv,
+                            qcrypto_secret_prop_set_iv,
+                            NULL);
+}
+
+
+static void
+qcrypto_secret_finalize(Object *obj)
+{
+    QCryptoSecret *secret = QCRYPTO_SECRET(obj);
+
+    g_free(secret->iv);
+    g_free(secret->file);
+    g_free(secret->keyid);
+    g_free(secret->rawdata);
+    g_free(secret->data);
+}
+
+static void
+qcrypto_secret_class_init(ObjectClass *oc, void *data)
+{
+    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+
+    ucc->complete = qcrypto_secret_complete;
+}
+
+
+int qcrypto_secret_lookup(const char *secretid,
+                          uint8_t **data,
+                          size_t *datalen,
+                          Error **errp)
+{
+    Object *obj;
+    QCryptoSecret *secret;
+
+    obj = object_resolve_path_component(
+        object_get_objects_root(), secretid);
+    if (!obj) {
+        error_setg(errp, "No secret with id '%s'", secretid);
+        return -1;
+    }
+
+    secret = (QCryptoSecret *)
+        object_dynamic_cast(obj,
+                            TYPE_QCRYPTO_SECRET);
+    if (!secret) {
+        error_setg(errp, "Object with id '%s' is not a secret",
+                   secretid);
+        return -1;
+    }
+
+    if (!secret->rawdata) {
+        error_setg(errp, "Secret with id '%s' has no data",
+                   secretid);
+        return -1;
+    }
+
+    *data = g_new0(uint8, secret->rawlen + 1);
+    memcpy(*data, secret->rawdata, secret->rawlen);
+    (*data)[secret->rawlen] = '\0';
+    *datalen = secret->rawlen;
+
+    return 0;
+}
+
+
+char *qcrypto_secret_lookup_as_utf8(const char *secretid,
+                                    Error **errp)
+{
+    uint8_t *data;
+    size_t datalen;
+
+    if (qcrypto_secret_lookup(secretid,
+                              &data,
+                              &datalen,
+                              errp) < 0) {
+        return NULL;
+    }
+
+    if (!g_utf8_validate((const gchar*)data, datalen, NULL)) {
+        error_setg(errp,
+                   "Data from secret %s is not valid UTF-8",
+                   secretid);
+        g_free(data);
+        return NULL;
+    }
+
+    return (char *)data;
+}
+
+
+char *qcrypto_secret_lookup_as_base64(const char *secretid,
+                                      Error **errp)
+{
+    uint8_t *data;
+    size_t datalen;
+    char *ret;
+
+    if (qcrypto_secret_lookup(secretid,
+                              &data,
+                              &datalen,
+                              errp) < 0) {
+        return NULL;
+    }
+
+    ret = g_base64_encode(data, datalen);
+    g_free(data);
+    return ret;
+}
+
+
+static const TypeInfo qcrypto_secret_info = {
+    .parent = TYPE_OBJECT,
+    .name = TYPE_QCRYPTO_SECRET,
+    .instance_size = sizeof(QCryptoSecret),
+    .instance_init = qcrypto_secret_init,
+    .instance_finalize = qcrypto_secret_finalize,
+    .class_size = sizeof(QCryptoSecretClass),
+    .class_init = qcrypto_secret_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_USER_CREATABLE },
+        { }
+    }
+};
+
+
+static void
+qcrypto_secret_register_types(void)
+{
+    type_register_static(&qcrypto_secret_info);
+}
+
+
+type_init(qcrypto_secret_register_types);
--- a/crypto/tlscreds.c
+++ b/crypto/tlscreds.c
@@ -123,10 +123,10 @@ qcrypto_tls_creds_get_path(QCryptoTLSCreds *creds,
        goto cleanup;
    }

-    trace_qcrypto_tls_creds_get_path(creds, filename,
-                                     *cred ? *cred : "<none>");
    ret = 0;
 cleanup:
+    trace_qcrypto_tls_creds_get_path(creds, filename,
+                                     *cred ? *cred : "<none>");
    return ret;
 }

--- a/crypto/tlscredsx509.c
+++ b/crypto/tlscredsx509.c
@@ -20,6 +20,7 @@

 #include "crypto/tlscredsx509.h"
 #include "crypto/tlscredspriv.h"
+#include "crypto/secret.h"
 #include "qom/object_interfaces.h"
 #include "trace.h"

@@ -255,6 +256,7 @@ qcrypto_tls_creds_check_cert_key_purpose(QCryptoTLSCredsX509 *creds,
        }

        g_free(buffer);
+        buffer = NULL;
    }

    if (isServer) {
@@ -485,7 +487,8 @@ qcrypto_tls_creds_x509_sanity_check(QCryptoTLSCredsX509 *creds,
    int ret = -1;

    memset(cacerts, 0, sizeof(cacerts));
-    if (access(certFile, R_OK) == 0) {
+    if (certFile &&
+        access(certFile, R_OK) == 0) {
        cert = qcrypto_tls_creds_load_cert(creds,
                                           certFile, isServer,
                                           errp);
@@ -605,9 +608,30 @@ qcrypto_tls_creds_x509_load(QCryptoTLSCredsX509 *creds,
    }

    if (cert != NULL && key != NULL) {
+#if GNUTLS_VERSION_NUMBER >= 0x030111
+        char *password = NULL;
+        if (creds->passwordid) {
+            password = qcrypto_secret_lookup_as_utf8(creds->passwordid,
+                                                     errp);
+            if (!password) {
+                goto cleanup;
+            }
+        }
+        ret = gnutls_certificate_set_x509_key_file2(creds->data,
+                                                    cert, key,
+                                                    GNUTLS_X509_FMT_PEM,
+                                                    password,
+                                                    0);
+        g_free(password);
+#else /* GNUTLS_VERSION_NUMBER < 0x030111 */
+        if (creds->passwordid) {
+            error_setg(errp, "PKCS8 decryption requires GNUTLS >= 3.1.11");
+            goto cleanup;
+        }
        ret = gnutls_certificate_set_x509_key_file(creds->data,
                                                   cert, key,
                                                   GNUTLS_X509_FMT_PEM);
+#endif /* GNUTLS_VERSION_NUMBER < 0x030111 */
        if (ret < 0) {
            error_setg(errp, "Cannot load certificate '%s' & key '%s': %s",
                       cert, key, gnutls_strerror(ret));
@@ -654,6 +678,10 @@ qcrypto_tls_creds_x509_unload(QCryptoTLSCredsX509 *creds)
        gnutls_certificate_free_credentials(creds->data);
        creds->data = NULL;
    }
+    if (creds->parent_obj.dh_params) {
+        gnutls_dh_params_deinit(creds->parent_obj.dh_params);
+        creds->parent_obj.dh_params = NULL;
+    }
 }


@@ -731,6 +759,27 @@ qcrypto_tls_creds_x509_prop_set_sanity(Object *obj,
 }


+static void
+qcrypto_tls_creds_x509_prop_set_passwordid(Object *obj,
+                                           const char *value,
+                                           Error **errp G_GNUC_UNUSED)
+{
+    QCryptoTLSCredsX509 *creds = QCRYPTO_TLS_CREDS_X509(obj);
+
+    creds->passwordid = g_strdup(value);
+}
+
+
+static char *
+qcrypto_tls_creds_x509_prop_get_passwordid(Object *obj,
+                                           Error **errp G_GNUC_UNUSED)
+{
+    QCryptoTLSCredsX509 *creds = QCRYPTO_TLS_CREDS_X509(obj);
+
+    return g_strdup(creds->passwordid);
+}
+
+
 static bool
 qcrypto_tls_creds_x509_prop_get_sanity(Object *obj,
                                       Error **errp G_GNUC_UNUSED)
@@ -763,6 +812,10 @@ qcrypto_tls_creds_x509_init(Object *obj)
                             qcrypto_tls_creds_x509_prop_get_sanity,
                             qcrypto_tls_creds_x509_prop_set_sanity,
                             NULL);
+    object_property_add_str(obj, "passwordid",
+                            qcrypto_tls_creds_x509_prop_get_passwordid,
+                            qcrypto_tls_creds_x509_prop_set_passwordid,
+                            NULL);
 }


@@ -771,6 +824,7 @@ qcrypto_tls_creds_x509_finalize(Object *obj)
 {
    QCryptoTLSCredsX509 *creds = QCRYPTO_TLS_CREDS_X509(obj);

+    g_free(creds->passwordid);
    qcrypto_tls_creds_x509_unload(creds);
 }

--- a/crypto/tlssession.c
+++ b/crypto/tlssession.c
@@ -304,9 +304,9 @@ qcrypto_tls_session_check_certificate(QCryptoTLSSession *session,

                allow = qemu_acl_party_is_allowed(acl, session->peername);

-                error_setg(errp, "TLS x509 ACL check for %s is %s",
-                           session->peername, allow ? "allowed" : "denied");
                if (!allow) {
+                    error_setg(errp, "TLS x509 ACL check for %s is denied",
+                               session->peername);
                    goto error;
                }
            }
--- a/default-configs/aarch64-linux-user.mak
+++ b/default-configs/aarch64-linux-user.mak
@@ -1,3 +1 @@
 # Default configuration for aarch64-linux-user
-
-CONFIG_GDBSTUB_XML=y
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -9,6 +9,11 @@ CONFIG_VGA_CIRRUS=y
 CONFIG_VMWARE_VGA=y
 CONFIG_VIRTIO_VGA=y
 CONFIG_VMMOUSE=y
+CONFIG_IPMI=y
+CONFIG_IPMI_LOCAL=y
+CONFIG_IPMI_EXTERN=y
+CONFIG_ISA_IPMI_KCS=y
+CONFIG_ISA_IPMI_BT=y
 CONFIG_SERIAL=y
 CONFIG_PARALLEL=y
 CONFIG_I8254=y
@@ -46,7 +51,10 @@ CONFIG_APIC=y
 CONFIG_IOAPIC=y
 CONFIG_PVPANIC=y
 CONFIG_MEM_HOTPLUG=y
+CONFIG_NVDIMM=y
+CONFIG_ACPI_NVDIMM=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
 CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
+CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -9,6 +9,11 @@ CONFIG_VGA_CIRRUS=y
 CONFIG_VMWARE_VGA=y
 CONFIG_VIRTIO_VGA=y
 CONFIG_VMMOUSE=y
+CONFIG_IPMI=y
+CONFIG_IPMI_LOCAL=y
+CONFIG_IPMI_EXTERN=y
+CONFIG_ISA_IPMI_KCS=y
+CONFIG_ISA_IPMI_BT=y
 CONFIG_SERIAL=y
 CONFIG_PARALLEL=y
 CONFIG_I8254=y
@@ -46,7 +51,10 @@ CONFIG_APIC=y
 CONFIG_IOAPIC=y
 CONFIG_PVPANIC=y
 CONFIG_MEM_HOTPLUG=y
+CONFIG_NVDIMM=y
+CONFIG_ACPI_NVDIMM=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
 CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
+CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
--- a/disas/Makefile.objs
+++ b/disas/Makefile.objs
@@ -4,7 +4,10 @@ common-obj-$(CONFIG_ARM_DIS) += arm.o
 common-obj-$(CONFIG_ARM_A64_DIS) += arm-a64.o
 common-obj-$(CONFIG_ARM_A64_DIS) += libvixl/
 libvixldir = $(SRC_PATH)/disas/libvixl
-arm-a64.o-cflags := -I$(libvixldir)
+# The -Wno-sign-compare is needed only for gcc 4.6, which complains about
+# some signed-unsigned equality comparisons in libvixl which later gcc
+# versions do not.
+arm-a64.o-cflags := -I$(libvixldir) -Wno-sign-compare
 common-obj-$(CONFIG_CRIS_DIS) += cris.o
 common-obj-$(CONFIG_HPPA_DIS) += hppa.o
 common-obj-$(CONFIG_I386_DIS) += i386.o
--- a/disas/arm-a64.cc
+++ b/disas/arm-a64.cc
@@ -17,7 +17,7 @@
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */

-#include "a64/disasm-a64.h"
+#include "vixl/a64/disasm-a64.h"

 extern "C" {
 #include "disas/bfd.h"
--- a/disas/arm.c
+++ b/disas/arm.c
@@ -1779,7 +1779,7 @@ print_insn_coprocessor (bfd_vma pc, struct disassemble_info *info, long given,

 			/* Is ``imm'' a negative number?  */
 			if (imm & 0x40)
-			  imm |= (-1 << 7);
+			  imm |= (~0u << 7);

 			func (stream, "%d", imm);
 		      }
--- a/disas/libvixl/Makefile.objs
+++ b/disas/libvixl/Makefile.objs
@@ -1,8 +1,11 @@
-libvixl_OBJS = utils.o \
-               a64/instructions-a64.o \
-               a64/decoder-a64.o \
-               a64/disasm-a64.o
+libvixl_OBJS = vixl/utils.o \
+               vixl/compiler-intrinsics.o \
+               vixl/a64/instructions-a64.o \
+               vixl/a64/decoder-a64.o \
+               vixl/a64/disasm-a64.o

-$(addprefix $(obj)/,$(libvixl_OBJS)): QEMU_CFLAGS := -I$(SRC_PATH)/disas/libvixl $(QEMU_CFLAGS)
+# The -Wno-sign-compare is needed only for gcc 4.6, which complains about
+# some signed-unsigned equality comparisons which later gcc versions do not.
+$(addprefix $(obj)/,$(libvixl_OBJS)): QEMU_CFLAGS := -I$(SRC_PATH)/disas/libvixl $(QEMU_CFLAGS) -Wno-sign-compare

 common-obj-$(CONFIG_ARM_A64_DIS) += $(libvixl_OBJS)
--- a/disas/libvixl/README
+++ b/disas/libvixl/README
@@ -2,11 +2,10 @@
 The code in this directory is a subset of libvixl:
 https://github.com/armvixl/vixl
 (specifically, it is the set of files needed for disassembly only,
-taken from libvixl 1.7).
+taken from libvixl 1.12).
 Bugfixes should preferably be sent upstream initially.

 The disassembler does not currently support the entire A64 instruction
 set. Notably:
- * No Advanced SIMD support.
 * Limited support for system instructions.
 * A few miscellaneous integer and floating point instructions are missing.
--- a/disas/libvixl/a64/assembler-a64.h
+++ b/disas/libvixl/a64/assembler-a64.h
--- a/disas/libvixl/a64/disasm-a64.cc
+++ b/disas/libvixl/a64/disasm-a64.cc
--- a/disas/libvixl/a64/instructions-a64.cc
+++ b/disas/libvixl/a64/instructions-a64.cc
@@ -1,314 +0,0 @@
-// Copyright 2013, ARM Limited
-// All rights reserved.
-//
-// Redistribution and use in source and binary forms, with or without
-// modification, are permitted provided that the following conditions are met:
-//
-//   * Redistributions of source code must retain the above copyright notice,
-//     this list of conditions and the following disclaimer.
-//   * Redistributions in binary form must reproduce the above copyright notice,
-//     this list of conditions and the following disclaimer in the documentation
-//     and/or other materials provided with the distribution.
-//   * Neither the name of ARM Limited nor the names of its contributors may be
-//     used to endorse or promote products derived from this software without
-//     specific prior written permission.
-//
-// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
-// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
-// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
-// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
-// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
-// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
-// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
-// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-#include "a64/instructions-a64.h"
-#include "a64/assembler-a64.h"
-
-namespace vixl {
-
-
-// Floating-point infinity values.
-const float kFP32PositiveInfinity = rawbits_to_float(0x7f800000);
-const float kFP32NegativeInfinity = rawbits_to_float(0xff800000);
-const double kFP64PositiveInfinity =
-    rawbits_to_double(UINT64_C(0x7ff0000000000000));
-const double kFP64NegativeInfinity =
-    rawbits_to_double(UINT64_C(0xfff0000000000000));
-
-
-// The default NaN values (for FPCR.DN=1).
-const double kFP64DefaultNaN = rawbits_to_double(UINT64_C(0x7ff8000000000000));
-const float kFP32DefaultNaN = rawbits_to_float(0x7fc00000);
-
-
-static uint64_t RotateRight(uint64_t value,
-                            unsigned int rotate,
-                            unsigned int width) {
-  VIXL_ASSERT(width <= 64);
-  rotate &= 63;
-  return ((value & ((UINT64_C(1) << rotate) - 1)) <<
-          (width - rotate)) | (value >> rotate);
-}
-
-
-static uint64_t RepeatBitsAcrossReg(unsigned reg_size,
-                                    uint64_t value,
-                                    unsigned width) {
-  VIXL_ASSERT((width == 2) || (width == 4) || (width == 8) || (width == 16) ||
-              (width == 32));
-  VIXL_ASSERT((reg_size == kWRegSize) || (reg_size == kXRegSize));
-  uint64_t result = value & ((UINT64_C(1) << width) - 1);
-  for (unsigned i = width; i < reg_size; i *= 2) {
-    result |= (result << i);
-  }
-  return result;
-}
-
-
-bool Instruction::IsLoad() const {
-  if (Mask(LoadStoreAnyFMask) != LoadStoreAnyFixed) {
-    return false;
-  }
-
-  if (Mask(LoadStorePairAnyFMask) == LoadStorePairAnyFixed) {
-    return Mask(LoadStorePairLBit) != 0;
-  } else {
-    LoadStoreOp op = static_cast<LoadStoreOp>(Mask(LoadStoreOpMask));
-    switch (op) {
-      case LDRB_w:
-      case LDRH_w:
-      case LDR_w:
-      case LDR_x:
-      case LDRSB_w:
-      case LDRSB_x:
-      case LDRSH_w:
-      case LDRSH_x:
-      case LDRSW_x:
-      case LDR_s:
-      case LDR_d: return true;
-      default: return false;
-    }
-  }
-}
-
-
-bool Instruction::IsStore() const {
-  if (Mask(LoadStoreAnyFMask) != LoadStoreAnyFixed) {
-    return false;
-  }
-
-  if (Mask(LoadStorePairAnyFMask) == LoadStorePairAnyFixed) {
-    return Mask(LoadStorePairLBit) == 0;
-  } else {
-    LoadStoreOp op = static_cast<LoadStoreOp>(Mask(LoadStoreOpMask));
-    switch (op) {
-      case STRB_w:
-      case STRH_w:
-      case STR_w:
-      case STR_x:
-      case STR_s:
-      case STR_d: return true;
-      default: return false;
-    }
-  }
-}
-
-
-// Logical immediates can't encode zero, so a return value of zero is used to
-// indicate a failure case. Specifically, where the constraints on imm_s are
-// not met.
-uint64_t Instruction::ImmLogical() const {
-  unsigned reg_size = SixtyFourBits() ? kXRegSize : kWRegSize;
-  int64_t n = BitN();
-  int64_t imm_s = ImmSetBits();
-  int64_t imm_r = ImmRotate();
-
-  // An integer is constructed from the n, imm_s and imm_r bits according to
-  // the following table:
-  //
-  //  N   imms    immr    size        S             R
-  //  1  ssssss  rrrrrr    64    UInt(ssssss)  UInt(rrrrrr)
-  //  0  0sssss  xrrrrr    32    UInt(sssss)   UInt(rrrrr)
-  //  0  10ssss  xxrrrr    16    UInt(ssss)    UInt(rrrr)
-  //  0  110sss  xxxrrr     8    UInt(sss)     UInt(rrr)
-  //  0  1110ss  xxxxrr     4    UInt(ss)      UInt(rr)
-  //  0  11110s  xxxxxr     2    UInt(s)       UInt(r)
-  // (s bits must not be all set)
-  //
-  // A pattern is constructed of size bits, where the least significant S+1
-  // bits are set. The pattern is rotated right by R, and repeated across a
-  // 32 or 64-bit value, depending on destination register width.
-  //
-
-  if (n == 1) {
-    if (imm_s == 0x3F) {
-      return 0;
-    }
-    uint64_t bits = (UINT64_C(1) << (imm_s + 1)) - 1;
-    return RotateRight(bits, imm_r, 64);
-  } else {
-    if ((imm_s >> 1) == 0x1F) {
-      return 0;
-    }
-    for (int width = 0x20; width >= 0x2; width >>= 1) {
-      if ((imm_s & width) == 0) {
-        int mask = width - 1;
-        if ((imm_s & mask) == mask) {
-          return 0;
-        }
-        uint64_t bits = (UINT64_C(1) << ((imm_s & mask) + 1)) - 1;
-        return RepeatBitsAcrossReg(reg_size,
-                                   RotateRight(bits, imm_r & mask, width),
-                                   width);
-      }
-    }
-  }
-  VIXL_UNREACHABLE();
-  return 0;
-}
-
-
-float Instruction::ImmFP32() const {
-  //  ImmFP: abcdefgh (8 bits)
-  // Single: aBbb.bbbc.defg.h000.0000.0000.0000.0000 (32 bits)
-  // where B is b ^ 1
-  uint32_t bits = ImmFP();
-  uint32_t bit7 = (bits >> 7) & 0x1;
-  uint32_t bit6 = (bits >> 6) & 0x1;
-  uint32_t bit5_to_0 = bits & 0x3f;
-  uint32_t result = (bit7 << 31) | ((32 - bit6) << 25) | (bit5_to_0 << 19);
-
-  return rawbits_to_float(result);
-}
-
-
-double Instruction::ImmFP64() const {
-  //  ImmFP: abcdefgh (8 bits)
-  // Double: aBbb.bbbb.bbcd.efgh.0000.0000.0000.0000
-  //         0000.0000.0000.0000.0000.0000.0000.0000 (64 bits)
-  // where B is b ^ 1
-  uint32_t bits = ImmFP();
-  uint64_t bit7 = (bits >> 7) & 0x1;
-  uint64_t bit6 = (bits >> 6) & 0x1;
-  uint64_t bit5_to_0 = bits & 0x3f;
-  uint64_t result = (bit7 << 63) | ((256 - bit6) << 54) | (bit5_to_0 << 48);
-
-  return rawbits_to_double(result);
-}
-
-
-LSDataSize CalcLSPairDataSize(LoadStorePairOp op) {
-  switch (op) {
-    case STP_x:
-    case LDP_x:
-    case STP_d:
-    case LDP_d: return LSDoubleWord;
-    default: return LSWord;
-  }
-}
-
-
-const Instruction* Instruction::ImmPCOffsetTarget() const {
-  const Instruction * base = this;
-  ptrdiff_t offset;
-  if (IsPCRelAddressing()) {
-    // ADR and ADRP.
-    offset = ImmPCRel();
-    if (Mask(PCRelAddressingMask) == ADRP) {
-      base = AlignDown(base, kPageSize);
-      offset *= kPageSize;
-    } else {
-      VIXL_ASSERT(Mask(PCRelAddressingMask) == ADR);
-    }
-  } else {
-    // All PC-relative branches.
-    VIXL_ASSERT(BranchType() != UnknownBranchType);
-    // Relative branch offsets are instruction-size-aligned.
-    offset = ImmBranch() << kInstructionSizeLog2;
-  }
-  return base + offset;
-}
-
-
-inline int Instruction::ImmBranch() const {
-  switch (BranchType()) {
-    case CondBranchType: return ImmCondBranch();
-    case UncondBranchType: return ImmUncondBranch();
-    case CompareBranchType: return ImmCmpBranch();
-    case TestBranchType: return ImmTestBranch();
-    default: VIXL_UNREACHABLE();
-  }
-  return 0;
-}
-
-
-void Instruction::SetImmPCOffsetTarget(const Instruction* target) {
-  if (IsPCRelAddressing()) {
-    SetPCRelImmTarget(target);
-  } else {
-    SetBranchImmTarget(target);
-  }
-}
-
-
-void Instruction::SetPCRelImmTarget(const Instruction* target) {
-  int32_t imm21;
-  if ((Mask(PCRelAddressingMask) == ADR)) {
-    imm21 = target - this;
-  } else {
-    VIXL_ASSERT(Mask(PCRelAddressingMask) == ADRP);
-    uintptr_t this_page = reinterpret_cast<uintptr_t>(this) / kPageSize;
-    uintptr_t target_page = reinterpret_cast<uintptr_t>(target) / kPageSize;
-    imm21 = target_page - this_page;
-  }
-  Instr imm = Assembler::ImmPCRelAddress(imm21);
-
-  SetInstructionBits(Mask(~ImmPCRel_mask) | imm);
-}
-
-
-void Instruction::SetBranchImmTarget(const Instruction* target) {
-  VIXL_ASSERT(((target - this) & 3) == 0);
-  Instr branch_imm = 0;
-  uint32_t imm_mask = 0;
-  int offset = (target - this) >> kInstructionSizeLog2;
-  switch (BranchType()) {
-    case CondBranchType: {
-      branch_imm = Assembler::ImmCondBranch(offset);
-      imm_mask = ImmCondBranch_mask;
-      break;
-    }
-    case UncondBranchType: {
-      branch_imm = Assembler::ImmUncondBranch(offset);
-      imm_mask = ImmUncondBranch_mask;
-      break;
-    }
-    case CompareBranchType: {
-      branch_imm = Assembler::ImmCmpBranch(offset);
-      imm_mask = ImmCmpBranch_mask;
-      break;
-    }
-    case TestBranchType: {
-      branch_imm = Assembler::ImmTestBranch(offset);
-      imm_mask = ImmTestBranch_mask;
-      break;
-    }
-    default: VIXL_UNREACHABLE();
-  }
-  SetInstructionBits(Mask(~imm_mask) | branch_imm);
-}
-
-
-void Instruction::SetImmLLiteral(const Instruction* source) {
-  VIXL_ASSERT(IsWordAligned(source));
-  ptrdiff_t offset = (source - this) >> kLiteralEntrySizeLog2;
-  Instr imm = Assembler::ImmLLiteral(offset);
-  Instr mask = ImmLLiteral_mask;
-
-  SetInstructionBits(Mask(~mask) | imm);
-}
-}  // namespace vixl
-
--- a/disas/libvixl/a64/instructions-a64.h
+++ b/disas/libvixl/a64/instructions-a64.h
@@ -1,384 +0,0 @@
-// Copyright 2013, ARM Limited
-// All rights reserved.
-//
-// Redistribution and use in source and binary forms, with or without
-// modification, are permitted provided that the following conditions are met:
-//
-//   * Redistributions of source code must retain the above copyright notice,
-//     this list of conditions and the following disclaimer.
-//   * Redistributions in binary form must reproduce the above copyright notice,
-//     this list of conditions and the following disclaimer in the documentation
-//     and/or other materials provided with the distribution.
-//   * Neither the name of ARM Limited nor the names of its contributors may be
-//     used to endorse or promote products derived from this software without
-//     specific prior written permission.
-//
-// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
-// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
-// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
-// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
-// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
-// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
-// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
-// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-#ifndef VIXL_A64_INSTRUCTIONS_A64_H_
-#define VIXL_A64_INSTRUCTIONS_A64_H_
-
-#include "globals.h"
-#include "utils.h"
-#include "a64/constants-a64.h"
-
-namespace vixl {
-// ISA constants. --------------------------------------------------------------
-
-typedef uint32_t Instr;
-const unsigned kInstructionSize = 4;
-const unsigned kInstructionSizeLog2 = 2;
-const unsigned kLiteralEntrySize = 4;
-const unsigned kLiteralEntrySizeLog2 = 2;
-const unsigned kMaxLoadLiteralRange = 1 * MBytes;
-
-// This is the nominal page size (as used by the adrp instruction); the actual
-// size of the memory pages allocated by the kernel is likely to differ.
-const unsigned kPageSize = 4 * KBytes;
-const unsigned kPageSizeLog2 = 12;
-
-const unsigned kWRegSize = 32;
-const unsigned kWRegSizeLog2 = 5;
-const unsigned kWRegSizeInBytes = kWRegSize / 8;
-const unsigned kWRegSizeInBytesLog2 = kWRegSizeLog2 - 3;
-const unsigned kXRegSize = 64;
-const unsigned kXRegSizeLog2 = 6;
-const unsigned kXRegSizeInBytes = kXRegSize / 8;
-const unsigned kXRegSizeInBytesLog2 = kXRegSizeLog2 - 3;
-const unsigned kSRegSize = 32;
-const unsigned kSRegSizeLog2 = 5;
-const unsigned kSRegSizeInBytes = kSRegSize / 8;
-const unsigned kSRegSizeInBytesLog2 = kSRegSizeLog2 - 3;
-const unsigned kDRegSize = 64;
-const unsigned kDRegSizeLog2 = 6;
-const unsigned kDRegSizeInBytes = kDRegSize / 8;
-const unsigned kDRegSizeInBytesLog2 = kDRegSizeLog2 - 3;
-const uint64_t kWRegMask = UINT64_C(0xffffffff);
-const uint64_t kXRegMask = UINT64_C(0xffffffffffffffff);
-const uint64_t kSRegMask = UINT64_C(0xffffffff);
-const uint64_t kDRegMask = UINT64_C(0xffffffffffffffff);
-const uint64_t kSSignMask = UINT64_C(0x80000000);
-const uint64_t kDSignMask = UINT64_C(0x8000000000000000);
-const uint64_t kWSignMask = UINT64_C(0x80000000);
-const uint64_t kXSignMask = UINT64_C(0x8000000000000000);
-const uint64_t kByteMask = UINT64_C(0xff);
-const uint64_t kHalfWordMask = UINT64_C(0xffff);
-const uint64_t kWordMask = UINT64_C(0xffffffff);
-const uint64_t kXMaxUInt = UINT64_C(0xffffffffffffffff);
-const uint64_t kWMaxUInt = UINT64_C(0xffffffff);
-const int64_t kXMaxInt = INT64_C(0x7fffffffffffffff);
-const int64_t kXMinInt = INT64_C(0x8000000000000000);
-const int32_t kWMaxInt = INT32_C(0x7fffffff);
-const int32_t kWMinInt = INT32_C(0x80000000);
-const unsigned kLinkRegCode = 30;
-const unsigned kZeroRegCode = 31;
-const unsigned kSPRegInternalCode = 63;
-const unsigned kRegCodeMask = 0x1f;
-
-const unsigned kAddressTagOffset = 56;
-const unsigned kAddressTagWidth = 8;
-const uint64_t kAddressTagMask =
-    ((UINT64_C(1) << kAddressTagWidth) - 1) << kAddressTagOffset;
-VIXL_STATIC_ASSERT(kAddressTagMask == UINT64_C(0xff00000000000000));
-
-// AArch64 floating-point specifics. These match IEEE-754.
-const unsigned kDoubleMantissaBits = 52;
-const unsigned kDoubleExponentBits = 11;
-const unsigned kFloatMantissaBits = 23;
-const unsigned kFloatExponentBits = 8;
-
-// Floating-point infinity values.
-extern const float kFP32PositiveInfinity;
-extern const float kFP32NegativeInfinity;
-extern const double kFP64PositiveInfinity;
-extern const double kFP64NegativeInfinity;
-
-// The default NaN values (for FPCR.DN=1).
-extern const double kFP64DefaultNaN;
-extern const float kFP32DefaultNaN;
-
-
-enum LSDataSize {
-  LSByte        = 0,
-  LSHalfword    = 1,
-  LSWord        = 2,
-  LSDoubleWord  = 3
-};
-
-LSDataSize CalcLSPairDataSize(LoadStorePairOp op);
-
-enum ImmBranchType {
-  UnknownBranchType = 0,
-  CondBranchType    = 1,
-  UncondBranchType  = 2,
-  CompareBranchType = 3,
-  TestBranchType    = 4
-};
-
-enum AddrMode {
-  Offset,
-  PreIndex,
-  PostIndex
-};
-
-enum FPRounding {
-  // The first four values are encodable directly by FPCR<RMode>.
-  FPTieEven = 0x0,
-  FPPositiveInfinity = 0x1,
-  FPNegativeInfinity = 0x2,
-  FPZero = 0x3,
-
-  // The final rounding mode is only available when explicitly specified by the
-  // instruction (such as with fcvta). It cannot be set in FPCR.
-  FPTieAway
-};
-
-enum Reg31Mode {
-  Reg31IsStackPointer,
-  Reg31IsZeroRegister
-};
-
-// Instructions. ---------------------------------------------------------------
-
-class Instruction {
- public:
-  Instr InstructionBits() const {
-    return *(reinterpret_cast<const Instr*>(this));
-  }
-
-  void SetInstructionBits(Instr new_instr) {
-    *(reinterpret_cast<Instr*>(this)) = new_instr;
-  }
-
-  int Bit(int pos) const {
-    return (InstructionBits() >> pos) & 1;
-  }
-
-  uint32_t Bits(int msb, int lsb) const {
-    return unsigned_bitextract_32(msb, lsb, InstructionBits());
-  }
-
-  int32_t SignedBits(int msb, int lsb) const {
-    int32_t bits = *(reinterpret_cast<const int32_t*>(this));
-    return signed_bitextract_32(msb, lsb, bits);
-  }
-
-  Instr Mask(uint32_t mask) const {
-    return InstructionBits() & mask;
-  }
-
-  #define DEFINE_GETTER(Name, HighBit, LowBit, Func)             \
-  int64_t Name() const { return Func(HighBit, LowBit); }
-  INSTRUCTION_FIELDS_LIST(DEFINE_GETTER)
-  #undef DEFINE_GETTER
-
-  // ImmPCRel is a compound field (not present in INSTRUCTION_FIELDS_LIST),
-  // formed from ImmPCRelLo and ImmPCRelHi.
-  int ImmPCRel() const {
-    int const offset = ((ImmPCRelHi() << ImmPCRelLo_width) | ImmPCRelLo());
-    int const width = ImmPCRelLo_width + ImmPCRelHi_width;
-    return signed_bitextract_32(width-1, 0, offset);
-  }
-
-  uint64_t ImmLogical() const;
-  float ImmFP32() const;
-  double ImmFP64() const;
-
-  LSDataSize SizeLSPair() const {
-    return CalcLSPairDataSize(
-             static_cast<LoadStorePairOp>(Mask(LoadStorePairMask)));
-  }
-
-  // Helpers.
-  bool IsCondBranchImm() const {
-    return Mask(ConditionalBranchFMask) == ConditionalBranchFixed;
-  }
-
-  bool IsUncondBranchImm() const {
-    return Mask(UnconditionalBranchFMask) == UnconditionalBranchFixed;
-  }
-
-  bool IsCompareBranch() const {
-    return Mask(CompareBranchFMask) == CompareBranchFixed;
-  }
-
-  bool IsTestBranch() const {
-    return Mask(TestBranchFMask) == TestBranchFixed;
-  }
-
-  bool IsPCRelAddressing() const {
-    return Mask(PCRelAddressingFMask) == PCRelAddressingFixed;
-  }
-
-  bool IsLogicalImmediate() const {
-    return Mask(LogicalImmediateFMask) == LogicalImmediateFixed;
-  }
-
-  bool IsAddSubImmediate() const {
-    return Mask(AddSubImmediateFMask) == AddSubImmediateFixed;
-  }
-
-  bool IsAddSubExtended() const {
-    return Mask(AddSubExtendedFMask) == AddSubExtendedFixed;
-  }
-
-  bool IsLoadOrStore() const {
-    return Mask(LoadStoreAnyFMask) == LoadStoreAnyFixed;
-  }
-
-  bool IsLoad() const;
-  bool IsStore() const;
-
-  bool IsLoadLiteral() const {
-    // This includes PRFM_lit.
-    return Mask(LoadLiteralFMask) == LoadLiteralFixed;
-  }
-
-  bool IsMovn() const {
-    return (Mask(MoveWideImmediateMask) == MOVN_x) ||
-           (Mask(MoveWideImmediateMask) == MOVN_w);
-  }
-
-  // Indicate whether Rd can be the stack pointer or the zero register. This
-  // does not check that the instruction actually has an Rd field.
-  Reg31Mode RdMode() const {
-    // The following instructions use sp or wsp as Rd:
-    //  Add/sub (immediate) when not setting the flags.
-    //  Add/sub (extended) when not setting the flags.
-    //  Logical (immediate) when not setting the flags.
-    // Otherwise, r31 is the zero register.
-    if (IsAddSubImmediate() || IsAddSubExtended()) {
-      if (Mask(AddSubSetFlagsBit)) {
-        return Reg31IsZeroRegister;
-      } else {
-        return Reg31IsStackPointer;
-      }
-    }
-    if (IsLogicalImmediate()) {
-      // Of the logical (immediate) instructions, only ANDS (and its aliases)
-      // can set the flags. The others can all write into sp.
-      // Note that some logical operations are not available to
-      // immediate-operand instructions, so we have to combine two masks here.
-      if (Mask(LogicalImmediateMask & LogicalOpMask) == ANDS) {
-        return Reg31IsZeroRegister;
-      } else {
-        return Reg31IsStackPointer;
-      }
-    }
-    return Reg31IsZeroRegister;
-  }
-
-  // Indicate whether Rn can be the stack pointer or the zero register. This
-  // does not check that the instruction actually has an Rn field.
-  Reg31Mode RnMode() const {
-    // The following instructions use sp or wsp as Rn:
-    //  All loads and stores.
-    //  Add/sub (immediate).
-    //  Add/sub (extended).
-    // Otherwise, r31 is the zero register.
-    if (IsLoadOrStore() || IsAddSubImmediate() || IsAddSubExtended()) {
-      return Reg31IsStackPointer;
-    }
-    return Reg31IsZeroRegister;
-  }
-
-  ImmBranchType BranchType() const {
-    if (IsCondBranchImm()) {
-      return CondBranchType;
-    } else if (IsUncondBranchImm()) {
-      return UncondBranchType;
-    } else if (IsCompareBranch()) {
-      return CompareBranchType;
-    } else if (IsTestBranch()) {
-      return TestBranchType;
-    } else {
-      return UnknownBranchType;
-    }
-  }
-
-  // Find the target of this instruction. 'this' may be a branch or a
-  // PC-relative addressing instruction.
-  const Instruction* ImmPCOffsetTarget() const;
-
-  // Patch a PC-relative offset to refer to 'target'. 'this' may be a branch or
-  // a PC-relative addressing instruction.
-  void SetImmPCOffsetTarget(const Instruction* target);
-  // Patch a literal load instruction to load from 'source'.
-  void SetImmLLiteral(const Instruction* source);
-
-  // Calculate the address of a literal referred to by a load-literal
-  // instruction, and return it as the specified type.
-  //
-  // The literal itself is safely mutable only if the backing buffer is safely
-  // mutable.
-  template <typename T>
-  T LiteralAddress() const {
-    uint64_t base_raw = reinterpret_cast<uintptr_t>(this);
-    ptrdiff_t offset = ImmLLiteral() << kLiteralEntrySizeLog2;
-    uint64_t address_raw = base_raw + offset;
-
-    // Cast the address using a C-style cast. A reinterpret_cast would be
-    // appropriate, but it can't cast one integral type to another.
-    T address = (T)(address_raw);
-
-    // Assert that the address can be represented by the specified type.
-    VIXL_ASSERT((uint64_t)(address) == address_raw);
-
-    return address;
-  }
-
-  uint32_t Literal32() const {
-    uint32_t literal;
-    memcpy(&literal, LiteralAddress<const void*>(), sizeof(literal));
-    return literal;
-  }
-
-  uint64_t Literal64() const {
-    uint64_t literal;
-    memcpy(&literal, LiteralAddress<const void*>(), sizeof(literal));
-    return literal;
-  }
-
-  float LiteralFP32() const {
-    return rawbits_to_float(Literal32());
-  }
-
-  double LiteralFP64() const {
-    return rawbits_to_double(Literal64());
-  }
-
-  const Instruction* NextInstruction() const {
-    return this + kInstructionSize;
-  }
-
-  const Instruction* InstructionAtOffset(int64_t offset) const {
-    VIXL_ASSERT(IsWordAligned(this + offset));
-    return this + offset;
-  }
-
-  template<typename T> static Instruction* Cast(T src) {
-    return reinterpret_cast<Instruction*>(src);
-  }
-
-  template<typename T> static const Instruction* CastConst(T src) {
-    return reinterpret_cast<const Instruction*>(src);
-  }
-
- private:
-  int ImmBranch() const;
-
-  void SetPCRelImmTarget(const Instruction* target);
-  void SetBranchImmTarget(const Instruction* target);
-};
-}  // namespace vixl
-
-#endif  // VIXL_A64_INSTRUCTIONS_A64_H_
--- a/disas/libvixl/vixl/a64/assembler-a64.h
+++ b/disas/libvixl/vixl/a64/assembler-a64.h
--- a/disas/libvixl/vixl/a64/constants-a64.h
+++ b/disas/libvixl/vixl/a64/constants-a64.h
--- a/disas/libvixl/vixl/a64/cpu-a64.h
+++ b/disas/libvixl/vixl/a64/cpu-a64.h
@@ -1,4 +1,4 @@
-// Copyright 2013, ARM Limited
+// Copyright 2014, ARM Limited
 // All rights reserved.
 //
 // Redistribution and use in source and binary forms, with or without
@@ -27,8 +27,8 @@
 #ifndef VIXL_CPU_A64_H
 #define VIXL_CPU_A64_H

-#include "globals.h"
-#include "instructions-a64.h"
+#include "vixl/globals.h"
+#include "vixl/a64/instructions-a64.h"

 namespace vixl {

--- a/disas/libvixl/vixl/a64/decoder-a64.cc
+++ b/disas/libvixl/vixl/a64/decoder-a64.cc
@@ -1,4 +1,4 @@
-// Copyright 2013, ARM Limited
+// Copyright 2014, ARM Limited
 // All rights reserved.
 //
 // Redistribution and use in source and binary forms, with or without
@@ -24,9 +24,9 @@
 // OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

-#include "globals.h"
-#include "utils.h"
-#include "a64/decoder-a64.h"
+#include "vixl/globals.h"
+#include "vixl/utils.h"
+#include "vixl/a64/decoder-a64.h"

 namespace vixl {

@@ -271,6 +271,11 @@ void Decoder::DecodeLoadStore(const Instruction* instr) {
              (instr->Bits(27, 24) == 0x9) ||
              (instr->Bits(27, 24) == 0xC) ||
              (instr->Bits(27, 24) == 0xD) );
+  // TODO(all): rearrange the tree to integrate this branch.
+  if ((instr->Bit(28) == 0) && (instr->Bit(29) == 0) && (instr->Bit(26) == 1)) {
+    DecodeNEONLoadStore(instr);
+    return;
+  }

  if (instr->Bit(24) == 0) {
    if (instr->Bit(28) == 0) {
@@ -278,7 +283,7 @@ void Decoder::DecodeLoadStore(const Instruction* instr) {
        if (instr->Bit(26) == 0) {
          VisitLoadStoreExclusive(instr);
        } else {
-          DecodeAdvSIMDLoadStore(instr);
+          VIXL_UNREACHABLE();
        }
      } else {
        if ((instr->Bits(31, 30) == 0x3) ||
@@ -483,6 +488,7 @@ void Decoder::DecodeDataProcessing(const Instruction* instr) {
        case 6: {
          if (instr->Bit(29) == 0x1) {
            VisitUnallocated(instr);
+            VIXL_FALLTHROUGH();
          } else {
            if (instr->Bit(30) == 0) {
              if ((instr->Bit(15) == 0x1) ||
@@ -556,18 +562,15 @@ void Decoder::DecodeDataProcessing(const Instruction* instr) {
 void Decoder::DecodeFP(const Instruction* instr) {
  VIXL_ASSERT((instr->Bits(27, 24) == 0xE) ||
              (instr->Bits(27, 24) == 0xF));
-
  if (instr->Bit(28) == 0) {
-    DecodeAdvSIMDDataProcessing(instr);
+    DecodeNEONVectorDataProcessing(instr);
  } else {
-    if (instr->Bit(29) == 1) {
+    if (instr->Bits(31, 30) == 0x3) {
      VisitUnallocated(instr);
+    } else if (instr->Bits(31, 30) == 0x1) {
+      DecodeNEONScalarDataProcessing(instr);
    } else {
-      if (instr->Bits(31, 30) == 0x3) {
-        VisitUnallocated(instr);
-      } else if (instr->Bits(31, 30) == 0x1) {
-        DecodeAdvSIMDDataProcessing(instr);
-      } else {
+      if (instr->Bit(29) == 0) {
        if (instr->Bit(24) == 0) {
          if (instr->Bit(21) == 0) {
            if ((instr->Bit(23) == 1) ||
@@ -674,23 +677,190 @@ void Decoder::DecodeFP(const Instruction* instr) {
            VisitFPDataProcessing3Source(instr);
          }
        }
+      } else {
+        VisitUnallocated(instr);
      }
    }
  }
 }


-void Decoder::DecodeAdvSIMDLoadStore(const Instruction* instr) {
-  // TODO: Implement Advanced SIMD load/store instruction decode.
+void Decoder::DecodeNEONLoadStore(const Instruction* instr) {
  VIXL_ASSERT(instr->Bits(29, 25) == 0x6);
-  VisitUnimplemented(instr);
+  if (instr->Bit(31) == 0) {
+    if ((instr->Bit(24) == 0) && (instr->Bit(21) == 1)) {
+      VisitUnallocated(instr);
+      return;
+    }
+
+    if (instr->Bit(23) == 0) {
+      if (instr->Bits(20, 16) == 0) {
+        if (instr->Bit(24) == 0) {
+          VisitNEONLoadStoreMultiStruct(instr);
+        } else {
+          VisitNEONLoadStoreSingleStruct(instr);
+        }
+      } else {
+        VisitUnallocated(instr);
+      }
+    } else {
+      if (instr->Bit(24) == 0) {
+        VisitNEONLoadStoreMultiStructPostIndex(instr);
+      } else {
+        VisitNEONLoadStoreSingleStructPostIndex(instr);
+      }
+    }
+  } else {
+    VisitUnallocated(instr);
+  }
 }


-void Decoder::DecodeAdvSIMDDataProcessing(const Instruction* instr) {
-  // TODO: Implement Advanced SIMD data processing instruction decode.
-  VIXL_ASSERT(instr->Bits(27, 25) == 0x7);
-  VisitUnimplemented(instr);
+void Decoder::DecodeNEONVectorDataProcessing(const Instruction* instr) {
+  VIXL_ASSERT(instr->Bits(28, 25) == 0x7);
+  if (instr->Bit(31) == 0) {
+    if (instr->Bit(24) == 0) {
+      if (instr->Bit(21) == 0) {
+        if (instr->Bit(15) == 0) {
+          if (instr->Bit(10) == 0) {
+            if (instr->Bit(29) == 0) {
+              if (instr->Bit(11) == 0) {
+                VisitNEONTable(instr);
+              } else {
+                VisitNEONPerm(instr);
+              }
+            } else {
+              VisitNEONExtract(instr);
+            }
+          } else {
+            if (instr->Bits(23, 22) == 0) {
+              VisitNEONCopy(instr);
+            } else {
+              VisitUnallocated(instr);
+            }
+          }
+        } else {
+          VisitUnallocated(instr);
+        }
+      } else {
+        if (instr->Bit(10) == 0) {
+          if (instr->Bit(11) == 0) {
+            VisitNEON3Different(instr);
+          } else {
+            if (instr->Bits(18, 17) == 0) {
+              if (instr->Bit(20) == 0) {
+                if (instr->Bit(19) == 0) {
+                  VisitNEON2RegMisc(instr);
+                } else {
+                  if (instr->Bits(30, 29) == 0x2) {
+                    VisitCryptoAES(instr);
+                  } else {
+                    VisitUnallocated(instr);
+                  }
+                }
+              } else {
+                if (instr->Bit(19) == 0) {
+                  VisitNEONAcrossLanes(instr);
+                } else {
+                  VisitUnallocated(instr);
+                }
+              }
+            } else {
+              VisitUnallocated(instr);
+            }
+          }
+        } else {
+          VisitNEON3Same(instr);
+        }
+      }
+    } else {
+      if (instr->Bit(10) == 0) {
+        VisitNEONByIndexedElement(instr);
+      } else {
+        if (instr->Bit(23) == 0) {
+          if (instr->Bits(22, 19) == 0) {
+            VisitNEONModifiedImmediate(instr);
+          } else {
+            VisitNEONShiftImmediate(instr);
+          }
+        } else {
+          VisitUnallocated(instr);
+        }
+      }
+    }
+  } else {
+    VisitUnallocated(instr);
+  }
+}
+
+
+void Decoder::DecodeNEONScalarDataProcessing(const Instruction* instr) {
+  VIXL_ASSERT(instr->Bits(28, 25) == 0xF);
+  if (instr->Bit(24) == 0) {
+    if (instr->Bit(21) == 0) {
+      if (instr->Bit(15) == 0) {
+        if (instr->Bit(10) == 0) {
+          if (instr->Bit(29) == 0) {
+            if (instr->Bit(11) == 0) {
+              VisitCrypto3RegSHA(instr);
+            } else {
+              VisitUnallocated(instr);
+            }
+          } else {
+            VisitUnallocated(instr);
+          }
+        } else {
+          if (instr->Bits(23, 22) == 0) {
+            VisitNEONScalarCopy(instr);
+          } else {
+            VisitUnallocated(instr);
+          }
+        }
+      } else {
+        VisitUnallocated(instr);
+      }
+    } else {
+      if (instr->Bit(10) == 0) {
+        if (instr->Bit(11) == 0) {
+          VisitNEONScalar3Diff(instr);
+        } else {
+          if (instr->Bits(18, 17) == 0) {
+            if (instr->Bit(20) == 0) {
+              if (instr->Bit(19) == 0) {
+                VisitNEONScalar2RegMisc(instr);
+              } else {
+                if (instr->Bit(29) == 0) {
+                  VisitCrypto2RegSHA(instr);
+                } else {
+                  VisitUnallocated(instr);
+                }
+              }
+            } else {
+              if (instr->Bit(19) == 0) {
+                VisitNEONScalarPairwise(instr);
+              } else {
+                VisitUnallocated(instr);
+              }
+            }
+          } else {
+            VisitUnallocated(instr);
+          }
+        }
+      } else {
+        VisitNEONScalar3Same(instr);
+      }
+    }
+  } else {
+    if (instr->Bit(10) == 0) {
+      VisitNEONScalarByIndexedElement(instr);
+    } else {
+      if (instr->Bit(23) == 0) {
+        VisitNEONScalarShiftImmediate(instr);
+      } else {
+        VisitUnallocated(instr);
+      }
+    }
+  }
 }


--- a/disas/libvixl/vixl/a64/decoder-a64.h
+++ b/disas/libvixl/vixl/a64/decoder-a64.h
@@ -1,4 +1,4 @@
-// Copyright 2013, ARM Limited
+// Copyright 2014, ARM Limited
 // All rights reserved.
 //
 // Redistribution and use in source and binary forms, with or without
@@ -29,13 +29,13 @@

 #include <list>

-#include "globals.h"
-#include "a64/instructions-a64.h"
+#include "vixl/globals.h"
+#include "vixl/a64/instructions-a64.h"


 // List macro containing all visitors needed by the decoder class.

-#define VISITOR_LIST(V)             \
+#define VISITOR_LIST_THAT_RETURN(V) \
  V(PCRelAddressing)                \
  V(AddSubImmediate)                \
  V(LogicalImmediate)               \
@@ -79,8 +79,39 @@
  V(FPDataProcessing3Source)        \
  V(FPIntegerConvert)               \
  V(FPFixedPointConvert)            \
-  V(Unallocated)                    \
-  V(Unimplemented)
+  V(Crypto2RegSHA)                  \
+  V(Crypto3RegSHA)                  \
+  V(CryptoAES)                      \
+  V(NEON2RegMisc)                   \
+  V(NEON3Different)                 \
+  V(NEON3Same)                      \
+  V(NEONAcrossLanes)                \
+  V(NEONByIndexedElement)           \
+  V(NEONCopy)                       \
+  V(NEONExtract)                    \
+  V(NEONLoadStoreMultiStruct)       \
+  V(NEONLoadStoreMultiStructPostIndex)  \
+  V(NEONLoadStoreSingleStruct)      \
+  V(NEONLoadStoreSingleStructPostIndex) \
+  V(NEONModifiedImmediate)          \
+  V(NEONScalar2RegMisc)             \
+  V(NEONScalar3Diff)                \
+  V(NEONScalar3Same)                \
+  V(NEONScalarByIndexedElement)     \
+  V(NEONScalarCopy)                 \
+  V(NEONScalarPairwise)             \
+  V(NEONScalarShiftImmediate)       \
+  V(NEONShiftImmediate)             \
+  V(NEONTable)                      \
+  V(NEONPerm)                       \
+
+#define VISITOR_LIST_THAT_DONT_RETURN(V)  \
+  V(Unallocated)                          \
+  V(Unimplemented)                        \
+
+#define VISITOR_LIST(V)             \
+  VISITOR_LIST_THAT_RETURN(V)       \
+  VISITOR_LIST_THAT_DONT_RETURN(V)  \

 namespace vixl {

@@ -222,12 +253,17 @@ class Decoder {
  // Decode the Advanced SIMD (NEON) load/store part of the instruction tree,
  // and call the corresponding visitors.
  // On entry, instruction bits 29:25 = 0x6.
-  void DecodeAdvSIMDLoadStore(const Instruction* instr);
+  void DecodeNEONLoadStore(const Instruction* instr);

-  // Decode the Advanced SIMD (NEON) data processing part of the instruction
-  // tree, and call the corresponding visitors.
-  // On entry, instruction bits 27:25 = 0x7.
-  void DecodeAdvSIMDDataProcessing(const Instruction* instr);
+  // Decode the Advanced SIMD (NEON) vector data processing part of the
+  // instruction tree, and call the corresponding visitors.
+  // On entry, instruction bits 28:25 = 0x7.
+  void DecodeNEONVectorDataProcessing(const Instruction* instr);
+
+  // Decode the Advanced SIMD (NEON) scalar data processing part of the
+  // instruction tree, and call the corresponding visitors.
+  // On entry, instruction bits 28:25 = 0xF.
+  void DecodeNEONScalarDataProcessing(const Instruction* instr);

 private:
  // Visitors are registered in a list.
--- a/disas/libvixl/vixl/a64/disasm-a64.cc
+++ b/disas/libvixl/vixl/a64/disasm-a64.cc
--- a/disas/libvixl/vixl/a64/disasm-a64.h
+++ b/disas/libvixl/vixl/a64/disasm-a64.h
@@ -1,4 +1,4 @@
-// Copyright 2013, ARM Limited
+// Copyright 2015, ARM Limited
 // All rights reserved.
 //
 // Redistribution and use in source and binary forms, with or without
@@ -27,11 +27,11 @@
 #ifndef VIXL_A64_DISASM_A64_H
 #define VIXL_A64_DISASM_A64_H

-#include "globals.h"
-#include "utils.h"
-#include "instructions-a64.h"
-#include "decoder-a64.h"
-#include "assembler-a64.h"
+#include "vixl/globals.h"
+#include "vixl/utils.h"
+#include "vixl/a64/instructions-a64.h"
+#include "vixl/a64/decoder-a64.h"
+#include "vixl/a64/assembler-a64.h"

 namespace vixl {

@@ -55,6 +55,7 @@ class Disassembler: public DecoderVisitor {
  // customize the disassembly output.

  // Prints the name of a register.
+  // TODO: This currently doesn't allow renaming of V registers.
  virtual void AppendRegisterNameToOutput(const Instruction* instr,
                                          const CPURegister& reg);

@@ -122,7 +123,8 @@ class Disassembler: public DecoderVisitor {
  int SubstituteLSRegOffsetField(const Instruction* instr, const char* format);
  int SubstitutePrefetchField(const Instruction* instr, const char* format);
  int SubstituteBarrierField(const Instruction* instr, const char* format);
-
+  int SubstituteSysOpField(const Instruction* instr, const char* format);
+  int SubstituteCrField(const Instruction* instr, const char* format);
  bool RdIsZROrSP(const Instruction* instr) const {
    return (instr->Rd() == kZeroRegCode);
  }
@@ -163,7 +165,6 @@ class Disassembler: public DecoderVisitor {
 class PrintDisassembler: public Disassembler {
 public:
  explicit PrintDisassembler(FILE* stream) : stream_(stream) { }
-  virtual ~PrintDisassembler() { }

 protected:
  virtual void ProcessOutput(const Instruction* instr);
--- a/disas/libvixl/vixl/a64/instructions-a64.cc
+++ b/disas/libvixl/vixl/a64/instructions-a64.cc
@@ -0,0 +1,622 @@
+// Copyright 2015, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "vixl/a64/instructions-a64.h"
+#include "vixl/a64/assembler-a64.h"
+
+namespace vixl {
+
+
+// Floating-point infinity values.
+const float16 kFP16PositiveInfinity = 0x7c00;
+const float16 kFP16NegativeInfinity = 0xfc00;
+const float kFP32PositiveInfinity = rawbits_to_float(0x7f800000);
+const float kFP32NegativeInfinity = rawbits_to_float(0xff800000);
+const double kFP64PositiveInfinity =
+    rawbits_to_double(UINT64_C(0x7ff0000000000000));
+const double kFP64NegativeInfinity =
+    rawbits_to_double(UINT64_C(0xfff0000000000000));
+
+
+// The default NaN values (for FPCR.DN=1).
+const double kFP64DefaultNaN = rawbits_to_double(UINT64_C(0x7ff8000000000000));
+const float kFP32DefaultNaN = rawbits_to_float(0x7fc00000);
+const float16 kFP16DefaultNaN = 0x7e00;
+
+
+static uint64_t RotateRight(uint64_t value,
+                            unsigned int rotate,
+                            unsigned int width) {
+  VIXL_ASSERT(width <= 64);
+  rotate &= 63;
+  return ((value & ((UINT64_C(1) << rotate) - 1)) <<
+          (width - rotate)) | (value >> rotate);
+}
+
+
+static uint64_t RepeatBitsAcrossReg(unsigned reg_size,
+                                    uint64_t value,
+                                    unsigned width) {
+  VIXL_ASSERT((width == 2) || (width == 4) || (width == 8) || (width == 16) ||
+              (width == 32));
+  VIXL_ASSERT((reg_size == kWRegSize) || (reg_size == kXRegSize));
+  uint64_t result = value & ((UINT64_C(1) << width) - 1);
+  for (unsigned i = width; i < reg_size; i *= 2) {
+    result |= (result << i);
+  }
+  return result;
+}
+
+
+bool Instruction::IsLoad() const {
+  if (Mask(LoadStoreAnyFMask) != LoadStoreAnyFixed) {
+    return false;
+  }
+
+  if (Mask(LoadStorePairAnyFMask) == LoadStorePairAnyFixed) {
+    return Mask(LoadStorePairLBit) != 0;
+  } else {
+    LoadStoreOp op = static_cast<LoadStoreOp>(Mask(LoadStoreMask));
+    switch (op) {
+      case LDRB_w:
+      case LDRH_w:
+      case LDR_w:
+      case LDR_x:
+      case LDRSB_w:
+      case LDRSB_x:
+      case LDRSH_w:
+      case LDRSH_x:
+      case LDRSW_x:
+      case LDR_b:
+      case LDR_h:
+      case LDR_s:
+      case LDR_d:
+      case LDR_q: return true;
+      default: return false;
+    }
+  }
+}
+
+
+bool Instruction::IsStore() const {
+  if (Mask(LoadStoreAnyFMask) != LoadStoreAnyFixed) {
+    return false;
+  }
+
+  if (Mask(LoadStorePairAnyFMask) == LoadStorePairAnyFixed) {
+    return Mask(LoadStorePairLBit) == 0;
+  } else {
+    LoadStoreOp op = static_cast<LoadStoreOp>(Mask(LoadStoreMask));
+    switch (op) {
+      case STRB_w:
+      case STRH_w:
+      case STR_w:
+      case STR_x:
+      case STR_b:
+      case STR_h:
+      case STR_s:
+      case STR_d:
+      case STR_q: return true;
+      default: return false;
+    }
+  }
+}
+
+
+// Logical immediates can't encode zero, so a return value of zero is used to
+// indicate a failure case. Specifically, where the constraints on imm_s are
+// not met.
+uint64_t Instruction::ImmLogical() const {
+  unsigned reg_size = SixtyFourBits() ? kXRegSize : kWRegSize;
+  int32_t n = BitN();
+  int32_t imm_s = ImmSetBits();
+  int32_t imm_r = ImmRotate();
+
+  // An integer is constructed from the n, imm_s and imm_r bits according to
+  // the following table:
+  //
+  //  N   imms    immr    size        S             R
+  //  1  ssssss  rrrrrr    64    UInt(ssssss)  UInt(rrrrrr)
+  //  0  0sssss  xrrrrr    32    UInt(sssss)   UInt(rrrrr)
+  //  0  10ssss  xxrrrr    16    UInt(ssss)    UInt(rrrr)
+  //  0  110sss  xxxrrr     8    UInt(sss)     UInt(rrr)
+  //  0  1110ss  xxxxrr     4    UInt(ss)      UInt(rr)
+  //  0  11110s  xxxxxr     2    UInt(s)       UInt(r)
+  // (s bits must not be all set)
+  //
+  // A pattern is constructed of size bits, where the least significant S+1
+  // bits are set. The pattern is rotated right by R, and repeated across a
+  // 32 or 64-bit value, depending on destination register width.
+  //
+
+  if (n == 1) {
+    if (imm_s == 0x3f) {
+      return 0;
+    }
+    uint64_t bits = (UINT64_C(1) << (imm_s + 1)) - 1;
+    return RotateRight(bits, imm_r, 64);
+  } else {
+    if ((imm_s >> 1) == 0x1f) {
+      return 0;
+    }
+    for (int width = 0x20; width >= 0x2; width >>= 1) {
+      if ((imm_s & width) == 0) {
+        int mask = width - 1;
+        if ((imm_s & mask) == mask) {
+          return 0;
+        }
+        uint64_t bits = (UINT64_C(1) << ((imm_s & mask) + 1)) - 1;
+        return RepeatBitsAcrossReg(reg_size,
+                                   RotateRight(bits, imm_r & mask, width),
+                                   width);
+      }
+    }
+  }
+  VIXL_UNREACHABLE();
+  return 0;
+}
+
+
+uint32_t Instruction::ImmNEONabcdefgh() const {
+  return ImmNEONabc() << 5 | ImmNEONdefgh();
+}
+
+
+float Instruction::Imm8ToFP32(uint32_t imm8) {
+  //   Imm8: abcdefgh (8 bits)
+  // Single: aBbb.bbbc.defg.h000.0000.0000.0000.0000 (32 bits)
+  // where B is b ^ 1
+  uint32_t bits = imm8;
+  uint32_t bit7 = (bits >> 7) & 0x1;
+  uint32_t bit6 = (bits >> 6) & 0x1;
+  uint32_t bit5_to_0 = bits & 0x3f;
+  uint32_t result = (bit7 << 31) | ((32 - bit6) << 25) | (bit5_to_0 << 19);
+
+  return rawbits_to_float(result);
+}
+
+
+float Instruction::ImmFP32() const {
+  return Imm8ToFP32(ImmFP());
+}
+
+
+double Instruction::Imm8ToFP64(uint32_t imm8) {
+  //   Imm8: abcdefgh (8 bits)
+  // Double: aBbb.bbbb.bbcd.efgh.0000.0000.0000.0000
+  //         0000.0000.0000.0000.0000.0000.0000.0000 (64 bits)
+  // where B is b ^ 1
+  uint32_t bits = imm8;
+  uint64_t bit7 = (bits >> 7) & 0x1;
+  uint64_t bit6 = (bits >> 6) & 0x1;
+  uint64_t bit5_to_0 = bits & 0x3f;
+  uint64_t result = (bit7 << 63) | ((256 - bit6) << 54) | (bit5_to_0 << 48);
+
+  return rawbits_to_double(result);
+}
+
+
+double Instruction::ImmFP64() const {
+  return Imm8ToFP64(ImmFP());
+}
+
+
+float Instruction::ImmNEONFP32() const {
+  return Imm8ToFP32(ImmNEONabcdefgh());
+}
+
+
+double Instruction::ImmNEONFP64() const {
+  return Imm8ToFP64(ImmNEONabcdefgh());
+}
+
+
+unsigned CalcLSDataSize(LoadStoreOp op) {
+  VIXL_ASSERT((LSSize_offset + LSSize_width) == (kInstructionSize * 8));
+  unsigned size = static_cast<Instr>(op) >> LSSize_offset;
+  if ((op & LSVector_mask) != 0) {
+    // Vector register memory operations encode the access size in the "size"
+    // and "opc" fields.
+    if ((size == 0) && ((op & LSOpc_mask) >> LSOpc_offset) >= 2) {
+      size = kQRegSizeInBytesLog2;
+    }
+  }
+  return size;
+}
+
+
+unsigned CalcLSPairDataSize(LoadStorePairOp op) {
+  VIXL_STATIC_ASSERT(kXRegSizeInBytes == kDRegSizeInBytes);
+  VIXL_STATIC_ASSERT(kWRegSizeInBytes == kSRegSizeInBytes);
+  switch (op) {
+    case STP_q:
+    case LDP_q: return kQRegSizeInBytesLog2;
+    case STP_x:
+    case LDP_x:
+    case STP_d:
+    case LDP_d: return kXRegSizeInBytesLog2;
+    default: return kWRegSizeInBytesLog2;
+  }
+}
+
+
+int Instruction::ImmBranchRangeBitwidth(ImmBranchType branch_type) {
+  switch (branch_type) {
+    case UncondBranchType:
+      return ImmUncondBranch_width;
+    case CondBranchType:
+      return ImmCondBranch_width;
+    case CompareBranchType:
+      return ImmCmpBranch_width;
+    case TestBranchType:
+      return ImmTestBranch_width;
+    default:
+      VIXL_UNREACHABLE();
+      return 0;
+  }
+}
+
+
+int32_t Instruction::ImmBranchForwardRange(ImmBranchType branch_type) {
+  int32_t encoded_max = 1 << (ImmBranchRangeBitwidth(branch_type) - 1);
+  return encoded_max * kInstructionSize;
+}
+
+
+bool Instruction::IsValidImmPCOffset(ImmBranchType branch_type,
+                                     int64_t offset) {
+  return is_intn(ImmBranchRangeBitwidth(branch_type), offset);
+}
+
+
+const Instruction* Instruction::ImmPCOffsetTarget() const {
+  const Instruction * base = this;
+  ptrdiff_t offset;
+  if (IsPCRelAddressing()) {
+    // ADR and ADRP.
+    offset = ImmPCRel();
+    if (Mask(PCRelAddressingMask) == ADRP) {
+      base = AlignDown(base, kPageSize);
+      offset *= kPageSize;
+    } else {
+      VIXL_ASSERT(Mask(PCRelAddressingMask) == ADR);
+    }
+  } else {
+    // All PC-relative branches.
+    VIXL_ASSERT(BranchType() != UnknownBranchType);
+    // Relative branch offsets are instruction-size-aligned.
+    offset = ImmBranch() << kInstructionSizeLog2;
+  }
+  return base + offset;
+}
+
+
+int Instruction::ImmBranch() const {
+  switch (BranchType()) {
+    case CondBranchType: return ImmCondBranch();
+    case UncondBranchType: return ImmUncondBranch();
+    case CompareBranchType: return ImmCmpBranch();
+    case TestBranchType: return ImmTestBranch();
+    default: VIXL_UNREACHABLE();
+  }
+  return 0;
+}
+
+
+void Instruction::SetImmPCOffsetTarget(const Instruction* target) {
+  if (IsPCRelAddressing()) {
+    SetPCRelImmTarget(target);
+  } else {
+    SetBranchImmTarget(target);
+  }
+}
+
+
+void Instruction::SetPCRelImmTarget(const Instruction* target) {
+  ptrdiff_t imm21;
+  if ((Mask(PCRelAddressingMask) == ADR)) {
+    imm21 = target - this;
+  } else {
+    VIXL_ASSERT(Mask(PCRelAddressingMask) == ADRP);
+    uintptr_t this_page = reinterpret_cast<uintptr_t>(this) / kPageSize;
+    uintptr_t target_page = reinterpret_cast<uintptr_t>(target) / kPageSize;
+    imm21 = target_page - this_page;
+  }
+  Instr imm = Assembler::ImmPCRelAddress(static_cast<int32_t>(imm21));
+
+  SetInstructionBits(Mask(~ImmPCRel_mask) | imm);
+}
+
+
+void Instruction::SetBranchImmTarget(const Instruction* target) {
+  VIXL_ASSERT(((target - this) & 3) == 0);
+  Instr branch_imm = 0;
+  uint32_t imm_mask = 0;
+  int offset = static_cast<int>((target - this) >> kInstructionSizeLog2);
+  switch (BranchType()) {
+    case CondBranchType: {
+      branch_imm = Assembler::ImmCondBranch(offset);
+      imm_mask = ImmCondBranch_mask;
+      break;
+    }
+    case UncondBranchType: {
+      branch_imm = Assembler::ImmUncondBranch(offset);
+      imm_mask = ImmUncondBranch_mask;
+      break;
+    }
+    case CompareBranchType: {
+      branch_imm = Assembler::ImmCmpBranch(offset);
+      imm_mask = ImmCmpBranch_mask;
+      break;
+    }
+    case TestBranchType: {
+      branch_imm = Assembler::ImmTestBranch(offset);
+      imm_mask = ImmTestBranch_mask;
+      break;
+    }
+    default: VIXL_UNREACHABLE();
+  }
+  SetInstructionBits(Mask(~imm_mask) | branch_imm);
+}
+
+
+void Instruction::SetImmLLiteral(const Instruction* source) {
+  VIXL_ASSERT(IsWordAligned(source));
+  ptrdiff_t offset = (source - this) >> kLiteralEntrySizeLog2;
+  Instr imm = Assembler::ImmLLiteral(static_cast<int>(offset));
+  Instr mask = ImmLLiteral_mask;
+
+  SetInstructionBits(Mask(~mask) | imm);
+}
+
+
+VectorFormat VectorFormatHalfWidth(const VectorFormat vform) {
+  VIXL_ASSERT(vform == kFormat8H || vform == kFormat4S || vform == kFormat2D ||
+              vform == kFormatH || vform == kFormatS || vform == kFormatD);
+  switch (vform) {
+    case kFormat8H: return kFormat8B;
+    case kFormat4S: return kFormat4H;
+    case kFormat2D: return kFormat2S;
+    case kFormatH:  return kFormatB;
+    case kFormatS:  return kFormatH;
+    case kFormatD:  return kFormatS;
+    default: VIXL_UNREACHABLE(); return kFormatUndefined;
+  }
+}
+
+
+VectorFormat VectorFormatDoubleWidth(const VectorFormat vform) {
+  VIXL_ASSERT(vform == kFormat8B || vform == kFormat4H || vform == kFormat2S ||
+              vform == kFormatB || vform == kFormatH || vform == kFormatS);
+  switch (vform) {
+    case kFormat8B: return kFormat8H;
+    case kFormat4H: return kFormat4S;
+    case kFormat2S: return kFormat2D;
+    case kFormatB:  return kFormatH;
+    case kFormatH:  return kFormatS;
+    case kFormatS:  return kFormatD;
+    default: VIXL_UNREACHABLE(); return kFormatUndefined;
+  }
+}
+
+
+VectorFormat VectorFormatFillQ(const VectorFormat vform) {
+  switch (vform) {
+    case kFormatB:
+    case kFormat8B:
+    case kFormat16B: return kFormat16B;
+    case kFormatH:
+    case kFormat4H:
+    case kFormat8H:  return kFormat8H;
+    case kFormatS:
+    case kFormat2S:
+    case kFormat4S:  return kFormat4S;
+    case kFormatD:
+    case kFormat1D:
+    case kFormat2D:  return kFormat2D;
+    default: VIXL_UNREACHABLE(); return kFormatUndefined;
+  }
+}
+
+VectorFormat VectorFormatHalfWidthDoubleLanes(const VectorFormat vform) {
+  switch (vform) {
+    case kFormat4H: return kFormat8B;
+    case kFormat8H: return kFormat16B;
+    case kFormat2S: return kFormat4H;
+    case kFormat4S: return kFormat8H;
+    case kFormat1D: return kFormat2S;
+    case kFormat2D: return kFormat4S;
+    default: VIXL_UNREACHABLE(); return kFormatUndefined;
+  }
+}
+
+VectorFormat VectorFormatDoubleLanes(const VectorFormat vform) {
+  VIXL_ASSERT(vform == kFormat8B || vform == kFormat4H || vform == kFormat2S);
+  switch (vform) {
+    case kFormat8B: return kFormat16B;
+    case kFormat4H: return kFormat8H;
+    case kFormat2S: return kFormat4S;
+    default: VIXL_UNREACHABLE(); return kFormatUndefined;
+  }
+}
+
+
+VectorFormat VectorFormatHalfLanes(const VectorFormat vform) {
+  VIXL_ASSERT(vform == kFormat16B || vform == kFormat8H || vform == kFormat4S);
+  switch (vform) {
+    case kFormat16B: return kFormat8B;
+    case kFormat8H: return kFormat4H;
+    case kFormat4S: return kFormat2S;
+    default: VIXL_UNREACHABLE(); return kFormatUndefined;
+  }
+}
+
+
+VectorFormat ScalarFormatFromLaneSize(int laneSize) {
+  switch (laneSize) {
+    case 8:  return kFormatB;
+    case 16: return kFormatH;
+    case 32: return kFormatS;
+    case 64: return kFormatD;
+    default: VIXL_UNREACHABLE(); return kFormatUndefined;
+  }
+}
+
+
+unsigned RegisterSizeInBitsFromFormat(VectorFormat vform) {
+  VIXL_ASSERT(vform != kFormatUndefined);
+  switch (vform) {
+    case kFormatB: return kBRegSize;
+    case kFormatH: return kHRegSize;
+    case kFormatS: return kSRegSize;
+    case kFormatD: return kDRegSize;
+    case kFormat8B:
+    case kFormat4H:
+    case kFormat2S:
+    case kFormat1D: return kDRegSize;
+    default: return kQRegSize;
+  }
+}
+
+
+unsigned RegisterSizeInBytesFromFormat(VectorFormat vform) {
+  return RegisterSizeInBitsFromFormat(vform) / 8;
+}
+
+
+unsigned LaneSizeInBitsFromFormat(VectorFormat vform) {
+  VIXL_ASSERT(vform != kFormatUndefined);
+  switch (vform) {
+    case kFormatB:
+    case kFormat8B:
+    case kFormat16B: return 8;
+    case kFormatH:
+    case kFormat4H:
+    case kFormat8H: return 16;
+    case kFormatS:
+    case kFormat2S:
+    case kFormat4S: return 32;
+    case kFormatD:
+    case kFormat1D:
+    case kFormat2D: return 64;
+    default: VIXL_UNREACHABLE(); return 0;
+  }
+}
+
+
+int LaneSizeInBytesFromFormat(VectorFormat vform) {
+  return LaneSizeInBitsFromFormat(vform) / 8;
+}
+
+
+int LaneSizeInBytesLog2FromFormat(VectorFormat vform) {
+  VIXL_ASSERT(vform != kFormatUndefined);
+  switch (vform) {
+    case kFormatB:
+    case kFormat8B:
+    case kFormat16B: return 0;
+    case kFormatH:
+    case kFormat4H:
+    case kFormat8H: return 1;
+    case kFormatS:
+    case kFormat2S:
+    case kFormat4S: return 2;
+    case kFormatD:
+    case kFormat1D:
+    case kFormat2D: return 3;
+    default: VIXL_UNREACHABLE(); return 0;
+  }
+}
+
+
+int LaneCountFromFormat(VectorFormat vform) {
+  VIXL_ASSERT(vform != kFormatUndefined);
+  switch (vform) {
+    case kFormat16B: return 16;
+    case kFormat8B:
+    case kFormat8H: return 8;
+    case kFormat4H:
+    case kFormat4S: return 4;
+    case kFormat2S:
+    case kFormat2D: return 2;
+    case kFormat1D:
+    case kFormatB:
+    case kFormatH:
+    case kFormatS:
+    case kFormatD: return 1;
+    default: VIXL_UNREACHABLE(); return 0;
+  }
+}
+
+
+int MaxLaneCountFromFormat(VectorFormat vform) {
+  VIXL_ASSERT(vform != kFormatUndefined);
+  switch (vform) {
+    case kFormatB:
+    case kFormat8B:
+    case kFormat16B: return 16;
+    case kFormatH:
+    case kFormat4H:
+    case kFormat8H: return 8;
+    case kFormatS:
+    case kFormat2S:
+    case kFormat4S: return 4;
+    case kFormatD:
+    case kFormat1D:
+    case kFormat2D: return 2;
+    default: VIXL_UNREACHABLE(); return 0;
+  }
+}
+
+
+// Does 'vform' indicate a vector format or a scalar format?
+bool IsVectorFormat(VectorFormat vform) {
+  VIXL_ASSERT(vform != kFormatUndefined);
+  switch (vform) {
+    case kFormatB:
+    case kFormatH:
+    case kFormatS:
+    case kFormatD: return false;
+    default: return true;
+  }
+}
+
+
+int64_t MaxIntFromFormat(VectorFormat vform) {
+  return INT64_MAX >> (64 - LaneSizeInBitsFromFormat(vform));
+}
+
+
+int64_t MinIntFromFormat(VectorFormat vform) {
+  return INT64_MIN >> (64 - LaneSizeInBitsFromFormat(vform));
+}
+
+
+uint64_t MaxUintFromFormat(VectorFormat vform) {
+  return UINT64_MAX >> (64 - LaneSizeInBitsFromFormat(vform));
+}
+}  // namespace vixl
+
--- a/disas/libvixl/vixl/a64/instructions-a64.h
+++ b/disas/libvixl/vixl/a64/instructions-a64.h
@@ -0,0 +1,757 @@
+// Copyright 2015, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_A64_INSTRUCTIONS_A64_H_
+#define VIXL_A64_INSTRUCTIONS_A64_H_
+
+#include "vixl/globals.h"
+#include "vixl/utils.h"
+#include "vixl/a64/constants-a64.h"
+
+namespace vixl {
+// ISA constants. --------------------------------------------------------------
+
+typedef uint32_t Instr;
+const unsigned kInstructionSize = 4;
+const unsigned kInstructionSizeLog2 = 2;
+const unsigned kLiteralEntrySize = 4;
+const unsigned kLiteralEntrySizeLog2 = 2;
+const unsigned kMaxLoadLiteralRange = 1 * MBytes;
+
+// This is the nominal page size (as used by the adrp instruction); the actual
+// size of the memory pages allocated by the kernel is likely to differ.
+const unsigned kPageSize = 4 * KBytes;
+const unsigned kPageSizeLog2 = 12;
+
+const unsigned kBRegSize = 8;
+const unsigned kBRegSizeLog2 = 3;
+const unsigned kBRegSizeInBytes = kBRegSize / 8;
+const unsigned kBRegSizeInBytesLog2 = kBRegSizeLog2 - 3;
+const unsigned kHRegSize = 16;
+const unsigned kHRegSizeLog2 = 4;
+const unsigned kHRegSizeInBytes = kHRegSize / 8;
+const unsigned kHRegSizeInBytesLog2 = kHRegSizeLog2 - 3;
+const unsigned kWRegSize = 32;
+const unsigned kWRegSizeLog2 = 5;
+const unsigned kWRegSizeInBytes = kWRegSize / 8;
+const unsigned kWRegSizeInBytesLog2 = kWRegSizeLog2 - 3;
+const unsigned kXRegSize = 64;
+const unsigned kXRegSizeLog2 = 6;
+const unsigned kXRegSizeInBytes = kXRegSize / 8;
+const unsigned kXRegSizeInBytesLog2 = kXRegSizeLog2 - 3;
+const unsigned kSRegSize = 32;
+const unsigned kSRegSizeLog2 = 5;
+const unsigned kSRegSizeInBytes = kSRegSize / 8;
+const unsigned kSRegSizeInBytesLog2 = kSRegSizeLog2 - 3;
+const unsigned kDRegSize = 64;
+const unsigned kDRegSizeLog2 = 6;
+const unsigned kDRegSizeInBytes = kDRegSize / 8;
+const unsigned kDRegSizeInBytesLog2 = kDRegSizeLog2 - 3;
+const unsigned kQRegSize = 128;
+const unsigned kQRegSizeLog2 = 7;
+const unsigned kQRegSizeInBytes = kQRegSize / 8;
+const unsigned kQRegSizeInBytesLog2 = kQRegSizeLog2 - 3;
+const uint64_t kWRegMask = UINT64_C(0xffffffff);
+const uint64_t kXRegMask = UINT64_C(0xffffffffffffffff);
+const uint64_t kSRegMask = UINT64_C(0xffffffff);
+const uint64_t kDRegMask = UINT64_C(0xffffffffffffffff);
+const uint64_t kSSignMask = UINT64_C(0x80000000);
+const uint64_t kDSignMask = UINT64_C(0x8000000000000000);
+const uint64_t kWSignMask = UINT64_C(0x80000000);
+const uint64_t kXSignMask = UINT64_C(0x8000000000000000);
+const uint64_t kByteMask = UINT64_C(0xff);
+const uint64_t kHalfWordMask = UINT64_C(0xffff);
+const uint64_t kWordMask = UINT64_C(0xffffffff);
+const uint64_t kXMaxUInt = UINT64_C(0xffffffffffffffff);
+const uint64_t kWMaxUInt = UINT64_C(0xffffffff);
+const int64_t kXMaxInt = INT64_C(0x7fffffffffffffff);
+const int64_t kXMinInt = INT64_C(0x8000000000000000);
+const int32_t kWMaxInt = INT32_C(0x7fffffff);
+const int32_t kWMinInt = INT32_C(0x80000000);
+const unsigned kLinkRegCode = 30;
+const unsigned kZeroRegCode = 31;
+const unsigned kSPRegInternalCode = 63;
+const unsigned kRegCodeMask = 0x1f;
+
+const unsigned kAddressTagOffset = 56;
+const unsigned kAddressTagWidth = 8;
+const uint64_t kAddressTagMask =
+    ((UINT64_C(1) << kAddressTagWidth) - 1) << kAddressTagOffset;
+VIXL_STATIC_ASSERT(kAddressTagMask == UINT64_C(0xff00000000000000));
+
+// AArch64 floating-point specifics. These match IEEE-754.
+const unsigned kDoubleMantissaBits = 52;
+const unsigned kDoubleExponentBits = 11;
+const unsigned kFloatMantissaBits = 23;
+const unsigned kFloatExponentBits = 8;
+const unsigned kFloat16MantissaBits = 10;
+const unsigned kFloat16ExponentBits = 5;
+
+// Floating-point infinity values.
+extern const float16 kFP16PositiveInfinity;
+extern const float16 kFP16NegativeInfinity;
+extern const float kFP32PositiveInfinity;
+extern const float kFP32NegativeInfinity;
+extern const double kFP64PositiveInfinity;
+extern const double kFP64NegativeInfinity;
+
+// The default NaN values (for FPCR.DN=1).
+extern const float16 kFP16DefaultNaN;
+extern const float kFP32DefaultNaN;
+extern const double kFP64DefaultNaN;
+
+unsigned CalcLSDataSize(LoadStoreOp op);
+unsigned CalcLSPairDataSize(LoadStorePairOp op);
+
+enum ImmBranchType {
+  UnknownBranchType = 0,
+  CondBranchType    = 1,
+  UncondBranchType  = 2,
+  CompareBranchType = 3,
+  TestBranchType    = 4
+};
+
+enum AddrMode {
+  Offset,
+  PreIndex,
+  PostIndex
+};
+
+enum FPRounding {
+  // The first four values are encodable directly by FPCR<RMode>.
+  FPTieEven = 0x0,
+  FPPositiveInfinity = 0x1,
+  FPNegativeInfinity = 0x2,
+  FPZero = 0x3,
+
+  // The final rounding modes are only available when explicitly specified by
+  // the instruction (such as with fcvta). It cannot be set in FPCR.
+  FPTieAway,
+  FPRoundOdd
+};
+
+enum Reg31Mode {
+  Reg31IsStackPointer,
+  Reg31IsZeroRegister
+};
+
+// Instructions. ---------------------------------------------------------------
+
+class Instruction {
+ public:
+  Instr InstructionBits() const {
+    return *(reinterpret_cast<const Instr*>(this));
+  }
+
+  void SetInstructionBits(Instr new_instr) {
+    *(reinterpret_cast<Instr*>(this)) = new_instr;
+  }
+
+  int Bit(int pos) const {
+    return (InstructionBits() >> pos) & 1;
+  }
+
+  uint32_t Bits(int msb, int lsb) const {
+    return unsigned_bitextract_32(msb, lsb, InstructionBits());
+  }
+
+  int32_t SignedBits(int msb, int lsb) const {
+    int32_t bits = *(reinterpret_cast<const int32_t*>(this));
+    return signed_bitextract_32(msb, lsb, bits);
+  }
+
+  Instr Mask(uint32_t mask) const {
+    return InstructionBits() & mask;
+  }
+
+  #define DEFINE_GETTER(Name, HighBit, LowBit, Func)             \
+  int32_t Name() const { return Func(HighBit, LowBit); }
+  INSTRUCTION_FIELDS_LIST(DEFINE_GETTER)
+  #undef DEFINE_GETTER
+
+  // ImmPCRel is a compound field (not present in INSTRUCTION_FIELDS_LIST),
+  // formed from ImmPCRelLo and ImmPCRelHi.
+  int ImmPCRel() const {
+    int offset =
+        static_cast<int>((ImmPCRelHi() << ImmPCRelLo_width) | ImmPCRelLo());
+    int width = ImmPCRelLo_width + ImmPCRelHi_width;
+    return signed_bitextract_32(width - 1, 0, offset);
+  }
+
+  uint64_t ImmLogical() const;
+  unsigned ImmNEONabcdefgh() const;
+  float ImmFP32() const;
+  double ImmFP64() const;
+  float ImmNEONFP32() const;
+  double ImmNEONFP64() const;
+
+  unsigned SizeLS() const {
+    return CalcLSDataSize(static_cast<LoadStoreOp>(Mask(LoadStoreMask)));
+  }
+
+  unsigned SizeLSPair() const {
+    return CalcLSPairDataSize(
+        static_cast<LoadStorePairOp>(Mask(LoadStorePairMask)));
+  }
+
+  int NEONLSIndex(int access_size_shift) const {
+    int64_t q = NEONQ();
+    int64_t s = NEONS();
+    int64_t size = NEONLSSize();
+    int64_t index = (q << 3) | (s << 2) | size;
+    return static_cast<int>(index >> access_size_shift);
+  }
+
+  // Helpers.
+  bool IsCondBranchImm() const {
+    return Mask(ConditionalBranchFMask) == ConditionalBranchFixed;
+  }
+
+  bool IsUncondBranchImm() const {
+    return Mask(UnconditionalBranchFMask) == UnconditionalBranchFixed;
+  }
+
+  bool IsCompareBranch() const {
+    return Mask(CompareBranchFMask) == CompareBranchFixed;
+  }
+
+  bool IsTestBranch() const {
+    return Mask(TestBranchFMask) == TestBranchFixed;
+  }
+
+  bool IsImmBranch() const {
+    return BranchType() != UnknownBranchType;
+  }
+
+  bool IsPCRelAddressing() const {
+    return Mask(PCRelAddressingFMask) == PCRelAddressingFixed;
+  }
+
+  bool IsLogicalImmediate() const {
+    return Mask(LogicalImmediateFMask) == LogicalImmediateFixed;
+  }
+
+  bool IsAddSubImmediate() const {
+    return Mask(AddSubImmediateFMask) == AddSubImmediateFixed;
+  }
+
+  bool IsAddSubExtended() const {
+    return Mask(AddSubExtendedFMask) == AddSubExtendedFixed;
+  }
+
+  bool IsLoadOrStore() const {
+    return Mask(LoadStoreAnyFMask) == LoadStoreAnyFixed;
+  }
+
+  bool IsLoad() const;
+  bool IsStore() const;
+
+  bool IsLoadLiteral() const {
+    // This includes PRFM_lit.
+    return Mask(LoadLiteralFMask) == LoadLiteralFixed;
+  }
+
+  bool IsMovn() const {
+    return (Mask(MoveWideImmediateMask) == MOVN_x) ||
+           (Mask(MoveWideImmediateMask) == MOVN_w);
+  }
+
+  static int ImmBranchRangeBitwidth(ImmBranchType branch_type);
+  static int32_t ImmBranchForwardRange(ImmBranchType branch_type);
+  static bool IsValidImmPCOffset(ImmBranchType branch_type, int64_t offset);
+
+  // Indicate whether Rd can be the stack pointer or the zero register. This
+  // does not check that the instruction actually has an Rd field.
+  Reg31Mode RdMode() const {
+    // The following instructions use sp or wsp as Rd:
+    //  Add/sub (immediate) when not setting the flags.
+    //  Add/sub (extended) when not setting the flags.
+    //  Logical (immediate) when not setting the flags.
+    // Otherwise, r31 is the zero register.
+    if (IsAddSubImmediate() || IsAddSubExtended()) {
+      if (Mask(AddSubSetFlagsBit)) {
+        return Reg31IsZeroRegister;
+      } else {
+        return Reg31IsStackPointer;
+      }
+    }
+    if (IsLogicalImmediate()) {
+      // Of the logical (immediate) instructions, only ANDS (and its aliases)
+      // can set the flags. The others can all write into sp.
+      // Note that some logical operations are not available to
+      // immediate-operand instructions, so we have to combine two masks here.
+      if (Mask(LogicalImmediateMask & LogicalOpMask) == ANDS) {
+        return Reg31IsZeroRegister;
+      } else {
+        return Reg31IsStackPointer;
+      }
+    }
+    return Reg31IsZeroRegister;
+  }
+
+  // Indicate whether Rn can be the stack pointer or the zero register. This
+  // does not check that the instruction actually has an Rn field.
+  Reg31Mode RnMode() const {
+    // The following instructions use sp or wsp as Rn:
+    //  All loads and stores.
+    //  Add/sub (immediate).
+    //  Add/sub (extended).
+    // Otherwise, r31 is the zero register.
+    if (IsLoadOrStore() || IsAddSubImmediate() || IsAddSubExtended()) {
+      return Reg31IsStackPointer;
+    }
+    return Reg31IsZeroRegister;
+  }
+
+  ImmBranchType BranchType() const {
+    if (IsCondBranchImm()) {
+      return CondBranchType;
+    } else if (IsUncondBranchImm()) {
+      return UncondBranchType;
+    } else if (IsCompareBranch()) {
+      return CompareBranchType;
+    } else if (IsTestBranch()) {
+      return TestBranchType;
+    } else {
+      return UnknownBranchType;
+    }
+  }
+
+  // Find the target of this instruction. 'this' may be a branch or a
+  // PC-relative addressing instruction.
+  const Instruction* ImmPCOffsetTarget() const;
+
+  // Patch a PC-relative offset to refer to 'target'. 'this' may be a branch or
+  // a PC-relative addressing instruction.
+  void SetImmPCOffsetTarget(const Instruction* target);
+  // Patch a literal load instruction to load from 'source'.
+  void SetImmLLiteral(const Instruction* source);
+
+  // The range of a load literal instruction, expressed as 'instr +- range'.
+  // The range is actually the 'positive' range; the branch instruction can
+  // target [instr - range - kInstructionSize, instr + range].
+  static const int kLoadLiteralImmBitwidth = 19;
+  static const int kLoadLiteralRange =
+      (1 << kLoadLiteralImmBitwidth) / 2 - kInstructionSize;
+
+  // Calculate the address of a literal referred to by a load-literal
+  // instruction, and return it as the specified type.
+  //
+  // The literal itself is safely mutable only if the backing buffer is safely
+  // mutable.
+  template <typename T>
+  T LiteralAddress() const {
+    uint64_t base_raw = reinterpret_cast<uint64_t>(this);
+    int64_t offset = ImmLLiteral() << kLiteralEntrySizeLog2;
+    uint64_t address_raw = base_raw + offset;
+
+    // Cast the address using a C-style cast. A reinterpret_cast would be
+    // appropriate, but it can't cast one integral type to another.
+    T address = (T)(address_raw);
+
+    // Assert that the address can be represented by the specified type.
+    VIXL_ASSERT((uint64_t)(address) == address_raw);
+
+    return address;
+  }
+
+  uint32_t Literal32() const {
+    uint32_t literal;
+    memcpy(&literal, LiteralAddress<const void*>(), sizeof(literal));
+    return literal;
+  }
+
+  uint64_t Literal64() const {
+    uint64_t literal;
+    memcpy(&literal, LiteralAddress<const void*>(), sizeof(literal));
+    return literal;
+  }
+
+  float LiteralFP32() const {
+    return rawbits_to_float(Literal32());
+  }
+
+  double LiteralFP64() const {
+    return rawbits_to_double(Literal64());
+  }
+
+  const Instruction* NextInstruction() const {
+    return this + kInstructionSize;
+  }
+
+  const Instruction* InstructionAtOffset(int64_t offset) const {
+    VIXL_ASSERT(IsWordAligned(this + offset));
+    return this + offset;
+  }
+
+  template<typename T> static Instruction* Cast(T src) {
+    return reinterpret_cast<Instruction*>(src);
+  }
+
+  template<typename T> static const Instruction* CastConst(T src) {
+    return reinterpret_cast<const Instruction*>(src);
+  }
+
+ private:
+  int ImmBranch() const;
+
+  static float Imm8ToFP32(uint32_t imm8);
+  static double Imm8ToFP64(uint32_t imm8);
+
+  void SetPCRelImmTarget(const Instruction* target);
+  void SetBranchImmTarget(const Instruction* target);
+};
+
+
+// Functions for handling NEON vector format information.
+enum VectorFormat {
+  kFormatUndefined = 0xffffffff,
+  kFormat8B  = NEON_8B,
+  kFormat16B = NEON_16B,
+  kFormat4H  = NEON_4H,
+  kFormat8H  = NEON_8H,
+  kFormat2S  = NEON_2S,
+  kFormat4S  = NEON_4S,
+  kFormat1D  = NEON_1D,
+  kFormat2D  = NEON_2D,
+
+  // Scalar formats. We add the scalar bit to distinguish between scalar and
+  // vector enumerations; the bit is always set in the encoding of scalar ops
+  // and always clear for vector ops. Although kFormatD and kFormat1D appear
+  // to be the same, their meaning is subtly different. The first is a scalar
+  // operation, the second a vector operation that only affects one lane.
+  kFormatB = NEON_B | NEONScalar,
+  kFormatH = NEON_H | NEONScalar,
+  kFormatS = NEON_S | NEONScalar,
+  kFormatD = NEON_D | NEONScalar
+};
+
+VectorFormat VectorFormatHalfWidth(const VectorFormat vform);
+VectorFormat VectorFormatDoubleWidth(const VectorFormat vform);
+VectorFormat VectorFormatDoubleLanes(const VectorFormat vform);
+VectorFormat VectorFormatHalfLanes(const VectorFormat vform);
+VectorFormat ScalarFormatFromLaneSize(int lanesize);
+VectorFormat VectorFormatHalfWidthDoubleLanes(const VectorFormat vform);
+VectorFormat VectorFormatFillQ(const VectorFormat vform);
+unsigned RegisterSizeInBitsFromFormat(VectorFormat vform);
+unsigned RegisterSizeInBytesFromFormat(VectorFormat vform);
+// TODO: Make the return types of these functions consistent.
+unsigned LaneSizeInBitsFromFormat(VectorFormat vform);
+int LaneSizeInBytesFromFormat(VectorFormat vform);
+int LaneSizeInBytesLog2FromFormat(VectorFormat vform);
+int LaneCountFromFormat(VectorFormat vform);
+int MaxLaneCountFromFormat(VectorFormat vform);
+bool IsVectorFormat(VectorFormat vform);
+int64_t MaxIntFromFormat(VectorFormat vform);
+int64_t MinIntFromFormat(VectorFormat vform);
+uint64_t MaxUintFromFormat(VectorFormat vform);
+
+
+enum NEONFormat {
+  NF_UNDEF = 0,
+  NF_8B    = 1,
+  NF_16B   = 2,
+  NF_4H    = 3,
+  NF_8H    = 4,
+  NF_2S    = 5,
+  NF_4S    = 6,
+  NF_1D    = 7,
+  NF_2D    = 8,
+  NF_B     = 9,
+  NF_H     = 10,
+  NF_S     = 11,
+  NF_D     = 12
+};
+
+static const unsigned kNEONFormatMaxBits = 6;
+
+struct NEONFormatMap {
+  // The bit positions in the instruction to consider.
+  uint8_t bits[kNEONFormatMaxBits];
+
+  // Mapping from concatenated bits to format.
+  NEONFormat map[1 << kNEONFormatMaxBits];
+};
+
+class NEONFormatDecoder {
+ public:
+  enum SubstitutionMode {
+    kPlaceholder,
+    kFormat
+  };
+
+  // Construct a format decoder with increasingly specific format maps for each
+  // subsitution. If no format map is specified, the default is the integer
+  // format map.
+  explicit NEONFormatDecoder(const Instruction* instr) {
+    instrbits_ = instr->InstructionBits();
+    SetFormatMaps(IntegerFormatMap());
+  }
+  NEONFormatDecoder(const Instruction* instr,
+                    const NEONFormatMap* format) {
+    instrbits_ = instr->InstructionBits();
+    SetFormatMaps(format);
+  }
+  NEONFormatDecoder(const Instruction* instr,
+                    const NEONFormatMap* format0,
+                    const NEONFormatMap* format1) {
+    instrbits_ = instr->InstructionBits();
+    SetFormatMaps(format0, format1);
+  }
+  NEONFormatDecoder(const Instruction* instr,
+                    const NEONFormatMap* format0,
+                    const NEONFormatMap* format1,
+                    const NEONFormatMap* format2) {
+    instrbits_ = instr->InstructionBits();
+    SetFormatMaps(format0, format1, format2);
+  }
+
+  // Set the format mapping for all or individual substitutions.
+  void SetFormatMaps(const NEONFormatMap* format0,
+                     const NEONFormatMap* format1 = NULL,
+                     const NEONFormatMap* format2 = NULL) {
+    VIXL_ASSERT(format0 != NULL);
+    formats_[0] = format0;
+    formats_[1] = (format1 == NULL) ? formats_[0] : format1;
+    formats_[2] = (format2 == NULL) ? formats_[1] : format2;
+  }
+  void SetFormatMap(unsigned index, const NEONFormatMap* format) {
+    VIXL_ASSERT(index <= (sizeof(formats_) / sizeof(formats_[0])));
+    VIXL_ASSERT(format != NULL);
+    formats_[index] = format;
+  }
+
+  // Substitute %s in the input string with the placeholder string for each
+  // register, ie. "'B", "'H", etc.
+  const char* SubstitutePlaceholders(const char* string) {
+    return Substitute(string, kPlaceholder, kPlaceholder, kPlaceholder);
+  }
+
+  // Substitute %s in the input string with a new string based on the
+  // substitution mode.
+  const char* Substitute(const char* string,
+                         SubstitutionMode mode0 = kFormat,
+                         SubstitutionMode mode1 = kFormat,
+                         SubstitutionMode mode2 = kFormat) {
+    snprintf(form_buffer_, sizeof(form_buffer_), string,
+             GetSubstitute(0, mode0),
+             GetSubstitute(1, mode1),
+             GetSubstitute(2, mode2));
+    return form_buffer_;
+  }
+
+  // Append a "2" to a mnemonic string based of the state of the Q bit.
+  const char* Mnemonic(const char* mnemonic) {
+    if ((instrbits_ & NEON_Q) != 0) {
+      snprintf(mne_buffer_, sizeof(mne_buffer_), "%s2", mnemonic);
+      return mne_buffer_;
+    }
+    return mnemonic;
+  }
+
+  VectorFormat GetVectorFormat(int format_index = 0) {
+    return GetVectorFormat(formats_[format_index]);
+  }
+
+  VectorFormat GetVectorFormat(const NEONFormatMap* format_map) {
+    static const VectorFormat vform[] = {
+      kFormatUndefined,
+      kFormat8B, kFormat16B, kFormat4H, kFormat8H,
+      kFormat2S, kFormat4S, kFormat1D, kFormat2D,
+      kFormatB, kFormatH, kFormatS, kFormatD
+    };
+    VIXL_ASSERT(GetNEONFormat(format_map) < (sizeof(vform) / sizeof(vform[0])));
+    return vform[GetNEONFormat(format_map)];
+  }
+
+  // Built in mappings for common cases.
+
+  // The integer format map uses three bits (Q, size<1:0>) to encode the
+  // "standard" set of NEON integer vector formats.
+  static const NEONFormatMap* IntegerFormatMap() {
+    static const NEONFormatMap map = {
+      {23, 22, 30},
+      {NF_8B, NF_16B, NF_4H, NF_8H, NF_2S, NF_4S, NF_UNDEF, NF_2D}
+    };
+    return &map;
+  }
+
+  // The long integer format map uses two bits (size<1:0>) to encode the
+  // long set of NEON integer vector formats. These are used in narrow, wide
+  // and long operations.
+  static const NEONFormatMap* LongIntegerFormatMap() {
+    static const NEONFormatMap map = {
+      {23, 22}, {NF_8H, NF_4S, NF_2D}
+    };
+    return &map;
+  }
+
+  // The FP format map uses two bits (Q, size<0>) to encode the NEON FP vector
+  // formats: NF_2S, NF_4S, NF_2D.
+  static const NEONFormatMap* FPFormatMap() {
+    // The FP format map assumes two bits (Q, size<0>) are used to encode the
+    // NEON FP vector formats: NF_2S, NF_4S, NF_2D.
+    static const NEONFormatMap map = {
+      {22, 30}, {NF_2S, NF_4S, NF_UNDEF, NF_2D}
+    };
+    return &map;
+  }
+
+  // The load/store format map uses three bits (Q, 11, 10) to encode the
+  // set of NEON vector formats.
+  static const NEONFormatMap* LoadStoreFormatMap() {
+    static const NEONFormatMap map = {
+      {11, 10, 30},
+      {NF_8B, NF_16B, NF_4H, NF_8H, NF_2S, NF_4S, NF_1D, NF_2D}
+    };
+    return &map;
+  }
+
+  // The logical format map uses one bit (Q) to encode the NEON vector format:
+  // NF_8B, NF_16B.
+  static const NEONFormatMap* LogicalFormatMap() {
+    static const NEONFormatMap map = {
+      {30}, {NF_8B, NF_16B}
+    };
+    return &map;
+  }
+
+  // The triangular format map uses between two and five bits to encode the NEON
+  // vector format:
+  // xxx10->8B, xxx11->16B, xx100->4H, xx101->8H
+  // x1000->2S, x1001->4S,  10001->2D, all others undefined.
+  static const NEONFormatMap* TriangularFormatMap() {
+    static const NEONFormatMap map = {
+      {19, 18, 17, 16, 30},
+      {NF_UNDEF, NF_UNDEF, NF_8B, NF_16B, NF_4H, NF_8H, NF_8B, NF_16B, NF_2S,
+       NF_4S, NF_8B, NF_16B, NF_4H, NF_8H, NF_8B, NF_16B, NF_UNDEF, NF_2D,
+       NF_8B, NF_16B, NF_4H, NF_8H, NF_8B, NF_16B, NF_2S, NF_4S, NF_8B, NF_16B,
+       NF_4H, NF_8H, NF_8B, NF_16B}
+    };
+    return &map;
+  }
+
+  // The scalar format map uses two bits (size<1:0>) to encode the NEON scalar
+  // formats: NF_B, NF_H, NF_S, NF_D.
+  static const NEONFormatMap* ScalarFormatMap() {
+    static const NEONFormatMap map = {
+      {23, 22}, {NF_B, NF_H, NF_S, NF_D}
+    };
+    return &map;
+  }
+
+  // The long scalar format map uses two bits (size<1:0>) to encode the longer
+  // NEON scalar formats: NF_H, NF_S, NF_D.
+  static const NEONFormatMap* LongScalarFormatMap() {
+    static const NEONFormatMap map = {
+      {23, 22}, {NF_H, NF_S, NF_D}
+    };
+    return &map;
+  }
+
+  // The FP scalar format map assumes one bit (size<0>) is used to encode the
+  // NEON FP scalar formats: NF_S, NF_D.
+  static const NEONFormatMap* FPScalarFormatMap() {
+    static const NEONFormatMap map = {
+      {22}, {NF_S, NF_D}
+    };
+    return &map;
+  }
+
+  // The triangular scalar format map uses between one and four bits to encode
+  // the NEON FP scalar formats:
+  // xxx1->B, xx10->H, x100->S, 1000->D, all others undefined.
+  static const NEONFormatMap* TriangularScalarFormatMap() {
+    static const NEONFormatMap map = {
+      {19, 18, 17, 16},
+      {NF_UNDEF, NF_B, NF_H, NF_B, NF_S, NF_B, NF_H, NF_B,
+       NF_D,     NF_B, NF_H, NF_B, NF_S, NF_B, NF_H, NF_B}
+    };
+    return &map;
+  }
+
+ private:
+  // Get a pointer to a string that represents the format or placeholder for
+  // the specified substitution index, based on the format map and instruction.
+  const char* GetSubstitute(int index, SubstitutionMode mode) {
+    if (mode == kFormat) {
+      return NEONFormatAsString(GetNEONFormat(formats_[index]));
+    }
+    VIXL_ASSERT(mode == kPlaceholder);
+    return NEONFormatAsPlaceholder(GetNEONFormat(formats_[index]));
+  }
+
+  // Get the NEONFormat enumerated value for bits obtained from the
+  // instruction based on the specified format mapping.
+  NEONFormat GetNEONFormat(const NEONFormatMap* format_map) {
+    return format_map->map[PickBits(format_map->bits)];
+  }
+
+  // Convert a NEONFormat into a string.
+  static const char* NEONFormatAsString(NEONFormat format) {
+    static const char* formats[] = {
+      "undefined",
+      "8b", "16b", "4h", "8h", "2s", "4s", "1d", "2d",
+      "b", "h", "s", "d"
+    };
+    VIXL_ASSERT(format < (sizeof(formats) / sizeof(formats[0])));
+    return formats[format];
+  }
+
+  // Convert a NEONFormat into a register placeholder string.
+  static const char* NEONFormatAsPlaceholder(NEONFormat format) {
+    VIXL_ASSERT((format == NF_B) || (format == NF_H) ||
+                (format == NF_S) || (format == NF_D) ||
+                (format == NF_UNDEF));
+    static const char* formats[] = {
+      "undefined",
+      "undefined", "undefined", "undefined", "undefined",
+      "undefined", "undefined", "undefined", "undefined",
+      "'B", "'H", "'S", "'D"
+    };
+    return formats[format];
+  }
+
+  // Select bits from instrbits_ defined by the bits array, concatenate them,
+  // and return the value.
+  uint8_t PickBits(const uint8_t bits[]) {
+    uint8_t result = 0;
+    for (unsigned b = 0; b < kNEONFormatMaxBits; b++) {
+      if (bits[b] == 0) break;
+      result <<= 1;
+      result |= ((instrbits_ & (1 << bits[b])) == 0) ? 0 : 1;
+    }
+    return result;
+  }
+
+  Instr instrbits_;
+  const NEONFormatMap* formats_[3];
+  char form_buffer_[64];
+  char mne_buffer_[16];
+};
+}  // namespace vixl
+
+#endif  // VIXL_A64_INSTRUCTIONS_A64_H_
--- a/disas/libvixl/vixl/code-buffer.h
+++ b/disas/libvixl/vixl/code-buffer.h
@@ -28,7 +28,7 @@
 #define VIXL_CODE_BUFFER_H

 #include <string.h>
-#include "globals.h"
+#include "vixl/globals.h"

 namespace vixl {

--- a/disas/libvixl/vixl/compiler-intrinsics.cc
+++ b/disas/libvixl/vixl/compiler-intrinsics.cc
@@ -1,4 +1,4 @@
-// Copyright 2013, ARM Limited
+// Copyright 2015, ARM Limited
 // All rights reserved.
 //
 // Redistribution and use in source and binary forms, with or without
@@ -24,53 +24,13 @@
 // OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

-#include "utils.h"
-#include <stdio.h>
+#include "compiler-intrinsics.h"

 namespace vixl {

-uint32_t float_to_rawbits(float value) {
-  uint32_t bits = 0;
-  memcpy(&bits, &value, 4);
-  return bits;
-}

-
-uint64_t double_to_rawbits(double value) {
-  uint64_t bits = 0;
-  memcpy(&bits, &value, 8);
-  return bits;
-}
-
-
-float rawbits_to_float(uint32_t bits) {
-  float value = 0.0;
-  memcpy(&value, &bits, 4);
-  return value;
-}
-
-
-double rawbits_to_double(uint64_t bits) {
-  double value = 0.0;
-  memcpy(&value, &bits, 8);
-  return value;
-}
-
-
-int CountLeadingZeros(uint64_t value, int width) {
-  VIXL_ASSERT((width == 32) || (width == 64));
-  int count = 0;
-  uint64_t bit_test = UINT64_C(1) << (width - 1);
-  while ((count < width) && ((bit_test & value) == 0)) {
-    count++;
-    bit_test >>= 1;
-  }
-  return count;
-}
-
-
-int CountLeadingSignBits(int64_t value, int width) {
-  VIXL_ASSERT((width == 32) || (width == 64));
+int CountLeadingSignBitsFallBack(int64_t value, int width) {
+  VIXL_ASSERT(IsPowerOf2(width) && (width <= 64));
  if (value >= 0) {
    return CountLeadingZeros(value, width) - 1;
  } else {
@@ -79,23 +39,46 @@ int CountLeadingSignBits(int64_t value, int width) {
 }


-int CountTrailingZeros(uint64_t value, int width) {
-  VIXL_ASSERT((width == 32) || (width == 64));
-  int count = 0;
-  while ((count < width) && (((value >> count) & 1) == 0)) {
-    count++;
+int CountLeadingZerosFallBack(uint64_t value, int width) {
+  VIXL_ASSERT(IsPowerOf2(width) && (width <= 64));
+  if (value == 0) {
+    return width;
  }
+  int count = 0;
+  value = value << (64 - width);
+  if ((value & UINT64_C(0xffffffff00000000)) == 0) {
+    count += 32;
+    value = value << 32;
+  }
+  if ((value & UINT64_C(0xffff000000000000)) == 0) {
+    count += 16;
+    value = value << 16;
+  }
+  if ((value & UINT64_C(0xff00000000000000)) == 0) {
+    count += 8;
+    value = value << 8;
+  }
+  if ((value & UINT64_C(0xf000000000000000)) == 0) {
+    count += 4;
+    value = value << 4;
+  }
+  if ((value & UINT64_C(0xc000000000000000)) == 0) {
+    count += 2;
+    value = value << 2;
+  }
+  if ((value & UINT64_C(0x8000000000000000)) == 0) {
+    count += 1;
+  }
+  count += (value == 0);
  return count;
 }


-int CountSetBits(uint64_t value, int width) {
-  // TODO: Other widths could be added here, as the implementation already
-  // supports them.
-  VIXL_ASSERT((width == 32) || (width == 64));
+int CountSetBitsFallBack(uint64_t value, int width) {
+  VIXL_ASSERT(IsPowerOf2(width) && (width <= 64));

  // Mask out unused bits to ensure that they are not counted.
-  value &= (UINT64_C(0xffffffffffffffff) >> (64-width));
+  value &= (UINT64_C(0xffffffffffffffff) >> (64 - width));

  // Add up the set bits.
  // The algorithm works by adding pairs of bit fields together iteratively,
@@ -122,30 +105,40 @@ int CountSetBits(uint64_t value, int width) {
    value = ((value >> shift) & kMasks[i]) + (value & kMasks[i]);
  }

-  return value;
+  return static_cast<int>(value);
 }


-uint64_t LowestSetBit(uint64_t value) {
-  return value & -value;
-}
-
-
-bool IsPowerOf2(int64_t value) {
-  return (value != 0) && ((value & (value - 1)) == 0);
-}
-
-
-unsigned CountClearHalfWords(uint64_t imm, unsigned reg_size) {
-  VIXL_ASSERT((reg_size % 8) == 0);
+int CountTrailingZerosFallBack(uint64_t value, int width) {
+  VIXL_ASSERT(IsPowerOf2(width) && (width <= 64));
  int count = 0;
-  for (unsigned i = 0; i < (reg_size / 16); i++) {
-    if ((imm & 0xffff) == 0) {
-      count++;
-    }
-    imm >>= 16;
+  value = value << (64 - width);
+  if ((value & UINT64_C(0xffffffff)) == 0) {
+    count += 32;
+    value = value >> 32;
  }
-  return count;
+  if ((value & 0xffff) == 0) {
+    count += 16;
+    value = value >> 16;
+  }
+  if ((value & 0xff) == 0) {
+    count += 8;
+    value = value >> 8;
+  }
+  if ((value & 0xf) == 0) {
+    count += 4;
+    value = value >> 4;
+  }
+  if ((value & 0x3) == 0) {
+    count += 2;
+    value = value >> 2;
+  }
+  if ((value & 0x1) == 0) {
+    count += 1;
+  }
+  count += (value == 0);
+  return count - (64 - width);
 }

+
 }  // namespace vixl
--- a/disas/libvixl/vixl/compiler-intrinsics.h
+++ b/disas/libvixl/vixl/compiler-intrinsics.h
@@ -0,0 +1,155 @@
+// Copyright 2015, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+#ifndef VIXL_COMPILER_INTRINSICS_H
+#define VIXL_COMPILER_INTRINSICS_H
+
+#include "globals.h"
+
+namespace vixl {
+
+// Helper to check whether the version of GCC used is greater than the specified
+// requirement.
+#define MAJOR 1000000
+#define MINOR 1000
+#if defined(__GNUC__) && defined(__GNUC_MINOR__) && defined(__GNUC_PATCHLEVEL__)
+#define GCC_VERSION_OR_NEWER(major, minor, patchlevel)                         \
+    ((__GNUC__ * MAJOR + __GNUC_MINOR__ * MINOR + __GNUC_PATCHLEVEL__) >=      \
+     ((major) * MAJOR + (minor) * MINOR + (patchlevel)))
+#elif defined(__GNUC__) && defined(__GNUC_MINOR__)
+#define GCC_VERSION_OR_NEWER(major, minor, patchlevel)                         \
+    ((__GNUC__ * MAJOR + __GNUC_MINOR__ * MINOR) >=                            \
+     ((major) * MAJOR + (minor) * MINOR + (patchlevel)))
+#else
+#define GCC_VERSION_OR_NEWER(major, minor, patchlevel) 0
+#endif
+
+
+#if defined(__clang__) && !defined(VIXL_NO_COMPILER_BUILTINS)
+
+#define COMPILER_HAS_BUILTIN_CLRSB    (__has_builtin(__builtin_clrsb))
+#define COMPILER_HAS_BUILTIN_CLZ      (__has_builtin(__builtin_clz))
+#define COMPILER_HAS_BUILTIN_CTZ      (__has_builtin(__builtin_ctz))
+#define COMPILER_HAS_BUILTIN_FFS      (__has_builtin(__builtin_ffs))
+#define COMPILER_HAS_BUILTIN_POPCOUNT (__has_builtin(__builtin_popcount))
+
+#elif defined(__GNUC__) && !defined(VIXL_NO_COMPILER_BUILTINS)
+// The documentation for these builtins is available at:
+// https://gcc.gnu.org/onlinedocs/gcc-$MAJOR.$MINOR.$PATCHLEVEL/gcc//Other-Builtins.html
+
+# define COMPILER_HAS_BUILTIN_CLRSB    (GCC_VERSION_OR_NEWER(4, 7, 0))
+# define COMPILER_HAS_BUILTIN_CLZ      (GCC_VERSION_OR_NEWER(3, 4, 0))
+# define COMPILER_HAS_BUILTIN_CTZ      (GCC_VERSION_OR_NEWER(3, 4, 0))
+# define COMPILER_HAS_BUILTIN_FFS      (GCC_VERSION_OR_NEWER(3, 4, 0))
+# define COMPILER_HAS_BUILTIN_POPCOUNT (GCC_VERSION_OR_NEWER(3, 4, 0))
+
+#else
+// One can define VIXL_NO_COMPILER_BUILTINS to force using the manually
+// implemented C++ methods.
+
+#define COMPILER_HAS_BUILTIN_BSWAP    false
+#define COMPILER_HAS_BUILTIN_CLRSB    false
+#define COMPILER_HAS_BUILTIN_CLZ      false
+#define COMPILER_HAS_BUILTIN_CTZ      false
+#define COMPILER_HAS_BUILTIN_FFS      false
+#define COMPILER_HAS_BUILTIN_POPCOUNT false
+
+#endif
+
+
+template<typename V>
+inline bool IsPowerOf2(V value) {
+  return (value != 0) && ((value & (value - 1)) == 0);
+}
+
+
+// Declaration of fallback functions.
+int CountLeadingSignBitsFallBack(int64_t value, int width);
+int CountLeadingZerosFallBack(uint64_t value, int width);
+int CountSetBitsFallBack(uint64_t value, int width);
+int CountTrailingZerosFallBack(uint64_t value, int width);
+
+
+// Implementation of intrinsics functions.
+// TODO: The implementations could be improved for sizes different from 32bit
+// and 64bit: we could mask the values and call the appropriate builtin.
+
+template<typename V>
+inline int CountLeadingSignBits(V value, int width = (sizeof(V) * 8)) {
+#if COMPILER_HAS_BUILTIN_CLRSB
+  if (width == 32) {
+    return __builtin_clrsb(value);
+  } else if (width == 64) {
+    return __builtin_clrsbll(value);
+  }
+#endif
+  return CountLeadingSignBitsFallBack(value, width);
+}
+
+
+template<typename V>
+inline int CountLeadingZeros(V value, int width = (sizeof(V) * 8)) {
+#if COMPILER_HAS_BUILTIN_CLZ
+  if (width == 32) {
+    return (value == 0) ? 32 : __builtin_clz(static_cast<unsigned>(value));
+  } else if (width == 64) {
+    return (value == 0) ? 64 : __builtin_clzll(value);
+  }
+#endif
+  return CountLeadingZerosFallBack(value, width);
+}
+
+
+template<typename V>
+inline int CountSetBits(V value, int width = (sizeof(V) * 8)) {
+#if COMPILER_HAS_BUILTIN_POPCOUNT
+  if (width == 32) {
+    return __builtin_popcount(static_cast<unsigned>(value));
+  } else if (width == 64) {
+    return __builtin_popcountll(value);
+  }
+#endif
+  return CountSetBitsFallBack(value, width);
+}
+
+
+template<typename V>
+inline int CountTrailingZeros(V value, int width = (sizeof(V) * 8)) {
+#if COMPILER_HAS_BUILTIN_CTZ
+  if (width == 32) {
+    return (value == 0) ? 32 : __builtin_ctz(static_cast<unsigned>(value));
+  } else if (width == 64) {
+    return (value == 0) ? 64 : __builtin_ctzll(value);
+  }
+#endif
+  return CountTrailingZerosFallBack(value, width);
+}
+
+}  // namespace vixl
+
+#endif  // VIXL_COMPILER_INTRINSICS_H
+
--- a/disas/libvixl/vixl/globals.h
+++ b/disas/libvixl/vixl/globals.h
@@ -1,4 +1,4 @@
-// Copyright 2013, ARM Limited
+// Copyright 2015, ARM Limited
 // All rights reserved.
 //
 // Redistribution and use in source and binary forms, with or without
@@ -49,20 +49,26 @@
 #include <stdint.h>
 #include <stdlib.h>
 #include <stddef.h>
-#include "platform.h"
+#include "vixl/platform.h"


 typedef uint8_t byte;

+// Type for half-precision (16 bit) floating point numbers.
+typedef uint16_t float16;
+
 const int KBytes = 1024;
 const int MBytes = 1024 * KBytes;

-#define VIXL_ABORT() printf("in %s, line %i", __FILE__, __LINE__); abort()
+#define VIXL_ABORT() \
+    do { printf("in %s, line %i", __FILE__, __LINE__); abort(); } while (false)
 #ifdef VIXL_DEBUG
  #define VIXL_ASSERT(condition) assert(condition)
  #define VIXL_CHECK(condition) VIXL_ASSERT(condition)
-  #define VIXL_UNIMPLEMENTED() printf("UNIMPLEMENTED\t"); VIXL_ABORT()
-  #define VIXL_UNREACHABLE() printf("UNREACHABLE\t"); VIXL_ABORT()
+  #define VIXL_UNIMPLEMENTED() \
+    do { fprintf(stderr, "UNIMPLEMENTED\t"); VIXL_ABORT(); } while (false)
+  #define VIXL_UNREACHABLE() \
+    do { fprintf(stderr, "UNREACHABLE\t"); VIXL_ABORT(); } while (false)
 #else
  #define VIXL_ASSERT(condition) ((void) 0)
  #define VIXL_CHECK(condition) assert(condition)
@@ -76,10 +82,70 @@ const int MBytes = 1024 * KBytes;
 #define VIXL_STATIC_ASSERT_LINE(line, condition) \
  typedef char VIXL_CONCAT(STATIC_ASSERT_LINE_, line)[(condition) ? 1 : -1] \
  __attribute__((unused))
-#define VIXL_STATIC_ASSERT(condition) VIXL_STATIC_ASSERT_LINE(__LINE__, condition) //NOLINT
+#define VIXL_STATIC_ASSERT(condition) \
+    VIXL_STATIC_ASSERT_LINE(__LINE__, condition)

-template <typename T> inline void USE(T) {}
+template <typename T1>
+inline void USE(T1) {}

-#define VIXL_ALIGNMENT_EXCEPTION() printf("ALIGNMENT EXCEPTION\t"); VIXL_ABORT()
+template <typename T1, typename T2>
+inline void USE(T1, T2) {}
+
+template <typename T1, typename T2, typename T3>
+inline void USE(T1, T2, T3) {}
+
+template <typename T1, typename T2, typename T3, typename T4>
+inline void USE(T1, T2, T3, T4) {}
+
+#define VIXL_ALIGNMENT_EXCEPTION() \
+    do { fprintf(stderr, "ALIGNMENT EXCEPTION\t"); VIXL_ABORT(); } while (0)
+
+// The clang::fallthrough attribute is used along with the Wimplicit-fallthrough
+// argument to annotate intentional fall-through between switch labels.
+// For more information please refer to:
+// http://clang.llvm.org/docs/AttributeReference.html#fallthrough-clang-fallthrough
+#ifndef __has_warning
+  #define __has_warning(x)  0
+#endif
+
+// Note: This option is only available for Clang. And will only be enabled for
+// C++11(201103L).
+#if __has_warning("-Wimplicit-fallthrough") && __cplusplus >= 201103L
+  #define VIXL_FALLTHROUGH() [[clang::fallthrough]] //NOLINT
+#else
+  #define VIXL_FALLTHROUGH() do {} while (0)
+#endif
+
+#if __cplusplus >= 201103L
+  #define VIXL_NO_RETURN  [[noreturn]] //NOLINT
+#else
+  #define VIXL_NO_RETURN  __attribute__((noreturn))
+#endif
+
+// Some functions might only be marked as "noreturn" for the DEBUG build. This
+// macro should be used for such cases (for more details see what
+// VIXL_UNREACHABLE expands to).
+#ifdef VIXL_DEBUG
+  #define VIXL_DEBUG_NO_RETURN  VIXL_NO_RETURN
+#else
+  #define VIXL_DEBUG_NO_RETURN
+#endif
+
+#ifdef VIXL_INCLUDE_SIMULATOR
+#ifndef VIXL_GENERATE_SIMULATOR_INSTRUCTIONS_VALUE
+  #define VIXL_GENERATE_SIMULATOR_INSTRUCTIONS_VALUE  1
+#endif
+#else
+#ifndef VIXL_GENERATE_SIMULATOR_INSTRUCTIONS_VALUE
+  #define VIXL_GENERATE_SIMULATOR_INSTRUCTIONS_VALUE  0
+#endif
+#if VIXL_GENERATE_SIMULATOR_INSTRUCTIONS_VALUE
+  #warning "Generating Simulator instructions without Simulator support."
+#endif
+#endif
+
+#ifdef USE_SIMULATOR
+  #error "Please see the release notes for USE_SIMULATOR."
+#endif

 #endif  // VIXL_GLOBALS_H
--- a/disas/libvixl/vixl/invalset.h
+++ b/disas/libvixl/vixl/invalset.h
@@ -0,0 +1,775 @@
+// Copyright 2015, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_INVALSET_H_
+#define VIXL_INVALSET_H_
+
+#include <string.h>
+
+#include <algorithm>
+#include <vector>
+
+#include "vixl/globals.h"
+
+namespace vixl {
+
+// We define a custom data structure template and its iterator as `std`
+// containers do not fit the performance requirements for some of our use cases.
+//
+// The structure behaves like an iterable unordered set with special properties
+// and restrictions. "InvalSet" stands for "Invalidatable Set".
+//
+// Restrictions and requirements:
+// - Adding an element already present in the set is illegal. In debug mode,
+//   this is checked at insertion time.
+// - The templated class `ElementType` must provide comparison operators so that
+//   `std::sort()` can be used.
+// - A key must be available to represent invalid elements.
+// - Elements with an invalid key must compare higher or equal to any other
+//   element.
+//
+// Use cases and performance considerations:
+// Our use cases present two specificities that allow us to design this
+// structure to provide fast insertion *and* fast search and deletion
+// operations:
+// - Elements are (generally) inserted in order (sorted according to their key).
+// - A key is available to mark elements as invalid (deleted).
+// The backing `std::vector` allows for fast insertions. When
+// searching for an element we ensure the elements are sorted (this is generally
+// the case) and perform a binary search. When deleting an element we do not
+// free the associated memory immediately. Instead, an element to be deleted is
+// marked with the 'invalid' key. Other methods of the container take care of
+// ignoring entries marked as invalid.
+// To avoid the overhead of the `std::vector` container when only few entries
+// are used, a number of elements are preallocated.
+
+// 'ElementType' and 'KeyType' are respectively the types of the elements and
+// their key.  The structure only reclaims memory when safe to do so, if the
+// number of elements that can be reclaimed is greater than `RECLAIM_FROM` and
+// greater than `<total number of elements> / RECLAIM_FACTOR.
+#define TEMPLATE_INVALSET_P_DECL                                               \
+  class ElementType,                                                           \
+  unsigned N_PREALLOCATED_ELEMENTS,                                            \
+  class KeyType,                                                               \
+  KeyType INVALID_KEY,                                                         \
+  size_t RECLAIM_FROM,                                                         \
+  unsigned RECLAIM_FACTOR
+
+#define TEMPLATE_INVALSET_P_DEF                                                \
+ElementType, N_PREALLOCATED_ELEMENTS,                                          \
+KeyType, INVALID_KEY, RECLAIM_FROM, RECLAIM_FACTOR
+
+template<class S> class InvalSetIterator;  // Forward declaration.
+
+template<TEMPLATE_INVALSET_P_DECL> class InvalSet {
+ public:
+  InvalSet();
+  ~InvalSet();
+
+  static const size_t kNPreallocatedElements = N_PREALLOCATED_ELEMENTS;
+  static const KeyType kInvalidKey = INVALID_KEY;
+
+  // It is illegal to insert an element already present in the set.
+  void insert(const ElementType& element);
+
+  // Looks for the specified element in the set and - if found - deletes it.
+  void erase(const ElementType& element);
+
+  // This indicates the number of (valid) elements stored in this set.
+  size_t size() const;
+
+  // Returns true if no elements are stored in the set.
+  // Note that this does not mean the the backing storage is empty: it can still
+  // contain invalid elements.
+  bool empty() const;
+
+  void clear();
+
+  const ElementType min_element();
+
+  // This returns the key of the minimum element in the set.
+  KeyType min_element_key();
+
+  static bool IsValid(const ElementType& element);
+  static KeyType Key(const ElementType& element);
+  static void SetKey(ElementType* element, KeyType key);
+
+ protected:
+  // Returns a pointer to the element in vector_ if it was found, or NULL
+  // otherwise.
+  ElementType* Search(const ElementType& element);
+
+  // The argument *must* point to an element stored in *this* set.
+  // This function is not allowed to move elements in the backing vector
+  // storage.
+  void EraseInternal(ElementType* element);
+
+  // The elements in the range searched must be sorted.
+  ElementType* BinarySearch(const ElementType& element,
+                            ElementType* start,
+                            ElementType* end) const;
+
+  // Sort the elements.
+  enum SortType {
+    // The 'hard' version guarantees that invalid elements are moved to the end
+    // of the container.
+    kHardSort,
+    // The 'soft' version only guarantees that the elements will be sorted.
+    // Invalid elements may still be present anywhere in the set.
+    kSoftSort
+  };
+  void Sort(SortType sort_type);
+
+  // Delete the elements that have an invalid key. The complexity is linear
+  // with the size of the vector.
+  void Clean();
+
+  const ElementType Front() const;
+  const ElementType Back() const;
+
+  // Delete invalid trailing elements and return the last valid element in the
+  // set.
+  const ElementType CleanBack();
+
+  // Returns a pointer to the start or end of the backing storage.
+  const ElementType* StorageBegin() const;
+  const ElementType* StorageEnd() const;
+  ElementType* StorageBegin();
+  ElementType* StorageEnd();
+
+  // Returns the index of the element within the backing storage. The element
+  // must belong to the backing storage.
+  size_t ElementIndex(const ElementType* element) const;
+
+  // Returns the element at the specified index in the backing storage.
+  const ElementType* ElementAt(size_t index) const;
+  ElementType* ElementAt(size_t index);
+
+  static const ElementType* FirstValidElement(const ElementType* from,
+                                              const ElementType* end);
+
+  void CacheMinElement();
+  const ElementType CachedMinElement() const;
+
+  bool ShouldReclaimMemory() const;
+  void ReclaimMemory();
+
+  bool IsUsingVector() const { return vector_ != NULL; }
+  void set_sorted(bool sorted) { sorted_ = sorted; }
+
+  // We cache some data commonly required by users to improve performance.
+  // We cannot cache pointers to elements as we do not control the backing
+  // storage.
+  bool valid_cached_min_;
+  size_t cached_min_index_;  // Valid iff `valid_cached_min_` is true.
+  KeyType cached_min_key_;         // Valid iff `valid_cached_min_` is true.
+
+  // Indicates whether the elements are sorted.
+  bool sorted_;
+
+  // This represents the number of (valid) elements in this set.
+  size_t size_;
+
+  // The backing storage is either the array of preallocated elements or the
+  // vector. The structure starts by using the preallocated elements, and
+  // transitions (permanently) to using the vector once more than
+  // kNPreallocatedElements are used.
+  // Elements are only invalidated when using the vector. The preallocated
+  // storage always only contains valid elements.
+  ElementType preallocated_[kNPreallocatedElements];
+  std::vector<ElementType>* vector_;
+
+#ifdef VIXL_DEBUG
+  // Iterators acquire and release this monitor. While a set is acquired,
+  // certain operations are illegal to ensure that the iterator will
+  // correctly iterate over the elements in the set.
+  int monitor_;
+  int monitor() const { return monitor_; }
+  void Acquire() { monitor_++; }
+  void Release() {
+    monitor_--;
+    VIXL_ASSERT(monitor_ >= 0);
+  }
+#endif
+
+  friend class InvalSetIterator<InvalSet<TEMPLATE_INVALSET_P_DEF> >;
+  typedef ElementType _ElementType;
+  typedef KeyType _KeyType;
+};
+
+
+template<class S> class InvalSetIterator {
+ private:
+  // Redefine types to mirror the associated set types.
+  typedef typename S::_ElementType ElementType;
+  typedef typename S::_KeyType KeyType;
+
+ public:
+  explicit InvalSetIterator(S* inval_set);
+  ~InvalSetIterator();
+
+  ElementType* Current() const;
+  void Advance();
+  bool Done() const;
+
+  // Mark this iterator as 'done'.
+  void Finish();
+
+  // Delete the current element and advance the iterator to point to the next
+  // element.
+  void DeleteCurrentAndAdvance();
+
+  static bool IsValid(const ElementType& element);
+  static KeyType Key(const ElementType& element);
+
+ protected:
+  void MoveToValidElement();
+
+  // Indicates if the iterator is looking at the vector or at the preallocated
+  // elements.
+  const bool using_vector_;
+  // Used when looking at the preallocated elements, or in debug mode when using
+  // the vector to track how many times the iterator has advanced.
+  size_t index_;
+  typename std::vector<ElementType>::iterator iterator_;
+  S* inval_set_;
+};
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+InvalSet<TEMPLATE_INVALSET_P_DEF>::InvalSet()
+  : valid_cached_min_(false),
+    sorted_(true), size_(0), vector_(NULL) {
+#ifdef VIXL_DEBUG
+  monitor_ = 0;
+#endif
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+InvalSet<TEMPLATE_INVALSET_P_DEF>::~InvalSet() {
+  VIXL_ASSERT(monitor_ == 0);
+  delete vector_;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+void InvalSet<TEMPLATE_INVALSET_P_DEF>::insert(const ElementType& element) {
+  VIXL_ASSERT(monitor() == 0);
+  VIXL_ASSERT(IsValid(element));
+  VIXL_ASSERT(Search(element) == NULL);
+  set_sorted(empty() || (sorted_ && (element > CleanBack())));
+  if (IsUsingVector()) {
+    vector_->push_back(element);
+  } else {
+    if (size_ < kNPreallocatedElements) {
+      preallocated_[size_] = element;
+    } else {
+      // Transition to using the vector.
+      vector_ = new std::vector<ElementType>(preallocated_,
+                                             preallocated_ + size_);
+      vector_->push_back(element);
+    }
+  }
+  size_++;
+
+  if (valid_cached_min_ && (element < min_element())) {
+    cached_min_index_ = IsUsingVector() ? vector_->size() - 1 : size_ - 1;
+    cached_min_key_ = Key(element);
+    valid_cached_min_ = true;
+  }
+
+  if (ShouldReclaimMemory()) {
+    ReclaimMemory();
+  }
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+void InvalSet<TEMPLATE_INVALSET_P_DEF>::erase(const ElementType& element) {
+  VIXL_ASSERT(monitor() == 0);
+  VIXL_ASSERT(IsValid(element));
+  ElementType* local_element = Search(element);
+  if (local_element != NULL) {
+    EraseInternal(local_element);
+  }
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+ElementType* InvalSet<TEMPLATE_INVALSET_P_DEF>::Search(
+    const ElementType& element) {
+  VIXL_ASSERT(monitor() == 0);
+  if (empty()) {
+    return NULL;
+  }
+  if (ShouldReclaimMemory()) {
+    ReclaimMemory();
+  }
+  if (!sorted_) {
+    Sort(kHardSort);
+  }
+  if (!valid_cached_min_) {
+    CacheMinElement();
+  }
+  return BinarySearch(element, ElementAt(cached_min_index_), StorageEnd());
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+size_t InvalSet<TEMPLATE_INVALSET_P_DEF>::size() const {
+  return size_;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+bool InvalSet<TEMPLATE_INVALSET_P_DEF>::empty() const {
+  return size_ == 0;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+void InvalSet<TEMPLATE_INVALSET_P_DEF>::clear() {
+  VIXL_ASSERT(monitor() == 0);
+  size_ = 0;
+  if (IsUsingVector()) {
+    vector_->clear();
+  }
+  set_sorted(true);
+  valid_cached_min_ = false;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+const ElementType InvalSet<TEMPLATE_INVALSET_P_DEF>::min_element() {
+  VIXL_ASSERT(monitor() == 0);
+  VIXL_ASSERT(!empty());
+  CacheMinElement();
+  return *ElementAt(cached_min_index_);
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+KeyType InvalSet<TEMPLATE_INVALSET_P_DEF>::min_element_key() {
+  VIXL_ASSERT(monitor() == 0);
+  if (valid_cached_min_) {
+    return cached_min_key_;
+  } else {
+    return Key(min_element());
+  }
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+bool InvalSet<TEMPLATE_INVALSET_P_DEF>::IsValid(const ElementType& element) {
+  return Key(element) != kInvalidKey;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+void InvalSet<TEMPLATE_INVALSET_P_DEF>::EraseInternal(ElementType* element) {
+  // Note that this function must be safe even while an iterator has acquired
+  // this set.
+  VIXL_ASSERT(element != NULL);
+  size_t deleted_index = ElementIndex(element);
+  if (IsUsingVector()) {
+    VIXL_ASSERT((&(vector_->front()) <= element) &&
+                (element <= &(vector_->back())));
+    SetKey(element, kInvalidKey);
+  } else {
+    VIXL_ASSERT((preallocated_ <= element) &&
+                (element < (preallocated_ + kNPreallocatedElements)));
+    ElementType* end = preallocated_ + kNPreallocatedElements;
+    size_t copy_size = sizeof(*element) * (end - element - 1);
+    memmove(element, element + 1, copy_size);
+  }
+  size_--;
+
+  if (valid_cached_min_ &&
+      (deleted_index == cached_min_index_)) {
+    if (sorted_ && !empty()) {
+      const ElementType* min = FirstValidElement(element, StorageEnd());
+      cached_min_index_ = ElementIndex(min);
+      cached_min_key_ = Key(*min);
+      valid_cached_min_ = true;
+    } else {
+      valid_cached_min_ = false;
+    }
+  }
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+ElementType* InvalSet<TEMPLATE_INVALSET_P_DEF>::BinarySearch(
+    const ElementType& element, ElementType* start, ElementType* end) const {
+  if (start == end) {
+    return NULL;
+  }
+  VIXL_ASSERT(sorted_);
+  VIXL_ASSERT(start < end);
+  VIXL_ASSERT(!empty());
+
+  // Perform a binary search through the elements while ignoring invalid
+  // elements.
+  ElementType* elements = start;
+  size_t low = 0;
+  size_t high = (end - start) - 1;
+  while (low < high) {
+    // Find valid bounds.
+    while (!IsValid(elements[low]) && (low < high)) ++low;
+    while (!IsValid(elements[high]) && (low < high)) --high;
+    VIXL_ASSERT(low <= high);
+    // Avoid overflow when computing the middle index.
+    size_t middle = low / 2 + high / 2 + (low & high & 1);
+    if ((middle == low) || (middle == high)) {
+      break;
+    }
+    while (!IsValid(elements[middle]) && (middle < high - 1)) ++middle;
+    while (!IsValid(elements[middle]) && (low + 1 < middle)) --middle;
+    if (!IsValid(elements[middle])) {
+      break;
+    }
+    if (elements[middle] < element) {
+      low = middle;
+    } else {
+      high = middle;
+    }
+  }
+
+  if (elements[low] == element) return &elements[low];
+  if (elements[high] == element) return &elements[high];
+  return NULL;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+void InvalSet<TEMPLATE_INVALSET_P_DEF>::Sort(SortType sort_type) {
+  VIXL_ASSERT(monitor() == 0);
+  if (sort_type == kSoftSort) {
+    if (sorted_) {
+      return;
+    }
+  }
+  if (empty()) {
+    return;
+  }
+
+  Clean();
+  std::sort(StorageBegin(), StorageEnd());
+
+  set_sorted(true);
+  cached_min_index_ = 0;
+  cached_min_key_ = Key(Front());
+  valid_cached_min_ = true;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+void InvalSet<TEMPLATE_INVALSET_P_DEF>::Clean() {
+  VIXL_ASSERT(monitor() == 0);
+  if (empty() || !IsUsingVector()) {
+    return;
+  }
+  // Manually iterate through the vector storage to discard invalid elements.
+  ElementType* start = &(vector_->front());
+  ElementType* end = start + vector_->size();
+  ElementType* c = start;
+  ElementType* first_invalid;
+  ElementType* first_valid;
+  ElementType* next_invalid;
+
+  while (c < end && IsValid(*c)) { c++; }
+  first_invalid = c;
+
+  while (c < end) {
+    while (c < end && !IsValid(*c)) { c++; }
+    first_valid = c;
+    while (c < end && IsValid(*c)) { c++; }
+    next_invalid = c;
+
+    ptrdiff_t n_moved_elements = (next_invalid - first_valid);
+    memmove(first_invalid, first_valid,  n_moved_elements * sizeof(*c));
+    first_invalid = first_invalid + n_moved_elements;
+    c = next_invalid;
+  }
+
+  // Delete the trailing invalid elements.
+  vector_->erase(vector_->begin() + (first_invalid - start), vector_->end());
+  VIXL_ASSERT(vector_->size() == size_);
+
+  if (sorted_) {
+    valid_cached_min_ = true;
+    cached_min_index_ = 0;
+    cached_min_key_ = Key(*ElementAt(0));
+  } else {
+    valid_cached_min_ = false;
+  }
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+const ElementType InvalSet<TEMPLATE_INVALSET_P_DEF>::Front() const {
+  VIXL_ASSERT(!empty());
+  return IsUsingVector() ? vector_->front() : preallocated_[0];
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+const ElementType InvalSet<TEMPLATE_INVALSET_P_DEF>::Back() const {
+  VIXL_ASSERT(!empty());
+  return IsUsingVector() ? vector_->back() : preallocated_[size_ - 1];
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+const ElementType InvalSet<TEMPLATE_INVALSET_P_DEF>::CleanBack() {
+  VIXL_ASSERT(monitor() == 0);
+  if (IsUsingVector()) {
+    // Delete the invalid trailing elements.
+    typename std::vector<ElementType>::reverse_iterator it = vector_->rbegin();
+    while (!IsValid(*it)) {
+      it++;
+    }
+    vector_->erase(it.base(), vector_->end());
+  }
+  return Back();
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+const ElementType* InvalSet<TEMPLATE_INVALSET_P_DEF>::StorageBegin() const {
+  return IsUsingVector() ? &(vector_->front()) : preallocated_;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+const ElementType* InvalSet<TEMPLATE_INVALSET_P_DEF>::StorageEnd() const {
+  return IsUsingVector() ? &(vector_->back()) + 1 : preallocated_ + size_;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+ElementType* InvalSet<TEMPLATE_INVALSET_P_DEF>::StorageBegin() {
+  return IsUsingVector() ? &(vector_->front()) : preallocated_;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+ElementType* InvalSet<TEMPLATE_INVALSET_P_DEF>::StorageEnd() {
+  return IsUsingVector() ? &(vector_->back()) + 1 : preallocated_ + size_;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+size_t InvalSet<TEMPLATE_INVALSET_P_DEF>::ElementIndex(
+    const ElementType* element) const {
+  VIXL_ASSERT((StorageBegin() <= element) && (element < StorageEnd()));
+  return element - StorageBegin();
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+const ElementType* InvalSet<TEMPLATE_INVALSET_P_DEF>::ElementAt(
+    size_t index) const {
+  VIXL_ASSERT(
+      (IsUsingVector() && (index < vector_->size())) || (index < size_));
+  return StorageBegin() + index;
+}
+
+template<TEMPLATE_INVALSET_P_DECL>
+ElementType* InvalSet<TEMPLATE_INVALSET_P_DEF>::ElementAt(size_t index) {
+  VIXL_ASSERT(
+      (IsUsingVector() && (index < vector_->size())) || (index < size_));
+  return StorageBegin() + index;
+}
+
+template<TEMPLATE_INVALSET_P_DECL>
+const ElementType* InvalSet<TEMPLATE_INVALSET_P_DEF>::FirstValidElement(
+    const ElementType* from, const ElementType* end) {
+  while ((from < end) && !IsValid(*from)) {
+    from++;
+  }
+  return from;
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+void InvalSet<TEMPLATE_INVALSET_P_DEF>::CacheMinElement() {
+  VIXL_ASSERT(monitor() == 0);
+  VIXL_ASSERT(!empty());
+
+  if (valid_cached_min_) {
+    return;
+  }
+
+  if (sorted_) {
+    const ElementType* min = FirstValidElement(StorageBegin(), StorageEnd());
+    cached_min_index_ = ElementIndex(min);
+    cached_min_key_ = Key(*min);
+    valid_cached_min_ = true;
+  } else {
+    Sort(kHardSort);
+  }
+  VIXL_ASSERT(valid_cached_min_);
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+bool InvalSet<TEMPLATE_INVALSET_P_DEF>::ShouldReclaimMemory() const {
+  if (!IsUsingVector()) {
+    return false;
+  }
+  size_t n_invalid_elements = vector_->size() - size_;
+  return (n_invalid_elements > RECLAIM_FROM) &&
+         (n_invalid_elements > vector_->size() / RECLAIM_FACTOR);
+}
+
+
+template<TEMPLATE_INVALSET_P_DECL>
+void InvalSet<TEMPLATE_INVALSET_P_DEF>::ReclaimMemory() {
+  VIXL_ASSERT(monitor() == 0);
+  Clean();
+}
+
+
+template<class S>
+InvalSetIterator<S>::InvalSetIterator(S* inval_set)
+    : using_vector_((inval_set != NULL) && inval_set->IsUsingVector()),
+      index_(0),
+      inval_set_(inval_set) {
+  if (inval_set != NULL) {
+    inval_set->Sort(S::kSoftSort);
+#ifdef VIXL_DEBUG
+    inval_set->Acquire();
+#endif
+    if (using_vector_) {
+      iterator_ = typename std::vector<ElementType>::iterator(
+          inval_set_->vector_->begin());
+    }
+    MoveToValidElement();
+  }
+}
+
+
+template<class S>
+InvalSetIterator<S>::~InvalSetIterator() {
+#ifdef VIXL_DEBUG
+  if (inval_set_ != NULL) {
+    inval_set_->Release();
+  }
+#endif
+}
+
+
+template<class S>
+typename S::_ElementType* InvalSetIterator<S>::Current() const {
+  VIXL_ASSERT(!Done());
+  if (using_vector_) {
+    return &(*iterator_);
+  } else {
+    return &(inval_set_->preallocated_[index_]);
+  }
+}
+
+
+template<class S>
+void InvalSetIterator<S>::Advance() {
+  VIXL_ASSERT(!Done());
+  if (using_vector_) {
+    iterator_++;
+#ifdef VIXL_DEBUG
+    index_++;
+#endif
+    MoveToValidElement();
+  } else {
+    index_++;
+  }
+}
+
+
+template<class S>
+bool InvalSetIterator<S>::Done() const {
+  if (using_vector_) {
+    bool done = (iterator_ == inval_set_->vector_->end());
+    VIXL_ASSERT(done == (index_ == inval_set_->size()));
+    return done;
+  } else {
+    return index_ == inval_set_->size();
+  }
+}
+
+
+template<class S>
+void InvalSetIterator<S>::Finish() {
+  VIXL_ASSERT(inval_set_->sorted_);
+  if (using_vector_) {
+    iterator_ = inval_set_->vector_->end();
+  }
+  index_ = inval_set_->size();
+}
+
+
+template<class S>
+void InvalSetIterator<S>::DeleteCurrentAndAdvance() {
+  if (using_vector_) {
+    inval_set_->EraseInternal(&(*iterator_));
+    MoveToValidElement();
+  } else {
+    inval_set_->EraseInternal(inval_set_->preallocated_ + index_);
+  }
+}
+
+
+template<class S>
+bool InvalSetIterator<S>::IsValid(const ElementType& element) {
+  return S::IsValid(element);
+}
+
+
+template<class S>
+typename S::_KeyType InvalSetIterator<S>::Key(const ElementType& element) {
+  return S::Key(element);
+}
+
+
+template<class S>
+void InvalSetIterator<S>::MoveToValidElement() {
+  if (using_vector_) {
+    while ((iterator_ != inval_set_->vector_->end()) && !IsValid(*iterator_)) {
+      iterator_++;
+    }
+  } else {
+    VIXL_ASSERT(inval_set_->empty() || IsValid(inval_set_->preallocated_[0]));
+    // Nothing to do.
+  }
+}
+
+#undef TEMPLATE_INVALSET_P_DECL
+#undef TEMPLATE_INVALSET_P_DEF
+
+}  // namespace vixl
+
+#endif  // VIXL_INVALSET_H_
--- a/disas/libvixl/vixl/platform.h
+++ b/disas/libvixl/vixl/platform.h
@@ -1,4 +1,4 @@
-// Copyright 2013, ARM Limited
+// Copyright 2014, ARM Limited
 // All rights reserved.
 //
 // Redistribution and use in source and binary forms, with or without
--- a/disas/libvixl/vixl/utils.cc
+++ b/disas/libvixl/vixl/utils.cc
@@ -0,0 +1,142 @@
+// Copyright 2015, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "vixl/utils.h"
+#include <stdio.h>
+
+namespace vixl {
+
+uint32_t float_to_rawbits(float value) {
+  uint32_t bits = 0;
+  memcpy(&bits, &value, 4);
+  return bits;
+}
+
+
+uint64_t double_to_rawbits(double value) {
+  uint64_t bits = 0;
+  memcpy(&bits, &value, 8);
+  return bits;
+}
+
+
+float rawbits_to_float(uint32_t bits) {
+  float value = 0.0;
+  memcpy(&value, &bits, 4);
+  return value;
+}
+
+
+double rawbits_to_double(uint64_t bits) {
+  double value = 0.0;
+  memcpy(&value, &bits, 8);
+  return value;
+}
+
+
+uint32_t float_sign(float val) {
+  uint32_t rawbits = float_to_rawbits(val);
+  return unsigned_bitextract_32(31, 31, rawbits);
+}
+
+
+uint32_t float_exp(float val) {
+  uint32_t rawbits = float_to_rawbits(val);
+  return unsigned_bitextract_32(30, 23, rawbits);
+}
+
+
+uint32_t float_mantissa(float val) {
+  uint32_t rawbits = float_to_rawbits(val);
+  return unsigned_bitextract_32(22, 0, rawbits);
+}
+
+
+uint32_t double_sign(double val) {
+  uint64_t rawbits = double_to_rawbits(val);
+  return static_cast<uint32_t>(unsigned_bitextract_64(63, 63, rawbits));
+}
+
+
+uint32_t double_exp(double val) {
+  uint64_t rawbits = double_to_rawbits(val);
+  return static_cast<uint32_t>(unsigned_bitextract_64(62, 52, rawbits));
+}
+
+
+uint64_t double_mantissa(double val) {
+  uint64_t rawbits = double_to_rawbits(val);
+  return unsigned_bitextract_64(51, 0, rawbits);
+}
+
+
+float float_pack(uint32_t sign, uint32_t exp, uint32_t mantissa) {
+  uint32_t bits = (sign << 31) | (exp << 23) | mantissa;
+  return rawbits_to_float(bits);
+}
+
+
+double double_pack(uint64_t sign, uint64_t exp, uint64_t mantissa) {
+  uint64_t bits = (sign << 63) | (exp << 52) | mantissa;
+  return rawbits_to_double(bits);
+}
+
+
+int float16classify(float16 value) {
+  uint16_t exponent_max = (1 << 5) - 1;
+  uint16_t exponent_mask = exponent_max << 10;
+  uint16_t mantissa_mask = (1 << 10) - 1;
+
+  uint16_t exponent = (value & exponent_mask) >> 10;
+  uint16_t mantissa = value & mantissa_mask;
+  if (exponent == 0) {
+    if (mantissa == 0) {
+      return FP_ZERO;
+    }
+    return FP_SUBNORMAL;
+  } else if (exponent == exponent_max) {
+    if (mantissa == 0) {
+      return FP_INFINITE;
+    }
+    return FP_NAN;
+  }
+  return FP_NORMAL;
+}
+
+
+unsigned CountClearHalfWords(uint64_t imm, unsigned reg_size) {
+  VIXL_ASSERT((reg_size % 8) == 0);
+  int count = 0;
+  for (unsigned i = 0; i < (reg_size / 16); i++) {
+    if ((imm & 0xffff) == 0) {
+      count++;
+    }
+    imm >>= 16;
+  }
+  return count;
+}
+
+}  // namespace vixl
--- a/disas/libvixl/vixl/utils.h
+++ b/disas/libvixl/vixl/utils.h
@@ -1,4 +1,4 @@
-// Copyright 2013, ARM Limited
+// Copyright 2015, ARM Limited
 // All rights reserved.
 //
 // Redistribution and use in source and binary forms, with or without
@@ -27,16 +27,17 @@
 #ifndef VIXL_UTILS_H
 #define VIXL_UTILS_H

-#include <math.h>
 #include <string.h>
-#include "globals.h"
+#include <cmath>
+#include "vixl/globals.h"
+#include "vixl/compiler-intrinsics.h"

 namespace vixl {

 // Macros for compile-time format checking.
-#if defined(__GNUC__)
+#if GCC_VERSION_OR_NEWER(4, 4, 0)
 #define PRINTF_CHECK(format_index, varargs_index) \
-  __attribute__((format(printf, format_index, varargs_index)))
+  __attribute__((format(gnu_printf, format_index, varargs_index)))
 #else
 #define PRINTF_CHECK(format_index, varargs_index)
 #endif
@@ -53,9 +54,9 @@ inline bool is_uintn(unsigned n, int64_t x) {
  return !(x >> n);
 }

-inline unsigned truncate_to_intn(unsigned n, int64_t x) {
+inline uint32_t truncate_to_intn(unsigned n, int64_t x) {
  VIXL_ASSERT((0 < n) && (n < 64));
-  return (x & ((INT64_C(1) << n) - 1));
+  return static_cast<uint32_t>(x & ((INT64_C(1) << n) - 1));
 }

 #define INT_1_TO_63_LIST(V)                                                    \
@@ -73,7 +74,7 @@ inline bool is_int##N(int64_t x) { return is_intn(N, x); }
 #define DECLARE_IS_UINT_N(N)                                                   \
 inline bool is_uint##N(int64_t x) { return is_uintn(N, x); }
 #define DECLARE_TRUNCATE_TO_INT_N(N)                                           \
-inline int truncate_to_int##N(int x) { return truncate_to_intn(N, x); }
+inline uint32_t truncate_to_int##N(int x) { return truncate_to_intn(N, x); }
 INT_1_TO_63_LIST(DECLARE_IS_INT_N)
 INT_1_TO_63_LIST(DECLARE_IS_UINT_N)
 INT_1_TO_63_LIST(DECLARE_TRUNCATE_TO_INT_N)
@@ -104,12 +105,24 @@ uint64_t double_to_rawbits(double value);
 float rawbits_to_float(uint32_t bits);
 double rawbits_to_double(uint64_t bits);

+uint32_t float_sign(float val);
+uint32_t float_exp(float val);
+uint32_t float_mantissa(float val);
+uint32_t double_sign(double val);
+uint32_t double_exp(double val);
+uint64_t double_mantissa(double val);
+
+float float_pack(uint32_t sign, uint32_t exp, uint32_t mantissa);
+double double_pack(uint64_t sign, uint64_t exp, uint64_t mantissa);
+
+// An fpclassify() function for 16-bit half-precision floats.
+int float16classify(float16 value);

 // NaN tests.
 inline bool IsSignallingNaN(double num) {
  const uint64_t kFP64QuietNaNMask = UINT64_C(0x0008000000000000);
  uint64_t raw = double_to_rawbits(num);
-  if (isnan(num) && ((raw & kFP64QuietNaNMask) == 0)) {
+  if (std::isnan(num) && ((raw & kFP64QuietNaNMask) == 0)) {
    return true;
  }
  return false;
@@ -119,30 +132,37 @@ inline bool IsSignallingNaN(double num) {
 inline bool IsSignallingNaN(float num) {
  const uint32_t kFP32QuietNaNMask = 0x00400000;
  uint32_t raw = float_to_rawbits(num);
-  if (isnan(num) && ((raw & kFP32QuietNaNMask) == 0)) {
+  if (std::isnan(num) && ((raw & kFP32QuietNaNMask) == 0)) {
    return true;
  }
  return false;
 }


+inline bool IsSignallingNaN(float16 num) {
+  const uint16_t kFP16QuietNaNMask = 0x0200;
+  return (float16classify(num) == FP_NAN) &&
+         ((num & kFP16QuietNaNMask) == 0);
+}
+
+
 template <typename T>
 inline bool IsQuietNaN(T num) {
-  return isnan(num) && !IsSignallingNaN(num);
+  return std::isnan(num) && !IsSignallingNaN(num);
 }


 // Convert the NaN in 'num' to a quiet NaN.
 inline double ToQuietNaN(double num) {
  const uint64_t kFP64QuietNaNMask = UINT64_C(0x0008000000000000);
-  VIXL_ASSERT(isnan(num));
+  VIXL_ASSERT(std::isnan(num));
  return rawbits_to_double(double_to_rawbits(num) | kFP64QuietNaNMask);
 }


 inline float ToQuietNaN(float num) {
  const uint32_t kFP32QuietNaNMask = 0x00400000;
-  VIXL_ASSERT(isnan(num));
+  VIXL_ASSERT(std::isnan(num));
  return rawbits_to_float(float_to_rawbits(num) | kFP32QuietNaNMask);
 }

@@ -158,16 +178,71 @@ inline float FusedMultiplyAdd(float op1, float op2, float a) {
 }


-// Bit counting.
-int CountLeadingZeros(uint64_t value, int width);
-int CountLeadingSignBits(int64_t value, int width);
-int CountTrailingZeros(uint64_t value, int width);
-int CountSetBits(uint64_t value, int width);
-uint64_t LowestSetBit(uint64_t value);
-bool IsPowerOf2(int64_t value);
+inline uint64_t LowestSetBit(uint64_t value) {
+  return value & -value;
+}
+
+
+template<typename T>
+inline int HighestSetBitPosition(T value) {
+  VIXL_ASSERT(value != 0);
+  return (sizeof(value) * 8 - 1) - CountLeadingZeros(value);
+}
+
+
+template<typename V>
+inline int WhichPowerOf2(V value) {
+  VIXL_ASSERT(IsPowerOf2(value));
+  return CountTrailingZeros(value);
+}
+

 unsigned CountClearHalfWords(uint64_t imm, unsigned reg_size);

+
+template <typename T>
+T ReverseBits(T value) {
+  VIXL_ASSERT((sizeof(value) == 1) || (sizeof(value) == 2) ||
+              (sizeof(value) == 4) || (sizeof(value) == 8));
+  T result = 0;
+  for (unsigned i = 0; i < (sizeof(value) * 8); i++) {
+    result = (result << 1) | (value & 1);
+    value >>= 1;
+  }
+  return result;
+}
+
+
+template <typename T>
+T ReverseBytes(T value, int block_bytes_log2) {
+  VIXL_ASSERT((sizeof(value) == 4) || (sizeof(value) == 8));
+  VIXL_ASSERT((1U << block_bytes_log2) <= sizeof(value));
+  // Split the 64-bit value into an 8-bit array, where b[0] is the least
+  // significant byte, and b[7] is the most significant.
+  uint8_t bytes[8];
+  uint64_t mask = UINT64_C(0xff00000000000000);
+  for (int i = 7; i >= 0; i--) {
+    bytes[i] = (static_cast<uint64_t>(value) & mask) >> (i * 8);
+    mask >>= 8;
+  }
+
+  // Permutation tables for REV instructions.
+  //  permute_table[0] is used by REV16_x, REV16_w
+  //  permute_table[1] is used by REV32_x, REV_w
+  //  permute_table[2] is used by REV_x
+  VIXL_ASSERT((0 < block_bytes_log2) && (block_bytes_log2 < 4));
+  static const uint8_t permute_table[3][8] = { {6, 7, 4, 5, 2, 3, 0, 1},
+                                               {4, 5, 6, 7, 0, 1, 2, 3},
+                                               {0, 1, 2, 3, 4, 5, 6, 7} };
+  T result = 0;
+  for (int i = 0; i < 8; i++) {
+    result <<= 8;
+    result |= bytes[permute_table[block_bytes_log2 - 1][i]];
+  }
+  return result;
+}
+
+
 // Pointer alignment
 // TODO: rename/refactor to make it specific to instructions.
 template<typename T>
--- a/docs/bitmaps.md
+++ b/docs/bitmaps.md
@@ -19,12 +19,20 @@ which is included at the end of this document.
 * A dirty bitmap's name is unique to the node, but bitmaps attached to different
  nodes can share the same name.

+* Dirty bitmaps created for internal use by QEMU may be anonymous and have no
+  name, but any user-created bitmaps may not be. There can be any number of
+  anonymous bitmaps per node.
+
+* The name of a user-created bitmap must not be empty ("").
+
 ## Bitmap Modes

 * A Bitmap can be "frozen," which means that it is currently in-use by a backup
  operation and cannot be deleted, renamed, written to, reset,
  etc.

+* The normal operating mode for a bitmap is "active."
+
 ## Basic QMP Usage

 ### Supported Commands ###
@@ -97,11 +105,7 @@ which is included at the end of this document.
 }
 ```

-## Transactions (Not yet implemented)
-
-* Transactional commands are forthcoming in a future version,
-  and are not yet available for use. This section serves as
-  documentation of intent for their design and usage.
+## Transactions

 ### Justification

@@ -323,6 +327,155 @@ full backup as a backing image.
      "event": "BLOCK_JOB_COMPLETED" }
    ```

+### Partial Transactional Failures
+
+* Sometimes, a transaction will succeed in launching and return success,
+  but then later the backup jobs themselves may fail. It is possible that
+  a management application may have to deal with a partial backup failure
+  after a successful transaction.
+
+* If multiple backup jobs are specified in a single transaction, when one of
+  them fails, it will not interact with the other backup jobs in any way.
+
+* The job(s) that succeeded will clear the dirty bitmap associated with the
+  operation, but the job(s) that failed will not. It is not "safe" to delete
+  any incremental backups that were created successfully in this scenario,
+  even though others failed.
+
+#### Example
+
+* QMP example highlighting two backup jobs:
+
+    ```json
+    { "execute": "transaction",
+      "arguments": {
+        "actions": [
+          { "type": "drive-backup",
+            "data": { "device": "drive0", "bitmap": "bitmap0",
+                      "format": "qcow2", "mode": "existing",
+                      "sync": "incremental", "target": "d0-incr-1.qcow2" } },
+          { "type": "drive-backup",
+            "data": { "device": "drive1", "bitmap": "bitmap1",
+                      "format": "qcow2", "mode": "existing",
+                      "sync": "incremental", "target": "d1-incr-1.qcow2" } },
+        ]
+      }
+    }
+    ```
+
+* QMP example response, highlighting one success and one failure:
+    * Acknowledgement that the Transaction was accepted and jobs were launched:
+        ```json
+        { "return": {} }
+        ```
+
+    * Later, QEMU sends notice that the first job was completed:
+        ```json
+        { "timestamp": { "seconds": 1447192343, "microseconds": 615698 },
+          "data": { "device": "drive0", "type": "backup",
+                     "speed": 0, "len": 67108864, "offset": 67108864 },
+          "event": "BLOCK_JOB_COMPLETED"
+        }
+        ```
+
+    * Later yet, QEMU sends notice that the second job has failed:
+        ```json
+        { "timestamp": { "seconds": 1447192399, "microseconds": 683015 },
+          "data": { "device": "drive1", "action": "report",
+                    "operation": "read" },
+          "event": "BLOCK_JOB_ERROR" }
+        ```
+
+        ```json
+        { "timestamp": { "seconds": 1447192399, "microseconds": 685853 },
+          "data": { "speed": 0, "offset": 0, "len": 67108864,
+                    "error": "Input/output error",
+                    "device": "drive1", "type": "backup" },
+          "event": "BLOCK_JOB_COMPLETED" }
+
+* In the above example, "d0-incr-1.qcow2" is valid and must be kept,
+  but "d1-incr-1.qcow2" is invalid and should be deleted. If a VM-wide
+  incremental backup of all drives at a point-in-time is to be made,
+  new backups for both drives will need to be made, taking into account
+  that a new incremental backup for drive0 needs to be based on top of
+  "d0-incr-1.qcow2."
+
+### Grouped Completion Mode
+
+* While jobs launched by transactions normally complete or fail on their own,
+  it is possible to instruct them to complete or fail together as a group.
+
+* QMP transactions take an optional properties structure that can affect
+  the semantics of the transaction.
+
+* The "completion-mode" transaction property can be either "individual"
+  which is the default, legacy behavior described above, or "grouped,"
+  a new behavior detailed below.
+
+* Delayed Completion: In grouped completion mode, no jobs will report
+  success until all jobs are ready to report success.
+
+* Grouped failure: If any job fails in grouped completion mode, all remaining
+  jobs will be cancelled. Any incremental backups will restore their dirty
+  bitmap objects as if no backup command was ever issued.
+
+    * Regardless of if QEMU reports a particular incremental backup job as
+      CANCELLED or as an ERROR, the in-memory bitmap will be restored.
+
+#### Example
+
+* Here's the same example scenario from above with the new property:
+
+    ```json
+    { "execute": "transaction",
+      "arguments": {
+        "actions": [
+          { "type": "drive-backup",
+            "data": { "device": "drive0", "bitmap": "bitmap0",
+                      "format": "qcow2", "mode": "existing",
+                      "sync": "incremental", "target": "d0-incr-1.qcow2" } },
+          { "type": "drive-backup",
+            "data": { "device": "drive1", "bitmap": "bitmap1",
+                      "format": "qcow2", "mode": "existing",
+                      "sync": "incremental", "target": "d1-incr-1.qcow2" } },
+        ],
+        "properties": {
+          "completion-mode": "grouped"
+        }
+      }
+    }
+    ```
+
+* QMP example response, highlighting a failure for drive2:
+    * Acknowledgement that the Transaction was accepted and jobs were launched:
+        ```json
+        { "return": {} }
+        ```
+
+    * Later, QEMU sends notice that the second job has errored out,
+      but that the first job was also cancelled:
+        ```json
+        { "timestamp": { "seconds": 1447193702, "microseconds": 632377 },
+          "data": { "device": "drive1", "action": "report",
+                    "operation": "read" },
+          "event": "BLOCK_JOB_ERROR" }
+        ```
+
+        ```json
+        { "timestamp": { "seconds": 1447193702, "microseconds": 640074 },
+          "data": { "speed": 0, "offset": 0, "len": 67108864,
+                    "error": "Input/output error",
+                    "device": "drive1", "type": "backup" },
+          "event": "BLOCK_JOB_COMPLETED" }
+        ```
+
+        ```json
+        { "timestamp": { "seconds": 1447193702, "microseconds": 640163 },
+          "data": { "device": "drive0", "type": "backup", "speed": 0,
+                    "len": 67108864, "offset": 16777216 },
+          "event": "BLOCK_JOB_CANCELLED" }
+        ```
+
 <!--
 The FreeBSD Documentation License

--- a/docs/blkdebug.txt
+++ b/docs/blkdebug.txt
@@ -1,6 +1,6 @@
 Block I/O error injection using blkdebug
 ----------------------------------------
-Copyright (C) 2014 Red Hat Inc
+Copyright (C) 2014-2015 Red Hat Inc

 This work is licensed under the terms of the GNU GPL, version 2 or later.  See
 the COPYING file in the top-level directory.
@@ -92,8 +92,9 @@ The core events are:

  flush_to_disk - flush the host block device's disk cache

-See block/blkdebug.c:event_names[] for the full list of events.  You may need
-to grep block driver source code to understand the meaning of specific events.
+See qapi/block-core.json:BlkdebugEvent for the full list of events.
+You may need to grep block driver source code to understand the
+meaning of specific events.

 State transitions
 -----------------
--- a/docs/migration.txt
+++ b/docs/migration.txt
@@ -291,3 +291,194 @@ save/send this state when we are in the middle of a pio operation
 (that is what ide_drive_pio_state_needed() checks).  If DRQ_STAT is
 not enabled, the values on that fields are garbage and don't need to
 be sent.
+
+= Return path =
+
+In most migration scenarios there is only a single data path that runs
+from the source VM to the destination, typically along a single fd (although
+possibly with another fd or similar for some fast way of throwing pages across).
+
+However, some uses need two way communication; in particular the Postcopy
+destination needs to be able to request pages on demand from the source.
+
+For these scenarios there is a 'return path' from the destination to the source;
+qemu_file_get_return_path(QEMUFile* fwdpath) gives the QEMUFile* for the return
+path.
+
+  Source side
+     Forward path - written by migration thread
+     Return path  - opened by main thread, read by return-path thread
+
+  Destination side
+     Forward path - read by main thread
+     Return path  - opened by main thread, written by main thread AND postcopy
+                    thread (protected by rp_mutex)
+
+= Postcopy =
+'Postcopy' migration is a way to deal with migrations that refuse to converge
+(or take too long to converge) its plus side is that there is an upper bound on
+the amount of migration traffic and time it takes, the down side is that during
+the postcopy phase, a failure of *either* side or the network connection causes
+the guest to be lost.
+
+In postcopy the destination CPUs are started before all the memory has been
+transferred, and accesses to pages that are yet to be transferred cause
+a fault that's translated by QEMU into a request to the source QEMU.
+
+Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
+doesn't finish in a given time the switch is made to postcopy.
+
+=== Enabling postcopy ===
+
+To enable postcopy, issue this command on the monitor prior to the
+start of migration:
+
+migrate_set_capability x-postcopy-ram on
+
+The normal commands are then used to start a migration, which is still
+started in precopy mode.  Issuing:
+
+migrate_start_postcopy
+
+will now cause the transition from precopy to postcopy.
+It can be issued immediately after migration is started or any
+time later on.  Issuing it after the end of a migration is harmless.
+
+Note: During the postcopy phase, the bandwidth limits set using
+migrate_set_speed is ignored (to avoid delaying requested pages that
+the destination is waiting for).
+
+=== Postcopy device transfer ===
+
+Loading of device data may cause the device emulation to access guest RAM
+that may trigger faults that have to be resolved by the source, as such
+the migration stream has to be able to respond with page data *during* the
+device load, and hence the device data has to be read from the stream completely
+before the device load begins to free the stream up.  This is achieved by
+'packaging' the device data into a blob that's read in one go.
+
+Source behaviour
+
+Until postcopy is entered the migration stream is identical to normal
+precopy, except for the addition of a 'postcopy advise' command at
+the beginning, to tell the destination that postcopy might happen.
+When postcopy starts the source sends the page discard data and then
+forms the 'package' containing:
+
+   Command: 'postcopy listen'
+   The device state
+      A series of sections, identical to the precopy streams device state stream
+      containing everything except postcopiable devices (i.e. RAM)
+   Command: 'postcopy run'
+
+The 'package' is sent as the data part of a Command: 'CMD_PACKAGED', and the
+contents are formatted in the same way as the main migration stream.
+
+During postcopy the source scans the list of dirty pages and sends them
+to the destination without being requested (in much the same way as precopy),
+however when a page request is received from the destination, the dirty page
+scanning restarts from the requested location.  This causes requested pages
+to be sent quickly, and also causes pages directly after the requested page
+to be sent quickly in the hope that those pages are likely to be used
+by the destination soon.
+
+Destination behaviour
+
+Initially the destination looks the same as precopy, with a single thread
+reading the migration stream; the 'postcopy advise' and 'discard' commands
+are processed to change the way RAM is managed, but don't affect the stream
+processing.
+
+------------------------------------------------------------------------------
+                        1      2   3     4 5                      6   7
+main -----DISCARD-CMD_PACKAGED ( LISTEN  DEVICE     DEVICE DEVICE RUN )
+thread                             |       |
+                                   |     (page request)
+                                   |        \___
+                                   v            \
+listen thread:                     --- page -- page -- page -- page -- page --
+
+                                   a   b        c
+------------------------------------------------------------------------------
+
+On receipt of CMD_PACKAGED (1)
+   All the data associated with the package - the ( ... ) section in the
+diagram - is read into memory (into a QEMUSizedBuffer), and the main thread
+recurses into qemu_loadvm_state_main to process the contents of the package (2)
+which contains commands (3,6) and devices (4...)
+
+On receipt of 'postcopy listen' - 3 -(i.e. the 1st command in the package)
+a new thread (a) is started that takes over servicing the migration stream,
+while the main thread carries on loading the package.   It loads normal
+background page data (b) but if during a device load a fault happens (5) the
+returned page (c) is loaded by the listen thread allowing the main threads
+device load to carry on.
+
+The last thing in the CMD_PACKAGED is a 'RUN' command (6) letting the destination
+CPUs start running.
+At the end of the CMD_PACKAGED (7) the main thread returns to normal running behaviour
+and is no longer used by migration, while the listen thread carries
+on servicing page data until the end of migration.
+
+=== Postcopy states ===
+
+Postcopy moves through a series of states (see postcopy_state) from
+ADVISE->DISCARD->LISTEN->RUNNING->END
+
+  Advise:  Set at the start of migration if postcopy is enabled, even
+           if it hasn't had the start command; here the destination
+           checks that its OS has the support needed for postcopy, and performs
+           setup to ensure the RAM mappings are suitable for later postcopy.
+           The destination will fail early in migration at this point if the
+           required OS support is not present.
+           (Triggered by reception of POSTCOPY_ADVISE command)
+
+  Discard: Entered on receipt of the first 'discard' command; prior to
+           the first Discard being performed, hugepages are switched off
+           (using madvise) to ensure that no new huge pages are created
+           during the postcopy phase, and to cause any huge pages that
+           have discards on them to be broken.
+
+  Listen:  The first command in the package, POSTCOPY_LISTEN, switches
+           the destination state to Listen, and starts a new thread
+           (the 'listen thread') which takes over the job of receiving
+           pages off the migration stream, while the main thread carries
+           on processing the blob.  With this thread able to process page
+           reception, the destination now 'sensitises' the RAM to detect
+           any access to missing pages (on Linux using the 'userfault'
+           system).
+
+  Running: POSTCOPY_RUN causes the destination to synchronise all
+           state and start the CPUs and IO devices running.  The main
+           thread now finishes processing the migration package and
+           now carries on as it would for normal precopy migration
+           (although it can't do the cleanup it would do as it
+           finishes a normal migration).
+
+  End:     The listen thread can now quit, and perform the cleanup of migration
+           state, the migration is now complete.
+
+=== Source side page maps ===
+
+The source side keeps two bitmaps during postcopy; 'the migration bitmap'
+and 'unsent map'.  The 'migration bitmap' is basically the same as in
+the precopy case, and holds a bit to indicate that page is 'dirty' -
+i.e. needs sending.  During the precopy phase this is updated as the CPU
+dirties pages, however during postcopy the CPUs are stopped and nothing
+should dirty anything any more.
+
+The 'unsent map' is used for the transition to postcopy. It is a bitmap that
+has a bit cleared whenever a page is sent to the destination, however during
+the transition to postcopy mode it is combined with the migration bitmap
+to form a set of pages that:
+   a) Have been sent but then redirtied (which must be discarded)
+   b) Have not yet been sent - which also must be discarded to cause any
+      transparent huge pages built during precopy to be broken.
+
+Note that the contents of the unsentmap are sacrificed during the calculation
+of the discard set and thus aren't valid once in postcopy.  The dirtymap
+is still valid and is used to ensure that no page is sent more than once.  Any
+request for a page that has already been sent is ignored.  Duplicate requests
+such as this can happen as a page is sent at about the same time the
+destination accesses it.
+
--- a/docs/pci_expander_bridge.txt
+++ b/docs/pci_expander_bridge.txt
@@ -23,9 +23,9 @@ A detailed command line would be:
 -m 2G
 -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 -numa node,nodeid=0,cpus=0,memdev=ram-node0
 -object memory-backend-ram,size=1024M,policy=bind,host-nodes=1,id=ram-node1 -numa node,nodeid=1,cpus=1,memdev=ram-node1
-device pxb,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd
-device pxb,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8,bus=pci.0 -device e1000,bus=bridge2,addr=0x3
-device pxb,id=bridge3,bus=pci.0,bus_nr=40,bus=pci.0 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1
+-device pxb,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd -device e1000,bus=bridge1,addr=0x4,netdev=nd
+-device pxb,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8, -device e1000,bus=bridge2,addr=0x3
+-device pxb,id=bridge3,bus=pci.0,bus_nr=40, -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1

 Here you have:
 - 2 NUMA nodes for the guest, 0 and 1. (both mapped to the same NUMA node in host, but you can and should put it in different host NUMA nodes)
--- a/docs/qapi-code-gen.txt
+++ b/docs/qapi-code-gen.txt
@@ -118,17 +118,17 @@ tracking optional fields.

 Any name (command, event, type, field, or enum value) beginning with
 "x-" is marked experimental, and may be withdrawn or changed
-incompatibly in a future release.  Downstream vendors may add
-extensions; such extensions should begin with a prefix matching
-"__RFQDN_" (for the reverse-fully-qualified-domain-name of the
-vendor), even if the rest of the name uses dash (example:
-__com.redhat_drive-mirror).  Other than downstream extensions (with
-leading underscore and the use of dots), all names should begin with a
-letter, and contain only ASCII letters, digits, dash, and underscore.
-Names beginning with 'q_' are reserved for the generator: QMP names
-that resemble C keywords or other problematic strings will be munged
-in C to use this prefix.  For example, a field named "default" in
-qapi becomes "q_default" in the generated C code.
+incompatibly in a future release.  All names must begin with a letter,
+and contain only ASCII letters, digits, dash, and underscore.  There
+are two exceptions: enum values may start with a digit, and any
+extensions added by downstream vendors should start with a prefix
+matching "__RFQDN_" (for the reverse-fully-qualified-domain-name of
+the vendor), even if the rest of the name uses dash (example:
+__com.redhat_drive-mirror).  Names beginning with 'q_' are reserved
+for the generator: QMP names that resemble C keywords or other
+problematic strings will be munged in C to use this prefix.  For
+example, a field named "default" in qapi becomes "q_default" in the
+generated C code.

 In the rest of this document, usage lines are given for each
 expression type, with literal strings written in lower case and
@@ -160,6 +160,7 @@ The following types are predefined, and map to C as follows:
                       accepts size suffixes
  bool      bool       JSON true or false
  any       QObject *  any JSON value
+  QType     QType      JSON string matching enum QType values


 === Includes ===
@@ -383,9 +384,6 @@ where each branch of the union names a QAPI type.  For example:
   'data': { 'definition': 'BlockdevOptions',
             'reference': 'str' } }

-Just like for a simple union, an implicit C enum 'NameKind' is created
-to enumerate the branches for the alternate 'Name'.
-
 Unlike a union, the discriminator string is never passed on the wire
 for the Client JSON Protocol.  Instead, the value's JSON type serves
 as an implicit discriminator, which in turn means that an alternate
@@ -514,8 +512,23 @@ exactly the server (QEMU) supports.
 For this purpose, QMP provides introspection via command
 query-qmp-schema.  QGA currently doesn't support introspection.

+While Client JSON Protocol wire compatibility should be maintained
+between qemu versions, we cannot make the same guarantees for
+introspection stability.  For example, one version of qemu may provide
+a non-variant optional member of a struct, and a later version rework
+the member to instead be non-optional and associated with a variant.
+Likewise, one version of qemu may list a member with open-ended type
+'str', and a later version could convert it to a finite set of strings
+via an enum type; or a member may be converted from a specific type to
+an alternate that represents a choice between the original type and
+something else.
+
 query-qmp-schema returns a JSON array of SchemaInfo objects.  These
 objects together describe the wire ABI, as defined in the QAPI schema.
+There is no specified order to the SchemaInfo objects returned; a
+client must search for a particular name throughout the entire array
+to learn more about that name, but is at least guaranteed that there
+will be no collisions between type, command, and event names.

 However, the SchemaInfo can't reflect all the rules and restrictions
 that apply to QMP.  It's interface introspection (figuring out what's
@@ -596,7 +609,9 @@ any.  Each element is a JSON object with members "name" (the member's
 name), "type" (the name of its type), and optionally "default".  The
 member is optional if "default" is present.  Currently, "default" can
 only have value null.  Other values are reserved for future
-extensions.
+extensions.  The "members" array is in no particular order; clients
+must search the entire object when learning whether a particular
+member is supported.

 Example: the SchemaInfo for MyType from section Struct types

@@ -610,7 +625,9 @@ Example: the SchemaInfo for MyType from section Struct types
 "variants" is a JSON array describing the object's variant members.
 Each element is a JSON object with members "case" (the value of type
 tag this element applies to) and "type" (the name of an object type
-that provides the variant members for this type tag value).
+that provides the variant members for this type tag value).  The
+"variants" array is in no particular order, and is not guaranteed to
+list cases in the same order as the corresponding "tag" enum type.

 Example: the SchemaInfo for flat union BlockdevOptions from section
 Union types
@@ -651,7 +668,8 @@ Union types
 The SchemaInfo for an alternate type has meta-type "alternate", and
 variant member "members".  "members" is a JSON array.  Each element is
 a JSON object with member "type", which names a type.  Values of the
-alternate type conform to exactly one of its member types.
+alternate type conform to exactly one of its member types.  There is
+no guarantee on the order in which "members" will be listed.

 Example: the SchemaInfo for BlockRef from section Alternate types

@@ -662,15 +680,20 @@ Example: the SchemaInfo for BlockRef from section Alternate types

 The SchemaInfo for an array type has meta-type "array", and variant
 member "element-type", which names the array's element type.  Array
-types are implicitly defined.
+types are implicitly defined.  For convenience, the array's name may
+resemble the element type; however, clients should examine member
+"element-type" instead of making assumptions based on parsing member
+"name".

 Example: the SchemaInfo for ['str']

-    { "name": "strList", "meta-type": "array",
+    { "name": "[str]", "meta-type": "array",
      "element-type": "str" }

 The SchemaInfo for an enumeration type has meta-type "enum" and
-variant member "values".
+variant member "values".  The values are listed in no particular
+order; clients must search the entire enum when learning whether a
+particular value is supported.

 Example: the SchemaInfo for MyEnum from section Enumeration types

@@ -1028,7 +1051,7 @@ Example:

    const char *const example_QAPIEvent_lookup[] = {
        [EXAMPLE_QAPI_EVENT_MY_EVENT] = "MY_EVENT",
-        [EXAMPLE_QAPI_EVENT_MAX] = NULL,
+        [EXAMPLE_QAPI_EVENT__MAX] = NULL,
    };
    $ cat qapi-generated/example-qapi-event.h
 [Uninteresting stuff omitted...]
@@ -1045,7 +1068,7 @@ Example:

    typedef enum example_QAPIEvent {
        EXAMPLE_QAPI_EVENT_MY_EVENT = 0,
-        EXAMPLE_QAPI_EVENT_MAX = 1,
+        EXAMPLE_QAPI_EVENT__MAX = 1,
    } example_QAPIEvent;

    extern const char *const example_QAPIEvent_lookup[];
--- a/docs/qmp-events.txt
+++ b/docs/qmp-events.txt
@@ -496,6 +496,20 @@ Example:
 {"timestamp": {"seconds": 1432121972, "microseconds": 744001},
 "event": "MIGRATION", "data": {"status": "completed"}}

+MIGRATION_PASS
+--------------
+
+Emitted from the source side of a migration at the start of each pass
+(when it syncs the dirty bitmap)
+
+Data: None.
+
+  - "pass": An incrementing count (starting at 1 on the first pass)
+
+Example:
+{"timestamp": {"seconds": 1449669631, "microseconds": 239225},
+ "event": "MIGRATION_PASS", "data": {"pass": 2}}
+
 STOP
 ----

--- a/docs/replay.txt
+++ b/docs/replay.txt
@@ -0,0 +1,168 @@
+Copyright (c) 2010-2015 Institute for System Programming
+                        of the Russian Academy of Sciences.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
+
+Record/replay
+-------------
+
+Record/replay functions are used for the reverse execution and deterministic
+replay of qemu execution. This implementation of deterministic replay can
+be used for deterministic debugging of guest code through a gdb remote
+interface.
+
+Execution recording writes a non-deterministic events log, which can be later
+used for replaying the execution anywhere and for unlimited number of times.
+It also supports checkpointing for faster rewinding during reverse debugging.
+Execution replaying reads the log and replays all non-deterministic events
+including external input, hardware clocks, and interrupts.
+
+Deterministic replay has the following features:
+ * Deterministically replays whole system execution and all contents of
+   the memory, state of the hardware devices, clocks, and screen of the VM.
+ * Writes execution log into the file for later replaying for multiple times
+   on different machines.
+ * Supports i386, x86_64, and ARM hardware platforms.
+ * Performs deterministic replay of all operations with keyboard and mouse
+   input devices.
+
+Usage of the record/replay:
+ * First, record the execution, by adding the following arguments to the command line:
+   '-icount shift=7,rr=record,rrfile=replay.bin -net none'.
+   Block devices' images are not actually changed in the recording mode,
+   because all of the changes are written to the temporary overlay file.
+ * Then you can replay it by using another command
+   line option: '-icount shift=7,rr=replay,rrfile=replay.bin -net none'
+ * '-net none' option should also be specified if network replay patches
+   are not applied.
+
+Papers with description of deterministic replay implementation:
+http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html
+http://dl.acm.org/citation.cfm?id=2786805.2803179
+
+Modifications of qemu include:
+ * wrappers for clock and time functions to save their return values in the log
+ * saving different asynchronous events (e.g. system shutdown) into the log
+ * synchronization of the bottom halves execution
+ * synchronization of the threads from thread pool
+ * recording/replaying user input (mouse and keyboard)
+ * adding internal checkpoints for cpu and io synchronization
+
+Non-deterministic events
+------------------------
+
+Our record/replay system is based on saving and replaying non-deterministic
+events (e.g. keyboard input) and simulating deterministic ones (e.g. reading
+from HDD or memory of the VM). Saving only non-deterministic events makes
+log file smaller, simulation faster, and allows using reverse debugging even
+for realtime applications.
+
+The following non-deterministic data from peripheral devices is saved into
+the log: mouse and keyboard input, network packets, audio controller input,
+USB packets, serial port input, and hardware clocks (they are non-deterministic
+too, because their values are taken from the host machine). Inputs from
+simulated hardware, memory of VM, software interrupts, and execution of
+instructions are not saved into the log, because they are deterministic and
+can be replayed by simulating the behavior of virtual machine starting from
+initial state.
+
+We had to solve three tasks to implement deterministic replay: recording
+non-deterministic events, replaying non-deterministic events, and checking
+that there is no divergence between record and replay modes.
+
+We changed several parts of QEMU to make event log recording and replaying.
+Devices' models that have non-deterministic input from external devices were
+changed to write every external event into the execution log immediately.
+E.g. network packets are written into the log when they arrive into the virtual
+network adapter.
+
+All non-deterministic events are coming from these devices. But to
+replay them we need to know at which moments they occur. We specify
+these moments by counting the number of instructions executed between
+every pair of consecutive events.
+
+Instruction counting
+--------------------
+
+QEMU should work in icount mode to use record/replay feature. icount was
+designed to allow deterministic execution in absence of external inputs
+of the virtual machine. We also use icount to control the occurrence of the
+non-deterministic events. The number of instructions elapsed from the last event
+is written to the log while recording the execution. In replay mode we
+can predict when to inject that event using the instruction counter.
+
+Timers
+------
+
+Timers are used to execute callbacks from different subsystems of QEMU
+at the specified moments of time. There are several kinds of timers:
+ * Real time clock. Based on host time and used only for callbacks that
+   do not change the virtual machine state. For this reason real time
+   clock and timers does not affect deterministic replay at all.
+ * Virtual clock. These timers run only during the emulation. In icount
+   mode virtual clock value is calculated using executed instructions counter.
+   That is why it is completely deterministic and does not have to be recorded.
+ * Host clock. This clock is used by device models that simulate real time
+   sources (e.g. real time clock chip). Host clock is the one of the sources
+   of non-determinism. Host clock read operations should be logged to
+   make the execution deterministic.
+ * Real time clock for icount. This clock is similar to real time clock but
+   it is used only for increasing virtual clock while virtual machine is
+   sleeping. Due to its nature it is also non-deterministic as the host clock
+   and has to be logged too.
+
+Checkpoints
+-----------
+
+Replaying of the execution of virtual machine is bound by sources of
+non-determinism. These are inputs from clock and peripheral devices,
+and QEMU thread scheduling. Thread scheduling affect on processing events
+from timers, asynchronous input-output, and bottom halves.
+
+Invocations of timers are coupled with clock reads and changing the state
+of the virtual machine. Reads produce non-deterministic data taken from
+host clock. And VM state changes should preserve their order. Their relative
+order in replay mode must replicate the order of callbacks in record mode.
+To preserve this order we use checkpoints. When a specific clock is processed
+in record mode we save to the log special "checkpoint" event.
+Checkpoints here do not refer to virtual machine snapshots. They are just
+record/replay events used for synchronization.
+
+QEMU in replay mode will try to invoke timers processing in random moment
+of time. That's why we do not process a group of timers until the checkpoint
+event will be read from the log. Such an event allows synchronizing CPU
+execution and timer events.
+
+Another checkpoints application in record/replay is instruction counting
+while the virtual machine is idle. This function (qemu_clock_warp) is called
+from the wait loop. It changes virtual machine state and must be deterministic
+then. That is why we added checkpoint to this function to prevent its
+operation in replay mode when it does not correspond to record mode.
+
+Bottom halves
+-------------
+
+Disk I/O events are completely deterministic in our model, because
+in both record and replay modes we start virtual machine from the same
+disk state. But callbacks that virtual disk controller uses for reading and
+writing the disk may occur at different moments of time in record and replay
+modes.
+
+Reading and writing requests are created by CPU thread of QEMU. Later these
+requests proceed to block layer which creates "bottom halves". Bottom
+halves consist of callback and its parameters. They are processed when
+main loop locks the global mutex. These locks are not synchronized with
+replaying process because main loop also processes the events that do not
+affect the virtual machine state (like user interaction with monitor).
+
+That is why we had to implement saving and replaying bottom halves callbacks
+synchronously to the CPU execution. When the callback is about to execute
+it is added to the queue in the replay module. This queue is written to the
+log when its callbacks are executed. In replay mode callbacks are not processed
+until the corresponding event is read from the events log file.
+
+Sometimes the block layer uses asynchronous callbacks for its internal purposes
+(like reading or writing VM snapshots or disk image cluster tables). In this
+case bottom halves are not marked as "replayable" and do not saved
+into the log.
--- a/docs/specs/fw_cfg.txt
+++ b/docs/specs/fw_cfg.txt
@@ -192,90 +192,7 @@ To check the result, read the "control" field:
                            today due to implementation not being async,
                            but may in the future).

-= Host-side API =
-
-The following functions are available to the QEMU programmer for adding
-data to a fw_cfg device during guest initialization (see fw_cfg.h for
-each function's complete prototype):
-
-== fw_cfg_add_bytes() ==
-
-Given a selector key value, starting pointer, and size, create an item
-as a raw "blob" of the given size, available by selecting the given key.
-The data referenced by the starting pointer is only linked, NOT copied,
-into the data structure of the fw_cfg device.
-
-== fw_cfg_add_string() ==
-
-Instead of a starting pointer and size, this function accepts a pointer
-to a NUL-terminated ascii string, and inserts a newly allocated copy of
-the string (including the NUL terminator) into the fw_cfg device data
-structure.
-
-== fw_cfg_add_iXX() ==
-
-Insert an XX-bit item, where XX may be 16, 32, or 64. These functions
-will convert a 16-, 32-, or 64-bit integer to little-endian, then add
-a dynamically allocated copy of the appropriately sized item to fw_cfg
-under the given selector key value.
-
-== fw_cfg_modify_iXX() ==
-
-Modify the value of an XX-bit item (where XX may be 16, 32, or 64).
-Similarly to the corresponding fw_cfg_add_iXX() function set, convert
-a 16-, 32-, or 64-bit integer to little endian, create a dynamically
-allocated copy of the required size, and replace the existing item at
-the given selector key value with the newly allocated one. The previous
-item, assumed to have been allocated during an earlier call to
-fw_cfg_add_iXX() or fw_cfg_modify_iXX() (of the same width XX), is freed
-before the function returns.
-
-== fw_cfg_add_file() ==
-
-Given a filename (i.e., fw_cfg item name), starting pointer, and size,
-create an item as a raw "blob" of the given size. Unlike fw_cfg_add_bytes()
-above, the next available selector key (above 0x0020, FW_CFG_FILE_FIRST)
-will be used, and a new entry will be added to the file directory structure
-(at key 0x0019), containing the item name, blob size, and automatically
-assigned selector key value. The data referenced by the starting pointer
-is only linked, NOT copied, into the fw_cfg data structure.
-
-== fw_cfg_add_file_callback() ==
-
-Like fw_cfg_add_file(), but additionally sets pointers to a callback
-function (and opaque argument), which will be executed host-side by
-QEMU each time a byte is read by the guest from this particular item.
-
-NOTE: The callback function is given the opaque argument set by
-fw_cfg_add_file_callback(), but also the current data offset,
-allowing it the option of only acting upon specific offset values
-(e.g., 0, before the first data byte of the selected item is
-returned to the guest).
-
-== fw_cfg_modify_file() ==
-
-Given a filename (i.e., fw_cfg item name), starting pointer, and size,
-completely replace the configuration item referenced by the given item
-name with the new given blob. If an existing blob is found, its
-callback information is removed, and a pointer to the old data is
-returned to allow the caller to free it, helping avoid memory leaks.
-If a configuration item does not already exist under the given item
-name, a new item will be created as with fw_cfg_add_file(), and NULL
-is returned to the caller. In any case, the data referenced by the
-starting pointer is only linked, NOT copied, into the fw_cfg data
-structure.
-
-== fw_cfg_add_callback() ==
-
-Like fw_cfg_add_bytes(), but additionally sets pointers to a callback
-function (and opaque argument), which will be executed host-side by
-QEMU each time a guest-side write operation to this particular item
-completes fully overwriting the item's data.
-
-NOTE: This function is deprecated, and will be completely removed
-starting with QEMU v2.4.
-
-== Externally Provided Items ==
+= Externally Provided Items =

 As of v2.4, "file" fw_cfg items (i.e., items with selector keys above
 FW_CFG_FILE_FIRST, and with a corresponding entry in the fw_cfg file
--- a/docs/specs/parallels.txt
+++ b/docs/specs/parallels.txt
@@ -0,0 +1,228 @@
+= License =
+
+Copyright (c) 2015 Denis Lunev
+Copyright (c) 2015 Vladimir Sementsov-Ogievskiy
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
+
+= Parallels Expandable Image File Format =
+
+A Parallels expandable image file consists of three consecutive parts:
+    * header
+    * BAT
+    * data area
+
+All numbers in a Parallels expandable image are stored in little-endian byte
+order.
+
+
+== Definitions ==
+
+    Sector    A 512-byte data chunk.
+
+    Cluster   A data chunk of the size specified in the image header.
+              Currently, the default size is 1MiB (2048 sectors). In previous
+              versions, cluster sizes of 63 sectors, 256 and 252 kilobytes were
+              used.
+
+    BAT       Block Allocation Table, an entity that contains information for
+              guest-to-host I/O data address translation.
+
+
+== Header ==
+
+The header is placed at the start of an image and contains the following
+fields:
+
+Bytes:
+   0 - 15:    magic
+              Must contain "WithoutFreeSpace" or "WithouFreSpacExt".
+
+  16 - 19:    version
+              Must be 2.
+
+  20 - 23:    heads
+              Disk geometry parameter for guest.
+
+  24 - 27:    cylinders
+              Disk geometry parameter for guest.
+
+  28 - 31:    tracks
+              Cluster size, in sectors.
+
+  32 - 35:    nb_bat_entries
+              Disk size, in clusters (BAT size).
+
+  36 - 43:    nb_sectors
+              Disk size, in sectors.
+
+              For "WithoutFreeSpace" images:
+              Only the lowest 4 bytes are used. The highest 4 bytes must be
+              cleared in this case.
+
+              For "WithouFreSpacExt" images, there are no such
+              restrictions.
+
+  44 - 47:    in_use
+              Set to 0x746F6E59 when the image is opened by software in R/W
+              mode; set to 0x312e3276 when the image is closed.
+
+              A zero in this field means that the image was opened by an old
+              version of the software that doesn't support Format Extension
+              (see below).
+
+              Other values are not allowed.
+
+  48 - 51:    data_off
+              An offset, in sectors, from the start of the file to the start of
+              the data area.
+
+              For "WithoutFreeSpace" images:
+              - If data_off is zero, the offset is calculated as the end of BAT
+                table plus some padding to ensure sector size alignment.
+              - If data_off is non-zero, the offset should be aligned to sector
+                size. However it is recommended to align it to cluster size for
+                newly created images.
+
+              For "WithouFreSpacExt" images:
+              data_off must be non-zero and aligned to cluster size.
+
+  52 - 55:    flags
+              Miscellaneous flags.
+
+              Bit 0: Empty Image bit. If set, the image should be
+                     considered clear.
+
+              Bits 2-31: Unused.
+
+  56 - 63:    ext_off
+              Format Extension offset, an offset, in sectors, from the start of
+              the file to the start of the Format Extension Cluster.
+
+              ext_off must meet the same requirements as cluster offsets
+              defined by BAT entries (see below).
+
+
+== BAT ==
+
+BAT is placed immediately after the image header. In the file, BAT is a
+contiguous array of 32-bit unsigned little-endian integers with
+(bat_entries * 4) bytes size.
+
+Each BAT entry contains an offset from the start of the file to the
+corresponding cluster. The offset set in clusters for "WithouFreSpacExt" images
+and in sectors for "WithoutFreeSpace" images.
+
+If a BAT entry is zero, the corresponding cluster is not allocated and should
+be considered as filled with zeroes.
+
+Cluster offsets specified by BAT entries must meet the following requirements:
+    - the value must not be lower than data offset (provided by header.data_off
+      or calculated as specified above),
+    - the value must be lower than the desired file size,
+    - the value must be unique among all BAT entries,
+    - the result of (cluster offset - data offset) must be aligned to cluster
+      size.
+
+
+== Data Area ==
+
+The data area is an area from the data offset (provided by header.data_off or
+calculated as specified above) to the end of the file. It represents a
+contiguous array of clusters. Most of them are allocated by the BAT, some may
+be allocated by the ext_off field in the header while other may be allocated by
+extensions. All clusters allocated by ext_off and extensions should meet the
+same requirements as clusters specified by BAT entries.
+
+
+== Format Extension ==
+
+The Format Extension is an area 1 cluster in size that provides additional
+format features. This cluster is addressed by the ext_off field in the header.
+The format of the Format Extension area is the following:
+
+   0 -  7:    magic
+              Must be 0xAB234CEF23DCEA87
+
+   8 - 23:    m_CheckSum
+              The MD5 checksum of the entire Header Extension cluster except
+              the first 24 bytes.
+
+    The above are followed by feature sections or "extensions". The last
+    extension must be "End of features" (see below).
+
+Each feature section has the following format:
+
+   0 -  7:    magic
+              The identifier of the feature:
+              0x0000000000000000 - End of features
+              0x20385FAE252CB34A - Dirty bitmap
+
+   8 - 15:    flags
+              External flags for extension:
+
+              Bit 0: NECESSARY
+                     If the software cannot load the extension (due to an
+                     unknown magic number or error), the file should not be
+                     changed. If this flag is unset and there is an error on
+                     loading the extension, said extension should be dropped.
+
+              Bit 1: TRANSIT
+                     If there is an unknown extension with this flag set,
+                     said extension should be left as is.
+
+              If neither NECESSARY nor TRANSIT are set, the extension should be
+              dropped.
+
+  16 - 19:    data_size
+              The size of the following feature data, in bytes.
+
+  20 - 23:    unused32
+              Align header to 8 bytes boundary.
+
+  variable:   data (data_size bytes)
+
+    The above is followed by padding to the next 8 bytes boundary, then the
+    next extension starts.
+
+    The last extension must be "End of features" with all the fields set to 0.
+
+
+=== Dirty bitmaps feature ===
+
+This feature provides a way of storing dirty bitmaps in the image. The fields
+of its data area are:
+
+   0 -  7:    size
+              The bitmap size, should be equal to disk size in sectors.
+
+   8 - 23:    id
+              An identifier for backup consistency checking.
+
+  24 - 27:    granularity
+              Bitmap granularity, in sectors. I.e., the number of sectors
+              corresponding to one bit of the bitmap. Granularity must be
+              a power of 2.
+
+  28 - 31:    l1_size
+              The number of entries in the L1 table of the bitmap.
+
+  variable:   l1 (64 * l1_size bytes)
+              L1 offset table (in bytes)
+
+A dirty bitmap is stored using a one-level structure for the mapping to host
+clusters - an L1 table.
+
+Given an offset in bytes into the bitmap data, the offset in bytes into the
+image file can be obtained as follows:
+
+    offset = l1_table[offset / cluster_size] + (offset % cluster_size)
+
+If an L1 table entry is 0, the corresponding cluster of the bitmap is assumed
+to be zero.
+
+If an L1 table entry is 1, the corresponding cluster of the bitmap is assumed
+to have all bits set.
+
+If an L1 table entry is not 0 or 1, it allocates a cluster from the data area.
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -87,6 +87,14 @@ Depending on the request type, payload can be:
   User address: a 64-bit user address
   mmap offset: 64-bit offset where region starts in the mapped memory

+* Log description
+   ---------------------------
+   | log size | log offset |
+   ---------------------------
+   log size: size of area used for logging
+   log offset: offset from start of supplied file descriptor
+       where logging starts (i.e. where guest address 0 would be logged)
+
 In QEMU the vhost-user message is implemented with the following struct:

 typedef struct VhostUserMsg {
@@ -98,6 +106,7 @@ typedef struct VhostUserMsg {
        struct vhost_vring_state state;
        struct vhost_vring_addr addr;
        VhostUserMemory memory;
+        VhostUserLog log;
    };
 } QEMU_PACKED VhostUserMsg;

@@ -137,6 +146,39 @@ As older slaves don't support negotiating protocol features,
 a feature bit was dedicated for this purpose:
 #define VHOST_USER_F_PROTOCOL_FEATURES 30

+Starting and stopping rings
+----------------------
+Client must only process each ring when it is started.
+
+Client must only pass data between the ring and the
+backend, when the ring is enabled.
+
+If ring is started but disabled, client must process the
+ring without talking to the backend.
+
+For example, for a networking device, in the disabled state
+client must not supply any new RX packets, but must process
+and discard any TX packets.
+
+If VHOST_USER_F_PROTOCOL_FEATURES has not been negotiated, the ring is initialized
+in an enabled state.
+
+If VHOST_USER_F_PROTOCOL_FEATURES has been negotiated, the ring is initialized
+in a disabled state. Client must not pass data to/from the backend until ring is enabled by
+VHOST_USER_SET_VRING_ENABLE with parameter 1, or after it has been disabled by
+VHOST_USER_SET_VRING_ENABLE with parameter 0.
+
+Each ring is initialized in a stopped state, client must not process it until
+ring is started, or after it has been stopped.
+
+Client must start ring upon receiving a kick (that is, detecting that file
+descriptor is readable) on the descriptor specified by
+VHOST_USER_SET_VRING_KICK, and stop ring upon receiving
+VHOST_USER_GET_VRING_BASE.
+
+While processing the rings (whether they are enabled or not), client must
+support changing some configuration aspects on the fly.
+
 Multiple queue support
 ----------------------

@@ -161,9 +203,13 @@ the slave makes to the memory mapped regions. The client should mark
 the dirty pages in a log. Once it complies to this logging, it may
 declare the VHOST_F_LOG_ALL vhost feature.

+To start/stop logging of data/used ring writes, server may send messages
+VHOST_USER_SET_FEATURES with VHOST_F_LOG_ALL and VHOST_USER_SET_VRING_ADDR with
+VHOST_VRING_F_LOG in ring's flags set to 1/0, respectively.
+
 All the modifications to memory pointed by vring "descriptor" should
 be marked. Modifications to "used" vring should be marked if
-VHOST_VRING_F_LOG is part of ring's features.
+VHOST_VRING_F_LOG is part of ring's flags.

 Dirty pages are of size:
 #define VHOST_LOG_PAGE 0x1000
@@ -172,22 +218,35 @@ The log memory fd is provided in the ancillary data of
 VHOST_USER_SET_LOG_BASE message when the slave has
 VHOST_USER_PROTOCOL_F_LOG_SHMFD protocol feature.

-The size of the log may be computed by using all the known guest
-addresses. The log covers from address 0 to the maximum of guest
+The size of the log is supplied as part of VhostUserMsg
+which should be large enough to cover all known guest
+addresses. Log starts at the supplied offset in the
+supplied file descriptor.
+The log covers from address 0 to the maximum of guest
 regions. In pseudo-code, to mark page at "addr" as dirty:

 page = addr / VHOST_LOG_PAGE
 log[page / 8] |= 1 << page % 8

+Where addr is the guest physical address.
+
 Use atomic operations, as the log may be concurrently manipulated.

+Note that when logging modifications to the used ring (when VHOST_VRING_F_LOG
+is set for this ring), log_guest_addr should be used to calculate the log
+offset: the write to first byte of the used ring is logged at this offset from
+log start. Also note that this value might be outside the legal guest physical
+address range (i.e. does not have to be covered by the VhostUserMemory table),
+but the bit offset of the last byte of the ring must fall within
+the size supplied by VhostUserLog.
+
 VHOST_USER_SET_LOG_FD is an optional message with an eventfd in
 ancillary data, it may be used to inform the master that the log has
 been modified.

-Once the source has finished migration, VHOST_USER_RESET_OWNER message
-will be sent by the source. No further update must be done before the
-destination takes over with new regions & rings.
+Once the source has finished migration, rings will be stopped by
+the source. No further update must be done before rings are
+restarted.

 Protocol features
 -----------------
@@ -255,14 +314,16 @@ Message types
      as an owner of the session. This can be used on the Slave as a
      "session start" flag.

- * VHOST_USER_RESET_DEVICE
+ * VHOST_USER_RESET_OWNER

      Id: 4
-      Equivalent ioctl: VHOST_RESET_DEVICE
      Master payload: N/A

-      Issued when a new connection is about to be closed. The Master will no
-      longer own this connection (and will usually close it).
+      This is no longer used. Used to be sent to request disabling
+      all rings, but some clients interpreted it to also discard
+      connection state (this interpretation would lead to bugs).
+      It is recommended that clients either ignore this message,
+      or use it to disable all rings.

 * VHOST_USER_SET_MEM_TABLE

@@ -282,7 +343,12 @@ Message types
      Master payload: u64
      Slave payload: N/A

-      Sets the logging base address.
+      Sets logging shared memory space.
+      When slave has VHOST_USER_PROTOCOL_F_LOG_SHMFD protocol
+      feature, the log memory fd is provided in the ancillary data of
+      VHOST_USER_SET_LOG_BASE message, the size and offset of shared
+      memory area provided in the message.
+

 * VHOST_USER_SET_LOG_FD

@@ -382,6 +448,8 @@ Message types
      Master payload: vring state description

      Signal slave to enable or disable corresponding vring.
+      This request should be sent only when VHOST_USER_F_PROTOCOL_FEATURES
+      has been negotiated.

 * VHOST_USER_SEND_RARP

--- a/Show More
+++ b/Show More
@@ -1 +1 @@
 .4.50
 .5.50