update VERSION for v1.3.1

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
qxl: Fix SPICE_RING_PROD_ITEM(), SPICE_RING_CONS_ITEM() sanity check
2013-01-28 10:38:28 -06:00 · 2013-01-21 14:44:31 -06:00 · 2013-01-21 14:08:52 -06:00 · 2013-01-21 14:07:21 -06:00 · 2013-01-21 14:06:50 -06:00 · 2013-01-21 13:52:23 -06:00
3456 changed files with 245110 additions and 583506 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -1,69 +1,64 @@
-/config-devices.*
-/config-all-devices.*
-/config-all-disas.*
-/config-host.*
-/config-target.*
-/config.status
-/config-temp
-/trace/generated-tracers.h
-/trace/generated-tracers.c
-/trace/generated-tracers-dtrace.h
-/trace/generated-tracers.dtrace
-/trace/generated-events.h
-/trace/generated-events.c
-/trace/generated-helpers-wrappers.h
-/trace/generated-helpers.h
-/trace/generated-helpers.c
-/trace/generated-tcg-tracers.h
-/trace/generated-ust-provider.h
-/trace/generated-ust.c
-/libcacard/trace/generated-tracers.c
+config-devices.*
+config-all-devices.*
+config-host.*
+config-target.*
+trace.h
+trace.c
+trace-dtrace.h
+trace-dtrace.dtrace
 *-timestamp
-/*-softmmu
-/*-darwin-user
-/*-linux-user
-/*-bsd-user
-/libdis*
-/libuser
-/linux-headers/asm
-/qga/qapi-generated
-/qapi-generated
-/qapi-types.[ch]
-/qapi-visit.[ch]
-/qapi-event.[ch]
-/qmp-commands.h
-/qmp-marshal.c
-/qemu-doc.html
-/qemu-tech.html
-/qemu-doc.info
-/qemu-tech.info
-/qemu-img
-/qemu-nbd
-/qemu-options.def
-/qemu-options.texi
-/qemu-img-cmds.texi
-/qemu-img-cmds.h
-/qemu-io
-/qemu-ga
-/qemu-bridge-helper
-/qemu-monitor.texi
-/qmp-commands.txt
-/vscclient
-/fsdev/virtfs-proxy-helper
-*.[1-9]
+*-softmmu
+*-darwin-user
+*-linux-user
+*-bsd-user
+libdis*
+libuser
+linux-headers/asm
+qapi-generated
+qapi-types.[ch]
+qapi-visit.[ch]
+qmp-commands.h
+qmp-marshal.c
+qemu-doc.html
+qemu-tech.html
+qemu-doc.info
+qemu-tech.info
+qemu.1
+qemu.pod
+qemu-img.1
+qemu-img.pod
+qemu-img
+qemu-nbd
+qemu-nbd.8
+qemu-nbd.pod
+qemu-options.def
+qemu-options.texi
+qemu-img-cmds.texi
+qemu-img-cmds.h
+qemu-io
+qemu-ga
+qemu-bridge-helper
+qemu-monitor.texi
+vscclient
+QMP/qmp-commands.txt
+test-coroutine
+test-qmp-input-visitor
+test-qmp-output-visitor
+test-string-input-visitor
+test-string-output-visitor
+test-visitor-serialization
+fsdev/virtfs-proxy-helper.1
+fsdev/virtfs-proxy-helper.pod
+.gdbinit
 *.a
 *.aux
 *.cp
 *.dvi
 *.exe
-*.dll
-*.so
-*.mo
 *.fn
 *.ky
 *.log
 *.pdf
-*.pod
 *.cps
 *.fns
 *.kys
@@ -73,31 +68,26 @@
 *.tp
 *.vr
 *.d
-!/scripts/qemu-guest-agent/fsfreeze-hook.d
 *.o
 *.lo
 *.la
 *.pc
 .libs
-.sdk
-*.gcda
-*.gcno
-/pc-bios/bios-pq/status
-/pc-bios/vgabios-pq/status
-/pc-bios/optionrom/linuxboot.asm
-/pc-bios/optionrom/linuxboot.bin
-/pc-bios/optionrom/linuxboot.raw
-/pc-bios/optionrom/linuxboot.img
-/pc-bios/optionrom/multiboot.asm
-/pc-bios/optionrom/multiboot.bin
-/pc-bios/optionrom/multiboot.raw
-/pc-bios/optionrom/multiboot.img
-/pc-bios/optionrom/kvmvapic.asm
-/pc-bios/optionrom/kvmvapic.bin
-/pc-bios/optionrom/kvmvapic.raw
-/pc-bios/optionrom/kvmvapic.img
-/pc-bios/s390-ccw/s390-ccw.elf
-/pc-bios/s390-ccw/s390-ccw.img
+*.swp
+*.orig
+.pc
+patches
+pc-bios/bios-pq/status
+pc-bios/vgabios-pq/status
+pc-bios/optionrom/linuxboot.bin
+pc-bios/optionrom/linuxboot.raw
+pc-bios/optionrom/linuxboot.img
+pc-bios/optionrom/multiboot.bin
+pc-bios/optionrom/multiboot.raw
+pc-bios/optionrom/multiboot.img
+pc-bios/optionrom/kvmvapic.bin
+pc-bios/optionrom/kvmvapic.raw
+pc-bios/optionrom/kvmvapic.img
 .stgit-*
 cscope.*
 tags
--- a/.gitmodules
+++ b/.gitmodules
@@ -1,33 +1,24 @@
 [submodule "roms/vgabios"]
 	path = roms/vgabios
-	url = git://git.qemu-project.org/vgabios.git/
+	url = git://git.qemu.org/vgabios.git/
 [submodule "roms/seabios"]
 	path = roms/seabios
-	url = git://git.qemu-project.org/seabios.git/
+	url = git://git.qemu.org/seabios.git/
 [submodule "roms/SLOF"]
 	path = roms/SLOF
-	url = git://git.qemu-project.org/SLOF.git
+	url = git://git.qemu.org/SLOF.git
 [submodule "roms/ipxe"]
 	path = roms/ipxe
-	url = git://git.qemu-project.org/ipxe.git
+	url = git://git.qemu.org/ipxe.git
 [submodule "roms/openbios"]
 	path = roms/openbios
-	url = git://git.qemu-project.org/openbios.git
-[submodule "roms/openhackware"]
-	path = roms/openhackware
-	url = git://git.qemu-project.org/openhackware.git
+	url = git://git.qemu.org/openbios.git
 [submodule "roms/qemu-palcode"]
 	path = roms/qemu-palcode
-	url = git://github.com/rth7680/qemu-palcode.git
+	url = git://repo.or.cz/qemu-palcode.git
 [submodule "roms/sgabios"]
 	path = roms/sgabios
-	url = git://git.qemu-project.org/sgabios.git
+	url = git://git.qemu.org/sgabios.git
 [submodule "pixman"]
 	path = pixman
 	url = git://anongit.freedesktop.org/pixman
-[submodule "dtc"]
-	path = dtc
-	url = git://git.qemu-project.org/dtc.git
-[submodule "roms/u-boot"]
-	path = roms/u-boot
-	url = git://git.qemu-project.org/u-boot.git
--- a/.mailmap
+++ b/.mailmap
@@ -2,8 +2,7 @@
 # into proper addresses so that they are counted properly in git shortlog output.
 #
 Andrzej Zaborowski <balrogg@gmail.com> balrog <balrog@c046a42c-6fe2-441c-8c8c-71466251a162>
-Anthony Liguori <anthony@codemonkey.ws> aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
-Anthony Liguori <anthony@codemonkey.ws> Anthony Liguori <aliguori@us.ibm.com>
+Anthony Liguori <aliguori@us.ibm.com> aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
 Aurelien Jarno <aurelien@aurel32.net> aurel32 <aurel32@c046a42c-6fe2-441c-8c8c-71466251a162>
 Blue Swirl <blauwirbel@gmail.com> blueswir1 <blueswir1@c046a42c-6fe2-441c-8c8c-71466251a162>
 Edgar E. Iglesias <edgar.iglesias@gmail.com> edgar_igl <edgar_igl@c046a42c-6fe2-441c-8c8c-71466251a162>
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,103 +0,0 @@
-language: c
-python:
-  - "2.4"
-compiler:
-  - gcc
-  - clang
-notifications:
-  irc:
-    channels:
-      - "irc.oftc.net#qemu"
-    on_success: change
-    on_failure: always
-env:
-  global:
-    - TEST_CMD=""
-    - EXTRA_CONFIG=""
-    # Development packages, EXTRA_PKGS saved for additional builds
-    - CORE_PKGS="libusb-1.0-0-dev libiscsi-dev librados-dev libncurses5-dev"
-    - NET_PKGS="libseccomp-dev libgnutls-dev libssh2-1-dev  libspice-server-dev libspice-protocol-dev libnss3-dev"
-    - GUI_PKGS="libgtk-3-dev libvte-2.90-dev libsdl1.2-dev libpng12-dev libpixman-1-dev"
-    - EXTRA_PKGS=""
-  matrix:
-    # Group major targets together with their linux-user counterparts
-    - TARGETS=alpha-softmmu,alpha-linux-user
-    - TARGETS=arm-softmmu,arm-linux-user,armeb-linux-user,aarch64-softmmu,aarch64-linux-user
-    - TARGETS=cris-softmmu,cris-linux-user
-    - TARGETS=i386-softmmu,i386-linux-user,x86_64-softmmu,x86_64-linux-user
-    - TARGETS=m68k-softmmu,m68k-linux-user
-    - TARGETS=microblaze-softmmu,microblazeel-softmmu,microblaze-linux-user,microblazeel-linux-user
-    - TARGETS=mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu
-    - TARGETS=mips-linux-user,mips64-linux-user,mips64el-linux-user,mipsel-linux-user,mipsn32-linux-user,mipsn32el-linux-user
-    - TARGETS=or32-softmmu,or32-linux-user
-    - TARGETS=ppc-softmmu,ppc64-softmmu,ppcemb-softmmu,ppc-linux-user,ppc64-linux-user,ppc64abi32-linux-user,ppc64le-linux-user
-    - TARGETS=s390x-softmmu,s390x-linux-user
-    - TARGETS=sh4-softmmu,sh4eb-softmmu,sh4-linux-user sh4eb-linux-user
-    - TARGETS=sparc-softmmu,sparc64-softmmu,sparc-linux-user,sparc32plus-linux-user,sparc64-linux-user
-    - TARGETS=unicore32-softmmu,unicore32-linux-user
-    # Group remaining softmmu only targets into one build
-    - TARGETS=lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,xtensaeb-softmmu
-git:
-  # we want to do this ourselves
-  submodules: false
-before_install:
-  - wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
-  - git submodule update --init --recursive
-  - sudo apt-get update -qq
-  - sudo apt-get install -qq ${CORE_PKGS} ${NET_PKGS} ${GUI_PKGS} ${EXTRA_PKGS}
-before_script:
-  - ./configure --target-list=${TARGETS} --enable-debug-tcg ${EXTRA_CONFIG}
-script:
-  - make -j2 && ${TEST_CMD}
-matrix:
-  # We manually include a number of additional build for non-standard bits
-  include:
-    # Make check target (we only do this once)
-    - env:
-        - TARGETS=alpha-softmmu,arm-softmmu,aarch64-softmmu,cris-softmmu,
-                  i386-softmmu,x86_64-softmmu,m68k-softmmu,microblaze-softmmu,
-                  microblazeel-softmmu,mips-softmmu,mips64-softmmu,
-                  mips64el-softmmu,mipsel-softmmu,or32-softmmu,ppc-softmmu,
-                  ppc64-softmmu,ppcemb-softmmu,s390x-softmmu,sh4-softmmu,
-                  sh4eb-softmmu,sparc-softmmu,sparc64-softmmu,
-                  unicore32-softmmu,unicore32-linux-user,
-                  lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,
-                  xtensaeb-softmmu
-          TEST_CMD="make check"
-      compiler: gcc
-    # Debug related options
-    - env: TARGETS=i386-softmmu,x86_64-softmmu
-           EXTRA_CONFIG="--enable-debug"
-      compiler: gcc
-    - env: TARGETS=i386-softmmu,x86_64-softmmu
-           EXTRA_CONFIG="--enable-debug --enable-tcg-interpreter"
-      compiler: gcc
-    # All the extra -dev packages
-    - env: TARGETS=i386-softmmu,x86_64-softmmu
-           EXTRA_PKGS="libaio-dev libcap-ng-dev libattr1-dev libbrlapi-dev uuid-dev libusb-1.0.0-dev"
-      compiler: gcc
-    # Currently configure doesn't force --disable-pie
-    - env: TARGETS=i386-softmmu,x86_64-softmmu
-           EXTRA_CONFIG="--enable-gprof --enable-gcov --disable-pie"
-      compiler: gcc
-    - env: TARGETS=i386-softmmu,x86_64-softmmu
-           EXTRA_PKGS="sparse"
-           EXTRA_CONFIG="--enable-sparse"
-      compiler: gcc
-    # All the trace backends (apart from dtrace)
-    - env: TARGETS=i386-softmmu,x86_64-softmmu
-           EXTRA_CONFIG="--enable-trace-backends=stderr"
-      compiler: gcc
-    - env: TARGETS=i386-softmmu,x86_64-softmmu
-           EXTRA_CONFIG="--enable-trace-backends=simple"
-      compiler: gcc
-    - env: TARGETS=i386-softmmu,x86_64-softmmu
-           EXTRA_CONFIG="--enable-trace-backends=ftrace"
-      compiler: gcc
-    - env: TARGETS=i386-softmmu,x86_64-softmmu
-          EXTRA_PKGS="liblttng-ust-dev liburcu-dev"
-          EXTRA_CONFIG="--enable-trace-backends=ust"
-      compiler: gcc
-    - env: TARGETS=i386-softmmu,x86_64-softmmu
-           EXTRA_CONFIG="--enable-modules"
-      compiler: gcc
--- a/21
+++ b/21
@@ -84,24 +84,3 @@ and clarity it comes on a line by itself:
 Rationale: a consistent (except for functions...) bracing style reduces
 ambiguity and avoids needless churn when lines are added or removed.
 Furthermore, it is the QEMU coding style.
-
-5. Declarations
-
-Mixed declarations (interleaving statements and declarations within blocks)
-are not allowed; declarations should be at the beginning of blocks.  In other
-words, the code should not generate warnings if using GCC's
-Wdeclaration-after-statement option.
-
-6. Conditional statements
-
-When comparing a variable for (in)equality with a constant, list the
-constant on the right, as in:
-
-if (a == 1) {
-    /* Reads like: "If a equals 1" */
-    do_something();
-}
-
-Rationale: Yoda conditions (as in 'if (1 == a)') are awkward to read.
-Besides, good compilers already warn users when '==' is mis-typed as '=',
-even when the constant is on the right.
--- a/2
+++ b/2
@@ -1,6 +1,6 @@
 This file documents changes for QEMU releases 0.12 and earlier.
 For changelog information for later releases, see
-http://wiki.qemu-project.org/ChangeLog or look at the git history for
+http://wiki.qemu.org/ChangeLog or look at the git history for
 more detailed information.


--- a/46
+++ b/46
@@ -40,23 +40,8 @@ speaking, the size of guest memory can always fit into ram_addr_t but
 it would not be correct to store an actual guest physical address in a
 ram_addr_t.

-For CPU virtual addresses there are several possible types.
-vaddr is the best type to use to hold a CPU virtual address in
-target-independent code. It is guaranteed to be large enough to hold a
-virtual address for any target, and it does not change size from target
-to target. It is always unsigned.
-target_ulong is a type the size of a virtual address on the CPU; this means
-it may be 32 or 64 bits depending on which target is being built. It should
-therefore be used only in target-specific code, and in some
-performance-critical built-per-target core code such as the TLB code.
-There is also a signed version, target_long.
-abi_ulong is for the *-user targets, and represents a type the size of
-'void *' in that target's ABI. (This may not be the same as the size of a
-full CPU virtual address in the case of target ABIs which use 32 bit pointers
-on 64 bit CPUs, like sparc32plus.) Definitions of structures that must match
-the target's ABI must use this type for anything that on the target is defined
-to be an 'unsigned long' or a pointer type.
-There is also a signed version, abi_long.
+Use target_ulong (or abi_ulong) for CPU virtual addresses, however
+devices should not need to use target_ulong.

 Of course, take all of the above with a grain of salt.  If you're about
 to use some system interface that requires a type like size_t, pid_t or
@@ -93,15 +78,16 @@ avoided.
 Use of the malloc/free/realloc/calloc/valloc/memalign/posix_memalign
 APIs is not allowed in the QEMU codebase. Instead of these routines,
 use the GLib memory allocation routines g_malloc/g_malloc0/g_new/
-g_new0/g_realloc/g_free or QEMU's qemu_memalign/qemu_blockalign/qemu_vfree
+g_new0/g_realloc/g_free or QEMU's qemu_vmalloc/qemu_memalign/qemu_vfree
 APIs.

 Please note that g_malloc will exit on allocation failure, so there
 is no need to test for failure (as you would have to with malloc).
 Calling g_malloc with a zero size is valid and will return NULL.

-Memory allocated by qemu_memalign or qemu_blockalign must be freed with
-qemu_vfree, since breaking this will cause problems on Win32.
+Memory allocated by qemu_vmalloc or qemu_memalign must be freed with
+qemu_vfree, since breaking this will cause problems on Win32 and user
+emulators.

 4. String manipulation

@@ -137,23 +123,3 @@ gcc's printf attribute directive in the prototype.
 This makes it so gcc's -Wformat and -Wformat-security options can do
 their jobs and cross-check format strings with the number and types
 of arguments.
-
-6. C standard, implementation defined and undefined behaviors
-
-C code in QEMU should be written to the C99 language specification. A copy
-of the final version of the C99 standard with corrigenda TC1, TC2, and TC3
-included, formatted as a draft, can be downloaded from:
- http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf
-
-The C language specification defines regions of undefined behavior and
-implementation defined behavior (to give compiler authors enough leeway to
-produce better code).  In general, code in QEMU should follow the language
-specification and avoid both undefined and implementation defined
-constructs. ("It works fine on the gcc I tested it with" is not a valid
-argument...) However there are a few areas where we allow ourselves to
-assume certain behaviors because in practice all the platforms we care about
-behave in the same way and writing strictly conformant code would be
-painful. These are:
- * you may assume that integers are 2s complement representation
- * you may assume that right shift of a signed integer duplicates
-   the sign bit (ie it is an arithmetic shift, not a logical shift)
--- a/15
+++ b/15
@@ -1,21 +1,16 @@
 The following points clarify the QEMU license:

-1) QEMU as a whole is released under the GNU General Public License,
-version 2.
+1) QEMU as a whole is released under the GNU General Public License

 2) Parts of QEMU have specific licenses which are compatible with the
-GNU General Public License, version 2. Hence each source file contains
-its own licensing information.  Source files with no licensing information
-are released under the GNU General Public License, version 2 or (at your
-option) any later version.
+GNU General Public License. Hence each source file contains its own
+licensing information.

-As of July 2013, contributions under version 2 of the GNU General Public
-License (and no later version) are only accepted for the following files
-or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*.
+Many hardware device emulation sources are released under the BSD license.

 3) The Tiny Code Generator (TCG) is released under the BSD license
   (see license headers in files).

 4) QEMU is a trademark of Fabrice Bellard.

-Fabrice Bellard and the QEMU team
+Fabrice Bellard.
--- a/848
+++ b/848
--- a/379
+++ b/379
@@ -19,23 +19,10 @@ seems to have been used for an in-tree build. You can fix this by running \
 endif
 endif

-CONFIG_SOFTMMU := $(if $(filter %-softmmu,$(TARGET_DIRS)),y)
-CONFIG_USER_ONLY := $(if $(filter %-user,$(TARGET_DIRS)),y)
-CONFIG_ALL=y
-include config-all-devices.mak
-include config-all-disas.mak
-
 include $(SRC_PATH)/rules.mak
 config-host.mak: $(SRC_PATH)/configure
 	@echo $@ is out-of-date, running configure
-	@# TODO: The next lines include code which supports a smooth
-	@# transition from old configurations without config.status.
-	@# This code can be removed after QEMU 1.7.
-	@if test -x config.status; then \
-	    ./config.status; \
-        else \
-	    sed -n "/.*Configured with/s/[^:]*: //p" $@ | sh; \
-	fi
+	@sed -n "/.*Configured with/s/[^:]*: //p" $@ | sh
 else
 config-host.mak:
 ifneq ($(filter-out %clean,$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
@@ -44,29 +31,12 @@ ifneq ($(filter-out %clean,$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
 endif
 endif

-GENERATED_HEADERS = config-host.h qemu-options.def
-GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h
-GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c
-
-GENERATED_HEADERS += trace/generated-events.h
-GENERATED_SOURCES += trace/generated-events.c
-
-GENERATED_HEADERS += trace/generated-tracers.h
-ifeq ($(findstring dtrace,$(TRACE_BACKENDS)),dtrace)
-GENERATED_HEADERS += trace/generated-tracers-dtrace.h
-endif
-GENERATED_SOURCES += trace/generated-tracers.c
-
-GENERATED_HEADERS += trace/generated-tcg-tracers.h
-
-GENERATED_HEADERS += trace/generated-helpers-wrappers.h
-GENERATED_HEADERS += trace/generated-helpers.h
-GENERATED_SOURCES += trace/generated-helpers.c
-
-ifeq ($(findstring ust,$(TRACE_BACKENDS)),ust)
-GENERATED_HEADERS += trace/generated-ust-provider.h
-GENERATED_SOURCES += trace/generated-ust.c
+GENERATED_HEADERS = config-host.h trace.h qemu-options.def
+ifeq ($(TRACE_BACKEND),dtrace)
+GENERATED_HEADERS += trace-dtrace.h
 endif
+GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h
+GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c trace.c

 # Don't try to regenerate Makefile or configure
 # We don't generate any of them
@@ -83,10 +53,7 @@ LIBS+=-lz $(LIBS_TOOLS)
 HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)

 ifdef BUILD_DOCS
-DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 qmp-commands.txt
-ifdef CONFIG_LINUX
-DOCS+=kvm_stat.1
-endif
+DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 QMP/qmp-commands.txt
 ifdef CONFIG_VIRTFS
 DOCS+=fsdev/virtfs-proxy-helper.1
 endif
@@ -96,25 +63,21 @@ endif

 SUBDIR_MAKEFLAGS=$(if $(V),,--no-print-directory) BUILD_DIR=$(BUILD_DIR)
 SUBDIR_DEVICES_MAK=$(patsubst %, %/config-devices.mak, $(TARGET_DIRS))
-SUBDIR_DEVICES_MAK_DEP=$(patsubst %, %-config-devices.mak.d, $(TARGET_DIRS))
+SUBDIR_DEVICES_MAK_DEP=$(patsubst %, %/config-devices.mak.d, $(TARGET_DIRS))

 ifeq ($(SUBDIR_DEVICES_MAK),)
 config-all-devices.mak:
 	$(call quiet-command,echo '# no devices' > $@,"  GEN   $@")
 else
 config-all-devices.mak: $(SUBDIR_DEVICES_MAK)
-	$(call quiet-command, sed -n \
-             's|^\([^=]*\)=\(.*\)$$|\1:=$$(findstring y,$$(\1)\2)|p' \
-             $(SUBDIR_DEVICES_MAK) | sort -u > $@, \
-             "  GEN   $@")
+	$(call quiet-command,cat $(SUBDIR_DEVICES_MAK) | grep =y | sort -u > $@,"  GEN   $@")
 endif

 -include $(SUBDIR_DEVICES_MAK_DEP)

 %/config-devices.mak: default-configs/%.mak
-	$(call quiet-command, \
-            $(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $< $*-config-devices.mak.d $@ > $@.tmp, "  GEN   $@.tmp")
-	$(call quiet-command, if test -f $@; then \
+	$(call quiet-command,$(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $@ $<, "  GEN   $@")
+	@if test -f $@; then \
 	  if cmp -s $@.old $@; then \
 	    mv $@.tmp $@; \
 	    cp -p $@ $@.old; \
@@ -130,33 +93,14 @@ endif
 	 else \
 	  mv $@.tmp $@; \
 	  cp -p $@ $@.old; \
-	 fi, "  GEN  $@");
+	 fi

 defconfig:
 	rm -f config-all-devices.mak $(SUBDIR_DEVICES_MAK)

-ifneq ($(wildcard config-host.mak),)
-include $(SRC_PATH)/Makefile.objs
-endif
+-include config-all-devices.mak

-dummy := $(call unnest-vars,, \
-                stub-obj-y \
-                util-obj-y \
-                qga-obj-y \
-                qga-vss-dll-obj-y \
-                block-obj-y \
-                block-obj-m \
-                common-obj-y \
-                common-obj-m)
-
-ifneq ($(wildcard config-host.mak),)
-include $(SRC_PATH)/tests/Makefile
-endif
-ifeq ($(CONFIG_SMARTCARD_NSS),y)
-include $(SRC_PATH)/libcacard/Makefile
-endif
-
-all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules
+all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all

 config-host.h: config-host.h-timestamp
 config-host.h-timestamp: config-host.mak
@@ -164,34 +108,30 @@ qemu-options.def: $(SRC_PATH)/qemu-options.hx
 	$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"  GEN   $@")

 SUBDIR_RULES=$(patsubst %,subdir-%, $(TARGET_DIRS))
-SOFTMMU_SUBDIR_RULES=$(filter %-softmmu,$(SUBDIR_RULES))
-
-$(SOFTMMU_SUBDIR_RULES): $(block-obj-y)
-$(SOFTMMU_SUBDIR_RULES): config-all-devices.mak

 subdir-%:
 	$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C $* V="$(V)" TARGET_DIR="$*/" all,)

+ifneq ($(wildcard config-host.mak),)
+include $(SRC_PATH)/Makefile.objs
+endif
+
+subdir-libcacard: $(oslib-obj-y) $(trace-obj-y) qemu-timer-common.o
+
 subdir-pixman: pixman/Makefile
 	$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C pixman V="$(V)" all,)

 pixman/Makefile: $(SRC_PATH)/pixman/configure
-	(cd pixman; CFLAGS="$(CFLAGS) -fPIC $(extra_cflags) $(extra_ldflags)" $(SRC_PATH)/pixman/configure $(AUTOCONF_HOST) --disable-gtk --disable-shared --enable-static)
+	(cd pixman; CFLAGS="$(CFLAGS) -fPIC" $(SRC_PATH)/pixman/configure $(AUTOCONF_HOST) --disable-gtk --disable-shared --enable-static)

 $(SRC_PATH)/pixman/configure:
 	(cd $(SRC_PATH)/pixman; autoreconf -v --install)

-DTC_MAKE_ARGS=-I$(SRC_PATH)/dtc VPATH=$(SRC_PATH)/dtc -C dtc V="$(V)" LIBFDT_srcdir=$(SRC_PATH)/dtc/libfdt
-DTC_CFLAGS=$(CFLAGS) $(QEMU_CFLAGS)
-DTC_CPPFLAGS=-I$(BUILD_DIR)/dtc -I$(SRC_PATH)/dtc -I$(SRC_PATH)/dtc/libfdt
+$(SUBDIR_RULES): libqemustub.a

-subdir-dtc:dtc/libfdt dtc/tests
-	$(call quiet-command,$(MAKE) $(DTC_MAKE_ARGS) CPPFLAGS="$(DTC_CPPFLAGS)" CFLAGS="$(DTC_CFLAGS)" LDFLAGS="$(LDFLAGS)" ARFLAGS="$(ARFLAGS)" CC="$(CC)" AR="$(AR)" LD="$(LD)" $(SUBDIR_MAKEFLAGS) libfdt/libfdt.a,)
+$(filter %-softmmu,$(SUBDIR_RULES)): $(universal-obj-y) $(trace-obj-y) $(common-obj-y) $(extra-obj-y) subdir-libdis

-dtc/%:
-	mkdir -p $@
-
-$(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y)
+$(filter %-user,$(SUBDIR_RULES)): $(universal-obj-y) $(trace-obj-y) subdir-libdis-user subdir-libuser

 ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS))
 romsubdir-%:
@@ -201,33 +141,66 @@ ALL_SUBDIRS=$(TARGET_DIRS) $(patsubst %,pc-bios/%, $(ROMS))

 recurse-all: $(SUBDIR_RULES) $(ROMSUBDIR_RULES)

-$(BUILD_DIR)/version.o: $(SRC_PATH)/version.rc config-host.h | $(BUILD_DIR)/version.lo
-	$(call quiet-command,$(WINDRES) -I$(BUILD_DIR) -o $@ $<,"  RC    version.o")
-$(BUILD_DIR)/version.lo: $(SRC_PATH)/version.rc config-host.h
-	$(call quiet-command,$(WINDRES) -I$(BUILD_DIR) -o $@ $<,"  RC    version.lo")
+audio/audio.o audio/fmodaudio.o: QEMU_CFLAGS += $(FMOD_CFLAGS)

-Makefile: $(version-obj-y) $(version-lobj-y)
+QEMU_CFLAGS+=$(CURL_CFLAGS)
+
+QEMU_CFLAGS += -I$(SRC_PATH)/include
+
+ui/cocoa.o: ui/cocoa.m
+
+ui/sdl.o audio/sdlaudio.o ui/sdl_zoom.o hw/baum.o: QEMU_CFLAGS += $(SDL_CFLAGS)
+
+ui/vnc.o: QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
+
+bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS)
+
+version.o: $(SRC_PATH)/version.rc config-host.h
+	$(call quiet-command,$(WINDRES) -I. -o $@ $<,"  RC    $(TARGET_DIR)$@")
+
+version-obj-$(CONFIG_WIN32) += version.o

 ######################################################################
-# Build libraries
+# Build library with stubs

 libqemustub.a: $(stub-obj-y)
-libqemuutil.a: $(util-obj-y)

-block-modules = $(foreach o,$(block-obj-m),"$(basename $(subst /,-,$o))",) NULL
-util/module.o-cflags = -D'CONFIG_BLOCK_MODULES=$(block-modules)'
+######################################################################
+# Support building shared library libcacard
+
+.PHONY: libcacard.la install-libcacard
+ifeq ($(LIBTOOL),)
+libcacard.la:
+	@echo "libtool is missing, please install and rerun configure"; exit 1
+
+install-libcacard:
+	@echo "libtool is missing, please install and rerun configure"; exit 1
+else
+libcacard.la: $(oslib-obj-y) qemu-timer-common.o $(addsuffix .lo, $(basename $(trace-obj-y)))
+	$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C libcacard V="$(V)" TARGET_DIR="$*/" libcacard.la,)
+
+install-libcacard: libcacard.la
+	$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C libcacard V="$(V)" TARGET_DIR="$*/" install-libcacard,)
+endif

 ######################################################################

 qemu-img.o: qemu-img-cmds.h

-qemu-img$(EXESUF): qemu-img.o $(block-obj-y) libqemuutil.a libqemustub.a
-qemu-nbd$(EXESUF): qemu-nbd.o $(block-obj-y) libqemuutil.a libqemustub.a
-qemu-io$(EXESUF): qemu-io.o $(block-obj-y) libqemuutil.a libqemustub.a
+tools-obj-y = $(oslib-obj-y) $(trace-obj-y) qemu-tool.o qemu-timer.o \
+	main-loop.o iohandler.o error.o
+tools-obj-$(CONFIG_POSIX) += compatfd.o
+
+qemu-img$(EXESUF): qemu-img.o $(tools-obj-y) $(block-obj-y) libqemustub.a
+qemu-nbd$(EXESUF): qemu-nbd.o $(tools-obj-y) $(block-obj-y) libqemustub.a
+qemu-io$(EXESUF): qemu-io.o cmd.o $(tools-obj-y) $(block-obj-y) libqemustub.a

 qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o

-fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/virtio-9p-marshal.o libqemuutil.a libqemustub.a
+vscclient$(EXESUF): $(libcacard-y) $(oslib-obj-y) $(trace-obj-y) libcacard/vscclient.o libqemustub.a
+	$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $^ $(libcacard_libs) $(LIBS),"  LINK  $@")
+
+fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/virtio-9p-marshal.o oslib-posix.o $(trace-obj-y)
 fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap

 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
@@ -238,73 +211,56 @@ qemu-ga$(EXESUF): QEMU_CFLAGS += -I qga/qapi-generated

 gen-out-type = $(subst .,-,$(suffix $@))

+ifneq ($(wildcard config-host.mak),)
+include $(SRC_PATH)/tests/Makefile
+endif
+
 qapi-py = $(SRC_PATH)/scripts/qapi.py $(SRC_PATH)/scripts/ordereddict.py

 qga/qapi-generated/qga-qapi-types.c qga/qapi-generated/qga-qapi-types.h :\
-$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
-	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
-		$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
-		"  GEN   $@")
+$(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
+	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py $(gen-out-type) -o qga/qapi-generated -p "qga-" < $<, "  GEN   $@")
 qga/qapi-generated/qga-qapi-visit.c qga/qapi-generated/qga-qapi-visit.h :\
-$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
-	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
-		$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
-		"  GEN   $@")
+$(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
+	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py $(gen-out-type) -o qga/qapi-generated -p "qga-" < $<, "  GEN   $@")
 qga/qapi-generated/qga-qmp-commands.h qga/qapi-generated/qga-qmp-marshal.c :\
-$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
-	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
-		$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
-		"  GEN   $@")
-
-qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json \
-               $(SRC_PATH)/qapi/block.json $(SRC_PATH)/qapi/block-core.json \
-               $(SRC_PATH)/qapi/event.json
+$(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
+	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py $(gen-out-type) -o qga/qapi-generated -p "qga-" < $<, "  GEN   $@")

 qapi-types.c qapi-types.h :\
-$(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
-	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
-		$(gen-out-type) -o "." -b $<, \
-		"  GEN   $@")
+$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
+	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py $(gen-out-type) -o "." < $<, "  GEN   $@")
 qapi-visit.c qapi-visit.h :\
-$(qapi-modules) $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
-	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
-		$(gen-out-type) -o "." -b $<, \
-		"  GEN   $@")
-qapi-event.c qapi-event.h :\
-$(qapi-modules) $(SRC_PATH)/scripts/qapi-event.py $(qapi-py)
-	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-event.py \
-		$(gen-out-type) -o "." $<, \
-		"  GEN   $@")
+$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
+	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py $(gen-out-type) -o "."  < $<, "  GEN   $@")
 qmp-commands.h qmp-marshal.c :\
-$(qapi-modules) $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
-	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
-		$(gen-out-type) -o "." -m $<, \
-		"  GEN   $@")
+$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
+	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py $(gen-out-type) -m -o "." < $<, "  GEN   $@")

 QGALIB_GEN=$(addprefix qga/qapi-generated/, qga-qapi-types.h qga-qapi-visit.h qga-qmp-commands.h)
 $(qga-obj-y) qemu-ga.o: $(QGALIB_GEN)

-qemu-ga$(EXESUF): $(qga-obj-y) libqemuutil.a libqemustub.a
-	$(call LINK, $^)
+qemu-ga$(EXESUF): qemu-ga.o $(qga-obj-y) $(oslib-obj-y) $(trace-obj-y) $(qapi-obj-y) $(qobject-obj-y) $(version-obj-y) libqemustub.a
+
+QEMULIBS=libuser libdis libdis-user

 clean:
 # avoid old build problems by removing potentially incorrect old files
 	rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h gen-op-arm.h
 	rm -f qemu-options.def
-	find . \( -name '*.l[oa]' -o -name '*.so' -o -name '*.dll' -o -name '*.mo' -o -name '*.[oda]' \) -type f -exec rm {} +
-	rm -f $(filter-out %.tlb,$(TOOLS)) $(HELPERS-y) qemu-ga TAGS cscope.* *.pod *~ */*~
-	rm -f fsdev/*.pod
-	rm -rf .libs */.libs
+	find . -name '*.[od]' -exec rm -f {} +
+	rm -f *.a *.lo $(TOOLS) $(HELPERS-y) qemu-ga TAGS cscope.* *.pod *~ */*~
+	rm -Rf .libs
 	rm -f qemu-img-cmds.h
-	rm -f ui/shader/*-vert.h ui/shader/*-frag.h
+	rm -f trace-dtrace.dtrace trace-dtrace.dtrace-timestamp
 	@# May not be present in GENERATED_HEADERS
-	rm -f trace/generated-tracers-dtrace.dtrace*
-	rm -f trace/generated-tracers-dtrace.h*
+	rm -f trace-dtrace.h trace-dtrace.h-timestamp
 	rm -f $(foreach f,$(GENERATED_HEADERS),$(f) $(f)-timestamp)
 	rm -f $(foreach f,$(GENERATED_SOURCES),$(f) $(f)-timestamp)
 	rm -rf qapi-generated
 	rm -rf qga/qapi-generated
-	for d in $(ALL_SUBDIRS); do \
+	$(MAKE) -C tests/tcg clean
+	for d in $(ALL_SUBDIRS) $(QEMULIBS) libcacard; do \
 	if test -d $$d; then $(MAKE) -C $$d $@ || exit 1; fi; \
 	rm -f $$d/qemu-options.def; \
        done
@@ -318,8 +274,7 @@ qemu-%.tar.bz2:

 distclean: clean
 	rm -f config-host.mak config-host.h* config-host.ld $(DOCS) qemu-options.texi qemu-img-cmds.texi qemu-monitor.texi
-	rm -f config-all-devices.mak config-all-disas.mak config.status
-	rm -f po/*.mo tests/qemu-iotests/common.env
+	rm -f config-all-devices.mak
 	rm -f roms/seabios/config.mak roms/vgabios/config.mak
 	rm -f qemu-doc.info qemu-doc.aux qemu-doc.cp qemu-doc.cps qemu-doc.dvi
 	rm -f qemu-doc.fn qemu-doc.fns qemu-doc.info qemu-doc.ky qemu-doc.kys
@@ -328,35 +283,28 @@ distclean: clean
 	rm -f config.log
 	rm -f linux-headers/asm
 	rm -f qemu-tech.info qemu-tech.aux qemu-tech.cp qemu-tech.dvi qemu-tech.fn qemu-tech.info qemu-tech.ky qemu-tech.log qemu-tech.pdf qemu-tech.pg qemu-tech.toc qemu-tech.tp qemu-tech.vr
-	for d in $(TARGET_DIRS); do \
+	for d in $(TARGET_DIRS) $(QEMULIBS); do \
 	rm -rf $$d || exit 1 ; \
        done
-	rm -Rf .sdk
-	if test -f pixman/config.log; then $(MAKE) -C pixman distclean; fi
-	if test -f dtc/version_gen.h; then $(MAKE) $(DTC_MAKE_ARGS) clean; fi
+	if test -f pixman/config.log; then make -C pixman distclean; fi

 KEYMAPS=da     en-gb  et  fr     fr-ch  is  lt  modifiers  no  pt-br  sv \
 ar      de     en-us  fi  fr-be  hr     it  lv  nl         pl  ru     th \
 common  de-ch  es     fo  fr-ca  hu     ja  mk  nl-be      pt  sl     tr \
-bepo    cz
+bepo

 ifdef INSTALL_BLOBS
-BLOBS=bios.bin bios-256k.bin sgabios.bin vgabios.bin vgabios-cirrus.bin \
+BLOBS=bios.bin sgabios.bin vgabios.bin vgabios-cirrus.bin \
 vgabios-stdvga.bin vgabios-vmware.bin vgabios-qxl.bin \
-acpi-dsdt.aml q35-acpi-dsdt.aml \
-ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc QEMU,tcx.bin QEMU,cgthree.bin \
+ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc \
 pxe-e1000.rom pxe-eepro100.rom pxe-ne2k_pci.rom \
 pxe-pcnet.rom pxe-rtl8139.rom pxe-virtio.rom \
-efi-e1000.rom efi-eepro100.rom efi-ne2k_pci.rom \
-efi-pcnet.rom efi-rtl8139.rom efi-virtio.rom \
-qemu-icon.bmp qemu_logo_no_text.svg \
+qemu-icon.bmp \
 bamboo.dtb petalogix-s3adsp1800.dtb petalogix-ml605.dtb \
 multiboot.bin linuxboot.bin kvmvapic.bin \
 s390-zipl.rom \
-s390-ccw.img \
 spapr-rtas.bin slof.bin \
-palcode-clipper \
-u-boot.e500
+palcode-clipper
 else
 BLOBS=
 endif
@@ -364,16 +312,13 @@ endif
 install-doc: $(DOCS)
 	$(INSTALL_DIR) "$(DESTDIR)$(qemu_docdir)"
 	$(INSTALL_DATA) qemu-doc.html  qemu-tech.html "$(DESTDIR)$(qemu_docdir)"
-	$(INSTALL_DATA) qmp-commands.txt "$(DESTDIR)$(qemu_docdir)"
+	$(INSTALL_DATA) QMP/qmp-commands.txt "$(DESTDIR)$(qemu_docdir)"
 ifdef CONFIG_POSIX
 	$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
-	$(INSTALL_DATA) qemu.1 "$(DESTDIR)$(mandir)/man1"
-ifneq ($(TOOLS),)
-	$(INSTALL_DATA) qemu-img.1 "$(DESTDIR)$(mandir)/man1"
+	$(INSTALL_DATA) qemu.1 qemu-img.1 "$(DESTDIR)$(mandir)/man1"
 	$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man8"
 	$(INSTALL_DATA) qemu-nbd.8 "$(DESTDIR)$(mandir)/man8"
 endif
-endif
 ifdef CONFIG_VIRTFS
 	$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
 	$(INSTALL_DATA) fsdev/virtfs-proxy-helper.1 "$(DESTDIR)$(mandir)/man1"
@@ -382,50 +327,32 @@ endif
 install-datadir:
 	$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)"

-install-localstatedir:
-ifdef CONFIG_POSIX
-ifneq (,$(findstring qemu-ga,$(TOOLS)))
-	$(INSTALL_DIR) "$(DESTDIR)$(qemu_localstatedir)"/run
-endif
-endif
-
 install-confdir:
 	$(INSTALL_DIR) "$(DESTDIR)$(qemu_confdir)"

 install-sysconfig: install-datadir install-confdir
 	$(INSTALL_DATA) $(SRC_PATH)/sysconfigs/target/target-x86_64.conf "$(DESTDIR)$(qemu_confdir)"

-install: all $(if $(BUILD_DOCS),install-doc) install-sysconfig \
-install-datadir install-localstatedir
+install: all $(if $(BUILD_DOCS),install-doc) install-sysconfig install-datadir
+	$(INSTALL_DIR) "$(DESTDIR)$(bindir)"
 ifneq ($(TOOLS),)
-	$(call install-prog,$(TOOLS),$(DESTDIR)$(bindir))
-endif
-ifneq ($(CONFIG_MODULES),)
-	$(INSTALL_DIR) "$(DESTDIR)$(qemu_moddir)"
-	for s in $(modules-m:.mo=$(DSOSUF)); do \
-		t="$(DESTDIR)$(qemu_moddir)/$$(echo $$s | tr / -)"; \
-		$(INSTALL_LIB) $$s "$$t"; \
-		test -z "$(STRIP)" || $(STRIP) "$$t"; \
-	done
+	$(INSTALL_PROG) $(STRIP_OPT) $(TOOLS) "$(DESTDIR)$(bindir)"
 endif
 ifneq ($(HELPERS-y),)
-	$(call install-prog,$(HELPERS-y),$(DESTDIR)$(libexecdir))
+	$(INSTALL_DIR) "$(DESTDIR)$(libexecdir)"
+	$(INSTALL_PROG) $(STRIP_OPT) $(HELPERS-y) "$(DESTDIR)$(libexecdir)"
 endif
 ifneq ($(BLOBS),)
 	set -e; for x in $(BLOBS); do \
 		$(INSTALL_DATA) $(SRC_PATH)/pc-bios/$$x "$(DESTDIR)$(qemu_datadir)"; \
 	done
-endif
-ifeq ($(CONFIG_GTK),y)
-	$(MAKE) -C po $@
 endif
 	$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/keymaps"
 	set -e; for x in $(KEYMAPS); do \
 		$(INSTALL_DATA) $(SRC_PATH)/pc-bios/keymaps/$$x "$(DESTDIR)$(qemu_datadir)/keymaps"; \
 	done
-	$(INSTALL_DATA) $(SRC_PATH)/trace-events "$(DESTDIR)$(qemu_datadir)/trace-events"
 	for d in $(TARGET_DIRS); do \
-	$(MAKE) $(SUBDIR_MAKEFLAGS) TARGET_DIR=$$d/ -C $$d $@ || exit 1 ; \
+	$(MAKE) -C $$d $@ || exit 1 ; \
        done

 # various test targets
@@ -434,30 +361,13 @@ test speed: all

 .PHONY: TAGS
 TAGS:
-	rm -f $@
-	find "$(SRC_PATH)" -name '*.[hc]' -exec etags --append {} +
+	find "$(SRC_PATH)" -name '*.[hc]' -print0 | xargs -0 etags

 cscope:
 	rm -f ./cscope.*
 	find "$(SRC_PATH)" -name "*.[chsS]" -print | sed 's,^\./,,' > ./cscope.files
 	cscope -b

-# opengl shader programs
-ui/shader/%-vert.h: $(SRC_PATH)/ui/shader/%.vert $(SRC_PATH)/scripts/shaderinclude.pl
-	@mkdir -p $(dir $@)
-	$(call quiet-command,\
-		perl $(SRC_PATH)/scripts/shaderinclude.pl $< > $@,\
-		"  VERT  $@")
-
-ui/shader/%-frag.h: $(SRC_PATH)/ui/shader/%.frag $(SRC_PATH)/scripts/shaderinclude.pl
-	@mkdir -p $(dir $@)
-	$(call quiet-command,\
-		perl $(SRC_PATH)/scripts/shaderinclude.pl $< > $@,\
-		"  FRAG  $@")
-
-ui/console-gl.o: $(SRC_PATH)/ui/console-gl.c \
-	ui/shader/texture-blit-vert.h ui/shader/texture-blit-frag.h
-
 # documentation
 MAKEINFO=makeinfo
 MAKEINFOFLAGS=--no-headers --no-split --number-sections
@@ -481,7 +391,7 @@ qemu-options.texi: $(SRC_PATH)/qemu-options.hx
 qemu-monitor.texi: $(SRC_PATH)/hmp-commands.hx
 	$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"  GEN   $@")

-qmp-commands.txt: $(SRC_PATH)/qmp-commands.hx
+QMP/qmp-commands.txt: $(SRC_PATH)/qmp-commands.hx
 	$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -q < $< > $@,"  GEN   $@")

 qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx
@@ -511,12 +421,6 @@ qemu-nbd.8: qemu-nbd.texi
 	  $(POD2MAN) --section=8 --center=" " --release=" " qemu-nbd.pod > $@, \
 	  "  GEN   $@")

-kvm_stat.1: scripts/kvm/kvm_stat.texi
-	$(call quiet-command, \
-	  perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< kvm_stat.pod && \
-	  $(POD2MAN) --section=1 --center=" " --release=" " kvm_stat.pod > $@, \
-	  "  GEN   $@")
-
 dvi: qemu-doc.dvi qemu-tech.dvi
 html: qemu-doc.html qemu-tech.html
 info: qemu-doc.info qemu-tech.info
@@ -526,61 +430,6 @@ qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
 	qemu-img.texi qemu-nbd.texi qemu-options.texi \
 	qemu-monitor.texi qemu-img-cmds.texi

-ifdef CONFIG_WIN32
-
-INSTALLER = qemu-setup-$(VERSION)$(EXESUF)
-
-nsisflags = -V2 -NOCD
-
-ifneq ($(wildcard $(SRC_PATH)/dll),)
-ifeq ($(ARCH),x86_64)
-# 64 bit executables
-DLL_PATH = $(SRC_PATH)/dll/w64
-nsisflags += -DW64
-else
-# 32 bit executables
-DLL_PATH = $(SRC_PATH)/dll/w32
-endif
-endif
-
-.PHONY: installer
-installer: $(INSTALLER)
-
-INSTDIR=/tmp/qemu-nsis
-
-$(INSTALLER): $(SRC_PATH)/qemu.nsi
-	$(MAKE) install prefix=${INSTDIR}
-ifdef SIGNCODE
-	(cd ${INSTDIR}; \
-         for i in *.exe; do \
-           $(SIGNCODE) $${i}; \
-         done \
-        )
-endif # SIGNCODE
-	(cd ${INSTDIR}; \
-         for i in qemu-system-*.exe; do \
-           arch=$${i%.exe}; \
-           arch=$${arch#qemu-system-}; \
-           echo Section \"$$arch\" Section_$$arch; \
-           echo SetOutPath \"\$$INSTDIR\"; \
-           echo File \"\$${BINDIR}\\$$i\"; \
-           echo SectionEnd; \
-         done \
-        ) >${INSTDIR}/system-emulations.nsh
-	makensis $(nsisflags) \
-                $(if $(BUILD_DOCS),-DCONFIG_DOCUMENTATION="y") \
-                $(if $(CONFIG_GTK),-DCONFIG_GTK="y") \
-                -DBINDIR="${INSTDIR}" \
-                $(if $(DLL_PATH),-DDLLDIR="$(DLL_PATH)") \
-                -DSRCDIR="$(SRC_PATH)" \
-                -DOUTFILE="$(INSTALLER)" \
-                $(SRC_PATH)/qemu.nsi
-	rm -r ${INSTDIR}
-ifdef SIGNCODE
-	$(SIGNCODE) $(INSTALLER)
-endif # SIGNCODE
-endif # CONFIG_WIN
-
 # Add a dependency on the generated files, so that they are always
 # rebuilt before other object files
 ifneq ($(filter-out %clean,$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
--- a/Makefile.dis
+++ b/Makefile.dis
@@ -0,0 +1,20 @@
+# Makefile for disassemblers.
+
+include ../config-host.mak
+include config.mak
+include $(SRC_PATH)/rules.mak
+
+.PHONY: all
+
+$(call set-vpath, $(SRC_PATH))
+
+QEMU_CFLAGS+=-I..
+
+include $(SRC_PATH)/Makefile.objs
+
+all: $(libdis-y)
+# Dummy command so that make thinks it has done something
+	@true
+
+clean:
+	rm -f *.o *.d *.a *~
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -1,25 +1,212 @@
 #######################################################################
-# Common libraries for tools and emulators
+# Stub library, linked in tools
 stub-obj-y = stubs/
-util-obj-y = util/ qobject/ qapi/ qapi-types.o qapi-visit.o qapi-event.o
+
+#######################################################################
+# Target-independent parts used in system and user emulation
+universal-obj-y =
+universal-obj-y += qemu-log.o
+
+#######################################################################
+# QObject
+qobject-obj-y = qint.o qstring.o qdict.o qlist.o qfloat.o qbool.o
+qobject-obj-y += qjson.o json-lexer.o json-streamer.o json-parser.o
+qobject-obj-y += qerror.o error.o qemu-error.o
+
+universal-obj-y += $(qobject-obj-y)
+
+#######################################################################
+# QOM
+qom-obj-y = qom/
+
+universal-obj-y += $(qom-obj-y)
+
+#######################################################################
+# oslib-obj-y is code depending on the OS (win32 vs posix)
+oslib-obj-y = osdep.o cutils.o qemu-timer-common.o
+oslib-obj-$(CONFIG_WIN32) += oslib-win32.o qemu-thread-win32.o
+oslib-obj-$(CONFIG_POSIX) += oslib-posix.o qemu-thread-posix.o
+
+#######################################################################
+# coroutines
+coroutine-obj-y = qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
+coroutine-obj-y += qemu-coroutine-sleep.o
+ifeq ($(CONFIG_UCONTEXT_COROUTINE),y)
+coroutine-obj-$(CONFIG_POSIX) += coroutine-ucontext.o
+else
+ifeq ($(CONFIG_SIGALTSTACK_COROUTINE),y)
+coroutine-obj-$(CONFIG_POSIX) += coroutine-sigaltstack.o
+else
+coroutine-obj-$(CONFIG_POSIX) += coroutine-gthread.o
+endif
+endif
+coroutine-obj-$(CONFIG_WIN32) += coroutine-win32.o

 #######################################################################
 # block-obj-y is code used by both qemu system emulation and qemu-img

-block-obj-y = async.o thread-pool.o
-block-obj-y += nbd.o block.o blockjob.o
-block-obj-y += main-loop.o iohandler.o qemu-timer.o
-block-obj-$(CONFIG_POSIX) += aio-posix.o
-block-obj-$(CONFIG_WIN32) += aio-win32.o
+block-obj-y = iov.o cache-utils.o qemu-option.o module.o async.o
+block-obj-y += nbd.o block.o blockjob.o aes.o qemu-config.o
+block-obj-y += thread-pool.o qemu-progress.o qemu-sockets.o uri.o notify.o
+block-obj-y += $(coroutine-obj-y) $(qobject-obj-y) $(version-obj-y)
+block-obj-$(CONFIG_POSIX) += event_notifier-posix.o aio-posix.o
+block-obj-$(CONFIG_WIN32) += event_notifier-win32.o aio-win32.o
 block-obj-y += block/
-block-obj-y += qemu-io-cmds.o
+block-obj-y += $(qapi-obj-y) qapi-types.o qapi-visit.o

-block-obj-y += qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
-block-obj-y += qemu-coroutine-sleep.o
-block-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o
+ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
+# Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
+# only pull in the actual virtio-9p device if we also enabled virtio.
+CONFIG_REALLY_VIRTFS=y
+endif

-block-obj-m = block/
+######################################################################
+# Target independent part of system emulation. The long term path is to
+# suppress *all* target specific code in case of system emulation, i.e. a
+# single QEMU executable should support all CPUs and machines.

+common-obj-y = $(block-obj-y) blockdev.o blockdev-nbd.o block/
+common-obj-y += net.o net/
+common-obj-y += qom/
+common-obj-y += readline.o console.o cursor.o
+common-obj-y += qemu-pixman.o
+common-obj-y += $(oslib-obj-y)
+common-obj-$(CONFIG_WIN32) += os-win32.o
+common-obj-$(CONFIG_POSIX) += os-posix.o
+
+common-obj-$(CONFIG_LINUX) += fsdev/
+extra-obj-$(CONFIG_LINUX) += fsdev/
+
+common-obj-y += tcg-runtime.o host-utils.o main-loop.o
+common-obj-y += input.o
+common-obj-y += buffered_file.o migration.o migration-tcp.o
+common-obj-y += qemu-char.o #aio.o
+common-obj-y += block-migration.o iohandler.o
+common-obj-y += bitmap.o bitops.o
+common-obj-y += page_cache.o
+
+common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
+common-obj-$(CONFIG_WIN32) += version.o
+
+common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
+
+common-obj-y += audio/
+common-obj-y += hw/
+common-obj-y += ui/
+common-obj-y += bt-host.o bt-vhci.o
+
+common-obj-y += dma-helpers.o
+common-obj-y += acl.o
+common-obj-$(CONFIG_POSIX) += compatfd.o
+common-obj-y += qemu-timer.o qemu-timer-common.o
+common-obj-y += qtest.o
+common-obj-y += vl.o
+
+common-obj-$(CONFIG_SLIRP) += slirp/
+
+common-obj-y += backends/
+
+######################################################################
+# libseccomp
+ifeq ($(CONFIG_SECCOMP),y)
+common-obj-y += qemu-seccomp.o
+endif
+
+######################################################################
+# libuser
+
+user-obj-y =
+user-obj-y += envlist.o path.o
+user-obj-y += tcg-runtime.o host-utils.o
+user-obj-y += cache-utils.o
+user-obj-y += module.o
+user-obj-y += qemu-user.o
+user-obj-y += $(trace-obj-y)
+user-obj-y += qom/
+
+######################################################################
+# libdis
+# NOTE: the disassembler code is only needed for debugging
+
+libdis-y =
+libdis-$(CONFIG_ALPHA_DIS) += alpha-dis.o
+libdis-$(CONFIG_ARM_DIS) += arm-dis.o
+libdis-$(CONFIG_CRIS_DIS) += cris-dis.o
+libdis-$(CONFIG_HPPA_DIS) += hppa-dis.o
+libdis-$(CONFIG_I386_DIS) += i386-dis.o
+libdis-$(CONFIG_IA64_DIS) += ia64-dis.o
+libdis-$(CONFIG_M68K_DIS) += m68k-dis.o
+libdis-$(CONFIG_MICROBLAZE_DIS) += microblaze-dis.o
+libdis-$(CONFIG_MIPS_DIS) += mips-dis.o
+libdis-$(CONFIG_PPC_DIS) += ppc-dis.o
+libdis-$(CONFIG_S390_DIS) += s390-dis.o
+libdis-$(CONFIG_SH4_DIS) += sh4-dis.o
+libdis-$(CONFIG_SPARC_DIS) += sparc-dis.o
+libdis-$(CONFIG_LM32_DIS) += lm32-dis.o
+
+######################################################################
+# trace
+
+ifeq ($(TRACE_BACKEND),dtrace)
+TRACE_H_EXTRA_DEPS=trace-dtrace.h
+endif
+trace.h: trace.h-timestamp $(TRACE_H_EXTRA_DEPS)
+trace.h-timestamp: $(SRC_PATH)/trace-events $(BUILD_DIR)/config-host.mak
+	$(call quiet-command,$(TRACETOOL) \
+		--format=h \
+		--backend=$(TRACE_BACKEND) \
+		< $< > $@,"  GEN   trace.h")
+	@cmp -s $@ trace.h || cp $@ trace.h
+
+trace.c: trace.c-timestamp
+trace.c-timestamp: $(SRC_PATH)/trace-events $(BUILD_DIR)/config-host.mak
+	$(call quiet-command,$(TRACETOOL) \
+		--format=c \
+		--backend=$(TRACE_BACKEND) \
+		< $< > $@,"  GEN   trace.c")
+	@cmp -s $@ trace.c || cp $@ trace.c
+
+trace.o: trace.c $(GENERATED_HEADERS)
+
+trace-dtrace.h: trace-dtrace.dtrace
+	$(call quiet-command,dtrace -o $@ -h -s $<, "  GEN   trace-dtrace.h")
+
+# Normal practice is to name DTrace probe file with a '.d' extension
+# but that gets picked up by QEMU's Makefile as an external dependency
+# rule file. So we use '.dtrace' instead
+trace-dtrace.dtrace: trace-dtrace.dtrace-timestamp
+trace-dtrace.dtrace-timestamp: $(SRC_PATH)/trace-events $(BUILD_DIR)/config-host.mak
+	$(call quiet-command,$(TRACETOOL) \
+		--format=d \
+		--backend=$(TRACE_BACKEND) \
+		< $< > $@,"  GEN   trace-dtrace.dtrace")
+	@cmp -s $@ trace-dtrace.dtrace || cp $@ trace-dtrace.dtrace
+
+trace-dtrace.o: trace-dtrace.dtrace $(GENERATED_HEADERS)
+	$(call quiet-command,dtrace -o $@ -G -s $<, "  GEN   trace-dtrace.o")
+
+ifeq ($(LIBTOOL),)
+trace-dtrace.lo: trace-dtrace.dtrace
+	@echo "missing libtool. please install and rerun configure."; exit 1
+else
+trace-dtrace.lo: trace-dtrace.dtrace
+	$(call quiet-command,$(LIBTOOL) --mode=compile --tag=CC dtrace -o $@ -G -s $<, "  lt GEN trace-dtrace.o")
+endif
+
+trace/simple.o: trace/simple.c $(GENERATED_HEADERS)
+
+trace-obj-$(CONFIG_TRACE_DTRACE) += trace-dtrace.o
+ifneq ($(TRACE_BACKEND),dtrace)
+trace-obj-y = trace.o
+endif
+
+trace-obj-$(CONFIG_TRACE_DEFAULT) += trace/default.o
+trace-obj-$(CONFIG_TRACE_SIMPLE) += trace/simple.o
+trace-obj-$(CONFIG_TRACE_SIMPLE) += qemu-timer-common.o
+trace-obj-$(CONFIG_TRACE_STDERR) += trace/stderr.o
+trace-obj-y += trace/control.o
+
+$(trace-obj-y): $(GENERATED_HEADERS)

 ######################################################################
 # smartcard
@@ -29,82 +216,39 @@ libcacard-y += libcacard/vcard.o libcacard/vreader.o
 libcacard-y += libcacard/vcard_emul_nss.o
 libcacard-y += libcacard/vcard_emul_type.o
 libcacard-y += libcacard/card_7816.o
-libcacard-y += libcacard/vcardt.o
-libcacard/vcard_emul_nss.o-cflags := $(NSS_CFLAGS)
-libcacard/vcard_emul_nss.o-libs := $(NSS_LIBS)
-
-######################################################################
-# Target independent part of system emulation. The long term path is to
-# suppress *all* target specific code in case of system emulation, i.e. a
-# single QEMU executable should support all CPUs and machines.
-
-ifeq ($(CONFIG_SOFTMMU),y)
-common-obj-y = blockdev.o blockdev-nbd.o block/
-common-obj-y += iothread.o
-common-obj-y += net/
-common-obj-y += qdev-monitor.o device-hotplug.o
-common-obj-$(CONFIG_WIN32) += os-win32.o
-common-obj-$(CONFIG_POSIX) += os-posix.o
-
-common-obj-$(CONFIG_LINUX) += fsdev/
-
-common-obj-y += migration/
-common-obj-y += qemu-char.o #aio.o
-common-obj-y += page_cache.o
-common-obj-y += qjson.o
-
-common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
-
-common-obj-y += audio/
-common-obj-y += hw/
-common-obj-y += accel.o
-
-common-obj-y += ui/
-common-obj-y += bt-host.o bt-vhci.o
-bt-host.o-cflags := $(BLUEZ_CFLAGS)
-
-common-obj-y += dma-helpers.o
-common-obj-y += vl.o
-vl.o-cflags := $(GPROF_CFLAGS) $(SDL_CFLAGS)
-common-obj-y += tpm.o
-
-common-obj-$(CONFIG_SLIRP) += slirp/
-
-common-obj-y += backends/
-
-common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o

 common-obj-$(CONFIG_SMARTCARD_NSS) += $(libcacard-y)

 ######################################################################
 # qapi

-common-obj-y += qmp-marshal.o
+qapi-obj-y = qapi/
+qapi-obj-y += qapi-types.o qapi-visit.o
+
+common-obj-y += qmp-marshal.o qapi-visit.o qapi-types.o
 common-obj-y += qmp.o hmp.o
-endif

-#######################################################################
-# Target-independent parts used in system and user emulation
-common-obj-y += qemu-log.o
-common-obj-y += tcg-runtime.o
-common-obj-y += hw/
-common-obj-y += qom/
-common-obj-y += disas/
-
-######################################################################
-# Resource file for Windows executables
-version-obj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.o
-version-lobj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.lo
-
-######################################################################
-# tracing
-util-obj-y +=  trace/
-target-obj-y += trace/
+universal-obj-y += $(qapi-obj-y)

 ######################################################################
 # guest agent

-# FIXME: a few definitions from qapi-types.o/qapi-visit.o are needed
-# by libqemuutil.a.  These should be moved to a separate .json schema.
-qga-obj-y = qga/
-qga-vss-dll-obj-y = qga/
+qga-obj-y = qga/ qemu-ga.o module.o qemu-tool.o
+qga-obj-$(CONFIG_POSIX) += qemu-sockets.o qemu-option.o
+
+vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
+
+vl.o: QEMU_CFLAGS+=$(SDL_CFLAGS)
+
+QEMU_CFLAGS+=$(GLIB_CFLAGS)
+
+nested-vars += \
+	stub-obj-y \
+	qga-obj-y \
+	qom-obj-y \
+	qapi-obj-y \
+	block-obj-y \
+	user-obj-y \
+	common-obj-y \
+	extra-obj-y
+dummy := $(call unnest-vars)
--- a/Makefile.target
+++ b/Makefile.target
@@ -1,8 +1,8 @@
 # -*- Mode: makefile -*-

 include ../config-host.mak
-include config-target.mak
 include config-devices.mak
+include config-target.mak
 include $(SRC_PATH)/rules.mak

 $(call set-vpath, $(SRC_PATH))
@@ -15,30 +15,31 @@ QEMU_CFLAGS+=-I$(SRC_PATH)/include

 ifdef CONFIG_USER_ONLY
 # user emulator name
-QEMU_PROG=qemu-$(TARGET_NAME)
-QEMU_PROG_BUILD = $(QEMU_PROG)
+QEMU_PROG=qemu-$(TARGET_ARCH2)
 else
 # system emulator name
-QEMU_PROG=qemu-system-$(TARGET_NAME)$(EXESUF)
-ifneq (,$(findstring -mwindows,$(libs_softmmu)))
+ifneq (,$(findstring -mwindows,$(LIBS)))
 # Terminate program name with a 'w' because the linker builds a windows executable.
-QEMU_PROGW=qemu-system-$(TARGET_NAME)w$(EXESUF)
-$(QEMU_PROG): $(QEMU_PROGW)
-	$(call quiet-command,$(OBJCOPY) --subsystem console $(QEMU_PROGW) $(QEMU_PROG),"  GEN   $(TARGET_DIR)$(QEMU_PROG)")
-QEMU_PROG_BUILD = $(QEMU_PROGW)
-else
-QEMU_PROG_BUILD = $(QEMU_PROG)
-endif
+QEMU_PROGW=qemu-system-$(TARGET_ARCH2)w$(EXESUF)
+endif # windows executable
+QEMU_PROG=qemu-system-$(TARGET_ARCH2)$(EXESUF)
 endif

-PROGS=$(QEMU_PROG) $(QEMU_PROGW)
+PROGS=$(QEMU_PROG)
+ifdef QEMU_PROGW
+PROGS+=$(QEMU_PROGW)
+endif
 STPFILES=

+ifndef CONFIG_HAIKU
+LIBS+=-lm
+endif
+
 config-target.h: config-target.h-timestamp
 config-target.h-timestamp: config-target.mak

 ifdef CONFIG_TRACE_SYSTEMTAP
-stap: $(QEMU_PROG).stp-installed $(QEMU_PROG).stp $(QEMU_PROG)-simpletrace.stp
+stap: $(QEMU_PROG).stp

 ifdef CONFIG_USER_ONLY
 TARGET_TYPE=user
@@ -46,31 +47,14 @@ else
 TARGET_TYPE=system
 endif

-$(QEMU_PROG).stp-installed: $(SRC_PATH)/trace-events
-	$(call quiet-command,$(TRACETOOL) \
-		--format=stap \
-		--backends=$(TRACE_BACKENDS) \
-		--binary=$(bindir)/$(QEMU_PROG) \
-		--target-name=$(TARGET_NAME) \
-		--target-type=$(TARGET_TYPE) \
-		< $< > $@,"  GEN   $(TARGET_DIR)$(QEMU_PROG).stp-installed")
-
 $(QEMU_PROG).stp: $(SRC_PATH)/trace-events
 	$(call quiet-command,$(TRACETOOL) \
 		--format=stap \
-		--backends=$(TRACE_BACKENDS) \
-		--binary=$(realpath .)/$(QEMU_PROG) \
-		--target-name=$(TARGET_NAME) \
+		--backend=$(TRACE_BACKEND) \
+		--binary=$(bindir)/$(QEMU_PROG) \
+		--target-arch=$(TARGET_ARCH) \
 		--target-type=$(TARGET_TYPE) \
-		< $< > $@,"  GEN   $(TARGET_DIR)$(QEMU_PROG).stp")
-
-$(QEMU_PROG)-simpletrace.stp: $(SRC_PATH)/trace-events
-	$(call quiet-command,$(TRACETOOL) \
-		--format=simpletrace-stap \
-		--backends=$(TRACE_BACKENDS) \
-		--probe-prefix=qemu.$(TARGET_TYPE).$(TARGET_NAME) \
-		< $< > $@,"  GEN   $(TARGET_DIR)$(QEMU_PROG)-simpletrace.stp")
-
+		< $< > $@,"  GEN   $(QEMU_PROG).stp")
 else
 stap:
 endif
@@ -83,20 +67,15 @@ all: $(PROGS) stap
 #########################################################
 # cpu emulator library
 obj-y = exec.o translate-all.o cpu-exec.o
-obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o
+obj-y += tcg/tcg.o tcg/optimize.o
 obj-$(CONFIG_TCG_INTERPRETER) += tci.o
-obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
 obj-y += fpu/softfloat.o
-obj-y += target-$(TARGET_BASE_ARCH)/
 obj-y += disas.o
-obj-$(call notempty,$(TARGET_XML_FILES)) += gdbstub-xml.o
-obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
+obj-$(CONFIG_TCI_DIS) += tci-dis.o
+obj-y += target-$(TARGET_BASE_ARCH)/
+obj-$(CONFIG_GDBSTUB_XML) += gdbstub-xml.o

-obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decContext.o
-obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decNumber.o
-obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal32.o
-obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal64.o
-obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal128.o
+tci-dis.o: QEMU_CFLAGS += -I$(SRC_PATH)/tcg -I$(SRC_PATH)/tcg/tci

 #########################################################
 # Linux user emulator target
@@ -106,7 +85,7 @@ ifdef CONFIG_LINUX_USER
 QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) -I$(SRC_PATH)/linux-user

 obj-y += linux-user/
-obj-y += gdbstub.o thunk.o user-exec.o
+obj-y += gdbstub.o thunk.o user-exec.o $(oslib-obj-y)

 endif #CONFIG_LINUX_USER

@@ -115,40 +94,51 @@ endif #CONFIG_LINUX_USER

 ifdef CONFIG_BSD_USER

-QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ABI_DIR) \
-			 -I$(SRC_PATH)/bsd-user/$(HOST_VARIANT_DIR)
+QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ARCH)

 obj-y += bsd-user/
-obj-y += gdbstub.o user-exec.o
+obj-y += gdbstub.o user-exec.o $(oslib-obj-y)

 endif #CONFIG_BSD_USER

 #########################################################
 # System emulator target
 ifdef CONFIG_SOFTMMU
-obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
-obj-y += qtest.o bootdevice.o
+CONFIG_NO_PCI = $(if $(subst n,,$(CONFIG_PCI)),n,y)
+CONFIG_NO_KVM = $(if $(subst n,,$(CONFIG_KVM)),n,y)
+CONFIG_NO_XEN = $(if $(subst n,,$(CONFIG_XEN)),n,y)
+CONFIG_NO_GET_MEMORY_MAPPING = $(if $(subst n,,$(CONFIG_HAVE_GET_MEMORY_MAPPING)),n,y)
+CONFIG_NO_CORE_DUMP = $(if $(subst n,,$(CONFIG_HAVE_CORE_DUMP)),n,y)
+
+obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o
 obj-y += hw/
-obj-$(CONFIG_FDT) += device_tree.o
 obj-$(CONFIG_KVM) += kvm-all.o
+obj-$(CONFIG_NO_KVM) += kvm-stub.o
 obj-y += memory.o savevm.o cputlb.o
-obj-y += memory_mapping.o
-obj-y += dump.o
-LIBS := $(libs_softmmu) $(LIBS)
+obj-$(CONFIG_HAVE_GET_MEMORY_MAPPING) += memory_mapping.o
+obj-$(CONFIG_HAVE_CORE_DUMP) += dump.o
+obj-$(CONFIG_NO_GET_MEMORY_MAPPING) += memory_mapping-stub.o
+obj-$(CONFIG_NO_CORE_DUMP) += dump-stub.o
+LIBS+=-lz
+
+QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
+QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
+QEMU_CFLAGS += $(VNC_JPEG_CFLAGS)
+QEMU_CFLAGS += $(VNC_PNG_CFLAGS)

 # xen support
-obj-$(CONFIG_XEN) += xen-common.o
-obj-$(CONFIG_XEN_I386) += xen-hvm.o xen-mapcache.o
-obj-$(call lnot,$(CONFIG_XEN)) += xen-common-stub.o
-obj-$(call lnot,$(CONFIG_XEN_I386)) += xen-hvm-stub.o
+obj-$(CONFIG_XEN) += xen-all.o xen-mapcache.o
+obj-$(CONFIG_NO_XEN) += xen-stub.o

 # Hardware support
-ifeq ($(TARGET_NAME), sparc64)
+ifeq ($(TARGET_ARCH), sparc64)
 obj-y += hw/sparc64/
 else
 obj-y += hw/$(TARGET_BASE_ARCH)/
 endif

+main.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
+
 GENERATED_HEADERS += hmp-commands.h qmp-commands-old.h

 endif # CONFIG_SOFTMMU
@@ -156,33 +146,32 @@ endif # CONFIG_SOFTMMU
 # Workaround for http://gcc.gnu.org/PR55489, see configure.
 %/translate.o: QEMU_CFLAGS += $(TRANSLATE_OPT_CFLAGS)

-dummy := $(call unnest-vars,,obj-y)
-all-obj-y := $(obj-y)
+nested-vars += obj-y

-target-obj-y :=
-block-obj-y :=
-common-obj-y :=
+# This resolves all nested paths, so it must come last
 include $(SRC_PATH)/Makefile.objs
-dummy := $(call unnest-vars,,target-obj-y)
-target-obj-y-save := $(target-obj-y)
-dummy := $(call unnest-vars,.., \
-               block-obj-y \
-               block-obj-m \
-               common-obj-y \
-               common-obj-m)
-target-obj-y := $(target-obj-y-save)
-all-obj-y += $(common-obj-y)
-all-obj-y += $(target-obj-y)
-all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y)

-$(QEMU_PROG_BUILD): config-devices.mak
+all-obj-y = $(obj-y)
+all-obj-y += $(addprefix ../, $(universal-obj-y))

-# build either PROG or PROGW
-$(QEMU_PROG_BUILD): $(all-obj-y) ../libqemuutil.a ../libqemustub.a
-	$(call LINK, $(filter-out %.mak, $^))
-ifdef CONFIG_DARWIN
-	$(call quiet-command,Rez -append $(SRC_PATH)/pc-bios/qemu.rsrc -o $@,"  REZ   $(TARGET_DIR)$@")
-	$(call quiet-command,SetFile -a C $@,"  SETFILE $(TARGET_DIR)$@")
+ifdef CONFIG_SOFTMMU
+all-obj-y += $(addprefix ../, $(common-obj-y))
+all-obj-y += $(addprefix ../libdis/, $(libdis-y))
+all-obj-y += $(addprefix ../, $(trace-obj-y))
+else
+all-obj-y += $(addprefix ../libuser/, $(user-obj-y))
+all-obj-y += $(addprefix ../libdis-user/, $(libdis-y))
+endif #CONFIG_LINUX_USER
+
+ifdef QEMU_PROGW
+# The linker builds a windows executable. Make also a console executable.
+$(QEMU_PROGW): $(all-obj-y) ../libqemustub.a
+	$(call LINK,$^)
+$(QEMU_PROG): $(QEMU_PROGW)
+	$(call quiet-command,$(OBJCOPY) --subsystem console $(QEMU_PROGW) $(QEMU_PROG),"  GEN   $(TARGET_DIR)$(QEMU_PROG)")
+else
+$(QEMU_PROG): $(all-obj-y) ../libqemustub.a
+	$(call LINK,$^)
 endif

 gdbstub-xml.c: $(TARGET_XML_FILES) $(SRC_PATH)/scripts/feature_to_c.sh
@@ -204,12 +193,14 @@ endif

 install: all
 ifneq ($(PROGS),)
-	$(call install-prog,$(PROGS),$(DESTDIR)$(bindir))
+	$(INSTALL) -m 755 $(PROGS) "$(DESTDIR)$(bindir)"
+ifneq ($(STRIP),)
+	$(STRIP) $(patsubst %,"$(DESTDIR)$(bindir)/%",$(PROGS))
+endif
 endif
 ifdef CONFIG_TRACE_SYSTEMTAP
 	$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset"
-	$(INSTALL_DATA) $(QEMU_PROG).stp-installed "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG).stp"
-	$(INSTALL_DATA) $(QEMU_PROG)-simpletrace.stp "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG)-simpletrace.stp"
+	$(INSTALL_DATA) $(QEMU_PROG).stp "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset"
 endif

 GENERATED_HEADERS += config-target.h
--- a/Makefile.user
+++ b/Makefile.user
@@ -0,0 +1,24 @@
+# Makefile for qemu target independent user files.
+
+include ../config-host.mak
+include $(SRC_PATH)/rules.mak
+-include config.mak
+
+.PHONY: all
+
+$(call set-vpath, $(SRC_PATH))
+
+QEMU_CFLAGS+=-I..
+QEMU_CFLAGS += -I$(SRC_PATH)/include
+QEMU_CFLAGS += -DCONFIG_USER_ONLY
+
+include $(SRC_PATH)/Makefile.objs
+
+all: $(user-obj-y)
+# Dummy command so that make thinks it has done something
+	@true
+
+clean:
+	for d in . trace; do \
+	rm -f $$d/*.o $$d/*.d $$d/*.a $$d/*~; \
+	done
--- a/QMP/README
+++ b/QMP/README
@@ -0,0 +1,88 @@
+                          QEMU Monitor Protocol
+                          =====================
+
+Introduction
+-------------
+
+The QEMU Monitor Protocol (QMP) allows applications to communicate with
+QEMU's Monitor.
+
+QMP is JSON[1] based and currently has the following features:
+
+- Lightweight, text-based, easy to parse data format
+- Asynchronous messages support (ie. events)
+- Capabilities Negotiation
+
+For detailed information on QMP's usage, please, refer to the following files:
+
+o qmp-spec.txt      QEMU Monitor Protocol current specification
+o qmp-commands.txt  QMP supported commands (auto-generated at build-time)
+o qmp-events.txt    List of available asynchronous events
+
+There is also a simple Python script called 'qmp-shell' available.
+
+IMPORTANT: It's strongly recommended to read the 'Stability Considerations'
+section in the qmp-commands.txt file before making any serious use of QMP.
+
+
+[1] http://www.json.org
+
+Usage
+-----
+
+To enable QMP, you need a QEMU monitor instance in "control mode". There are
+two ways of doing this.
+
+The simplest one is using the '-qmp' command-line option. The following
+example makes QMP available on localhost port 4444:
+
+  $ qemu [...] -qmp tcp:localhost:4444,server
+
+However, in order to have more complex combinations, like multiple monitors,
+the '-mon' command-line option should be used along with the '-chardev' one.
+For instance, the following example creates one user monitor on stdio and one
+QMP monitor on localhost port 4444.
+
+   $ qemu [...] -chardev stdio,id=mon0 -mon chardev=mon0,mode=readline \
+                -chardev socket,id=mon1,host=localhost,port=4444,server \
+                -mon chardev=mon1,mode=control
+
+Please, refer to QEMU's manpage for more information.
+
+Simple Testing
+--------------
+
+To manually test QMP one can connect with telnet and issue commands by hand:
+
+$ telnet localhost 4444
+Trying 127.0.0.1...
+Connected to localhost.
+Escape character is '^]'.
+{"QMP": {"version": {"qemu": {"micro": 50, "minor": 13, "major": 0}, "package": ""}, "capabilities": []}}
+{ "execute": "qmp_capabilities" }
+{"return": {}}
+{ "execute": "query-version" }
+{"return": {"qemu": {"micro": 50, "minor": 13, "major": 0}, "package": ""}}
+
+Development Process
+-------------------
+
+When changing QMP's interface (by adding new commands, events or modifying
+existing ones) it's mandatory to update the relevant documentation, which is
+one (or more) of the files listed in the 'Introduction' section*.
+
+Also, it's strongly recommended to send the documentation patch first, before
+doing any code change. This is so because:
+
+  1. Avoids the code dictating the interface
+
+  2. Review can improve your interface.  Letting that happen before
+     you implement it can save you work.
+
+* The qmp-commands.txt file is generated from the qmp-commands.hx one, which
+  is the file that should be edited.
+
+Homepage
+--------
+
+http://wiki.qemu.org/QMP
--- a/scripts/qmp/qemu-ga-client
+++ b/scripts/qmp/qemu-ga-client
@@ -33,7 +33,7 @@
 # $ qemu-ga-client fsfreeze freeze
 # 2 filesystems frozen
 #
-# See also: http://wiki.qemu-project.org/Features/QAPI/GuestAgent
+# See also: http://wiki.qemu.org/Features/QAPI/GuestAgent
 #

 import base64
@@ -267,9 +267,7 @@ def main(address, cmd, args):
            print('Hint: qemu is not running?')
        sys.exit(1)

-    if cmd == 'fsfreeze' and args[0] == 'freeze':
-        client.sync(60)
-    elif cmd != 'ping':
+    if cmd != 'ping':
        client.sync()

    globals()['_cmd_' + cmd](client, args)
--- a/scripts/qmp/qmp
+++ b/scripts/qmp/qmp
--- a/docs/qmp/qmp-events.txt
+++ b/docs/qmp/qmp-events.txt
@@ -1,16 +1,6 @@
-                   QEMU Machine Protocol Events
+                   QEMU Monitor Protocol Events
                   ============================

-ACPI_DEVICE_OST
---------------
-
-Emitted when guest executes ACPI _OST method.
-
- - data: ACPIOSTInfo type as described in qapi-schema.json
-
-{ "event": "ACPI_DEVICE_OST",
-     "data": { "device": "d1", "slot": "0", "slot-type": "DIMM", "source": 1, "status": 0 } }
-
 BALLOON_CHANGE
 --------------

@@ -28,34 +18,6 @@ Example:
    "data": { "actual": 944766976 },
    "timestamp": { "seconds": 1267020223, "microseconds": 435656 } }

-BLOCK_IMAGE_CORRUPTED
---------------------
-
-Emitted when a disk image is being marked corrupt. The image can be
-identified by its device or node name. The 'device' field is always
-present for compatibility reasons, but it can be empty ("") if the
-image does not have a device name associated.
-
-Data:
-
- "device":    Device name (json-string)
- "node-name": Node name (json-string, optional)
- "msg":       Informative message (e.g., reason for the corruption)
-               (json-string)
- "offset":    If the corruption resulted from an image access, this
-               is the host's access offset into the image
-               (json-int, optional)
- "size":      If the corruption resulted from an image access, this
-               is the access size (json-int, optional)
-
-Example:
-
-{ "event": "BLOCK_IMAGE_CORRUPTED",
-    "data": { "device": "ide0-hd0", "node-name": "node0",
-        "msg": "Prevented active L1 table overwrite", "offset": 196608,
-        "size": 65536 },
-    "timestamp": { "seconds": 1378126126, "microseconds": 966463 } }
-
 BLOCK_IO_ERROR
 --------------

@@ -68,7 +30,7 @@ Data:
 - "action": action that has been taken, it's one of the following (json-string):
    "ignore": error has been ignored
    "report": error has been reported to the device
-    "stop": the VM is going to stop because of the error
+    "stop": error caused VM to be stopped

 Example:

@@ -163,43 +125,17 @@ Emitted when a block job is ready to complete.

 Data:

- "type":     Job type (json-string; "stream" for image streaming
-                                     "commit" for block commit)
- "device":   Device name (json-string)
- "len":      Maximum progress value (json-int)
- "offset":   Current progress value (json-int)
-              On success this is equal to len.
-              On failure this is less than len.
- "speed":    Rate limit, bytes per second (json-int)
+- "device": device name (json-string)

 Example:

 { "event": "BLOCK_JOB_READY",
-    "data": { "device": "drive0", "type": "mirror", "speed": 0,
-              "len": 2097152, "offset": 2097152 }
+    "data": { "device": "ide0-hd1" },
    "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }

 Note: The "ready to complete" status is always reset by a BLOCK_JOB_ERROR
 event.

-DEVICE_DELETED
--------------
-
-Emitted whenever the device removal completion is acknowledged
-by the guest.
-At this point, it's safe to reuse the specified device ID.
-Device removal can be initiated by the guest or by HMP/QMP commands.
-
-Data:
-
- "device": device name (json-string, optional)
- "path": device path (json-string)
-
-{ "event": "DEVICE_DELETED",
-  "data": { "device": "virtio-net-pci-0",
-            "path": "/machine/peripheral/virtio-net-pci-0" },
-  "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
-
 DEVICE_TRAY_MOVED
 -----------------

@@ -218,110 +154,10 @@ Data:
  },
  "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }

-GUEST_PANICKED
--------------
-
-Emitted when guest OS panic is detected.
-
-Data:
-
- "action": Action that has been taken (json-string, currently always "pause").
-
-Example:
-
-{ "event": "GUEST_PANICKED",
-     "data": { "action": "pause" } }
-
-MEM_UNPLUG_ERROR
--------------------
-Emitted when memory hot unplug error occurs.
-
-Data:
-
- "device": device name (json-string)
- "msg": Informative message (e.g., reason for the error) (json-string)
-
-Example:
-
-{ "event": "MEM_UNPLUG_ERROR"
-  "data": { "device": "dimm1",
-            "msg": "acpi: device unplug for unsupported device"
-  },
-  "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
-
-NIC_RX_FILTER_CHANGED
---------------------
-
-The event is emitted once until the query command is executed,
-the first event will always be emitted.
-
-Data:
-
- "name": net client name (json-string)
- "path": device path (json-string)
-
-{ "event": "NIC_RX_FILTER_CHANGED",
-  "data": { "name": "vnet0",
-            "path": "/machine/peripheral/vnet0/virtio-backend" },
-  "timestamp": { "seconds": 1368697518, "microseconds": 326866 } }
-}
-
-POWERDOWN
---------
-
-Emitted when the Virtual Machine is powered down through the power
-control system, such as via ACPI.
-
-Data: None.
-
-Example:
-
-{ "event": "POWERDOWN",
-    "timestamp": { "seconds": 1267040730, "microseconds": 682951 } }
-
-QUORUM_FAILURE
--------------
-
-Emitted by the Quorum block driver if it fails to establish a quorum.
-
-Data:
-
- "reference":     device name if defined else node name.
- "sector-num":    Number of the first sector of the failed read operation.
- "sectors-count": Failed read operation sector count.
-
-Example:
-
-{ "event": "QUORUM_FAILURE",
-     "data": { "reference": "usr1", "sector-num": 345435, "sectors-count": 5 },
-     "timestamp": { "seconds": 1344522075, "microseconds": 745528 } }
-
-QUORUM_REPORT_BAD
-----------------
-
-Emitted to report a corruption of a Quorum file.
-
-Data:
-
- "error":         Error message (json-string, optional)
-                   Only present on failure.  This field contains a human-readable
-                   error message.  There are no semantics other than that the
-                   block layer reported an error and clients should not try to
-                   interpret the error string.
- "node-name":     The graph node name of the block driver state.
- "sector-num":    Number of the first sector of the failed read operation.
- "sectors-count": Failed read operation sector count.
-
-Example:
-
-{ "event": "QUORUM_REPORT_BAD",
-     "data": { "node-name": "1.raw", "sector-num": 345435, "sectors-count": 5 },
-     "timestamp": { "seconds": 1344522075, "microseconds": 745528 } }
-
 RESET
 -----

-Emitted when the Virtual Machine is reset.
+Emitted when the Virtual Machine is reseted.

 Data: None.

@@ -349,8 +185,7 @@ Emitted when the guest changes the RTC time.

 Data:

- "offset": Offset between base RTC clock (as specified by -rtc base), and
-new RTC clock value (json-number)
+- "offset": delta against the host UTC in seconds (json-number)

 Example:

@@ -361,8 +196,7 @@ Example:
 SHUTDOWN
 --------

-Emitted when the Virtual Machine has shut down, indicating that qemu
-is about to exit.
+Emitted when the Virtual Machine is powered down.

 Data: None.

@@ -374,10 +208,10 @@ Example:
 Note: If the command-line option "-no-shutdown" has been specified, a STOP
 event will eventually follow the SHUTDOWN event.

-SPICE_CONNECTED
---------------
+SPICE_CONNECTED, SPICE_DISCONNECTED
+-----------------------------------

-Emitted when a SPICE client connects.
+Emitted when a SPICE client connects or disconnects.

 Data:

@@ -399,36 +233,11 @@ Example:
    "client": {"port": "52873", "family": "ipv4", "host": "127.0.0.1"}
 }}

-SPICE_DISCONNECTED
------------------
-
-Emitted when a SPICE client disconnects.
-
-Data:
-
- "server": Server information (json-object)
-  - "host": IP address (json-string)
-  - "port": port number (json-string)
-  - "family": address family (json-string, "ipv4" or "ipv6")
- "client": Client information (json-object)
-  - "host": IP address (json-string)
-  - "port": port number (json-string)
-  - "family": address family (json-string, "ipv4" or "ipv6")
-
-Example:
-
-{ "timestamp": {"seconds": 1290688046, "microseconds": 388707},
-  "event": "SPICE_DISCONNECTED",
-  "data": {
-    "server": { "port": "5920", "family": "ipv4", "host": "127.0.0.1"},
-    "client": {"port": "52873", "family": "ipv4", "host": "127.0.0.1"}
-}}
-
 SPICE_INITIALIZED
 -----------------

 Emitted after initial handshake and authentication takes place (if any)
-and the SPICE channel is up and running
+and the SPICE channel is up'n'running

 Data:

@@ -461,19 +270,6 @@ Example:
                      "channel-id": 0, "tls": true}
 }}

-SPICE_MIGRATE_COMPLETED
-----------------------
-
-Emitted when SPICE migration has completed
-
-Data: None.
-
-Example:
-
-{ "timestamp": {"seconds": 1290688046, "microseconds": 417172},
-  "event": "SPICE_MIGRATE_COMPLETED" }
-
-
 STOP
 ----

@@ -602,22 +398,6 @@ Example:
                    "host": "127.0.0.1", "sasl_username": "luiz" } },
        "timestamp": { "seconds": 1263475302, "microseconds": 150772 } }

-VSERPORT_CHANGE
---------------
-
-Emitted when the guest opens or closes a virtio-serial port.
-
-Data:
-
- "id": device identifier of the virtio-serial port (json-string)
- "open": true if the guest has opened the virtio-serial port (json-bool)
-
-Example:
-
-{ "event": "VSERPORT_CHANGE",
-    "data": { "id": "channel0", "open": true },
-    "timestamp": { "seconds": 1401385907, "microseconds": 422329 } }
-
 WAKEUP
 ------

@@ -627,7 +407,7 @@ Data: None.

 Example:

-{ "event": "WAKEUP",
+{ "event": "WATCHDOG",
     "timestamp": { "seconds": 1344522075, "microseconds": 745528 } }

 WATCHDOG
--- a/scripts/qmp/qmp-shell
+++ b/scripts/qmp/qmp-shell
@@ -31,8 +31,6 @@
 # (QEMU)

 import qmp
-import json
-import ast
 import readline
 import sys
 import pprint
@@ -52,19 +50,6 @@ class QMPShellError(Exception):
 class QMPShellBadPort(QMPShellError):
    pass

-class FuzzyJSON(ast.NodeTransformer):
-    '''This extension of ast.NodeTransformer filters literal "true/false/null"
-    values in an AST and replaces them by proper "True/False/None" values that
-    Python can properly evaluate.'''
-    def visit_Name(self, node):
-        if node.id == 'true':
-            node.id = 'True'
-        if node.id == 'false':
-            node.id = 'False'
-        if node.id == 'null':
-            node.id = 'None'
-        return node
-
 # TODO: QMPShell's interface is a bit ugly (eg. _fill_completion() and
 #       _execute_cmd()). Let's design a better one.
 class QMPShell(qmp.QEMUMonitorProtocol):
@@ -73,8 +58,6 @@ class QMPShell(qmp.QEMUMonitorProtocol):
        self._greeting = None
        self._completer = None
        self._pp = pp
-        self._transmode = False
-        self._actions = list()

    def __get_address(self, arg):
        """
@@ -104,122 +87,40 @@ class QMPShell(qmp.QEMUMonitorProtocol):
        # clearing everything as it doesn't seem to matter
        readline.set_completer_delims('')

-    def __parse_value(self, val):
-        try:
-            return int(val)
-        except ValueError:
-            pass
-
-        if val.lower() == 'true':
-            return True
-        if val.lower() == 'false':
-            return False
-        if val.startswith(('{', '[')):
-            # Try first as pure JSON:
-            try:
-                return json.loads(val)
-            except ValueError:
-                pass
-            # Try once again as FuzzyJSON:
-            try:
-                st = ast.parse(val, mode='eval')
-                return ast.literal_eval(FuzzyJSON().visit(st))
-            except SyntaxError:
-                pass
-            except ValueError:
-                pass
-        return val
-
-    def __cli_expr(self, tokens, parent):
-        for arg in tokens:
-            (key, _, val) = arg.partition('=')
-            if not val:
-                raise QMPShellError("Expected a key=value pair, got '%s'" % arg)
-
-            value = self.__parse_value(val)
-            optpath = key.split('.')
-            curpath = []
-            for p in optpath[:-1]:
-                curpath.append(p)
-                d = parent.get(p, {})
-                if type(d) is not dict:
-                    raise QMPShellError('Cannot use "%s" as both leaf and non-leaf key' % '.'.join(curpath))
-                parent[p] = d
-                parent = d
-            if optpath[-1] in parent:
-                if type(parent[optpath[-1]]) is dict:
-                    raise QMPShellError('Cannot use "%s" as both leaf and non-leaf key' % '.'.join(curpath))
-                else:
-                    raise QMPShellError('Cannot set "%s" multiple times' % key)
-            parent[optpath[-1]] = value
-
    def __build_cmd(self, cmdline):
        """
        Build a QMP input object from a user provided command-line in the
        following format:
-
+    
            < command-name > [ arg-name1=arg1 ] ... [ arg-nameN=argN ]
        """
        cmdargs = cmdline.split()
-
-        # Transactional CLI entry/exit:
-        if cmdargs[0] == 'transaction(':
-            self._transmode = True
-            cmdargs.pop(0)
-        elif cmdargs[0] == ')' and self._transmode:
-            self._transmode = False
-            if len(cmdargs) > 1:
-                raise QMPShellError("Unexpected input after close of Transaction sub-shell")
-            qmpcmd = { 'execute': 'transaction',
-                       'arguments': { 'actions': self._actions } }
-            self._actions = list()
-            return qmpcmd
-
-        # Nothing to process?
-        if not cmdargs:
-            return None
-
-        # Parse and then cache this Transactional Action
-        if self._transmode:
-            finalize = False
-            action = { 'type': cmdargs[0], 'data': {} }
-            if cmdargs[-1] == ')':
-                cmdargs.pop(-1)
-                finalize = True
-            self.__cli_expr(cmdargs[1:], action['data'])
-            self._actions.append(action)
-            return self.__build_cmd(')') if finalize else None
-
-        # Standard command: parse and return it to be executed.
        qmpcmd = { 'execute': cmdargs[0], 'arguments': {} }
-        self.__cli_expr(cmdargs[1:], qmpcmd['arguments'])
+        for arg in cmdargs[1:]:
+            opt = arg.split('=')
+            try:
+                value = int(opt[1])
+            except ValueError:
+                value = opt[1]
+            qmpcmd['arguments'][opt[0]] = value
        return qmpcmd

-    def _print(self, qmp):
-        jsobj = json.dumps(qmp)
-        if self._pp is not None:
-            self._pp.pprint(jsobj)
-        else:
-            print str(jsobj)
-
    def _execute_cmd(self, cmdline):
        try:
            qmpcmd = self.__build_cmd(cmdline)
-        except Exception, e:
-            print 'Error while parsing command line: %s' % e
+        except:
            print 'command format: <command-name> ',
            print '[arg-name1=arg1] ... [arg-nameN=argN]'
            return True
-        # For transaction mode, we may have just cached the action:
-        if qmpcmd is None:
-            return True
-        if self._verbose:
-            self._print(qmpcmd)
        resp = self.cmd_obj(qmpcmd)
        if resp is None:
            print 'Disconnected'
            return False
-        self._print(resp)
+
+        if self._pp is not None:
+            self._pp.pprint(resp)
+        else:
+            print resp
        return True

    def connect(self):
@@ -231,11 +132,6 @@ class QMPShell(qmp.QEMUMonitorProtocol):
        version = self._greeting['QMP']['version']['qemu']
        print 'Connected to QEMU %d.%d.%d\n' % (version['major'],version['minor'],version['micro'])

-    def get_prompt(self):
-        if self._transmode:
-            return "TRANS> "
-        return "(QEMU) "
-
    def read_exec_command(self, prompt):
        """
        Read and execute a command.
@@ -255,9 +151,6 @@ class QMPShell(qmp.QEMUMonitorProtocol):
        else:
            return self._execute_cmd(cmdline)

-    def set_verbosity(self, verbose):
-        self._verbose = verbose
-
 class HMPShell(QMPShell):
    def __init__(self, address):
        QMPShell.__init__(self, address)
@@ -335,7 +228,7 @@ def die(msg):
 def fail_cmdline(option=None):
    if option:
        sys.stderr.write('ERROR: bad command-line option \'%s\'\n' % option)
-    sys.stderr.write('qemu-shell [ -v ] [ -p ] [ -H ] < UNIX socket path> | < TCP address:port >\n')
+    sys.stderr.write('qemu-shell [ -p ] [ -H ] < UNIX socket path> | < TCP address:port >\n')
    sys.exit(1)

 def main():
@@ -343,7 +236,6 @@ def main():
    qemu = None
    hmp = False
    pp = None
-    verbose = False

    try:
        for arg in sys.argv[1:]:
@@ -355,8 +247,6 @@ def main():
                if pp is not None:
                    fail_cmdline(arg)
                pp = pprint.PrettyPrinter(indent=4)
-            elif arg == "-v":
-                verbose = True
            else:
                if qemu is not None:
                    fail_cmdline(arg)
@@ -381,8 +271,7 @@ def main():
        die('Could not connect to %s' % addr)

    qemu.show_banner()
-    qemu.set_verbosity(verbose)
-    while qemu.read_exec_command(qemu.get_prompt()):
+    while qemu.read_exec_command('(QEMU) '):
        pass
    qemu.close()

--- a/docs/qmp/qmp-spec.txt
+++ b/docs/qmp/qmp-spec.txt
@@ -1,55 +1,35 @@
-                      QEMU Machine Protocol Specification
-
-0. About This Document
-======================
-
-Copyright (C) 2009-2015 Red Hat, Inc.
-
-This work is licensed under the terms of the GNU GPL, version 2 or
-later. See the COPYING file in the top-level directory.
+           QEMU Monitor Protocol Specification - Version 0.1

 1. Introduction
 ===============

-This document specifies the QEMU Machine Protocol (QMP), a JSON-based
-protocol which is available for applications to operate QEMU at the
-machine-level.  It is also in use by the QEMU Guest Agent (QGA), which
-is available for host applications to interact with the guest
-operating system.
+This document specifies the QEMU Monitor Protocol (QMP), a JSON-based protocol
+which is available for applications to control QEMU at the machine-level.
+
+To enable QMP support, QEMU has to be run in "control mode". This is done by
+starting QEMU with the appropriate command-line options. Please, refer to the
+QEMU manual page for more information.

 2. Protocol Specification
 =========================

 This section details the protocol format. For the purpose of this document
-"Client" is any application which is using QMP to communicate with QEMU and
-"Server" is QEMU itself.
+"Client" is any application which is communicating with QEMU in control mode,
+and "Server" is QEMU itself.

 JSON data structures, when mentioned in this document, are always in the
 following format:

    json-DATA-STRUCTURE-NAME

-Where DATA-STRUCTURE-NAME is any valid JSON data structure, as defined
-by the JSON standard:
+Where DATA-STRUCTURE-NAME is any valid JSON data structure, as defined by
+the JSON standard:

-http://www.ietf.org/rfc/rfc7159.txt
+http://www.ietf.org/rfc/rfc4627.txt

-The protocol is always encoded in UTF-8 except for synchronization
-bytes (documented below); although thanks to json-string escape
-sequences, the server will reply using only the strict ASCII subset.
-
-For convenience, json-object members mentioned in this document will
-be in a certain order. However, in real protocol usage they can be in
-ANY order, thus no particular order should be assumed. On the other
-hand, use of json-array elements presumes that preserving order is
-important unless specifically documented otherwise.  Repeating a key
-within a json-object gives unpredictable results.
-
-Also for convenience, the server will accept an extension of
-'single-quoted' strings in place of the usual "double-quoted"
-json-string, and both input forms of strings understand an additional
-escape sequence of "\'" for a single quote. The server will only use
-double quoting on output.
+For convenience, json-object members and json-array elements mentioned in
+this document will be in a certain order. However, in real protocol usage
+they can be in ANY order, thus no particular order should be assumed.

 2.1 General Definitions
 -----------------------
@@ -67,25 +47,16 @@ that the connection has been successfully established and that the Server is
 ready for capabilities negotiation (for more information refer to section
 '4. Capabilities Negotiation').

-The greeting message format is:
+The format is:

 { "QMP": { "version": json-object, "capabilities": json-array } }

 Where,

 - The "version" member contains the Server's version information (the format
-  is the same of the query-version command)
+  is the same of the 'query-version' command)
 - The "capabilities" member specify the availability of features beyond the
-  baseline specification; the order of elements in this array has no
-  particular significance, so a client must search the entire array
-  when looking for a particular capability
-
-2.2.1 Capabilities
------------------
-
-As of the date this document was last revised, no server or client
-capability strings have been defined.
-
+  baseline specification

 2.3 Issuing Commands
 --------------------
@@ -98,14 +69,10 @@ The format for command execution is:

 - The "execute" member identifies the command to be executed by the Server
 - The "arguments" member is used to pass any arguments required for the
-  execution of the command, it is optional when no arguments are
-  required. Each command documents what contents will be considered
-  valid when handling the json-argument
+  execution of the command, it is optional when no arguments are required
 - The "id" member is a transaction identification associated with the
  command execution, it is optional and will be part of the response if
-  provided. The "id" member can be any json-value, although most
-  clients merely use a json-number incremented for each successive
-  command
+  provided

 2.4 Commands Responses
 ----------------------
@@ -116,24 +83,28 @@ of a command execution: success or error.
 2.4.1 success
 -------------

-The format of a success response is:
+The success response is issued when the command execution has finished
+without errors.

-{ "return": json-value, "id": json-value }
+The format is:
+
+{ "return": json-object, "id": json-value }

 Where,

- The "return" member contains the data returned by the command, which
-  is defined on a per-command basis (usually a json-object or
-  json-array of json-objects, but sometimes a json-number, json-string,
-  or json-array of json-strings); it is an empty json-object if the
-  command does not return data
+- The "return" member contains the command returned data, which is defined
+  in a per-command basis or an empty json-object if the command does not
+  return data
 - The "id" member contains the transaction identification associated
-  with the command execution if issued by the Client
+  with the command execution (if issued by the Client)

 2.4.2 error
 -----------

-The format of an error response is:
+The error response is issued when the command execution could not be
+completed because of an error condition.
+
+The format is:

 { "error": { "class": json-string, "desc": json-string }, "id": json-value }

@@ -143,7 +114,7 @@ The format of an error response is:
 - The "desc" member is a human-readable error message. Clients should
  not attempt to parse this message.
 - The "id" member contains the transaction identification associated with
-  the command execution if issued by the Client
+  the command execution (if issued by the Client)

 NOTE: Some errors can occur before the Server is able to read the "id" member,
 in these cases the "id" member will not be part of the error response, even
@@ -153,10 +124,9 @@ if provided by the client.
 -----------------------

 As a result of state changes, the Server may send messages unilaterally
-to the Client at any time, when not in the middle of any other
-response. They are called "asynchronous events".
+to the Client at any time. They are called 'asynchronous events'.

-The format of asynchronous events is:
+The format is:

 { "event": json-string, "data": json-object,
  "timestamp": { "seconds": json-number, "microseconds": json-number } }
@@ -166,89 +136,69 @@ The format of asynchronous events is:
 - The "event" member contains the event's name
 - The "data" member contains event specific data, which is defined in a
  per-event basis, it is optional
- The "timestamp" member contains the exact time of when the event
-  occurred in the Server. It is a fixed json-object with time in
-  seconds and microseconds relative to the Unix Epoch (1 Jan 1970); if
-  there is a failure to retrieve host time, both members of the
-  timestamp will be set to -1.
+- The "timestamp" member contains the exact time of when the event occurred
+  in the Server. It is a fixed json-object with time in seconds and
+  microseconds

 For a listing of supported asynchronous events, please, refer to the
 qmp-events.txt file.

-2.5 QGA Synchronization
-----------------------
-
-When using QGA, an additional synchronization feature is built into
-the protocol.  If the Client sends a raw 0xFF sentinel byte (not valid
-JSON), then the Server will reset its state and discard all pending
-data prior to the sentinel.  Conversely, if the Client makes use of
-the 'guest-sync-delimited' command, the Server will send a raw 0xFF
-sentinel byte prior to its response, to aid the Client in discarding
-any data prior to the sentinel.
-
-
 3. QMP Examples
 ===============

 This section provides some examples of real QMP usage, in all of them
-"C" stands for "Client" and "S" stands for "Server".
+'C' stands for 'Client' and 'S' stands for 'Server'.

 3.1 Server greeting
 -------------------

-S: { "QMP": { "version": { "qemu": { "micro": 50, "minor": 6, "major": 1 },
-     "package": ""}, "capabilities": []}}
+S: {"QMP": {"version": {"qemu": "0.12.50", "package": ""}, "capabilities": []}}

-3.2 Client QMP negotiation
--------------------------
-C: { "execute": "qmp_capabilities" }
-S: { "return": {}}
-
-3.3 Simple 'stop' execution
+3.2 Simple 'stop' execution
 ---------------------------

 C: { "execute": "stop" }
-S: { "return": {} }
+S: {"return": {}}

-3.4 KVM information
+3.3 KVM information
 -------------------

 C: { "execute": "query-kvm", "id": "example" }
-S: { "return": { "enabled": true, "present": true }, "id": "example"}
+S: {"return": {"enabled": true, "present": true}, "id": "example"}

-3.5 Parsing error
+3.4 Parsing error
 ------------------

 C: { "execute": }
-S: { "error": { "class": "GenericError", "desc": "Invalid JSON syntax" } }
+S: {"error": {"class": "GenericError", "desc": "Invalid JSON syntax" } }

-3.6 Powerdown event
+3.5 Powerdown event
 -------------------

-S: { "timestamp": { "seconds": 1258551470, "microseconds": 802384 },
-    "event": "POWERDOWN" }
+S: {"timestamp": {"seconds": 1258551470, "microseconds": 802384}, "event":
+"POWERDOWN"}

 4. Capabilities Negotiation
-===========================
+----------------------------

 When a Client successfully establishes a connection, the Server is in
 Capabilities Negotiation mode.

-In this mode only the qmp_capabilities command is allowed to run, all
-other commands will return the CommandNotFound error. Asynchronous
-messages are not delivered either.
+In this mode only the 'qmp_capabilities' command is allowed to run, all
+other commands will return the CommandNotFound error. Asynchronous messages
+are not delivered either.

-Clients should use the qmp_capabilities command to enable capabilities
+Clients should use the 'qmp_capabilities' command to enable capabilities
 advertised in the Server's greeting (section '2.2 Server Greeting') they
 support.

-When the qmp_capabilities command is issued, and if it does not return an
+When the 'qmp_capabilities' command is issued, and if it does not return an
 error, the Server enters in Command mode where capabilities changes take
-effect, all commands (except qmp_capabilities) are allowed and asynchronous
+effect, all commands (except 'qmp_capabilities') are allowed and asynchronous
 messages are delivered.

 5 Compatibility Considerations
-==============================
+------------------------------

 All protocol changes or new features which modify the protocol format in an
 incompatible way are disabled by default and will be advertised by the
@@ -272,16 +222,12 @@ However, Clients must not assume any particular:
 - Amount of errors generated by a command, that is, new errors can be added
  to any existing command in newer versions of the Server

-Any command or field name beginning with "x-" is deemed experimental,
-and may be withdrawn or changed in an incompatible manner in a future
-release.
-
 Of course, the Server does guarantee to send valid JSON.  But apart from
 this, a Client should be "conservative in what they send, and liberal in
 what they accept".

 6. Downstream extension of QMP
-==============================
+------------------------------

 We recommend that downstream consumers of QEMU do *not* modify QMP.
 Management tools should be able to support both upstream and downstream
@@ -299,7 +245,7 @@ arguments, errors, asynchronous events, and so forth.

 Any new names downstream wishes to add must begin with '__'.  To
 ensure compatibility with other downstreams, it is strongly
-recommended that you prefix your downstream names with '__RFQDN_' where
+recommended that you prefix your downstram names with '__RFQDN_' where
 RFQDN is a valid, reverse fully qualified domain name which you
 control.  For example, a qemu-kvm specific monitor command would be:

--- a/scripts/qmp/qmp.py
+++ b/scripts/qmp/qmp.py
@@ -1,5 +1,5 @@
 # QEMU Monitor Protocol Python class
-#
+# 
 # Copyright (C) 2009, 2010 Red Hat Inc.
 #
 # Authors:
@@ -21,9 +21,6 @@ class QMPConnectError(QMPError):
 class QMPCapabilitiesError(QMPError):
    pass

-class QMPTimeoutError(QMPError):
-    pass
-
 class QEMUMonitorProtocol:
    def __init__(self, address, server=False):
        """
@@ -75,44 +72,6 @@ class QEMUMonitorProtocol:

    error = socket.error

-    def __get_events(self, wait=False):
-        """
-        Check for new events in the stream and cache them in __events.
-
-        @param wait (bool): block until an event is available.
-        @param wait (float): If wait is a float, treat it as a timeout value.
-
-        @raise QMPTimeoutError: If a timeout float is provided and the timeout
-                                period elapses.
-        @raise QMPConnectError: If wait is True but no events could be retrieved
-                                or if some other error occurred.
-        """
-
-        # Check for new events regardless and pull them into the cache:
-        self.__sock.setblocking(0)
-        try:
-            self.__json_read()
-        except socket.error, err:
-            if err[0] == errno.EAGAIN:
-                # No data available
-                pass
-        self.__sock.setblocking(1)
-
-        # Wait for new events, if needed.
-        # if wait is 0.0, this means "no wait" and is also implicitly false.
-        if not self.__events and wait:
-            if isinstance(wait, float):
-                self.__sock.settimeout(wait)
-            try:
-                ret = self.__json_read(only_event=True)
-            except socket.timeout:
-                raise QMPTimeoutError("Timeout waiting for event")
-            except:
-                raise QMPConnectError("Error while reading from socket")
-            if ret is None:
-                raise QMPConnectError("Error while reading from socket")
-            self.__sock.settimeout(None)
-
    def connect(self, negotiate=True):
        """
        Connect to the QMP Monitor and perform capabilities negotiation.
@@ -181,37 +140,38 @@ class QEMUMonitorProtocol:
        """
        Get and delete the first available QMP event.

-        @param wait (bool): block until an event is available.
-        @param wait (float): If wait is a float, treat it as a timeout value.
-
-        @raise QMPTimeoutError: If a timeout float is provided and the timeout
-                                period elapses.
-        @raise QMPConnectError: If wait is True but no events could be retrieved
-                                or if some other error occurred.
-
-        @return The first available QMP event, or None.
+        @param wait: block until an event is available (bool)
        """
-        self.__get_events(wait)
-
-        if self.__events:
-            return self.__events.pop(0)
-        return None
+        self.__sock.setblocking(0)
+        try:
+            self.__json_read()
+        except socket.error, err:
+            if err[0] == errno.EAGAIN:
+                # No data available
+                pass
+        self.__sock.setblocking(1)
+        if not self.__events and wait:
+            self.__json_read(only_event=True)
+        event = self.__events[0]
+        del self.__events[0]
+        return event

    def get_events(self, wait=False):
        """
        Get a list of available QMP events.

-        @param wait (bool): block until an event is available.
-        @param wait (float): If wait is a float, treat it as a timeout value.
-
-        @raise QMPTimeoutError: If a timeout float is provided and the timeout
-                                period elapses.
-        @raise QMPConnectError: If wait is True but no events could be retrieved
-                                or if some other error occurred.
-
-        @return The list of available QMP events.
+        @param wait: block until an event is available (bool)
        """
-        self.__get_events(wait)
+        self.__sock.setblocking(0)
+        try:
+            self.__json_read()
+        except socket.error, err:
+            if err[0] == errno.EAGAIN:
+                # No data available
+                pass
+        self.__sock.setblocking(1)
+        if not self.__events and wait:
+            self.__json_read(only_event=True)
        return self.__events

    def clear_events(self):
@@ -228,9 +188,3 @@ class QEMUMonitorProtocol:

    def settimeout(self, timeout):
        self.__sock.settimeout(timeout)
-
-    def get_sock_fd(self):
-        return self.__sock.fileno()
-
-    def is_scm_available(self):
-        return self.__sock.family == socket.AF_UNIX
--- a/scripts/qmp/qom-fuse
+++ b/scripts/qmp/qom-fuse
--- a/scripts/qmp/qom-get
+++ b/scripts/qmp/qom-get
--- a/scripts/qmp/qom-list
+++ b/scripts/qmp/qom-list
--- a/scripts/qmp/qom-set
+++ b/scripts/qmp/qom-set
--- a/2
+++ b/2
@@ -1,3 +1,3 @@
-Read the documentation in qemu-doc.html or on http://wiki.qemu-project.org
+Read the documentation in qemu-doc.html or on http://wiki.qemu.org

 - QEMU team
--- a/37
+++ b/37
@@ -0,0 +1,37 @@
+General:
+-------
+- cycle counter for all archs
+- cpu_interrupt() win32/SMP fix
+- merge PIC spurious interrupt patch
+- warning for OS/2: must not use 128 MB memory (merge bochs cmos patch ?)
+- config file (at least for windows/Mac OS X)
+- update doc: PCI infos.
+- basic VGA optimizations
+- better code fetch
+- do not resize vga if invalid size.
+- TLB code protection support for PPC
+- disable SMC handling for ARM/SPARC/PPC (not finished)
+- see undefined flags for BTx insn
+- keyboard output buffer filling timing emulation
+- tests for each target CPU
+- fix all remaining thread lock issues (must put TBs in a specific invalid
+  state, find a solution for tb_flush()).
+
+ppc specific:
+------------
+- TLB invalidate not needed if msr_pr changes
+- enable shift optimizations ?
+
+linux-user specific:
+-------------------
+- remove threading support as it cannot work at this point
+- improve IPC syscalls
+- more syscalls (in particular all 64 bit ones, IPCs, fix 64 bit
+  issues, fix 16 bit uid issues)
+- use kernel traps for unaligned accesses on ARM ?
+
+
+lower priority:
+--------------
+- int15 ah=86: use better timing
+- use -msoft-float on ARM
--- a/2
+++ b/2
@@ -1 +1 @@
-2.3.50
+1.3.1
--- a/a.out.h
+++ b/a.out.h
@@ -0,0 +1,430 @@
+/* a.out.h
+
+   Copyright 1997, 1998, 1999, 2001 Red Hat, Inc.
+
+This file is part of Cygwin.
+
+This software is a copyrighted work licensed under the terms of the
+Cygwin license.  Please consult the file "CYGWIN_LICENSE" for
+details. */
+
+#ifndef _A_OUT_H_
+#define _A_OUT_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+#define COFF_IMAGE_WITH_PE
+#define COFF_LONG_SECTION_NAMES
+
+/*** coff information for Intel 386/486.  */
+
+
+/********************** FILE HEADER **********************/
+
+struct external_filehdr {
+  short f_magic;	/* magic number			*/
+  short f_nscns;	/* number of sections		*/
+  host_ulong f_timdat;	/* time & date stamp		*/
+  host_ulong f_symptr;	/* file pointer to symtab	*/
+  host_ulong f_nsyms;	/* number of symtab entries	*/
+  short f_opthdr;	/* sizeof(optional hdr)		*/
+  short f_flags;	/* flags			*/
+};
+
+/* Bits for f_flags:
+ *	F_RELFLG	relocation info stripped from file
+ *	F_EXEC		file is executable (no unresolved external references)
+ *	F_LNNO		line numbers stripped from file
+ *	F_LSYMS		local symbols stripped from file
+ *	F_AR32WR	file has byte ordering of an AR32WR machine (e.g. vax)
+ */
+
+#define F_RELFLG	(0x0001)
+#define F_EXEC		(0x0002)
+#define F_LNNO		(0x0004)
+#define F_LSYMS		(0x0008)
+
+
+
+#define	I386MAGIC	0x14c
+#define I386PTXMAGIC	0x154
+#define I386AIXMAGIC	0x175
+
+/* This is Lynx's all-platform magic number for executables. */
+
+#define LYNXCOFFMAGIC	0415
+
+#define I386BADMAG(x) (((x).f_magic != I386MAGIC) \
+		       && (x).f_magic != I386AIXMAGIC \
+		       && (x).f_magic != I386PTXMAGIC \
+		       && (x).f_magic != LYNXCOFFMAGIC)
+
+#define	FILHDR	struct external_filehdr
+#define	FILHSZ	20
+
+
+/********************** AOUT "OPTIONAL HEADER"=
+ **********************/
+
+
+typedef struct
+{
+  unsigned short magic;		/* type of file				*/
+  unsigned short vstamp;	/* version stamp			*/
+  host_ulong	tsize;		/* text size in bytes, padded to FW bdry*/
+  host_ulong	dsize;		/* initialized data "  "		*/
+  host_ulong	bsize;		/* uninitialized data "   "		*/
+  host_ulong	entry;		/* entry pt.				*/
+  host_ulong text_start;	/* base of text used for this file */
+  host_ulong data_start;	/* base of data used for this file=
+ */
+}
+AOUTHDR;
+
+#define AOUTSZ 28
+#define AOUTHDRSZ 28
+
+#define OMAGIC          0404    /* object files, eg as output */
+#define ZMAGIC          0413    /* demand load format, eg normal ld output */
+#define STMAGIC		0401	/* target shlib */
+#define SHMAGIC		0443	/* host   shlib */
+
+
+/* define some NT default values */
+/*  #define NT_IMAGE_BASE        0x400000 moved to internal.h */
+#define NT_SECTION_ALIGNMENT 0x1000
+#define NT_FILE_ALIGNMENT    0x200
+#define NT_DEF_RESERVE       0x100000
+#define NT_DEF_COMMIT        0x1000
+
+/********************** SECTION HEADER **********************/
+
+
+struct external_scnhdr {
+  char		s_name[8];	/* section name			*/
+  host_ulong	s_paddr;	/* physical address, offset
+				   of last addr in scn */
+  host_ulong	s_vaddr;	/* virtual address		*/
+  host_ulong	s_size;		/* section size			*/
+  host_ulong	s_scnptr;	/* file ptr to raw data for section */
+  host_ulong	s_relptr;	/* file ptr to relocation	*/
+  host_ulong	s_lnnoptr;	/* file ptr to line numbers	*/
+  unsigned short s_nreloc;	/* number of relocation entries	*/
+  unsigned short s_nlnno;	/* number of line number entries*/
+  host_ulong	s_flags;	/* flags			*/
+};
+
+#define	SCNHDR	struct external_scnhdr
+#define	SCNHSZ	40
+
+/*
+ * names of "special" sections
+ */
+#define _TEXT	".text"
+#define _DATA	".data"
+#define _BSS	".bss"
+#define _COMMENT ".comment"
+#define _LIB ".lib"
+
+/********************** LINE NUMBERS **********************/
+
+/* 1 line number entry for every "breakpointable" source line in a section.
+ * Line numbers are grouped on a per function basis; first entry in a function
+ * grouping will have l_lnno = 0 and in place of physical address will be the
+ * symbol table index of the function name.
+ */
+struct external_lineno {
+  union {
+    host_ulong l_symndx; /* function name symbol index, iff l_lnno 0 */
+    host_ulong l_paddr;	/* (physical) address of line number	*/
+  } l_addr;
+  unsigned short l_lnno;	/* line number		*/
+};
+
+#define	LINENO	struct external_lineno
+#define	LINESZ	6
+
+/********************** SYMBOLS **********************/
+
+#define E_SYMNMLEN	8	/* # characters in a symbol name	*/
+#define E_FILNMLEN	14	/* # characters in a file name		*/
+#define E_DIMNUM	4	/* # array dimensions in auxiliary entry */
+
+struct QEMU_PACKED external_syment
+{
+  union {
+    char e_name[E_SYMNMLEN];
+    struct {
+      host_ulong e_zeroes;
+      host_ulong e_offset;
+    } e;
+  } e;
+  host_ulong e_value;
+  unsigned short e_scnum;
+  unsigned short e_type;
+  char e_sclass[1];
+  char e_numaux[1];
+};
+
+#define N_BTMASK	(0xf)
+#define N_TMASK		(0x30)
+#define N_BTSHFT	(4)
+#define N_TSHIFT	(2)
+
+union external_auxent {
+  struct {
+    host_ulong x_tagndx;	/* str, un, or enum tag indx */
+    union {
+      struct {
+	unsigned short  x_lnno; /* declaration line number */
+	unsigned short  x_size; /* str/union/array size */
+      } x_lnsz;
+      host_ulong x_fsize;	/* size of function */
+    } x_misc;
+    union {
+      struct {			/* if ISFCN, tag, or .bb */
+	host_ulong x_lnnoptr;/* ptr to fcn line # */
+	host_ulong x_endndx;	/* entry ndx past block end */
+      } x_fcn;
+      struct {			/* if ISARY, up to 4 dimen. */
+	char x_dimen[E_DIMNUM][2];
+      } x_ary;
+    } x_fcnary;
+    unsigned short x_tvndx;	/* tv index */
+  } x_sym;
+
+  union {
+    char x_fname[E_FILNMLEN];
+    struct {
+      host_ulong x_zeroes;
+      host_ulong x_offset;
+    } x_n;
+  } x_file;
+
+  struct {
+    host_ulong x_scnlen;	/* section length */
+    unsigned short x_nreloc;	/* # relocation entries */
+    unsigned short x_nlinno;	/* # line numbers */
+    host_ulong x_checksum;	/* section COMDAT checksum */
+    unsigned short x_associated;/* COMDAT associated section index */
+    char x_comdat[1];		/* COMDAT selection number */
+  } x_scn;
+
+  struct {
+    host_ulong x_tvfill;	/* tv fill value */
+    unsigned short x_tvlen;	/* length of .tv */
+    char x_tvran[2][2];		/* tv range */
+  } x_tv;	/* info about .tv section (in auxent of symbol .tv)) */
+
+};
+
+#define	SYMENT	struct external_syment
+#define	SYMESZ	18
+#define	AUXENT	union external_auxent
+#define	AUXESZ	18
+
+#define _ETEXT	"etext"
+
+/********************** RELOCATION DIRECTIVES **********************/
+
+struct external_reloc {
+  char r_vaddr[4];
+  char r_symndx[4];
+  char r_type[2];
+};
+
+#define RELOC struct external_reloc
+#define RELSZ 10
+
+/* end of coff/i386.h */
+
+/* PE COFF header information */
+
+#ifndef _PE_H
+#define _PE_H
+
+/* NT specific file attributes */
+#define IMAGE_FILE_RELOCS_STRIPPED           0x0001
+#define IMAGE_FILE_EXECUTABLE_IMAGE          0x0002
+#define IMAGE_FILE_LINE_NUMS_STRIPPED        0x0004
+#define IMAGE_FILE_LOCAL_SYMS_STRIPPED       0x0008
+#define IMAGE_FILE_BYTES_REVERSED_LO         0x0080
+#define IMAGE_FILE_32BIT_MACHINE             0x0100
+#define IMAGE_FILE_DEBUG_STRIPPED            0x0200
+#define IMAGE_FILE_SYSTEM                    0x1000
+#define IMAGE_FILE_DLL                       0x2000
+#define IMAGE_FILE_BYTES_REVERSED_HI         0x8000
+
+/* additional flags to be set for section headers to allow the NT loader to
+   read and write to the section data (to replace the addresses of data in
+   dlls for one thing); also to execute the section in .text's case=
+ */
+#define IMAGE_SCN_MEM_DISCARDABLE 0x02000000
+#define IMAGE_SCN_MEM_EXECUTE     0x20000000
+#define IMAGE_SCN_MEM_READ        0x40000000
+#define IMAGE_SCN_MEM_WRITE       0x80000000
+
+/*
+ * Section characteristics added for ppc-nt
+ */
+
+#define IMAGE_SCN_TYPE_NO_PAD                0x00000008  /* Reserved.  */
+
+#define IMAGE_SCN_CNT_CODE                   0x00000020  /* Section contains code. */
+#define IMAGE_SCN_CNT_INITIALIZED_DATA       0x00000040  /* Section contains initialized data. */
+#define IMAGE_SCN_CNT_UNINITIALIZED_DATA     0x00000080  /* Section contains uninitialized data. */
+
+#define IMAGE_SCN_LNK_OTHER                  0x00000100  /* Reserved.  */
+#define IMAGE_SCN_LNK_INFO                   0x00000200  /* Section contains comments or some other type of information. */
+#define IMAGE_SCN_LNK_REMOVE                 0x00000800  /* Section contents will not become part of image. */
+#define IMAGE_SCN_LNK_COMDAT                 0x00001000  /* Section contents comdat. */
+
+#define IMAGE_SCN_MEM_FARDATA                0x00008000
+
+#define IMAGE_SCN_MEM_PURGEABLE              0x00020000
+#define IMAGE_SCN_MEM_16BIT                  0x00020000
+#define IMAGE_SCN_MEM_LOCKED                 0x00040000
+#define IMAGE_SCN_MEM_PRELOAD                0x00080000
+
+#define IMAGE_SCN_ALIGN_1BYTES               0x00100000
+#define IMAGE_SCN_ALIGN_2BYTES               0x00200000
+#define IMAGE_SCN_ALIGN_4BYTES               0x00300000
+#define IMAGE_SCN_ALIGN_8BYTES               0x00400000
+#define IMAGE_SCN_ALIGN_16BYTES              0x00500000  /* Default alignment if no others are specified. */
+#define IMAGE_SCN_ALIGN_32BYTES              0x00600000
+#define IMAGE_SCN_ALIGN_64BYTES              0x00700000
+
+
+#define IMAGE_SCN_LNK_NRELOC_OVFL            0x01000000  /* Section contains extended relocations. */
+#define IMAGE_SCN_MEM_NOT_CACHED             0x04000000  /* Section is not cachable.               */
+#define IMAGE_SCN_MEM_NOT_PAGED              0x08000000  /* Section is not pageable.               */
+#define IMAGE_SCN_MEM_SHARED                 0x10000000  /* Section is shareable.                  */
+
+/* COMDAT selection codes.  */
+
+#define IMAGE_COMDAT_SELECT_NODUPLICATES     (1) /* Warn if duplicates.  */
+#define IMAGE_COMDAT_SELECT_ANY		     (2) /* No warning.  */
+#define IMAGE_COMDAT_SELECT_SAME_SIZE	     (3) /* Warn if different size.  */
+#define IMAGE_COMDAT_SELECT_EXACT_MATCH	     (4) /* Warn if different.  */
+#define IMAGE_COMDAT_SELECT_ASSOCIATIVE	     (5) /* Base on other section.  */
+
+/* Magic values that are true for all dos/nt implementations */
+#define DOSMAGIC       0x5a4d
+#define NT_SIGNATURE   0x00004550
+
+/* NT allows long filenames, we want to accommodate this.  This may break
+     some of the bfd functions */
+#undef  FILNMLEN
+#define FILNMLEN	18	/* # characters in a file name		*/
+
+
+#ifdef COFF_IMAGE_WITH_PE
+/* The filehdr is only weired in images */
+
+#undef FILHDR
+struct external_PE_filehdr
+{
+  /* DOS header fields */
+  unsigned short e_magic;	/* Magic number, 0x5a4d */
+  unsigned short e_cblp;	/* Bytes on last page of file, 0x90 */
+  unsigned short e_cp;		/* Pages in file, 0x3 */
+  unsigned short e_crlc;	/* Relocations, 0x0 */
+  unsigned short e_cparhdr;	/* Size of header in paragraphs, 0x4 */
+  unsigned short e_minalloc;	/* Minimum extra paragraphs needed, 0x0 */
+  unsigned short e_maxalloc;	/* Maximum extra paragraphs needed, 0xFFFF */
+  unsigned short e_ss;		/* Initial (relative) SS value, 0x0 */
+  unsigned short e_sp;		/* Initial SP value, 0xb8 */
+  unsigned short e_csum;	/* Checksum, 0x0 */
+  unsigned short e_ip;		/* Initial IP value, 0x0 */
+  unsigned short e_cs;		/* Initial (relative) CS value, 0x0 */
+  unsigned short e_lfarlc;	/* File address of relocation table, 0x40 */
+  unsigned short e_ovno;	/* Overlay number, 0x0 */
+  char e_res[4][2];		/* Reserved words, all 0x0 */
+  unsigned short e_oemid;	/* OEM identifier (for e_oeminfo), 0x0 */
+  unsigned short e_oeminfo;	/* OEM information; e_oemid specific, 0x0 */
+  char e_res2[10][2];		/* Reserved words, all 0x0 */
+  host_ulong e_lfanew;	/* File address of new exe header, 0x80 */
+  char dos_message[16][4];	/* other stuff, always follow DOS header */
+  unsigned int nt_signature;	/* required NT signature, 0x4550 */
+
+  /* From standard header */
+
+  unsigned short f_magic;	/* magic number			*/
+  unsigned short f_nscns;	/* number of sections		*/
+  host_ulong f_timdat;	/* time & date stamp		*/
+  host_ulong f_symptr;	/* file pointer to symtab	*/
+  host_ulong f_nsyms;	/* number of symtab entries	*/
+  unsigned short f_opthdr;	/* sizeof(optional hdr)		*/
+  unsigned short f_flags;	/* flags			*/
+};
+
+
+#define FILHDR struct external_PE_filehdr
+#undef FILHSZ
+#define FILHSZ 152
+
+#endif
+
+typedef struct
+{
+  unsigned short magic;		/* type of file				*/
+  unsigned short vstamp;	/* version stamp			*/
+  host_ulong	tsize;		/* text size in bytes, padded to FW bdry*/
+  host_ulong	dsize;		/* initialized data "  "		*/
+  host_ulong	bsize;		/* uninitialized data "   "		*/
+  host_ulong	entry;		/* entry pt.				*/
+  host_ulong text_start;	/* base of text used for this file */
+  host_ulong data_start;	/* base of all data used for this file */
+
+  /* NT extra fields; see internal.h for descriptions */
+  host_ulong  ImageBase;
+  host_ulong  SectionAlignment;
+  host_ulong  FileAlignment;
+  unsigned short  MajorOperatingSystemVersion;
+  unsigned short  MinorOperatingSystemVersion;
+  unsigned short  MajorImageVersion;
+  unsigned short  MinorImageVersion;
+  unsigned short  MajorSubsystemVersion;
+  unsigned short  MinorSubsystemVersion;
+  char  Reserved1[4];
+  host_ulong  SizeOfImage;
+  host_ulong  SizeOfHeaders;
+  host_ulong  CheckSum;
+  unsigned short Subsystem;
+  unsigned short DllCharacteristics;
+  host_ulong  SizeOfStackReserve;
+  host_ulong  SizeOfStackCommit;
+  host_ulong  SizeOfHeapReserve;
+  host_ulong  SizeOfHeapCommit;
+  host_ulong  LoaderFlags;
+  host_ulong  NumberOfRvaAndSizes;
+  /* IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES]; */
+  char  DataDirectory[16][2][4]; /* 16 entries, 2 elements/entry, 4 chars */
+
+} PEAOUTHDR;
+
+
+#undef AOUTSZ
+#define AOUTSZ (AOUTHDRSZ + 196)
+
+#undef  E_FILNMLEN
+#define E_FILNMLEN	18	/* # characters in a file name		*/
+#endif
+
+/* end of coff/pe.h */
+
+#define DT_NON		(0)	/* no derived type */
+#define DT_PTR		(1)	/* pointer */
+#define DT_FCN		(2)	/* function */
+#define DT_ARY		(3)	/* array */
+
+#define ISPTR(x)	(((x) & N_TMASK) == (DT_PTR << N_BTSHFT))
+#define ISFCN(x)	(((x) & N_TMASK) == (DT_FCN << N_BTSHFT))
+#define ISARY(x)	(((x) & N_TMASK) == (DT_ARY << N_BTSHFT))
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _A_OUT_H_ */
--- a/accel.c
+++ b/accel.c
@@ -1,157 +0,0 @@
-/*
- * QEMU System Emulator, accelerator interfaces
- *
- * Copyright (c) 2003-2008 Fabrice Bellard
- * Copyright (c) 2014 Red Hat Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to deal
- * in the Software without restriction, including without limitation the rights
- * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
- * copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
- * THE SOFTWARE.
- */
-
-#include "sysemu/accel.h"
-#include "hw/boards.h"
-#include "qemu-common.h"
-#include "sysemu/arch_init.h"
-#include "sysemu/sysemu.h"
-#include "sysemu/kvm.h"
-#include "sysemu/qtest.h"
-#include "hw/xen/xen.h"
-#include "qom/object.h"
-#include "hw/boards.h"
-
-int tcg_tb_size;
-static bool tcg_allowed = true;
-
-static int tcg_init(MachineState *ms)
-{
-    tcg_exec_init(tcg_tb_size * 1024 * 1024);
-    return 0;
-}
-
-static const TypeInfo accel_type = {
-    .name = TYPE_ACCEL,
-    .parent = TYPE_OBJECT,
-    .class_size = sizeof(AccelClass),
-    .instance_size = sizeof(AccelState),
-};
-
-/* Lookup AccelClass from opt_name. Returns NULL if not found */
-static AccelClass *accel_find(const char *opt_name)
-{
-    char *class_name = g_strdup_printf(ACCEL_CLASS_NAME("%s"), opt_name);
-    AccelClass *ac = ACCEL_CLASS(object_class_by_name(class_name));
-    g_free(class_name);
-    return ac;
-}
-
-static int accel_init_machine(AccelClass *acc, MachineState *ms)
-{
-    ObjectClass *oc = OBJECT_CLASS(acc);
-    const char *cname = object_class_get_name(oc);
-    AccelState *accel = ACCEL(object_new(cname));
-    int ret;
-    ms->accelerator = accel;
-    *(acc->allowed) = true;
-    ret = acc->init_machine(ms);
-    if (ret < 0) {
-        ms->accelerator = NULL;
-        *(acc->allowed) = false;
-        object_unref(OBJECT(accel));
-    }
-    return ret;
-}
-
-int configure_accelerator(MachineState *ms)
-{
-    const char *p;
-    char buf[10];
-    int ret;
-    bool accel_initialised = false;
-    bool init_failed = false;
-    AccelClass *acc = NULL;
-
-    p = qemu_opt_get(qemu_get_machine_opts(), "accel");
-    if (p == NULL) {
-        /* Use the default "accelerator", tcg */
-        p = "tcg";
-    }
-
-    while (!accel_initialised && *p != '\0') {
-        if (*p == ':') {
-            p++;
-        }
-        p = get_opt_name(buf, sizeof(buf), p, ':');
-        acc = accel_find(buf);
-        if (!acc) {
-            fprintf(stderr, "\"%s\" accelerator not found.\n", buf);
-            continue;
-        }
-        if (acc->available && !acc->available()) {
-            printf("%s not supported for this target\n",
-                   acc->name);
-            continue;
-        }
-        ret = accel_init_machine(acc, ms);
-        if (ret < 0) {
-            init_failed = true;
-            fprintf(stderr, "failed to initialize %s: %s\n",
-                    acc->name,
-                    strerror(-ret));
-        } else {
-            accel_initialised = true;
-        }
-    }
-
-    if (!accel_initialised) {
-        if (!init_failed) {
-            fprintf(stderr, "No accelerator found!\n");
-        }
-        exit(1);
-    }
-
-    if (init_failed) {
-        fprintf(stderr, "Back to %s accelerator.\n", acc->name);
-    }
-
-    return !accel_initialised;
-}
-
-
-static void tcg_accel_class_init(ObjectClass *oc, void *data)
-{
-    AccelClass *ac = ACCEL_CLASS(oc);
-    ac->name = "tcg";
-    ac->init_machine = tcg_init;
-    ac->allowed = &tcg_allowed;
-}
-
-#define TYPE_TCG_ACCEL ACCEL_CLASS_NAME("tcg")
-
-static const TypeInfo tcg_accel_type = {
-    .name = TYPE_TCG_ACCEL,
-    .parent = TYPE_ACCEL,
-    .class_init = tcg_accel_class_init,
-};
-
-static void register_accel_types(void)
-{
-    type_register_static(&accel_type);
-    type_register_static(&tcg_accel_type);
-}
-
-type_init(register_accel_types);
--- a/util/acl.c
+++ b/util/acl.c
@@ -24,7 +24,7 @@


 #include "qemu-common.h"
-#include "qemu/acl.h"
+#include "acl.h"

 #ifdef CONFIG_FNMATCH
 #include <fnmatch.h>
@@ -103,8 +103,8 @@ void qemu_acl_reset(qemu_acl *acl)
    acl->defaultDeny = 1;
    QTAILQ_FOREACH_SAFE(entry, &acl->entries, next, next_entry) {
        QTAILQ_REMOVE(&acl->entries, entry, next);
-        g_free(entry->match);
-        g_free(entry);
+        free(entry->match);
+        free(entry);
    }
    acl->nentries = 0;
 }
@@ -132,23 +132,23 @@ int qemu_acl_insert(qemu_acl *acl,
                    const char *match,
                    int index)
 {
+    qemu_acl_entry *entry;
    qemu_acl_entry *tmp;
    int i = 0;

    if (index <= 0)
        return -1;
-    if (index > acl->nentries) {
+    if (index >= acl->nentries)
        return qemu_acl_append(acl, deny, match);
-    }
+
+
+    entry = g_malloc(sizeof(*entry));
+    entry->match = g_strdup(match);
+    entry->deny = deny;

    QTAILQ_FOREACH(tmp, &acl->entries, next) {
        i++;
        if (i == index) {
-            qemu_acl_entry *entry;
-            entry = g_malloc(sizeof(*entry));
-            entry->match = g_strdup(match);
-            entry->deny = deny;
-
            QTAILQ_INSERT_BEFORE(tmp, entry, next);
            acl->nentries++;
            break;
@@ -168,9 +168,6 @@ int qemu_acl_remove(qemu_acl *acl,
        i++;
        if (strcmp(entry->match, match) == 0) {
            QTAILQ_REMOVE(&acl->entries, entry, next);
-            acl->nentries--;
-            g_free(entry->match);
-            g_free(entry);
            return i;
        }
    }
--- a/include/qemu/acl.h
+++ b/include/qemu/acl.h
@@ -25,7 +25,7 @@
 #ifndef __QEMU_ACL_H__
 #define __QEMU_ACL_H__

-#include "qemu/queue.h"
+#include "qemu-queue.h"

 typedef struct qemu_acl_entry qemu_acl_entry;
 typedef struct qemu_acl qemu_acl;
--- a/util/aes.c
+++ b/util/aes.c
--- a/aes.h
+++ b/aes.h
@@ -0,0 +1,26 @@
+#ifndef QEMU_AES_H
+#define QEMU_AES_H
+
+#define AES_MAXNR 14
+#define AES_BLOCK_SIZE 16
+
+struct aes_key_st {
+    uint32_t rd_key[4 *(AES_MAXNR + 1)];
+    int rounds;
+};
+typedef struct aes_key_st AES_KEY;
+
+int AES_set_encrypt_key(const unsigned char *userKey, const int bits,
+	AES_KEY *key);
+int AES_set_decrypt_key(const unsigned char *userKey, const int bits,
+	AES_KEY *key);
+
+void AES_encrypt(const unsigned char *in, unsigned char *out,
+	const AES_KEY *key);
+void AES_decrypt(const unsigned char *in, unsigned char *out,
+	const AES_KEY *key);
+void AES_cbc_encrypt(const unsigned char *in, unsigned char *out,
+		     const unsigned long length, const AES_KEY *key,
+		     unsigned char *ivec, const int enc);
+
+#endif
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -14,15 +14,16 @@
 */

 #include "qemu-common.h"
-#include "block/block.h"
-#include "qemu/queue.h"
-#include "qemu/sockets.h"
+#include "block.h"
+#include "qemu-queue.h"
+#include "qemu_socket.h"

 struct AioHandler
 {
    GPollFD pfd;
    IOHandler *io_read;
    IOHandler *io_write;
+    AioFlushHandler *io_flush;
    int deleted;
    void *opaque;
    QLIST_ENTRY(AioHandler) node;
@@ -45,6 +46,7 @@ void aio_set_fd_handler(AioContext *ctx,
                        int fd,
                        IOHandler *io_read,
                        IOHandler *io_write,
+                        AioFlushHandler *io_flush,
                        void *opaque)
 {
    AioHandler *node;
@@ -72,7 +74,7 @@ void aio_set_fd_handler(AioContext *ctx,
    } else {
        if (node == NULL) {
            /* Alloc and insert if it's not already there */
-            node = g_new0(AioHandler, 1);
+            node = g_malloc0(sizeof(AioHandler));
            node->pfd.fd = fd;
            QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);

@@ -81,10 +83,11 @@ void aio_set_fd_handler(AioContext *ctx,
        /* Update handler with latest information */
        node->io_read = io_read;
        node->io_write = io_write;
+        node->io_flush = io_flush;
        node->opaque = opaque;

-        node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP | G_IO_ERR : 0);
-        node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0);
+        node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP : 0);
+        node->pfd.events |= (io_write ? G_IO_OUT : 0);
    }

    aio_notify(ctx);
@@ -92,15 +95,12 @@ void aio_set_fd_handler(AioContext *ctx,

 void aio_set_event_notifier(AioContext *ctx,
                            EventNotifier *notifier,
-                            EventNotifierHandler *io_read)
+                            EventNotifierHandler *io_read,
+                            AioFlushEventNotifierHandler *io_flush)
 {
    aio_set_fd_handler(ctx, event_notifier_get_fd(notifier),
-                       (IOHandler *)io_read, NULL, notifier);
-}
-
-bool aio_prepare(AioContext *ctx)
-{
-    return false;
+                       (IOHandler *)io_read, NULL,
+                       (AioFlushHandler *)io_flush, notifier);
 }

 bool aio_pending(AioContext *ctx)
@@ -110,6 +110,13 @@ bool aio_pending(AioContext *ctx)
    QLIST_FOREACH(node, &ctx->aio_handlers, node) {
        int revents;

+        /*
+         * FIXME: right now we cannot get G_IO_HUP and G_IO_ERR because
+         * main-loop.c is still select based (due to the slirp legacy).
+         * If main-loop.c ever switches to poll, G_IO_ERR should be
+         * tested too.  Dispatching G_IO_ERR to both handlers should be
+         * okay, since handlers need to be ready for spurious wakeups.
+         */
        revents = node->pfd.revents & node->pfd.events;
        if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR) && node->io_read) {
            return true;
@@ -122,22 +129,31 @@ bool aio_pending(AioContext *ctx)
    return false;
 }

-bool aio_dispatch(AioContext *ctx)
+bool aio_poll(AioContext *ctx, bool blocking)
 {
+    static struct timeval tv0;
    AioHandler *node;
-    bool progress = false;
+    fd_set rdfds, wrfds;
+    int max_fd = -1;
+    int ret;
+    bool busy, progress;
+
+    progress = false;

    /*
-     * If there are callbacks left that have been queued, we need to call them.
+     * If there are callbacks left that have been queued, we need to call then.
     * Do not call select in this case, because it is possible that the caller
-     * does not need a complete flush (as is the case for aio_poll loops).
+     * does not need a complete flush (as is the case for qemu_aio_wait loops).
     */
    if (aio_bh_poll(ctx)) {
+        blocking = false;
        progress = true;
    }

    /*
-     * We have to walk very carefully in case aio_set_fd_handler is
+     * Then dispatch any pending callbacks from the GSource.
+     *
+     * We have to walk very carefully in case qemu_aio_set_fd_handler is
     * called while we're walking.
     */
    node = QLIST_FIRST(&ctx->aio_handlers);
@@ -150,19 +166,12 @@ bool aio_dispatch(AioContext *ctx)
        revents = node->pfd.revents & node->pfd.events;
        node->pfd.revents = 0;

-        if (!node->deleted &&
-            (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) &&
-            node->io_read) {
+        /* See comment in aio_pending.  */
+        if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR) && node->io_read) {
            node->io_read(node->opaque);
-
-            /* aio_notify() does not count as progress */
-            if (node->opaque != &ctx->notifier) {
-                progress = true;
-            }
+            progress = true;
        }
-        if (!node->deleted &&
-            (revents & (G_IO_OUT | G_IO_ERR)) &&
-            node->io_write) {
+        if (revents & (G_IO_OUT | G_IO_ERR) && node->io_write) {
            node->io_write(node->opaque);
            progress = true;
        }
@@ -178,122 +187,83 @@ bool aio_dispatch(AioContext *ctx)
        }
    }

-    /* Run our timers */
-    progress |= timerlistgroup_run_timers(&ctx->tlg);
-
-    return progress;
-}
-
-/* These thread-local variables are used only in a small part of aio_poll
- * around the call to the poll() system call.  In particular they are not
- * used while aio_poll is performing callbacks, which makes it much easier
- * to think about reentrancy!
- *
- * Stack-allocated arrays would be perfect but they have size limitations;
- * heap allocation is expensive enough that we want to reuse arrays across
- * calls to aio_poll().  And because poll() has to be called without holding
- * any lock, the arrays cannot be stored in AioContext.  Thread-local data
- * has none of the disadvantages of these three options.
- */
-static __thread GPollFD *pollfds;
-static __thread AioHandler **nodes;
-static __thread unsigned npfd, nalloc;
-static __thread Notifier pollfds_cleanup_notifier;
-
-static void pollfds_cleanup(Notifier *n, void *unused)
-{
-    g_assert(npfd == 0);
-    g_free(pollfds);
-    g_free(nodes);
-    nalloc = 0;
-}
-
-static void add_pollfd(AioHandler *node)
-{
-    if (npfd == nalloc) {
-        if (nalloc == 0) {
-            pollfds_cleanup_notifier.notify = pollfds_cleanup;
-            qemu_thread_atexit_add(&pollfds_cleanup_notifier);
-            nalloc = 8;
-        } else {
-            g_assert(nalloc <= INT_MAX);
-            nalloc *= 2;
-        }
-        pollfds = g_renew(GPollFD, pollfds, nalloc);
-        nodes = g_renew(AioHandler *, nodes, nalloc);
+    if (progress && !blocking) {
+        return true;
    }
-    nodes[npfd] = node;
-    pollfds[npfd] = (GPollFD) {
-        .fd = node->pfd.fd,
-        .events = node->pfd.events,
-    };
-    npfd++;
-}
-
-bool aio_poll(AioContext *ctx, bool blocking)
-{
-    AioHandler *node;
-    bool was_dispatching;
-    int i, ret;
-    bool progress;
-    int64_t timeout;
-
-    aio_context_acquire(ctx);
-    was_dispatching = ctx->dispatching;
-    progress = false;
-
-    /* aio_notify can avoid the expensive event_notifier_set if
-     * everything (file descriptors, bottom halves, timers) will
-     * be re-evaluated before the next blocking poll().  This is
-     * already true when aio_poll is called with blocking == false;
-     * if blocking == true, it is only true after poll() returns.
-     *
-     * If we're in a nested event loop, ctx->dispatching might be true.
-     * In that case we can restore it just before returning, but we
-     * have to clear it now.
-     */
-    aio_set_dispatching(ctx, !blocking);

    ctx->walking_handlers++;

-    assert(npfd == 0);
+    FD_ZERO(&rdfds);
+    FD_ZERO(&wrfds);

-    /* fill pollfds */
+    /* fill fd sets */
+    busy = false;
    QLIST_FOREACH(node, &ctx->aio_handlers, node) {
-        if (!node->deleted && node->pfd.events) {
-            add_pollfd(node);
+        /* If there aren't pending AIO operations, don't invoke callbacks.
+         * Otherwise, if there are no AIO requests, qemu_aio_wait() would
+         * wait indefinitely.
+         */
+        if (!node->deleted && node->io_flush) {
+            if (node->io_flush(node->opaque) == 0) {
+                continue;
+            }
+            busy = true;
+        }
+        if (!node->deleted && node->io_read) {
+            FD_SET(node->pfd.fd, &rdfds);
+            max_fd = MAX(max_fd, node->pfd.fd + 1);
+        }
+        if (!node->deleted && node->io_write) {
+            FD_SET(node->pfd.fd, &wrfds);
+            max_fd = MAX(max_fd, node->pfd.fd + 1);
        }
    }

-    timeout = blocking ? aio_compute_timeout(ctx) : 0;
+    ctx->walking_handlers--;
+
+    /* No AIO operations?  Get us out of here */
+    if (!busy) {
+        return progress;
+    }

    /* wait until next event */
-    if (timeout) {
-        aio_context_release(ctx);
-    }
-    ret = qemu_poll_ns((GPollFD *)pollfds, npfd, timeout);
-    if (timeout) {
-        aio_context_acquire(ctx);
-    }
+    ret = select(max_fd, &rdfds, &wrfds, NULL, blocking ? NULL : &tv0);

    /* if we have any readable fds, dispatch event */
    if (ret > 0) {
-        for (i = 0; i < npfd; i++) {
-            nodes[i]->pfd.revents = pollfds[i].revents;
+        /* we have to walk very carefully in case
+         * qemu_aio_set_fd_handler is called while we're walking */
+        node = QLIST_FIRST(&ctx->aio_handlers);
+        while (node) {
+            AioHandler *tmp;
+
+            ctx->walking_handlers++;
+
+            if (!node->deleted &&
+                FD_ISSET(node->pfd.fd, &rdfds) &&
+                node->io_read) {
+                node->io_read(node->opaque);
+                progress = true;
+            }
+            if (!node->deleted &&
+                FD_ISSET(node->pfd.fd, &wrfds) &&
+                node->io_write) {
+                node->io_write(node->opaque);
+                progress = true;
+            }
+
+            tmp = node;
+            node = QLIST_NEXT(node, node);
+
+            ctx->walking_handlers--;
+
+            if (!ctx->walking_handlers && tmp->deleted) {
+                QLIST_REMOVE(tmp, node);
+                g_free(tmp);
+            }
        }
    }

-    npfd = 0;
-    ctx->walking_handlers--;
-
-    /* Run dispatch even if there were no readable fds to run timers */
-    aio_set_dispatching(ctx, true);
-    if (aio_dispatch(ctx)) {
-        progress = true;
-    }
-
-    aio_set_dispatching(ctx, was_dispatching);
-    aio_context_release(ctx);
-
-    return progress;
+    assert(progress || busy);
+    return true;
 }
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -16,89 +16,23 @@
 */

 #include "qemu-common.h"
-#include "block/block.h"
-#include "qemu/queue.h"
-#include "qemu/sockets.h"
+#include "block.h"
+#include "qemu-queue.h"
+#include "qemu_socket.h"

 struct AioHandler {
    EventNotifier *e;
-    IOHandler *io_read;
-    IOHandler *io_write;
    EventNotifierHandler *io_notify;
+    AioFlushEventNotifierHandler *io_flush;
    GPollFD pfd;
    int deleted;
-    void *opaque;
    QLIST_ENTRY(AioHandler) node;
 };

-void aio_set_fd_handler(AioContext *ctx,
-                        int fd,
-                        IOHandler *io_read,
-                        IOHandler *io_write,
-                        void *opaque)
-{
-    /* fd is a SOCKET in our case */
-    AioHandler *node;
-
-    QLIST_FOREACH(node, &ctx->aio_handlers, node) {
-        if (node->pfd.fd == fd && !node->deleted) {
-            break;
-        }
-    }
-
-    /* Are we deleting the fd handler? */
-    if (!io_read && !io_write) {
-        if (node) {
-            /* If the lock is held, just mark the node as deleted */
-            if (ctx->walking_handlers) {
-                node->deleted = 1;
-                node->pfd.revents = 0;
-            } else {
-                /* Otherwise, delete it for real.  We can't just mark it as
-                 * deleted because deleted nodes are only cleaned up after
-                 * releasing the walking_handlers lock.
-                 */
-                QLIST_REMOVE(node, node);
-                g_free(node);
-            }
-        }
-    } else {
-        HANDLE event;
-
-        if (node == NULL) {
-            /* Alloc and insert if it's not already there */
-            node = g_new0(AioHandler, 1);
-            node->pfd.fd = fd;
-            QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
-        }
-
-        node->pfd.events = 0;
-        if (node->io_read) {
-            node->pfd.events |= G_IO_IN;
-        }
-        if (node->io_write) {
-            node->pfd.events |= G_IO_OUT;
-        }
-
-        node->e = &ctx->notifier;
-
-        /* Update handler with latest information */
-        node->opaque = opaque;
-        node->io_read = io_read;
-        node->io_write = io_write;
-
-        event = event_notifier_get_handle(&ctx->notifier);
-        WSAEventSelect(node->pfd.fd, event,
-                       FD_READ | FD_ACCEPT | FD_CLOSE |
-                       FD_CONNECT | FD_WRITE | FD_OOB);
-    }
-
-    aio_notify(ctx);
-}
-
 void aio_set_event_notifier(AioContext *ctx,
                            EventNotifier *e,
-                            EventNotifierHandler *io_notify)
+                            EventNotifierHandler *io_notify,
+                            AioFlushEventNotifierHandler *io_flush)
 {
    AioHandler *node;

@@ -129,7 +63,7 @@ void aio_set_event_notifier(AioContext *ctx,
    } else {
        if (node == NULL) {
            /* Alloc and insert if it's not already there */
-            node = g_new0(AioHandler, 1);
+            node = g_malloc0(sizeof(AioHandler));
            node->e = e;
            node->pfd.fd = (uintptr_t)event_notifier_get_handle(e);
            node->pfd.events = G_IO_IN;
@@ -139,48 +73,12 @@ void aio_set_event_notifier(AioContext *ctx,
        }
        /* Update handler with latest information */
        node->io_notify = io_notify;
+        node->io_flush = io_flush;
    }

    aio_notify(ctx);
 }

-bool aio_prepare(AioContext *ctx)
-{
-    static struct timeval tv0;
-    AioHandler *node;
-    bool have_select_revents = false;
-    fd_set rfds, wfds;
-
-    /* fill fd sets */
-    FD_ZERO(&rfds);
-    FD_ZERO(&wfds);
-    QLIST_FOREACH(node, &ctx->aio_handlers, node) {
-        if (node->io_read) {
-            FD_SET ((SOCKET)node->pfd.fd, &rfds);
-        }
-        if (node->io_write) {
-            FD_SET ((SOCKET)node->pfd.fd, &wfds);
-        }
-    }
-
-    if (select(0, &rfds, &wfds, NULL, &tv0) > 0) {
-        QLIST_FOREACH(node, &ctx->aio_handlers, node) {
-            node->pfd.revents = 0;
-            if (FD_ISSET(node->pfd.fd, &rfds)) {
-                node->pfd.revents |= G_IO_IN;
-                have_select_revents = true;
-            }
-
-            if (FD_ISSET(node->pfd.fd, &wfds)) {
-                node->pfd.revents |= G_IO_OUT;
-                have_select_revents = true;
-            }
-        }
-    }
-
-    return have_select_revents;
-}
-
 bool aio_pending(AioContext *ctx)
 {
    AioHandler *node;
@@ -189,66 +87,46 @@ bool aio_pending(AioContext *ctx)
        if (node->pfd.revents && node->io_notify) {
            return true;
        }
-
-        if ((node->pfd.revents & G_IO_IN) && node->io_read) {
-            return true;
-        }
-        if ((node->pfd.revents & G_IO_OUT) && node->io_write) {
-            return true;
-        }
    }

    return false;
 }

-static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
+bool aio_poll(AioContext *ctx, bool blocking)
 {
    AioHandler *node;
-    bool progress = false;
+    HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
+    bool busy, progress;
+    int count;
+
+    progress = false;

    /*
-     * We have to walk very carefully in case aio_set_fd_handler is
+     * If there are callbacks left that have been queued, we need to call then.
+     * Do not call select in this case, because it is possible that the caller
+     * does not need a complete flush (as is the case for qemu_aio_wait loops).
+     */
+    if (aio_bh_poll(ctx)) {
+        blocking = false;
+        progress = true;
+    }
+
+    /*
+     * Then dispatch any pending callbacks from the GSource.
+     *
+     * We have to walk very carefully in case qemu_aio_set_fd_handler is
     * called while we're walking.
     */
    node = QLIST_FIRST(&ctx->aio_handlers);
    while (node) {
        AioHandler *tmp;
-        int revents = node->pfd.revents;

        ctx->walking_handlers++;

-        if (!node->deleted &&
-            (revents || event_notifier_get_handle(node->e) == event) &&
-            node->io_notify) {
+        if (node->pfd.revents && node->io_notify) {
            node->pfd.revents = 0;
            node->io_notify(node->e);
-
-            /* aio_notify() does not count as progress */
-            if (node->e != &ctx->notifier) {
-                progress = true;
-            }
-        }
-
-        if (!node->deleted &&
-            (node->io_read || node->io_write)) {
-            node->pfd.revents = 0;
-            if ((revents & G_IO_IN) && node->io_read) {
-                node->io_read(node->opaque);
-                progress = true;
-            }
-            if ((revents & G_IO_OUT) && node->io_write) {
-                node->io_write(node->opaque);
-                progress = true;
-            }
-
-            /* if the next select() will return an event, we have progressed */
-            if (event == event_notifier_get_handle(&ctx->notifier)) {
-                WSANETWORKEVENTS ev;
-                WSAEnumNetworkEvents(node->pfd.fd, event, &ev);
-                if (ev.lNetworkEvents) {
-                    progress = true;
-                }
-            }
+            progress = true;
        }

        tmp = node;
@@ -262,100 +140,80 @@ static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
        }
    }

-    return progress;
-}
-
-bool aio_dispatch(AioContext *ctx)
-{
-    bool progress;
-
-    progress = aio_bh_poll(ctx);
-    progress |= aio_dispatch_handlers(ctx, INVALID_HANDLE_VALUE);
-    progress |= timerlistgroup_run_timers(&ctx->tlg);
-    return progress;
-}
-
-bool aio_poll(AioContext *ctx, bool blocking)
-{
-    AioHandler *node;
-    HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
-    bool was_dispatching, progress, have_select_revents, first;
-    int count;
-    int timeout;
-
-    aio_context_acquire(ctx);
-    have_select_revents = aio_prepare(ctx);
-    if (have_select_revents) {
-        blocking = false;
+    if (progress && !blocking) {
+        return true;
    }

-    was_dispatching = ctx->dispatching;
-    progress = false;
-
-    /* aio_notify can avoid the expensive event_notifier_set if
-     * everything (file descriptors, bottom halves, timers) will
-     * be re-evaluated before the next blocking poll().  This is
-     * already true when aio_poll is called with blocking == false;
-     * if blocking == true, it is only true after poll() returns.
-     *
-     * If we're in a nested event loop, ctx->dispatching might be true.
-     * In that case we can restore it just before returning, but we
-     * have to clear it now.
-     */
-    aio_set_dispatching(ctx, !blocking);
-
    ctx->walking_handlers++;

    /* fill fd sets */
+    busy = false;
    count = 0;
    QLIST_FOREACH(node, &ctx->aio_handlers, node) {
+        /* If there aren't pending AIO operations, don't invoke callbacks.
+         * Otherwise, if there are no AIO requests, qemu_aio_wait() would
+         * wait indefinitely.
+         */
+        if (!node->deleted && node->io_flush) {
+            if (node->io_flush(node->e) == 0) {
+                continue;
+            }
+            busy = true;
+        }
        if (!node->deleted && node->io_notify) {
            events[count++] = event_notifier_get_handle(node->e);
        }
    }

    ctx->walking_handlers--;
-    first = true;
+
+    /* No AIO operations?  Get us out of here */
+    if (!busy) {
+        return progress;
+    }

    /* wait until next event */
    while (count > 0) {
-        HANDLE event;
-        int ret;
-
-        timeout = blocking
-            ? qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)) : 0;
-        if (timeout) {
-            aio_context_release(ctx);
-        }
-        ret = WaitForMultipleObjects(count, events, FALSE, timeout);
-        if (timeout) {
-            aio_context_acquire(ctx);
-        }
-        aio_set_dispatching(ctx, true);
-
-        if (first && aio_bh_poll(ctx)) {
-            progress = true;
-        }
-        first = false;
+        int timeout = blocking ? INFINITE : 0;
+        int ret = WaitForMultipleObjects(count, events, FALSE, timeout);

        /* if we have any signaled events, dispatch event */
-        event = NULL;
-        if ((DWORD) (ret - WAIT_OBJECT_0) < count) {
-            event = events[ret - WAIT_OBJECT_0];
-            events[ret - WAIT_OBJECT_0] = events[--count];
-        } else if (!have_select_revents) {
+        if ((DWORD) (ret - WAIT_OBJECT_0) >= count) {
            break;
        }

-        have_select_revents = false;
        blocking = false;

-        progress |= aio_dispatch_handlers(ctx, event);
+        /* we have to walk very carefully in case
+         * qemu_aio_set_fd_handler is called while we're walking */
+        node = QLIST_FIRST(&ctx->aio_handlers);
+        while (node) {
+            AioHandler *tmp;
+
+            ctx->walking_handlers++;
+
+            if (!node->deleted &&
+                event_notifier_get_handle(node->e) == events[ret - WAIT_OBJECT_0] &&
+                node->io_notify) {
+                node->io_notify(node->e);
+                progress = true;
+            }
+
+            tmp = node;
+            node = QLIST_NEXT(node, node);
+
+            ctx->walking_handlers--;
+
+            if (!ctx->walking_handlers && tmp->deleted) {
+                QLIST_REMOVE(tmp, node);
+                g_free(tmp);
+            }
+        }
+
+        /* Try again, but only call each handler once.  */
+        events[ret - WAIT_OBJECT_0] = events[--count];
    }

-    progress |= timerlistgroup_run_timers(&ctx->tlg);
-
-    aio_set_dispatching(ctx, was_dispatching);
-    aio_context_release(ctx);
-    return progress;
+    assert(progress || busy);
+    return true;
 }
--- a/disas/alpha.c
+++ b/disas/alpha.c
@@ -20,7 +20,7 @@ along with this file; see the file COPYING.  If not, see
 <http://www.gnu.org/licenses/>. */

 #include <stdio.h>
-#include "disas/bfd.h"
+#include "dis-asm.h"

 /* MAX is redefined below, so remove any previous definition. */
 #undef MAX
--- a/alpha.ld
+++ b/alpha.ld
@@ -0,0 +1,127 @@
+OUTPUT_FORMAT("elf64-alpha", "elf64-alpha",
+	      "elf64-alpha")
+OUTPUT_ARCH(alpha)
+ENTRY(__start)
+SECTIONS
+{
+  /* Read-only sections, merged into text segment: */
+  . = 0x60000000 + SIZEOF_HEADERS;
+  .interp     : { *(.interp) 	}
+  .hash          : { *(.hash)		}
+  .dynsym        : { *(.dynsym)		}
+  .dynstr        : { *(.dynstr)		}
+  .gnu.version   : { *(.gnu.version)	}
+  .gnu.version_d   : { *(.gnu.version_d)	}
+  .gnu.version_r   : { *(.gnu.version_r)	}
+  .rel.text      :
+    { *(.rel.text) *(.rel.gnu.linkonce.t*) }
+  .rela.text     :
+    { *(.rela.text) *(.rela.gnu.linkonce.t*) }
+  .rel.data      :
+    { *(.rel.data) *(.rel.gnu.linkonce.d*) }
+  .rela.data     :
+    { *(.rela.data) *(.rela.gnu.linkonce.d*) }
+  .rel.rodata    :
+    { *(.rel.rodata) *(.rel.gnu.linkonce.r*) }
+  .rela.rodata   :
+    { *(.rela.rodata) *(.rela.gnu.linkonce.r*) }
+  .rel.got       : { *(.rel.got)		}
+  .rela.got      : { *(.rela.got)		}
+  .rel.ctors     : { *(.rel.ctors)	}
+  .rela.ctors    : { *(.rela.ctors)	}
+  .rel.dtors     : { *(.rel.dtors)	}
+  .rela.dtors    : { *(.rela.dtors)	}
+  .rel.init      : { *(.rel.init)	}
+  .rela.init     : { *(.rela.init)	}
+  .rel.fini      : { *(.rel.fini)	}
+  .rela.fini     : { *(.rela.fini)	}
+  .rel.bss       : { *(.rel.bss)		}
+  .rela.bss      : { *(.rela.bss)		}
+  .rel.plt       : { *(.rel.plt)		}
+  .rela.plt      : { *(.rela.plt)		}
+  .init          : { *(.init)	} =0x47ff041f
+  .text      :
+  {
+    *(.text)
+    /* .gnu.warning sections are handled specially by elf32.em.  */
+    *(.gnu.warning)
+    *(.gnu.linkonce.t*)
+  } =0x47ff041f
+  _etext = .;
+  PROVIDE (etext = .);
+  .fini      : { *(.fini)    } =0x47ff041f
+  .rodata    : { *(.rodata) *(.gnu.linkonce.r*) }
+  .rodata1   : { *(.rodata1) }
+  .reginfo : { *(.reginfo) }
+  /* Adjust the address for the data segment.  We want to adjust up to
+     the same address within the page on the next page up.  */
+  . = ALIGN(0x100000) + (. & (0x100000 - 1));
+  .data    :
+  {
+    *(.data)
+    *(.gnu.linkonce.d*)
+    CONSTRUCTORS
+  }
+  .data1   : { *(.data1) }
+  .ctors         :
+  {
+    *(.ctors)
+  }
+  .dtors         :
+  {
+    *(.dtors)
+  }
+  .plt      : { *(.plt)	}
+  .got           : { *(.got.plt) *(.got) }
+  .dynamic       : { *(.dynamic) }
+  /* We want the small data sections together, so single-instruction offsets
+     can access them all, and initialized data all before uninitialized, so
+     we can shorten the on-disk segment size.  */
+  .sdata     : { *(.sdata) }
+  _edata  =  .;
+  PROVIDE (edata = .);
+  __bss_start = .;
+  .sbss      : { *(.sbss) *(.scommon) }
+  .bss       :
+  {
+   *(.dynbss)
+   *(.bss)
+   *(COMMON)
+  }
+  _end = . ;
+  PROVIDE (end = .);
+  /* Stabs debugging sections.  */
+  .stab 0 : { *(.stab) }
+  .stabstr 0 : { *(.stabstr) }
+  .stab.excl 0 : { *(.stab.excl) }
+  .stab.exclstr 0 : { *(.stab.exclstr) }
+  .stab.index 0 : { *(.stab.index) }
+  .stab.indexstr 0 : { *(.stab.indexstr) }
+  .comment 0 : { *(.comment) }
+  /* DWARF debug sections.
+     Symbols in the DWARF debugging sections are relative to the beginning
+     of the section so we begin them at 0.  */
+  /* DWARF 1 */
+  .debug          0 : { *(.debug) }
+  .line           0 : { *(.line) }
+  /* GNU DWARF 1 extensions */
+  .debug_srcinfo  0 : { *(.debug_srcinfo) }
+  .debug_sfnames  0 : { *(.debug_sfnames) }
+  /* DWARF 1.1 and DWARF 2 */
+  .debug_aranges  0 : { *(.debug_aranges) }
+  .debug_pubnames 0 : { *(.debug_pubnames) }
+  /* DWARF 2 */
+  .debug_info     0 : { *(.debug_info) }
+  .debug_abbrev   0 : { *(.debug_abbrev) }
+  .debug_line     0 : { *(.debug_line) }
+  .debug_frame    0 : { *(.debug_frame) }
+  .debug_str      0 : { *(.debug_str) }
+  .debug_loc      0 : { *(.debug_loc) }
+  .debug_macinfo  0 : { *(.debug_macinfo) }
+  /* SGI/MIPS DWARF 2 extensions */
+  .debug_weaknames 0 : { *(.debug_weaknames) }
+  .debug_funcnames 0 : { *(.debug_funcnames) }
+  .debug_typenames 0 : { *(.debug_typenames) }
+  .debug_varnames  0 : { *(.debug_varnames) }
+  /* These must appear regardless of  .  */
+}
--- a/arch_init.c
+++ b/arch_init.c
--- a/arch_init.h
+++ b/arch_init.h
@@ -0,0 +1,39 @@
+#ifndef QEMU_ARCH_INIT_H
+#define QEMU_ARCH_INIT_H
+
+#include "qmp-commands.h"
+
+enum {
+    QEMU_ARCH_ALL = -1,
+    QEMU_ARCH_ALPHA = 1,
+    QEMU_ARCH_ARM = 2,
+    QEMU_ARCH_CRIS = 4,
+    QEMU_ARCH_I386 = 8,
+    QEMU_ARCH_M68K = 16,
+    QEMU_ARCH_LM32 = 32,
+    QEMU_ARCH_MICROBLAZE = 64,
+    QEMU_ARCH_MIPS = 128,
+    QEMU_ARCH_PPC = 256,
+    QEMU_ARCH_S390X = 512,
+    QEMU_ARCH_SH4 = 1024,
+    QEMU_ARCH_SPARC = 2048,
+    QEMU_ARCH_XTENSA = 4096,
+    QEMU_ARCH_OPENRISC = 8192,
+    QEMU_ARCH_UNICORE32 = 0x4000,
+};
+
+extern const uint32_t arch_type;
+
+void select_soundhw(const char *optarg);
+void do_acpitable_option(const char *optarg);
+void do_smbios_option(const char *optarg);
+void cpudef_init(void);
+int audio_available(void);
+void audio_init(ISABus *isa_bus, PCIBus *pci_bus);
+int tcg_available(void);
+int kvm_available(void);
+int xen_available(void);
+
+CpuDefinitionInfoList *arch_query_cpu_definitions(Error **errp);
+
+#endif
--- a/disas/arm.c
+++ b/disas/arm.c
@@ -22,7 +22,7 @@
 /* Start of qemu specific additions.  Mostly this is stub definitions
   for things we don't care about.  */

-#include "disas/bfd.h"
+#include "dis-asm.h"
 #define ATTRIBUTE_UNUSED __attribute__((unused))
 #define ISSPACE(x) ((x) == ' ' || (x) == '\t' || (x) == '\n')

@@ -819,10 +819,6 @@ static const struct opcode32 arm_opcodes[] =
  {ARM_EXT_V3M, 0x00800090, 0x0fa000f0, "%22?sumull%20's%c\t%12-15r, %16-19r, %0-3r, %8-11r"},
  {ARM_EXT_V3M, 0x00a00090, 0x0fa000f0, "%22?sumlal%20's%c\t%12-15r, %16-19r, %0-3r, %8-11r"},

-  /* IDIV instructions.  */
-  {ARM_EXT_DIV, 0x0710f010, 0x0ff0f0f0, "sdiv%c\t%16-19r, %0-3r, %8-11r"},
-  {ARM_EXT_DIV, 0x0730f010, 0x0ff0f0f0, "udiv%c\t%16-19r, %0-3r, %8-11r"},
-
  /* V7 instructions.  */
  {ARM_EXT_V7, 0xf450f000, 0xfd70f000, "pli\t%P"},
  {ARM_EXT_V7, 0x0320f0f0, 0x0ffffff0, "dbg%c\t#%0-3d"},
@@ -1549,6 +1545,10 @@ enum map_type {
  MAP_DATA
 };

+enum map_type last_type;
+int last_mapping_sym = -1;
+bfd_vma last_mapping_addr = 0;
+
 /* Decode a bitfield of the form matching regexp (N(-N)?,)*N(-N)?.
   Returns pointer to following character of the format string and
   fills in *VALUEP and *WIDTHP with the extracted value and number of
@@ -3874,11 +3874,135 @@ print_insn_arm (bfd_vma pc, struct disassemble_info *info)
  int           is_data = false;
  unsigned int	size = 4;
  void	 	(*printer) (bfd_vma, struct disassemble_info *, long);
+#if 0
+  bfd_boolean   found = false;
+
+  if (info->disassembler_options)
+    {
+      parse_disassembler_options (info->disassembler_options);
+
+      /* To avoid repeated parsing of these options, we remove them here.  */
+      info->disassembler_options = NULL;
+    }
+
+  /* First check the full symtab for a mapping symbol, even if there
+     are no usable non-mapping symbols for this address.  */
+  if (info->symtab != NULL
+      && bfd_asymbol_flavour (*info->symtab) == bfd_target_elf_flavour)
+    {
+      bfd_vma addr;
+      int n;
+      int last_sym = -1;
+      enum map_type type = MAP_ARM;
+
+      if (pc <= last_mapping_addr)
+	last_mapping_sym = -1;
+      is_thumb = (last_type == MAP_THUMB);
+      found = false;
+      /* Start scanning at the start of the function, or wherever
+	 we finished last time.  */
+      n = info->symtab_pos + 1;
+      if (n < last_mapping_sym)
+	n = last_mapping_sym;
+
+      /* Scan up to the location being disassembled.  */
+      for (; n < info->symtab_size; n++)
+	{
+	  addr = bfd_asymbol_value (info->symtab[n]);
+	  if (addr > pc)
+	    break;
+	  if ((info->section == NULL
+	       || info->section == info->symtab[n]->section)
+	      && get_sym_code_type (info, n, &type))
+	    {
+	      last_sym = n;
+	      found = true;
+	    }
+	}
+
+      if (!found)
+	{
+	  n = info->symtab_pos;
+	  if (n < last_mapping_sym - 1)
+	    n = last_mapping_sym - 1;
+
+	  /* No mapping symbol found at this address.  Look backwards
+	     for a preceding one.  */
+	  for (; n >= 0; n--)
+	    {
+	      if (get_sym_code_type (info, n, &type))
+		{
+		  last_sym = n;
+		  found = true;
+		  break;
+		}
+	    }
+	}
+
+      last_mapping_sym = last_sym;
+      last_type = type;
+      is_thumb = (last_type == MAP_THUMB);
+      is_data = (last_type == MAP_DATA);
+
+      /* Look a little bit ahead to see if we should print out
+	 two or four bytes of data.  If there's a symbol,
+	 mapping or otherwise, after two bytes then don't
+	 print more.  */
+      if (is_data)
+	{
+	  size = 4 - (pc & 3);
+	  for (n = last_sym + 1; n < info->symtab_size; n++)
+	    {
+	      addr = bfd_asymbol_value (info->symtab[n]);
+	      if (addr > pc)
+		{
+		  if (addr - pc < size)
+		    size = addr - pc;
+		  break;
+		}
+	    }
+	  /* If the next symbol is after three bytes, we need to
+	     print only part of the data, so that we can use either
+	     .byte or .short.  */
+	  if (size == 3)
+	    size = (pc & 1) ? 1 : 2;
+	}
+    }
+
+  if (info->symbols != NULL)
+    {
+      if (bfd_asymbol_flavour (*info->symbols) == bfd_target_coff_flavour)
+	{
+	  coff_symbol_type * cs;
+
+	  cs = coffsymbol (*info->symbols);
+	  is_thumb = (   cs->native->u.syment.n_sclass == C_THUMBEXT
+		      || cs->native->u.syment.n_sclass == C_THUMBSTAT
+		      || cs->native->u.syment.n_sclass == C_THUMBLABEL
+		      || cs->native->u.syment.n_sclass == C_THUMBEXTFUNC
+		      || cs->native->u.syment.n_sclass == C_THUMBSTATFUNC);
+	}
+      else if (bfd_asymbol_flavour (*info->symbols) == bfd_target_elf_flavour
+	       && !found)
+	{
+	  /* If no mapping symbol has been found then fall back to the type
+	     of the function symbol.  */
+	  elf_symbol_type *  es;
+	  unsigned int       type;
+
+	  es = *(elf_symbol_type **)(info->symbols);
+	  type = ELF_ST_TYPE (es->internal_elf_sym.st_info);
+
+	  is_thumb = (type == STT_ARM_TFUNC) || (type == STT_ARM_16BIT);
+	}
+    }
+#else
  int little;

  little = (info->endian == BFD_ENDIAN_LITTLE);
  is_thumb |= (pc & 1);
  pc &= ~(bfd_vma)1;
+#endif

  if (force_thumb)
    is_thumb = true;
--- a/arm.ld
+++ b/arm.ld
@@ -0,0 +1,153 @@
+OUTPUT_FORMAT("elf32-littlearm", "elf32-littlearm",
+	      "elf32-littlearm")
+OUTPUT_ARCH(arm)
+ENTRY(_start)
+SECTIONS
+{
+  /* Read-only sections, merged into text segment: */
+  . = 0x60000000 + SIZEOF_HEADERS;
+  .interp     : { *(.interp) 	}
+  .hash          : { *(.hash)		}
+  .dynsym        : { *(.dynsym)		}
+  .dynstr        : { *(.dynstr)		}
+  .gnu.version   : { *(.gnu.version)	}
+  .gnu.version_d   : { *(.gnu.version_d)	}
+  .gnu.version_r   : { *(.gnu.version_r)	}
+  .rel.text      :
+    { *(.rel.text) *(.rel.gnu.linkonce.t*) }
+  .rela.text     :
+    { *(.rela.text) *(.rela.gnu.linkonce.t*) }
+  .rel.data      :
+    { *(.rel.data) *(.rel.gnu.linkonce.d*) }
+  .rela.data     :
+    { *(.rela.data) *(.rela.gnu.linkonce.d*) }
+  .rel.rodata    :
+    { *(.rel.rodata) *(.rel.gnu.linkonce.r*) }
+  .rela.rodata   :
+    { *(.rela.rodata) *(.rela.gnu.linkonce.r*) }
+  .rel.got       : { *(.rel.got)		}
+  .rela.got      : { *(.rela.got)		}
+  .rel.ctors     : { *(.rel.ctors)	}
+  .rela.ctors    : { *(.rela.ctors)	}
+  .rel.dtors     : { *(.rel.dtors)	}
+  .rela.dtors    : { *(.rela.dtors)	}
+  .rel.init      : { *(.rel.init)	}
+  .rela.init     : { *(.rela.init)	}
+  .rel.fini      : { *(.rel.fini)	}
+  .rela.fini     : { *(.rela.fini)	}
+  .rel.bss       : { *(.rel.bss)		}
+  .rela.bss      : { *(.rela.bss)		}
+  .rel.plt       : { *(.rel.plt)		}
+  .rela.plt      : { *(.rela.plt)		}
+  .init          : { *(.init)	} =0x47ff041f
+  .text      :
+  {
+    *(.text)
+    /* .gnu.warning sections are handled specially by elf32.em.  */
+    *(.gnu.warning)
+    *(.gnu.linkonce.t*)
+  } =0x47ff041f
+  _etext = .;
+  PROVIDE (etext = .);
+  .fini      : { *(.fini)    } =0x47ff041f
+  .rodata    : { *(.rodata) *(.gnu.linkonce.r*) }
+  .rodata1   : { *(.rodata1) }
+  .ARM.extab   : { *(.ARM.extab* .gnu.linkonce.armextab.*) }
+   __exidx_start = .;
+  .ARM.exidx   : { *(.ARM.exidx* .gnu.linkonce.armexidx.*) }
+   __exidx_end = .;
+  .reginfo : { *(.reginfo) }
+  /* Adjust the address for the data segment.  We want to adjust up to
+     the same address within the page on the next page up.  */
+  . = ALIGN(0x100000) + (. & (0x100000 - 1));
+  .data    :
+  {
+    *(.gen_code)
+    *(.data)
+    *(.gnu.linkonce.d*)
+    CONSTRUCTORS
+  }
+  .tbss           : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) }
+  .data1   : { *(.data1) }
+  .preinit_array     :
+  {
+    PROVIDE (__preinit_array_start = .);
+    KEEP (*(.preinit_array))
+    PROVIDE (__preinit_array_end = .);
+  }
+  .init_array     :
+  {
+     PROVIDE (__init_array_start = .);
+     KEEP (*(SORT(.init_array.*)))
+     KEEP (*(.init_array))
+     PROVIDE (__init_array_end = .);
+  }
+  .fini_array     :
+  {
+    PROVIDE (__fini_array_start = .);
+    KEEP (*(.fini_array))
+    KEEP (*(SORT(.fini_array.*)))
+    PROVIDE (__fini_array_end = .);
+  }
+  .ctors         :
+  {
+    *(.ctors)
+  }
+  .dtors         :
+  {
+    *(.dtors)
+  }
+  .plt      : { *(.plt)	}
+  .got           : { *(.got.plt) *(.got) }
+  .dynamic       : { *(.dynamic) }
+  /* We want the small data sections together, so single-instruction offsets
+     can access them all, and initialized data all before uninitialized, so
+     we can shorten the on-disk segment size.  */
+  .sdata     : { *(.sdata) }
+  _edata  =  .;
+  PROVIDE (edata = .);
+  __bss_start = .;
+  .sbss      : { *(.sbss) *(.scommon) }
+  .bss       :
+  {
+   *(.dynbss)
+   *(.bss)
+   *(COMMON)
+  }
+  _end = . ;
+  PROVIDE (end = .);
+  /* Stabs debugging sections.  */
+  .stab 0 : { *(.stab) }
+  .stabstr 0 : { *(.stabstr) }
+  .stab.excl 0 : { *(.stab.excl) }
+  .stab.exclstr 0 : { *(.stab.exclstr) }
+  .stab.index 0 : { *(.stab.index) }
+  .stab.indexstr 0 : { *(.stab.indexstr) }
+  .comment 0 : { *(.comment) }
+  /* DWARF debug sections.
+     Symbols in the DWARF debugging sections are relative to the beginning
+     of the section so we begin them at 0.  */
+  /* DWARF 1 */
+  .debug          0 : { *(.debug) }
+  .line           0 : { *(.line) }
+  /* GNU DWARF 1 extensions */
+  .debug_srcinfo  0 : { *(.debug_srcinfo) }
+  .debug_sfnames  0 : { *(.debug_sfnames) }
+  /* DWARF 1.1 and DWARF 2 */
+  .debug_aranges  0 : { *(.debug_aranges) }
+  .debug_pubnames 0 : { *(.debug_pubnames) }
+  /* DWARF 2 */
+  .debug_info     0 : { *(.debug_info) }
+  .debug_abbrev   0 : { *(.debug_abbrev) }
+  .debug_line     0 : { *(.debug_line) }
+  .debug_frame    0 : { *(.debug_frame) }
+  .debug_str      0 : { *(.debug_str) }
+  .debug_loc      0 : { *(.debug_loc) }
+  .debug_macinfo  0 : { *(.debug_macinfo) }
+  /* SGI/MIPS DWARF 2 extensions */
+  .debug_weaknames 0 : { *(.debug_weaknames) }
+  .debug_funcnames 0 : { *(.debug_funcnames) }
+  .debug_typenames 0 : { *(.debug_typenames) }
+  .debug_varnames  0 : { *(.debug_varnames) }
+  /* These must appear regardless of  .  */
+}
--- a/async.c
+++ b/async.c
@@ -23,10 +23,8 @@
 */

 #include "qemu-common.h"
-#include "block/aio.h"
-#include "block/thread-pool.h"
-#include "qemu/main-loop.h"
-#include "qemu/atomic.h"
+#include "qemu-aio.h"
+#include "main-loop.h"

 /***********************************************************/
 /* bottom halves (can be seen as timers which expire ASAP) */
@@ -44,22 +42,15 @@ struct QEMUBH {
 QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
 {
    QEMUBH *bh;
-    bh = g_new(QEMUBH, 1);
-    *bh = (QEMUBH){
-        .ctx = ctx,
-        .cb = cb,
-        .opaque = opaque,
-    };
-    qemu_mutex_lock(&ctx->bh_lock);
+    bh = g_malloc0(sizeof(QEMUBH));
+    bh->ctx = ctx;
+    bh->cb = cb;
+    bh->opaque = opaque;
    bh->next = ctx->first_bh;
-    /* Make sure that the members are ready before putting bh into list */
-    smp_wmb();
    ctx->first_bh = bh;
-    qemu_mutex_unlock(&ctx->bh_lock);
    return bh;
 }

-/* Multiple occurrences of aio_bh_poll cannot be called concurrently */
 int aio_bh_poll(AioContext *ctx)
 {
    QEMUBH *bh, **bhp, *next;
@@ -69,16 +60,9 @@ int aio_bh_poll(AioContext *ctx)

    ret = 0;
    for (bh = ctx->first_bh; bh; bh = next) {
-        /* Make sure that fetching bh happens before accessing its members */
-        smp_read_barrier_depends();
        next = bh->next;
-        /* The atomic_xchg is paired with the one in qemu_bh_schedule.  The
-         * implicit memory barrier ensures that the callback sees all writes
-         * done by the scheduling thread.  It also ensures that the scheduling
-         * thread sees the zero before bh->cb has run, and thus will call
-         * aio_notify again if necessary.
-         */
-        if (!bh->deleted && atomic_xchg(&bh->scheduled, 0)) {
+        if (!bh->deleted && bh->scheduled) {
+            bh->scheduled = 0;
            if (!bh->idle)
                ret = 1;
            bh->idle = 0;
@@ -90,7 +74,6 @@ int aio_bh_poll(AioContext *ctx)

    /* remove deleted bhs */
    if (!ctx->walking_bh) {
-        qemu_mutex_lock(&ctx->bh_lock);
        bhp = &ctx->first_bh;
        while (*bhp) {
            bh = *bhp;
@@ -101,7 +84,6 @@ int aio_bh_poll(AioContext *ctx)
                bhp = &bh->next;
            }
        }
-        qemu_mutex_unlock(&ctx->bh_lock);
    }

    return ret;
@@ -109,52 +91,36 @@ int aio_bh_poll(AioContext *ctx)

 void qemu_bh_schedule_idle(QEMUBH *bh)
 {
+    if (bh->scheduled)
+        return;
+    bh->scheduled = 1;
    bh->idle = 1;
-    /* Make sure that idle & any writes needed by the callback are done
-     * before the locations are read in the aio_bh_poll.
-     */
-    atomic_mb_set(&bh->scheduled, 1);
 }

 void qemu_bh_schedule(QEMUBH *bh)
 {
-    AioContext *ctx;
-
-    ctx = bh->ctx;
+    if (bh->scheduled)
+        return;
+    bh->scheduled = 1;
    bh->idle = 0;
-    /* The memory barrier implicit in atomic_xchg makes sure that:
-     * 1. idle & any writes needed by the callback are done before the
-     *    locations are read in the aio_bh_poll.
-     * 2. ctx is loaded before scheduled is set and the callback has a chance
-     *    to execute.
-     */
-    if (atomic_xchg(&bh->scheduled, 1) == 0) {
-        aio_notify(ctx);
-    }
+    aio_notify(bh->ctx);
 }

-
-/* This func is async.
- */
 void qemu_bh_cancel(QEMUBH *bh)
 {
    bh->scheduled = 0;
 }

-/* This func is async.The bottom half will do the delete action at the finial
- * end.
- */
 void qemu_bh_delete(QEMUBH *bh)
 {
    bh->scheduled = 0;
    bh->deleted = 1;
 }

-int64_t
-aio_compute_timeout(AioContext *ctx)
+static gboolean
+aio_ctx_prepare(GSource *source, gint    *timeout)
 {
-    int64_t deadline;
-    int timeout = -1;
+    AioContext *ctx = (AioContext *) source;
    QEMUBH *bh;

    for (bh = ctx->first_bh; bh; bh = bh->next) {
@@ -162,36 +128,17 @@ aio_compute_timeout(AioContext *ctx)
            if (bh->idle) {
                /* idle bottom halves will be polled at least
                 * every 10ms */
-                timeout = 10000000;
+                *timeout = 10;
            } else {
                /* non-idle bottom halves will be executed
                 * immediately */
-                return 0;
+                *timeout = 0;
+                return true;
            }
        }
    }

-    deadline = timerlistgroup_deadline_ns(&ctx->tlg);
-    if (deadline == 0) {
-        return 0;
-    } else {
-        return qemu_soonest_timeout(timeout, deadline);
-    }
-}
-
-static gboolean
-aio_ctx_prepare(GSource *source, gint    *timeout)
-{
-    AioContext *ctx = (AioContext *) source;
-
-    /* We assume there is no timeout already supplied */
-    *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx));
-
-    if (aio_prepare(ctx)) {
-        *timeout = 0;
-    }
-
-    return *timeout == 0;
+    return false;
 }

 static gboolean
@@ -205,7 +152,7 @@ aio_ctx_check(GSource *source)
            return true;
 	}
    }
-    return aio_pending(ctx) || (timerlistgroup_deadline_ns(&ctx->tlg) == 0);
+    return aio_pending(ctx);
 }

 static gboolean
@@ -216,7 +163,7 @@ aio_ctx_dispatch(GSource     *source,
    AioContext *ctx = (AioContext *) source;

    assert(callback == NULL);
-    aio_dispatch(ctx);
+    aio_poll(ctx, false);
    return true;
 }

@@ -225,12 +172,8 @@ aio_ctx_finalize(GSource     *source)
 {
    AioContext *ctx = (AioContext *) source;

-    thread_pool_free(ctx->thread_pool);
-    aio_set_event_notifier(ctx, &ctx->notifier, NULL);
+    aio_set_event_notifier(ctx, &ctx->notifier, NULL, NULL);
    event_notifier_cleanup(&ctx->notifier);
-    rfifolock_destroy(&ctx->lock);
-    qemu_mutex_destroy(&ctx->bh_lock);
-    timerlistgroup_deinit(&ctx->tlg);
 }

 static GSourceFuncs aio_source_funcs = {
@@ -246,59 +189,19 @@ GSource *aio_get_g_source(AioContext *ctx)
    return &ctx->source;
 }

-ThreadPool *aio_get_thread_pool(AioContext *ctx)
-{
-    if (!ctx->thread_pool) {
-        ctx->thread_pool = thread_pool_new(ctx);
-    }
-    return ctx->thread_pool;
-}
-
-void aio_set_dispatching(AioContext *ctx, bool dispatching)
-{
-    ctx->dispatching = dispatching;
-    if (!dispatching) {
-        /* Write ctx->dispatching before reading e.g. bh->scheduled.
-         * Optimization: this is only needed when we're entering the "unsafe"
-         * phase where other threads must call event_notifier_set.
-         */
-        smp_mb();
-    }
-}
-
 void aio_notify(AioContext *ctx)
 {
-    /* Write e.g. bh->scheduled before reading ctx->dispatching.  */
-    smp_mb();
-    if (!ctx->dispatching) {
-        event_notifier_set(&ctx->notifier);
-    }
+    event_notifier_set(&ctx->notifier);
 }

-static void aio_timerlist_notify(void *opaque)
+AioContext *aio_context_new(void)
 {
-    aio_notify(opaque);
-}
-
-AioContext *aio_context_new(Error **errp)
-{
-    int ret;
    AioContext *ctx;
    ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
-    ret = event_notifier_init(&ctx->notifier, false);
-    if (ret < 0) {
-        g_source_destroy(&ctx->source);
-        error_setg_errno(errp, -ret, "Failed to initialize event notifier");
-        return NULL;
-    }
-    g_source_set_can_recurse(&ctx->source, true);
-    aio_set_event_notifier(ctx, &ctx->notifier,
+    event_notifier_init(&ctx->notifier, false);
+    aio_set_event_notifier(ctx, &ctx->notifier, 
                           (EventNotifierHandler *)
-                           event_notifier_test_and_clear);
-    ctx->thread_pool = NULL;
-    qemu_mutex_init(&ctx->bh_lock);
-    rfifolock_init(&ctx->lock, NULL, NULL);
-    timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
+                           event_notifier_test_and_clear, NULL);

    return ctx;
 }
@@ -313,12 +216,7 @@ void aio_context_unref(AioContext *ctx)
    g_source_unref(&ctx->source);
 }

-void aio_context_acquire(AioContext *ctx)
+void aio_flush(AioContext *ctx)
 {
-    rfifolock_lock(&ctx->lock);
-}
-
-void aio_context_release(AioContext *ctx)
-{
-    rfifolock_unlock(&ctx->lock);
+    while (aio_poll(ctx, true));
 }
--- a/audio/Makefile.objs
+++ b/audio/Makefile.objs
@@ -12,6 +12,3 @@ common-obj-$(CONFIG_WINWAVE) += winwaveaudio.o
 common-obj-$(CONFIG_AUDIO_PT_INT) += audio_pt_int.o
 common-obj-$(CONFIG_AUDIO_WIN_INT) += audio_win_int.o
 common-obj-y += wavcapture.o
-
-$(obj)/audio.o $(obj)/fmodaudio.o: QEMU_CFLAGS += $(FMOD_CFLAGS)
-sdlaudio.o-cflags := $(SDL_CFLAGS)
--- a/audio/alsaaudio.c
+++ b/audio/alsaaudio.c
@@ -23,7 +23,7 @@
 */
 #include <alsa/asoundlib.h>
 #include "qemu-common.h"
-#include "qemu/main-loop.h"
+#include "qemu-char.h"
 #include "audio.h"

 #if QEMU_GNUC_PREREQ(4, 3)
@@ -815,8 +815,10 @@ static void alsa_fini_out (HWVoiceOut *hw)
    ldebug ("alsa_fini\n");
    alsa_anal_close (&alsa->handle, &alsa->pollhlp);

-    g_free(alsa->pcm_buf);
-    alsa->pcm_buf = NULL;
+    if (alsa->pcm_buf) {
+        g_free (alsa->pcm_buf);
+        alsa->pcm_buf = NULL;
+    }
 }

 static int alsa_init_out (HWVoiceOut *hw, struct audsettings *as)
@@ -976,8 +978,10 @@ static void alsa_fini_in (HWVoiceIn *hw)

    alsa_anal_close (&alsa->handle, &alsa->pollhlp);

-    g_free(alsa->pcm_buf);
-    alsa->pcm_buf = NULL;
+    if (alsa->pcm_buf) {
+        g_free (alsa->pcm_buf);
+        alsa->pcm_buf = NULL;
+    }
 }

 static int alsa_run_in (HWVoiceIn *hw)
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -23,9 +23,9 @@
 */
 #include "hw/hw.h"
 #include "audio.h"
-#include "monitor/monitor.h"
-#include "qemu/timer.h"
-#include "sysemu/sysemu.h"
+#include "monitor.h"
+#include "qemu-timer.h"
+#include "sysemu.h"

 #define AUDIO_CAP "audio"
 #include "audio_int.h"
@@ -95,7 +95,7 @@ static struct {
        }
    },

-    .period = { .hertz = 100 },
+    .period = { .hertz = 250 },
    .plive = 0,
    .log_to_monitor = 0,
    .try_poll_in = 1,
@@ -828,9 +828,8 @@ static int audio_attach_capture (HWVoiceOut *hw)
        QLIST_INSERT_HEAD (&hw_cap->sw_head, sw, entries);
        QLIST_INSERT_HEAD (&hw->cap_head, sc, entries);
 #ifdef DEBUG_CAPTURE
-        sw->name = g_strdup_printf ("for %p %d,%d,%d",
-                                    hw, sw->info.freq, sw->info.bits,
-                                    sw->info.nchannels);
+        asprintf (&sw->name, "for %p %d,%d,%d",
+                  hw, sw->info.freq, sw->info.bits, sw->info.nchannels);
        dolog ("Added %s active = %d\n", sw->name, sw->active);
 #endif
        if (sw->active) {
@@ -1124,11 +1123,10 @@ static int audio_is_timer_needed (void)
 static void audio_reset_timer (AudioState *s)
 {
    if (audio_is_timer_needed ()) {
-        timer_mod (s->ts,
-            qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + conf.period.ticks);
+        qemu_mod_timer (s->ts, qemu_get_clock_ns (vm_clock) + 1);
    }
    else {
-        timer_del (s->ts);
+        qemu_del_timer (s->ts);
    }
 }

@@ -1812,7 +1810,8 @@ static const VMStateDescription vmstate_audio = {
    .name = "audio",
    .version_id = 1,
    .minimum_version_id = 1,
-    .fields = (VMStateField[]) {
+    .minimum_version_id_old = 1,
+    .fields      = (VMStateField []) {
        VMSTATE_END_OF_LIST()
    }
 };
@@ -1834,7 +1833,7 @@ static void audio_init (void)
    QLIST_INIT (&s->cap_head);
    atexit (audio_atexit);

-    s->ts = timer_new_ns(QEMU_CLOCK_VIRTUAL, audio_timer, s);
+    s->ts = qemu_new_timer_ns (vm_clock, audio_timer, s);
    if (!s->ts) {
        hw_error("Could not create audio timer\n");
    }
--- a/audio/audio.h
+++ b/audio/audio.h
@@ -25,7 +25,7 @@
 #define QEMU_AUDIO_H

 #include "config-host.h"
-#include "qemu/queue.h"
+#include "qemu-queue.h"

 typedef void (*audio_callback_fn) (void *opaque, int avail);

--- a/audio/audio_int.h
+++ b/audio/audio_int.h
@@ -243,13 +243,38 @@ static inline int audio_ring_dist (int dst, int src, int len)
    return (dst >= src) ? (dst - src) : (len - src + dst);
 }

-#define dolog(fmt, ...) AUD_log(AUDIO_CAP, fmt, ## __VA_ARGS__)
+static void GCC_ATTR dolog (const char *fmt, ...)
+{
+    va_list ap;
+
+    va_start (ap, fmt);
+    AUD_vlog (AUDIO_CAP, fmt, ap);
+    va_end (ap);
+}

 #ifdef DEBUG
-#define ldebug(fmt, ...) AUD_log(AUDIO_CAP, fmt, ## __VA_ARGS__)
+static void GCC_ATTR ldebug (const char *fmt, ...)
+{
+    va_list ap;
+
+    va_start (ap, fmt);
+    AUD_vlog (AUDIO_CAP, fmt, ap);
+    va_end (ap);
+}
 #else
-#define ldebug(fmt, ...) (void)0
+#if defined NDEBUG && defined __GNUC__
+#define ldebug(...)
+#elif defined NDEBUG && defined _MSC_VER
+#define ldebug __noop
+#else
+static void GCC_ATTR ldebug (const char *fmt, ...)
+{
+    (void) fmt;
+}
 #endif
+#endif
+
+#undef GCC_ATTR

 #define AUDIO_STRINGIFY_(n) #n
 #define AUDIO_STRINGIFY(n) AUDIO_STRINGIFY_(n)
--- a/audio/audio_template.h
+++ b/audio/audio_template.h
@@ -71,7 +71,10 @@ static void glue (audio_init_nb_voices_, TYPE) (struct audio_driver *drv)

 static void glue (audio_pcm_hw_free_resources_, TYPE) (HW *hw)
 {
-    g_free (HWBUF);
+    if (HWBUF) {
+        g_free (HWBUF);
+    }
+
    HWBUF = NULL;
 }

@@ -89,7 +92,9 @@ static int glue (audio_pcm_hw_alloc_resources_, TYPE) (HW *hw)

 static void glue (audio_pcm_sw_free_resources_, TYPE) (SW *sw)
 {
-    g_free (sw->buf);
+    if (sw->buf) {
+        g_free (sw->buf);
+    }

    if (sw->rate) {
        st_rate_stop (sw->rate);
@@ -167,8 +172,10 @@ static int glue (audio_pcm_sw_init_, TYPE) (
 static void glue (audio_pcm_sw_fini_, TYPE) (SW *sw)
 {
    glue (audio_pcm_sw_free_resources_, TYPE) (sw);
-    g_free (sw->name);
-    sw->name = NULL;
+    if (sw->name) {
+        g_free (sw->name);
+        sw->name = NULL;
+    }
 }

 static void glue (audio_pcm_hw_add_sw_, TYPE) (HW *hw, SW *sw)
@@ -191,9 +198,9 @@ static void glue (audio_pcm_hw_gc_, TYPE) (HW **hwp)
        audio_detach_capture (hw);
 #endif
        QLIST_REMOVE (hw, entries);
-        glue (hw->pcm_ops->fini_, TYPE) (hw);
        glue (s->nb_hw_voices_, TYPE) += 1;
        glue (audio_pcm_hw_free_resources_ ,TYPE) (hw);
+        glue (hw->pcm_ops->fini_, TYPE) (hw);
        g_free (hw);
        *hwp = NULL;
    }
--- a/audio/audio_win_int.c
+++ b/audio/audio_win_int.c
@@ -1,6 +1,7 @@
 /* public domain */

 #include "qemu-common.h"
+#include "audio.h"

 #define AUDIO_CAP "win-int"
 #include <windows.h>
--- a/audio/mixeng.c
+++ b/audio/mixeng.c
@@ -348,6 +348,7 @@ void mixeng_clear (struct st_sample *buf, int len)

 void mixeng_volume (struct st_sample *buf, int len, struct mixeng_volume *vol)
 {
+#ifdef CONFIG_MIXEMU
    if (vol->mute) {
        mixeng_clear (buf, len);
        return;
@@ -363,4 +364,9 @@ void mixeng_volume (struct st_sample *buf, int len, struct mixeng_volume *vol)
 #endif
        buf += 1;
    }
+#else
+    (void) buf;
+    (void) len;
+    (void) vol;
+#endif
 }
--- a/audio/mixeng_template.h
+++ b/audio/mixeng_template.h
@@ -35,7 +35,7 @@
 #define IN_T glue (glue (ITYPE, BSIZE), _t)

 #ifdef FLOAT_MIXENG
-static inline mixeng_real glue (conv_, ET) (IN_T v)
+static mixeng_real inline glue (conv_, ET) (IN_T v)
 {
    IN_T nv = ENDIAN_CONVERT (v);

@@ -54,7 +54,7 @@ static inline mixeng_real glue (conv_, ET) (IN_T v)
 #endif
 }

-static inline IN_T glue (clip_, ET) (mixeng_real v)
+static IN_T inline glue (clip_, ET) (mixeng_real v)
 {
    if (v >= 0.5) {
        return IN_MAX;
--- a/audio/noaudio.c
+++ b/audio/noaudio.c
@@ -23,7 +23,7 @@
 */
 #include "qemu-common.h"
 #include "audio.h"
-#include "qemu/timer.h"
+#include "qemu-timer.h"

 #define AUDIO_CAP "noaudio"
 #include "audio_int.h"
@@ -46,7 +46,7 @@ static int no_run_out (HWVoiceOut *hw, int live)
    int64_t ticks;
    int64_t bytes;

-    now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+    now = qemu_get_clock_ns (vm_clock);
    ticks = now - no->old_ticks;
    bytes = muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());
    bytes = audio_MIN (bytes, INT_MAX);
@@ -102,7 +102,7 @@ static int no_run_in (HWVoiceIn *hw)
    int samples = 0;

    if (dead) {
-        int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+        int64_t now = qemu_get_clock_ns (vm_clock);
        int64_t ticks = now - no->old_ticks;
        int64_t bytes =
            muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());
--- a/audio/ossaudio.c
+++ b/audio/ossaudio.c
@@ -25,10 +25,14 @@
 #include <sys/mman.h>
 #include <sys/types.h>
 #include <sys/ioctl.h>
+#ifdef __OpenBSD__
+#include <soundcard.h>
+#else
 #include <sys/soundcard.h>
+#endif
 #include "qemu-common.h"
-#include "qemu/main-loop.h"
-#include "qemu/host-utils.h"
+#include "host-utils.h"
+#include "qemu-char.h"
 #include "audio.h"

 #define AUDIO_CAP "oss"
@@ -736,8 +740,10 @@ static void oss_fini_in (HWVoiceIn *hw)

    oss_anal_close (&oss->fd);

-    g_free(oss->pcm_buf);
-    oss->pcm_buf = NULL;
+    if (oss->pcm_buf) {
+        g_free (oss->pcm_buf);
+        oss->pcm_buf = NULL;
+    }
 }

 static int oss_run_in (HWVoiceIn *hw)
@@ -847,10 +853,6 @@ static int oss_ctl_in (HWVoiceIn *hw, int cmd, ...)

 static void *oss_audio_init (void)
 {
-    if (access(conf.devpath_in, R_OK | W_OK) < 0 ||
-        access(conf.devpath_out, R_OK | W_OK) < 0) {
-        return NULL;
-    }
    return &conf;
 }

--- a/audio/paaudio.c
+++ b/audio/paaudio.c
@@ -547,11 +547,11 @@ static int qpa_init_out (HWVoiceOut *hw, struct audsettings *as)
    ss.rate = as->freq;

    /*
-     * qemu audio tick runs at 100 Hz (by default), so processing
-     * data chunks worth 10 ms of sound should be a good fit.
+     * qemu audio tick runs at 250 Hz (by default), so processing
+     * data chunks worth 4 ms of sound should be a good fit.
     */
-    ba.tlength = pa_usec_to_bytes (10 * 1000, &ss);
-    ba.minreq = pa_usec_to_bytes (5 * 1000, &ss);
+    ba.tlength = pa_usec_to_bytes (4 * 1000, &ss);
+    ba.minreq = pa_usec_to_bytes (2 * 1000, &ss);
    ba.maxlength = -1;
    ba.prebuf = -1;

--- a/audio/spiceaudio.c
+++ b/audio/spiceaudio.c
@@ -18,24 +18,15 @@
 */

 #include "hw/hw.h"
-#include "qemu/timer.h"
+#include "qemu-timer.h"
 #include "ui/qemu-spice.h"

 #define AUDIO_CAP "spice"
 #include "audio.h"
 #include "audio_int.h"

-#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
-#define LINE_OUT_SAMPLES (480 * 4)
-#else
-#define LINE_OUT_SAMPLES (256 * 4)
-#endif
-
-#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
-#define LINE_IN_SAMPLES (480 * 4)
-#else
-#define LINE_IN_SAMPLES (256 * 4)
-#endif
+#define LINE_IN_SAMPLES 1024
+#define LINE_OUT_SAMPLES 1024

 typedef struct SpiceRateCtl {
    int64_t               start_ticks;
@@ -90,7 +81,7 @@ static void spice_audio_fini (void *opaque)
 static void rate_start (SpiceRateCtl *rate)
 {
    memset (rate, 0, sizeof (*rate));
-    rate->start_ticks = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+    rate->start_ticks = qemu_get_clock_ns (vm_clock);
 }

 static int rate_get_samples (struct audio_pcm_info *info, SpiceRateCtl *rate)
@@ -100,12 +91,12 @@ static int rate_get_samples (struct audio_pcm_info *info, SpiceRateCtl *rate)
    int64_t bytes;
    int64_t samples;

-    now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+    now = qemu_get_clock_ns (vm_clock);
    ticks = now - rate->start_ticks;
    bytes = muldiv64 (ticks, info->bytes_per_second, get_ticks_per_sec ());
    samples = (bytes - rate->bytes_sent) >> info->shift;
    if (samples < 0 || samples > 65536) {
-        error_report("Resetting rate control (%" PRId64 " samples)", samples);
+        fprintf (stderr, "Resetting rate control (%" PRId64 " samples)\n", samples);
        rate_start (rate);
        samples = 0;
    }
@@ -120,11 +111,7 @@ static int line_out_init (HWVoiceOut *hw, struct audsettings *as)
    SpiceVoiceOut *out = container_of (hw, SpiceVoiceOut, hw);
    struct audsettings settings;

-#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
-    settings.freq       = spice_server_get_best_playback_rate(NULL);
-#else
    settings.freq       = SPICE_INTERFACE_PLAYBACK_FREQ;
-#endif
    settings.nchannels  = SPICE_INTERFACE_PLAYBACK_CHAN;
    settings.fmt        = AUD_FMT_S16;
    settings.endianness = AUDIO_HOST_ENDIANNESS;
@@ -135,9 +122,6 @@ static int line_out_init (HWVoiceOut *hw, struct audsettings *as)

    out->sin.base.sif = &playback_sif.base;
    qemu_spice_add_interface (&out->sin.base);
-#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
-    spice_server_set_playback_rate(&out->sin, settings.freq);
-#endif
    return 0;
 }

@@ -248,11 +232,7 @@ static int line_in_init (HWVoiceIn *hw, struct audsettings *as)
    SpiceVoiceIn *in = container_of (hw, SpiceVoiceIn, hw);
    struct audsettings settings;

-#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
-    settings.freq       = spice_server_get_best_record_rate(NULL);
-#else
    settings.freq       = SPICE_INTERFACE_RECORD_FREQ;
-#endif
    settings.nchannels  = SPICE_INTERFACE_RECORD_CHAN;
    settings.fmt        = AUD_FMT_S16;
    settings.endianness = AUDIO_HOST_ENDIANNESS;
@@ -263,9 +243,6 @@ static int line_in_init (HWVoiceIn *hw, struct audsettings *as)

    in->sin.base.sif = &record_sif.base;
    qemu_spice_add_interface (&in->sin.base);
-#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
-    spice_server_set_record_rate(&in->sin, settings.freq);
-#endif
    return 0;
 }

--- a/audio/wavaudio.c
+++ b/audio/wavaudio.c
@@ -22,7 +22,7 @@
 * THE SOFTWARE.
 */
 #include "hw/hw.h"
-#include "qemu/timer.h"
+#include "qemu-timer.h"
 #include "audio.h"

 #define AUDIO_CAP "wav"
@@ -52,7 +52,7 @@ static int wav_run_out (HWVoiceOut *hw, int live)
    int rpos, decr, samples;
    uint8_t *dst;
    struct st_sample *src;
-    int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+    int64_t now = qemu_get_clock_ns (vm_clock);
    int64_t ticks = now - wav->old_ticks;
    int64_t bytes =
        muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());
--- a/audio/wavcapture.c
+++ b/audio/wavcapture.c
@@ -1,5 +1,5 @@
 #include "hw/hw.h"
-#include "monitor/monitor.h"
+#include "monitor.h"
 #include "audio.h"

 typedef struct {
@@ -63,7 +63,8 @@ static void wav_destroy (void *opaque)
        }
    doclose:
        if (fclose (wav->f)) {
-            error_report("wav_destroy: fclose failed: %s", strerror(errno));
+            fprintf (stderr, "wav_destroy: fclose failed: %s",
+                     strerror (errno));
        }
    }

--- a/audio/winwaveaudio.c
+++ b/audio/winwaveaudio.c
@@ -1,7 +1,7 @@
 /* public domain */

 #include "qemu-common.h"
-#include "sysemu/sysemu.h"
+#include "sysemu.h"
 #include "audio.h"

 #define AUDIO_CAP "winwave"
--- a/backends/Makefile.objs
+++ b/backends/Makefile.objs
@@ -1,11 +1,2 @@
 common-obj-y += rng.o rng-egd.o
 common-obj-$(CONFIG_POSIX) += rng-random.o
-
-common-obj-y += msmouse.o testdev.o
-common-obj-$(CONFIG_BRLAPI) += baum.o
-baum.o-cflags := $(SDL_CFLAGS)
-
-common-obj-$(CONFIG_TPM) += tpm.o
-
-common-obj-y += hostmem.o hostmem-ram.o
-common-obj-$(CONFIG_LINUX) += hostmem-file.o
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -1,134 +0,0 @@
-/*
- * QEMU Host Memory Backend for hugetlbfs
- *
- * Copyright (C) 2013-2014 Red Hat Inc
- *
- * Authors:
- *   Paolo Bonzini <pbonzini@redhat.com>
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- */
-#include "qemu-common.h"
-#include "sysemu/hostmem.h"
-#include "sysemu/sysemu.h"
-#include "qom/object_interfaces.h"
-
-/* hostmem-file.c */
-/**
- * @TYPE_MEMORY_BACKEND_FILE:
- * name of backend that uses mmap on a file descriptor
- */
-#define TYPE_MEMORY_BACKEND_FILE "memory-backend-file"
-
-#define MEMORY_BACKEND_FILE(obj) \
-    OBJECT_CHECK(HostMemoryBackendFile, (obj), TYPE_MEMORY_BACKEND_FILE)
-
-typedef struct HostMemoryBackendFile HostMemoryBackendFile;
-
-struct HostMemoryBackendFile {
-    HostMemoryBackend parent_obj;
-
-    bool share;
-    char *mem_path;
-};
-
-static void
-file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
-{
-    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
-
-    if (!backend->size) {
-        error_setg(errp, "can't create backend with size 0");
-        return;
-    }
-    if (!fb->mem_path) {
-        error_setg(errp, "mem-path property not set");
-        return;
-    }
-#ifndef CONFIG_LINUX
-    error_setg(errp, "-mem-path not supported on this host");
-#else
-    if (!memory_region_size(&backend->mr)) {
-        backend->force_prealloc = mem_prealloc;
-        memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
-                                 object_get_canonical_path(OBJECT(backend)),
-                                 backend->size, fb->share,
-                                 fb->mem_path, errp);
-    }
-#endif
-}
-
-static void
-file_backend_class_init(ObjectClass *oc, void *data)
-{
-    HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
-
-    bc->alloc = file_backend_memory_alloc;
-}
-
-static char *get_mem_path(Object *o, Error **errp)
-{
-    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
-
-    return g_strdup(fb->mem_path);
-}
-
-static void set_mem_path(Object *o, const char *str, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(o);
-    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
-
-    if (memory_region_size(&backend->mr)) {
-        error_setg(errp, "cannot change property value");
-        return;
-    }
-    if (fb->mem_path) {
-        g_free(fb->mem_path);
-    }
-    fb->mem_path = g_strdup(str);
-}
-
-static bool file_memory_backend_get_share(Object *o, Error **errp)
-{
-    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
-
-    return fb->share;
-}
-
-static void file_memory_backend_set_share(Object *o, bool value, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(o);
-    HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
-
-    if (memory_region_size(&backend->mr)) {
-        error_setg(errp, "cannot change property value");
-        return;
-    }
-    fb->share = value;
-}
-
-static void
-file_backend_instance_init(Object *o)
-{
-    object_property_add_bool(o, "share",
-                        file_memory_backend_get_share,
-                        file_memory_backend_set_share, NULL);
-    object_property_add_str(o, "mem-path", get_mem_path,
-                            set_mem_path, NULL);
-}
-
-static const TypeInfo file_backend_info = {
-    .name = TYPE_MEMORY_BACKEND_FILE,
-    .parent = TYPE_MEMORY_BACKEND,
-    .class_init = file_backend_class_init,
-    .instance_init = file_backend_instance_init,
-    .instance_size = sizeof(HostMemoryBackendFile),
-};
-
-static void register_types(void)
-{
-    type_register_static(&file_backend_info);
-}
-
-type_init(register_types);
--- a/backends/hostmem-ram.c
+++ b/backends/hostmem-ram.c
@@ -1,53 +0,0 @@
-/*
- * QEMU Host Memory Backend
- *
- * Copyright (C) 2013-2014 Red Hat Inc
- *
- * Authors:
- *   Igor Mammedov <imammedo@redhat.com>
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- */
-#include "sysemu/hostmem.h"
-#include "qom/object_interfaces.h"
-
-#define TYPE_MEMORY_BACKEND_RAM "memory-backend-ram"
-
-
-static void
-ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
-{
-    char *path;
-
-    if (!backend->size) {
-        error_setg(errp, "can't create backend with size 0");
-        return;
-    }
-
-    path = object_get_canonical_path_component(OBJECT(backend));
-    memory_region_init_ram(&backend->mr, OBJECT(backend), path,
-                           backend->size, errp);
-    g_free(path);
-}
-
-static void
-ram_backend_class_init(ObjectClass *oc, void *data)
-{
-    HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
-
-    bc->alloc = ram_backend_memory_alloc;
-}
-
-static const TypeInfo ram_backend_info = {
-    .name = TYPE_MEMORY_BACKEND_RAM,
-    .parent = TYPE_MEMORY_BACKEND,
-    .class_init = ram_backend_class_init,
-};
-
-static void register_types(void)
-{
-    type_register_static(&ram_backend_info);
-}
-
-type_init(register_types);
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -1,379 +0,0 @@
-/*
- * QEMU Host Memory Backend
- *
- * Copyright (C) 2013-2014 Red Hat Inc
- *
- * Authors:
- *   Igor Mammedov <imammedo@redhat.com>
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- */
-#include "sysemu/hostmem.h"
-#include "qapi/visitor.h"
-#include "qapi-types.h"
-#include "qapi-visit.h"
-#include "qapi/qmp/qerror.h"
-#include "qemu/config-file.h"
-#include "qom/object_interfaces.h"
-
-#ifdef CONFIG_NUMA
-#include <numaif.h>
-QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_DEFAULT != MPOL_DEFAULT);
-QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_PREFERRED != MPOL_PREFERRED);
-QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_BIND != MPOL_BIND);
-QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_INTERLEAVE != MPOL_INTERLEAVE);
-#endif
-
-static void
-host_memory_backend_get_size(Object *obj, Visitor *v, void *opaque,
-                             const char *name, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-    uint64_t value = backend->size;
-
-    visit_type_size(v, &value, name, errp);
-}
-
-static void
-host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
-                             const char *name, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-    Error *local_err = NULL;
-    uint64_t value;
-
-    if (memory_region_size(&backend->mr)) {
-        error_setg(&local_err, "cannot change property value");
-        goto out;
-    }
-
-    visit_type_size(v, &value, name, &local_err);
-    if (local_err) {
-        goto out;
-    }
-    if (!value) {
-        error_setg(&local_err, "Property '%s.%s' doesn't take value '%"
-                   PRIu64 "'", object_get_typename(obj), name, value);
-        goto out;
-    }
-    backend->size = value;
-out:
-    error_propagate(errp, local_err);
-}
-
-static void
-host_memory_backend_get_host_nodes(Object *obj, Visitor *v, void *opaque,
-                                   const char *name, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-    uint16List *host_nodes = NULL;
-    uint16List **node = &host_nodes;
-    unsigned long value;
-
-    value = find_first_bit(backend->host_nodes, MAX_NODES);
-    if (value == MAX_NODES) {
-        return;
-    }
-
-    *node = g_malloc0(sizeof(**node));
-    (*node)->value = value;
-    node = &(*node)->next;
-
-    do {
-        value = find_next_bit(backend->host_nodes, MAX_NODES, value + 1);
-        if (value == MAX_NODES) {
-            break;
-        }
-
-        *node = g_malloc0(sizeof(**node));
-        (*node)->value = value;
-        node = &(*node)->next;
-    } while (true);
-
-    visit_type_uint16List(v, &host_nodes, name, errp);
-}
-
-static void
-host_memory_backend_set_host_nodes(Object *obj, Visitor *v, void *opaque,
-                                   const char *name, Error **errp)
-{
-#ifdef CONFIG_NUMA
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-    uint16List *l = NULL;
-
-    visit_type_uint16List(v, &l, name, errp);
-
-    while (l) {
-        bitmap_set(backend->host_nodes, l->value, 1);
-        l = l->next;
-    }
-#else
-    error_setg(errp, "NUMA node binding are not supported by this QEMU");
-#endif
-}
-
-static void
-host_memory_backend_get_policy(Object *obj, Visitor *v, void *opaque,
-                               const char *name, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-    int policy = backend->policy;
-
-    visit_type_enum(v, &policy, HostMemPolicy_lookup, NULL, name, errp);
-}
-
-static void
-host_memory_backend_set_policy(Object *obj, Visitor *v, void *opaque,
-                               const char *name, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-    int policy;
-
-    visit_type_enum(v, &policy, HostMemPolicy_lookup, NULL, name, errp);
-    backend->policy = policy;
-
-#ifndef CONFIG_NUMA
-    if (policy != HOST_MEM_POLICY_DEFAULT) {
-        error_setg(errp, "NUMA policies are not supported by this QEMU");
-    }
-#endif
-}
-
-static bool host_memory_backend_get_merge(Object *obj, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-
-    return backend->merge;
-}
-
-static void host_memory_backend_set_merge(Object *obj, bool value, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-
-    if (!memory_region_size(&backend->mr)) {
-        backend->merge = value;
-        return;
-    }
-
-    if (value != backend->merge) {
-        void *ptr = memory_region_get_ram_ptr(&backend->mr);
-        uint64_t sz = memory_region_size(&backend->mr);
-
-        qemu_madvise(ptr, sz,
-                     value ? QEMU_MADV_MERGEABLE : QEMU_MADV_UNMERGEABLE);
-        backend->merge = value;
-    }
-}
-
-static bool host_memory_backend_get_dump(Object *obj, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-
-    return backend->dump;
-}
-
-static void host_memory_backend_set_dump(Object *obj, bool value, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-
-    if (!memory_region_size(&backend->mr)) {
-        backend->dump = value;
-        return;
-    }
-
-    if (value != backend->dump) {
-        void *ptr = memory_region_get_ram_ptr(&backend->mr);
-        uint64_t sz = memory_region_size(&backend->mr);
-
-        qemu_madvise(ptr, sz,
-                     value ? QEMU_MADV_DODUMP : QEMU_MADV_DONTDUMP);
-        backend->dump = value;
-    }
-}
-
-static bool host_memory_backend_get_prealloc(Object *obj, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-
-    return backend->prealloc || backend->force_prealloc;
-}
-
-static void host_memory_backend_set_prealloc(Object *obj, bool value,
-                                             Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-
-    if (backend->force_prealloc) {
-        if (value) {
-            error_setg(errp,
-                       "remove -mem-prealloc to use the prealloc property");
-            return;
-        }
-    }
-
-    if (!memory_region_size(&backend->mr)) {
-        backend->prealloc = value;
-        return;
-    }
-
-    if (value && !backend->prealloc) {
-        int fd = memory_region_get_fd(&backend->mr);
-        void *ptr = memory_region_get_ram_ptr(&backend->mr);
-        uint64_t sz = memory_region_size(&backend->mr);
-
-        os_mem_prealloc(fd, ptr, sz);
-        backend->prealloc = true;
-    }
-}
-
-static void host_memory_backend_init(Object *obj)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
-
-    backend->merge = qemu_opt_get_bool(qemu_get_machine_opts(),
-                                       "mem-merge", true);
-    backend->dump = qemu_opt_get_bool(qemu_get_machine_opts(),
-                                      "dump-guest-core", true);
-    backend->prealloc = mem_prealloc;
-
-    object_property_add_bool(obj, "merge",
-                        host_memory_backend_get_merge,
-                        host_memory_backend_set_merge, NULL);
-    object_property_add_bool(obj, "dump",
-                        host_memory_backend_get_dump,
-                        host_memory_backend_set_dump, NULL);
-    object_property_add_bool(obj, "prealloc",
-                        host_memory_backend_get_prealloc,
-                        host_memory_backend_set_prealloc, NULL);
-    object_property_add(obj, "size", "int",
-                        host_memory_backend_get_size,
-                        host_memory_backend_set_size, NULL, NULL, NULL);
-    object_property_add(obj, "host-nodes", "int",
-                        host_memory_backend_get_host_nodes,
-                        host_memory_backend_set_host_nodes, NULL, NULL, NULL);
-    object_property_add(obj, "policy", "str",
-                        host_memory_backend_get_policy,
-                        host_memory_backend_set_policy, NULL, NULL, NULL);
-}
-
-MemoryRegion *
-host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
-{
-    return memory_region_size(&backend->mr) ? &backend->mr : NULL;
-}
-
-static void
-host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
-{
-    HostMemoryBackend *backend = MEMORY_BACKEND(uc);
-    HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
-    Error *local_err = NULL;
-    void *ptr;
-    uint64_t sz;
-
-    if (bc->alloc) {
-        bc->alloc(backend, &local_err);
-        if (local_err) {
-            error_propagate(errp, local_err);
-            return;
-        }
-
-        ptr = memory_region_get_ram_ptr(&backend->mr);
-        sz = memory_region_size(&backend->mr);
-
-        if (backend->merge) {
-            qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
-        }
-        if (!backend->dump) {
-            qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP);
-        }
-#ifdef CONFIG_NUMA
-        unsigned long lastbit = find_last_bit(backend->host_nodes, MAX_NODES);
-        /* lastbit == MAX_NODES means maxnode = 0 */
-        unsigned long maxnode = (lastbit + 1) % (MAX_NODES + 1);
-        /* ensure policy won't be ignored in case memory is preallocated
-         * before mbind(). note: MPOL_MF_STRICT is ignored on hugepages so
-         * this doesn't catch hugepage case. */
-        unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE;
-
-        /* check for invalid host-nodes and policies and give more verbose
-         * error messages than mbind(). */
-        if (maxnode && backend->policy == MPOL_DEFAULT) {
-            error_setg(errp, "host-nodes must be empty for policy default,"
-                       " or you should explicitly specify a policy other"
-                       " than default");
-            return;
-        } else if (maxnode == 0 && backend->policy != MPOL_DEFAULT) {
-            error_setg(errp, "host-nodes must be set for policy %s",
-                       HostMemPolicy_lookup[backend->policy]);
-            return;
-        }
-
-        /* We can have up to MAX_NODES nodes, but we need to pass maxnode+1
-         * as argument to mbind() due to an old Linux bug (feature?) which
-         * cuts off the last specified node. This means backend->host_nodes
-         * must have MAX_NODES+1 bits available.
-         */
-        assert(sizeof(backend->host_nodes) >=
-               BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long));
-        assert(maxnode <= MAX_NODES);
-        if (mbind(ptr, sz, backend->policy,
-                  maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) {
-            error_setg_errno(errp, errno,
-                             "cannot bind memory to host NUMA nodes");
-            return;
-        }
-#endif
-        /* Preallocate memory after the NUMA policy has been instantiated.
-         * This is necessary to guarantee memory is allocated with
-         * specified NUMA policy in place.
-         */
-        if (backend->prealloc) {
-            os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz);
-        }
-    }
-}
-
-static bool
-host_memory_backend_can_be_deleted(UserCreatable *uc, Error **errp)
-{
-    MemoryRegion *mr;
-
-    mr = host_memory_backend_get_memory(MEMORY_BACKEND(uc), errp);
-    if (memory_region_is_mapped(mr)) {
-        return false;
-    } else {
-        return true;
-    }
-}
-
-static void
-host_memory_backend_class_init(ObjectClass *oc, void *data)
-{
-    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
-
-    ucc->complete = host_memory_backend_memory_complete;
-    ucc->can_be_deleted = host_memory_backend_can_be_deleted;
-}
-
-static const TypeInfo host_memory_backend_info = {
-    .name = TYPE_MEMORY_BACKEND,
-    .parent = TYPE_OBJECT,
-    .abstract = true,
-    .class_size = sizeof(HostMemoryBackendClass),
-    .class_init = host_memory_backend_class_init,
-    .instance_size = sizeof(HostMemoryBackend),
-    .instance_init = host_memory_backend_init,
-    .interfaces = (InterfaceInfo[]) {
-        { TYPE_USER_CREATABLE },
-        { }
-    }
-};
-
-static void register_types(void)
-{
-    type_register_static(&host_memory_backend_info);
-}
-
-type_init(register_types);
--- a/backends/rng-egd.c
+++ b/backends/rng-egd.c
@@ -10,9 +10,9 @@
 * See the COPYING file in the top-level directory.
 */

-#include "sysemu/rng.h"
-#include "sysemu/char.h"
-#include "qapi/qmp/qerror.h"
+#include "qemu/rng.h"
+#include "qemu-char.h"
+#include "qerror.h"
 #include "hw/qdev.h" /* just for DEFINE_PROP_CHR */

 #define TYPE_RNG_EGD "rng-egd"
@@ -91,14 +91,12 @@ static int rng_egd_chr_can_read(void *opaque)
 static void rng_egd_chr_read(void *opaque, const uint8_t *buf, int size)
 {
    RngEgd *s = RNG_EGD(opaque);
-    size_t buf_offset = 0;

    while (size > 0 && s->requests) {
        RngRequest *req = s->requests->data;
        int len = MIN(size, req->size - req->offset);

-        memcpy(req->data + req->offset, buf + buf_offset, len);
-        buf_offset += len;
+        memcpy(req->data + req->offset, buf, len);
        req->offset += len;
        size -= len;

@@ -151,11 +149,6 @@ static void rng_egd_opened(RngBackend *b, Error **errp)
        return;
    }

-    if (qemu_chr_fe_claim(s->chr) != 0) {
-        error_set(errp, QERR_DEVICE_IN_USE, s->chr_name);
-        return;
-    }
-
    /* FIXME we should resubmit pending requests when the CDS reconnects. */
    qemu_chr_add_handlers(s->chr, rng_egd_chr_can_read, rng_egd_chr_read,
                          NULL, s);
@@ -198,7 +191,6 @@ static void rng_egd_finalize(Object *obj)

    if (s->chr) {
        qemu_chr_add_handlers(s->chr, NULL, NULL, NULL, NULL);
-        qemu_chr_fe_release(s->chr);
    }

    g_free(s->chr_name);
@@ -215,7 +207,7 @@ static void rng_egd_class_init(ObjectClass *klass, void *data)
    rbc->opened = rng_egd_opened;
 }

-static const TypeInfo rng_egd_info = {
+static TypeInfo rng_egd_info = {
    .name = TYPE_RNG_EGD,
    .parent = TYPE_RNG_BACKEND,
    .instance_size = sizeof(RngEgd),
--- a/backends/rng-random.c
+++ b/backends/rng-random.c
@@ -10,10 +10,10 @@
 * See the COPYING file in the top-level directory.
 */

-#include "sysemu/rng-random.h"
-#include "sysemu/rng.h"
-#include "qapi/qmp/qerror.h"
-#include "qemu/main-loop.h"
+#include "qemu/rng-random.h"
+#include "qemu/rng.h"
+#include "qerror.h"
+#include "main-loop.h"

 struct RndRandom
 {
@@ -41,9 +41,6 @@ static void entropy_available(void *opaque)
    ssize_t len;

    len = read(s->fd, buffer, s->size);
-    if (len < 0 && errno == EAGAIN) {
-        return;
-    }
    g_assert(len != -1);

    s->receive_func(s->opaque, buffer, len);
@@ -77,9 +74,10 @@ static void rng_random_opened(RngBackend *b, Error **errp)
        error_set(errp, QERR_INVALID_PARAMETER_VALUE,
                  "filename", "a valid filename");
    } else {
-        s->fd = qemu_open(s->filename, O_RDONLY | O_NONBLOCK);
+        s->fd = open(s->filename, O_RDONLY | O_NONBLOCK);
+
        if (s->fd == -1) {
-            error_setg_file_open(errp, errno, s->filename);
+            error_set(errp, QERR_OPEN_FILE_FAILED, s->filename);
        }
    }
 }
@@ -88,7 +86,11 @@ static char *rng_random_get_filename(Object *obj, Error **errp)
 {
    RndRandom *s = RNG_RANDOM(obj);

-    return g_strdup(s->filename);
+    if (s->filename) {
+        return g_strdup(s->filename);
+    }
+
+    return NULL;
 }

 static void rng_random_set_filename(Object *obj, const char *filename,
@@ -102,7 +104,10 @@ static void rng_random_set_filename(Object *obj, const char *filename,
        return;
    }

-    g_free(s->filename);
+    if (s->filename) {
+        g_free(s->filename);
+    }
+
    s->filename = g_strdup(filename);
 }

@@ -116,16 +121,16 @@ static void rng_random_init(Object *obj)
                            NULL);

    s->filename = g_strdup("/dev/random");
-    s->fd = -1;
 }

 static void rng_random_finalize(Object *obj)
 {
    RndRandom *s = RNG_RANDOM(obj);

+    qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
+
    if (s->fd != -1) {
-        qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
-        qemu_close(s->fd);
+        close(s->fd);
    }

    g_free(s->filename);
@@ -139,7 +144,7 @@ static void rng_random_class_init(ObjectClass *klass, void *data)
    rbc->opened = rng_random_opened;
 }

-static const TypeInfo rng_random_info = {
+static TypeInfo rng_random_info = {
    .name = TYPE_RNG_RANDOM,
    .parent = TYPE_RNG_BACKEND,
    .instance_size = sizeof(RndRandom),
--- a/backends/rng.c
+++ b/backends/rng.c
@@ -10,9 +10,8 @@
 * See the COPYING file in the top-level directory.
 */

-#include "sysemu/rng.h"
-#include "qapi/qmp/qerror.h"
-#include "qom/object_interfaces.h"
+#include "qemu/rng.h"
+#include "qerror.h"

 void rng_backend_request_entropy(RngBackend *s, size_t size,
                                 EntropyReceiveFunc *receive_entropy,
@@ -41,16 +40,15 @@ static bool rng_backend_prop_get_opened(Object *obj, Error **errp)
    return s->opened;
 }

-static void rng_backend_complete(UserCreatable *uc, Error **errp)
+void rng_backend_open(RngBackend *s, Error **errp)
 {
-    object_property_set_bool(OBJECT(uc), true, "opened", errp);
+    object_property_set_bool(OBJECT(s), true, "opened", errp);
 }

 static void rng_backend_prop_set_opened(Object *obj, bool value, Error **errp)
 {
    RngBackend *s = RNG_BACKEND(obj);
    RngBackendClass *k = RNG_BACKEND_GET_CLASS(s);
-    Error *local_err = NULL;

    if (value == s->opened) {
        return;
@@ -62,14 +60,12 @@ static void rng_backend_prop_set_opened(Object *obj, bool value, Error **errp)
    }

    if (k->opened) {
-        k->opened(s, &local_err);
-        if (local_err) {
-            error_propagate(errp, local_err);
-            return;
-        }
+        k->opened(s, errp);
    }

-    s->opened = true;
+    if (!error_is_set(errp)) {
+        s->opened = value;
+    }
 }

 static void rng_backend_init(Object *obj)
@@ -80,25 +76,13 @@ static void rng_backend_init(Object *obj)
                             NULL);
 }

-static void rng_backend_class_init(ObjectClass *oc, void *data)
-{
-    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
-
-    ucc->complete = rng_backend_complete;
-}
-
-static const TypeInfo rng_backend_info = {
+static TypeInfo rng_backend_info = {
    .name = TYPE_RNG_BACKEND,
    .parent = TYPE_OBJECT,
    .instance_size = sizeof(RngBackend),
    .instance_init = rng_backend_init,
    .class_size = sizeof(RngBackendClass),
-    .class_init = rng_backend_class_init,
    .abstract = true,
-    .interfaces = (InterfaceInfo[]) {
-        { TYPE_USER_CREATABLE },
-        { }
-    }
 };

 static void register_types(void)
--- a/backends/testdev.c
+++ b/backends/testdev.c
@@ -1,131 +0,0 @@
-/*
- * QEMU Char Device for testsuite control
- *
- * Copyright (c) 2014 Red Hat, Inc.
- *
- * Author: Paolo Bonzini <pbonzini@redhat.com>
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to deal
- * in the Software without restriction, including without limitation the rights
- * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
- * copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
- * THE SOFTWARE.
- */
-#include "qemu-common.h"
-#include "sysemu/char.h"
-
-#define BUF_SIZE 32
-
-typedef struct {
-    CharDriverState *chr;
-    uint8_t in_buf[32];
-    int in_buf_used;
-} TestdevCharState;
-
-/* Try to interpret a whole incoming packet */
-static int testdev_eat_packet(TestdevCharState *testdev)
-{
-    const uint8_t *cur = testdev->in_buf;
-    int len = testdev->in_buf_used;
-    uint8_t c;
-    int arg;
-
-#define EAT(c) do { \
-    if (!len--) {   \
-        return 0;   \
-    }               \
-    c = *cur++;     \
-} while (0)
-
-    EAT(c);
-
-    while (isspace(c)) {
-        EAT(c);
-    }
-
-    arg = 0;
-    while (isdigit(c)) {
-        arg = arg * 10 + c - '0';
-        EAT(c);
-    }
-
-    while (isspace(c)) {
-        EAT(c);
-    }
-
-    switch (c) {
-    case 'q':
-        exit((arg << 1) | 1);
-        break;
-    default:
-        break;
-    }
-    return cur - testdev->in_buf;
-}
-
-/* The other end is writing some data.  Store it and try to interpret */
-static int testdev_write(CharDriverState *chr, const uint8_t *buf, int len)
-{
-    TestdevCharState *testdev = chr->opaque;
-    int tocopy, eaten, orig_len = len;
-
-    while (len) {
-        /* Complete our buffer as much as possible */
-        tocopy = MIN(len, BUF_SIZE - testdev->in_buf_used);
-
-        memcpy(testdev->in_buf + testdev->in_buf_used, buf, tocopy);
-        testdev->in_buf_used += tocopy;
-        buf += tocopy;
-        len -= tocopy;
-
-        /* Interpret it as much as possible */
-        while (testdev->in_buf_used > 0 &&
-               (eaten = testdev_eat_packet(testdev)) > 0) {
-            memmove(testdev->in_buf, testdev->in_buf + eaten,
-                    testdev->in_buf_used - eaten);
-            testdev->in_buf_used -= eaten;
-        }
-    }
-    return orig_len;
-}
-
-static void testdev_close(struct CharDriverState *chr)
-{
-    TestdevCharState *testdev = chr->opaque;
-
-    g_free(testdev);
-}
-
-CharDriverState *chr_testdev_init(void)
-{
-    TestdevCharState *testdev;
-    CharDriverState *chr;
-
-    testdev = g_malloc0(sizeof(TestdevCharState));
-    testdev->chr = chr = g_malloc0(sizeof(CharDriverState));
-
-    chr->opaque = testdev;
-    chr->chr_write = testdev_write;
-    chr->chr_close = testdev_close;
-
-    return chr;
-}
-
-static void register_types(void)
-{
-    register_char_driver("testdev", CHARDEV_BACKEND_KIND_TESTDEV, NULL);
-}
-
-type_init(register_types);
--- a/backends/tpm.c
+++ b/backends/tpm.c
@@ -1,182 +0,0 @@
-/*
- * QEMU TPM Backend
- *
- * Copyright IBM, Corp. 2013
- *
- * Authors:
- *  Stefan Berger   <stefanb@us.ibm.com>
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- *
- * Based on backends/rng.c by Anthony Liguori
- */
-
-#include "sysemu/tpm_backend.h"
-#include "qapi/qmp/qerror.h"
-#include "sysemu/tpm.h"
-#include "qemu/thread.h"
-#include "sysemu/tpm_backend_int.h"
-
-enum TpmType tpm_backend_get_type(TPMBackend *s)
-{
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-
-    return k->ops->type;
-}
-
-const char *tpm_backend_get_desc(TPMBackend *s)
-{
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-
-    return k->ops->desc();
-}
-
-void tpm_backend_destroy(TPMBackend *s)
-{
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-
-    k->ops->destroy(s);
-}
-
-int tpm_backend_init(TPMBackend *s, TPMState *state,
-                     TPMRecvDataCB *datacb)
-{
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-
-    return k->ops->init(s, state, datacb);
-}
-
-int tpm_backend_startup_tpm(TPMBackend *s)
-{
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-
-    return k->ops->startup_tpm(s);
-}
-
-bool tpm_backend_had_startup_error(TPMBackend *s)
-{
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-
-    return k->ops->had_startup_error(s);
-}
-
-size_t tpm_backend_realloc_buffer(TPMBackend *s, TPMSizedBuffer *sb)
-{
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-
-    return k->ops->realloc_buffer(sb);
-}
-
-void tpm_backend_deliver_request(TPMBackend *s)
-{
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-
-    k->ops->deliver_request(s);
-}
-
-void tpm_backend_reset(TPMBackend *s)
-{
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-
-    k->ops->reset(s);
-}
-
-void tpm_backend_cancel_cmd(TPMBackend *s)
-{
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-
-    k->ops->cancel_cmd(s);
-}
-
-bool tpm_backend_get_tpm_established_flag(TPMBackend *s)
-{
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-
-    return k->ops->get_tpm_established_flag(s);
-}
-
-static bool tpm_backend_prop_get_opened(Object *obj, Error **errp)
-{
-    TPMBackend *s = TPM_BACKEND(obj);
-
-    return s->opened;
-}
-
-void tpm_backend_open(TPMBackend *s, Error **errp)
-{
-    object_property_set_bool(OBJECT(s), true, "opened", errp);
-}
-
-static void tpm_backend_prop_set_opened(Object *obj, bool value, Error **errp)
-{
-    TPMBackend *s = TPM_BACKEND(obj);
-    TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
-    Error *local_err = NULL;
-
-    if (value == s->opened) {
-        return;
-    }
-
-    if (!value && s->opened) {
-        error_set(errp, QERR_PERMISSION_DENIED);
-        return;
-    }
-
-    if (k->opened) {
-        k->opened(s, &local_err);
-        if (local_err) {
-            error_propagate(errp, local_err);
-            return;
-        }
-    }
-
-    s->opened = true;
-}
-
-static void tpm_backend_instance_init(Object *obj)
-{
-    object_property_add_bool(obj, "opened",
-                             tpm_backend_prop_get_opened,
-                             tpm_backend_prop_set_opened,
-                             NULL);
-}
-
-void tpm_backend_thread_deliver_request(TPMBackendThread *tbt)
-{
-   g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_PROCESS_CMD, NULL);
-}
-
-void tpm_backend_thread_create(TPMBackendThread *tbt,
-                               GFunc func, gpointer user_data)
-{
-    if (!tbt->pool) {
-        tbt->pool = g_thread_pool_new(func, user_data, 1, TRUE, NULL);
-        g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_INIT, NULL);
-    }
-}
-
-void tpm_backend_thread_end(TPMBackendThread *tbt)
-{
-    if (tbt->pool) {
-        g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_END, NULL);
-        g_thread_pool_free(tbt->pool, FALSE, TRUE);
-        tbt->pool = NULL;
-    }
-}
-
-static const TypeInfo tpm_backend_info = {
-    .name = TYPE_TPM_BACKEND,
-    .parent = TYPE_OBJECT,
-    .instance_size = sizeof(TPMBackend),
-    .instance_init = tpm_backend_instance_init,
-    .class_size = sizeof(TPMBackendClass),
-    .abstract = true,
-};
-
-static void register_types(void)
-{
-    type_register_static(&tpm_backend_info);
-}
-
-type_init(register_types);
--- a/balloon.c
+++ b/balloon.c
@@ -24,33 +24,18 @@
 * THE SOFTWARE.
 */

-#include "monitor/monitor.h"
-#include "exec/cpu-common.h"
-#include "sysemu/kvm.h"
-#include "sysemu/balloon.h"
+#include "monitor.h"
+#include "cpu-common.h"
+#include "kvm.h"
+#include "balloon.h"
 #include "trace.h"
 #include "qmp-commands.h"
-#include "qapi/qmp/qjson.h"
+#include "qjson.h"

 static QEMUBalloonEvent *balloon_event_fn;
 static QEMUBalloonStatus *balloon_stat_fn;
 static void *balloon_opaque;

-static bool have_balloon(Error **errp)
-{
-    if (kvm_enabled() && !kvm_has_sync_mmu()) {
-        error_set(errp, ERROR_CLASS_KVM_MISSING_CAP,
-                  "Using KVM without synchronous MMU, balloon unavailable");
-        return false;
-    }
-    if (!balloon_event_fn) {
-        error_set(errp, ERROR_CLASS_DEVICE_NOT_ACTIVE,
-                  "No balloon device has been activated");
-        return false;
-    }
-    return true;
-}
-
 int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
                             QEMUBalloonStatus *stat_func, void *opaque)
 {
@@ -58,6 +43,7 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
        /* We're already registered one balloon handler.  How many can
         * a guest really have?
         */
+        error_report("Another balloon device already registered");
        return -1;
    }
    balloon_event_fn = event_func;
@@ -76,30 +62,71 @@ void qemu_remove_balloon_handler(void *opaque)
    balloon_opaque = NULL;
 }

+static int qemu_balloon(ram_addr_t target)
+{
+    if (!balloon_event_fn) {
+        return 0;
+    }
+    trace_balloon_event(balloon_opaque, target);
+    balloon_event_fn(balloon_opaque, target);
+    return 1;
+}
+
+static int qemu_balloon_status(BalloonInfo *info)
+{
+    if (!balloon_stat_fn) {
+        return 0;
+    }
+    balloon_stat_fn(balloon_opaque, info);
+    return 1;
+}
+
+void qemu_balloon_changed(int64_t actual)
+{
+    QObject *data;
+
+    data = qobject_from_jsonf("{ 'actual': %" PRId64 " }",
+                              actual);
+
+    monitor_protocol_event(QEVENT_BALLOON_CHANGE, data);
+
+    qobject_decref(data);
+}
+
+
 BalloonInfo *qmp_query_balloon(Error **errp)
 {
    BalloonInfo *info;

-    if (!have_balloon(errp)) {
+    if (kvm_enabled() && !kvm_has_sync_mmu()) {
+        error_set(errp, QERR_KVM_MISSING_CAP, "synchronous MMU", "balloon");
        return NULL;
    }

    info = g_malloc0(sizeof(*info));
-    balloon_stat_fn(balloon_opaque, info);
+
+    if (qemu_balloon_status(info) == 0) {
+        error_set(errp, QERR_DEVICE_NOT_ACTIVE, "balloon");
+        qapi_free_BalloonInfo(info);
+        return NULL;
+    }
+
    return info;
 }

-void qmp_balloon(int64_t target, Error **errp)
+void qmp_balloon(int64_t value, Error **errp)
 {
-    if (!have_balloon(errp)) {
+    if (kvm_enabled() && !kvm_has_sync_mmu()) {
+        error_set(errp, QERR_KVM_MISSING_CAP, "synchronous MMU", "balloon");
        return;
    }

-    if (target <= 0) {
+    if (value <= 0) {
        error_set(errp, QERR_INVALID_PARAMETER_VALUE, "target", "a size");
        return;
    }
-
-    trace_balloon_event(balloon_opaque, target);
-    balloon_event_fn(balloon_opaque, target);
+    
+    if (qemu_balloon(value) == 0) {
+        error_set(errp, QERR_DEVICE_NOT_ACTIVE, "balloon");
+    }
 }
--- a/include/sysemu/balloon.h
+++ b/include/sysemu/balloon.h
@@ -14,7 +14,7 @@
 #ifndef _QEMU_BALLOON_H
 #define _QEMU_BALLOON_H

-#include "monitor/monitor.h"
+#include "monitor.h"
 #include "qapi-types.h"

 typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
@@ -24,4 +24,6 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
 			     QEMUBalloonStatus *stat_func, void *opaque);
 void qemu_remove_balloon_handler(void *opaque);

+void qemu_balloon_changed(int64_t actual);
+
 #endif
--- a/util/bitmap.c
+++ b/util/bitmap.c
@@ -9,8 +9,8 @@
 * Version 2.
 */

-#include "qemu/bitops.h"
-#include "qemu/bitmap.h"
+#include "bitops.h"
+#include "bitmap.h"

 /*
 * bitmaps provide an array of bits, implemented using an an
@@ -36,9 +36,9 @@
 * endian architectures.
 */

-int slow_bitmap_empty(const unsigned long *bitmap, long bits)
+int slow_bitmap_empty(const unsigned long *bitmap, int bits)
 {
-    long k, lim = bits/BITS_PER_LONG;
+    int k, lim = bits/BITS_PER_LONG;

    for (k = 0; k < lim; ++k) {
        if (bitmap[k]) {
@@ -54,9 +54,9 @@ int slow_bitmap_empty(const unsigned long *bitmap, long bits)
    return 1;
 }

-int slow_bitmap_full(const unsigned long *bitmap, long bits)
+int slow_bitmap_full(const unsigned long *bitmap, int bits)
 {
-    long k, lim = bits/BITS_PER_LONG;
+    int k, lim = bits/BITS_PER_LONG;

    for (k = 0; k < lim; ++k) {
        if (~bitmap[k]) {
@@ -74,9 +74,9 @@ int slow_bitmap_full(const unsigned long *bitmap, long bits)
 }

 int slow_bitmap_equal(const unsigned long *bitmap1,
-                      const unsigned long *bitmap2, long bits)
+                      const unsigned long *bitmap2, int bits)
 {
-    long k, lim = bits/BITS_PER_LONG;
+    int k, lim = bits/BITS_PER_LONG;

    for (k = 0; k < lim; ++k) {
        if (bitmap1[k] != bitmap2[k]) {
@@ -94,9 +94,9 @@ int slow_bitmap_equal(const unsigned long *bitmap1,
 }

 void slow_bitmap_complement(unsigned long *dst, const unsigned long *src,
-                            long bits)
+                            int bits)
 {
-    long k, lim = bits/BITS_PER_LONG;
+    int k, lim = bits/BITS_PER_LONG;

    for (k = 0; k < lim; ++k) {
        dst[k] = ~src[k];
@@ -108,10 +108,10 @@ void slow_bitmap_complement(unsigned long *dst, const unsigned long *src,
 }

 int slow_bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
-                    const unsigned long *bitmap2, long bits)
+                    const unsigned long *bitmap2, int bits)
 {
-    long k;
-    long nr = BITS_TO_LONGS(bits);
+    int k;
+    int nr = BITS_TO_LONGS(bits);
    unsigned long result = 0;

    for (k = 0; k < nr; k++) {
@@ -121,10 +121,10 @@ int slow_bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
 }

 void slow_bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
-                    const unsigned long *bitmap2, long bits)
+                    const unsigned long *bitmap2, int bits)
 {
-    long k;
-    long nr = BITS_TO_LONGS(bits);
+    int k;
+    int nr = BITS_TO_LONGS(bits);

    for (k = 0; k < nr; k++) {
        dst[k] = bitmap1[k] | bitmap2[k];
@@ -132,10 +132,10 @@ void slow_bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
 }

 void slow_bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
-                     const unsigned long *bitmap2, long bits)
+                     const unsigned long *bitmap2, int bits)
 {
-    long k;
-    long nr = BITS_TO_LONGS(bits);
+    int k;
+    int nr = BITS_TO_LONGS(bits);

    for (k = 0; k < nr; k++) {
        dst[k] = bitmap1[k] ^ bitmap2[k];
@@ -143,10 +143,10 @@ void slow_bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
 }

 int slow_bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
-                       const unsigned long *bitmap2, long bits)
+                       const unsigned long *bitmap2, int bits)
 {
-    long k;
-    long nr = BITS_TO_LONGS(bits);
+    int k;
+    int nr = BITS_TO_LONGS(bits);
    unsigned long result = 0;

    for (k = 0; k < nr; k++) {
@@ -157,10 +157,10 @@ int slow_bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,

 #define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) % BITS_PER_LONG))

-void bitmap_set(unsigned long *map, long start, long nr)
+void bitmap_set(unsigned long *map, int start, int nr)
 {
    unsigned long *p = map + BIT_WORD(start);
-    const long size = start + nr;
+    const int size = start + nr;
    int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
    unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);

@@ -177,10 +177,10 @@ void bitmap_set(unsigned long *map, long start, long nr)
    }
 }

-void bitmap_clear(unsigned long *map, long start, long nr)
+void bitmap_clear(unsigned long *map, int start, int nr)
 {
    unsigned long *p = map + BIT_WORD(start);
-    const long size = start + nr;
+    const int size = start + nr;
    int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
    unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);

@@ -212,10 +212,10 @@ void bitmap_clear(unsigned long *map, long start, long nr)
 * power of 2. A @align_mask of 0 means no alignment is required.
 */
 unsigned long bitmap_find_next_zero_area(unsigned long *map,
-                                         unsigned long size,
-                                         unsigned long start,
-                                         unsigned long nr,
-                                         unsigned long align_mask)
+					 unsigned long size,
+					 unsigned long start,
+					 unsigned int nr,
+					 unsigned long align_mask)
 {
    unsigned long index, end, i;
 again:
@@ -237,9 +237,9 @@ again:
 }

 int slow_bitmap_intersects(const unsigned long *bitmap1,
-                           const unsigned long *bitmap2, long bits)
+                           const unsigned long *bitmap2, int bits)
 {
-    long k, lim = bits/BITS_PER_LONG;
+    int k, lim = bits/BITS_PER_LONG;

    for (k = 0; k < lim; ++k) {
        if (bitmap1[k] & bitmap2[k]) {
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -12,12 +12,8 @@
 #ifndef BITMAP_H
 #define BITMAP_H

-#include <glib.h>
-#include <string.h>
-#include <stdlib.h>
-
-#include "qemu/osdep.h"
-#include "qemu/bitops.h"
+#include "qemu-common.h"
+#include "bitops.h"

 /*
 * The available bitmap operations and their rough meaning in the
@@ -35,7 +31,7 @@
 * bitmap_andnot(dst, src1, src2, nbits)	*dst = *src1 & ~(*src2)
 * bitmap_complement(dst, src, nbits)		*dst = ~(*src)
 * bitmap_equal(src1, src2, nbits)		Are *src1 and *src2 equal?
- * bitmap_intersects(src1, src2, nbits)         Do *src1 and *src2 overlap?
+ * bitmap_intersects(src1, src2, nbits) 	Do *src1 and *src2 overlap?
 * bitmap_empty(src, nbits)			Are all bits zero in *src?
 * bitmap_full(src, nbits)			Are all bits set in *src?
 * bitmap_set(dst, pos, nbits)			Set specified bit area
@@ -66,80 +62,71 @@
        )

 #define DECLARE_BITMAP(name,bits)                  \
-        unsigned long name[BITS_TO_LONGS(bits)]
+	unsigned long name[BITS_TO_LONGS(bits)]

 #define small_nbits(nbits)                      \
-        ((nbits) <= BITS_PER_LONG)
+	((nbits) <= BITS_PER_LONG)

-int slow_bitmap_empty(const unsigned long *bitmap, long bits);
-int slow_bitmap_full(const unsigned long *bitmap, long bits);
+int slow_bitmap_empty(const unsigned long *bitmap, int bits);
+int slow_bitmap_full(const unsigned long *bitmap, int bits);
 int slow_bitmap_equal(const unsigned long *bitmap1,
-                      const unsigned long *bitmap2, long bits);
+                   const unsigned long *bitmap2, int bits);
 void slow_bitmap_complement(unsigned long *dst, const unsigned long *src,
-                            long bits);
+                         int bits);
 void slow_bitmap_shift_right(unsigned long *dst,
-                             const unsigned long *src, int shift, long bits);
+                          const unsigned long *src, int shift, int bits);
 void slow_bitmap_shift_left(unsigned long *dst,
-                            const unsigned long *src, int shift, long bits);
+                         const unsigned long *src, int shift, int bits);
 int slow_bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
-                    const unsigned long *bitmap2, long bits);
+                 const unsigned long *bitmap2, int bits);
 void slow_bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
-                    const unsigned long *bitmap2, long bits);
+                 const unsigned long *bitmap2, int bits);
 void slow_bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
-                     const unsigned long *bitmap2, long bits);
+                  const unsigned long *bitmap2, int bits);
 int slow_bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
-                       const unsigned long *bitmap2, long bits);
+                    const unsigned long *bitmap2, int bits);
 int slow_bitmap_intersects(const unsigned long *bitmap1,
-                           const unsigned long *bitmap2, long bits);
+			const unsigned long *bitmap2, int bits);

-static inline unsigned long *bitmap_try_new(long nbits)
+static inline unsigned long *bitmap_new(int nbits)
 {
-    long len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
-    return g_try_malloc0(len);
+    int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+    return g_malloc0(len);
 }

-static inline unsigned long *bitmap_new(long nbits)
-{
-    unsigned long *ptr = bitmap_try_new(nbits);
-    if (ptr == NULL) {
-        abort();
-    }
-    return ptr;
-}
-
-static inline void bitmap_zero(unsigned long *dst, long nbits)
+static inline void bitmap_zero(unsigned long *dst, int nbits)
 {
    if (small_nbits(nbits)) {
        *dst = 0UL;
    } else {
-        long len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+        int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
        memset(dst, 0, len);
    }
 }

-static inline void bitmap_fill(unsigned long *dst, long nbits)
+static inline void bitmap_fill(unsigned long *dst, int nbits)
 {
    size_t nlongs = BITS_TO_LONGS(nbits);
    if (!small_nbits(nbits)) {
-        long len = (nlongs - 1) * sizeof(unsigned long);
+        int len = (nlongs - 1) * sizeof(unsigned long);
        memset(dst, 0xff,  len);
    }
    dst[nlongs - 1] = BITMAP_LAST_WORD_MASK(nbits);
 }

 static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
-                               long nbits)
+                               int nbits)
 {
    if (small_nbits(nbits)) {
        *dst = *src;
    } else {
-        long len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+        int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
        memcpy(dst, src, len);
    }
 }

 static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
-                             const unsigned long *src2, long nbits)
+                             const unsigned long *src2, int nbits)
 {
    if (small_nbits(nbits)) {
        return (*dst = *src1 & *src2) != 0;
@@ -148,7 +135,7 @@ static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
 }

 static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
-                             const unsigned long *src2, long nbits)
+			const unsigned long *src2, int nbits)
 {
    if (small_nbits(nbits)) {
        *dst = *src1 | *src2;
@@ -158,7 +145,7 @@ static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
 }

 static inline void bitmap_xor(unsigned long *dst, const unsigned long *src1,
-                              const unsigned long *src2, long nbits)
+			const unsigned long *src2, int nbits)
 {
    if (small_nbits(nbits)) {
        *dst = *src1 ^ *src2;
@@ -168,7 +155,7 @@ static inline void bitmap_xor(unsigned long *dst, const unsigned long *src1,
 }

 static inline int bitmap_andnot(unsigned long *dst, const unsigned long *src1,
-                                const unsigned long *src2, long nbits)
+			const unsigned long *src2, int nbits)
 {
    if (small_nbits(nbits)) {
        return (*dst = *src1 & ~(*src2)) != 0;
@@ -176,9 +163,8 @@ static inline int bitmap_andnot(unsigned long *dst, const unsigned long *src1,
    return slow_bitmap_andnot(dst, src1, src2, nbits);
 }

-static inline void bitmap_complement(unsigned long *dst,
-                                     const unsigned long *src,
-                                     long nbits)
+static inline void bitmap_complement(unsigned long *dst, const unsigned long *src,
+			int nbits)
 {
    if (small_nbits(nbits)) {
        *dst = ~(*src) & BITMAP_LAST_WORD_MASK(nbits);
@@ -188,7 +174,7 @@ static inline void bitmap_complement(unsigned long *dst,
 }

 static inline int bitmap_equal(const unsigned long *src1,
-                               const unsigned long *src2, long nbits)
+			const unsigned long *src2, int nbits)
 {
    if (small_nbits(nbits)) {
        return ! ((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
@@ -197,7 +183,7 @@ static inline int bitmap_equal(const unsigned long *src1,
    }
 }

-static inline int bitmap_empty(const unsigned long *src, long nbits)
+static inline int bitmap_empty(const unsigned long *src, int nbits)
 {
    if (small_nbits(nbits)) {
        return ! (*src & BITMAP_LAST_WORD_MASK(nbits));
@@ -206,7 +192,7 @@ static inline int bitmap_empty(const unsigned long *src, long nbits)
    }
 }

-static inline int bitmap_full(const unsigned long *src, long nbits)
+static inline int bitmap_full(const unsigned long *src, int nbits)
 {
    if (small_nbits(nbits)) {
        return ! (~(*src) & BITMAP_LAST_WORD_MASK(nbits));
@@ -216,7 +202,7 @@ static inline int bitmap_full(const unsigned long *src, long nbits)
 }

 static inline int bitmap_intersects(const unsigned long *src1,
-                                    const unsigned long *src2, long nbits)
+			const unsigned long *src2, int nbits)
 {
    if (small_nbits(nbits)) {
        return ((*src1 & *src2) & BITMAP_LAST_WORD_MASK(nbits)) != 0;
@@ -225,21 +211,12 @@ static inline int bitmap_intersects(const unsigned long *src1,
    }
 }

-void bitmap_set(unsigned long *map, long i, long len);
-void bitmap_clear(unsigned long *map, long start, long nr);
+void bitmap_set(unsigned long *map, int i, int len);
+void bitmap_clear(unsigned long *map, int start, int nr);
 unsigned long bitmap_find_next_zero_area(unsigned long *map,
-                                         unsigned long size,
-                                         unsigned long start,
-                                         unsigned long nr,
-                                         unsigned long align_mask);
-
-static inline unsigned long *bitmap_zero_extend(unsigned long *old,
-                                                long old_nbits, long new_nbits)
-{
-    long new_len = BITS_TO_LONGS(new_nbits) * sizeof(unsigned long);
-    unsigned long *new = g_realloc(old, new_len);
-    bitmap_clear(new, old_nbits, new_nbits - old_nbits);
-    return new;
-}
+					 unsigned long size,
+					 unsigned long start,
+					 unsigned int nr,
+					 unsigned long align_mask);

 #endif /* BITMAP_H */
--- a/util/bitops.c
+++ b/util/bitops.c
@@ -11,7 +11,7 @@
 * 2 of the License, or (at your option) any later version.
 */

-#include "qemu/bitops.h"
+#include "bitops.h"

 #define BITOP_WORD(nr)		((nr) / BITS_PER_LONG)

@@ -42,23 +42,7 @@ unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
        size -= BITS_PER_LONG;
        result += BITS_PER_LONG;
    }
-    while (size >= 4*BITS_PER_LONG) {
-        unsigned long d1, d2, d3;
-        tmp = *p;
-        d1 = *(p+1);
-        d2 = *(p+2);
-        d3 = *(p+3);
-        if (tmp) {
-            goto found_middle;
-        }
-        if (d1 | d2 | d3) {
-            break;
-        }
-        p += 4;
-        result += 4*BITS_PER_LONG;
-        size -= 4*BITS_PER_LONG;
-    }
-    while (size >= BITS_PER_LONG) {
+    while (size & ~(BITS_PER_LONG-1)) {
        if ((tmp = *(p++))) {
            goto found_middle;
        }
@@ -76,7 +60,7 @@ found_first:
        return result + size;	/* Nope. */
    }
 found_middle:
-    return result + ctzl(tmp);
+    return result + bitops_ffsl(tmp);
 }

 /*
@@ -125,7 +109,7 @@ found_first:
        return result + size;	/* Nope. */
    }
 found_middle:
-    return result + ctzl(~tmp);
+    return result + ffz(tmp);
 }

 unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
@@ -149,7 +133,7 @@ unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
        tmp = addr[--words];
        if (tmp) {
        found:
-            return words * BITS_PER_LONG + BITS_PER_LONG - 1 - clzl(tmp);
+            return words * BITS_PER_LONG + bitops_flsl(tmp);
        }
    }

--- a/include/qemu/bitops.h
+++ b/include/qemu/bitops.h
@@ -12,30 +12,114 @@
 #ifndef BITOPS_H
 #define BITOPS_H

-#include <stdint.h>
-#include <assert.h>
-
-#include "host-utils.h"
+#include "qemu-common.h"

 #define BITS_PER_BYTE           CHAR_BIT
 #define BITS_PER_LONG           (sizeof (unsigned long) * BITS_PER_BYTE)

-#define BIT(nr)                 (1UL << (nr))
-#define BIT_MASK(nr)            (1UL << ((nr) % BITS_PER_LONG))
-#define BIT_WORD(nr)            ((nr) / BITS_PER_LONG)
-#define BITS_TO_LONGS(nr)       DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
+#define BIT(nr)			(1UL << (nr))
+#define BIT_MASK(nr)		(1UL << ((nr) % BITS_PER_LONG))
+#define BIT_WORD(nr)		((nr) / BITS_PER_LONG)
+#define BITS_TO_LONGS(nr)	DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
+
+/**
+ * bitops_ffs - find first bit in word.
+ * @word: The word to search
+ *
+ * Undefined if no bit exists, so code should check against 0 first.
+ */
+static unsigned long bitops_ffsl(unsigned long word)
+{
+	int num = 0;
+
+#if LONG_MAX > 0x7FFFFFFF
+	if ((word & 0xffffffff) == 0) {
+		num += 32;
+		word >>= 32;
+	}
+#endif
+	if ((word & 0xffff) == 0) {
+		num += 16;
+		word >>= 16;
+	}
+	if ((word & 0xff) == 0) {
+		num += 8;
+		word >>= 8;
+	}
+	if ((word & 0xf) == 0) {
+		num += 4;
+		word >>= 4;
+	}
+	if ((word & 0x3) == 0) {
+		num += 2;
+		word >>= 2;
+	}
+	if ((word & 0x1) == 0) {
+		num += 1;
+        }
+	return num;
+}
+
+/**
+ * bitops_fls - find last (most-significant) set bit in a long word
+ * @word: the word to search
+ *
+ * Undefined if no set bit exists, so code should check against 0 first.
+ */
+static inline unsigned long bitops_flsl(unsigned long word)
+{
+	int num = BITS_PER_LONG - 1;
+
+#if LONG_MAX > 0x7FFFFFFF
+	if (!(word & (~0ul << 32))) {
+		num -= 32;
+		word <<= 32;
+	}
+#endif
+	if (!(word & (~0ul << (BITS_PER_LONG-16)))) {
+		num -= 16;
+		word <<= 16;
+	}
+	if (!(word & (~0ul << (BITS_PER_LONG-8)))) {
+		num -= 8;
+		word <<= 8;
+	}
+	if (!(word & (~0ul << (BITS_PER_LONG-4)))) {
+		num -= 4;
+		word <<= 4;
+	}
+	if (!(word & (~0ul << (BITS_PER_LONG-2)))) {
+		num -= 2;
+
+		word <<= 2;
+	}
+	if (!(word & (~0ul << (BITS_PER_LONG-1))))
+		num -= 1;
+	return num;
+}
+
+/**
+ * ffz - find first zero in word.
+ * @word: The word to search
+ *
+ * Undefined if no zero exists, so code should check against ~0UL first.
+ */
+static inline unsigned long ffz(unsigned long word)
+{
+    return bitops_ffsl(~word);
+}

 /**
 * set_bit - Set a bit in memory
 * @nr: the bit to set
 * @addr: the address to start counting from
 */
-static inline void set_bit(long nr, unsigned long *addr)
+static inline void set_bit(int nr, unsigned long *addr)
 {
-    unsigned long mask = BIT_MASK(nr);
-    unsigned long *p = addr + BIT_WORD(nr);
+	unsigned long mask = BIT_MASK(nr);
+        unsigned long *p = addr + BIT_WORD(nr);

-    *p  |= mask;
+	*p  |= mask;
 }

 /**
@@ -43,12 +127,12 @@ static inline void set_bit(long nr, unsigned long *addr)
 * @nr: Bit to clear
 * @addr: Address to start counting from
 */
-static inline void clear_bit(long nr, unsigned long *addr)
+static inline void clear_bit(int nr, unsigned long *addr)
 {
-    unsigned long mask = BIT_MASK(nr);
-    unsigned long *p = addr + BIT_WORD(nr);
+	unsigned long mask = BIT_MASK(nr);
+        unsigned long *p = addr + BIT_WORD(nr);

-    *p &= ~mask;
+	*p &= ~mask;
 }

 /**
@@ -56,12 +140,12 @@ static inline void clear_bit(long nr, unsigned long *addr)
 * @nr: Bit to change
 * @addr: Address to start counting from
 */
-static inline void change_bit(long nr, unsigned long *addr)
+static inline void change_bit(int nr, unsigned long *addr)
 {
-    unsigned long mask = BIT_MASK(nr);
-    unsigned long *p = addr + BIT_WORD(nr);
+	unsigned long mask = BIT_MASK(nr);
+        unsigned long *p = addr + BIT_WORD(nr);

-    *p ^= mask;
+	*p ^= mask;
 }

 /**
@@ -69,14 +153,14 @@ static inline void change_bit(long nr, unsigned long *addr)
 * @nr: Bit to set
 * @addr: Address to count from
 */
-static inline int test_and_set_bit(long nr, unsigned long *addr)
+static inline int test_and_set_bit(int nr, unsigned long *addr)
 {
-    unsigned long mask = BIT_MASK(nr);
-    unsigned long *p = addr + BIT_WORD(nr);
-    unsigned long old = *p;
+	unsigned long mask = BIT_MASK(nr);
+        unsigned long *p = addr + BIT_WORD(nr);
+	unsigned long old = *p;

-    *p = old | mask;
-    return (old & mask) != 0;
+	*p = old | mask;
+	return (old & mask) != 0;
 }

 /**
@@ -84,14 +168,14 @@ static inline int test_and_set_bit(long nr, unsigned long *addr)
 * @nr: Bit to clear
 * @addr: Address to count from
 */
-static inline int test_and_clear_bit(long nr, unsigned long *addr)
+static inline int test_and_clear_bit(int nr, unsigned long *addr)
 {
-    unsigned long mask = BIT_MASK(nr);
-    unsigned long *p = addr + BIT_WORD(nr);
-    unsigned long old = *p;
+	unsigned long mask = BIT_MASK(nr);
+        unsigned long *p = addr + BIT_WORD(nr);
+	unsigned long old = *p;

-    *p = old & ~mask;
-    return (old & mask) != 0;
+	*p = old & ~mask;
+	return (old & mask) != 0;
 }

 /**
@@ -99,14 +183,14 @@ static inline int test_and_clear_bit(long nr, unsigned long *addr)
 * @nr: Bit to change
 * @addr: Address to count from
 */
-static inline int test_and_change_bit(long nr, unsigned long *addr)
+static inline int test_and_change_bit(int nr, unsigned long *addr)
 {
-    unsigned long mask = BIT_MASK(nr);
-    unsigned long *p = addr + BIT_WORD(nr);
-    unsigned long old = *p;
+	unsigned long mask = BIT_MASK(nr);
+        unsigned long *p = addr + BIT_WORD(nr);
+	unsigned long old = *p;

-    *p = old ^ mask;
-    return (old & mask) != 0;
+	*p = old ^ mask;
+	return (old & mask) != 0;
 }

 /**
@@ -114,9 +198,9 @@ static inline int test_and_change_bit(long nr, unsigned long *addr)
 * @nr: bit number to test
 * @addr: Address to start counting from
 */
-static inline int test_bit(long nr, const unsigned long *addr)
+static inline int test_bit(int nr, const unsigned long *addr)
 {
-    return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
+	return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
 }

 /**
@@ -136,8 +220,7 @@ unsigned long find_last_bit(const unsigned long *addr,
 * @size: The bitmap size in bits
 */
 unsigned long find_next_bit(const unsigned long *addr,
-                            unsigned long size,
-                            unsigned long offset);
+				   unsigned long size, unsigned long offset);

 /**
 * find_next_zero_bit - find the next cleared bit in a memory region
@@ -160,17 +243,7 @@ unsigned long find_next_zero_bit(const unsigned long *addr,
 static inline unsigned long find_first_bit(const unsigned long *addr,
                                           unsigned long size)
 {
-    unsigned long result, tmp;
-
-    for (result = 0; result < size; result += BITS_PER_LONG) {
-        tmp = *addr++;
-        if (tmp) {
-            result += ctzl(tmp);
-            return result < size ? result : size;
-        }
-    }
-    /* Not found */
-    return size;
+    return find_next_bit(addr, size, 0);
 }

 /**
@@ -196,86 +269,6 @@ static inline unsigned long hweight_long(unsigned long w)
    return count;
 }

-/**
- * rol8 - rotate an 8-bit value left
- * @word: value to rotate
- * @shift: bits to roll
- */
-static inline uint8_t rol8(uint8_t word, unsigned int shift)
-{
-    return (word << shift) | (word >> (8 - shift));
-}
-
-/**
- * ror8 - rotate an 8-bit value right
- * @word: value to rotate
- * @shift: bits to roll
- */
-static inline uint8_t ror8(uint8_t word, unsigned int shift)
-{
-    return (word >> shift) | (word << (8 - shift));
-}
-
-/**
- * rol16 - rotate a 16-bit value left
- * @word: value to rotate
- * @shift: bits to roll
- */
-static inline uint16_t rol16(uint16_t word, unsigned int shift)
-{
-    return (word << shift) | (word >> (16 - shift));
-}
-
-/**
- * ror16 - rotate a 16-bit value right
- * @word: value to rotate
- * @shift: bits to roll
- */
-static inline uint16_t ror16(uint16_t word, unsigned int shift)
-{
-    return (word >> shift) | (word << (16 - shift));
-}
-
-/**
- * rol32 - rotate a 32-bit value left
- * @word: value to rotate
- * @shift: bits to roll
- */
-static inline uint32_t rol32(uint32_t word, unsigned int shift)
-{
-    return (word << shift) | (word >> (32 - shift));
-}
-
-/**
- * ror32 - rotate a 32-bit value right
- * @word: value to rotate
- * @shift: bits to roll
- */
-static inline uint32_t ror32(uint32_t word, unsigned int shift)
-{
-    return (word >> shift) | (word << (32 - shift));
-}
-
-/**
- * rol64 - rotate a 64-bit value left
- * @word: value to rotate
- * @shift: bits to roll
- */
-static inline uint64_t rol64(uint64_t word, unsigned int shift)
-{
-    return (word << shift) | (word >> (64 - shift));
-}
-
-/**
- * ror64 - rotate a 64-bit value right
- * @word: value to rotate
- * @shift: bits to roll
- */
-static inline uint64_t ror64(uint64_t word, unsigned int shift)
-{
-    return (word >> shift) | (word << (64 - shift));
-}
-
 /**
 * extract32:
 * @value: the value to extract the bit field from
@@ -314,56 +307,6 @@ static inline uint64_t extract64(uint64_t value, int start, int length)
    return (value >> start) & (~0ULL >> (64 - length));
 }

-/**
- * sextract32:
- * @value: the value to extract the bit field from
- * @start: the lowest bit in the bit field (numbered from 0)
- * @length: the length of the bit field
- *
- * Extract from the 32 bit input @value the bit field specified by the
- * @start and @length parameters, and return it, sign extended to
- * an int32_t (ie with the most significant bit of the field propagated
- * to all the upper bits of the return value). The bit field must lie
- * entirely within the 32 bit word. It is valid to request that
- * all 32 bits are returned (ie @length 32 and @start 0).
- *
- * Returns: the sign extended value of the bit field extracted from the
- * input value.
- */
-static inline int32_t sextract32(uint32_t value, int start, int length)
-{
-    assert(start >= 0 && length > 0 && length <= 32 - start);
-    /* Note that this implementation relies on right shift of signed
-     * integers being an arithmetic shift.
-     */
-    return ((int32_t)(value << (32 - length - start))) >> (32 - length);
-}
-
-/**
- * sextract64:
- * @value: the value to extract the bit field from
- * @start: the lowest bit in the bit field (numbered from 0)
- * @length: the length of the bit field
- *
- * Extract from the 64 bit input @value the bit field specified by the
- * @start and @length parameters, and return it, sign extended to
- * an int64_t (ie with the most significant bit of the field propagated
- * to all the upper bits of the return value). The bit field must lie
- * entirely within the 64 bit word. It is valid to request that
- * all 64 bits are returned (ie @length 64 and @start 0).
- *
- * Returns: the sign extended value of the bit field extracted from the
- * input value.
- */
-static inline int64_t sextract64(uint64_t value, int start, int length)
-{
-    assert(start >= 0 && length > 0 && length <= 64 - start);
-    /* Note that this implementation relies on right shift of signed
-     * integers being an arithmetic shift.
-     */
-    return ((int64_t)(value << (64 - length - start))) >> (64 - length);
-}
-
 /**
 * deposit32:
 * @value: initial value to insert bit field into
--- a/block-migration.c
+++ b/block-migration.c
@@ -14,25 +14,20 @@
 */

 #include "qemu-common.h"
-#include "block/block.h"
-#include "qemu/error-report.h"
-#include "qemu/main-loop.h"
+#include "block_int.h"
 #include "hw/hw.h"
-#include "qemu/queue.h"
-#include "qemu/timer.h"
-#include "migration/block.h"
-#include "migration/migration.h"
-#include "sysemu/blockdev.h"
-#include "sysemu/block-backend.h"
+#include "qemu-queue.h"
+#include "qemu-timer.h"
+#include "block-migration.h"
+#include "migration.h"
+#include "blockdev.h"
 #include <assert.h>

-#define BLOCK_SIZE                       (1 << 20)
-#define BDRV_SECTORS_PER_DIRTY_CHUNK     (BLOCK_SIZE >> BDRV_SECTOR_BITS)
+#define BLOCK_SIZE (BDRV_SECTORS_PER_DIRTY_CHUNK << BDRV_SECTOR_BITS)

 #define BLK_MIG_FLAG_DEVICE_BLOCK       0x01
 #define BLK_MIG_FLAG_EOS                0x02
 #define BLK_MIG_FLAG_PROGRESS           0x04
-#define BLK_MIG_FLAG_ZERO_BLOCK         0x08

 #define MAX_IS_ALLOCATED_SEARCH 65536

@@ -47,103 +42,60 @@
 #endif

 typedef struct BlkMigDevState {
-    /* Written during setup phase.  Can be read without a lock.  */
    BlockDriverState *bs;
-    int shared_base;
-    int64_t total_sectors;
-    QSIMPLEQ_ENTRY(BlkMigDevState) entry;
-
-    /* Only used by migration thread.  Does not need a lock.  */
    int bulk_completed;
+    int shared_base;
    int64_t cur_sector;
    int64_t cur_dirty;
-
-    /* Protected by block migration lock.  */
-    unsigned long *aio_bitmap;
    int64_t completed_sectors;
-    BdrvDirtyBitmap *dirty_bitmap;
-    Error *blocker;
+    int64_t total_sectors;
+    int64_t dirty;
+    QSIMPLEQ_ENTRY(BlkMigDevState) entry;
+    unsigned long *aio_bitmap;
 } BlkMigDevState;

 typedef struct BlkMigBlock {
-    /* Only used by migration thread.  */
    uint8_t *buf;
    BlkMigDevState *bmds;
    int64_t sector;
    int nr_sectors;
    struct iovec iov;
    QEMUIOVector qiov;
-    BlockAIOCB *aiocb;
-
-    /* Protected by block migration lock.  */
+    BlockDriverAIOCB *aiocb;
    int ret;
    QSIMPLEQ_ENTRY(BlkMigBlock) entry;
 } BlkMigBlock;

 typedef struct BlkMigState {
-    /* Written during setup phase.  Can be read without a lock.  */
    int blk_enable;
    int shared_base;
    QSIMPLEQ_HEAD(bmds_list, BlkMigDevState) bmds_list;
-    int64_t total_sector_sum;
-    bool zero_blocks;
-
-    /* Protected by lock.  */
    QSIMPLEQ_HEAD(blk_list, BlkMigBlock) blk_list;
    int submitted;
    int read_done;
-
-    /* Only used by migration thread.  Does not need a lock.  */
    int transferred;
+    int64_t total_sector_sum;
    int prev_progress;
    int bulk_completed;
-
-    /* Lock must be taken _inside_ the iothread lock.  */
-    QemuMutex lock;
+    long double total_time;
+    long double prev_time_offset;
+    int reads;
 } BlkMigState;

 static BlkMigState block_mig_state;

-static void blk_mig_lock(void)
-{
-    qemu_mutex_lock(&block_mig_state.lock);
-}
-
-static void blk_mig_unlock(void)
-{
-    qemu_mutex_unlock(&block_mig_state.lock);
-}
-
-/* Must run outside of the iothread lock during the bulk phase,
- * or the VM will stall.
- */
-
 static void blk_send(QEMUFile *f, BlkMigBlock * blk)
 {
    int len;
-    uint64_t flags = BLK_MIG_FLAG_DEVICE_BLOCK;
-
-    if (block_mig_state.zero_blocks &&
-        buffer_is_zero(blk->buf, BLOCK_SIZE)) {
-        flags |= BLK_MIG_FLAG_ZERO_BLOCK;
-    }

    /* sector number and flags */
    qemu_put_be64(f, (blk->sector << BDRV_SECTOR_BITS)
-                     | flags);
+                     | BLK_MIG_FLAG_DEVICE_BLOCK);

    /* device name */
-    len = strlen(bdrv_get_device_name(blk->bmds->bs));
+    len = strlen(blk->bmds->bs->device_name);
    qemu_put_byte(f, len);
-    qemu_put_buffer(f, (uint8_t *)bdrv_get_device_name(blk->bmds->bs), len);
-
-    /* if a block is zero we need to flush here since the network
-     * bandwidth is now a lot higher than the storage device bandwidth.
-     * thus if we queue zero blocks we slow down the migration */
-    if (flags & BLK_MIG_FLAG_ZERO_BLOCK) {
-        qemu_fflush(f);
-        return;
-    }
+    qemu_put_buffer(f, (uint8_t *)blk->bmds->bs->device_name, len);

    qemu_put_buffer(f, blk->buf, BLOCK_SIZE);
 }
@@ -158,11 +110,9 @@ uint64_t blk_mig_bytes_transferred(void)
    BlkMigDevState *bmds;
    uint64_t sum = 0;

-    blk_mig_lock();
    QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
        sum += bmds->completed_sectors;
    }
-    blk_mig_unlock();
    return sum << BDRV_SECTOR_BITS;
 }

@@ -182,14 +132,17 @@ uint64_t blk_mig_bytes_total(void)
    return sum << BDRV_SECTOR_BITS;
 }

-
-/* Called with migration lock held.  */
+static inline long double compute_read_bwidth(void)
+{
+    assert(block_mig_state.total_time != 0);
+    return (block_mig_state.reads / block_mig_state.total_time) * BLOCK_SIZE;
+}

 static int bmds_aio_inflight(BlkMigDevState *bmds, int64_t sector)
 {
    int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK;

-    if (sector < bdrv_nb_sectors(bmds->bs)) {
+    if ((sector << BDRV_SECTOR_BITS) < bdrv_getlength(bmds->bs)) {
        return !!(bmds->aio_bitmap[chunk / (sizeof(unsigned long) * 8)] &
            (1UL << (chunk % (sizeof(unsigned long) * 8))));
    } else {
@@ -197,8 +150,6 @@ static int bmds_aio_inflight(BlkMigDevState *bmds, int64_t sector)
    }
 }

-/* Called with migration lock held.  */
-
 static void bmds_set_aio_inflight(BlkMigDevState *bmds, int64_t sector_num,
                             int nb_sectors, int set)
 {
@@ -226,32 +177,32 @@ static void alloc_aio_bitmap(BlkMigDevState *bmds)
    BlockDriverState *bs = bmds->bs;
    int64_t bitmap_size;

-    bitmap_size = bdrv_nb_sectors(bs) + BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
+    bitmap_size = (bdrv_getlength(bs) >> BDRV_SECTOR_BITS) +
+            BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
    bitmap_size /= BDRV_SECTORS_PER_DIRTY_CHUNK * 8;

    bmds->aio_bitmap = g_malloc0(bitmap_size);
 }

-/* Never hold migration lock when yielding to the main loop!  */
-
 static void blk_mig_read_cb(void *opaque, int ret)
 {
+    long double curr_time = qemu_get_clock_ns(rt_clock);
    BlkMigBlock *blk = opaque;

-    blk_mig_lock();
    blk->ret = ret;

+    block_mig_state.reads++;
+    block_mig_state.total_time += (curr_time - block_mig_state.prev_time_offset);
+    block_mig_state.prev_time_offset = curr_time;
+
    QSIMPLEQ_INSERT_TAIL(&block_mig_state.blk_list, blk, entry);
    bmds_set_aio_inflight(blk->bmds, blk->sector, blk->nr_sectors, 0);

    block_mig_state.submitted--;
    block_mig_state.read_done++;
    assert(block_mig_state.submitted >= 0);
-    blk_mig_unlock();
 }

-/* Called with no lock taken.  */
-
 static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
 {
    int64_t total_sectors = bmds->total_sectors;
@@ -261,13 +212,11 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
    int nr_sectors;

    if (bmds->shared_base) {
-        qemu_mutex_lock_iothread();
        while (cur_sector < total_sectors &&
               !bdrv_is_allocated(bs, cur_sector, MAX_IS_ALLOCATED_SEARCH,
                                  &nr_sectors)) {
            cur_sector += nr_sectors;
        }
-        qemu_mutex_unlock_iothread();
    }

    if (cur_sector >= total_sectors) {
@@ -286,7 +235,7 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
        nr_sectors = total_sectors - cur_sector;
    }

-    blk = g_new(BlkMigBlock, 1);
+    blk = g_malloc(sizeof(BlkMigBlock));
    blk->buf = g_malloc(BLOCK_SIZE);
    blk->bmds = bmds;
    blk->sector = cur_sector;
@@ -296,105 +245,76 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
    blk->iov.iov_len = nr_sectors * BDRV_SECTOR_SIZE;
    qemu_iovec_init_external(&blk->qiov, &blk->iov, 1);

-    blk_mig_lock();
-    block_mig_state.submitted++;
-    blk_mig_unlock();
+    if (block_mig_state.submitted == 0) {
+        block_mig_state.prev_time_offset = qemu_get_clock_ns(rt_clock);
+    }

-    qemu_mutex_lock_iothread();
    blk->aiocb = bdrv_aio_readv(bs, cur_sector, &blk->qiov,
                                nr_sectors, blk_mig_read_cb, blk);
+    block_mig_state.submitted++;

-    bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, cur_sector, nr_sectors);
-    qemu_mutex_unlock_iothread();
-
+    bdrv_reset_dirty(bs, cur_sector, nr_sectors);
    bmds->cur_sector = cur_sector + nr_sectors;
+
    return (bmds->cur_sector >= total_sectors);
 }

-/* Called with iothread lock taken.  */
-
-static int set_dirty_tracking(void)
-{
-    BlkMigDevState *bmds;
-    int ret;
-
-    QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
-        bmds->dirty_bitmap = bdrv_create_dirty_bitmap(bmds->bs, BLOCK_SIZE,
-                                                      NULL, NULL);
-        if (!bmds->dirty_bitmap) {
-            ret = -errno;
-            goto fail;
-        }
-    }
-    return 0;
-
-fail:
-    QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
-        if (bmds->dirty_bitmap) {
-            bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
-        }
-    }
-    return ret;
-}
-
-static void unset_dirty_tracking(void)
+static void set_dirty_tracking(int enable)
 {
    BlkMigDevState *bmds;

    QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
-        bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
+        bdrv_set_dirty_tracking(bmds->bs, enable);
    }
 }

-static void init_blk_migration(QEMUFile *f)
+static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
 {
-    BlockDriverState *bs;
    BlkMigDevState *bmds;
    int64_t sectors;

-    block_mig_state.submitted = 0;
-    block_mig_state.read_done = 0;
-    block_mig_state.transferred = 0;
-    block_mig_state.total_sector_sum = 0;
-    block_mig_state.prev_progress = -1;
-    block_mig_state.bulk_completed = 0;
-    block_mig_state.zero_blocks = migrate_zero_blocks();
-
-    for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
-        if (bdrv_is_read_only(bs)) {
-            continue;
-        }
-
-        sectors = bdrv_nb_sectors(bs);
+    if (!bdrv_is_read_only(bs)) {
+        sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
        if (sectors <= 0) {
            return;
        }

-        bmds = g_new0(BlkMigDevState, 1);
+        bmds = g_malloc0(sizeof(BlkMigDevState));
        bmds->bs = bs;
        bmds->bulk_completed = 0;
        bmds->total_sectors = sectors;
        bmds->completed_sectors = 0;
        bmds->shared_base = block_mig_state.shared_base;
        alloc_aio_bitmap(bmds);
-        error_setg(&bmds->blocker, "block device is in use by migration");
-        bdrv_op_block_all(bs, bmds->blocker);
-        bdrv_ref(bs);
+        drive_get_ref(drive_get_by_blockdev(bs));
+        bdrv_set_in_use(bs, 1);

        block_mig_state.total_sector_sum += sectors;

        if (bmds->shared_base) {
            DPRINTF("Start migration for %s with shared base image\n",
-                    bdrv_get_device_name(bs));
+                    bs->device_name);
        } else {
-            DPRINTF("Start full migration for %s\n", bdrv_get_device_name(bs));
+            DPRINTF("Start full migration for %s\n", bs->device_name);
        }

        QSIMPLEQ_INSERT_TAIL(&block_mig_state.bmds_list, bmds, entry);
    }
 }

-/* Called with no lock taken.  */
+static void init_blk_migration(QEMUFile *f)
+{
+    block_mig_state.submitted = 0;
+    block_mig_state.read_done = 0;
+    block_mig_state.transferred = 0;
+    block_mig_state.total_sector_sum = 0;
+    block_mig_state.prev_progress = -1;
+    block_mig_state.bulk_completed = 0;
+    block_mig_state.total_time = 0;
+    block_mig_state.reads = 0;
+
+    bdrv_iterate(init_blk_migration_it, NULL);
+}

 static int blk_mig_save_bulked_block(QEMUFile *f)
 {
@@ -442,8 +362,6 @@ static void blk_mig_reset_dirty_cursor(void)
    }
 }

-/* Called with iothread lock taken.  */
-
 static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
                                 int is_async)
 {
@@ -454,21 +372,17 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
    int ret = -EIO;

    for (sector = bmds->cur_dirty; sector < bmds->total_sectors;) {
-        blk_mig_lock();
        if (bmds_aio_inflight(bmds, sector)) {
-            blk_mig_unlock();
            bdrv_drain_all();
-        } else {
-            blk_mig_unlock();
        }
-        if (bdrv_get_dirty(bmds->bs, bmds->dirty_bitmap, sector)) {
+        if (bdrv_get_dirty(bmds->bs, sector)) {

            if (total_sectors - sector < BDRV_SECTORS_PER_DIRTY_CHUNK) {
                nr_sectors = total_sectors - sector;
            } else {
                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
            }
-            blk = g_new(BlkMigBlock, 1);
+            blk = g_malloc(sizeof(BlkMigBlock));
            blk->buf = g_malloc(BLOCK_SIZE);
            blk->bmds = bmds;
            blk->sector = sector;
@@ -479,13 +393,14 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
                blk->iov.iov_len = nr_sectors * BDRV_SECTOR_SIZE;
                qemu_iovec_init_external(&blk->qiov, &blk->iov, 1);

+                if (block_mig_state.submitted == 0) {
+                    block_mig_state.prev_time_offset = qemu_get_clock_ns(rt_clock);
+                }
+
                blk->aiocb = bdrv_aio_readv(bmds->bs, sector, &blk->qiov,
                                            nr_sectors, blk_mig_read_cb, blk);
-
-                blk_mig_lock();
                block_mig_state.submitted++;
                bmds_set_aio_inflight(bmds, sector, nr_sectors, 1);
-                blk_mig_unlock();
            } else {
                ret = bdrv_read(bmds->bs, sector, blk->buf, nr_sectors);
                if (ret < 0) {
@@ -497,7 +412,7 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
                g_free(blk);
            }

-            bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, sector, nr_sectors);
+            bdrv_reset_dirty(bmds->bs, sector, nr_sectors);
            break;
        }
        sector += BDRV_SECTORS_PER_DIRTY_CHUNK;
@@ -513,9 +428,7 @@ error:
    return ret;
 }

-/* Called with iothread lock taken.
- *
- * return value:
+/* return value:
 * 0: too much data for max_downtime
 * 1: few enough data for max_downtime
 */
@@ -534,8 +447,6 @@ static int blk_mig_save_dirty_block(QEMUFile *f, int is_async)
    return ret;
 }

-/* Called with no locks taken.  */
-
 static int flush_blks(QEMUFile *f)
 {
    BlkMigBlock *blk;
@@ -545,7 +456,6 @@ static int flush_blks(QEMUFile *f)
            __FUNCTION__, block_mig_state.submitted, block_mig_state.read_done,
            block_mig_state.transferred);

-    blk_mig_lock();
    while ((blk = QSIMPLEQ_FIRST(&block_mig_state.blk_list)) != NULL) {
        if (qemu_file_rate_limit(f)) {
            break;
@@ -554,12 +464,9 @@ static int flush_blks(QEMUFile *f)
            ret = blk->ret;
            break;
        }
+        blk_send(f, blk);

        QSIMPLEQ_REMOVE_HEAD(&block_mig_state.blk_list, entry);
-        blk_mig_unlock();
-        blk_send(f, blk);
-        blk_mig_lock();
-
        g_free(blk->buf);
        g_free(blk);

@@ -567,7 +474,6 @@ static int flush_blks(QEMUFile *f)
        block_mig_state.transferred++;
        assert(block_mig_state.read_done >= 0);
    }
-    blk_mig_unlock();

    DPRINTF("%s Exit submitted %d read_done %d transferred %d\n", __FUNCTION__,
            block_mig_state.submitted, block_mig_state.read_done,
@@ -575,21 +481,43 @@ static int flush_blks(QEMUFile *f)
    return ret;
 }

-/* Called with iothread lock taken.  */
-
 static int64_t get_remaining_dirty(void)
 {
    BlkMigDevState *bmds;
    int64_t dirty = 0;

    QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
-        dirty += bdrv_get_dirty_count(bmds->dirty_bitmap);
+        dirty += bdrv_get_dirty_count(bmds->bs);
    }

-    return dirty << BDRV_SECTOR_BITS;
+    return dirty * BLOCK_SIZE;
 }

-/* Called with iothread lock taken.  */
+static int is_stage2_completed(void)
+{
+    int64_t remaining_dirty;
+    long double bwidth;
+
+    if (block_mig_state.bulk_completed == 1) {
+
+        remaining_dirty = get_remaining_dirty();
+        if (remaining_dirty == 0) {
+            return 1;
+        }
+
+        bwidth = compute_read_bwidth();
+
+        if ((remaining_dirty / bwidth) <=
+            migrate_max_downtime()) {
+            /* finish stage2 because we think that we can finish remaining work
+               below max_downtime */
+
+            return 1;
+        }
+    }
+
+    return 0;
+}

 static void blk_mig_cleanup(void)
 {
@@ -598,14 +526,12 @@ static void blk_mig_cleanup(void)

    bdrv_drain_all();

-    unset_dirty_tracking();
+    set_dirty_tracking(0);

-    blk_mig_lock();
    while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
        QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
-        bdrv_op_unblock_all(bmds->bs, bmds->blocker);
-        error_free(bmds->blocker);
-        bdrv_unref(bmds->bs);
+        bdrv_set_in_use(bmds->bs, 0);
+        drive_put_ref(drive_get_by_blockdev(bmds->bs));
        g_free(bmds->aio_bitmap);
        g_free(bmds);
    }
@@ -615,7 +541,6 @@ static void blk_mig_cleanup(void)
        g_free(blk->buf);
        g_free(blk);
    }
-    blk_mig_unlock();
 }

 static void block_migration_cancel(void *opaque)
@@ -630,91 +555,72 @@ static int block_save_setup(QEMUFile *f, void *opaque)
    DPRINTF("Enter save live setup submitted %d transferred %d\n",
            block_mig_state.submitted, block_mig_state.transferred);

-    qemu_mutex_lock_iothread();
    init_blk_migration(f);

    /* start track dirty blocks */
-    ret = set_dirty_tracking();
+    set_dirty_tracking(1);

+    ret = flush_blks(f);
    if (ret) {
-        qemu_mutex_unlock_iothread();
+        blk_mig_cleanup();
        return ret;
    }

-    qemu_mutex_unlock_iothread();
-
-    ret = flush_blks(f);
    blk_mig_reset_dirty_cursor();
+
    qemu_put_be64(f, BLK_MIG_FLAG_EOS);

-    return ret;
+    return 0;
 }

 static int block_save_iterate(QEMUFile *f, void *opaque)
 {
    int ret;
-    int64_t last_ftell = qemu_ftell(f);
-    int64_t delta_ftell;

    DPRINTF("Enter save live iterate submitted %d transferred %d\n",
            block_mig_state.submitted, block_mig_state.transferred);

    ret = flush_blks(f);
    if (ret) {
+        blk_mig_cleanup();
        return ret;
    }

    blk_mig_reset_dirty_cursor();

    /* control the rate of transfer */
-    blk_mig_lock();
    while ((block_mig_state.submitted +
            block_mig_state.read_done) * BLOCK_SIZE <
           qemu_file_get_rate_limit(f)) {
-        blk_mig_unlock();
        if (block_mig_state.bulk_completed == 0) {
            /* first finish the bulk phase */
            if (blk_mig_save_bulked_block(f) == 0) {
                /* finished saving bulk on all devices */
                block_mig_state.bulk_completed = 1;
            }
-            ret = 0;
        } else {
-            /* Always called with iothread lock taken for
-             * simplicity, block_save_complete also calls it.
-             */
-            qemu_mutex_lock_iothread();
            ret = blk_mig_save_dirty_block(f, 1);
-            qemu_mutex_unlock_iothread();
-        }
-        if (ret < 0) {
-            return ret;
-        }
-        blk_mig_lock();
-        if (ret != 0) {
-            /* no more dirty blocks */
-            break;
+            if (ret != 0) {
+                /* no more dirty blocks */
+                break;
+            }
        }
    }
-    blk_mig_unlock();
+    if (ret) {
+        blk_mig_cleanup();
+        return ret;
+    }

    ret = flush_blks(f);
    if (ret) {
+        blk_mig_cleanup();
        return ret;
    }

    qemu_put_be64(f, BLK_MIG_FLAG_EOS);
-    delta_ftell = qemu_ftell(f) - last_ftell;
-    if (delta_ftell > 0) {
-        return 1;
-    } else if (delta_ftell < 0) {
-        return -1;
-    } else {
-        return 0;
-    }
-}

-/* Called with iothread lock taken.  */
+    return is_stage2_completed();
+}

 static int block_save_complete(QEMUFile *f, void *opaque)
 {
@@ -725,6 +631,7 @@ static int block_save_complete(QEMUFile *f, void *opaque)

    ret = flush_blks(f);
    if (ret) {
+        blk_mig_cleanup();
        return ret;
    }

@@ -732,17 +639,16 @@ static int block_save_complete(QEMUFile *f, void *opaque)

    /* we know for sure that save bulk is completed and
       all async read completed */
-    blk_mig_lock();
    assert(block_mig_state.submitted == 0);
-    blk_mig_unlock();

    do {
        ret = blk_mig_save_dirty_block(f, 0);
-        if (ret < 0) {
-            return ret;
-        }
    } while (ret == 0);

+    blk_mig_cleanup();
+    if (ret) {
+        return ret;
+    }
    /* report completion */
    qemu_put_be64(f, (100 << BDRV_SECTOR_BITS) | BLK_MIG_FLAG_PROGRESS);

@@ -750,32 +656,9 @@ static int block_save_complete(QEMUFile *f, void *opaque)

    qemu_put_be64(f, BLK_MIG_FLAG_EOS);

-    blk_mig_cleanup();
    return 0;
 }

-static uint64_t block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size)
-{
-    /* Estimate pending number of bytes to send */
-    uint64_t pending;
-
-    qemu_mutex_lock_iothread();
-    blk_mig_lock();
-    pending = get_remaining_dirty() +
-                       block_mig_state.submitted * BLOCK_SIZE +
-                       block_mig_state.read_done * BLOCK_SIZE;
-
-    /* Report at least one block pending during bulk phase */
-    if (pending <= max_size && !block_mig_state.bulk_completed) {
-        pending = max_size + BLOCK_SIZE;
-    }
-    blk_mig_unlock();
-    qemu_mutex_unlock_iothread();
-
-    DPRINTF("Enter save live pending  %" PRIu64 "\n", pending);
-    return pending;
-}
-
 static int block_load(QEMUFile *f, void *opaque, int version_id)
 {
    static int banner_printed;
@@ -783,7 +666,6 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
    char device_name[256];
    int64_t addr;
    BlockDriverState *bs, *bs_prev = NULL;
-    BlockBackend *blk;
    uint8_t *buf;
    int64_t total_sectors = 0;
    int nr_sectors;
@@ -801,17 +683,16 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
            qemu_get_buffer(f, (uint8_t *)device_name, len);
            device_name[len] = '\0';

-            blk = blk_by_name(device_name);
-            if (!blk) {
+            bs = bdrv_find(device_name);
+            if (!bs) {
                fprintf(stderr, "Error unknown block device %s\n",
                        device_name);
                return -EINVAL;
            }
-            bs = blk_bs(blk);

            if (bs != bs_prev) {
                bs_prev = bs;
-                total_sectors = bdrv_nb_sectors(bs);
+                total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
                if (total_sectors <= 0) {
                    error_report("Error getting length of block device %s",
                                 device_name);
@@ -825,16 +706,12 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
            }

-            if (flags & BLK_MIG_FLAG_ZERO_BLOCK) {
-                ret = bdrv_write_zeroes(bs, addr, nr_sectors,
-                                        BDRV_REQ_MAY_UNMAP);
-            } else {
-                buf = g_malloc(BLOCK_SIZE);
-                qemu_get_buffer(f, buf, BLOCK_SIZE);
-                ret = bdrv_write(bs, addr, buf, nr_sectors);
-                g_free(buf);
-            }
+            buf = g_malloc(BLOCK_SIZE);

+            qemu_get_buffer(f, buf, BLOCK_SIZE);
+            ret = bdrv_write(bs, addr, buf, nr_sectors);
+
+            g_free(buf);
            if (ret < 0) {
                return ret;
            }
@@ -847,7 +724,7 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
                   (addr == 100) ? '\n' : '\r');
            fflush(stdout);
        } else if (!(flags & BLK_MIG_FLAG_EOS)) {
-            fprintf(stderr, "Unknown block migration flags: %#x\n", flags);
+            fprintf(stderr, "Unknown flags\n");
            return -EINVAL;
        }
        ret = qemu_file_get_error(f);
@@ -873,12 +750,11 @@ static bool block_is_active(void *opaque)
    return block_mig_state.blk_enable == 1;
 }

-static SaveVMHandlers savevm_block_handlers = {
+SaveVMHandlers savevm_block_handlers = {
    .set_params = block_set_params,
    .save_live_setup = block_save_setup,
    .save_live_iterate = block_save_iterate,
    .save_live_complete = block_save_complete,
-    .save_live_pending = block_save_pending,
    .load_state = block_load,
    .cancel = block_migration_cancel,
    .is_active = block_is_active,
@@ -888,7 +764,6 @@ void blk_mig_init(void)
 {
    QSIMPLEQ_INIT(&block_mig_state.bmds_list);
    QSIMPLEQ_INIT(&block_mig_state.blk_list);
-    qemu_mutex_init(&block_mig_state.lock);

    register_savevm_live(NULL, "block", 0, 1, &savevm_block_handlers,
                         &block_mig_state);
--- a/include/migration/block.h
+++ b/include/migration/block.h
--- a/block.c
+++ b/block.c
--- a/block.h
+++ b/block.h
@@ -0,0 +1,434 @@
+#ifndef BLOCK_H
+#define BLOCK_H
+
+#include "qemu-aio.h"
+#include "qemu-common.h"
+#include "qemu-option.h"
+#include "qemu-coroutine.h"
+#include "qobject.h"
+#include "qapi-types.h"
+
+/* block.c */
+typedef struct BlockDriver BlockDriver;
+typedef struct BlockJob BlockJob;
+
+typedef struct BlockDriverInfo {
+    /* in bytes, 0 if irrelevant */
+    int cluster_size;
+    /* offset at which the VM state can be saved (0 if not possible) */
+    int64_t vm_state_offset;
+    bool is_dirty;
+} BlockDriverInfo;
+
+typedef struct BlockFragInfo {
+    uint64_t allocated_clusters;
+    uint64_t total_clusters;
+    uint64_t fragmented_clusters;
+} BlockFragInfo;
+
+typedef struct QEMUSnapshotInfo {
+    char id_str[128]; /* unique snapshot id */
+    /* the following fields are informative. They are not needed for
+       the consistency of the snapshot */
+    char name[256]; /* user chosen name */
+    uint64_t vm_state_size; /* VM state info size */
+    uint32_t date_sec; /* UTC date of the snapshot */
+    uint32_t date_nsec;
+    uint64_t vm_clock_nsec; /* VM clock relative to boot */
+} QEMUSnapshotInfo;
+
+/* Callbacks for block device models */
+typedef struct BlockDevOps {
+    /*
+     * Runs when virtual media changed (monitor commands eject, change)
+     * Argument load is true on load and false on eject.
+     * Beware: doesn't run when a host device's physical media
+     * changes.  Sure would be useful if it did.
+     * Device models with removable media must implement this callback.
+     */
+    void (*change_media_cb)(void *opaque, bool load);
+    /*
+     * Runs when an eject request is issued from the monitor, the tray
+     * is closed, and the medium is locked.
+     * Device models that do not implement is_medium_locked will not need
+     * this callback.  Device models that can lock the medium or tray might
+     * want to implement the callback and unlock the tray when "force" is
+     * true, even if they do not support eject requests.
+     */
+    void (*eject_request_cb)(void *opaque, bool force);
+    /*
+     * Is the virtual tray open?
+     * Device models implement this only when the device has a tray.
+     */
+    bool (*is_tray_open)(void *opaque);
+    /*
+     * Is the virtual medium locked into the device?
+     * Device models implement this only when device has such a lock.
+     */
+    bool (*is_medium_locked)(void *opaque);
+    /*
+     * Runs when the size changed (e.g. monitor command block_resize)
+     */
+    void (*resize_cb)(void *opaque);
+} BlockDevOps;
+
+#define BDRV_O_RDWR        0x0002
+#define BDRV_O_SNAPSHOT    0x0008 /* open the file read only and save writes in a snapshot */
+#define BDRV_O_NOCACHE     0x0020 /* do not use the host page cache */
+#define BDRV_O_CACHE_WB    0x0040 /* use write-back caching */
+#define BDRV_O_NATIVE_AIO  0x0080 /* use native AIO instead of the thread pool */
+#define BDRV_O_NO_BACKING  0x0100 /* don't open the backing file */
+#define BDRV_O_NO_FLUSH    0x0200 /* disable flushing on this disk */
+#define BDRV_O_COPY_ON_READ 0x0400 /* copy read backing sectors into image */
+#define BDRV_O_INCOMING    0x0800  /* consistency hint for incoming migration */
+#define BDRV_O_CHECK       0x1000  /* open solely for consistency check */
+#define BDRV_O_ALLOW_RDWR  0x2000  /* allow reopen to change from r/o to r/w */
+
+#define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH)
+
+#define BDRV_SECTOR_BITS   9
+#define BDRV_SECTOR_SIZE   (1ULL << BDRV_SECTOR_BITS)
+#define BDRV_SECTOR_MASK   ~(BDRV_SECTOR_SIZE - 1)
+
+typedef enum {
+    BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP
+} BlockErrorAction;
+
+typedef QSIMPLEQ_HEAD(BlockReopenQueue, BlockReopenQueueEntry) BlockReopenQueue;
+
+typedef struct BDRVReopenState {
+    BlockDriverState *bs;
+    int flags;
+    void *opaque;
+} BDRVReopenState;
+
+
+void bdrv_iostatus_enable(BlockDriverState *bs);
+void bdrv_iostatus_reset(BlockDriverState *bs);
+void bdrv_iostatus_disable(BlockDriverState *bs);
+bool bdrv_iostatus_is_enabled(const BlockDriverState *bs);
+void bdrv_iostatus_set_err(BlockDriverState *bs, int error);
+void bdrv_info_print(Monitor *mon, const QObject *data);
+void bdrv_info(Monitor *mon, QObject **ret_data);
+void bdrv_stats_print(Monitor *mon, const QObject *data);
+void bdrv_info_stats(Monitor *mon, QObject **ret_data);
+
+/* disk I/O throttling */
+void bdrv_io_limits_enable(BlockDriverState *bs);
+void bdrv_io_limits_disable(BlockDriverState *bs);
+bool bdrv_io_limits_enabled(BlockDriverState *bs);
+
+void bdrv_init(void);
+void bdrv_init_with_whitelist(void);
+BlockDriver *bdrv_find_protocol(const char *filename);
+BlockDriver *bdrv_find_format(const char *format_name);
+BlockDriver *bdrv_find_whitelisted_format(const char *format_name);
+int bdrv_create(BlockDriver *drv, const char* filename,
+    QEMUOptionParameter *options);
+int bdrv_create_file(const char* filename, QEMUOptionParameter *options);
+BlockDriverState *bdrv_new(const char *device_name);
+void bdrv_make_anon(BlockDriverState *bs);
+void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old);
+void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top);
+void bdrv_delete(BlockDriverState *bs);
+int bdrv_parse_cache_flags(const char *mode, int *flags);
+int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags);
+int bdrv_open_backing_file(BlockDriverState *bs);
+int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
+              BlockDriver *drv);
+BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
+                                    BlockDriverState *bs, int flags);
+int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, Error **errp);
+int bdrv_reopen(BlockDriverState *bs, int bdrv_flags, Error **errp);
+int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
+                        BlockReopenQueue *queue, Error **errp);
+void bdrv_reopen_commit(BDRVReopenState *reopen_state);
+void bdrv_reopen_abort(BDRVReopenState *reopen_state);
+void bdrv_close(BlockDriverState *bs);
+void bdrv_add_close_notifier(BlockDriverState *bs, Notifier *notify);
+int bdrv_attach_dev(BlockDriverState *bs, void *dev);
+void bdrv_attach_dev_nofail(BlockDriverState *bs, void *dev);
+void bdrv_detach_dev(BlockDriverState *bs, void *dev);
+void *bdrv_get_attached_dev(BlockDriverState *bs);
+void bdrv_set_dev_ops(BlockDriverState *bs, const BlockDevOps *ops,
+                      void *opaque);
+void bdrv_dev_eject_request(BlockDriverState *bs, bool force);
+bool bdrv_dev_has_removable_media(BlockDriverState *bs);
+bool bdrv_dev_is_tray_open(BlockDriverState *bs);
+bool bdrv_dev_is_medium_locked(BlockDriverState *bs);
+int bdrv_read(BlockDriverState *bs, int64_t sector_num,
+              uint8_t *buf, int nb_sectors);
+int bdrv_read_unthrottled(BlockDriverState *bs, int64_t sector_num,
+                          uint8_t *buf, int nb_sectors);
+int bdrv_write(BlockDriverState *bs, int64_t sector_num,
+               const uint8_t *buf, int nb_sectors);
+int bdrv_pread(BlockDriverState *bs, int64_t offset,
+               void *buf, int count);
+int bdrv_pwrite(BlockDriverState *bs, int64_t offset,
+                const void *buf, int count);
+int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset,
+    const void *buf, int count);
+int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
+    int nb_sectors, QEMUIOVector *qiov);
+int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
+    int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
+int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
+    int nb_sectors, QEMUIOVector *qiov);
+/*
+ * Efficiently zero a region of the disk image.  Note that this is a regular
+ * I/O request like read or write and should have a reasonable size.  This
+ * function is not suitable for zeroing the entire image in a single request
+ * because it may allocate memory for the entire region.
+ */
+int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
+    int nb_sectors);
+int coroutine_fn bdrv_co_is_allocated(BlockDriverState *bs, int64_t sector_num,
+    int nb_sectors, int *pnum);
+int coroutine_fn bdrv_co_is_allocated_above(BlockDriverState *top,
+                                            BlockDriverState *base,
+                                            int64_t sector_num,
+                                            int nb_sectors, int *pnum);
+BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
+    const char *backing_file);
+int bdrv_get_backing_file_depth(BlockDriverState *bs);
+int bdrv_truncate(BlockDriverState *bs, int64_t offset);
+int64_t bdrv_getlength(BlockDriverState *bs);
+int64_t bdrv_get_allocated_file_size(BlockDriverState *bs);
+void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
+int bdrv_commit(BlockDriverState *bs);
+int bdrv_commit_all(void);
+int bdrv_change_backing_file(BlockDriverState *bs,
+    const char *backing_file, const char *backing_fmt);
+void bdrv_register(BlockDriver *bdrv);
+int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
+                           BlockDriverState *base);
+BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
+                                    BlockDriverState *bs);
+BlockDriverState *bdrv_find_base(BlockDriverState *bs);
+
+
+typedef struct BdrvCheckResult {
+    int corruptions;
+    int leaks;
+    int check_errors;
+    int corruptions_fixed;
+    int leaks_fixed;
+    BlockFragInfo bfi;
+} BdrvCheckResult;
+
+typedef enum {
+    BDRV_FIX_LEAKS    = 1,
+    BDRV_FIX_ERRORS   = 2,
+} BdrvCheckMode;
+
+int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix);
+
+/* async block I/O */
+typedef void BlockDriverDirtyHandler(BlockDriverState *bs, int64_t sector,
+                                     int sector_num);
+BlockDriverAIOCB *bdrv_aio_readv(BlockDriverState *bs, int64_t sector_num,
+                                 QEMUIOVector *iov, int nb_sectors,
+                                 BlockDriverCompletionFunc *cb, void *opaque);
+BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num,
+                                  QEMUIOVector *iov, int nb_sectors,
+                                  BlockDriverCompletionFunc *cb, void *opaque);
+BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
+                                 BlockDriverCompletionFunc *cb, void *opaque);
+BlockDriverAIOCB *bdrv_aio_discard(BlockDriverState *bs,
+                                   int64_t sector_num, int nb_sectors,
+                                   BlockDriverCompletionFunc *cb, void *opaque);
+void bdrv_aio_cancel(BlockDriverAIOCB *acb);
+
+typedef struct BlockRequest {
+    /* Fields to be filled by multiwrite caller */
+    int64_t sector;
+    int nb_sectors;
+    QEMUIOVector *qiov;
+    BlockDriverCompletionFunc *cb;
+    void *opaque;
+
+    /* Filled by multiwrite implementation */
+    int error;
+} BlockRequest;
+
+int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs,
+    int num_reqs);
+
+/* sg packet commands */
+int bdrv_ioctl(BlockDriverState *bs, unsigned long int req, void *buf);
+BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
+        unsigned long int req, void *buf,
+        BlockDriverCompletionFunc *cb, void *opaque);
+
+/* Invalidate any cached metadata used by image formats */
+void bdrv_invalidate_cache(BlockDriverState *bs);
+void bdrv_invalidate_cache_all(void);
+
+void bdrv_clear_incoming_migration_all(void);
+
+/* Ensure contents are flushed to disk.  */
+int bdrv_flush(BlockDriverState *bs);
+int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
+void bdrv_flush_all(void);
+void bdrv_close_all(void);
+void bdrv_drain_all(void);
+
+int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
+int bdrv_co_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
+int bdrv_has_zero_init(BlockDriverState *bs);
+int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
+                      int *pnum);
+
+void bdrv_set_on_error(BlockDriverState *bs, BlockdevOnError on_read_error,
+                       BlockdevOnError on_write_error);
+BlockdevOnError bdrv_get_on_error(BlockDriverState *bs, bool is_read);
+BlockErrorAction bdrv_get_error_action(BlockDriverState *bs, bool is_read, int error);
+void bdrv_error_action(BlockDriverState *bs, BlockErrorAction action,
+                       bool is_read, int error);
+int bdrv_is_read_only(BlockDriverState *bs);
+int bdrv_is_sg(BlockDriverState *bs);
+int bdrv_enable_write_cache(BlockDriverState *bs);
+void bdrv_set_enable_write_cache(BlockDriverState *bs, bool wce);
+int bdrv_is_inserted(BlockDriverState *bs);
+int bdrv_media_changed(BlockDriverState *bs);
+void bdrv_lock_medium(BlockDriverState *bs, bool locked);
+void bdrv_eject(BlockDriverState *bs, bool eject_flag);
+const char *bdrv_get_format_name(BlockDriverState *bs);
+BlockDriverState *bdrv_find(const char *name);
+BlockDriverState *bdrv_next(BlockDriverState *bs);
+void bdrv_iterate(void (*it)(void *opaque, BlockDriverState *bs),
+                  void *opaque);
+int bdrv_is_encrypted(BlockDriverState *bs);
+int bdrv_key_required(BlockDriverState *bs);
+int bdrv_set_key(BlockDriverState *bs, const char *key);
+int bdrv_query_missing_keys(void);
+void bdrv_iterate_format(void (*it)(void *opaque, const char *name),
+                         void *opaque);
+const char *bdrv_get_device_name(BlockDriverState *bs);
+int bdrv_get_flags(BlockDriverState *bs);
+int bdrv_write_compressed(BlockDriverState *bs, int64_t sector_num,
+                          const uint8_t *buf, int nb_sectors);
+int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi);
+
+const char *bdrv_get_encrypted_filename(BlockDriverState *bs);
+void bdrv_get_backing_filename(BlockDriverState *bs,
+                               char *filename, int filename_size);
+void bdrv_get_full_backing_filename(BlockDriverState *bs,
+                                    char *dest, size_t sz);
+BlockInfo *bdrv_query_info(BlockDriverState *s);
+BlockStats *bdrv_query_stats(const BlockDriverState *bs);
+int bdrv_can_snapshot(BlockDriverState *bs);
+int bdrv_is_snapshot(BlockDriverState *bs);
+BlockDriverState *bdrv_snapshots(void);
+int bdrv_snapshot_create(BlockDriverState *bs,
+                         QEMUSnapshotInfo *sn_info);
+int bdrv_snapshot_goto(BlockDriverState *bs,
+                       const char *snapshot_id);
+int bdrv_snapshot_delete(BlockDriverState *bs, const char *snapshot_id);
+int bdrv_snapshot_list(BlockDriverState *bs,
+                       QEMUSnapshotInfo **psn_info);
+int bdrv_snapshot_load_tmp(BlockDriverState *bs,
+                           const char *snapshot_name);
+char *bdrv_snapshot_dump(char *buf, int buf_size, QEMUSnapshotInfo *sn);
+
+char *get_human_readable_size(char *buf, int buf_size, int64_t size);
+int path_is_absolute(const char *path);
+void path_combine(char *dest, int dest_size,
+                  const char *base_path,
+                  const char *filename);
+
+int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf,
+                      int64_t pos, int size);
+
+int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
+                      int64_t pos, int size);
+
+int bdrv_img_create(const char *filename, const char *fmt,
+                    const char *base_filename, const char *base_fmt,
+                    char *options, uint64_t img_size, int flags);
+
+void bdrv_set_buffer_alignment(BlockDriverState *bs, int align);
+void *qemu_blockalign(BlockDriverState *bs, size_t size);
+
+#define BDRV_SECTORS_PER_DIRTY_CHUNK 2048
+
+void bdrv_set_dirty_tracking(BlockDriverState *bs, int enable);
+int bdrv_get_dirty(BlockDriverState *bs, int64_t sector);
+void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors);
+void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors);
+int64_t bdrv_get_next_dirty(BlockDriverState *bs, int64_t sector);
+int64_t bdrv_get_dirty_count(BlockDriverState *bs);
+
+void bdrv_enable_copy_on_read(BlockDriverState *bs);
+void bdrv_disable_copy_on_read(BlockDriverState *bs);
+
+void bdrv_set_in_use(BlockDriverState *bs, int in_use);
+int bdrv_in_use(BlockDriverState *bs);
+
+enum BlockAcctType {
+    BDRV_ACCT_READ,
+    BDRV_ACCT_WRITE,
+    BDRV_ACCT_FLUSH,
+    BDRV_MAX_IOTYPE,
+};
+
+typedef struct BlockAcctCookie {
+    int64_t bytes;
+    int64_t start_time_ns;
+    enum BlockAcctType type;
+} BlockAcctCookie;
+
+void bdrv_acct_start(BlockDriverState *bs, BlockAcctCookie *cookie,
+        int64_t bytes, enum BlockAcctType type);
+void bdrv_acct_done(BlockDriverState *bs, BlockAcctCookie *cookie);
+
+typedef enum {
+    BLKDBG_L1_UPDATE,
+
+    BLKDBG_L1_GROW_ALLOC_TABLE,
+    BLKDBG_L1_GROW_WRITE_TABLE,
+    BLKDBG_L1_GROW_ACTIVATE_TABLE,
+
+    BLKDBG_L2_LOAD,
+    BLKDBG_L2_UPDATE,
+    BLKDBG_L2_UPDATE_COMPRESSED,
+    BLKDBG_L2_ALLOC_COW_READ,
+    BLKDBG_L2_ALLOC_WRITE,
+
+    BLKDBG_READ_AIO,
+    BLKDBG_READ_BACKING_AIO,
+    BLKDBG_READ_COMPRESSED,
+
+    BLKDBG_WRITE_AIO,
+    BLKDBG_WRITE_COMPRESSED,
+
+    BLKDBG_VMSTATE_LOAD,
+    BLKDBG_VMSTATE_SAVE,
+
+    BLKDBG_COW_READ,
+    BLKDBG_COW_WRITE,
+
+    BLKDBG_REFTABLE_LOAD,
+    BLKDBG_REFTABLE_GROW,
+
+    BLKDBG_REFBLOCK_LOAD,
+    BLKDBG_REFBLOCK_UPDATE,
+    BLKDBG_REFBLOCK_UPDATE_PART,
+    BLKDBG_REFBLOCK_ALLOC,
+    BLKDBG_REFBLOCK_ALLOC_HOOKUP,
+    BLKDBG_REFBLOCK_ALLOC_WRITE,
+    BLKDBG_REFBLOCK_ALLOC_WRITE_BLOCKS,
+    BLKDBG_REFBLOCK_ALLOC_WRITE_TABLE,
+    BLKDBG_REFBLOCK_ALLOC_SWITCH_TABLE,
+
+    BLKDBG_CLUSTER_ALLOC,
+    BLKDBG_CLUSTER_ALLOC_BYTES,
+    BLKDBG_CLUSTER_FREE,
+
+    BLKDBG_EVENT_MAX,
+} BlkDebugEvent;
+
+#define BLKDBG_EVENT(bs, evt) bdrv_debug_event(bs, evt)
+void bdrv_debug_event(BlockDriverState *bs, BlkDebugEvent event);
+
+#endif
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,43 +1,20 @@
-block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o
+block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
 block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
 block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
-block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
-block-obj-$(CONFIG_QUORUM) += quorum.o
 block-obj-y += parallels.o blkdebug.o blkverify.o
-block-obj-y += block-backend.o snapshot.o qapi.o
 block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
 block-obj-$(CONFIG_POSIX) += raw-posix.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
-block-obj-y += null.o mirror.o io.o

-block-obj-y += nbd.o nbd-client.o sheepdog.o
+ifeq ($(CONFIG_POSIX),y)
+block-obj-y += nbd.o sheepdog.o
 block-obj-$(CONFIG_LIBISCSI) += iscsi.o
-block-obj-$(CONFIG_LIBNFS) += nfs.o
 block-obj-$(CONFIG_CURL) += curl.o
 block-obj-$(CONFIG_RBD) += rbd.o
 block-obj-$(CONFIG_GLUSTERFS) += gluster.o
-block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
-block-obj-$(CONFIG_LIBSSH2) += ssh.o
-block-obj-y += accounting.o
-block-obj-y += write-threshold.o
+endif

 common-obj-y += stream.o
 common-obj-y += commit.o
-common-obj-y += backup.o
-
-iscsi.o-cflags     := $(LIBISCSI_CFLAGS)
-iscsi.o-libs       := $(LIBISCSI_LIBS)
-curl.o-cflags      := $(CURL_CFLAGS)
-curl.o-libs        := $(CURL_LIBS)
-rbd.o-cflags       := $(RBD_CFLAGS)
-rbd.o-libs         := $(RBD_LIBS)
-gluster.o-cflags   := $(GLUSTERFS_CFLAGS)
-gluster.o-libs     := $(GLUSTERFS_LIBS)
-ssh.o-cflags       := $(LIBSSH2_CFLAGS)
-ssh.o-libs         := $(LIBSSH2_LIBS)
-archipelago.o-libs := $(ARCHIPELAGO_LIBS)
-block-obj-m        += dmg.o
-dmg.o-libs         := $(BZIP2_LIBS)
-qcow.o-libs        := -lz
-linux-aio.o-libs   := -laio
+common-obj-y += mirror.o
--- a/block/accounting.c
+++ b/block/accounting.c
@@ -1,63 +0,0 @@
-/*
- * QEMU System Emulator block accounting
- *
- * Copyright (c) 2011 Christoph Hellwig
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to deal
- * in the Software without restriction, including without limitation the rights
- * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
- * copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
- * THE SOFTWARE.
- */
-
-#include "block/accounting.h"
-#include "block/block_int.h"
-#include "qemu/timer.h"
-
-void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
-                      int64_t bytes, enum BlockAcctType type)
-{
-    assert(type < BLOCK_MAX_IOTYPE);
-
-    cookie->bytes = bytes;
-    cookie->start_time_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
-    cookie->type = type;
-}
-
-void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
-{
-    assert(cookie->type < BLOCK_MAX_IOTYPE);
-
-    stats->nr_bytes[cookie->type] += cookie->bytes;
-    stats->nr_ops[cookie->type]++;
-    stats->total_time_ns[cookie->type] +=
-        qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - cookie->start_time_ns;
-}
-
-
-void block_acct_highest_sector(BlockAcctStats *stats, int64_t sector_num,
-                               unsigned int nb_sectors)
-{
-    if (stats->wr_highest_sector < sector_num + nb_sectors - 1) {
-        stats->wr_highest_sector = sector_num + nb_sectors - 1;
-    }
-}
-
-void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type,
-                      int num_requests)
-{
-    assert(type < BLOCK_MAX_IOTYPE);
-    stats->merged[type] += num_requests;
-}
--- a/block/archipelago.c
+++ b/block/archipelago.c
--- a/block/backup.c
+++ b/block/backup.c
@@ -1,548 +0,0 @@
-/*
- * QEMU backup
- *
- * Copyright (C) 2013 Proxmox Server Solutions
- *
- * Authors:
- *  Dietmar Maurer (dietmar@proxmox.com)
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- *
- */
-
-#include <stdio.h>
-#include <errno.h>
-#include <unistd.h>
-
-#include "trace.h"
-#include "block/block.h"
-#include "block/block_int.h"
-#include "block/blockjob.h"
-#include "qemu/ratelimit.h"
-
-#define BACKUP_CLUSTER_BITS 16
-#define BACKUP_CLUSTER_SIZE (1 << BACKUP_CLUSTER_BITS)
-#define BACKUP_SECTORS_PER_CLUSTER (BACKUP_CLUSTER_SIZE / BDRV_SECTOR_SIZE)
-
-#define SLICE_TIME 100000000ULL /* ns */
-
-typedef struct CowRequest {
-    int64_t start;
-    int64_t end;
-    QLIST_ENTRY(CowRequest) list;
-    CoQueue wait_queue; /* coroutines blocked on this request */
-} CowRequest;
-
-typedef struct BackupBlockJob {
-    BlockJob common;
-    BlockDriverState *target;
-    /* bitmap for sync=dirty-bitmap */
-    BdrvDirtyBitmap *sync_bitmap;
-    MirrorSyncMode sync_mode;
-    RateLimit limit;
-    BlockdevOnError on_source_error;
-    BlockdevOnError on_target_error;
-    CoRwlock flush_rwlock;
-    uint64_t sectors_read;
-    HBitmap *bitmap;
-    QLIST_HEAD(, CowRequest) inflight_reqs;
-} BackupBlockJob;
-
-/* See if in-flight requests overlap and wait for them to complete */
-static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
-                                                       int64_t start,
-                                                       int64_t end)
-{
-    CowRequest *req;
-    bool retry;
-
-    do {
-        retry = false;
-        QLIST_FOREACH(req, &job->inflight_reqs, list) {
-            if (end > req->start && start < req->end) {
-                qemu_co_queue_wait(&req->wait_queue);
-                retry = true;
-                break;
-            }
-        }
-    } while (retry);
-}
-
-/* Keep track of an in-flight request */
-static void cow_request_begin(CowRequest *req, BackupBlockJob *job,
-                                     int64_t start, int64_t end)
-{
-    req->start = start;
-    req->end = end;
-    qemu_co_queue_init(&req->wait_queue);
-    QLIST_INSERT_HEAD(&job->inflight_reqs, req, list);
-}
-
-/* Forget about a completed request */
-static void cow_request_end(CowRequest *req)
-{
-    QLIST_REMOVE(req, list);
-    qemu_co_queue_restart_all(&req->wait_queue);
-}
-
-static int coroutine_fn backup_do_cow(BlockDriverState *bs,
-                                      int64_t sector_num, int nb_sectors,
-                                      bool *error_is_read)
-{
-    BackupBlockJob *job = (BackupBlockJob *)bs->job;
-    CowRequest cow_request;
-    struct iovec iov;
-    QEMUIOVector bounce_qiov;
-    void *bounce_buffer = NULL;
-    int ret = 0;
-    int64_t start, end;
-    int n;
-
-    qemu_co_rwlock_rdlock(&job->flush_rwlock);
-
-    start = sector_num / BACKUP_SECTORS_PER_CLUSTER;
-    end = DIV_ROUND_UP(sector_num + nb_sectors, BACKUP_SECTORS_PER_CLUSTER);
-
-    trace_backup_do_cow_enter(job, start, sector_num, nb_sectors);
-
-    wait_for_overlapping_requests(job, start, end);
-    cow_request_begin(&cow_request, job, start, end);
-
-    for (; start < end; start++) {
-        if (hbitmap_get(job->bitmap, start)) {
-            trace_backup_do_cow_skip(job, start);
-            continue; /* already copied */
-        }
-
-        trace_backup_do_cow_process(job, start);
-
-        n = MIN(BACKUP_SECTORS_PER_CLUSTER,
-                job->common.len / BDRV_SECTOR_SIZE -
-                start * BACKUP_SECTORS_PER_CLUSTER);
-
-        if (!bounce_buffer) {
-            bounce_buffer = qemu_blockalign(bs, BACKUP_CLUSTER_SIZE);
-        }
-        iov.iov_base = bounce_buffer;
-        iov.iov_len = n * BDRV_SECTOR_SIZE;
-        qemu_iovec_init_external(&bounce_qiov, &iov, 1);
-
-        ret = bdrv_co_readv(bs, start * BACKUP_SECTORS_PER_CLUSTER, n,
-                            &bounce_qiov);
-        if (ret < 0) {
-            trace_backup_do_cow_read_fail(job, start, ret);
-            if (error_is_read) {
-                *error_is_read = true;
-            }
-            goto out;
-        }
-
-        if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
-            ret = bdrv_co_write_zeroes(job->target,
-                                       start * BACKUP_SECTORS_PER_CLUSTER,
-                                       n, BDRV_REQ_MAY_UNMAP);
-        } else {
-            ret = bdrv_co_writev(job->target,
-                                 start * BACKUP_SECTORS_PER_CLUSTER, n,
-                                 &bounce_qiov);
-        }
-        if (ret < 0) {
-            trace_backup_do_cow_write_fail(job, start, ret);
-            if (error_is_read) {
-                *error_is_read = false;
-            }
-            goto out;
-        }
-
-        hbitmap_set(job->bitmap, start, 1);
-
-        /* Publish progress, guest I/O counts as progress too.  Note that the
-         * offset field is an opaque progress value, it is not a disk offset.
-         */
-        job->sectors_read += n;
-        job->common.offset += n * BDRV_SECTOR_SIZE;
-    }
-
-out:
-    if (bounce_buffer) {
-        qemu_vfree(bounce_buffer);
-    }
-
-    cow_request_end(&cow_request);
-
-    trace_backup_do_cow_return(job, sector_num, nb_sectors, ret);
-
-    qemu_co_rwlock_unlock(&job->flush_rwlock);
-
-    return ret;
-}
-
-static int coroutine_fn backup_before_write_notify(
-        NotifierWithReturn *notifier,
-        void *opaque)
-{
-    BdrvTrackedRequest *req = opaque;
-    int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
-    int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;
-
-    assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
-    assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-
-    return backup_do_cow(req->bs, sector_num, nb_sectors, NULL);
-}
-
-static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
-{
-    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
-
-    if (speed < 0) {
-        error_set(errp, QERR_INVALID_PARAMETER, "speed");
-        return;
-    }
-    ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
-}
-
-static void backup_iostatus_reset(BlockJob *job)
-{
-    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
-
-    bdrv_iostatus_reset(s->target);
-}
-
-static const BlockJobDriver backup_job_driver = {
-    .instance_size  = sizeof(BackupBlockJob),
-    .job_type       = BLOCK_JOB_TYPE_BACKUP,
-    .set_speed      = backup_set_speed,
-    .iostatus_reset = backup_iostatus_reset,
-};
-
-static BlockErrorAction backup_error_action(BackupBlockJob *job,
-                                            bool read, int error)
-{
-    if (read) {
-        return block_job_error_action(&job->common, job->common.bs,
-                                      job->on_source_error, true, error);
-    } else {
-        return block_job_error_action(&job->common, job->target,
-                                      job->on_target_error, false, error);
-    }
-}
-
-typedef struct {
-    int ret;
-} BackupCompleteData;
-
-static void backup_complete(BlockJob *job, void *opaque)
-{
-    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
-    BackupCompleteData *data = opaque;
-
-    bdrv_unref(s->target);
-
-    block_job_completed(job, data->ret);
-    g_free(data);
-}
-
-static bool coroutine_fn yield_and_check(BackupBlockJob *job)
-{
-    if (block_job_is_cancelled(&job->common)) {
-        return true;
-    }
-
-    /* we need to yield so that bdrv_drain_all() returns.
-     * (without, VM does not reboot)
-     */
-    if (job->common.speed) {
-        uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
-                                                      job->sectors_read);
-        job->sectors_read = 0;
-        block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
-    } else {
-        block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, 0);
-    }
-
-    if (block_job_is_cancelled(&job->common)) {
-        return true;
-    }
-
-    return false;
-}
-
-static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
-{
-    bool error_is_read;
-    int ret = 0;
-    int clusters_per_iter;
-    uint32_t granularity;
-    int64_t sector;
-    int64_t cluster;
-    int64_t end;
-    int64_t last_cluster = -1;
-    BlockDriverState *bs = job->common.bs;
-    HBitmapIter hbi;
-
-    granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
-    clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);
-    bdrv_dirty_iter_init(job->sync_bitmap, &hbi);
-
-    /* Find the next dirty sector(s) */
-    while ((sector = hbitmap_iter_next(&hbi)) != -1) {
-        cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
-
-        /* Fake progress updates for any clusters we skipped */
-        if (cluster != last_cluster + 1) {
-            job->common.offset += ((cluster - last_cluster - 1) *
-                                   BACKUP_CLUSTER_SIZE);
-        }
-
-        for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
-            do {
-                if (yield_and_check(job)) {
-                    return ret;
-                }
-                ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
-                                    BACKUP_SECTORS_PER_CLUSTER, &error_is_read);
-                if ((ret < 0) &&
-                    backup_error_action(job, error_is_read, -ret) ==
-                    BLOCK_ERROR_ACTION_REPORT) {
-                    return ret;
-                }
-            } while (ret < 0);
-        }
-
-        /* If the bitmap granularity is smaller than the backup granularity,
-         * we need to advance the iterator pointer to the next cluster. */
-        if (granularity < BACKUP_CLUSTER_SIZE) {
-            bdrv_set_dirty_iter(&hbi, cluster * BACKUP_SECTORS_PER_CLUSTER);
-        }
-
-        last_cluster = cluster - 1;
-    }
-
-    /* Play some final catchup with the progress meter */
-    end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
-    if (last_cluster + 1 < end) {
-        job->common.offset += ((end - last_cluster - 1) * BACKUP_CLUSTER_SIZE);
-    }
-
-    return ret;
-}
-
-static void coroutine_fn backup_run(void *opaque)
-{
-    BackupBlockJob *job = opaque;
-    BackupCompleteData *data;
-    BlockDriverState *bs = job->common.bs;
-    BlockDriverState *target = job->target;
-    BlockdevOnError on_target_error = job->on_target_error;
-    NotifierWithReturn before_write = {
-        .notify = backup_before_write_notify,
-    };
-    int64_t start, end;
-    int ret = 0;
-
-    QLIST_INIT(&job->inflight_reqs);
-    qemu_co_rwlock_init(&job->flush_rwlock);
-
-    start = 0;
-    end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
-
-    job->bitmap = hbitmap_alloc(end, 0);
-
-    bdrv_set_enable_write_cache(target, true);
-    bdrv_set_on_error(target, on_target_error, on_target_error);
-    bdrv_iostatus_enable(target);
-
-    bdrv_add_before_write_notifier(bs, &before_write);
-
-    if (job->sync_mode == MIRROR_SYNC_MODE_NONE) {
-        while (!block_job_is_cancelled(&job->common)) {
-            /* Yield until the job is cancelled.  We just let our before_write
-             * notify callback service CoW requests. */
-            job->common.busy = false;
-            qemu_coroutine_yield();
-            job->common.busy = true;
-        }
-    } else if (job->sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
-        ret = backup_run_incremental(job);
-    } else {
-        /* Both FULL and TOP SYNC_MODE's require copying.. */
-        for (; start < end; start++) {
-            bool error_is_read;
-            if (yield_and_check(job)) {
-                break;
-            }
-
-            if (job->sync_mode == MIRROR_SYNC_MODE_TOP) {
-                int i, n;
-                int alloced = 0;
-
-                /* Check to see if these blocks are already in the
-                 * backing file. */
-
-                for (i = 0; i < BACKUP_SECTORS_PER_CLUSTER;) {
-                    /* bdrv_is_allocated() only returns true/false based
-                     * on the first set of sectors it comes across that
-                     * are are all in the same state.
-                     * For that reason we must verify each sector in the
-                     * backup cluster length.  We end up copying more than
-                     * needed but at some point that is always the case. */
-                    alloced =
-                        bdrv_is_allocated(bs,
-                                start * BACKUP_SECTORS_PER_CLUSTER + i,
-                                BACKUP_SECTORS_PER_CLUSTER - i, &n);
-                    i += n;
-
-                    if (alloced == 1 || n == 0) {
-                        break;
-                    }
-                }
-
-                /* If the above loop never found any sectors that are in
-                 * the topmost image, skip this backup. */
-                if (alloced == 0) {
-                    continue;
-                }
-            }
-            /* FULL sync mode we copy the whole drive. */
-            ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER,
-                    BACKUP_SECTORS_PER_CLUSTER, &error_is_read);
-            if (ret < 0) {
-                /* Depending on error action, fail now or retry cluster */
-                BlockErrorAction action =
-                    backup_error_action(job, error_is_read, -ret);
-                if (action == BLOCK_ERROR_ACTION_REPORT) {
-                    break;
-                } else {
-                    start--;
-                    continue;
-                }
-            }
-        }
-    }
-
-    notifier_with_return_remove(&before_write);
-
-    /* wait until pending backup_do_cow() calls have completed */
-    qemu_co_rwlock_wrlock(&job->flush_rwlock);
-    qemu_co_rwlock_unlock(&job->flush_rwlock);
-
-    if (job->sync_bitmap) {
-        BdrvDirtyBitmap *bm;
-        if (ret < 0) {
-            /* Merge the successor back into the parent, delete nothing. */
-            bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
-            assert(bm);
-        } else {
-            /* Everything is fine, delete this bitmap and install the backup. */
-            bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
-            assert(bm);
-        }
-    }
-    hbitmap_free(job->bitmap);
-
-    bdrv_iostatus_disable(target);
-    bdrv_op_unblock_all(target, job->common.blocker);
-
-    data = g_malloc(sizeof(*data));
-    data->ret = ret;
-    block_job_defer_to_main_loop(&job->common, backup_complete, data);
-}
-
-void backup_start(BlockDriverState *bs, BlockDriverState *target,
-                  int64_t speed, MirrorSyncMode sync_mode,
-                  BdrvDirtyBitmap *sync_bitmap,
-                  BlockdevOnError on_source_error,
-                  BlockdevOnError on_target_error,
-                  BlockCompletionFunc *cb, void *opaque,
-                  Error **errp)
-{
-    int64_t len;
-
-    assert(bs);
-    assert(target);
-    assert(cb);
-
-    if (bs == target) {
-        error_setg(errp, "Source and target cannot be the same");
-        return;
-    }
-
-    if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
-         on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
-        !bdrv_iostatus_is_enabled(bs)) {
-        error_set(errp, QERR_INVALID_PARAMETER, "on-source-error");
-        return;
-    }
-
-    if (!bdrv_is_inserted(bs)) {
-        error_setg(errp, "Device is not inserted: %s",
-                   bdrv_get_device_name(bs));
-        return;
-    }
-
-    if (!bdrv_is_inserted(target)) {
-        error_setg(errp, "Device is not inserted: %s",
-                   bdrv_get_device_name(target));
-        return;
-    }
-
-    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
-        return;
-    }
-
-    if (bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP_TARGET, errp)) {
-        return;
-    }
-
-    if (sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
-        if (!sync_bitmap) {
-            error_setg(errp, "must provide a valid bitmap name for "
-                             "\"dirty-bitmap\" sync mode");
-            return;
-        }
-
-        /* Create a new bitmap, and freeze/disable this one. */
-        if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap, errp) < 0) {
-            return;
-        }
-    } else if (sync_bitmap) {
-        error_setg(errp,
-                   "a sync_bitmap was provided to backup_run, "
-                   "but received an incompatible sync_mode (%s)",
-                   MirrorSyncMode_lookup[sync_mode]);
-        return;
-    }
-
-    len = bdrv_getlength(bs);
-    if (len < 0) {
-        error_setg_errno(errp, -len, "unable to get length for '%s'",
-                         bdrv_get_device_name(bs));
-        goto error;
-    }
-
-    BackupBlockJob *job = block_job_create(&backup_job_driver, bs, speed,
-                                           cb, opaque, errp);
-    if (!job) {
-        goto error;
-    }
-
-    bdrv_op_block_all(target, job->common.blocker);
-
-    job->on_source_error = on_source_error;
-    job->on_target_error = on_target_error;
-    job->target = target;
-    job->sync_mode = sync_mode;
-    job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP ?
-                       sync_bitmap : NULL;
-    job->common.len = len;
-    job->common.co = qemu_coroutine_create(backup_run);
-    qemu_coroutine_enter(job->common.co, job);
-    return;
-
- error:
-    if (sync_bitmap) {
-        bdrv_reclaim_dirty_bitmap(bs, sync_bitmap, NULL);
-    }
-}
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -23,43 +23,32 @@
 */

 #include "qemu-common.h"
-#include "qemu/config-file.h"
-#include "block/block_int.h"
-#include "qemu/module.h"
-#include "qapi/qmp/qbool.h"
-#include "qapi/qmp/qdict.h"
-#include "qapi/qmp/qint.h"
-#include "qapi/qmp/qstring.h"
+#include "block_int.h"
+#include "module.h"

 typedef struct BDRVBlkdebugState {
    int state;
    int new_state;
-
    QLIST_HEAD(, BlkdebugRule) rules[BLKDBG_EVENT_MAX];
    QSIMPLEQ_HEAD(, BlkdebugRule) active_rules;
-    QLIST_HEAD(, BlkdebugSuspendedReq) suspended_reqs;
 } BDRVBlkdebugState;

 typedef struct BlkdebugAIOCB {
-    BlockAIOCB common;
+    BlockDriverAIOCB common;
    QEMUBH *bh;
    int ret;
 } BlkdebugAIOCB;

-typedef struct BlkdebugSuspendedReq {
-    Coroutine *co;
-    char *tag;
-    QLIST_ENTRY(BlkdebugSuspendedReq) next;
-} BlkdebugSuspendedReq;
+static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb);

 static const AIOCBInfo blkdebug_aiocb_info = {
-    .aiocb_size    = sizeof(BlkdebugAIOCB),
+    .aiocb_size = sizeof(BlkdebugAIOCB),
+    .cancel     = blkdebug_aio_cancel,
 };

 enum {
    ACTION_INJECT_ERROR,
    ACTION_SET_STATE,
-    ACTION_SUSPEND,
 };

 typedef struct BlkdebugRule {
@@ -76,9 +65,6 @@ typedef struct BlkdebugRule {
        struct {
            int new_state;
        } set_state;
-        struct {
-            char *tag;
-        } suspend;
    } options;
    QLIST_ENTRY(BlkdebugRule) next;
    QSIMPLEQ_ENTRY(BlkdebugRule) active_next;
@@ -169,7 +155,6 @@ static const char *event_names[BLKDBG_EVENT_MAX] = {

    [BLKDBG_REFTABLE_LOAD]                  = "reftable_load",
    [BLKDBG_REFTABLE_GROW]                  = "reftable_grow",
-    [BLKDBG_REFTABLE_UPDATE]                = "reftable_update",

    [BLKDBG_REFBLOCK_LOAD]                  = "refblock_load",
    [BLKDBG_REFBLOCK_UPDATE]                = "refblock_update",
@@ -184,19 +169,6 @@ static const char *event_names[BLKDBG_EVENT_MAX] = {
    [BLKDBG_CLUSTER_ALLOC]                  = "cluster_alloc",
    [BLKDBG_CLUSTER_ALLOC_BYTES]            = "cluster_alloc_bytes",
    [BLKDBG_CLUSTER_FREE]                   = "cluster_free",
-
-    [BLKDBG_FLUSH_TO_OS]                    = "flush_to_os",
-    [BLKDBG_FLUSH_TO_DISK]                  = "flush_to_disk",
-
-    [BLKDBG_PWRITEV_RMW_HEAD]               = "pwritev_rmw.head",
-    [BLKDBG_PWRITEV_RMW_AFTER_HEAD]         = "pwritev_rmw.after_head",
-    [BLKDBG_PWRITEV_RMW_TAIL]               = "pwritev_rmw.tail",
-    [BLKDBG_PWRITEV_RMW_AFTER_TAIL]         = "pwritev_rmw.after_tail",
-    [BLKDBG_PWRITEV]                        = "pwritev",
-    [BLKDBG_PWRITEV_ZERO]                   = "pwritev_zero",
-    [BLKDBG_PWRITEV_DONE]                   = "pwritev_done",
-
-    [BLKDBG_EMPTY_IMAGE_PREPARE]            = "empty_image_prepare",
 };

 static int get_event_by_name(const char *name, BlkDebugEvent *event)
@@ -216,7 +188,6 @@ static int get_event_by_name(const char *name, BlkDebugEvent *event)
 struct add_rule_data {
    BDRVBlkdebugState *s;
    int action;
-    Error **errp;
 };

 static int add_rule(QemuOpts *opts, void *opaque)
@@ -229,11 +200,7 @@ static int add_rule(QemuOpts *opts, void *opaque)

    /* Find the right event for the rule */
    event_name = qemu_opt_get(opts, "event");
-    if (!event_name) {
-        error_setg(d->errp, "Missing event name for rule");
-        return -1;
-    } else if (get_event_by_name(event_name, &event) < 0) {
-        error_setg(d->errp, "Invalid event name \"%s\"", event_name);
+    if (!event_name || get_event_by_name(event_name, &event) < 0) {
        return -1;
    }

@@ -259,11 +226,6 @@ static int add_rule(QemuOpts *opts, void *opaque)
        rule->options.set_state.new_state =
            qemu_opt_get_number(opts, "new_state", 0);
        break;
-
-    case ACTION_SUSPEND:
-        rule->options.suspend.tag =
-            g_strdup(qemu_opt_get(opts, "tag"));
-        break;
    };

    /* Add the rule */
@@ -272,189 +234,75 @@ static int add_rule(QemuOpts *opts, void *opaque)
    return 0;
 }

-static void remove_rule(BlkdebugRule *rule)
+static int read_config(BDRVBlkdebugState *s, const char *filename)
 {
-    switch (rule->action) {
-    case ACTION_INJECT_ERROR:
-    case ACTION_SET_STATE:
-        break;
-    case ACTION_SUSPEND:
-        g_free(rule->options.suspend.tag);
-        break;
-    }
-
-    QLIST_REMOVE(rule, next);
-    g_free(rule);
-}
-
-static int read_config(BDRVBlkdebugState *s, const char *filename,
-                       QDict *options, Error **errp)
-{
-    FILE *f = NULL;
+    FILE *f;
    int ret;
    struct add_rule_data d;
-    Error *local_err = NULL;

-    if (filename) {
-        f = fopen(filename, "r");
-        if (f == NULL) {
-            error_setg_errno(errp, errno, "Could not read blkdebug config file");
-            return -errno;
-        }
-
-        ret = qemu_config_parse(f, config_groups, filename);
-        if (ret < 0) {
-            error_setg(errp, "Could not parse blkdebug config file");
-            ret = -EINVAL;
-            goto fail;
-        }
+    f = fopen(filename, "r");
+    if (f == NULL) {
+        return -errno;
    }

-    qemu_config_parse_qdict(options, config_groups, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
+    ret = qemu_config_parse(f, config_groups, filename);
+    if (ret < 0) {
        goto fail;
    }

    d.s = s;
    d.action = ACTION_INJECT_ERROR;
-    d.errp = &local_err;
-    qemu_opts_foreach(&inject_error_opts, add_rule, &d, 1);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto fail;
-    }
+    qemu_opts_foreach(&inject_error_opts, add_rule, &d, 0);

    d.action = ACTION_SET_STATE;
-    qemu_opts_foreach(&set_state_opts, add_rule, &d, 1);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto fail;
-    }
+    qemu_opts_foreach(&set_state_opts, add_rule, &d, 0);

    ret = 0;
 fail:
    qemu_opts_reset(&inject_error_opts);
    qemu_opts_reset(&set_state_opts);
-    if (f) {
-        fclose(f);
-    }
+    fclose(f);
    return ret;
 }

 /* Valid blkdebug filenames look like blkdebug:path/to/config:path/to/image */
-static void blkdebug_parse_filename(const char *filename, QDict *options,
-                                    Error **errp)
-{
-    const char *c;
-
-    /* Parse the blkdebug: prefix */
-    if (!strstart(filename, "blkdebug:", &filename)) {
-        /* There was no prefix; therefore, all options have to be already
-           present in the QDict (except for the filename) */
-        qdict_put(options, "x-image", qstring_from_str(filename));
-        return;
-    }
-
-    /* Parse config file path */
-    c = strchr(filename, ':');
-    if (c == NULL) {
-        error_setg(errp, "blkdebug requires both config file and image path");
-        return;
-    }
-
-    if (c != filename) {
-        QString *config_path;
-        config_path = qstring_from_substr(filename, 0, c - filename - 1);
-        qdict_put(options, "config", config_path);
-    }
-
-    /* TODO Allow multi-level nesting and set file.filename here */
-    filename = c + 1;
-    qdict_put(options, "x-image", qstring_from_str(filename));
-}
-
-static QemuOptsList runtime_opts = {
-    .name = "blkdebug",
-    .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
-    .desc = {
-        {
-            .name = "config",
-            .type = QEMU_OPT_STRING,
-            .help = "Path to the configuration file",
-        },
-        {
-            .name = "x-image",
-            .type = QEMU_OPT_STRING,
-            .help = "[internal use only, will be removed]",
-        },
-        {
-            .name = "align",
-            .type = QEMU_OPT_SIZE,
-            .help = "Required alignment in bytes",
-        },
-        { /* end of list */ }
-    },
-};
-
-static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
-                         Error **errp)
+static int blkdebug_open(BlockDriverState *bs, const char *filename, int flags)
 {
    BDRVBlkdebugState *s = bs->opaque;
-    QemuOpts *opts;
-    Error *local_err = NULL;
-    const char *config;
-    uint64_t align;
    int ret;
+    char *config, *c;

-    opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
-    qemu_opts_absorb_qdict(opts, options, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto out;
+    /* Parse the blkdebug: prefix */
+    if (strncmp(filename, "blkdebug:", strlen("blkdebug:"))) {
+        return -EINVAL;
+    }
+    filename += strlen("blkdebug:");
+
+    /* Read rules from config file */
+    c = strchr(filename, ':');
+    if (c == NULL) {
+        return -EINVAL;
    }

-    /* Read rules from config file or command line options */
-    config = qemu_opt_get(opts, "config");
-    ret = read_config(s, config, options, errp);
-    if (ret) {
-        goto out;
+    config = g_strdup(filename);
+    config[c - filename] = '\0';
+    ret = read_config(s, config);
+    g_free(config);
+    if (ret < 0) {
+        return ret;
    }
+    filename = c + 1;

    /* Set initial state */
    s->state = 1;

    /* Open the backing file */
-    assert(bs->file == NULL);
-    ret = bdrv_open_image(&bs->file, qemu_opt_get(opts, "x-image"), options, "image",
-                          flags | BDRV_O_PROTOCOL, false, &local_err);
+    ret = bdrv_file_open(&bs->file, filename, flags);
    if (ret < 0) {
-        error_propagate(errp, local_err);
-        goto out;
+        return ret;
    }

-    /* Set request alignment */
-    align = qemu_opt_get_size(opts, "align", bs->request_alignment);
-    if (align > 0 && align < INT_MAX && !(align & (align - 1))) {
-        bs->request_alignment = align;
-    } else {
-        error_setg(errp, "Invalid alignment");
-        ret = -EINVAL;
-        goto fail_unref;
-    }
-
-    ret = 0;
-    goto out;
-
-fail_unref:
-    bdrv_unref(bs->file);
-out:
-    qemu_opts_del(opts);
-    return ret;
+    return 0;
 }

 static void error_callback_bh(void *opaque)
@@ -462,40 +310,44 @@ static void error_callback_bh(void *opaque)
    struct BlkdebugAIOCB *acb = opaque;
    qemu_bh_delete(acb->bh);
    acb->common.cb(acb->common.opaque, acb->ret);
-    qemu_aio_unref(acb);
+    qemu_aio_release(acb);
 }

-static BlockAIOCB *inject_error(BlockDriverState *bs,
-    BlockCompletionFunc *cb, void *opaque, BlkdebugRule *rule)
+static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb)
+{
+    BlkdebugAIOCB *acb = container_of(blockacb, BlkdebugAIOCB, common);
+    qemu_aio_release(acb);
+}
+
+static BlockDriverAIOCB *inject_error(BlockDriverState *bs,
+    BlockDriverCompletionFunc *cb, void *opaque, BlkdebugRule *rule)
 {
    BDRVBlkdebugState *s = bs->opaque;
    int error = rule->options.inject.error;
    struct BlkdebugAIOCB *acb;
    QEMUBH *bh;
-    bool immediately = rule->options.inject.immediately;

    if (rule->options.inject.once) {
-        QSIMPLEQ_REMOVE(&s->active_rules, rule, BlkdebugRule, active_next);
-        remove_rule(rule);
+        QSIMPLEQ_INIT(&s->active_rules);
    }

-    if (immediately) {
+    if (rule->options.inject.immediately) {
        return NULL;
    }

    acb = qemu_aio_get(&blkdebug_aiocb_info, bs, cb, opaque);
    acb->ret = -error;

-    bh = aio_bh_new(bdrv_get_aio_context(bs), error_callback_bh, acb);
+    bh = qemu_bh_new(error_callback_bh, acb);
    acb->bh = bh;
    qemu_bh_schedule(bh);

    return &acb->common;
 }

-static BlockAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
+static BlockDriverAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
    int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-    BlockCompletionFunc *cb, void *opaque)
+    BlockDriverCompletionFunc *cb, void *opaque)
 {
    BDRVBlkdebugState *s = bs->opaque;
    BlkdebugRule *rule = NULL;
@@ -515,9 +367,9 @@ static BlockAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
    return bdrv_aio_readv(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
 }

-static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
+static BlockDriverAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
    int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-    BlockCompletionFunc *cb, void *opaque)
+    BlockDriverCompletionFunc *cb, void *opaque)
 {
    BDRVBlkdebugState *s = bs->opaque;
    BlkdebugRule *rule = NULL;
@@ -537,26 +389,6 @@ static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
    return bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
 }

-static BlockAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
-    BlockCompletionFunc *cb, void *opaque)
-{
-    BDRVBlkdebugState *s = bs->opaque;
-    BlkdebugRule *rule = NULL;
-
-    QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
-        if (rule->options.inject.sector == -1) {
-            break;
-        }
-    }
-
-    if (rule && rule->options.inject.error) {
-        return inject_error(bs, cb, opaque, rule);
-    }
-
-    return bdrv_aio_flush(bs->file, cb, opaque);
-}
-
-
 static void blkdebug_close(BlockDriverState *bs)
 {
    BDRVBlkdebugState *s = bs->opaque;
@@ -565,32 +397,12 @@ static void blkdebug_close(BlockDriverState *bs)

    for (i = 0; i < BLKDBG_EVENT_MAX; i++) {
        QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
-            remove_rule(rule);
+            QLIST_REMOVE(rule, next);
+            g_free(rule);
        }
    }
 }

-static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
-{
-    BDRVBlkdebugState *s = bs->opaque;
-    BlkdebugSuspendedReq r;
-
-    r = (BlkdebugSuspendedReq) {
-        .co         = qemu_coroutine_self(),
-        .tag        = g_strdup(rule->options.suspend.tag),
-    };
-
-    remove_rule(rule);
-    QLIST_INSERT_HEAD(&s->suspended_reqs, &r, next);
-
-    printf("blkdebug: Suspended request '%s'\n", r.tag);
-    qemu_coroutine_yield();
-    printf("blkdebug: Resuming request '%s'\n", r.tag);
-
-    QLIST_REMOVE(&r, next);
-    g_free(r.tag);
-}
-
 static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
    bool injected)
 {
@@ -614,10 +426,6 @@ static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
    case ACTION_SET_STATE:
        s->new_state = rule->options.set_state.new_state;
        break;
-
-    case ACTION_SUSPEND:
-        suspend_request(bs, rule);
-        break;
    }
    return injected;
 }
@@ -625,178 +433,38 @@ static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
 static void blkdebug_debug_event(BlockDriverState *bs, BlkDebugEvent event)
 {
    BDRVBlkdebugState *s = bs->opaque;
-    struct BlkdebugRule *rule, *next;
+    struct BlkdebugRule *rule;
    bool injected;

    assert((int)event >= 0 && event < BLKDBG_EVENT_MAX);

    injected = false;
    s->new_state = s->state;
-    QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
+    QLIST_FOREACH(rule, &s->rules[event], next) {
        injected = process_rule(bs, rule, injected);
    }
    s->state = s->new_state;
 }

-static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
-                                     const char *tag)
-{
-    BDRVBlkdebugState *s = bs->opaque;
-    struct BlkdebugRule *rule;
-    BlkDebugEvent blkdebug_event;
-
-    if (get_event_by_name(event, &blkdebug_event) < 0) {
-        return -ENOENT;
-    }
-
-
-    rule = g_malloc(sizeof(*rule));
-    *rule = (struct BlkdebugRule) {
-        .event  = blkdebug_event,
-        .action = ACTION_SUSPEND,
-        .state  = 0,
-        .options.suspend.tag = g_strdup(tag),
-    };
-
-    QLIST_INSERT_HEAD(&s->rules[blkdebug_event], rule, next);
-
-    return 0;
-}
-
-static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
-{
-    BDRVBlkdebugState *s = bs->opaque;
-    BlkdebugSuspendedReq *r, *next;
-
-    QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, next) {
-        if (!strcmp(r->tag, tag)) {
-            qemu_coroutine_enter(r->co, NULL);
-            return 0;
-        }
-    }
-    return -ENOENT;
-}
-
-static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
-                                            const char *tag)
-{
-    BDRVBlkdebugState *s = bs->opaque;
-    BlkdebugSuspendedReq *r, *r_next;
-    BlkdebugRule *rule, *next;
-    int i, ret = -ENOENT;
-
-    for (i = 0; i < BLKDBG_EVENT_MAX; i++) {
-        QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
-            if (rule->action == ACTION_SUSPEND &&
-                !strcmp(rule->options.suspend.tag, tag)) {
-                remove_rule(rule);
-                ret = 0;
-            }
-        }
-    }
-    QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, r_next) {
-        if (!strcmp(r->tag, tag)) {
-            qemu_coroutine_enter(r->co, NULL);
-            ret = 0;
-        }
-    }
-    return ret;
-}
-
-static bool blkdebug_debug_is_suspended(BlockDriverState *bs, const char *tag)
-{
-    BDRVBlkdebugState *s = bs->opaque;
-    BlkdebugSuspendedReq *r;
-
-    QLIST_FOREACH(r, &s->suspended_reqs, next) {
-        if (!strcmp(r->tag, tag)) {
-            return true;
-        }
-    }
-    return false;
-}
-
 static int64_t blkdebug_getlength(BlockDriverState *bs)
 {
    return bdrv_getlength(bs->file);
 }

-static int blkdebug_truncate(BlockDriverState *bs, int64_t offset)
-{
-    return bdrv_truncate(bs->file, offset);
-}
-
-static void blkdebug_refresh_filename(BlockDriverState *bs)
-{
-    QDict *opts;
-    const QDictEntry *e;
-    bool force_json = false;
-
-    for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
-        if (strcmp(qdict_entry_key(e), "config") &&
-            strcmp(qdict_entry_key(e), "x-image") &&
-            strcmp(qdict_entry_key(e), "image") &&
-            strncmp(qdict_entry_key(e), "image.", strlen("image.")))
-        {
-            force_json = true;
-            break;
-        }
-    }
-
-    if (force_json && !bs->file->full_open_options) {
-        /* The config file cannot be recreated, so creating a plain filename
-         * is impossible */
-        return;
-    }
-
-    if (!force_json && bs->file->exact_filename[0]) {
-        snprintf(bs->exact_filename, sizeof(bs->exact_filename),
-                 "blkdebug:%s:%s",
-                 qdict_get_try_str(bs->options, "config") ?: "",
-                 bs->file->exact_filename);
-    }
-
-    opts = qdict_new();
-    qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkdebug")));
-
-    QINCREF(bs->file->full_open_options);
-    qdict_put_obj(opts, "image", QOBJECT(bs->file->full_open_options));
-
-    for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
-        if (strcmp(qdict_entry_key(e), "x-image") &&
-            strcmp(qdict_entry_key(e), "image") &&
-            strncmp(qdict_entry_key(e), "image.", strlen("image.")))
-        {
-            qobject_incref(qdict_entry_value(e));
-            qdict_put_obj(opts, qdict_entry_key(e), qdict_entry_value(e));
-        }
-    }
-
-    bs->full_open_options = opts;
-}
-
 static BlockDriver bdrv_blkdebug = {
-    .format_name            = "blkdebug",
-    .protocol_name          = "blkdebug",
-    .instance_size          = sizeof(BDRVBlkdebugState),
+    .format_name        = "blkdebug",
+    .protocol_name      = "blkdebug",

-    .bdrv_parse_filename    = blkdebug_parse_filename,
-    .bdrv_file_open         = blkdebug_open,
-    .bdrv_close             = blkdebug_close,
-    .bdrv_getlength         = blkdebug_getlength,
-    .bdrv_truncate          = blkdebug_truncate,
-    .bdrv_refresh_filename  = blkdebug_refresh_filename,
+    .instance_size      = sizeof(BDRVBlkdebugState),

-    .bdrv_aio_readv         = blkdebug_aio_readv,
-    .bdrv_aio_writev        = blkdebug_aio_writev,
-    .bdrv_aio_flush         = blkdebug_aio_flush,
+    .bdrv_file_open     = blkdebug_open,
+    .bdrv_close         = blkdebug_close,
+    .bdrv_getlength     = blkdebug_getlength,

-    .bdrv_debug_event           = blkdebug_debug_event,
-    .bdrv_debug_breakpoint      = blkdebug_debug_breakpoint,
-    .bdrv_debug_remove_breakpoint
-                                = blkdebug_debug_remove_breakpoint,
-    .bdrv_debug_resume          = blkdebug_debug_resume,
-    .bdrv_debug_is_suspended    = blkdebug_debug_is_suspended,
+    .bdrv_aio_readv     = blkdebug_aio_readv,
+    .bdrv_aio_writev    = blkdebug_aio_writev,
+
+    .bdrv_debug_event   = blkdebug_debug_event,
 };

 static void bdrv_blkdebug_init(void)
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -8,10 +8,8 @@
 */

 #include <stdarg.h>
-#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
-#include "block/block_int.h"
-#include "qapi/qmp/qdict.h"
-#include "qapi/qmp/qstring.h"
+#include "qemu_socket.h" /* for EINPROGRESS on Windows */
+#include "block_int.h"

 typedef struct {
    BlockDriverState *test_file;
@@ -19,7 +17,7 @@ typedef struct {

 typedef struct BlkverifyAIOCB BlkverifyAIOCB;
 struct BlkverifyAIOCB {
-    BlockAIOCB common;
+    BlockDriverAIOCB common;
    QEMUBH *bh;

    /* Request metadata */
@@ -29,6 +27,7 @@ struct BlkverifyAIOCB {

    int ret;                    /* first completed request's result */
    unsigned int done;          /* completion counter */
+    bool *finished;             /* completion signal for cancel */

    QEMUIOVector *qiov;         /* user I/O vector */
    QEMUIOVector raw_qiov;      /* cloned I/O vector for raw file */
@@ -37,8 +36,21 @@ struct BlkverifyAIOCB {
    void (*verify)(BlkverifyAIOCB *acb);
 };

+static void blkverify_aio_cancel(BlockDriverAIOCB *blockacb)
+{
+    BlkverifyAIOCB *acb = (BlkverifyAIOCB *)blockacb;
+    bool finished = false;
+
+    /* Wait until request completes, invokes its callback, and frees itself */
+    acb->finished = &finished;
+    while (!finished) {
+        qemu_aio_wait();
+    }
+}
+
 static const AIOCBInfo blkverify_aiocb_info = {
    .aiocb_size         = sizeof(BlkverifyAIOCB),
+    .cancel             = blkverify_aio_cancel,
 };

 static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyAIOCB *acb,
@@ -57,101 +69,50 @@ static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyAIOCB *acb,
 }

 /* Valid blkverify filenames look like blkverify:path/to/raw_image:path/to/image */
-static void blkverify_parse_filename(const char *filename, QDict *options,
-                                     Error **errp)
+static int blkverify_open(BlockDriverState *bs, const char *filename, int flags)
 {
-    const char *c;
-    QString *raw_path;
-
+    BDRVBlkverifyState *s = bs->opaque;
+    int ret;
+    char *raw, *c;

    /* Parse the blkverify: prefix */
-    if (!strstart(filename, "blkverify:", &filename)) {
-        /* There was no prefix; therefore, all options have to be already
-           present in the QDict (except for the filename) */
-        qdict_put(options, "x-image", qstring_from_str(filename));
-        return;
+    if (strncmp(filename, "blkverify:", strlen("blkverify:"))) {
+        return -EINVAL;
    }
+    filename += strlen("blkverify:");

    /* Parse the raw image filename */
    c = strchr(filename, ':');
    if (c == NULL) {
-        error_setg(errp, "blkverify requires raw copy and original image path");
-        return;
+        return -EINVAL;
    }

-    /* TODO Implement option pass-through and set raw.filename here */
-    raw_path = qstring_from_substr(filename, 0, c - filename - 1);
-    qdict_put(options, "x-raw", raw_path);
-
-    /* TODO Allow multi-level nesting and set file.filename here */
-    filename = c + 1;
-    qdict_put(options, "x-image", qstring_from_str(filename));
-}
-
-static QemuOptsList runtime_opts = {
-    .name = "blkverify",
-    .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
-    .desc = {
-        {
-            .name = "x-raw",
-            .type = QEMU_OPT_STRING,
-            .help = "[internal use only, will be removed]",
-        },
-        {
-            .name = "x-image",
-            .type = QEMU_OPT_STRING,
-            .help = "[internal use only, will be removed]",
-        },
-        { /* end of list */ }
-    },
-};
-
-static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
-                          Error **errp)
-{
-    BDRVBlkverifyState *s = bs->opaque;
-    QemuOpts *opts;
-    Error *local_err = NULL;
-    int ret;
-
-    opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
-    qemu_opts_absorb_qdict(opts, options, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto fail;
-    }
-
-    /* Open the raw file */
-    assert(bs->file == NULL);
-    ret = bdrv_open_image(&bs->file, qemu_opt_get(opts, "x-raw"), options,
-                          "raw", flags | BDRV_O_PROTOCOL, false, &local_err);
+    raw = g_strdup(filename);
+    raw[c - filename] = '\0';
+    ret = bdrv_file_open(&bs->file, raw, flags);
+    g_free(raw);
    if (ret < 0) {
-        error_propagate(errp, local_err);
-        goto fail;
+        return ret;
    }
+    filename = c + 1;

    /* Open the test file */
-    assert(s->test_file == NULL);
-    ret = bdrv_open_image(&s->test_file, qemu_opt_get(opts, "x-image"), options,
-                          "test", flags, false, &local_err);
+    s->test_file = bdrv_new("");
+    ret = bdrv_open(s->test_file, filename, flags, NULL);
    if (ret < 0) {
-        error_propagate(errp, local_err);
+        bdrv_delete(s->test_file);
        s->test_file = NULL;
-        goto fail;
+        return ret;
    }

-    ret = 0;
-fail:
-    qemu_opts_del(opts);
-    return ret;
+    return 0;
 }

 static void blkverify_close(BlockDriverState *bs)
 {
    BDRVBlkverifyState *s = bs->opaque;

-    bdrv_unref(s->test_file);
+    bdrv_delete(s->test_file);
    s->test_file = NULL;
 }

@@ -162,10 +123,114 @@ static int64_t blkverify_getlength(BlockDriverState *bs)
    return bdrv_getlength(s->test_file);
 }

+/**
+ * Check that I/O vector contents are identical
+ *
+ * @a:          I/O vector
+ * @b:          I/O vector
+ * @ret:        Offset to first mismatching byte or -1 if match
+ */
+static ssize_t blkverify_iovec_compare(QEMUIOVector *a, QEMUIOVector *b)
+{
+    int i;
+    ssize_t offset = 0;
+
+    assert(a->niov == b->niov);
+    for (i = 0; i < a->niov; i++) {
+        size_t len = 0;
+        uint8_t *p = (uint8_t *)a->iov[i].iov_base;
+        uint8_t *q = (uint8_t *)b->iov[i].iov_base;
+
+        assert(a->iov[i].iov_len == b->iov[i].iov_len);
+        while (len < a->iov[i].iov_len && *p++ == *q++) {
+            len++;
+        }
+
+        offset += len;
+
+        if (len != a->iov[i].iov_len) {
+            return offset;
+        }
+    }
+    return -1;
+}
+
+typedef struct {
+    int src_index;
+    struct iovec *src_iov;
+    void *dest_base;
+} IOVectorSortElem;
+
+static int sortelem_cmp_src_base(const void *a, const void *b)
+{
+    const IOVectorSortElem *elem_a = a;
+    const IOVectorSortElem *elem_b = b;
+
+    /* Don't overflow */
+    if (elem_a->src_iov->iov_base < elem_b->src_iov->iov_base) {
+        return -1;
+    } else if (elem_a->src_iov->iov_base > elem_b->src_iov->iov_base) {
+        return 1;
+    } else {
+        return 0;
+    }
+}
+
+static int sortelem_cmp_src_index(const void *a, const void *b)
+{
+    const IOVectorSortElem *elem_a = a;
+    const IOVectorSortElem *elem_b = b;
+
+    return elem_a->src_index - elem_b->src_index;
+}
+
+/**
+ * Copy contents of I/O vector
+ *
+ * The relative relationships of overlapping iovecs are preserved.  This is
+ * necessary to ensure identical semantics in the cloned I/O vector.
+ */
+static void blkverify_iovec_clone(QEMUIOVector *dest, const QEMUIOVector *src,
+                                  void *buf)
+{
+    IOVectorSortElem sortelems[src->niov];
+    void *last_end;
+    int i;
+
+    /* Sort by source iovecs by base address */
+    for (i = 0; i < src->niov; i++) {
+        sortelems[i].src_index = i;
+        sortelems[i].src_iov = &src->iov[i];
+    }
+    qsort(sortelems, src->niov, sizeof(sortelems[0]), sortelem_cmp_src_base);
+
+    /* Allocate buffer space taking into account overlapping iovecs */
+    last_end = NULL;
+    for (i = 0; i < src->niov; i++) {
+        struct iovec *cur = sortelems[i].src_iov;
+        ptrdiff_t rewind = 0;
+
+        /* Detect overlap */
+        if (last_end && last_end > cur->iov_base) {
+            rewind = last_end - cur->iov_base;
+        }
+
+        sortelems[i].dest_base = buf - rewind;
+        buf += cur->iov_len - MIN(rewind, cur->iov_len);
+        last_end = MAX(cur->iov_base + cur->iov_len, last_end);
+    }
+
+    /* Sort by source iovec index and build destination iovec */
+    qsort(sortelems, src->niov, sizeof(sortelems[0]), sortelem_cmp_src_index);
+    for (i = 0; i < src->niov; i++) {
+        qemu_iovec_add(dest, sortelems[i].dest_base, src->iov[i].iov_len);
+    }
+}
+
 static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
                                         int64_t sector_num, QEMUIOVector *qiov,
                                         int nb_sectors,
-                                         BlockCompletionFunc *cb,
+                                         BlockDriverCompletionFunc *cb,
                                         void *opaque)
 {
    BlkverifyAIOCB *acb = qemu_aio_get(&blkverify_aiocb_info, bs, cb, opaque);
@@ -179,6 +244,7 @@ static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
    acb->qiov = qiov;
    acb->buf = NULL;
    acb->verify = NULL;
+    acb->finished = NULL;
    return acb;
 }

@@ -192,7 +258,10 @@ static void blkverify_aio_bh(void *opaque)
        qemu_vfree(acb->buf);
    }
    acb->common.cb(acb->common.opaque, acb->ret);
-    qemu_aio_unref(acb);
+    if (acb->finished) {
+        *acb->finished = true;
+    }
+    qemu_aio_release(acb);
 }

 static void blkverify_aio_cb(void *opaque, int ret)
@@ -213,8 +282,7 @@ static void blkverify_aio_cb(void *opaque, int ret)
            acb->verify(acb);
        }

-        acb->bh = aio_bh_new(bdrv_get_aio_context(acb->common.bs),
-                             blkverify_aio_bh, acb);
+        acb->bh = qemu_bh_new(blkverify_aio_bh, acb);
        qemu_bh_schedule(acb->bh);
        break;
    }
@@ -222,16 +290,16 @@ static void blkverify_aio_cb(void *opaque, int ret)

 static void blkverify_verify_readv(BlkverifyAIOCB *acb)
 {
-    ssize_t offset = qemu_iovec_compare(acb->qiov, &acb->raw_qiov);
+    ssize_t offset = blkverify_iovec_compare(acb->qiov, &acb->raw_qiov);
    if (offset != -1) {
        blkverify_err(acb, "contents mismatch in sector %" PRId64,
                      acb->sector_num + (int64_t)(offset / BDRV_SECTOR_SIZE));
    }
 }

-static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
+static BlockDriverAIOCB *blkverify_aio_readv(BlockDriverState *bs,
        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque)
+        BlockDriverCompletionFunc *cb, void *opaque)
 {
    BDRVBlkverifyState *s = bs->opaque;
    BlkverifyAIOCB *acb = blkverify_aio_get(bs, false, sector_num, qiov,
@@ -240,7 +308,7 @@ static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
    acb->verify = blkverify_verify_readv;
    acb->buf = qemu_blockalign(bs->file, qiov->size);
    qemu_iovec_init(&acb->raw_qiov, acb->qiov->niov);
-    qemu_iovec_clone(&acb->raw_qiov, qiov, acb->buf);
+    blkverify_iovec_clone(&acb->raw_qiov, qiov, acb->buf);

    bdrv_aio_readv(s->test_file, sector_num, qiov, nb_sectors,
                   blkverify_aio_cb, acb);
@@ -249,9 +317,9 @@ static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
    return &acb->common;
 }

-static BlockAIOCB *blkverify_aio_writev(BlockDriverState *bs,
+static BlockDriverAIOCB *blkverify_aio_writev(BlockDriverState *bs,
        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque)
+        BlockDriverCompletionFunc *cb, void *opaque)
 {
    BDRVBlkverifyState *s = bs->opaque;
    BlkverifyAIOCB *acb = blkverify_aio_get(bs, true, sector_num, qiov,
@@ -264,9 +332,9 @@ static BlockAIOCB *blkverify_aio_writev(BlockDriverState *bs,
    return &acb->common;
 }

-static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
-                                       BlockCompletionFunc *cb,
-                                       void *opaque)
+static BlockDriverAIOCB *blkverify_aio_flush(BlockDriverState *bs,
+                                             BlockDriverCompletionFunc *cb,
+                                             void *opaque)
 {
    BDRVBlkverifyState *s = bs->opaque;

@@ -274,82 +342,20 @@ static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
    return bdrv_aio_flush(s->test_file, cb, opaque);
 }

-static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
-                                                  BlockDriverState *candidate)
-{
-    BDRVBlkverifyState *s = bs->opaque;
-
-    bool perm = bdrv_recurse_is_first_non_filter(bs->file, candidate);
-
-    if (perm) {
-        return true;
-    }
-
-    return bdrv_recurse_is_first_non_filter(s->test_file, candidate);
-}
-
-/* Propagate AioContext changes to ->test_file */
-static void blkverify_detach_aio_context(BlockDriverState *bs)
-{
-    BDRVBlkverifyState *s = bs->opaque;
-
-    bdrv_detach_aio_context(s->test_file);
-}
-
-static void blkverify_attach_aio_context(BlockDriverState *bs,
-                                         AioContext *new_context)
-{
-    BDRVBlkverifyState *s = bs->opaque;
-
-    bdrv_attach_aio_context(s->test_file, new_context);
-}
-
-static void blkverify_refresh_filename(BlockDriverState *bs)
-{
-    BDRVBlkverifyState *s = bs->opaque;
-
-    /* bs->file has already been refreshed */
-    bdrv_refresh_filename(s->test_file);
-
-    if (bs->file->full_open_options && s->test_file->full_open_options) {
-        QDict *opts = qdict_new();
-        qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkverify")));
-
-        QINCREF(bs->file->full_open_options);
-        qdict_put_obj(opts, "raw", QOBJECT(bs->file->full_open_options));
-        QINCREF(s->test_file->full_open_options);
-        qdict_put_obj(opts, "test", QOBJECT(s->test_file->full_open_options));
-
-        bs->full_open_options = opts;
-    }
-
-    if (bs->file->exact_filename[0] && s->test_file->exact_filename[0]) {
-        snprintf(bs->exact_filename, sizeof(bs->exact_filename),
-                 "blkverify:%s:%s",
-                 bs->file->exact_filename, s->test_file->exact_filename);
-    }
-}
-
 static BlockDriver bdrv_blkverify = {
-    .format_name                      = "blkverify",
-    .protocol_name                    = "blkverify",
-    .instance_size                    = sizeof(BDRVBlkverifyState),
+    .format_name        = "blkverify",
+    .protocol_name      = "blkverify",

-    .bdrv_parse_filename              = blkverify_parse_filename,
-    .bdrv_file_open                   = blkverify_open,
-    .bdrv_close                       = blkverify_close,
-    .bdrv_getlength                   = blkverify_getlength,
-    .bdrv_refresh_filename            = blkverify_refresh_filename,
+    .instance_size      = sizeof(BDRVBlkverifyState),

-    .bdrv_aio_readv                   = blkverify_aio_readv,
-    .bdrv_aio_writev                  = blkverify_aio_writev,
-    .bdrv_aio_flush                   = blkverify_aio_flush,
+    .bdrv_getlength     = blkverify_getlength,

-    .bdrv_attach_aio_context          = blkverify_attach_aio_context,
-    .bdrv_detach_aio_context          = blkverify_detach_aio_context,
+    .bdrv_file_open     = blkverify_open,
+    .bdrv_close         = blkverify_close,

-    .is_filter                        = true,
-    .bdrv_recurse_is_first_non_filter = blkverify_recurse_is_first_non_filter,
+    .bdrv_aio_readv     = blkverify_aio_readv,
+    .bdrv_aio_writev    = blkverify_aio_writev,
+    .bdrv_aio_flush     = blkverify_aio_flush,
 };

 static void bdrv_blkverify_init(void)
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1,915 +0,0 @@
-/*
- * QEMU Block backends
- *
- * Copyright (C) 2014 Red Hat, Inc.
- *
- * Authors:
- *  Markus Armbruster <armbru@redhat.com>,
- *
- * This work is licensed under the terms of the GNU LGPL, version 2.1
- * or later.  See the COPYING.LIB file in the top-level directory.
- */
-
-#include "sysemu/block-backend.h"
-#include "block/block_int.h"
-#include "sysemu/blockdev.h"
-#include "qapi-event.h"
-
-/* Number of coroutines to reserve per attached device model */
-#define COROUTINE_POOL_RESERVATION 64
-
-struct BlockBackend {
-    char *name;
-    int refcnt;
-    BlockDriverState *bs;
-    DriveInfo *legacy_dinfo;    /* null unless created by drive_new() */
-    QTAILQ_ENTRY(BlockBackend) link; /* for blk_backends */
-
-    void *dev;                  /* attached device model, if any */
-    /* TODO change to DeviceState when all users are qdevified */
-    const BlockDevOps *dev_ops;
-    void *dev_opaque;
-};
-
-typedef struct BlockBackendAIOCB {
-    BlockAIOCB common;
-    QEMUBH *bh;
-    int ret;
-} BlockBackendAIOCB;
-
-static const AIOCBInfo block_backend_aiocb_info = {
-    .aiocb_size = sizeof(BlockBackendAIOCB),
-};
-
-static void drive_info_del(DriveInfo *dinfo);
-
-/* All the BlockBackends (except for hidden ones) */
-static QTAILQ_HEAD(, BlockBackend) blk_backends =
-    QTAILQ_HEAD_INITIALIZER(blk_backends);
-
-/*
- * Create a new BlockBackend with @name, with a reference count of one.
- * @name must not be null or empty.
- * Fail if a BlockBackend with this name already exists.
- * Store an error through @errp on failure, unless it's null.
- * Return the new BlockBackend on success, null on failure.
- */
-BlockBackend *blk_new(const char *name, Error **errp)
-{
-    BlockBackend *blk;
-
-    assert(name && name[0]);
-    if (!id_wellformed(name)) {
-        error_setg(errp, "Invalid device name");
-        return NULL;
-    }
-    if (blk_by_name(name)) {
-        error_setg(errp, "Device with id '%s' already exists", name);
-        return NULL;
-    }
-    if (bdrv_find_node(name)) {
-        error_setg(errp,
-                   "Device name '%s' conflicts with an existing node name",
-                   name);
-        return NULL;
-    }
-
-    blk = g_new0(BlockBackend, 1);
-    blk->name = g_strdup(name);
-    blk->refcnt = 1;
-    QTAILQ_INSERT_TAIL(&blk_backends, blk, link);
-    return blk;
-}
-
-/*
- * Create a new BlockBackend with a new BlockDriverState attached.
- * Otherwise just like blk_new(), which see.
- */
-BlockBackend *blk_new_with_bs(const char *name, Error **errp)
-{
-    BlockBackend *blk;
-    BlockDriverState *bs;
-
-    blk = blk_new(name, errp);
-    if (!blk) {
-        return NULL;
-    }
-
-    bs = bdrv_new_root();
-    blk->bs = bs;
-    bs->blk = blk;
-    return blk;
-}
-
-/*
- * Calls blk_new_with_bs() and then calls bdrv_open() on the BlockDriverState.
- *
- * Just as with bdrv_open(), after having called this function the reference to
- * @options belongs to the block layer (even on failure).
- *
- * TODO: Remove @filename and @flags; it should be possible to specify a whole
- * BDS tree just by specifying the @options QDict (or @reference,
- * alternatively). At the time of adding this function, this is not possible,
- * though, so callers of this function have to be able to specify @filename and
- * @flags.
- */
-BlockBackend *blk_new_open(const char *name, const char *filename,
-                           const char *reference, QDict *options, int flags,
-                           Error **errp)
-{
-    BlockBackend *blk;
-    int ret;
-
-    blk = blk_new_with_bs(name, errp);
-    if (!blk) {
-        QDECREF(options);
-        return NULL;
-    }
-
-    ret = bdrv_open(&blk->bs, filename, reference, options, flags, NULL, errp);
-    if (ret < 0) {
-        blk_unref(blk);
-        return NULL;
-    }
-
-    return blk;
-}
-
-static void blk_delete(BlockBackend *blk)
-{
-    assert(!blk->refcnt);
-    assert(!blk->dev);
-    if (blk->bs) {
-        assert(blk->bs->blk == blk);
-        blk->bs->blk = NULL;
-        bdrv_unref(blk->bs);
-        blk->bs = NULL;
-    }
-    /* Avoid double-remove after blk_hide_on_behalf_of_hmp_drive_del() */
-    if (blk->name[0]) {
-        QTAILQ_REMOVE(&blk_backends, blk, link);
-    }
-    g_free(blk->name);
-    drive_info_del(blk->legacy_dinfo);
-    g_free(blk);
-}
-
-static void drive_info_del(DriveInfo *dinfo)
-{
-    if (!dinfo) {
-        return;
-    }
-    qemu_opts_del(dinfo->opts);
-    g_free(dinfo->serial);
-    g_free(dinfo);
-}
-
-/*
- * Increment @blk's reference count.
- * @blk must not be null.
- */
-void blk_ref(BlockBackend *blk)
-{
-    blk->refcnt++;
-}
-
-/*
- * Decrement @blk's reference count.
- * If this drops it to zero, destroy @blk.
- * For convenience, do nothing if @blk is null.
- */
-void blk_unref(BlockBackend *blk)
-{
-    if (blk) {
-        assert(blk->refcnt > 0);
-        if (!--blk->refcnt) {
-            blk_delete(blk);
-        }
-    }
-}
-
-/*
- * Return the BlockBackend after @blk.
- * If @blk is null, return the first one.
- * Else, return @blk's next sibling, which may be null.
- *
- * To iterate over all BlockBackends, do
- * for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
- *     ...
- * }
- */
-BlockBackend *blk_next(BlockBackend *blk)
-{
-    return blk ? QTAILQ_NEXT(blk, link) : QTAILQ_FIRST(&blk_backends);
-}
-
-/*
- * Return @blk's name, a non-null string.
- * Wart: the name is empty iff @blk has been hidden with
- * blk_hide_on_behalf_of_hmp_drive_del().
- */
-const char *blk_name(BlockBackend *blk)
-{
-    return blk->name;
-}
-
-/*
- * Return the BlockBackend with name @name if it exists, else null.
- * @name must not be null.
- */
-BlockBackend *blk_by_name(const char *name)
-{
-    BlockBackend *blk;
-
-    assert(name);
-    QTAILQ_FOREACH(blk, &blk_backends, link) {
-        if (!strcmp(name, blk->name)) {
-            return blk;
-        }
-    }
-    return NULL;
-}
-
-/*
- * Return the BlockDriverState attached to @blk if any, else null.
- */
-BlockDriverState *blk_bs(BlockBackend *blk)
-{
-    return blk->bs;
-}
-
-/*
- * Return @blk's DriveInfo if any, else null.
- */
-DriveInfo *blk_legacy_dinfo(BlockBackend *blk)
-{
-    return blk->legacy_dinfo;
-}
-
-/*
- * Set @blk's DriveInfo to @dinfo, and return it.
- * @blk must not have a DriveInfo set already.
- * No other BlockBackend may have the same DriveInfo set.
- */
-DriveInfo *blk_set_legacy_dinfo(BlockBackend *blk, DriveInfo *dinfo)
-{
-    assert(!blk->legacy_dinfo);
-    return blk->legacy_dinfo = dinfo;
-}
-
-/*
- * Return the BlockBackend with DriveInfo @dinfo.
- * It must exist.
- */
-BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo)
-{
-    BlockBackend *blk;
-
-    QTAILQ_FOREACH(blk, &blk_backends, link) {
-        if (blk->legacy_dinfo == dinfo) {
-            return blk;
-        }
-    }
-    abort();
-}
-
-/*
- * Hide @blk.
- * @blk must not have been hidden already.
- * Make attached BlockDriverState, if any, anonymous.
- * Once hidden, @blk is invisible to all functions that don't receive
- * it as argument.  For example, blk_by_name() won't return it.
- * Strictly for use by do_drive_del().
- * TODO get rid of it!
- */
-void blk_hide_on_behalf_of_hmp_drive_del(BlockBackend *blk)
-{
-    QTAILQ_REMOVE(&blk_backends, blk, link);
-    blk->name[0] = 0;
-    if (blk->bs) {
-        bdrv_make_anon(blk->bs);
-    }
-}
-
-/*
- * Attach device model @dev to @blk.
- * Return 0 on success, -EBUSY when a device model is attached already.
- */
-int blk_attach_dev(BlockBackend *blk, void *dev)
-/* TODO change to DeviceState *dev when all users are qdevified */
-{
-    if (blk->dev) {
-        return -EBUSY;
-    }
-    blk_ref(blk);
-    blk->dev = dev;
-    bdrv_iostatus_reset(blk->bs);
-    return 0;
-}
-
-/*
- * Attach device model @dev to @blk.
- * @blk must not have a device model attached already.
- * TODO qdevified devices don't use this, remove when devices are qdevified
- */
-void blk_attach_dev_nofail(BlockBackend *blk, void *dev)
-{
-    if (blk_attach_dev(blk, dev) < 0) {
-        abort();
-    }
-}
-
-/*
- * Detach device model @dev from @blk.
- * @dev must be currently attached to @blk.
- */
-void blk_detach_dev(BlockBackend *blk, void *dev)
-/* TODO change to DeviceState *dev when all users are qdevified */
-{
-    assert(blk->dev == dev);
-    blk->dev = NULL;
-    blk->dev_ops = NULL;
-    blk->dev_opaque = NULL;
-    bdrv_set_guest_block_size(blk->bs, 512);
-    blk_unref(blk);
-}
-
-/*
- * Return the device model attached to @blk if any, else null.
- */
-void *blk_get_attached_dev(BlockBackend *blk)
-/* TODO change to return DeviceState * when all users are qdevified */
-{
-    return blk->dev;
-}
-
-/*
- * Set @blk's device model callbacks to @ops.
- * @opaque is the opaque argument to pass to the callbacks.
- * This is for use by device models.
- */
-void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops,
-                     void *opaque)
-{
-    blk->dev_ops = ops;
-    blk->dev_opaque = opaque;
-}
-
-/*
- * Notify @blk's attached device model of media change.
- * If @load is true, notify of media load.
- * Else, notify of media eject.
- * Also send DEVICE_TRAY_MOVED events as appropriate.
- */
-void blk_dev_change_media_cb(BlockBackend *blk, bool load)
-{
-    if (blk->dev_ops && blk->dev_ops->change_media_cb) {
-        bool tray_was_closed = !blk_dev_is_tray_open(blk);
-
-        blk->dev_ops->change_media_cb(blk->dev_opaque, load);
-        if (tray_was_closed) {
-            /* tray open */
-            qapi_event_send_device_tray_moved(blk_name(blk),
-                                              true, &error_abort);
-        }
-        if (load) {
-            /* tray close */
-            qapi_event_send_device_tray_moved(blk_name(blk),
-                                              false, &error_abort);
-        }
-    }
-}
-
-/*
- * Does @blk's attached device model have removable media?
- * %true if no device model is attached.
- */
-bool blk_dev_has_removable_media(BlockBackend *blk)
-{
-    return !blk->dev || (blk->dev_ops && blk->dev_ops->change_media_cb);
-}
-
-/*
- * Notify @blk's attached device model of a media eject request.
- * If @force is true, the medium is about to be yanked out forcefully.
- */
-void blk_dev_eject_request(BlockBackend *blk, bool force)
-{
-    if (blk->dev_ops && blk->dev_ops->eject_request_cb) {
-        blk->dev_ops->eject_request_cb(blk->dev_opaque, force);
-    }
-}
-
-/*
- * Does @blk's attached device model have a tray, and is it open?
- */
-bool blk_dev_is_tray_open(BlockBackend *blk)
-{
-    if (blk->dev_ops && blk->dev_ops->is_tray_open) {
-        return blk->dev_ops->is_tray_open(blk->dev_opaque);
-    }
-    return false;
-}
-
-/*
- * Does @blk's attached device model have the medium locked?
- * %false if the device model has no such lock.
- */
-bool blk_dev_is_medium_locked(BlockBackend *blk)
-{
-    if (blk->dev_ops && blk->dev_ops->is_medium_locked) {
-        return blk->dev_ops->is_medium_locked(blk->dev_opaque);
-    }
-    return false;
-}
-
-/*
- * Notify @blk's attached device model of a backend size change.
- */
-void blk_dev_resize_cb(BlockBackend *blk)
-{
-    if (blk->dev_ops && blk->dev_ops->resize_cb) {
-        blk->dev_ops->resize_cb(blk->dev_opaque);
-    }
-}
-
-void blk_iostatus_enable(BlockBackend *blk)
-{
-    bdrv_iostatus_enable(blk->bs);
-}
-
-static int blk_check_byte_request(BlockBackend *blk, int64_t offset,
-                                  size_t size)
-{
-    int64_t len;
-
-    if (size > INT_MAX) {
-        return -EIO;
-    }
-
-    if (!blk_is_inserted(blk)) {
-        return -ENOMEDIUM;
-    }
-
-    len = blk_getlength(blk);
-    if (len < 0) {
-        return len;
-    }
-
-    if (offset < 0) {
-        return -EIO;
-    }
-
-    if (offset > len || len - offset < size) {
-        return -EIO;
-    }
-
-    return 0;
-}
-
-static int blk_check_request(BlockBackend *blk, int64_t sector_num,
-                             int nb_sectors)
-{
-    if (sector_num < 0 || sector_num > INT64_MAX / BDRV_SECTOR_SIZE) {
-        return -EIO;
-    }
-
-    if (nb_sectors < 0 || nb_sectors > INT_MAX / BDRV_SECTOR_SIZE) {
-        return -EIO;
-    }
-
-    return blk_check_byte_request(blk, sector_num * BDRV_SECTOR_SIZE,
-                                  nb_sectors * BDRV_SECTOR_SIZE);
-}
-
-int blk_read(BlockBackend *blk, int64_t sector_num, uint8_t *buf,
-             int nb_sectors)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return ret;
-    }
-
-    return bdrv_read(blk->bs, sector_num, buf, nb_sectors);
-}
-
-int blk_read_unthrottled(BlockBackend *blk, int64_t sector_num, uint8_t *buf,
-                         int nb_sectors)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return ret;
-    }
-
-    return bdrv_read_unthrottled(blk->bs, sector_num, buf, nb_sectors);
-}
-
-int blk_write(BlockBackend *blk, int64_t sector_num, const uint8_t *buf,
-              int nb_sectors)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return ret;
-    }
-
-    return bdrv_write(blk->bs, sector_num, buf, nb_sectors);
-}
-
-int blk_write_zeroes(BlockBackend *blk, int64_t sector_num,
-                     int nb_sectors, BdrvRequestFlags flags)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return ret;
-    }
-
-    return bdrv_write_zeroes(blk->bs, sector_num, nb_sectors, flags);
-}
-
-static void error_callback_bh(void *opaque)
-{
-    struct BlockBackendAIOCB *acb = opaque;
-    qemu_bh_delete(acb->bh);
-    acb->common.cb(acb->common.opaque, acb->ret);
-    qemu_aio_unref(acb);
-}
-
-static BlockAIOCB *abort_aio_request(BlockBackend *blk, BlockCompletionFunc *cb,
-                                     void *opaque, int ret)
-{
-    struct BlockBackendAIOCB *acb;
-    QEMUBH *bh;
-
-    acb = blk_aio_get(&block_backend_aiocb_info, blk, cb, opaque);
-    acb->ret = ret;
-
-    bh = aio_bh_new(blk_get_aio_context(blk), error_callback_bh, acb);
-    acb->bh = bh;
-    qemu_bh_schedule(bh);
-
-    return &acb->common;
-}
-
-BlockAIOCB *blk_aio_write_zeroes(BlockBackend *blk, int64_t sector_num,
-                                 int nb_sectors, BdrvRequestFlags flags,
-                                 BlockCompletionFunc *cb, void *opaque)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
-    }
-
-    return bdrv_aio_write_zeroes(blk->bs, sector_num, nb_sectors, flags,
-                                 cb, opaque);
-}
-
-int blk_pread(BlockBackend *blk, int64_t offset, void *buf, int count)
-{
-    int ret = blk_check_byte_request(blk, offset, count);
-    if (ret < 0) {
-        return ret;
-    }
-
-    return bdrv_pread(blk->bs, offset, buf, count);
-}
-
-int blk_pwrite(BlockBackend *blk, int64_t offset, const void *buf, int count)
-{
-    int ret = blk_check_byte_request(blk, offset, count);
-    if (ret < 0) {
-        return ret;
-    }
-
-    return bdrv_pwrite(blk->bs, offset, buf, count);
-}
-
-int64_t blk_getlength(BlockBackend *blk)
-{
-    return bdrv_getlength(blk->bs);
-}
-
-void blk_get_geometry(BlockBackend *blk, uint64_t *nb_sectors_ptr)
-{
-    bdrv_get_geometry(blk->bs, nb_sectors_ptr);
-}
-
-int64_t blk_nb_sectors(BlockBackend *blk)
-{
-    return bdrv_nb_sectors(blk->bs);
-}
-
-BlockAIOCB *blk_aio_readv(BlockBackend *blk, int64_t sector_num,
-                          QEMUIOVector *iov, int nb_sectors,
-                          BlockCompletionFunc *cb, void *opaque)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
-    }
-
-    return bdrv_aio_readv(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
-}
-
-BlockAIOCB *blk_aio_writev(BlockBackend *blk, int64_t sector_num,
-                           QEMUIOVector *iov, int nb_sectors,
-                           BlockCompletionFunc *cb, void *opaque)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
-    }
-
-    return bdrv_aio_writev(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
-}
-
-BlockAIOCB *blk_aio_flush(BlockBackend *blk,
-                          BlockCompletionFunc *cb, void *opaque)
-{
-    return bdrv_aio_flush(blk->bs, cb, opaque);
-}
-
-BlockAIOCB *blk_aio_discard(BlockBackend *blk,
-                            int64_t sector_num, int nb_sectors,
-                            BlockCompletionFunc *cb, void *opaque)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
-    }
-
-    return bdrv_aio_discard(blk->bs, sector_num, nb_sectors, cb, opaque);
-}
-
-void blk_aio_cancel(BlockAIOCB *acb)
-{
-    bdrv_aio_cancel(acb);
-}
-
-void blk_aio_cancel_async(BlockAIOCB *acb)
-{
-    bdrv_aio_cancel_async(acb);
-}
-
-int blk_aio_multiwrite(BlockBackend *blk, BlockRequest *reqs, int num_reqs)
-{
-    int i, ret;
-
-    for (i = 0; i < num_reqs; i++) {
-        ret = blk_check_request(blk, reqs[i].sector, reqs[i].nb_sectors);
-        if (ret < 0) {
-            return ret;
-        }
-    }
-
-    return bdrv_aio_multiwrite(blk->bs, reqs, num_reqs);
-}
-
-int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
-{
-    return bdrv_ioctl(blk->bs, req, buf);
-}
-
-BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
-                          BlockCompletionFunc *cb, void *opaque)
-{
-    return bdrv_aio_ioctl(blk->bs, req, buf, cb, opaque);
-}
-
-int blk_co_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return ret;
-    }
-
-    return bdrv_co_discard(blk->bs, sector_num, nb_sectors);
-}
-
-int blk_co_flush(BlockBackend *blk)
-{
-    return bdrv_co_flush(blk->bs);
-}
-
-int blk_flush(BlockBackend *blk)
-{
-    return bdrv_flush(blk->bs);
-}
-
-int blk_flush_all(void)
-{
-    return bdrv_flush_all();
-}
-
-void blk_drain_all(void)
-{
-    bdrv_drain_all();
-}
-
-BlockdevOnError blk_get_on_error(BlockBackend *blk, bool is_read)
-{
-    return bdrv_get_on_error(blk->bs, is_read);
-}
-
-BlockErrorAction blk_get_error_action(BlockBackend *blk, bool is_read,
-                                      int error)
-{
-    return bdrv_get_error_action(blk->bs, is_read, error);
-}
-
-void blk_error_action(BlockBackend *blk, BlockErrorAction action,
-                      bool is_read, int error)
-{
-    bdrv_error_action(blk->bs, action, is_read, error);
-}
-
-int blk_is_read_only(BlockBackend *blk)
-{
-    return bdrv_is_read_only(blk->bs);
-}
-
-int blk_is_sg(BlockBackend *blk)
-{
-    return bdrv_is_sg(blk->bs);
-}
-
-int blk_enable_write_cache(BlockBackend *blk)
-{
-    return bdrv_enable_write_cache(blk->bs);
-}
-
-void blk_set_enable_write_cache(BlockBackend *blk, bool wce)
-{
-    bdrv_set_enable_write_cache(blk->bs, wce);
-}
-
-void blk_invalidate_cache(BlockBackend *blk, Error **errp)
-{
-    bdrv_invalidate_cache(blk->bs, errp);
-}
-
-int blk_is_inserted(BlockBackend *blk)
-{
-    return bdrv_is_inserted(blk->bs);
-}
-
-void blk_lock_medium(BlockBackend *blk, bool locked)
-{
-    bdrv_lock_medium(blk->bs, locked);
-}
-
-void blk_eject(BlockBackend *blk, bool eject_flag)
-{
-    bdrv_eject(blk->bs, eject_flag);
-}
-
-int blk_get_flags(BlockBackend *blk)
-{
-    return bdrv_get_flags(blk->bs);
-}
-
-int blk_get_max_transfer_length(BlockBackend *blk)
-{
-    return blk->bs->bl.max_transfer_length;
-}
-
-void blk_set_guest_block_size(BlockBackend *blk, int align)
-{
-    bdrv_set_guest_block_size(blk->bs, align);
-}
-
-void *blk_blockalign(BlockBackend *blk, size_t size)
-{
-    return qemu_blockalign(blk ? blk->bs : NULL, size);
-}
-
-bool blk_op_is_blocked(BlockBackend *blk, BlockOpType op, Error **errp)
-{
-    return bdrv_op_is_blocked(blk->bs, op, errp);
-}
-
-void blk_op_unblock(BlockBackend *blk, BlockOpType op, Error *reason)
-{
-    bdrv_op_unblock(blk->bs, op, reason);
-}
-
-void blk_op_block_all(BlockBackend *blk, Error *reason)
-{
-    bdrv_op_block_all(blk->bs, reason);
-}
-
-void blk_op_unblock_all(BlockBackend *blk, Error *reason)
-{
-    bdrv_op_unblock_all(blk->bs, reason);
-}
-
-AioContext *blk_get_aio_context(BlockBackend *blk)
-{
-    return bdrv_get_aio_context(blk->bs);
-}
-
-void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
-{
-    bdrv_set_aio_context(blk->bs, new_context);
-}
-
-void blk_add_aio_context_notifier(BlockBackend *blk,
-        void (*attached_aio_context)(AioContext *new_context, void *opaque),
-        void (*detach_aio_context)(void *opaque), void *opaque)
-{
-    bdrv_add_aio_context_notifier(blk->bs, attached_aio_context,
-                                  detach_aio_context, opaque);
-}
-
-void blk_remove_aio_context_notifier(BlockBackend *blk,
-                                     void (*attached_aio_context)(AioContext *,
-                                                                  void *),
-                                     void (*detach_aio_context)(void *),
-                                     void *opaque)
-{
-    bdrv_remove_aio_context_notifier(blk->bs, attached_aio_context,
-                                     detach_aio_context, opaque);
-}
-
-void blk_add_close_notifier(BlockBackend *blk, Notifier *notify)
-{
-    bdrv_add_close_notifier(blk->bs, notify);
-}
-
-void blk_io_plug(BlockBackend *blk)
-{
-    bdrv_io_plug(blk->bs);
-}
-
-void blk_io_unplug(BlockBackend *blk)
-{
-    bdrv_io_unplug(blk->bs);
-}
-
-BlockAcctStats *blk_get_stats(BlockBackend *blk)
-{
-    return bdrv_get_stats(blk->bs);
-}
-
-void *blk_aio_get(const AIOCBInfo *aiocb_info, BlockBackend *blk,
-                  BlockCompletionFunc *cb, void *opaque)
-{
-    return qemu_aio_get(aiocb_info, blk_bs(blk), cb, opaque);
-}
-
-int coroutine_fn blk_co_write_zeroes(BlockBackend *blk, int64_t sector_num,
-                                     int nb_sectors, BdrvRequestFlags flags)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return ret;
-    }
-
-    return bdrv_co_write_zeroes(blk->bs, sector_num, nb_sectors, flags);
-}
-
-int blk_write_compressed(BlockBackend *blk, int64_t sector_num,
-                         const uint8_t *buf, int nb_sectors)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return ret;
-    }
-
-    return bdrv_write_compressed(blk->bs, sector_num, buf, nb_sectors);
-}
-
-int blk_truncate(BlockBackend *blk, int64_t offset)
-{
-    return bdrv_truncate(blk->bs, offset);
-}
-
-int blk_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
-{
-    int ret = blk_check_request(blk, sector_num, nb_sectors);
-    if (ret < 0) {
-        return ret;
-    }
-
-    return bdrv_discard(blk->bs, sector_num, nb_sectors);
-}
-
-int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
-                     int64_t pos, int size)
-{
-    return bdrv_save_vmstate(blk->bs, buf, pos, size);
-}
-
-int blk_load_vmstate(BlockBackend *blk, uint8_t *buf, int64_t pos, int size)
-{
-    return bdrv_load_vmstate(blk->bs, buf, pos, size);
-}
-
-int blk_probe_blocksizes(BlockBackend *blk, BlockSizes *bsz)
-{
-    return bdrv_probe_blocksizes(blk->bs, bsz);
-}
-
-int blk_probe_geometry(BlockBackend *blk, HDGeometry *geo)
-{
-    return bdrv_probe_geometry(blk->bs, geo);
-}
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -23,8 +23,8 @@
 * THE SOFTWARE.
 */
 #include "qemu-common.h"
-#include "block/block_int.h"
-#include "qemu/module.h"
+#include "block_int.h"
+#include "module.h"

 /**************************************************************/

@@ -39,41 +39,56 @@
 // not allocated: 0xffffffff

 // always little-endian
-struct bochs_header {
-    char magic[32];     /* "Bochs Virtual HD Image" */
-    char type[16];      /* "Redolog" */
-    char subtype[16];   /* "Undoable" / "Volatile" / "Growing" */
+struct bochs_header_v1 {
+    char magic[32]; // "Bochs Virtual HD Image"
+    char type[16]; // "Redolog"
+    char subtype[16]; // "Undoable" / "Volatile" / "Growing"
    uint32_t version;
-    uint32_t header;    /* size of header */
-
-    uint32_t catalog;   /* num of entries */
-    uint32_t bitmap;    /* bitmap size */
-    uint32_t extent;    /* extent size */
+    uint32_t header; // size of header

    union {
-        struct {
-            uint32_t reserved;  /* for ??? */
-            uint64_t disk;      /* disk size */
-            char padding[HEADER_SIZE - 64 - 20 - 12];
-        } QEMU_PACKED redolog;
-        struct {
-            uint64_t disk;      /* disk size */
-            char padding[HEADER_SIZE - 64 - 20 - 8];
-        } QEMU_PACKED redolog_v1;
-        char padding[HEADER_SIZE - 64 - 20];
+	struct {
+	    uint32_t catalog; // num of entries
+	    uint32_t bitmap; // bitmap size
+	    uint32_t extent; // extent size
+	    uint64_t disk; // disk size
+	    char padding[HEADER_SIZE - 64 - 8 - 20];
+	} redolog;
+	char padding[HEADER_SIZE - 64 - 8];
    } extra;
-} QEMU_PACKED;
+};
+
+// always little-endian
+struct bochs_header {
+    char magic[32]; // "Bochs Virtual HD Image"
+    char type[16]; // "Redolog"
+    char subtype[16]; // "Undoable" / "Volatile" / "Growing"
+    uint32_t version;
+    uint32_t header; // size of header
+
+    union {
+	struct {
+	    uint32_t catalog; // num of entries
+	    uint32_t bitmap; // bitmap size
+	    uint32_t extent; // extent size
+	    uint32_t reserved; // for ???
+	    uint64_t disk; // disk size
+	    char padding[HEADER_SIZE - 64 - 8 - 24];
+	} redolog;
+	char padding[HEADER_SIZE - 64 - 8];
+    } extra;
+};

 typedef struct BDRVBochsState {
    CoMutex lock;
    uint32_t *catalog_bitmap;
-    uint32_t catalog_size;
+    int catalog_size;

-    uint32_t data_offset;
+    int data_offset;

-    uint32_t bitmap_blocks;
-    uint32_t extent_blocks;
-    uint32_t extent_size;
+    int bitmap_blocks;
+    int extent_blocks;
+    int extent_size;
 } BDRVBochsState;

 static int bochs_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -93,19 +108,17 @@ static int bochs_probe(const uint8_t *buf, int buf_size, const char *filename)
    return 0;
 }

-static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
-                      Error **errp)
+static int bochs_open(BlockDriverState *bs, int flags)
 {
    BDRVBochsState *s = bs->opaque;
-    uint32_t i;
+    int i;
    struct bochs_header bochs;
-    int ret;
+    struct bochs_header_v1 header_v1;

    bs->read_only = 1; // no write support yet

-    ret = bdrv_pread(bs->file, 0, &bochs, sizeof(bochs));
-    if (ret < 0) {
-        return ret;
+    if (bdrv_pread(bs->file, 0, &bochs, sizeof(bochs)) != sizeof(bochs)) {
+        goto fail;
    }

    if (strcmp(bochs.magic, HEADER_MAGIC) ||
@@ -113,107 +126,63 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
        strcmp(bochs.subtype, GROWING_TYPE) ||
 	((le32_to_cpu(bochs.version) != HEADER_VERSION) &&
 	(le32_to_cpu(bochs.version) != HEADER_V1))) {
-        error_setg(errp, "Image not in Bochs format");
-        return -EINVAL;
-    }
-
-    if (le32_to_cpu(bochs.version) == HEADER_V1) {
-        bs->total_sectors = le64_to_cpu(bochs.extra.redolog_v1.disk) / 512;
-    } else {
-        bs->total_sectors = le64_to_cpu(bochs.extra.redolog.disk) / 512;
-    }
-
-    /* Limit to 1M entries to avoid unbounded allocation. This is what is
-     * needed for the largest image that bximage can create (~8 TB). */
-    s->catalog_size = le32_to_cpu(bochs.catalog);
-    if (s->catalog_size > 0x100000) {
-        error_setg(errp, "Catalog size is too large");
-        return -EFBIG;
-    }
-
-    s->catalog_bitmap = g_try_new(uint32_t, s->catalog_size);
-    if (s->catalog_size && s->catalog_bitmap == NULL) {
-        error_setg(errp, "Could not allocate memory for catalog");
-        return -ENOMEM;
-    }
-
-    ret = bdrv_pread(bs->file, le32_to_cpu(bochs.header), s->catalog_bitmap,
-                     s->catalog_size * 4);
-    if (ret < 0) {
        goto fail;
    }

+    if (le32_to_cpu(bochs.version) == HEADER_V1) {
+      memcpy(&header_v1, &bochs, sizeof(bochs));
+      bs->total_sectors = le64_to_cpu(header_v1.extra.redolog.disk) / 512;
+    } else {
+      bs->total_sectors = le64_to_cpu(bochs.extra.redolog.disk) / 512;
+    }
+
+    s->catalog_size = le32_to_cpu(bochs.extra.redolog.catalog);
+    s->catalog_bitmap = g_malloc(s->catalog_size * 4);
+    if (bdrv_pread(bs->file, le32_to_cpu(bochs.header), s->catalog_bitmap,
+                   s->catalog_size * 4) != s->catalog_size * 4)
+	goto fail;
    for (i = 0; i < s->catalog_size; i++)
 	le32_to_cpus(&s->catalog_bitmap[i]);

    s->data_offset = le32_to_cpu(bochs.header) + (s->catalog_size * 4);

-    s->bitmap_blocks = 1 + (le32_to_cpu(bochs.bitmap) - 1) / 512;
-    s->extent_blocks = 1 + (le32_to_cpu(bochs.extent) - 1) / 512;
+    s->bitmap_blocks = 1 + (le32_to_cpu(bochs.extra.redolog.bitmap) - 1) / 512;
+    s->extent_blocks = 1 + (le32_to_cpu(bochs.extra.redolog.extent) - 1) / 512;

-    s->extent_size = le32_to_cpu(bochs.extent);
-    if (s->extent_size < BDRV_SECTOR_SIZE) {
-        /* bximage actually never creates extents smaller than 4k */
-        error_setg(errp, "Extent size must be at least 512");
-        ret = -EINVAL;
-        goto fail;
-    } else if (!is_power_of_2(s->extent_size)) {
-        error_setg(errp, "Extent size %" PRIu32 " is not a power of two",
-                   s->extent_size);
-        ret = -EINVAL;
-        goto fail;
-    } else if (s->extent_size > 0x800000) {
-        error_setg(errp, "Extent size %" PRIu32 " is too large",
-                   s->extent_size);
-        ret = -EINVAL;
-        goto fail;
-    }
-
-    if (s->catalog_size < DIV_ROUND_UP(bs->total_sectors,
-                                       s->extent_size / BDRV_SECTOR_SIZE))
-    {
-        error_setg(errp, "Catalog size is too small for this disk size");
-        ret = -EINVAL;
-        goto fail;
-    }
+    s->extent_size = le32_to_cpu(bochs.extra.redolog.extent);

    qemu_co_mutex_init(&s->lock);
    return 0;
-
-fail:
-    g_free(s->catalog_bitmap);
-    return ret;
+ fail:
+    return -1;
 }

 static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
 {
    BDRVBochsState *s = bs->opaque;
-    uint64_t offset = sector_num * 512;
-    uint64_t extent_index, extent_offset, bitmap_offset;
+    int64_t offset = sector_num * 512;
+    int64_t extent_index, extent_offset, bitmap_offset;
    char bitmap_entry;
-    int ret;

    // seek to sector
    extent_index = offset / s->extent_size;
    extent_offset = (offset % s->extent_size) / 512;

    if (s->catalog_bitmap[extent_index] == 0xffffffff) {
-	return 0; /* not allocated */
+	return -1; /* not allocated */
    }

-    bitmap_offset = s->data_offset +
-        (512 * (uint64_t) s->catalog_bitmap[extent_index] *
-        (s->extent_blocks + s->bitmap_blocks));
+    bitmap_offset = s->data_offset + (512 * s->catalog_bitmap[extent_index] *
+	(s->extent_blocks + s->bitmap_blocks));

    /* read in bitmap for current extent */
-    ret = bdrv_pread(bs->file, bitmap_offset + (extent_offset / 8),
-                     &bitmap_entry, 1);
-    if (ret < 0) {
-        return ret;
+    if (bdrv_pread(bs->file, bitmap_offset + (extent_offset / 8),
+                   &bitmap_entry, 1) != 1) {
+        return -1;
    }

    if (!((bitmap_entry >> (extent_offset % 8)) & 1)) {
-	return 0; /* not allocated */
+	return -1; /* not allocated */
    }

    return bitmap_offset + (512 * (s->bitmap_blocks + extent_offset));
@@ -226,16 +195,13 @@ static int bochs_read(BlockDriverState *bs, int64_t sector_num,

    while (nb_sectors > 0) {
        int64_t block_offset = seek_to_sector(bs, sector_num);
-        if (block_offset < 0) {
-            return block_offset;
-        } else if (block_offset > 0) {
+        if (block_offset >= 0) {
            ret = bdrv_pread(bs->file, block_offset, buf, 512);
-            if (ret < 0) {
-                return ret;
+            if (ret != 512) {
+                return -1;
            }
-        } else {
+        } else
            memset(buf, 0, 512);
-        }
        nb_sectors--;
        sector_num++;
        buf += 512;
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -22,13 +22,10 @@
 * THE SOFTWARE.
 */
 #include "qemu-common.h"
-#include "block/block_int.h"
-#include "qemu/module.h"
+#include "block_int.h"
+#include "module.h"
 #include <zlib.h>

-/* Maximum compressed block size */
-#define MAX_BLOCK_SIZE (64 * 1024 * 1024)
-
 typedef struct BDRVCloopState {
    CoMutex lock;
    uint32_t block_size;
@@ -56,130 +53,46 @@ static int cloop_probe(const uint8_t *buf, int buf_size, const char *filename)
    return 0;
 }

-static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
-                      Error **errp)
+static int cloop_open(BlockDriverState *bs, int flags)
 {
    BDRVCloopState *s = bs->opaque;
    uint32_t offsets_size, max_compressed_block_size = 1, i;
-    int ret;

    bs->read_only = 1;

    /* read header */
-    ret = bdrv_pread(bs->file, 128, &s->block_size, 4);
-    if (ret < 0) {
-        return ret;
+    if (bdrv_pread(bs->file, 128, &s->block_size, 4) < 4) {
+        goto cloop_close;
    }
    s->block_size = be32_to_cpu(s->block_size);
-    if (s->block_size % 512) {
-        error_setg(errp, "block_size %" PRIu32 " must be a multiple of 512",
-                   s->block_size);
-        return -EINVAL;
-    }
-    if (s->block_size == 0) {
-        error_setg(errp, "block_size cannot be zero");
-        return -EINVAL;
-    }

-    /* cloop's create_compressed_fs.c warns about block sizes beyond 256 KB but
-     * we can accept more.  Prevent ridiculous values like 4 GB - 1 since we
-     * need a buffer this big.
-     */
-    if (s->block_size > MAX_BLOCK_SIZE) {
-        error_setg(errp, "block_size %" PRIu32 " must be %u MB or less",
-                   s->block_size,
-                   MAX_BLOCK_SIZE / (1024 * 1024));
-        return -EINVAL;
-    }
-
-    ret = bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4);
-    if (ret < 0) {
-        return ret;
+    if (bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4) < 4) {
+        goto cloop_close;
    }
    s->n_blocks = be32_to_cpu(s->n_blocks);

    /* read offsets */
-    if (s->n_blocks > (UINT32_MAX - 1) / sizeof(uint64_t)) {
-        /* Prevent integer overflow */
-        error_setg(errp, "n_blocks %" PRIu32 " must be %zu or less",
-                   s->n_blocks,
-                   (UINT32_MAX - 1) / sizeof(uint64_t));
-        return -EINVAL;
+    offsets_size = s->n_blocks * sizeof(uint64_t);
+    s->offsets = g_malloc(offsets_size);
+    if (bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size) <
+            offsets_size) {
+        goto cloop_close;
    }
-    offsets_size = (s->n_blocks + 1) * sizeof(uint64_t);
-    if (offsets_size > 512 * 1024 * 1024) {
-        /* Prevent ridiculous offsets_size which causes memory allocation to
-         * fail or overflows bdrv_pread() size.  In practice the 512 MB
-         * offsets[] limit supports 16 TB images at 256 KB block size.
-         */
-        error_setg(errp, "image requires too many offsets, "
-                   "try increasing block size");
-        return -EINVAL;
-    }
-
-    s->offsets = g_try_malloc(offsets_size);
-    if (s->offsets == NULL) {
-        error_setg(errp, "Could not allocate offsets table");
-        return -ENOMEM;
-    }
-
-    ret = bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size);
-    if (ret < 0) {
-        goto fail;
-    }
-
-    for (i = 0; i < s->n_blocks + 1; i++) {
-        uint64_t size;
-
+    for(i=0;i<s->n_blocks;i++) {
        s->offsets[i] = be64_to_cpu(s->offsets[i]);
-        if (i == 0) {
-            continue;
-        }
-
-        if (s->offsets[i] < s->offsets[i - 1]) {
-            error_setg(errp, "offsets not monotonically increasing at "
-                       "index %" PRIu32 ", image file is corrupt", i);
-            ret = -EINVAL;
-            goto fail;
-        }
-
-        size = s->offsets[i] - s->offsets[i - 1];
-
-        /* Compressed blocks should be smaller than the uncompressed block size
-         * but maybe compression performed poorly so the compressed block is
-         * actually bigger.  Clamp down on unrealistic values to prevent
-         * ridiculous s->compressed_block allocation.
-         */
-        if (size > 2 * MAX_BLOCK_SIZE) {
-            error_setg(errp, "invalid compressed block size at index %" PRIu32
-                       ", image file is corrupt", i);
-            ret = -EINVAL;
-            goto fail;
-        }
-
-        if (size > max_compressed_block_size) {
-            max_compressed_block_size = size;
+        if (i > 0) {
+            uint32_t size = s->offsets[i] - s->offsets[i - 1];
+            if (size > max_compressed_block_size) {
+                max_compressed_block_size = size;
+            }
        }
    }

    /* initialize zlib engine */
-    s->compressed_block = g_try_malloc(max_compressed_block_size + 1);
-    if (s->compressed_block == NULL) {
-        error_setg(errp, "Could not allocate compressed_block");
-        ret = -ENOMEM;
-        goto fail;
-    }
-
-    s->uncompressed_block = g_try_malloc(s->block_size);
-    if (s->uncompressed_block == NULL) {
-        error_setg(errp, "Could not allocate uncompressed_block");
-        ret = -ENOMEM;
-        goto fail;
-    }
-
+    s->compressed_block = g_malloc(max_compressed_block_size + 1);
+    s->uncompressed_block = g_malloc(s->block_size);
    if (inflateInit(&s->zstream) != Z_OK) {
-        ret = -EINVAL;
-        goto fail;
+        goto cloop_close;
    }
    s->current_block = s->n_blocks;

@@ -188,11 +101,8 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
    qemu_co_mutex_init(&s->lock);
    return 0;

-fail:
-    g_free(s->offsets);
-    g_free(s->compressed_block);
-    g_free(s->uncompressed_block);
-    return ret;
+cloop_close:
+    return -1;
 }

 static inline int cloop_read_block(BlockDriverState *bs, int block_num)
@@ -260,7 +170,9 @@ static coroutine_fn int cloop_co_read(BlockDriverState *bs, int64_t sector_num,
 static void cloop_close(BlockDriverState *bs)
 {
    BDRVCloopState *s = bs->opaque;
-    g_free(s->offsets);
+    if (s->n_blocks > 0) {
+        g_free(s->offsets);
+    }
    g_free(s->compressed_block);
    g_free(s->uncompressed_block);
    inflateEnd(&s->zstream);
--- a/block/commit.c
+++ b/block/commit.c
@@ -13,8 +13,8 @@
 */

 #include "trace.h"
-#include "block/block_int.h"
-#include "block/blockjob.h"
+#include "block_int.h"
+#include "blockjob.h"
 #include "qemu/ratelimit.h"

 enum {
@@ -37,7 +37,6 @@ typedef struct CommitBlockJob {
    BlockdevOnError on_error;
    int base_flags;
    int orig_overlay_flags;
-    char *backing_file_str;
 } CommitBlockJob;

 static int coroutine_fn commit_populate(BlockDriverState *bs,
@@ -60,50 +59,17 @@ static int coroutine_fn commit_populate(BlockDriverState *bs,
    return 0;
 }

-typedef struct {
-    int ret;
-} CommitCompleteData;
-
-static void commit_complete(BlockJob *job, void *opaque)
-{
-    CommitBlockJob *s = container_of(job, CommitBlockJob, common);
-    CommitCompleteData *data = opaque;
-    BlockDriverState *active = s->active;
-    BlockDriverState *top = s->top;
-    BlockDriverState *base = s->base;
-    BlockDriverState *overlay_bs;
-    int ret = data->ret;
-
-    if (!block_job_is_cancelled(&s->common) && ret == 0) {
-        /* success */
-        ret = bdrv_drop_intermediate(active, top, base, s->backing_file_str);
-    }
-
-    /* restore base open flags here if appropriate (e.g., change the base back
-     * to r/o). These reopens do not need to be atomic, since we won't abort
-     * even on failure here */
-    if (s->base_flags != bdrv_get_flags(base)) {
-        bdrv_reopen(base, s->base_flags, NULL);
-    }
-    overlay_bs = bdrv_find_overlay(active, top);
-    if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
-        bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
-    }
-    g_free(s->backing_file_str);
-    block_job_completed(&s->common, ret);
-    g_free(data);
-}
-
 static void coroutine_fn commit_run(void *opaque)
 {
    CommitBlockJob *s = opaque;
-    CommitCompleteData *data;
+    BlockDriverState *active = s->active;
    BlockDriverState *top = s->top;
    BlockDriverState *base = s->base;
+    BlockDriverState *overlay_bs = NULL;
    int64_t sector_num, end;
    int ret = 0;
    int n = 0;
-    void *buf = NULL;
+    void *buf;
    int bytes_written = 0;
    int64_t base_len;

@@ -111,21 +77,23 @@ static void coroutine_fn commit_run(void *opaque)


    if (s->common.len < 0) {
-        goto out;
+        goto exit_restore_reopen;
    }

    ret = base_len = bdrv_getlength(base);
    if (base_len < 0) {
-        goto out;
+        goto exit_restore_reopen;
    }

    if (base_len < s->common.len) {
        ret = bdrv_truncate(base, s->common.len);
        if (ret) {
-            goto out;
+            goto exit_restore_reopen;
        }
    }

+    overlay_bs = bdrv_find_overlay(active, top);
+
    end = s->common.len >> BDRV_SECTOR_BITS;
    buf = qemu_blockalign(top, COMMIT_BUFFER_SIZE);

@@ -135,16 +103,16 @@ static void coroutine_fn commit_run(void *opaque)

 wait:
        /* Note that even when no rate limit is applied we need to yield
-         * with no pending I/O here so that bdrv_drain_all() returns.
+         * with no pending I/O here so that qemu_aio_flush() returns.
         */
-        block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
+        block_job_sleep_ns(&s->common, rt_clock, delay_ns);
        if (block_job_is_cancelled(&s->common)) {
            break;
        }
        /* Copy if allocated above the base */
-        ret = bdrv_is_allocated_above(top, base, sector_num,
-                                      COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
-                                      &n);
+        ret = bdrv_co_is_allocated_above(top, base, sector_num,
+                                         COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
+                                         &n);
        copy = (ret == 1);
        trace_commit_one_iteration(s, sector_num, n, ret);
        if (copy) {
@@ -161,7 +129,7 @@ wait:
            if (s->on_error == BLOCKDEV_ON_ERROR_STOP ||
                s->on_error == BLOCKDEV_ON_ERROR_REPORT||
                (s->on_error == BLOCKDEV_ON_ERROR_ENOSPC && ret == -ENOSPC)) {
-                goto out;
+                goto exit_free_buf;
            } else {
                n = 0;
                continue;
@@ -173,12 +141,26 @@ wait:

    ret = 0;

-out:
+    if (!block_job_is_cancelled(&s->common) && sector_num == end) {
+        /* success */
+        ret = bdrv_drop_intermediate(active, top, base);
+    }
+
+exit_free_buf:
    qemu_vfree(buf);

-    data = g_malloc(sizeof(*data));
-    data->ret = ret;
-    block_job_defer_to_main_loop(&s->common, commit_complete, data);
+exit_restore_reopen:
+    /* restore base open flags here if appropriate (e.g., change the base back
+     * to r/o). These reopens do not need to be atomic, since we won't abort
+     * even on failure here */
+    if (s->base_flags != bdrv_get_flags(base)) {
+        bdrv_reopen(base, s->base_flags, NULL);
+    }
+    if (s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
+        bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
+    }
+
+    block_job_completed(&s->common, ret);
 }

 static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -192,16 +174,16 @@ static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
    ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
 }

-static const BlockJobDriver commit_job_driver = {
+static BlockJobType commit_job_type = {
    .instance_size = sizeof(CommitBlockJob),
-    .job_type      = BLOCK_JOB_TYPE_COMMIT,
+    .job_type      = "commit",
    .set_speed     = commit_set_speed,
 };

 void commit_start(BlockDriverState *bs, BlockDriverState *base,
                  BlockDriverState *top, int64_t speed,
-                  BlockdevOnError on_error, BlockCompletionFunc *cb,
-                  void *opaque, const char *backing_file_str, Error **errp)
+                  BlockdevOnError on_error, BlockDriverCompletionFunc *cb,
+                  void *opaque, Error **errp)
 {
    CommitBlockJob *s;
    BlockReopenQueue *reopen_queue = NULL;
@@ -213,11 +195,17 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
    if ((on_error == BLOCKDEV_ON_ERROR_STOP ||
         on_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
        !bdrv_iostatus_is_enabled(bs)) {
-        error_setg(errp, "Invalid parameter combination");
+        error_set(errp, QERR_INVALID_PARAMETER_COMBINATION);
+        return;
+    }
+
+    /* Once we support top == active layer, remove this check */
+    if (top == bs) {
+        error_setg(errp,
+                   "Top image as the active layer is currently unsupported");
        return;
    }

-    assert(top != bs);
    if (top == base) {
        error_setg(errp, "Invalid files for merge: top and base are the same");
        return;
@@ -251,7 +239,7 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
    }


-    s = block_job_create(&commit_job_driver, bs, speed, cb, opaque, errp);
+    s = block_job_create(&commit_job_type, bs, speed, cb, opaque, errp);
    if (!s) {
        return;
    }
@@ -263,8 +251,6 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
    s->base_flags          = orig_base_flags;
    s->orig_overlay_flags  = orig_overlay_flags;

-    s->backing_file_str = g_strdup(backing_file_str);
-
    s->on_error = on_error;
    s->common.co = qemu_coroutine_create(commit_run);

--- a/block/cow.c
+++ b/block/cow.c
@@ -0,0 +1,356 @@
+/*
+ * Block driver for the COW format
+ *
+ * Copyright (c) 2004 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include "qemu-common.h"
+#include "block_int.h"
+#include "module.h"
+
+/**************************************************************/
+/* COW block driver using file system holes */
+
+/* user mode linux compatible COW file */
+#define COW_MAGIC 0x4f4f4f4d  /* MOOO */
+#define COW_VERSION 2
+
+struct cow_header_v2 {
+    uint32_t magic;
+    uint32_t version;
+    char backing_file[1024];
+    int32_t mtime;
+    uint64_t size;
+    uint32_t sectorsize;
+};
+
+typedef struct BDRVCowState {
+    CoMutex lock;
+    int64_t cow_sectors_offset;
+} BDRVCowState;
+
+static int cow_probe(const uint8_t *buf, int buf_size, const char *filename)
+{
+    const struct cow_header_v2 *cow_header = (const void *)buf;
+
+    if (buf_size >= sizeof(struct cow_header_v2) &&
+        be32_to_cpu(cow_header->magic) == COW_MAGIC &&
+        be32_to_cpu(cow_header->version) == COW_VERSION)
+        return 100;
+    else
+        return 0;
+}
+
+static int cow_open(BlockDriverState *bs, int flags)
+{
+    BDRVCowState *s = bs->opaque;
+    struct cow_header_v2 cow_header;
+    int bitmap_size;
+    int64_t size;
+    int ret;
+
+    /* see if it is a cow image */
+    ret = bdrv_pread(bs->file, 0, &cow_header, sizeof(cow_header));
+    if (ret < 0) {
+        goto fail;
+    }
+
+    if (be32_to_cpu(cow_header.magic) != COW_MAGIC) {
+        ret = -EINVAL;
+        goto fail;
+    }
+
+    if (be32_to_cpu(cow_header.version) != COW_VERSION) {
+        char version[64];
+        snprintf(version, sizeof(version),
+               "COW version %d", cow_header.version);
+        qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
+            bs->device_name, "cow", version);
+        ret = -ENOTSUP;
+        goto fail;
+    }
+
+    /* cow image found */
+    size = be64_to_cpu(cow_header.size);
+    bs->total_sectors = size / 512;
+
+    pstrcpy(bs->backing_file, sizeof(bs->backing_file),
+            cow_header.backing_file);
+
+    bitmap_size = ((bs->total_sectors + 7) >> 3) + sizeof(cow_header);
+    s->cow_sectors_offset = (bitmap_size + 511) & ~511;
+    qemu_co_mutex_init(&s->lock);
+    return 0;
+ fail:
+    return ret;
+}
+
+/*
+ * XXX(hch): right now these functions are extremely inefficient.
+ * We should just read the whole bitmap we'll need in one go instead.
+ */
+static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
+{
+    uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
+    uint8_t bitmap;
+    int ret;
+
+    ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
+    if (ret < 0) {
+       return ret;
+    }
+
+    bitmap |= (1 << (bitnum % 8));
+
+    ret = bdrv_pwrite_sync(bs->file, offset, &bitmap, sizeof(bitmap));
+    if (ret < 0) {
+       return ret;
+    }
+    return 0;
+}
+
+static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
+{
+    uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
+    uint8_t bitmap;
+    int ret;
+
+    ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
+    if (ret < 0) {
+       return ret;
+    }
+
+    return !!(bitmap & (1 << (bitnum % 8)));
+}
+
+/* Return true if first block has been changed (ie. current version is
+ * in COW file).  Set the number of continuous blocks for which that
+ * is true. */
+static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
+        int64_t sector_num, int nb_sectors, int *num_same)
+{
+    int changed;
+
+    if (nb_sectors == 0) {
+	*num_same = nb_sectors;
+	return 0;
+    }
+
+    changed = is_bit_set(bs, sector_num);
+    if (changed < 0) {
+        return 0; /* XXX: how to return I/O errors? */
+    }
+
+    for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) {
+	if (is_bit_set(bs, sector_num + *num_same) != changed)
+	    break;
+    }
+
+    return changed;
+}
+
+static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
+        int nb_sectors)
+{
+    int error = 0;
+    int i;
+
+    for (i = 0; i < nb_sectors; i++) {
+        error = cow_set_bit(bs, sector_num + i);
+        if (error) {
+            break;
+        }
+    }
+
+    return error;
+}
+
+static int coroutine_fn cow_read(BlockDriverState *bs, int64_t sector_num,
+                                 uint8_t *buf, int nb_sectors)
+{
+    BDRVCowState *s = bs->opaque;
+    int ret, n;
+
+    while (nb_sectors > 0) {
+        if (bdrv_co_is_allocated(bs, sector_num, nb_sectors, &n)) {
+            ret = bdrv_pread(bs->file,
+                        s->cow_sectors_offset + sector_num * 512,
+                        buf, n * 512);
+            if (ret < 0) {
+                return ret;
+            }
+        } else {
+            if (bs->backing_hd) {
+                /* read from the base image */
+                ret = bdrv_read(bs->backing_hd, sector_num, buf, n);
+                if (ret < 0) {
+                    return ret;
+                }
+            } else {
+                memset(buf, 0, n * 512);
+            }
+        }
+        nb_sectors -= n;
+        sector_num += n;
+        buf += n * 512;
+    }
+    return 0;
+}
+
+static coroutine_fn int cow_co_read(BlockDriverState *bs, int64_t sector_num,
+                                    uint8_t *buf, int nb_sectors)
+{
+    int ret;
+    BDRVCowState *s = bs->opaque;
+    qemu_co_mutex_lock(&s->lock);
+    ret = cow_read(bs, sector_num, buf, nb_sectors);
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
+static int cow_write(BlockDriverState *bs, int64_t sector_num,
+                     const uint8_t *buf, int nb_sectors)
+{
+    BDRVCowState *s = bs->opaque;
+    int ret;
+
+    ret = bdrv_pwrite(bs->file, s->cow_sectors_offset + sector_num * 512,
+                      buf, nb_sectors * 512);
+    if (ret < 0) {
+        return ret;
+    }
+
+    return cow_update_bitmap(bs, sector_num, nb_sectors);
+}
+
+static coroutine_fn int cow_co_write(BlockDriverState *bs, int64_t sector_num,
+                                     const uint8_t *buf, int nb_sectors)
+{
+    int ret;
+    BDRVCowState *s = bs->opaque;
+    qemu_co_mutex_lock(&s->lock);
+    ret = cow_write(bs, sector_num, buf, nb_sectors);
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
+static void cow_close(BlockDriverState *bs)
+{
+}
+
+static int cow_create(const char *filename, QEMUOptionParameter *options)
+{
+    struct cow_header_v2 cow_header;
+    struct stat st;
+    int64_t image_sectors = 0;
+    const char *image_filename = NULL;
+    int ret;
+    BlockDriverState *cow_bs;
+
+    /* Read out options */
+    while (options && options->name) {
+        if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
+            image_sectors = options->value.n / 512;
+        } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
+            image_filename = options->value.s;
+        }
+        options++;
+    }
+
+    ret = bdrv_create_file(filename, options);
+    if (ret < 0) {
+        return ret;
+    }
+
+    ret = bdrv_file_open(&cow_bs, filename, BDRV_O_RDWR);
+    if (ret < 0) {
+        return ret;
+    }
+
+    memset(&cow_header, 0, sizeof(cow_header));
+    cow_header.magic = cpu_to_be32(COW_MAGIC);
+    cow_header.version = cpu_to_be32(COW_VERSION);
+    if (image_filename) {
+        /* Note: if no file, we put a dummy mtime */
+        cow_header.mtime = cpu_to_be32(0);
+
+        if (stat(image_filename, &st) != 0) {
+            goto mtime_fail;
+        }
+        cow_header.mtime = cpu_to_be32(st.st_mtime);
+    mtime_fail:
+        pstrcpy(cow_header.backing_file, sizeof(cow_header.backing_file),
+                image_filename);
+    }
+    cow_header.sectorsize = cpu_to_be32(512);
+    cow_header.size = cpu_to_be64(image_sectors * 512);
+    ret = bdrv_pwrite(cow_bs, 0, &cow_header, sizeof(cow_header));
+    if (ret < 0) {
+        goto exit;
+    }
+
+    /* resize to include at least all the bitmap */
+    ret = bdrv_truncate(cow_bs,
+        sizeof(cow_header) + ((image_sectors + 7) >> 3));
+    if (ret < 0) {
+        goto exit;
+    }
+
+exit:
+    bdrv_delete(cow_bs);
+    return ret;
+}
+
+static QEMUOptionParameter cow_create_options[] = {
+    {
+        .name = BLOCK_OPT_SIZE,
+        .type = OPT_SIZE,
+        .help = "Virtual disk size"
+    },
+    {
+        .name = BLOCK_OPT_BACKING_FILE,
+        .type = OPT_STRING,
+        .help = "File name of a base image"
+    },
+    { NULL }
+};
+
+static BlockDriver bdrv_cow = {
+    .format_name    = "cow",
+    .instance_size  = sizeof(BDRVCowState),
+
+    .bdrv_probe     = cow_probe,
+    .bdrv_open      = cow_open,
+    .bdrv_close     = cow_close,
+    .bdrv_create    = cow_create,
+
+    .bdrv_read              = cow_co_read,
+    .bdrv_write             = cow_co_write,
+    .bdrv_co_is_allocated   = cow_co_is_allocated,
+
+    .create_options = cow_create_options,
+};
+
+static void bdrv_cow_init(void)
+{
+    bdrv_register(&bdrv_cow);
+}
+
+block_init(bdrv_cow_init);
--- a/block/curl.c
+++ b/block/curl.c
@@ -22,11 +22,10 @@
 * THE SOFTWARE.
 */
 #include "qemu-common.h"
-#include "block/block_int.h"
-#include "qapi/qmp/qbool.h"
+#include "block_int.h"
 #include <curl/curl.h>

-// #define DEBUG_CURL
+// #define DEBUG
 // #define DEBUG_VERBOSE

 #ifdef DEBUG_CURL
@@ -35,51 +34,19 @@
 #define DPRINTF(fmt, ...) do { } while (0)
 #endif

-#if LIBCURL_VERSION_NUM >= 0x071000
-/* The multi interface timer callback was introduced in 7.16.0 */
-#define NEED_CURL_TIMER_CALLBACK
-#define HAVE_SOCKET_ACTION
-#endif
-
-#ifndef HAVE_SOCKET_ACTION
-/* If curl_multi_socket_action isn't available, define it statically here in
- * terms of curl_multi_socket. Note that ev_bitmask will be ignored, which is
- * less efficient but still safe. */
-static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
-                                            curl_socket_t sockfd,
-                                            int ev_bitmask,
-                                            int *running_handles)
-{
-    return curl_multi_socket(multi_handle, sockfd, running_handles);
-}
-#define curl_multi_socket_action __curl_multi_socket_action
-#endif
-
-#define PROTOCOLS (CURLPROTO_HTTP | CURLPROTO_HTTPS | \
-                   CURLPROTO_FTP | CURLPROTO_FTPS | \
-                   CURLPROTO_TFTP)
-
 #define CURL_NUM_STATES 8
 #define CURL_NUM_ACB    8
 #define SECTOR_SIZE     512
-#define READ_AHEAD_DEFAULT (256 * 1024)
-#define CURL_TIMEOUT_DEFAULT 5
-#define CURL_TIMEOUT_MAX 10000
+#define READ_AHEAD_SIZE (256 * 1024)

 #define FIND_RET_NONE   0
 #define FIND_RET_OK     1
 #define FIND_RET_WAIT   2

-#define CURL_BLOCK_OPT_URL       "url"
-#define CURL_BLOCK_OPT_READAHEAD "readahead"
-#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
-#define CURL_BLOCK_OPT_TIMEOUT "timeout"
-#define CURL_BLOCK_OPT_COOKIE    "cookie"
-
 struct BDRVCURLState;

 typedef struct CURLAIOCB {
-    BlockAIOCB common;
+    BlockDriverAIOCB common;
    QEMUBH *bh;
    QEMUIOVector *qiov;

@@ -95,7 +62,6 @@ typedef struct CURLState
    struct BDRVCURLState *s;
    CURLAIOCB *acb[CURL_NUM_ACB];
    CURL *curl;
-    curl_socket_t sock_fd;
    char *orig_buf;
    size_t buf_start;
    size_t buf_off;
@@ -107,78 +73,47 @@ typedef struct CURLState

 typedef struct BDRVCURLState {
    CURLM *multi;
-    QEMUTimer timer;
    size_t len;
    CURLState states[CURL_NUM_STATES];
    char *url;
    size_t readahead_size;
-    bool sslverify;
-    uint64_t timeout;
-    char *cookie;
-    bool accept_range;
-    AioContext *aio_context;
 } BDRVCURLState;

 static void curl_clean_state(CURLState *s);
 static void curl_multi_do(void *arg);
-static void curl_multi_read(void *arg);
-
-#ifdef NEED_CURL_TIMER_CALLBACK
-static int curl_timer_cb(CURLM *multi, long timeout_ms, void *opaque)
-{
-    BDRVCURLState *s = opaque;
-
-    DPRINTF("CURL: timer callback timeout_ms %ld\n", timeout_ms);
-    if (timeout_ms == -1) {
-        timer_del(&s->timer);
-    } else {
-        int64_t timeout_ns = (int64_t)timeout_ms * 1000 * 1000;
-        timer_mod(&s->timer,
-                  qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ns);
-    }
-    return 0;
-}
-#endif
+static int curl_aio_flush(void *opaque);

 static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
-                        void *userp, void *sp)
+                        void *s, void *sp)
 {
-    BDRVCURLState *s;
-    CURLState *state = NULL;
-    curl_easy_getinfo(curl, CURLINFO_PRIVATE, (char **)&state);
-    state->sock_fd = fd;
-    s = state->s;
-
    DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
    switch (action) {
        case CURL_POLL_IN:
-            aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
-                               NULL, state);
+            qemu_aio_set_fd_handler(fd, curl_multi_do, NULL, curl_aio_flush, s);
            break;
        case CURL_POLL_OUT:
-            aio_set_fd_handler(s->aio_context, fd, NULL, curl_multi_do, state);
+            qemu_aio_set_fd_handler(fd, NULL, curl_multi_do, curl_aio_flush, s);
            break;
        case CURL_POLL_INOUT:
-            aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
-                               curl_multi_do, state);
+            qemu_aio_set_fd_handler(fd, curl_multi_do, curl_multi_do,
+                                    curl_aio_flush, s);
            break;
        case CURL_POLL_REMOVE:
-            aio_set_fd_handler(s->aio_context, fd, NULL, NULL, NULL);
+            qemu_aio_set_fd_handler(fd, NULL, NULL, NULL, NULL);
            break;
    }

    return 0;
 }

-static size_t curl_header_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
+static size_t curl_size_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
 {
-    BDRVCURLState *s = opaque;
+    CURLState *s = ((CURLState*)opaque);
    size_t realsize = size * nmemb;
-    const char *accept_line = "Accept-Ranges: bytes";
+    size_t fsize;

-    if (realsize >= strlen(accept_line)
-        && strncmp((char *)ptr, accept_line, strlen(accept_line)) == 0) {
-        s->accept_range = true;
+    if(sscanf(ptr, "Content-Length: %zd", &fsize) == 1) {
+        s->s->len = fsize;
    }

    return realsize;
@@ -193,13 +128,8 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
    DPRINTF("CURL: Just reading %zd bytes\n", realsize);

    if (!s || !s->orig_buf)
-        return 0;
+        goto read_end;

-    if (s->buf_off >= s->buf_len) {
-        /* buffer full, read nothing */
-        return 0;
-    }
-    realsize = MIN(realsize, s->buf_len - s->buf_off);
    memcpy(s->orig_buf + s->buf_off, ptr, realsize);
    s->buf_off += realsize;

@@ -213,11 +143,12 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
            qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
                                acb->end - acb->start);
            acb->common.cb(acb->common.opaque, 0);
-            qemu_aio_unref(acb);
+            qemu_aio_release(acb);
            s->acb[i] = NULL;
        }
    }

+read_end:
    return realsize;
 }

@@ -252,8 +183,7 @@ static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
        }

        // Wait for unfinished chunks
-        if (state->in_use &&
-            (start >= state->buf_start) &&
+        if ((start >= state->buf_start) &&
            (start <= buf_fend) &&
            (end >= state->buf_start) &&
            (end <= buf_fend))
@@ -275,90 +205,64 @@ static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
    return FIND_RET_NONE;
 }

-static void curl_multi_check_completion(BDRVCURLState *s)
+static void curl_multi_do(void *arg)
 {
+    BDRVCURLState *s = (BDRVCURLState *)arg;
+    int running;
+    int r;
    int msgs_in_queue;

+    if (!s->multi)
+        return;
+
+    do {
+        r = curl_multi_socket_all(s->multi, &running);
+    } while(r == CURLM_CALL_MULTI_PERFORM);
+
    /* Try to find done transfers, so we can free the easy
     * handle again. */
-    for (;;) {
+    do {
        CURLMsg *msg;
        msg = curl_multi_info_read(s->multi, &msgs_in_queue);

-        /* Quit when there are no more completions */
        if (!msg)
            break;
-
-        if (msg->msg == CURLMSG_DONE) {
-            CURLState *state = NULL;
-            curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE,
-                              (char **)&state);
-
-            /* ACBs for successful messages get completed in curl_read_cb */
-            if (msg->data.result != CURLE_OK) {
-                int i;
-                for (i = 0; i < CURL_NUM_ACB; i++) {
-                    CURLAIOCB *acb = state->acb[i];
-
-                    if (acb == NULL) {
-                        continue;
-                    }
-
-                    acb->common.cb(acb->common.opaque, -EIO);
-                    qemu_aio_unref(acb);
-                    state->acb[i] = NULL;
-                }
-            }
-
-            curl_clean_state(state);
+        if (msg->msg == CURLMSG_NONE)
            break;
+
+        switch (msg->msg) {
+            case CURLMSG_DONE:
+            {
+                CURLState *state = NULL;
+                curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE, (char**)&state);
+
+                /* ACBs for successful messages get completed in curl_read_cb */
+                if (msg->data.result != CURLE_OK) {
+                    int i;
+                    for (i = 0; i < CURL_NUM_ACB; i++) {
+                        CURLAIOCB *acb = state->acb[i];
+
+                        if (acb == NULL) {
+                            continue;
+                        }
+
+                        acb->common.cb(acb->common.opaque, -EIO);
+                        qemu_aio_release(acb);
+                        state->acb[i] = NULL;
+                    }
+                }
+
+                curl_clean_state(state);
+                break;
+            }
+            default:
+                msgs_in_queue = 0;
+                break;
        }
-    }
+    } while(msgs_in_queue);
 }

-static void curl_multi_do(void *arg)
-{
-    CURLState *s = (CURLState *)arg;
-    int running;
-    int r;
-
-    if (!s->s->multi) {
-        return;
-    }
-
-    do {
-        r = curl_multi_socket_action(s->s->multi, s->sock_fd, 0, &running);
-    } while(r == CURLM_CALL_MULTI_PERFORM);
-
-}
-
-static void curl_multi_read(void *arg)
-{
-    CURLState *s = (CURLState *)arg;
-
-    curl_multi_do(arg);
-    curl_multi_check_completion(s->s);
-}
-
-static void curl_multi_timeout_do(void *arg)
-{
-#ifdef NEED_CURL_TIMER_CALLBACK
-    BDRVCURLState *s = (BDRVCURLState *)arg;
-    int running;
-
-    if (!s->multi) {
-        return;
-    }
-
-    curl_multi_socket_action(s->multi, CURL_SOCKET_TIMEOUT, 0, &running);
-
-    curl_multi_check_completion(s);
-#else
-    abort();
-#endif
-}
-
-static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
+static CURLState *curl_init_state(BDRVCURLState *s)
 {
    CURLState *state = NULL;
    int i, j;
@@ -376,47 +280,33 @@ static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
            break;
        }
        if (!state) {
-            aio_poll(bdrv_get_aio_context(bs), true);
+            g_usleep(100);
+            curl_multi_do(s);
        }
    } while(!state);

-    if (!state->curl) {
-        state->curl = curl_easy_init();
-        if (!state->curl) {
-            return NULL;
-        }
-        curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
-        curl_easy_setopt(state->curl, CURLOPT_SSL_VERIFYPEER,
-                         (long) s->sslverify);
-        if (s->cookie) {
-            curl_easy_setopt(state->curl, CURLOPT_COOKIE, s->cookie);
-        }
-        curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, (long)s->timeout);
-        curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION,
-                         (void *)curl_read_cb);
-        curl_easy_setopt(state->curl, CURLOPT_WRITEDATA, (void *)state);
-        curl_easy_setopt(state->curl, CURLOPT_PRIVATE, (void *)state);
-        curl_easy_setopt(state->curl, CURLOPT_AUTOREFERER, 1);
-        curl_easy_setopt(state->curl, CURLOPT_FOLLOWLOCATION, 1);
-        curl_easy_setopt(state->curl, CURLOPT_NOSIGNAL, 1);
-        curl_easy_setopt(state->curl, CURLOPT_ERRORBUFFER, state->errmsg);
-        curl_easy_setopt(state->curl, CURLOPT_FAILONERROR, 1);
+    if (state->curl)
+        goto has_curl;

-        /* Restrict supported protocols to avoid security issues in the more
-         * obscure protocols.  For example, do not allow POP3/SMTP/IMAP see
-         * CVE-2013-0249.
-         *
-         * Restricting protocols is only supported from 7.19.4 upwards.
-         */
-#if LIBCURL_VERSION_NUM >= 0x071304
-        curl_easy_setopt(state->curl, CURLOPT_PROTOCOLS, PROTOCOLS);
-        curl_easy_setopt(state->curl, CURLOPT_REDIR_PROTOCOLS, PROTOCOLS);
-#endif
+    state->curl = curl_easy_init();
+    if (!state->curl)
+        return NULL;
+    curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
+    curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, 5);
+    curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION, (void *)curl_read_cb);
+    curl_easy_setopt(state->curl, CURLOPT_WRITEDATA, (void *)state);
+    curl_easy_setopt(state->curl, CURLOPT_PRIVATE, (void *)state);
+    curl_easy_setopt(state->curl, CURLOPT_AUTOREFERER, 1);
+    curl_easy_setopt(state->curl, CURLOPT_FOLLOWLOCATION, 1);
+    curl_easy_setopt(state->curl, CURLOPT_NOSIGNAL, 1);
+    curl_easy_setopt(state->curl, CURLOPT_ERRORBUFFER, state->errmsg);
+    curl_easy_setopt(state->curl, CURLOPT_FAILONERROR, 1);

 #ifdef DEBUG_VERBOSE
-        curl_easy_setopt(state->curl, CURLOPT_VERBOSE, 1);
+    curl_easy_setopt(state->curl, CURLOPT_VERBOSE, 1);
 #endif
-    }
+
+has_curl:

    state->s = s;

@@ -430,136 +320,52 @@ static void curl_clean_state(CURLState *s)
    s->in_use = 0;
 }

-static void curl_parse_filename(const char *filename, QDict *options,
-                                Error **errp)
-{
-    qdict_put(options, CURL_BLOCK_OPT_URL, qstring_from_str(filename));
-}
-
-static void curl_detach_aio_context(BlockDriverState *bs)
-{
-    BDRVCURLState *s = bs->opaque;
-    int i;
-
-    for (i = 0; i < CURL_NUM_STATES; i++) {
-        if (s->states[i].in_use) {
-            curl_clean_state(&s->states[i]);
-        }
-        if (s->states[i].curl) {
-            curl_easy_cleanup(s->states[i].curl);
-            s->states[i].curl = NULL;
-        }
-        g_free(s->states[i].orig_buf);
-        s->states[i].orig_buf = NULL;
-    }
-    if (s->multi) {
-        curl_multi_cleanup(s->multi);
-        s->multi = NULL;
-    }
-
-    timer_del(&s->timer);
-}
-
-static void curl_attach_aio_context(BlockDriverState *bs,
-                                    AioContext *new_context)
-{
-    BDRVCURLState *s = bs->opaque;
-
-    aio_timer_init(new_context, &s->timer,
-                   QEMU_CLOCK_REALTIME, SCALE_NS,
-                   curl_multi_timeout_do, s);
-
-    assert(!s->multi);
-    s->multi = curl_multi_init();
-    s->aio_context = new_context;
-    curl_multi_setopt(s->multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb);
-#ifdef NEED_CURL_TIMER_CALLBACK
-    curl_multi_setopt(s->multi, CURLMOPT_TIMERDATA, s);
-    curl_multi_setopt(s->multi, CURLMOPT_TIMERFUNCTION, curl_timer_cb);
-#endif
-}
-
-static QemuOptsList runtime_opts = {
-    .name = "curl",
-    .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
-    .desc = {
-        {
-            .name = CURL_BLOCK_OPT_URL,
-            .type = QEMU_OPT_STRING,
-            .help = "URL to open",
-        },
-        {
-            .name = CURL_BLOCK_OPT_READAHEAD,
-            .type = QEMU_OPT_SIZE,
-            .help = "Readahead size",
-        },
-        {
-            .name = CURL_BLOCK_OPT_SSLVERIFY,
-            .type = QEMU_OPT_BOOL,
-            .help = "Verify SSL certificate"
-        },
-        {
-            .name = CURL_BLOCK_OPT_TIMEOUT,
-            .type = QEMU_OPT_NUMBER,
-            .help = "Curl timeout"
-        },
-        {
-            .name = CURL_BLOCK_OPT_COOKIE,
-            .type = QEMU_OPT_STRING,
-            .help = "Pass the cookie or list of cookies with each request"
-        },
-        { /* end of list */ }
-    },
-};
-
-static int curl_open(BlockDriverState *bs, QDict *options, int flags,
-                     Error **errp)
+static int curl_open(BlockDriverState *bs, const char *filename, int flags)
 {
    BDRVCURLState *s = bs->opaque;
    CURLState *state = NULL;
-    QemuOpts *opts;
-    Error *local_err = NULL;
-    const char *file;
-    const char *cookie;
    double d;

+    #define RA_OPTSTR ":readahead="
+    char *file;
+    char *ra;
+    const char *ra_val;
+    int parse_state = 0;
+
    static int inited = 0;

-    if (flags & BDRV_O_RDWR) {
-        error_setg(errp, "curl block device does not support writes");
-        return -EROFS;
+    file = g_strdup(filename);
+    s->readahead_size = READ_AHEAD_SIZE;
+
+    /* Parse a trailing ":readahead=#:" param, if present. */
+    ra = file + strlen(file) - 1;
+    while (ra >= file) {
+        if (parse_state == 0) {
+            if (*ra == ':')
+                parse_state++;
+            else
+                break;
+        } else if (parse_state == 1) {
+            if (*ra > '9' || *ra < '0') {
+                char *opt_start = ra - strlen(RA_OPTSTR) + 1;
+                if (opt_start > file &&
+                    strncmp(opt_start, RA_OPTSTR, strlen(RA_OPTSTR)) == 0) {
+                    ra_val = ra + 1;
+                    ra -= strlen(RA_OPTSTR) - 1;
+                    *ra = '\0';
+                    s->readahead_size = atoi(ra_val);
+                    break;
+                } else {
+                    break;
+                }
+            }
+        }
+        ra--;
    }

-    opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
-    qemu_opts_absorb_qdict(opts, options, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        goto out_noclean;
-    }
-
-    s->readahead_size = qemu_opt_get_size(opts, CURL_BLOCK_OPT_READAHEAD,
-                                          READ_AHEAD_DEFAULT);
    if ((s->readahead_size & 0x1ff) != 0) {
-        error_setg(errp, "HTTP_READAHEAD_SIZE %zd is not a multiple of 512",
-                   s->readahead_size);
-        goto out_noclean;
-    }
-
-    s->timeout = qemu_opt_get_number(opts, CURL_BLOCK_OPT_TIMEOUT,
-                                     CURL_TIMEOUT_DEFAULT);
-    if (s->timeout > CURL_TIMEOUT_MAX) {
-        error_setg(errp, "timeout parameter is too large or negative");
-        goto out_noclean;
-    }
-
-    s->sslverify = qemu_opt_get_bool(opts, CURL_BLOCK_OPT_SSLVERIFY, true);
-
-    cookie = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE);
-    s->cookie = g_strdup(cookie);
-
-    file = qemu_opt_get(opts, CURL_BLOCK_OPT_URL);
-    if (file == NULL) {
-        error_setg(errp, "curl block driver requires an 'url' option");
+        fprintf(stderr, "HTTP_READAHEAD_SIZE %zd is not a multiple of 512\n",
+                s->readahead_size);
        goto out_noclean;
    }

@@ -569,64 +375,78 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
    }

    DPRINTF("CURL: Opening %s\n", file);
-    s->aio_context = bdrv_get_aio_context(bs);
-    s->url = g_strdup(file);
-    state = curl_init_state(bs, s);
+    s->url = file;
+    state = curl_init_state(s);
    if (!state)
        goto out_noclean;

    // Get file size

-    s->accept_range = false;
    curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1);
-    curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION,
-                     curl_header_cb);
-    curl_easy_setopt(state->curl, CURLOPT_HEADERDATA, s);
+    curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION, (void *)curl_size_cb);
    if (curl_easy_perform(state->curl))
        goto out;
    curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, &d);
+    curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION, (void *)curl_read_cb);
+    curl_easy_setopt(state->curl, CURLOPT_NOBODY, 0);
    if (d)
        s->len = (size_t)d;
    else if(!s->len)
        goto out;
-    if ((!strncasecmp(s->url, "http://", strlen("http://"))
-        || !strncasecmp(s->url, "https://", strlen("https://")))
-        && !s->accept_range) {
-        pstrcpy(state->errmsg, CURL_ERROR_SIZE,
-                "Server does not support 'range' (byte ranges).");
-        goto out;
-    }
    DPRINTF("CURL: Size = %zd\n", s->len);

    curl_clean_state(state);
    curl_easy_cleanup(state->curl);
    state->curl = NULL;

-    curl_attach_aio_context(bs, bdrv_get_aio_context(bs));
+    // Now we know the file exists and its size, so let's
+    // initialize the multi interface!
+
+    s->multi = curl_multi_init();
+    curl_multi_setopt( s->multi, CURLMOPT_SOCKETDATA, s); 
+    curl_multi_setopt( s->multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb ); 
+    curl_multi_do(s);

-    qemu_opts_del(opts);
    return 0;

 out:
-    error_setg(errp, "CURL: Error opening file: %s", state->errmsg);
+    fprintf(stderr, "CURL: Error opening file: %s\n", state->errmsg);
    curl_easy_cleanup(state->curl);
    state->curl = NULL;
 out_noclean:
-    g_free(s->cookie);
-    g_free(s->url);
-    qemu_opts_del(opts);
+    g_free(file);
    return -EINVAL;
 }

+static int curl_aio_flush(void *opaque)
+{
+    BDRVCURLState *s = opaque;
+    int i, j;
+
+    for (i=0; i < CURL_NUM_STATES; i++) {
+        for(j=0; j < CURL_NUM_ACB; j++) {
+            if (s->states[i].acb[j]) {
+                return 1;
+            }
+        }
+    }
+    return 0;
+}
+
+static void curl_aio_cancel(BlockDriverAIOCB *blockacb)
+{
+    // Do we have to implement canceling? Seems to work without...
+}
+
 static const AIOCBInfo curl_aiocb_info = {
    .aiocb_size         = sizeof(CURLAIOCB),
+    .cancel             = curl_aio_cancel,
 };


 static void curl_readv_bh_cb(void *p)
 {
    CURLState *state;
-    int running;

    CURLAIOCB *acb = p;
    BDRVCURLState *s = acb->common.bs->opaque;
@@ -641,7 +461,7 @@ static void curl_readv_bh_cb(void *p)
    // we can just call the callback and be done.
    switch (curl_find_buf(s, start, acb->nb_sectors * SECTOR_SIZE, acb)) {
        case FIND_RET_OK:
-            qemu_aio_unref(acb);
+            qemu_aio_release(acb);
            // fall through
        case FIND_RET_WAIT:
            return;
@@ -650,10 +470,10 @@ static void curl_readv_bh_cb(void *p)
    }

    // No cache found, so let's start a new request
-    state = curl_init_state(acb->common.bs, s);
+    state = curl_init_state(s);
    if (!state) {
        acb->common.cb(acb->common.opaque, -EIO);
-        qemu_aio_unref(acb);
+        qemu_aio_release(acb);
        return;
    }

@@ -661,17 +481,12 @@ static void curl_readv_bh_cb(void *p)
    acb->end = (acb->nb_sectors * SECTOR_SIZE);

    state->buf_off = 0;
-    g_free(state->orig_buf);
+    if (state->orig_buf)
+        g_free(state->orig_buf);
    state->buf_start = start;
    state->buf_len = acb->end + s->readahead_size;
    end = MIN(start + state->buf_len, s->len) - 1;
-    state->orig_buf = g_try_malloc(state->buf_len);
-    if (state->buf_len && state->orig_buf == NULL) {
-        curl_clean_state(state);
-        acb->common.cb(acb->common.opaque, -ENOMEM);
-        qemu_aio_unref(acb);
-        return;
-    }
+    state->orig_buf = g_malloc(state->buf_len);
    state->acb[0] = acb;

    snprintf(state->range, 127, "%zd-%zd", start, end);
@@ -680,14 +495,13 @@ static void curl_readv_bh_cb(void *p)
    curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);

    curl_multi_add_handle(s->multi, state->curl);
+    curl_multi_do(s);

-    /* Tell curl it needs to kick things off */
-    curl_multi_socket_action(s->multi, CURL_SOCKET_TIMEOUT, 0, &running);
 }

-static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
+static BlockDriverAIOCB *curl_aio_readv(BlockDriverState *bs,
        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque)
+        BlockDriverCompletionFunc *cb, void *opaque)
 {
    CURLAIOCB *acb;

@@ -697,7 +511,13 @@ static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
    acb->sector_num = sector_num;
    acb->nb_sectors = nb_sectors;

-    acb->bh = aio_bh_new(bdrv_get_aio_context(bs), curl_readv_bh_cb, acb);
+    acb->bh = qemu_bh_new(curl_readv_bh_cb, acb);
+
+    if (!acb->bh) {
+        DPRINTF("CURL: qemu_bh_new failed\n");
+        return NULL;
+    }
+
    qemu_bh_schedule(acb->bh);
    return &acb->common;
 }
@@ -705,11 +525,23 @@ static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
 static void curl_close(BlockDriverState *bs)
 {
    BDRVCURLState *s = bs->opaque;
+    int i;

    DPRINTF("CURL: Close\n");
-    curl_detach_aio_context(bs);
-
-    g_free(s->cookie);
+    for (i=0; i<CURL_NUM_STATES; i++) {
+        if (s->states[i].in_use)
+            curl_clean_state(&s->states[i]);
+        if (s->states[i].curl) {
+            curl_easy_cleanup(s->states[i].curl);
+            s->states[i].curl = NULL;
+        }
+        if (s->states[i].orig_buf) {
+            g_free(s->states[i].orig_buf);
+            s->states[i].orig_buf = NULL;
+        }
+    }
+    if (s->multi)
+        curl_multi_cleanup(s->multi);
    g_free(s->url);
 }

@@ -720,83 +552,63 @@ static int64_t curl_getlength(BlockDriverState *bs)
 }

 static BlockDriver bdrv_http = {
-    .format_name                = "http",
-    .protocol_name              = "http",
+    .format_name     = "http",
+    .protocol_name   = "http",

-    .instance_size              = sizeof(BDRVCURLState),
-    .bdrv_parse_filename        = curl_parse_filename,
-    .bdrv_file_open             = curl_open,
-    .bdrv_close                 = curl_close,
-    .bdrv_getlength             = curl_getlength,
+    .instance_size   = sizeof(BDRVCURLState),
+    .bdrv_file_open  = curl_open,
+    .bdrv_close      = curl_close,
+    .bdrv_getlength  = curl_getlength,

-    .bdrv_aio_readv             = curl_aio_readv,
-
-    .bdrv_detach_aio_context    = curl_detach_aio_context,
-    .bdrv_attach_aio_context    = curl_attach_aio_context,
+    .bdrv_aio_readv  = curl_aio_readv,
 };

 static BlockDriver bdrv_https = {
-    .format_name                = "https",
-    .protocol_name              = "https",
+    .format_name     = "https",
+    .protocol_name   = "https",

-    .instance_size              = sizeof(BDRVCURLState),
-    .bdrv_parse_filename        = curl_parse_filename,
-    .bdrv_file_open             = curl_open,
-    .bdrv_close                 = curl_close,
-    .bdrv_getlength             = curl_getlength,
+    .instance_size   = sizeof(BDRVCURLState),
+    .bdrv_file_open  = curl_open,
+    .bdrv_close      = curl_close,
+    .bdrv_getlength  = curl_getlength,

-    .bdrv_aio_readv             = curl_aio_readv,
-
-    .bdrv_detach_aio_context    = curl_detach_aio_context,
-    .bdrv_attach_aio_context    = curl_attach_aio_context,
+    .bdrv_aio_readv  = curl_aio_readv,
 };

 static BlockDriver bdrv_ftp = {
-    .format_name                = "ftp",
-    .protocol_name              = "ftp",
+    .format_name     = "ftp",
+    .protocol_name   = "ftp",

-    .instance_size              = sizeof(BDRVCURLState),
-    .bdrv_parse_filename        = curl_parse_filename,
-    .bdrv_file_open             = curl_open,
-    .bdrv_close                 = curl_close,
-    .bdrv_getlength             = curl_getlength,
+    .instance_size   = sizeof(BDRVCURLState),
+    .bdrv_file_open  = curl_open,
+    .bdrv_close      = curl_close,
+    .bdrv_getlength  = curl_getlength,

-    .bdrv_aio_readv             = curl_aio_readv,
-
-    .bdrv_detach_aio_context    = curl_detach_aio_context,
-    .bdrv_attach_aio_context    = curl_attach_aio_context,
+    .bdrv_aio_readv  = curl_aio_readv,
 };

 static BlockDriver bdrv_ftps = {
-    .format_name                = "ftps",
-    .protocol_name              = "ftps",
+    .format_name     = "ftps",
+    .protocol_name   = "ftps",

-    .instance_size              = sizeof(BDRVCURLState),
-    .bdrv_parse_filename        = curl_parse_filename,
-    .bdrv_file_open             = curl_open,
-    .bdrv_close                 = curl_close,
-    .bdrv_getlength             = curl_getlength,
+    .instance_size   = sizeof(BDRVCURLState),
+    .bdrv_file_open  = curl_open,
+    .bdrv_close      = curl_close,
+    .bdrv_getlength  = curl_getlength,

-    .bdrv_aio_readv             = curl_aio_readv,
-
-    .bdrv_detach_aio_context    = curl_detach_aio_context,
-    .bdrv_attach_aio_context    = curl_attach_aio_context,
+    .bdrv_aio_readv  = curl_aio_readv,
 };

 static BlockDriver bdrv_tftp = {
-    .format_name                = "tftp",
-    .protocol_name              = "tftp",
+    .format_name     = "tftp",
+    .protocol_name   = "tftp",

-    .instance_size              = sizeof(BDRVCURLState),
-    .bdrv_parse_filename        = curl_parse_filename,
-    .bdrv_file_open             = curl_open,
-    .bdrv_close                 = curl_close,
-    .bdrv_getlength             = curl_getlength,
+    .instance_size   = sizeof(BDRVCURLState),
+    .bdrv_file_open  = curl_open,
+    .bdrv_close      = curl_close,
+    .bdrv_getlength  = curl_getlength,

-    .bdrv_aio_readv             = curl_aio_readv,
-
-    .bdrv_detach_aio_context    = curl_detach_aio_context,
-    .bdrv_attach_aio_context    = curl_attach_aio_context,
+    .bdrv_aio_readv  = curl_aio_readv,
 };

 static void curl_block_init(void)
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -22,22 +22,10 @@
 * THE SOFTWARE.
 */
 #include "qemu-common.h"
-#include "block/block_int.h"
-#include "qemu/bswap.h"
-#include "qemu/module.h"
+#include "block_int.h"
+#include "bswap.h"
+#include "module.h"
 #include <zlib.h>
-#ifdef CONFIG_BZIP2
-#include <bzlib.h>
-#endif
-#include <glib.h>
-
-enum {
-    /* Limit chunk sizes to prevent unreasonable amounts of memory being used
-     * or truncating when converting to 32-bit types
-     */
-    DMG_LENGTHS_MAX = 64 * 1024 * 1024, /* 64 MB */
-    DMG_SECTORCOUNTS_MAX = DMG_LENGTHS_MAX / 512,
-};

 typedef struct BDRVDMGState {
    CoMutex lock;
@@ -59,599 +47,221 @@ typedef struct BDRVDMGState {
    uint8_t *compressed_chunk;
    uint8_t *uncompressed_chunk;
    z_stream zstream;
-#ifdef CONFIG_BZIP2
-    bz_stream bzstream;
-#endif
 } BDRVDMGState;

 static int dmg_probe(const uint8_t *buf, int buf_size, const char *filename)
 {
-    int len;
-
-    if (!filename) {
-        return 0;
-    }
-
-    len = strlen(filename);
-    if (len > 4 && !strcmp(filename + len - 4, ".dmg")) {
-        return 2;
-    }
+    int len=strlen(filename);
+    if(len>4 && !strcmp(filename+len-4,".dmg"))
+	return 2;
    return 0;
 }

-static int read_uint64(BlockDriverState *bs, int64_t offset, uint64_t *result)
+static off_t read_off(BlockDriverState *bs, int64_t offset)
 {
-    uint64_t buffer;
-    int ret;
-
-    ret = bdrv_pread(bs->file, offset, &buffer, 8);
-    if (ret < 0) {
-        return ret;
-    }
-
-    *result = be64_to_cpu(buffer);
-    return 0;
+	uint64_t buffer;
+	if (bdrv_pread(bs->file, offset, &buffer, 8) < 8)
+		return 0;
+	return be64_to_cpu(buffer);
 }

-static int read_uint32(BlockDriverState *bs, int64_t offset, uint32_t *result)
+static off_t read_uint32(BlockDriverState *bs, int64_t offset)
 {
-    uint32_t buffer;
-    int ret;
-
-    ret = bdrv_pread(bs->file, offset, &buffer, 4);
-    if (ret < 0) {
-        return ret;
-    }
-
-    *result = be32_to_cpu(buffer);
-    return 0;
+	uint32_t buffer;
+	if (bdrv_pread(bs->file, offset, &buffer, 4) < 4)
+		return 0;
+	return be32_to_cpu(buffer);
 }

-static inline uint64_t buff_read_uint64(const uint8_t *buffer, int64_t offset)
-{
-    return be64_to_cpu(*(uint64_t *)&buffer[offset]);
-}
-
-static inline uint32_t buff_read_uint32(const uint8_t *buffer, int64_t offset)
-{
-    return be32_to_cpu(*(uint32_t *)&buffer[offset]);
-}
-
-/* Increase max chunk sizes, if necessary.  This function is used to calculate
- * the buffer sizes needed for compressed/uncompressed chunk I/O.
- */
-static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
-                                  uint32_t *max_compressed_size,
-                                  uint32_t *max_sectors_per_chunk)
-{
-    uint32_t compressed_size = 0;
-    uint32_t uncompressed_sectors = 0;
-
-    switch (s->types[chunk]) {
-    case 0x80000005: /* zlib compressed */
-    case 0x80000006: /* bzip2 compressed */
-        compressed_size = s->lengths[chunk];
-        uncompressed_sectors = s->sectorcounts[chunk];
-        break;
-    case 1: /* copy */
-        uncompressed_sectors = (s->lengths[chunk] + 511) / 512;
-        break;
-    case 2: /* zero */
-        /* as the all-zeroes block may be large, it is treated specially: the
-         * sector is not copied from a large buffer, a simple memset is used
-         * instead. Therefore uncompressed_sectors does not need to be set. */
-        break;
-    }
-
-    if (compressed_size > *max_compressed_size) {
-        *max_compressed_size = compressed_size;
-    }
-    if (uncompressed_sectors > *max_sectors_per_chunk) {
-        *max_sectors_per_chunk = uncompressed_sectors;
-    }
-}
-
-static int64_t dmg_find_koly_offset(BlockDriverState *file_bs, Error **errp)
-{
-    int64_t length;
-    int64_t offset = 0;
-    uint8_t buffer[515];
-    int i, ret;
-
-    /* bdrv_getlength returns a multiple of block size (512), rounded up. Since
-     * dmg images can have odd sizes, try to look for the "koly" magic which
-     * marks the begin of the UDIF trailer (512 bytes). This magic can be found
-     * in the last 511 bytes of the second-last sector or the first 4 bytes of
-     * the last sector (search space: 515 bytes) */
-    length = bdrv_getlength(file_bs);
-    if (length < 0) {
-        error_setg_errno(errp, -length,
-            "Failed to get file size while reading UDIF trailer");
-        return length;
-    } else if (length < 512) {
-        error_setg(errp, "dmg file must be at least 512 bytes long");
-        return -EINVAL;
-    }
-    if (length > 511 + 512) {
-        offset = length - 511 - 512;
-    }
-    length = length < 515 ? length : 515;
-    ret = bdrv_pread(file_bs, offset, buffer, length);
-    if (ret < 0) {
-        error_setg_errno(errp, -ret, "Failed while reading UDIF trailer");
-        return ret;
-    }
-    for (i = 0; i < length - 3; i++) {
-        if (buffer[i] == 'k' && buffer[i+1] == 'o' &&
-            buffer[i+2] == 'l' && buffer[i+3] == 'y') {
-            return offset + i;
-        }
-    }
-    error_setg(errp, "Could not locate UDIF trailer in dmg file");
-    return -EINVAL;
-}
-
-/* used when building the sector table */
-typedef struct DmgHeaderState {
-    /* used internally by dmg_read_mish_block to remember offsets of blocks
-     * across calls */
-    uint64_t data_fork_offset;
-    /* exported for dmg_open */
-    uint32_t max_compressed_size;
-    uint32_t max_sectors_per_chunk;
-} DmgHeaderState;
-
-static bool dmg_is_known_block_type(uint32_t entry_type)
-{
-    switch (entry_type) {
-    case 0x00000001:    /* uncompressed */
-    case 0x00000002:    /* zeroes */
-    case 0x80000005:    /* zlib */
-#ifdef CONFIG_BZIP2
-    case 0x80000006:    /* bzip2 */
-#endif
-        return true;
-    default:
-        return false;
-    }
-}
-
-static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
-                               uint8_t *buffer, uint32_t count)
-{
-    uint32_t type, i;
-    int ret;
-    size_t new_size;
-    uint32_t chunk_count;
-    int64_t offset = 0;
-    uint64_t data_offset;
-    uint64_t in_offset = ds->data_fork_offset;
-    uint64_t out_offset;
-
-    type = buff_read_uint32(buffer, offset);
-    /* skip data that is not a valid MISH block (invalid magic or too small) */
-    if (type != 0x6d697368 || count < 244) {
-        /* assume success for now */
-        return 0;
-    }
-
-    /* chunk offsets are relative to this sector number */
-    out_offset = buff_read_uint64(buffer, offset + 8);
-
-    /* location in data fork for (compressed) blob (in bytes) */
-    data_offset = buff_read_uint64(buffer, offset + 0x18);
-    in_offset += data_offset;
-
-    /* move to begin of chunk entries */
-    offset += 204;
-
-    chunk_count = (count - 204) / 40;
-    new_size = sizeof(uint64_t) * (s->n_chunks + chunk_count);
-    s->types = g_realloc(s->types, new_size / 2);
-    s->offsets = g_realloc(s->offsets, new_size);
-    s->lengths = g_realloc(s->lengths, new_size);
-    s->sectors = g_realloc(s->sectors, new_size);
-    s->sectorcounts = g_realloc(s->sectorcounts, new_size);
-
-    for (i = s->n_chunks; i < s->n_chunks + chunk_count; i++) {
-        s->types[i] = buff_read_uint32(buffer, offset);
-        if (!dmg_is_known_block_type(s->types[i])) {
-            chunk_count--;
-            i--;
-            offset += 40;
-            continue;
-        }
-
-        /* sector number */
-        s->sectors[i] = buff_read_uint64(buffer, offset + 8);
-        s->sectors[i] += out_offset;
-
-        /* sector count */
-        s->sectorcounts[i] = buff_read_uint64(buffer, offset + 0x10);
-
-        /* all-zeroes sector (type 2) does not need to be "uncompressed" and can
-         * therefore be unbounded. */
-        if (s->types[i] != 2 && s->sectorcounts[i] > DMG_SECTORCOUNTS_MAX) {
-            error_report("sector count %" PRIu64 " for chunk %" PRIu32
-                         " is larger than max (%u)",
-                         s->sectorcounts[i], i, DMG_SECTORCOUNTS_MAX);
-            ret = -EINVAL;
-            goto fail;
-        }
-
-        /* offset in (compressed) data fork */
-        s->offsets[i] = buff_read_uint64(buffer, offset + 0x18);
-        s->offsets[i] += in_offset;
-
-        /* length in (compressed) data fork */
-        s->lengths[i] = buff_read_uint64(buffer, offset + 0x20);
-
-        if (s->lengths[i] > DMG_LENGTHS_MAX) {
-            error_report("length %" PRIu64 " for chunk %" PRIu32
-                         " is larger than max (%u)",
-                         s->lengths[i], i, DMG_LENGTHS_MAX);
-            ret = -EINVAL;
-            goto fail;
-        }
-
-        update_max_chunk_size(s, i, &ds->max_compressed_size,
-                              &ds->max_sectors_per_chunk);
-        offset += 40;
-    }
-    s->n_chunks += chunk_count;
-    return 0;
-
-fail:
-    return ret;
-}
-
-static int dmg_read_resource_fork(BlockDriverState *bs, DmgHeaderState *ds,
-                                  uint64_t info_begin, uint64_t info_length)
+static int dmg_open(BlockDriverState *bs, int flags)
 {
    BDRVDMGState *s = bs->opaque;
-    int ret;
-    uint32_t count, rsrc_data_offset;
-    uint8_t *buffer = NULL;
-    uint64_t info_end;
-    uint64_t offset;
-
-    /* read offset from begin of resource fork (info_begin) to resource data */
-    ret = read_uint32(bs, info_begin, &rsrc_data_offset);
-    if (ret < 0) {
-        goto fail;
-    } else if (rsrc_data_offset > info_length) {
-        ret = -EINVAL;
-        goto fail;
-    }
-
-    /* read length of resource data */
-    ret = read_uint32(bs, info_begin + 8, &count);
-    if (ret < 0) {
-        goto fail;
-    } else if (count == 0 || rsrc_data_offset + count > info_length) {
-        ret = -EINVAL;
-        goto fail;
-    }
-
-    /* begin of resource data (consisting of one or more resources) */
-    offset = info_begin + rsrc_data_offset;
-
-    /* end of resource data (there is possibly a following resource map
-     * which will be ignored). */
-    info_end = offset + count;
-
-    /* read offsets (mish blocks) from one or more resources in resource data */
-    while (offset < info_end) {
-        /* size of following resource */
-        ret = read_uint32(bs, offset, &count);
-        if (ret < 0) {
-            goto fail;
-        } else if (count == 0 || count > info_end - offset) {
-            ret = -EINVAL;
-            goto fail;
-        }
-        offset += 4;
-
-        buffer = g_realloc(buffer, count);
-        ret = bdrv_pread(bs->file, offset, buffer, count);
-        if (ret < 0) {
-            goto fail;
-        }
-
-        ret = dmg_read_mish_block(s, ds, buffer, count);
-        if (ret < 0) {
-            goto fail;
-        }
-        /* advance offset by size of resource */
-        offset += count;
-    }
-    ret = 0;
-
-fail:
-    g_free(buffer);
-    return ret;
-}
-
-static int dmg_read_plist_xml(BlockDriverState *bs, DmgHeaderState *ds,
-                              uint64_t info_begin, uint64_t info_length)
-{
-    BDRVDMGState *s = bs->opaque;
-    int ret;
-    uint8_t *buffer = NULL;
-    char *data_begin, *data_end;
-
-    /* Have at least some length to avoid NULL for g_malloc. Attempt to set a
-     * safe upper cap on the data length. A test sample had a XML length of
-     * about 1 MiB. */
-    if (info_length == 0 || info_length > 16 * 1024 * 1024) {
-        ret = -EINVAL;
-        goto fail;
-    }
-
-    buffer = g_malloc(info_length + 1);
-    buffer[info_length] = '\0';
-    ret = bdrv_pread(bs->file, info_begin, buffer, info_length);
-    if (ret != info_length) {
-        ret = -EINVAL;
-        goto fail;
-    }
-
-    /* look for <data>...</data>. The data is 284 (0x11c) bytes after base64
-     * decode. The actual data element has 431 (0x1af) bytes which includes tabs
-     * and line feeds. */
-    data_end = (char *)buffer;
-    while ((data_begin = strstr(data_end, "<data>")) != NULL) {
-        guchar *mish;
-        gsize out_len = 0;
-
-        data_begin += 6;
-        data_end = strstr(data_begin, "</data>");
-        /* malformed XML? */
-        if (data_end == NULL) {
-            ret = -EINVAL;
-            goto fail;
-        }
-        *data_end++ = '\0';
-        mish = g_base64_decode(data_begin, &out_len);
-        ret = dmg_read_mish_block(s, ds, mish, (uint32_t)out_len);
-        g_free(mish);
-        if (ret < 0) {
-            goto fail;
-        }
-    }
-    ret = 0;
-
-fail:
-    g_free(buffer);
-    return ret;
-}
-
-static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
-                    Error **errp)
-{
-    BDRVDMGState *s = bs->opaque;
-    DmgHeaderState ds;
-    uint64_t rsrc_fork_offset, rsrc_fork_length;
-    uint64_t plist_xml_offset, plist_xml_length;
+    off_t info_begin,info_end,last_in_offset,last_out_offset;
+    uint32_t count;
+    uint32_t max_compressed_size=1,max_sectors_per_chunk=1,i;
    int64_t offset;
-    int ret;

    bs->read_only = 1;
    s->n_chunks = 0;
    s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
-    /* used by dmg_read_mish_block to keep track of the current I/O position */
-    ds.data_fork_offset = 0;
-    ds.max_compressed_size = 1;
-    ds.max_sectors_per_chunk = 1;

-    /* locate the UDIF trailer */
-    offset = dmg_find_koly_offset(bs->file, errp);
+    /* read offset of info blocks */
+    offset = bdrv_getlength(bs->file);
    if (offset < 0) {
-        ret = offset;
+        goto fail;
+    }
+    offset -= 0x1d8;
+
+    info_begin = read_off(bs, offset);
+    if (info_begin == 0) {
+	goto fail;
+    }
+
+    if (read_uint32(bs, info_begin) != 0x100) {
        goto fail;
    }

-    /* offset of data fork (DataForkOffset) */
-    ret = read_uint64(bs, offset + 0x18, &ds.data_fork_offset);
-    if (ret < 0) {
-        goto fail;
-    } else if (ds.data_fork_offset > offset) {
-        ret = -EINVAL;
+    count = read_uint32(bs, info_begin + 4);
+    if (count == 0) {
        goto fail;
    }
+    info_end = info_begin + count;

-    /* offset of resource fork (RsrcForkOffset) */
-    ret = read_uint64(bs, offset + 0x28, &rsrc_fork_offset);
-    if (ret < 0) {
-        goto fail;
-    }
-    ret = read_uint64(bs, offset + 0x30, &rsrc_fork_length);
-    if (ret < 0) {
-        goto fail;
-    }
-    if (rsrc_fork_offset >= offset ||
-        rsrc_fork_length > offset - rsrc_fork_offset) {
-        ret = -EINVAL;
-        goto fail;
-    }
-    /* offset of property list (XMLOffset) */
-    ret = read_uint64(bs, offset + 0xd8, &plist_xml_offset);
-    if (ret < 0) {
-        goto fail;
-    }
-    ret = read_uint64(bs, offset + 0xe0, &plist_xml_length);
-    if (ret < 0) {
-        goto fail;
-    }
-    if (plist_xml_offset >= offset ||
-        plist_xml_length > offset - plist_xml_offset) {
-        ret = -EINVAL;
-        goto fail;
-    }
-    ret = read_uint64(bs, offset + 0x1ec, (uint64_t *)&bs->total_sectors);
-    if (ret < 0) {
-        goto fail;
-    }
-    if (bs->total_sectors < 0) {
-        ret = -EINVAL;
-        goto fail;
-    }
-    if (rsrc_fork_length != 0) {
-        ret = dmg_read_resource_fork(bs, &ds,
-                                     rsrc_fork_offset, rsrc_fork_length);
-        if (ret < 0) {
-            goto fail;
-        }
-    } else if (plist_xml_length != 0) {
-        ret = dmg_read_plist_xml(bs, &ds, plist_xml_offset, plist_xml_length);
-        if (ret < 0) {
-            goto fail;
-        }
-    } else {
-        ret = -EINVAL;
-        goto fail;
+    offset = info_begin + 0x100;
+
+    /* read offsets */
+    last_in_offset = last_out_offset = 0;
+    while (offset < info_end) {
+        uint32_t type;
+
+	count = read_uint32(bs, offset);
+	if(count==0)
+	    goto fail;
+        offset += 4;
+
+	type = read_uint32(bs, offset);
+	if (type == 0x6d697368 && count >= 244) {
+	    int new_size, chunk_count;
+
+            offset += 4;
+            offset += 200;
+
+	    chunk_count = (count-204)/40;
+	    new_size = sizeof(uint64_t) * (s->n_chunks + chunk_count);
+	    s->types = g_realloc(s->types, new_size/2);
+	    s->offsets = g_realloc(s->offsets, new_size);
+	    s->lengths = g_realloc(s->lengths, new_size);
+	    s->sectors = g_realloc(s->sectors, new_size);
+	    s->sectorcounts = g_realloc(s->sectorcounts, new_size);
+
+	    for(i=s->n_chunks;i<s->n_chunks+chunk_count;i++) {
+		s->types[i] = read_uint32(bs, offset);
+		offset += 4;
+		if(s->types[i]!=0x80000005 && s->types[i]!=1 && s->types[i]!=2) {
+		    if(s->types[i]==0xffffffff) {
+			last_in_offset = s->offsets[i-1]+s->lengths[i-1];
+			last_out_offset = s->sectors[i-1]+s->sectorcounts[i-1];
+		    }
+		    chunk_count--;
+		    i--;
+		    offset += 36;
+		    continue;
+		}
+		offset += 4;
+
+		s->sectors[i] = last_out_offset+read_off(bs, offset);
+		offset += 8;
+
+		s->sectorcounts[i] = read_off(bs, offset);
+		offset += 8;
+
+		s->offsets[i] = last_in_offset+read_off(bs, offset);
+		offset += 8;
+
+		s->lengths[i] = read_off(bs, offset);
+		offset += 8;
+
+		if(s->lengths[i]>max_compressed_size)
+		    max_compressed_size = s->lengths[i];
+		if(s->sectorcounts[i]>max_sectors_per_chunk)
+		    max_sectors_per_chunk = s->sectorcounts[i];
+	    }
+	    s->n_chunks+=chunk_count;
+	}
    }

    /* initialize zlib engine */
-    s->compressed_chunk = qemu_try_blockalign(bs->file,
-                                              ds.max_compressed_size + 1);
-    s->uncompressed_chunk = qemu_try_blockalign(bs->file,
-                                                512 * ds.max_sectors_per_chunk);
-    if (s->compressed_chunk == NULL || s->uncompressed_chunk == NULL) {
-        ret = -ENOMEM;
-        goto fail;
-    }
-
-    if (inflateInit(&s->zstream) != Z_OK) {
-        ret = -EINVAL;
-        goto fail;
-    }
+    s->compressed_chunk = g_malloc(max_compressed_size+1);
+    s->uncompressed_chunk = g_malloc(512*max_sectors_per_chunk);
+    if(inflateInit(&s->zstream) != Z_OK)
+	goto fail;

    s->current_chunk = s->n_chunks;

    qemu_co_mutex_init(&s->lock);
    return 0;
-
 fail:
-    g_free(s->types);
-    g_free(s->offsets);
-    g_free(s->lengths);
-    g_free(s->sectors);
-    g_free(s->sectorcounts);
-    qemu_vfree(s->compressed_chunk);
-    qemu_vfree(s->uncompressed_chunk);
-    return ret;
+    return -1;
 }

 static inline int is_sector_in_chunk(BDRVDMGState* s,
-                uint32_t chunk_num, uint64_t sector_num)
+		uint32_t chunk_num,int sector_num)
 {
-    if (chunk_num >= s->n_chunks || s->sectors[chunk_num] > sector_num ||
-            s->sectors[chunk_num] + s->sectorcounts[chunk_num] <= sector_num) {
-        return 0;
-    } else {
-        return -1;
-    }
+    if(chunk_num>=s->n_chunks || s->sectors[chunk_num]>sector_num ||
+	    s->sectors[chunk_num]+s->sectorcounts[chunk_num]<=sector_num)
+	return 0;
+    else
+	return -1;
 }

-static inline uint32_t search_chunk(BDRVDMGState *s, uint64_t sector_num)
+static inline uint32_t search_chunk(BDRVDMGState* s,int sector_num)
 {
    /* binary search */
-    uint32_t chunk1 = 0, chunk2 = s->n_chunks, chunk3;
-    while (chunk1 != chunk2) {
-        chunk3 = (chunk1 + chunk2) / 2;
-        if (s->sectors[chunk3] > sector_num) {
-            chunk2 = chunk3;
-        } else if (s->sectors[chunk3] + s->sectorcounts[chunk3] > sector_num) {
-            return chunk3;
-        } else {
-            chunk1 = chunk3;
-        }
+    uint32_t chunk1=0,chunk2=s->n_chunks,chunk3;
+    while(chunk1!=chunk2) {
+	chunk3 = (chunk1+chunk2)/2;
+	if(s->sectors[chunk3]>sector_num)
+	    chunk2 = chunk3;
+	else if(s->sectors[chunk3]+s->sectorcounts[chunk3]>sector_num)
+	    return chunk3;
+	else
+	    chunk1 = chunk3;
    }
    return s->n_chunks; /* error */
 }

-static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
+static inline int dmg_read_chunk(BlockDriverState *bs, int sector_num)
 {
    BDRVDMGState *s = bs->opaque;

-    if (!is_sector_in_chunk(s, s->current_chunk, sector_num)) {
-        int ret;
-        uint32_t chunk = search_chunk(s, sector_num);
-#ifdef CONFIG_BZIP2
-        uint64_t total_out;
-#endif
+    if(!is_sector_in_chunk(s,s->current_chunk,sector_num)) {
+	int ret;
+	uint32_t chunk = search_chunk(s,sector_num);

-        if (chunk >= s->n_chunks) {
-            return -1;
-        }
+	if(chunk>=s->n_chunks)
+	    return -1;

-        s->current_chunk = s->n_chunks;
-        switch (s->types[chunk]) { /* block entry type */
-        case 0x80000005: { /* zlib compressed */
-            /* we need to buffer, because only the chunk as whole can be
-             * inflated. */
-            ret = bdrv_pread(bs->file, s->offsets[chunk],
-                             s->compressed_chunk, s->lengths[chunk]);
-            if (ret != s->lengths[chunk]) {
-                return -1;
-            }
+	s->current_chunk = s->n_chunks;
+	switch(s->types[chunk]) {
+	case 0x80000005: { /* zlib compressed */
+	    int i;

-            s->zstream.next_in = s->compressed_chunk;
-            s->zstream.avail_in = s->lengths[chunk];
-            s->zstream.next_out = s->uncompressed_chunk;
-            s->zstream.avail_out = 512 * s->sectorcounts[chunk];
-            ret = inflateReset(&s->zstream);
-            if (ret != Z_OK) {
-                return -1;
-            }
-            ret = inflate(&s->zstream, Z_FINISH);
-            if (ret != Z_STREAM_END ||
-                s->zstream.total_out != 512 * s->sectorcounts[chunk]) {
-                return -1;
-            }
-            break; }
-#ifdef CONFIG_BZIP2
-        case 0x80000006: /* bzip2 compressed */
-            /* we need to buffer, because only the chunk as whole can be
-             * inflated. */
-            ret = bdrv_pread(bs->file, s->offsets[chunk],
-                             s->compressed_chunk, s->lengths[chunk]);
-            if (ret != s->lengths[chunk]) {
-                return -1;
-            }
+	    /* we need to buffer, because only the chunk as whole can be
+	     * inflated. */
+	    i=0;
+	    do {
+                ret = bdrv_pread(bs->file, s->offsets[chunk] + i,
+                                 s->compressed_chunk+i, s->lengths[chunk]-i);
+		if(ret<0 && errno==EINTR)
+		    ret=0;
+		i+=ret;
+	    } while(ret>=0 && ret+i<s->lengths[chunk]);

-            ret = BZ2_bzDecompressInit(&s->bzstream, 0, 0);
-            if (ret != BZ_OK) {
-                return -1;
-            }
-            s->bzstream.next_in = (char *)s->compressed_chunk;
-            s->bzstream.avail_in = (unsigned int) s->lengths[chunk];
-            s->bzstream.next_out = (char *)s->uncompressed_chunk;
-            s->bzstream.avail_out = (unsigned int) 512 * s->sectorcounts[chunk];
-            ret = BZ2_bzDecompress(&s->bzstream);
-            total_out = ((uint64_t)s->bzstream.total_out_hi32 << 32) +
-                        s->bzstream.total_out_lo32;
-            BZ2_bzDecompressEnd(&s->bzstream);
-            if (ret != BZ_STREAM_END ||
-                total_out != 512 * s->sectorcounts[chunk]) {
-                return -1;
-            }
-            break;
-#endif /* CONFIG_BZIP2 */
-        case 1: /* copy */
-            ret = bdrv_pread(bs->file, s->offsets[chunk],
+	    if (ret != s->lengths[chunk])
+		return -1;
+
+	    s->zstream.next_in = s->compressed_chunk;
+	    s->zstream.avail_in = s->lengths[chunk];
+	    s->zstream.next_out = s->uncompressed_chunk;
+	    s->zstream.avail_out = 512*s->sectorcounts[chunk];
+	    ret = inflateReset(&s->zstream);
+	    if(ret != Z_OK)
+		return -1;
+	    ret = inflate(&s->zstream, Z_FINISH);
+	    if(ret != Z_STREAM_END || s->zstream.total_out != 512*s->sectorcounts[chunk])
+		return -1;
+	    break; }
+	case 1: /* copy */
+	    ret = bdrv_pread(bs->file, s->offsets[chunk],
                             s->uncompressed_chunk, s->lengths[chunk]);
-            if (ret != s->lengths[chunk]) {
-                return -1;
-            }
-            break;
-        case 2: /* zero */
-            /* see dmg_read, it is treated specially. No buffer needs to be
-             * pre-filled, the zeroes can be set directly. */
-            break;
-        }
-        s->current_chunk = chunk;
+	    if (ret != s->lengths[chunk])
+		return -1;
+	    break;
+	case 2: /* zero */
+	    memset(s->uncompressed_chunk, 0, 512*s->sectorcounts[chunk]);
+	    break;
+	}
+	s->current_chunk = chunk;
    }
    return 0;
 }
@@ -662,21 +272,12 @@ static int dmg_read(BlockDriverState *bs, int64_t sector_num,
    BDRVDMGState *s = bs->opaque;
    int i;

-    for (i = 0; i < nb_sectors; i++) {
-        uint32_t sector_offset_in_chunk;
-        if (dmg_read_chunk(bs, sector_num + i) != 0) {
-            return -1;
-        }
-        /* Special case: current chunk is all zeroes. Do not perform a memcpy as
-         * s->uncompressed_chunk may be too small to cover the large all-zeroes
-         * section. dmg_read_chunk is called to find s->current_chunk */
-        if (s->types[s->current_chunk] == 2) { /* all zeroes block entry */
-            memset(buf + i * 512, 0, 512);
-            continue;
-        }
-        sector_offset_in_chunk = sector_num + i - s->sectors[s->current_chunk];
-        memcpy(buf + i * 512,
-               s->uncompressed_chunk + sector_offset_in_chunk * 512, 512);
+    for(i=0;i<nb_sectors;i++) {
+	uint32_t sector_offset_in_chunk;
+	if(dmg_read_chunk(bs, sector_num+i) != 0)
+	    return -1;
+	sector_offset_in_chunk = sector_num+i-s->sectors[s->current_chunk];
+	memcpy(buf+i*512,s->uncompressed_chunk+sector_offset_in_chunk*512,512);
    }
    return 0;
 }
@@ -695,25 +296,25 @@ static coroutine_fn int dmg_co_read(BlockDriverState *bs, int64_t sector_num,
 static void dmg_close(BlockDriverState *bs)
 {
    BDRVDMGState *s = bs->opaque;
-
-    g_free(s->types);
-    g_free(s->offsets);
-    g_free(s->lengths);
-    g_free(s->sectors);
-    g_free(s->sectorcounts);
-    qemu_vfree(s->compressed_chunk);
-    qemu_vfree(s->uncompressed_chunk);
-
+    if(s->n_chunks>0) {
+	free(s->types);
+	free(s->offsets);
+	free(s->lengths);
+	free(s->sectors);
+	free(s->sectorcounts);
+    }
+    free(s->compressed_chunk);
+    free(s->uncompressed_chunk);
    inflateEnd(&s->zstream);
 }

 static BlockDriver bdrv_dmg = {
-    .format_name    = "dmg",
-    .instance_size  = sizeof(BDRVDMGState),
-    .bdrv_probe     = dmg_probe,
-    .bdrv_open      = dmg_open,
-    .bdrv_read      = dmg_co_read,
-    .bdrv_close     = dmg_close,
+    .format_name	= "dmg",
+    .instance_size	= sizeof(BDRVDMGState),
+    .bdrv_probe		= dmg_probe,
+    .bdrv_open		= dmg_open,
+    .bdrv_read          = dmg_co_read,
+    .bdrv_close		= dmg_close,
 };

 static void bdrv_dmg_init(void)
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -3,27 +3,43 @@
 *
 * Copyright (C) 2012 Bharata B Rao <bharata@linux.vnet.ibm.com>
 *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
+ * Pipe handling mechanism in AIO implementation is derived from
+ * block/rbd.c. Hence,
 *
+ * Copyright (C) 2010-2011 Christian Brunner <chb@muc.de>,
+ *                         Josh Durgin <josh.durgin@dreamhost.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Contributions after 2012-01-13 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
 */
 #include <glusterfs/api/glfs.h>
-#include "block/block_int.h"
-#include "qemu/uri.h"
+#include "block_int.h"
+#include "qemu_socket.h"
+#include "uri.h"

 typedef struct GlusterAIOCB {
+    BlockDriverAIOCB common;
    int64_t size;
    int ret;
+    bool *finished;
    QEMUBH *bh;
-    Coroutine *coroutine;
-    AioContext *aio_context;
 } GlusterAIOCB;

 typedef struct BDRVGlusterState {
    struct glfs *glfs;
+    int fds[2];
    struct glfs_fd *fd;
+    int qemu_aio_count;
+    int event_reader_pos;
+    GlusterAIOCB *event_acb;
 } BDRVGlusterState;

+#define GLUSTER_FD_READ  0
+#define GLUSTER_FD_WRITE 1
+
 typedef struct GlusterConf {
    char *server;
    int port;
@@ -34,13 +50,11 @@ typedef struct GlusterConf {

 static void qemu_gluster_gconf_free(GlusterConf *gconf)
 {
-    if (gconf) {
-        g_free(gconf->server);
-        g_free(gconf->volname);
-        g_free(gconf->image);
-        g_free(gconf->transport);
-        g_free(gconf);
-    }
+    g_free(gconf->server);
+    g_free(gconf->volname);
+    g_free(gconf->image);
+    g_free(gconf->transport);
+    g_free(gconf);
 }

 static int parse_volume_options(GlusterConf *gconf, char *path)
@@ -81,7 +95,7 @@ static int parse_volume_options(GlusterConf *gconf, char *path)
 * 'server' specifies the server where the volume file specification for
 * the given volume resides. This can be either hostname, ipv4 address
 * or ipv6 address. ipv6 address needs to be within square brackets [ ].
- * If transport type is 'unix', then 'server' field should not be specified.
+ * If transport type is 'unix', then 'server' field should not be specifed.
 * The 'socket' field needs to be populated with the path to unix domain
 * socket.
 *
@@ -118,7 +132,7 @@ static int qemu_gluster_parseuri(GlusterConf *gconf, const char *filename)
    }

    /* transport */
-    if (!uri->scheme || !strcmp(uri->scheme, "gluster")) {
+    if (!strcmp(uri->scheme, "gluster")) {
        gconf->transport = g_strdup("tcp");
    } else if (!strcmp(uri->scheme, "gluster+tcp")) {
        gconf->transport = g_strdup("tcp");
@@ -154,7 +168,7 @@ static int qemu_gluster_parseuri(GlusterConf *gconf, const char *filename)
        }
        gconf->server = g_strdup(qp->p[0].value);
    } else {
-        gconf->server = g_strdup(uri->server ? uri->server : "localhost");
+        gconf->server = g_strdup(uri->server);
        gconf->port = uri->port;
    }

@@ -166,8 +180,7 @@ out:
    return ret;
 }

-static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char *filename,
-                                      Error **errp)
+static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char *filename)
 {
    struct glfs *glfs = NULL;
    int ret;
@@ -175,8 +188,8 @@ static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char *filename,

    ret = qemu_gluster_parseuri(gconf, filename);
    if (ret < 0) {
-        error_setg(errp, "Usage: file=gluster[+transport]://[server[:port]]/"
-                   "volname/image[?socket=...]");
+        error_report("Usage: file=gluster[+transport]://[server[:port]]/"
+            "volname/image[?socket=...]");
        errno = -ret;
        goto out;
    }
@@ -203,16 +216,9 @@ static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char *filename,

    ret = glfs_init(glfs);
    if (ret) {
-        error_setg_errno(errp, errno,
-                         "Gluster connection failed for server=%s port=%d "
-                         "volume=%s image=%s transport=%s", gconf->server,
-                         gconf->port, gconf->volname, gconf->image,
-                         gconf->transport);
-
-        /* glfs_init sometimes doesn't set errno although docs suggest that */
-        if (errno == 0)
-            errno = EINVAL;
-
+        error_report("Gluster connection failed for server=%s port=%d "
+             "volume=%s image=%s transport=%s\n", gconf->server, gconf->port,
+             gconf->volname, gconf->image, gconf->transport);
        goto out;
    }
    return glfs;
@@ -226,101 +232,96 @@ out:
    return NULL;
 }

-static void qemu_gluster_complete_aio(void *opaque)
+static void qemu_gluster_complete_aio(GlusterAIOCB *acb, BDRVGlusterState *s)
 {
-    GlusterAIOCB *acb = (GlusterAIOCB *)opaque;
+    int ret;
+    bool *finished = acb->finished;
+    BlockDriverCompletionFunc *cb = acb->common.cb;
+    void *opaque = acb->common.opaque;

-    qemu_bh_delete(acb->bh);
-    acb->bh = NULL;
-    qemu_coroutine_enter(acb->coroutine, NULL);
-}
-
-/*
- * AIO callback routine called from GlusterFS thread.
- */
-static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
-{
-    GlusterAIOCB *acb = (GlusterAIOCB *)arg;
-
-    if (!ret || ret == acb->size) {
-        acb->ret = 0; /* Success */
-    } else if (ret < 0) {
-        acb->ret = ret; /* Read/Write failed */
+    if (!acb->ret || acb->ret == acb->size) {
+        ret = 0; /* Success */
+    } else if (acb->ret < 0) {
+        ret = acb->ret; /* Read/Write failed */
    } else {
-        acb->ret = -EIO; /* Partial read/write - fail it */
+        ret = -EIO; /* Partial read/write - fail it */
    }

-    acb->bh = aio_bh_new(acb->aio_context, qemu_gluster_complete_aio, acb);
-    qemu_bh_schedule(acb->bh);
+    s->qemu_aio_count--;
+    qemu_aio_release(acb);
+    cb(opaque, ret);
+    if (finished) {
+        *finished = true;
+    }
 }

-/* TODO Convert to fine grained options */
-static QemuOptsList runtime_opts = {
-    .name = "gluster",
-    .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
-    .desc = {
-        {
-            .name = "filename",
-            .type = QEMU_OPT_STRING,
-            .help = "URL to the gluster image",
-        },
-        { /* end of list */ }
-    },
-};
-
-static void qemu_gluster_parse_flags(int bdrv_flags, int *open_flags)
+static void qemu_gluster_aio_event_reader(void *opaque)
 {
-    assert(open_flags != NULL);
+    BDRVGlusterState *s = opaque;
+    ssize_t ret;

-    *open_flags |= O_BINARY;
+    do {
+        char *p = (char *)&s->event_acb;

-    if (bdrv_flags & BDRV_O_RDWR) {
-        *open_flags |= O_RDWR;
-    } else {
-        *open_flags |= O_RDONLY;
-    }
-
-    if ((bdrv_flags & BDRV_O_NOCACHE)) {
-        *open_flags |= O_DIRECT;
-    }
+        ret = read(s->fds[GLUSTER_FD_READ], p + s->event_reader_pos,
+                   sizeof(s->event_acb) - s->event_reader_pos);
+        if (ret > 0) {
+            s->event_reader_pos += ret;
+            if (s->event_reader_pos == sizeof(s->event_acb)) {
+                s->event_reader_pos = 0;
+                qemu_gluster_complete_aio(s->event_acb, s);
+            }
+        }
+    } while (ret < 0 && errno == EINTR);
 }

-static int qemu_gluster_open(BlockDriverState *bs,  QDict *options,
-                             int bdrv_flags, Error **errp)
+static int qemu_gluster_aio_flush_cb(void *opaque)
+{
+    BDRVGlusterState *s = opaque;
+
+    return (s->qemu_aio_count > 0);
+}
+
+static int qemu_gluster_open(BlockDriverState *bs, const char *filename,
+    int bdrv_flags)
 {
    BDRVGlusterState *s = bs->opaque;
-    int open_flags = 0;
+    int open_flags = O_BINARY;
    int ret = 0;
-    GlusterConf *gconf = g_new0(GlusterConf, 1);
-    QemuOpts *opts;
-    Error *local_err = NULL;
-    const char *filename;
+    GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));

-    opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
-    qemu_opts_absorb_qdict(opts, options, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto out;
-    }
-
-    filename = qemu_opt_get(opts, "filename");
-
-    s->glfs = qemu_gluster_init(gconf, filename, errp);
+    s->glfs = qemu_gluster_init(gconf, filename);
    if (!s->glfs) {
        ret = -errno;
        goto out;
    }

-    qemu_gluster_parse_flags(bdrv_flags, &open_flags);
+    if (bdrv_flags & BDRV_O_RDWR) {
+        open_flags |= O_RDWR;
+    } else {
+        open_flags |= O_RDONLY;
+    }
+
+    if ((bdrv_flags & BDRV_O_NOCACHE)) {
+        open_flags |= O_DIRECT;
+    }

    s->fd = glfs_open(s->glfs, gconf->image, open_flags);
    if (!s->fd) {
        ret = -errno;
+        goto out;
    }

+    ret = qemu_pipe(s->fds);
+    if (ret < 0) {
+        ret = -errno;
+        goto out;
+    }
+    fcntl(s->fds[GLUSTER_FD_READ], F_SETFL, O_NONBLOCK);
+    qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ],
+        qemu_gluster_aio_event_reader, NULL, qemu_gluster_aio_flush_cb, s);
+
 out:
-    qemu_opts_del(opts);
    qemu_gluster_gconf_free(gconf);
    if (!ret) {
        return ret;
@@ -334,181 +335,26 @@ out:
    return ret;
 }

-typedef struct BDRVGlusterReopenState {
-    struct glfs *glfs;
-    struct glfs_fd *fd;
-} BDRVGlusterReopenState;
-
-
-static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
-                                       BlockReopenQueue *queue, Error **errp)
-{
-    int ret = 0;
-    BDRVGlusterReopenState *reop_s;
-    GlusterConf *gconf = NULL;
-    int open_flags = 0;
-
-    assert(state != NULL);
-    assert(state->bs != NULL);
-
-    state->opaque = g_new0(BDRVGlusterReopenState, 1);
-    reop_s = state->opaque;
-
-    qemu_gluster_parse_flags(state->flags, &open_flags);
-
-    gconf = g_new0(GlusterConf, 1);
-
-    reop_s->glfs = qemu_gluster_init(gconf, state->bs->filename, errp);
-    if (reop_s->glfs == NULL) {
-        ret = -errno;
-        goto exit;
-    }
-
-    reop_s->fd = glfs_open(reop_s->glfs, gconf->image, open_flags);
-    if (reop_s->fd == NULL) {
-        /* reops->glfs will be cleaned up in _abort */
-        ret = -errno;
-        goto exit;
-    }
-
-exit:
-    /* state->opaque will be freed in either the _abort or _commit */
-    qemu_gluster_gconf_free(gconf);
-    return ret;
-}
-
-static void qemu_gluster_reopen_commit(BDRVReopenState *state)
-{
-    BDRVGlusterReopenState *reop_s = state->opaque;
-    BDRVGlusterState *s = state->bs->opaque;
-
-
-    /* close the old */
-    if (s->fd) {
-        glfs_close(s->fd);
-    }
-    if (s->glfs) {
-        glfs_fini(s->glfs);
-    }
-
-    /* use the newly opened image / connection */
-    s->fd         = reop_s->fd;
-    s->glfs       = reop_s->glfs;
-
-    g_free(state->opaque);
-    state->opaque = NULL;
-
-    return;
-}
-
-
-static void qemu_gluster_reopen_abort(BDRVReopenState *state)
-{
-    BDRVGlusterReopenState *reop_s = state->opaque;
-
-    if (reop_s == NULL) {
-        return;
-    }
-
-    if (reop_s->fd) {
-        glfs_close(reop_s->fd);
-    }
-
-    if (reop_s->glfs) {
-        glfs_fini(reop_s->glfs);
-    }
-
-    g_free(state->opaque);
-    state->opaque = NULL;
-
-    return;
-}
-
-#ifdef CONFIG_GLUSTERFS_ZEROFILL
-static coroutine_fn int qemu_gluster_co_write_zeroes(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
-{
-    int ret;
-    GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
-    BDRVGlusterState *s = bs->opaque;
-    off_t size = nb_sectors * BDRV_SECTOR_SIZE;
-    off_t offset = sector_num * BDRV_SECTOR_SIZE;
-
-    acb->size = size;
-    acb->ret = 0;
-    acb->coroutine = qemu_coroutine_self();
-    acb->aio_context = bdrv_get_aio_context(bs);
-
-    ret = glfs_zerofill_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
-    if (ret < 0) {
-        ret = -errno;
-        goto out;
-    }
-
-    qemu_coroutine_yield();
-    ret = acb->ret;
-
-out:
-    g_slice_free(GlusterAIOCB, acb);
-    return ret;
-}
-
-static inline bool gluster_supports_zerofill(void)
-{
-    return 1;
-}
-
-static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
-        int64_t size)
-{
-    return glfs_zerofill(fd, offset, size);
-}
-
-#else
-static inline bool gluster_supports_zerofill(void)
-{
-    return 0;
-}
-
-static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
-        int64_t size)
-{
-    return 0;
-}
-#endif
-
 static int qemu_gluster_create(const char *filename,
-                               QemuOpts *opts, Error **errp)
+        QEMUOptionParameter *options)
 {
    struct glfs *glfs;
    struct glfs_fd *fd;
    int ret = 0;
-    int prealloc = 0;
    int64_t total_size = 0;
-    char *tmp = NULL;
-    GlusterConf *gconf = g_new0(GlusterConf, 1);
+    GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));

-    glfs = qemu_gluster_init(gconf, filename, errp);
+    glfs = qemu_gluster_init(gconf, filename);
    if (!glfs) {
        ret = -errno;
        goto out;
    }

-    total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-                          BDRV_SECTOR_SIZE);
-
-    tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
-    if (!tmp || !strcmp(tmp, "off")) {
-        prealloc = 0;
-    } else if (!strcmp(tmp, "full") &&
-               gluster_supports_zerofill()) {
-        prealloc = 1;
-    } else {
-        error_setg(errp, "Invalid preallocation mode: '%s'"
-            " or GlusterFS doesn't support zerofill API",
-            tmp);
-        ret = -EINVAL;
-        goto out;
+    while (options && options->name) {
+        if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
+            total_size = options->value.n / BDRV_SECTOR_SIZE;
+        }
+        options++;
    }

    fd = glfs_creat(glfs, gconf->image,
@@ -516,20 +362,14 @@ static int qemu_gluster_create(const char *filename,
    if (!fd) {
        ret = -errno;
    } else {
-        if (!glfs_ftruncate(fd, total_size)) {
-            if (prealloc && qemu_gluster_zerofill(fd, 0, total_size)) {
-                ret = -errno;
-            }
-        } else {
+        if (glfs_ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
            ret = -errno;
        }
-
        if (glfs_close(fd) != 0) {
            ret = -errno;
        }
    }
 out:
-    g_free(tmp);
    qemu_gluster_gconf_free(gconf);
    if (glfs) {
        glfs_fini(glfs);
@@ -537,19 +377,72 @@ out:
    return ret;
 }

-static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int write)
+static void qemu_gluster_aio_cancel(BlockDriverAIOCB *blockacb)
+{
+    GlusterAIOCB *acb = (GlusterAIOCB *)blockacb;
+    bool finished = false;
+
+    acb->finished = &finished;
+    while (!finished) {
+        qemu_aio_wait();
+    }
+}
+
+static const AIOCBInfo gluster_aiocb_info = {
+    .aiocb_size = sizeof(GlusterAIOCB),
+    .cancel = qemu_gluster_aio_cancel,
+};
+
+static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
+{
+    GlusterAIOCB *acb = (GlusterAIOCB *)arg;
+    BlockDriverState *bs = acb->common.bs;
+    BDRVGlusterState *s = bs->opaque;
+    int retval;
+
+    acb->ret = ret;
+    retval = qemu_write_full(s->fds[GLUSTER_FD_WRITE], &acb, sizeof(acb));
+    if (retval != sizeof(acb)) {
+        /*
+         * Gluster AIO callback thread failed to notify the waiting
+         * QEMU thread about IO completion.
+         *
+         * Complete this IO request and make the disk inaccessible for
+         * subsequent reads and writes.
+         */
+        error_report("Gluster failed to notify QEMU about IO completion");
+
+        qemu_mutex_lock_iothread(); /* We are in gluster thread context */
+        acb->common.cb(acb->common.opaque, -EIO);
+        qemu_aio_release(acb);
+        s->qemu_aio_count--;
+        close(s->fds[GLUSTER_FD_READ]);
+        close(s->fds[GLUSTER_FD_WRITE]);
+        qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], NULL, NULL, NULL,
+            NULL);
+        bs->drv = NULL; /* Make the disk inaccessible */
+        qemu_mutex_unlock_iothread();
+    }
+}
+
+static BlockDriverAIOCB *qemu_gluster_aio_rw(BlockDriverState *bs,
+        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+        BlockDriverCompletionFunc *cb, void *opaque, int write)
 {
    int ret;
-    GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
+    GlusterAIOCB *acb;
    BDRVGlusterState *s = bs->opaque;
-    size_t size = nb_sectors * BDRV_SECTOR_SIZE;
-    off_t offset = sector_num * BDRV_SECTOR_SIZE;
+    size_t size;
+    off_t offset;

+    offset = sector_num * BDRV_SECTOR_SIZE;
+    size = nb_sectors * BDRV_SECTOR_SIZE;
+    s->qemu_aio_count++;
+
+    acb = qemu_aio_get(&gluster_aiocb_info, bs, cb, opaque);
    acb->size = size;
    acb->ret = 0;
-    acb->coroutine = qemu_coroutine_self();
-    acb->aio_context = bdrv_get_aio_context(bs);
+    acb->finished = NULL;

    if (write) {
        ret = glfs_pwritev_async(s->fd, qiov->iov, qiov->niov, offset, 0,
@@ -560,98 +453,55 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
    }

    if (ret < 0) {
-        ret = -errno;
        goto out;
    }
-
-    qemu_coroutine_yield();
-    ret = acb->ret;
+    return &acb->common;

 out:
-    g_slice_free(GlusterAIOCB, acb);
-    return ret;
+    s->qemu_aio_count--;
+    qemu_aio_release(acb);
+    return NULL;
 }

-static int qemu_gluster_truncate(BlockDriverState *bs, int64_t offset)
+static BlockDriverAIOCB *qemu_gluster_aio_readv(BlockDriverState *bs,
+        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+        BlockDriverCompletionFunc *cb, void *opaque)
+{
+    return qemu_gluster_aio_rw(bs, sector_num, qiov, nb_sectors, cb, opaque, 0);
+}
+
+static BlockDriverAIOCB *qemu_gluster_aio_writev(BlockDriverState *bs,
+        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+        BlockDriverCompletionFunc *cb, void *opaque)
+{
+    return qemu_gluster_aio_rw(bs, sector_num, qiov, nb_sectors, cb, opaque, 1);
+}
+
+static BlockDriverAIOCB *qemu_gluster_aio_flush(BlockDriverState *bs,
+        BlockDriverCompletionFunc *cb, void *opaque)
 {
    int ret;
+    GlusterAIOCB *acb;
    BDRVGlusterState *s = bs->opaque;

-    ret = glfs_ftruncate(s->fd, offset);
-    if (ret < 0) {
-        return -errno;
-    }
-
-    return 0;
-}
-
-static coroutine_fn int qemu_gluster_co_readv(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
-{
-    return qemu_gluster_co_rw(bs, sector_num, nb_sectors, qiov, 0);
-}
-
-static coroutine_fn int qemu_gluster_co_writev(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
-{
-    return qemu_gluster_co_rw(bs, sector_num, nb_sectors, qiov, 1);
-}
-
-static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
-{
-    int ret;
-    GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
-    BDRVGlusterState *s = bs->opaque;
-
+    acb = qemu_aio_get(&gluster_aiocb_info, bs, cb, opaque);
    acb->size = 0;
    acb->ret = 0;
-    acb->coroutine = qemu_coroutine_self();
-    acb->aio_context = bdrv_get_aio_context(bs);
+    acb->finished = NULL;
+    s->qemu_aio_count++;

    ret = glfs_fsync_async(s->fd, &gluster_finish_aiocb, acb);
    if (ret < 0) {
-        ret = -errno;
        goto out;
    }
-
-    qemu_coroutine_yield();
-    ret = acb->ret;
+    return &acb->common;

 out:
-    g_slice_free(GlusterAIOCB, acb);
-    return ret;
+    s->qemu_aio_count--;
+    qemu_aio_release(acb);
+    return NULL;
 }

-#ifdef CONFIG_GLUSTERFS_DISCARD
-static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
-        int64_t sector_num, int nb_sectors)
-{
-    int ret;
-    GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
-    BDRVGlusterState *s = bs->opaque;
-    size_t size = nb_sectors * BDRV_SECTOR_SIZE;
-    off_t offset = sector_num * BDRV_SECTOR_SIZE;
-
-    acb->size = 0;
-    acb->ret = 0;
-    acb->coroutine = qemu_coroutine_self();
-    acb->aio_context = bdrv_get_aio_context(bs);
-
-    ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
-    if (ret < 0) {
-        ret = -errno;
-        goto out;
-    }
-
-    qemu_coroutine_yield();
-    ret = acb->ret;
-
-out:
-    g_slice_free(GlusterAIOCB, acb);
-    return ret;
-}
-#endif
-
 static int64_t qemu_gluster_getlength(BlockDriverState *bs)
 {
    BDRVGlusterState *s = bs->opaque;
@@ -683,6 +533,10 @@ static void qemu_gluster_close(BlockDriverState *bs)
 {
    BDRVGlusterState *s = bs->opaque;

+    close(s->fds[GLUSTER_FD_READ]);
+    close(s->fds[GLUSTER_FD_WRITE]);
+    qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], NULL, NULL, NULL, NULL);
+
    if (s->fd) {
        glfs_close(s->fd);
        s->fd = NULL;
@@ -690,136 +544,73 @@ static void qemu_gluster_close(BlockDriverState *bs)
    glfs_fini(s->glfs);
 }

-static int qemu_gluster_has_zero_init(BlockDriverState *bs)
-{
-    /* GlusterFS volume could be backed by a block device */
-    return 0;
-}
-
-static QemuOptsList qemu_gluster_create_opts = {
-    .name = "qemu-gluster-create-opts",
-    .head = QTAILQ_HEAD_INITIALIZER(qemu_gluster_create_opts.head),
-    .desc = {
-        {
-            .name = BLOCK_OPT_SIZE,
-            .type = QEMU_OPT_SIZE,
-            .help = "Virtual disk size"
-        },
-        {
-            .name = BLOCK_OPT_PREALLOC,
-            .type = QEMU_OPT_STRING,
-            .help = "Preallocation mode (allowed values: off, full)"
-        },
-        { /* end of list */ }
-    }
+static QEMUOptionParameter qemu_gluster_create_options[] = {
+    {
+        .name = BLOCK_OPT_SIZE,
+        .type = OPT_SIZE,
+        .help = "Virtual disk size"
+    },
+    { NULL }
 };

 static BlockDriver bdrv_gluster = {
    .format_name                  = "gluster",
    .protocol_name                = "gluster",
    .instance_size                = sizeof(BDRVGlusterState),
-    .bdrv_needs_filename          = true,
    .bdrv_file_open               = qemu_gluster_open,
-    .bdrv_reopen_prepare          = qemu_gluster_reopen_prepare,
-    .bdrv_reopen_commit           = qemu_gluster_reopen_commit,
-    .bdrv_reopen_abort            = qemu_gluster_reopen_abort,
    .bdrv_close                   = qemu_gluster_close,
    .bdrv_create                  = qemu_gluster_create,
    .bdrv_getlength               = qemu_gluster_getlength,
    .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
-    .bdrv_truncate                = qemu_gluster_truncate,
-    .bdrv_co_readv                = qemu_gluster_co_readv,
-    .bdrv_co_writev               = qemu_gluster_co_writev,
-    .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
-    .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
-#ifdef CONFIG_GLUSTERFS_DISCARD
-    .bdrv_co_discard              = qemu_gluster_co_discard,
-#endif
-#ifdef CONFIG_GLUSTERFS_ZEROFILL
-    .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
-#endif
-    .create_opts                  = &qemu_gluster_create_opts,
+    .bdrv_aio_readv               = qemu_gluster_aio_readv,
+    .bdrv_aio_writev              = qemu_gluster_aio_writev,
+    .bdrv_aio_flush               = qemu_gluster_aio_flush,
+    .create_options               = qemu_gluster_create_options,
 };

 static BlockDriver bdrv_gluster_tcp = {
    .format_name                  = "gluster",
    .protocol_name                = "gluster+tcp",
    .instance_size                = sizeof(BDRVGlusterState),
-    .bdrv_needs_filename          = true,
    .bdrv_file_open               = qemu_gluster_open,
-    .bdrv_reopen_prepare          = qemu_gluster_reopen_prepare,
-    .bdrv_reopen_commit           = qemu_gluster_reopen_commit,
-    .bdrv_reopen_abort            = qemu_gluster_reopen_abort,
    .bdrv_close                   = qemu_gluster_close,
    .bdrv_create                  = qemu_gluster_create,
    .bdrv_getlength               = qemu_gluster_getlength,
    .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
-    .bdrv_truncate                = qemu_gluster_truncate,
-    .bdrv_co_readv                = qemu_gluster_co_readv,
-    .bdrv_co_writev               = qemu_gluster_co_writev,
-    .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
-    .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
-#ifdef CONFIG_GLUSTERFS_DISCARD
-    .bdrv_co_discard              = qemu_gluster_co_discard,
-#endif
-#ifdef CONFIG_GLUSTERFS_ZEROFILL
-    .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
-#endif
-    .create_opts                  = &qemu_gluster_create_opts,
+    .bdrv_aio_readv               = qemu_gluster_aio_readv,
+    .bdrv_aio_writev              = qemu_gluster_aio_writev,
+    .bdrv_aio_flush               = qemu_gluster_aio_flush,
+    .create_options               = qemu_gluster_create_options,
 };

 static BlockDriver bdrv_gluster_unix = {
    .format_name                  = "gluster",
    .protocol_name                = "gluster+unix",
    .instance_size                = sizeof(BDRVGlusterState),
-    .bdrv_needs_filename          = true,
    .bdrv_file_open               = qemu_gluster_open,
-    .bdrv_reopen_prepare          = qemu_gluster_reopen_prepare,
-    .bdrv_reopen_commit           = qemu_gluster_reopen_commit,
-    .bdrv_reopen_abort            = qemu_gluster_reopen_abort,
    .bdrv_close                   = qemu_gluster_close,
    .bdrv_create                  = qemu_gluster_create,
    .bdrv_getlength               = qemu_gluster_getlength,
    .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
-    .bdrv_truncate                = qemu_gluster_truncate,
-    .bdrv_co_readv                = qemu_gluster_co_readv,
-    .bdrv_co_writev               = qemu_gluster_co_writev,
-    .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
-    .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
-#ifdef CONFIG_GLUSTERFS_DISCARD
-    .bdrv_co_discard              = qemu_gluster_co_discard,
-#endif
-#ifdef CONFIG_GLUSTERFS_ZEROFILL
-    .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
-#endif
-    .create_opts                  = &qemu_gluster_create_opts,
+    .bdrv_aio_readv               = qemu_gluster_aio_readv,
+    .bdrv_aio_writev              = qemu_gluster_aio_writev,
+    .bdrv_aio_flush               = qemu_gluster_aio_flush,
+    .create_options               = qemu_gluster_create_options,
 };

 static BlockDriver bdrv_gluster_rdma = {
    .format_name                  = "gluster",
    .protocol_name                = "gluster+rdma",
    .instance_size                = sizeof(BDRVGlusterState),
-    .bdrv_needs_filename          = true,
    .bdrv_file_open               = qemu_gluster_open,
-    .bdrv_reopen_prepare          = qemu_gluster_reopen_prepare,
-    .bdrv_reopen_commit           = qemu_gluster_reopen_commit,
-    .bdrv_reopen_abort            = qemu_gluster_reopen_abort,
    .bdrv_close                   = qemu_gluster_close,
    .bdrv_create                  = qemu_gluster_create,
    .bdrv_getlength               = qemu_gluster_getlength,
    .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
-    .bdrv_truncate                = qemu_gluster_truncate,
-    .bdrv_co_readv                = qemu_gluster_co_readv,
-    .bdrv_co_writev               = qemu_gluster_co_writev,
-    .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
-    .bdrv_has_zero_init           = qemu_gluster_has_zero_init,
-#ifdef CONFIG_GLUSTERFS_DISCARD
-    .bdrv_co_discard              = qemu_gluster_co_discard,
-#endif
-#ifdef CONFIG_GLUSTERFS_ZEROFILL
-    .bdrv_co_write_zeroes         = qemu_gluster_co_write_zeroes,
-#endif
-    .create_opts                  = &qemu_gluster_create_opts,
+    .bdrv_aio_readv               = qemu_gluster_aio_readv,
+    .bdrv_aio_writev              = qemu_gluster_aio_writev,
+    .bdrv_aio_flush               = qemu_gluster_aio_flush,
+    .create_options               = qemu_gluster_create_options,
 };

 static void bdrv_gluster_init(void)
--- a/block/io.c
+++ b/block/io.c
--- a/block/iscsi.c
+++ b/block/iscsi.c
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -8,10 +8,10 @@
 * See the COPYING file in the top-level directory.
 */
 #include "qemu-common.h"
-#include "block/aio.h"
-#include "qemu/queue.h"
+#include "qemu-aio.h"
+#include "qemu-queue.h"
 #include "block/raw-aio.h"
-#include "qemu/event_notifier.h"
+#include "event_notifier.h"

 #include <libaio.h>

@@ -25,42 +25,23 @@
 */
 #define MAX_EVENTS 128

-#define MAX_QUEUED_IO  128
-
 struct qemu_laiocb {
-    BlockAIOCB common;
+    BlockDriverAIOCB common;
    struct qemu_laio_state *ctx;
    struct iocb iocb;
    ssize_t ret;
    size_t nbytes;
    QEMUIOVector *qiov;
    bool is_read;
-    QSIMPLEQ_ENTRY(qemu_laiocb) next;
+    QLIST_ENTRY(qemu_laiocb) node;
 };

-typedef struct {
-    int plugged;
-    unsigned int n;
-    bool blocked;
-    QSIMPLEQ_HEAD(, qemu_laiocb) pending;
-} LaioQueue;
-
 struct qemu_laio_state {
    io_context_t ctx;
    EventNotifier e;
-
-    /* io queue for submit at batch */
-    LaioQueue io_q;
-
-    /* I/O completion processing */
-    QEMUBH *completion_bh;
-    struct io_event events[MAX_EVENTS];
-    int event_idx;
-    int event_max;
+    int count;
 };

-static void ioq_submit(struct qemu_laio_state *s);
-
 static inline ssize_t io_event_ret(struct io_event *ev)
 {
    return (ssize_t)(((uint64_t)ev->res2 << 32) | ev->res);
@@ -74,6 +55,8 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
 {
    int ret;

+    s->count--;
+
    ret = laiocb->ret;
    if (ret != -ECANCELED) {
        if (ret == laiocb->nbytes) {
@@ -87,159 +70,84 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
                ret = -EINVAL;
            }
        }
-    }
-    laiocb->common.cb(laiocb->common.opaque, ret);

-    qemu_aio_unref(laiocb);
-}
-
-/* The completion BH fetches completed I/O requests and invokes their
- * callbacks.
- *
- * The function is somewhat tricky because it supports nested event loops, for
- * example when a request callback invokes aio_poll().  In order to do this,
- * the completion events array and index are kept in qemu_laio_state.  The BH
- * reschedules itself as long as there are completions pending so it will
- * either be called again in a nested event loop or will be called after all
- * events have been completed.  When there are no events left to complete, the
- * BH returns without rescheduling.
- */
-static void qemu_laio_completion_bh(void *opaque)
-{
-    struct qemu_laio_state *s = opaque;
-
-    /* Fetch more completion events when empty */
-    if (s->event_idx == s->event_max) {
-        do {
-            struct timespec ts = { 0 };
-            s->event_max = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS,
-                                        s->events, &ts);
-        } while (s->event_max == -EINTR);
-
-        s->event_idx = 0;
-        if (s->event_max <= 0) {
-            s->event_max = 0;
-            return; /* no more events */
-        }
+        laiocb->common.cb(laiocb->common.opaque, ret);
    }

-    /* Reschedule so nested event loops see currently pending completions */
-    qemu_bh_schedule(s->completion_bh);
-
-    /* Process completion events */
-    while (s->event_idx < s->event_max) {
-        struct iocb *iocb = s->events[s->event_idx].obj;
-        struct qemu_laiocb *laiocb =
-                container_of(iocb, struct qemu_laiocb, iocb);
-
-        laiocb->ret = io_event_ret(&s->events[s->event_idx]);
-        s->event_idx++;
-
-        qemu_laio_process_completion(s, laiocb);
-    }
-
-    if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
-        ioq_submit(s);
-    }
+    qemu_aio_release(laiocb);
 }

 static void qemu_laio_completion_cb(EventNotifier *e)
 {
    struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, e);

-    if (event_notifier_test_and_clear(&s->e)) {
-        qemu_bh_schedule(s->completion_bh);
+    while (event_notifier_test_and_clear(&s->e)) {
+        struct io_event events[MAX_EVENTS];
+        struct timespec ts = { 0 };
+        int nevents, i;
+
+        do {
+            nevents = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS, events, &ts);
+        } while (nevents == -EINTR);
+
+        for (i = 0; i < nevents; i++) {
+            struct iocb *iocb = events[i].obj;
+            struct qemu_laiocb *laiocb =
+                    container_of(iocb, struct qemu_laiocb, iocb);
+
+            laiocb->ret = io_event_ret(&events[i]);
+            qemu_laio_process_completion(s, laiocb);
+        }
    }
 }

-static void laio_cancel(BlockAIOCB *blockacb)
+static int qemu_laio_flush_cb(EventNotifier *e)
+{
+    struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, e);
+
+    return (s->count > 0) ? 1 : 0;
+}
+
+static void laio_cancel(BlockDriverAIOCB *blockacb)
 {
    struct qemu_laiocb *laiocb = (struct qemu_laiocb *)blockacb;
    struct io_event event;
    int ret;

-    if (laiocb->ret != -EINPROGRESS) {
+    if (laiocb->ret != -EINPROGRESS)
        return;
-    }
+
+    /*
+     * Note that as of Linux 2.6.31 neither the block device code nor any
+     * filesystem implements cancellation of AIO request.
+     * Thus the polling loop below is the normal code path.
+     */
    ret = io_cancel(laiocb->ctx->ctx, &laiocb->iocb, &event);
-    laiocb->ret = -ECANCELED;
-    if (ret != 0) {
-        /* iocb is not cancelled, cb will be called by the event loop later */
+    if (ret == 0) {
+        laiocb->ret = -ECANCELED;
        return;
    }

-    laiocb->common.cb(laiocb->common.opaque, laiocb->ret);
+    /*
+     * We have to wait for the iocb to finish.
+     *
+     * The only way to get the iocb status update is by polling the io context.
+     * We might be able to do this slightly more optimal by removing the
+     * O_NONBLOCK flag.
+     */
+    while (laiocb->ret == -EINPROGRESS) {
+        qemu_laio_completion_cb(&laiocb->ctx->e);
+    }
 }

 static const AIOCBInfo laio_aiocb_info = {
    .aiocb_size         = sizeof(struct qemu_laiocb),
-    .cancel_async       = laio_cancel,
+    .cancel             = laio_cancel,
 };

-static void ioq_init(LaioQueue *io_q)
-{
-    QSIMPLEQ_INIT(&io_q->pending);
-    io_q->plugged = 0;
-    io_q->n = 0;
-    io_q->blocked = false;
-}
-
-static void ioq_submit(struct qemu_laio_state *s)
-{
-    int ret, len;
-    struct qemu_laiocb *aiocb;
-    struct iocb *iocbs[MAX_QUEUED_IO];
-    QSIMPLEQ_HEAD(, qemu_laiocb) completed;
-
-    do {
-        len = 0;
-        QSIMPLEQ_FOREACH(aiocb, &s->io_q.pending, next) {
-            iocbs[len++] = &aiocb->iocb;
-            if (len == MAX_QUEUED_IO) {
-                break;
-            }
-        }
-
-        ret = io_submit(s->ctx, len, iocbs);
-        if (ret == -EAGAIN) {
-            break;
-        }
-        if (ret < 0) {
-            abort();
-        }
-
-        s->io_q.n -= ret;
-        aiocb = container_of(iocbs[ret - 1], struct qemu_laiocb, iocb);
-        QSIMPLEQ_SPLIT_AFTER(&s->io_q.pending, aiocb, next, &completed);
-    } while (ret == len && !QSIMPLEQ_EMPTY(&s->io_q.pending));
-    s->io_q.blocked = (s->io_q.n > 0);
-}
-
-void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
-{
-    struct qemu_laio_state *s = aio_ctx;
-
-    s->io_q.plugged++;
-}
-
-void laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
-{
-    struct qemu_laio_state *s = aio_ctx;
-
-    assert(s->io_q.plugged > 0 || !unplug);
-
-    if (unplug && --s->io_q.plugged > 0) {
-        return;
-    }
-
-    if (!s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
-        ioq_submit(s);
-    }
-}
-
-BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
+BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque, int type)
+        BlockDriverCompletionFunc *cb, void *opaque, int type)
 {
    struct qemu_laio_state *s = aio_ctx;
    struct qemu_laiocb *laiocb;
@@ -269,36 +177,19 @@ BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
        goto out_free_aiocb;
    }
    io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
+    s->count++;

-    QSIMPLEQ_INSERT_TAIL(&s->io_q.pending, laiocb, next);
-    s->io_q.n++;
-    if (!s->io_q.blocked &&
-        (!s->io_q.plugged || s->io_q.n >= MAX_QUEUED_IO)) {
-        ioq_submit(s);
-    }
+    if (io_submit(s->ctx, 1, &iocbs) < 0)
+        goto out_dec_count;
    return &laiocb->common;

+out_dec_count:
+    s->count--;
 out_free_aiocb:
-    qemu_aio_unref(laiocb);
+    qemu_aio_release(laiocb);
    return NULL;
 }

-void laio_detach_aio_context(void *s_, AioContext *old_context)
-{
-    struct qemu_laio_state *s = s_;
-
-    aio_set_event_notifier(old_context, &s->e, NULL);
-    qemu_bh_delete(s->completion_bh);
-}
-
-void laio_attach_aio_context(void *s_, AioContext *new_context)
-{
-    struct qemu_laio_state *s = s_;
-
-    s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
-    aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
-}
-
 void *laio_init(void)
 {
    struct qemu_laio_state *s;
@@ -312,7 +203,8 @@ void *laio_init(void)
        goto out_close_efd;
    }

-    ioq_init(&s->io_q);
+    qemu_aio_set_event_notifier(&s->e, qemu_laio_completion_cb,
+                                qemu_laio_flush_cb);

    return s;

@@ -322,16 +214,3 @@ out_free_state:
    g_free(s);
    return NULL;
 }
-
-void laio_cleanup(void *s_)
-{
-    struct qemu_laio_state *s = s_;
-
-    event_notifier_cleanup(&s->e);
-
-    if (io_destroy(s->ctx) != 0) {
-        fprintf(stderr, "%s: destroy AIO context %p failed\n",
-                        __func__, &s->ctx);
-    }
-    g_free(s);
-}
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -12,60 +12,33 @@
 */

 #include "trace.h"
-#include "block/blockjob.h"
-#include "block/block_int.h"
+#include "blockjob.h"
+#include "block_int.h"
 #include "qemu/ratelimit.h"
-#include "qemu/bitmap.h"

-#define SLICE_TIME    100000000ULL /* ns */
-#define MAX_IN_FLIGHT 16
+enum {
+    /*
+     * Size of data buffer for populating the image file.  This should be large
+     * enough to process multiple clusters in a single call, so that populating
+     * contiguous regions of the image is efficient.
+     */
+    BLOCK_SIZE = 512 * BDRV_SECTORS_PER_DIRTY_CHUNK, /* in bytes */
+};

-/* The mirroring buffer is a list of granularity-sized chunks.
- * Free chunks are organized in a list.
- */
-typedef struct MirrorBuffer {
-    QSIMPLEQ_ENTRY(MirrorBuffer) next;
-} MirrorBuffer;
+#define SLICE_TIME 100000000ULL /* ns */

 typedef struct MirrorBlockJob {
    BlockJob common;
    RateLimit limit;
    BlockDriverState *target;
-    BlockDriverState *base;
-    /* The name of the graph node to replace */
-    char *replaces;
-    /* The BDS to replace */
-    BlockDriverState *to_replace;
-    /* Used to block operations on the drive-mirror-replace target */
-    Error *replace_blocker;
-    bool is_none_mode;
+    MirrorSyncMode mode;
    BlockdevOnError on_source_error, on_target_error;
    bool synced;
    bool should_complete;
    int64_t sector_num;
-    int64_t granularity;
-    size_t buf_size;
-    int64_t bdev_length;
-    unsigned long *cow_bitmap;
-    BdrvDirtyBitmap *dirty_bitmap;
-    HBitmapIter hbi;
    uint8_t *buf;
-    QSIMPLEQ_HEAD(, MirrorBuffer) buf_free;
-    int buf_free_count;
-
-    unsigned long *in_flight_bitmap;
-    int in_flight;
-    int sectors_in_flight;
-    int ret;
 } MirrorBlockJob;

-typedef struct MirrorOp {
-    MirrorBlockJob *s;
-    QEMUIOVector qiov;
-    int64_t sector_num;
-    int nb_sectors;
-} MirrorOp;
-
 static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read,
                                            int error)
 {
@@ -79,301 +52,51 @@ static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read,
    }
 }

-static void mirror_iteration_done(MirrorOp *op, int ret)
-{
-    MirrorBlockJob *s = op->s;
-    struct iovec *iov;
-    int64_t chunk_num;
-    int i, nb_chunks, sectors_per_chunk;
-
-    trace_mirror_iteration_done(s, op->sector_num, op->nb_sectors, ret);
-
-    s->in_flight--;
-    s->sectors_in_flight -= op->nb_sectors;
-    iov = op->qiov.iov;
-    for (i = 0; i < op->qiov.niov; i++) {
-        MirrorBuffer *buf = (MirrorBuffer *) iov[i].iov_base;
-        QSIMPLEQ_INSERT_TAIL(&s->buf_free, buf, next);
-        s->buf_free_count++;
-    }
-
-    sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
-    chunk_num = op->sector_num / sectors_per_chunk;
-    nb_chunks = op->nb_sectors / sectors_per_chunk;
-    bitmap_clear(s->in_flight_bitmap, chunk_num, nb_chunks);
-    if (ret >= 0) {
-        if (s->cow_bitmap) {
-            bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
-        }
-        s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE;
-    }
-
-    qemu_iovec_destroy(&op->qiov);
-    g_slice_free(MirrorOp, op);
-
-    /* Enter coroutine when it is not sleeping.  The coroutine sleeps to
-     * rate-limit itself.  The coroutine will eventually resume since there is
-     * a sleep timeout so don't wake it early.
-     */
-    if (s->common.busy) {
-        qemu_coroutine_enter(s->common.co, NULL);
-    }
-}
-
-static void mirror_write_complete(void *opaque, int ret)
-{
-    MirrorOp *op = opaque;
-    MirrorBlockJob *s = op->s;
-    if (ret < 0) {
-        BlockErrorAction action;
-
-        bdrv_set_dirty_bitmap(s->dirty_bitmap, op->sector_num, op->nb_sectors);
-        action = mirror_error_action(s, false, -ret);
-        if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
-            s->ret = ret;
-        }
-    }
-    mirror_iteration_done(op, ret);
-}
-
-static void mirror_read_complete(void *opaque, int ret)
-{
-    MirrorOp *op = opaque;
-    MirrorBlockJob *s = op->s;
-    if (ret < 0) {
-        BlockErrorAction action;
-
-        bdrv_set_dirty_bitmap(s->dirty_bitmap, op->sector_num, op->nb_sectors);
-        action = mirror_error_action(s, true, -ret);
-        if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
-            s->ret = ret;
-        }
-
-        mirror_iteration_done(op, ret);
-        return;
-    }
-    bdrv_aio_writev(s->target, op->sector_num, &op->qiov, op->nb_sectors,
-                    mirror_write_complete, op);
-}
-
-static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
+static int coroutine_fn mirror_iteration(MirrorBlockJob *s,
+                                         BlockErrorAction *p_action)
 {
    BlockDriverState *source = s->common.bs;
-    int nb_sectors, sectors_per_chunk, nb_chunks;
-    int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
-    uint64_t delay_ns = 0;
-    MirrorOp *op;
+    BlockDriverState *target = s->target;
+    QEMUIOVector qiov;
+    int ret, nb_sectors;
+    int64_t end;
+    struct iovec iov;

-    s->sector_num = hbitmap_iter_next(&s->hbi);
-    if (s->sector_num < 0) {
-        bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
-        s->sector_num = hbitmap_iter_next(&s->hbi);
-        trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
-        assert(s->sector_num >= 0);
-    }
-
-    hbitmap_next_sector = s->sector_num;
-    sector_num = s->sector_num;
-    sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
-    end = s->bdev_length / BDRV_SECTOR_SIZE;
-
-    /* Extend the QEMUIOVector to include all adjacent blocks that will
-     * be copied in this operation.
-     *
-     * We have to do this if we have no backing file yet in the destination,
-     * and the cluster size is very large.  Then we need to do COW ourselves.
-     * The first time a cluster is copied, copy it entirely.  Note that,
-     * because both the granularity and the cluster size are powers of two,
-     * the number of sectors to copy cannot exceed one cluster.
-     *
-     * We also want to extend the QEMUIOVector to include more adjacent
-     * dirty blocks if possible, to limit the number of I/O operations and
-     * run efficiently even with a small granularity.
-     */
-    nb_chunks = 0;
-    nb_sectors = 0;
-    next_sector = sector_num;
-    next_chunk = sector_num / sectors_per_chunk;
-
-    /* Wait for I/O to this cluster (from a previous iteration) to be done.  */
-    while (test_bit(next_chunk, s->in_flight_bitmap)) {
-        trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
-        qemu_coroutine_yield();
-    }
-
-    do {
-        int added_sectors, added_chunks;
-
-        if (!bdrv_get_dirty(source, s->dirty_bitmap, next_sector) ||
-            test_bit(next_chunk, s->in_flight_bitmap)) {
-            assert(nb_sectors > 0);
-            break;
-        }
-
-        added_sectors = sectors_per_chunk;
-        if (s->cow_bitmap && !test_bit(next_chunk, s->cow_bitmap)) {
-            bdrv_round_to_clusters(s->target,
-                                   next_sector, added_sectors,
-                                   &next_sector, &added_sectors);
-
-            /* On the first iteration, the rounding may make us copy
-             * sectors before the first dirty one.
-             */
-            if (next_sector < sector_num) {
-                assert(nb_sectors == 0);
-                sector_num = next_sector;
-                next_chunk = next_sector / sectors_per_chunk;
-            }
-        }
-
-        added_sectors = MIN(added_sectors, end - (sector_num + nb_sectors));
-        added_chunks = (added_sectors + sectors_per_chunk - 1) / sectors_per_chunk;
-
-        /* When doing COW, it may happen that there is not enough space for
-         * a full cluster.  Wait if that is the case.
-         */
-        while (nb_chunks == 0 && s->buf_free_count < added_chunks) {
-            trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
-            qemu_coroutine_yield();
-        }
-        if (s->buf_free_count < nb_chunks + added_chunks) {
-            trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
-            break;
-        }
-
-        /* We have enough free space to copy these sectors.  */
-        bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks);
-
-        nb_sectors += added_sectors;
-        nb_chunks += added_chunks;
-        next_sector += added_sectors;
-        next_chunk += added_chunks;
-        if (!s->synced && s->common.speed) {
-            delay_ns = ratelimit_calculate_delay(&s->limit, added_sectors);
-        }
-    } while (delay_ns == 0 && next_sector < end);
-
-    /* Allocate a MirrorOp that is used as an AIO callback.  */
-    op = g_slice_new(MirrorOp);
-    op->s = s;
-    op->sector_num = sector_num;
-    op->nb_sectors = nb_sectors;
-
-    /* Now make a QEMUIOVector taking enough granularity-sized chunks
-     * from s->buf_free.
-     */
-    qemu_iovec_init(&op->qiov, nb_chunks);
-    next_sector = sector_num;
-    while (nb_chunks-- > 0) {
-        MirrorBuffer *buf = QSIMPLEQ_FIRST(&s->buf_free);
-        size_t remaining = (nb_sectors * BDRV_SECTOR_SIZE) - op->qiov.size;
-
-        QSIMPLEQ_REMOVE_HEAD(&s->buf_free, next);
-        s->buf_free_count--;
-        qemu_iovec_add(&op->qiov, buf, MIN(s->granularity, remaining));
-
-        /* Advance the HBitmapIter in parallel, so that we do not examine
-         * the same sector twice.
-         */
-        if (next_sector > hbitmap_next_sector
-            && bdrv_get_dirty(source, s->dirty_bitmap, next_sector)) {
-            hbitmap_next_sector = hbitmap_iter_next(&s->hbi);
-        }
-
-        next_sector += sectors_per_chunk;
-    }
-
-    bdrv_reset_dirty_bitmap(s->dirty_bitmap, sector_num, nb_sectors);
+    end = s->common.len >> BDRV_SECTOR_BITS;
+    s->sector_num = bdrv_get_next_dirty(source, s->sector_num);
+    nb_sectors = MIN(BDRV_SECTORS_PER_DIRTY_CHUNK, end - s->sector_num);
+    bdrv_reset_dirty(source, s->sector_num, nb_sectors);

    /* Copy the dirty cluster.  */
-    s->in_flight++;
-    s->sectors_in_flight += nb_sectors;
-    trace_mirror_one_iteration(s, sector_num, nb_sectors);
-    bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
-                   mirror_read_complete, op);
-    return delay_ns;
-}
+    iov.iov_base = s->buf;
+    iov.iov_len  = nb_sectors * 512;
+    qemu_iovec_init_external(&qiov, &iov, 1);

-static void mirror_free_init(MirrorBlockJob *s)
-{
-    int granularity = s->granularity;
-    size_t buf_size = s->buf_size;
-    uint8_t *buf = s->buf;
-
-    assert(s->buf_free_count == 0);
-    QSIMPLEQ_INIT(&s->buf_free);
-    while (buf_size != 0) {
-        MirrorBuffer *cur = (MirrorBuffer *)buf;
-        QSIMPLEQ_INSERT_TAIL(&s->buf_free, cur, next);
-        s->buf_free_count++;
-        buf_size -= granularity;
-        buf += granularity;
+    trace_mirror_one_iteration(s, s->sector_num, nb_sectors);
+    ret = bdrv_co_readv(source, s->sector_num, nb_sectors, &qiov);
+    if (ret < 0) {
+        *p_action = mirror_error_action(s, true, -ret);
+        goto fail;
    }
-}
-
-static void mirror_drain(MirrorBlockJob *s)
-{
-    while (s->in_flight > 0) {
-        qemu_coroutine_yield();
+    ret = bdrv_co_writev(target, s->sector_num, nb_sectors, &qiov);
+    if (ret < 0) {
+        *p_action = mirror_error_action(s, false, -ret);
+        s->synced = false;
+        goto fail;
    }
-}
+    return 0;

-typedef struct {
-    int ret;
-} MirrorExitData;
-
-static void mirror_exit(BlockJob *job, void *opaque)
-{
-    MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
-    MirrorExitData *data = opaque;
-    AioContext *replace_aio_context = NULL;
-
-    if (s->to_replace) {
-        replace_aio_context = bdrv_get_aio_context(s->to_replace);
-        aio_context_acquire(replace_aio_context);
-    }
-
-    if (s->should_complete && data->ret == 0) {
-        BlockDriverState *to_replace = s->common.bs;
-        if (s->to_replace) {
-            to_replace = s->to_replace;
-        }
-        if (bdrv_get_flags(s->target) != bdrv_get_flags(to_replace)) {
-            bdrv_reopen(s->target, bdrv_get_flags(to_replace), NULL);
-        }
-        bdrv_swap(s->target, to_replace);
-        if (s->common.driver->job_type == BLOCK_JOB_TYPE_COMMIT) {
-            /* drop the bs loop chain formed by the swap: break the loop then
-             * trigger the unref from the top one */
-            BlockDriverState *p = s->base->backing_hd;
-            bdrv_set_backing_hd(s->base, NULL);
-            bdrv_unref(p);
-        }
-    }
-    if (s->to_replace) {
-        bdrv_op_unblock_all(s->to_replace, s->replace_blocker);
-        error_free(s->replace_blocker);
-        bdrv_unref(s->to_replace);
-    }
-    if (replace_aio_context) {
-        aio_context_release(replace_aio_context);
-    }
-    g_free(s->replaces);
-    bdrv_unref(s->target);
-    block_job_completed(&s->common, data->ret);
-    g_free(data);
+fail:
+    /* Try again later.  */
+    bdrv_set_dirty(source, s->sector_num, nb_sectors);
+    return ret;
 }

 static void coroutine_fn mirror_run(void *opaque)
 {
    MirrorBlockJob *s = opaque;
-    MirrorExitData *data;
    BlockDriverState *bs = s->common.bs;
-    int64_t sector_num, end, sectors_per_chunk, length;
-    uint64_t last_pause_ns;
-    BlockDriverInfo bdi;
-    char backing_filename[2]; /* we only need 2 characters because we are only
-                                 checking for a NULL string */
+    int64_t sector_num, end;
    int ret = 0;
    int n;

@@ -381,58 +104,23 @@ static void coroutine_fn mirror_run(void *opaque)
        goto immediate_exit;
    }

-    s->bdev_length = bdrv_getlength(bs);
-    if (s->bdev_length < 0) {
-        ret = s->bdev_length;
-        goto immediate_exit;
-    } else if (s->bdev_length == 0) {
-        /* Report BLOCK_JOB_READY and wait for complete. */
-        block_job_event_ready(&s->common);
-        s->synced = true;
-        while (!block_job_is_cancelled(&s->common) && !s->should_complete) {
-            block_job_yield(&s->common);
-        }
-        s->common.cancelled = false;
-        goto immediate_exit;
+    s->common.len = bdrv_getlength(bs);
+    if (s->common.len < 0) {
+        block_job_completed(&s->common, s->common.len);
+        return;
    }

-    length = DIV_ROUND_UP(s->bdev_length, s->granularity);
-    s->in_flight_bitmap = bitmap_new(length);
+    end = s->common.len >> BDRV_SECTOR_BITS;
+    s->buf = qemu_blockalign(bs, BLOCK_SIZE);

-    /* If we have no backing file yet in the destination, we cannot let
-     * the destination do COW.  Instead, we copy sectors around the
-     * dirty data if needed.  We need a bitmap to do that.
-     */
-    bdrv_get_backing_filename(s->target, backing_filename,
-                              sizeof(backing_filename));
-    if (backing_filename[0] && !s->target->backing_hd) {
-        ret = bdrv_get_info(s->target, &bdi);
-        if (ret < 0) {
-            goto immediate_exit;
-        }
-        if (s->granularity < bdi.cluster_size) {
-            s->buf_size = MAX(s->buf_size, bdi.cluster_size);
-            s->cow_bitmap = bitmap_new(length);
-        }
-    }
-
-    end = s->bdev_length / BDRV_SECTOR_SIZE;
-    s->buf = qemu_try_blockalign(bs, s->buf_size);
-    if (s->buf == NULL) {
-        ret = -ENOMEM;
-        goto immediate_exit;
-    }
-
-    sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
-    mirror_free_init(s);
-
-    if (!s->is_none_mode) {
+    if (s->mode != MIRROR_SYNC_MODE_NONE) {
        /* First part, loop on the sectors and initialize the dirty bitmap.  */
-        BlockDriverState *base = s->base;
+        BlockDriverState *base;
+        base = s->mode == MIRROR_SYNC_MODE_FULL ? NULL : bs->backing_hd;
        for (sector_num = 0; sector_num < end; ) {
-            int64_t next = (sector_num | (sectors_per_chunk - 1)) + 1;
-            ret = bdrv_is_allocated_above(bs, base,
-                                          sector_num, next - sector_num, &n);
+            int64_t next = (sector_num | (BDRV_SECTORS_PER_DIRTY_CHUNK - 1)) + 1;
+            ret = bdrv_co_is_allocated_above(bs, base,
+                                             sector_num, next - sector_num, &n);

            if (ret < 0) {
                goto immediate_exit;
@@ -440,7 +128,7 @@ static void coroutine_fn mirror_run(void *opaque)

            assert(n > 0);
            if (ret == 1) {
-                bdrv_set_dirty_bitmap(s->dirty_bitmap, sector_num, n);
+                bdrv_set_dirty(bs, sector_num, n);
                sector_num = next;
            } else {
                sector_num += n;
@@ -448,50 +136,28 @@ static void coroutine_fn mirror_run(void *opaque)
        }
    }

-    bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
-    last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
+    s->sector_num = -1;
    for (;;) {
-        uint64_t delay_ns = 0;
+        uint64_t delay_ns;
        int64_t cnt;
        bool should_complete;

-        if (s->ret < 0) {
-            ret = s->ret;
-            goto immediate_exit;
-        }
-
-        cnt = bdrv_get_dirty_count(s->dirty_bitmap);
-        /* s->common.offset contains the number of bytes already processed so
-         * far, cnt is the number of dirty sectors remaining and
-         * s->sectors_in_flight is the number of sectors currently being
-         * processed; together those are the current total operation length */
-        s->common.len = s->common.offset +
-                        (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;
-
-        /* Note that even when no rate limit is applied we need to yield
-         * periodically with no pending I/O so that bdrv_drain_all() returns.
-         * We do so every SLICE_TIME nanoseconds, or when there is an error,
-         * or when the source is clean, whichever comes first.
-         */
-        if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - last_pause_ns < SLICE_TIME &&
-            s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
-            if (s->in_flight == MAX_IN_FLIGHT || s->buf_free_count == 0 ||
-                (cnt == 0 && s->in_flight > 0)) {
-                trace_mirror_yield(s, s->in_flight, s->buf_free_count, cnt);
-                qemu_coroutine_yield();
-                continue;
-            } else if (cnt != 0) {
-                delay_ns = mirror_iteration(s);
+        cnt = bdrv_get_dirty_count(bs);
+        if (cnt != 0) {
+            BlockErrorAction action = BDRV_ACTION_REPORT;
+            ret = mirror_iteration(s, &action);
+            if (ret < 0 && action == BDRV_ACTION_REPORT) {
+                goto immediate_exit;
            }
+            cnt = bdrv_get_dirty_count(bs);
        }

        should_complete = false;
-        if (s->in_flight == 0 && cnt == 0) {
+        if (cnt == 0) {
            trace_mirror_before_flush(s);
            ret = bdrv_flush(s->target);
            if (ret < 0) {
-                if (mirror_error_action(s, false, -ret) ==
-                    BLOCK_ERROR_ACTION_REPORT) {
+                if (mirror_error_action(s, false, -ret) == BDRV_ACTION_REPORT) {
                    goto immediate_exit;
                }
            } else {
@@ -500,14 +166,15 @@ static void coroutine_fn mirror_run(void *opaque)
                 * report completion.  This way, block-job-cancel will leave
                 * the target in a consistent state.
                 */
+                s->common.offset = end * BDRV_SECTOR_SIZE;
                if (!s->synced) {
-                    block_job_event_ready(&s->common);
+                    block_job_ready(&s->common);
                    s->synced = true;
                }

                should_complete = s->should_complete ||
                    block_job_is_cancelled(&s->common);
-                cnt = bdrv_get_dirty_count(s->dirty_bitmap);
+                cnt = bdrv_get_dirty_count(bs);
            }
        }

@@ -521,20 +188,32 @@ static void coroutine_fn mirror_run(void *opaque)
             * mirror_populate runs.
             */
            trace_mirror_before_drain(s, cnt);
-            bdrv_drain(bs);
-            cnt = bdrv_get_dirty_count(s->dirty_bitmap);
+            bdrv_drain_all();
+            cnt = bdrv_get_dirty_count(bs);
        }

        ret = 0;
-        trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
+        trace_mirror_before_sleep(s, cnt, s->synced);
        if (!s->synced) {
-            block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
+            /* Publish progress */
+            s->common.offset = end * BDRV_SECTOR_SIZE - cnt * BLOCK_SIZE;
+
+            if (s->common.speed) {
+                delay_ns = ratelimit_calculate_delay(&s->limit, BDRV_SECTORS_PER_DIRTY_CHUNK);
+            } else {
+                delay_ns = 0;
+            }
+
+            /* Note that even when no rate limit is applied we need to yield
+             * with no pending I/O here so that qemu_aio_flush() returns.
+             */
+            block_job_sleep_ns(&s->common, rt_clock, delay_ns);
            if (block_job_is_cancelled(&s->common)) {
                break;
            }
        } else if (!should_complete) {
-            delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
-            block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
+            delay_ns = (cnt == 0 ? SLICE_TIME : 0);
+            block_job_sleep_ns(&s->common, rt_clock, delay_ns);
        } else if (cnt == 0) {
            /* The two disks are in sync.  Exit and report successful
             * completion.
@@ -543,29 +222,21 @@ static void coroutine_fn mirror_run(void *opaque)
            s->common.cancelled = false;
            break;
        }
-        last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
    }

 immediate_exit:
-    if (s->in_flight > 0) {
-        /* We get here only if something went wrong.  Either the job failed,
-         * or it was cancelled prematurely so that we do not guarantee that
-         * the target is a copy of the source.
-         */
-        assert(ret < 0 || (!s->synced && block_job_is_cancelled(&s->common)));
-        mirror_drain(s);
-    }
-
-    assert(s->in_flight == 0);
-    qemu_vfree(s->buf);
-    g_free(s->cow_bitmap);
-    g_free(s->in_flight_bitmap);
-    bdrv_release_dirty_bitmap(bs, s->dirty_bitmap);
+    g_free(s->buf);
+    bdrv_set_dirty_tracking(bs, false);
    bdrv_iostatus_disable(s->target);
-
-    data = g_malloc(sizeof(*data));
-    data->ret = ret;
-    block_job_defer_to_main_loop(&s->common, mirror_exit, data);
+    if (s->should_complete && ret == 0) {
+        if (bdrv_get_flags(s->target) != bdrv_get_flags(s->common.bs)) {
+            bdrv_reopen(s->target, bdrv_get_flags(s->common.bs), NULL);
+        }
+        bdrv_swap(s->target, s->common.bs);
+    }
+    bdrv_close(s->target);
+    bdrv_delete(s->target);
+    block_job_completed(&s->common, ret);
 }

 static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -589,81 +260,42 @@ static void mirror_iostatus_reset(BlockJob *job)
 static void mirror_complete(BlockJob *job, Error **errp)
 {
    MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
-    Error *local_err = NULL;
    int ret;

-    ret = bdrv_open_backing_file(s->target, NULL, &local_err);
+    ret = bdrv_open_backing_file(s->target);
    if (ret < 0) {
-        error_propagate(errp, local_err);
+        char backing_filename[PATH_MAX];
+        bdrv_get_full_backing_filename(s->target, backing_filename,
+                                       sizeof(backing_filename));
+        error_set(errp, QERR_OPEN_FILE_FAILED, backing_filename);
        return;
    }
    if (!s->synced) {
-        error_set(errp, QERR_BLOCK_JOB_NOT_READY,
-                  bdrv_get_device_name(job->bs));
+        error_set(errp, QERR_BLOCK_JOB_NOT_READY, job->bs->device_name);
        return;
    }

-    /* check the target bs is not blocked and block all operations on it */
-    if (s->replaces) {
-        AioContext *replace_aio_context;
-
-        s->to_replace = check_to_replace_node(s->replaces, &local_err);
-        if (!s->to_replace) {
-            error_propagate(errp, local_err);
-            return;
-        }
-
-        replace_aio_context = bdrv_get_aio_context(s->to_replace);
-        aio_context_acquire(replace_aio_context);
-
-        error_setg(&s->replace_blocker,
-                   "block device is in use by block-job-complete");
-        bdrv_op_block_all(s->to_replace, s->replace_blocker);
-        bdrv_ref(s->to_replace);
-
-        aio_context_release(replace_aio_context);
-    }
-
    s->should_complete = true;
-    block_job_enter(&s->common);
+    block_job_resume(job);
 }

-static const BlockJobDriver mirror_job_driver = {
+static BlockJobType mirror_job_type = {
    .instance_size = sizeof(MirrorBlockJob),
-    .job_type      = BLOCK_JOB_TYPE_MIRROR,
+    .job_type      = "mirror",
    .set_speed     = mirror_set_speed,
    .iostatus_reset= mirror_iostatus_reset,
    .complete      = mirror_complete,
 };

-static const BlockJobDriver commit_active_job_driver = {
-    .instance_size = sizeof(MirrorBlockJob),
-    .job_type      = BLOCK_JOB_TYPE_COMMIT,
-    .set_speed     = mirror_set_speed,
-    .iostatus_reset
-                   = mirror_iostatus_reset,
-    .complete      = mirror_complete,
-};
-
-static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
-                             const char *replaces,
-                             int64_t speed, uint32_t granularity,
-                             int64_t buf_size,
-                             BlockdevOnError on_source_error,
-                             BlockdevOnError on_target_error,
-                             BlockCompletionFunc *cb,
-                             void *opaque, Error **errp,
-                             const BlockJobDriver *driver,
-                             bool is_none_mode, BlockDriverState *base)
+void mirror_start(BlockDriverState *bs, BlockDriverState *target,
+                  int64_t speed, MirrorSyncMode mode,
+                  BlockdevOnError on_source_error,
+                  BlockdevOnError on_target_error,
+                  BlockDriverCompletionFunc *cb,
+                  void *opaque, Error **errp)
 {
    MirrorBlockJob *s;

-    if (granularity == 0) {
-        granularity = bdrv_get_default_bitmap_granularity(target);
-    }
-
-    assert ((granularity & (granularity - 1)) == 0);
-
    if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
         on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
        !bdrv_iostatus_is_enabled(bs)) {
@@ -671,25 +303,16 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
        return;
    }

-
-    s = block_job_create(driver, bs, speed, cb, opaque, errp);
+    s = block_job_create(&mirror_job_type, bs, speed, cb, opaque, errp);
    if (!s) {
        return;
    }

-    s->replaces = g_strdup(replaces);
    s->on_source_error = on_source_error;
    s->on_target_error = on_target_error;
    s->target = target;
-    s->is_none_mode = is_none_mode;
-    s->base = base;
-    s->granularity = granularity;
-    s->buf_size = MAX(buf_size, granularity);
-
-    s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
-    if (!s->dirty_bitmap) {
-        return;
-    }
+    s->mode = mode;
+    bdrv_set_dirty_tracking(bs, true);
    bdrv_set_enable_write_cache(s->target, true);
    bdrv_set_on_error(s->target, on_target_error, on_target_error);
    bdrv_iostatus_enable(s->target);
@@ -697,86 +320,3 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
    trace_mirror_start(bs, s, s->common.co, opaque);
    qemu_coroutine_enter(s->common.co, s);
 }
-
-void mirror_start(BlockDriverState *bs, BlockDriverState *target,
-                  const char *replaces,
-                  int64_t speed, uint32_t granularity, int64_t buf_size,
-                  MirrorSyncMode mode, BlockdevOnError on_source_error,
-                  BlockdevOnError on_target_error,
-                  BlockCompletionFunc *cb,
-                  void *opaque, Error **errp)
-{
-    bool is_none_mode;
-    BlockDriverState *base;
-
-    if (mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
-        error_setg(errp, "Sync mode 'dirty-bitmap' not supported");
-        return;
-    }
-    is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
-    base = mode == MIRROR_SYNC_MODE_TOP ? bs->backing_hd : NULL;
-    mirror_start_job(bs, target, replaces,
-                     speed, granularity, buf_size,
-                     on_source_error, on_target_error, cb, opaque, errp,
-                     &mirror_job_driver, is_none_mode, base);
-}
-
-void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
-                         int64_t speed,
-                         BlockdevOnError on_error,
-                         BlockCompletionFunc *cb,
-                         void *opaque, Error **errp)
-{
-    int64_t length, base_length;
-    int orig_base_flags;
-    int ret;
-    Error *local_err = NULL;
-
-    orig_base_flags = bdrv_get_flags(base);
-
-    if (bdrv_reopen(base, bs->open_flags, errp)) {
-        return;
-    }
-
-    length = bdrv_getlength(bs);
-    if (length < 0) {
-        error_setg_errno(errp, -length,
-                         "Unable to determine length of %s", bs->filename);
-        goto error_restore_flags;
-    }
-
-    base_length = bdrv_getlength(base);
-    if (base_length < 0) {
-        error_setg_errno(errp, -base_length,
-                         "Unable to determine length of %s", base->filename);
-        goto error_restore_flags;
-    }
-
-    if (length > base_length) {
-        ret = bdrv_truncate(base, length);
-        if (ret < 0) {
-            error_setg_errno(errp, -ret,
-                            "Top image %s is larger than base image %s, and "
-                             "resize of base image failed",
-                             bs->filename, base->filename);
-            goto error_restore_flags;
-        }
-    }
-
-    bdrv_ref(base);
-    mirror_start_job(bs, base, NULL, speed, 0, 0,
-                     on_error, on_error, cb, opaque, &local_err,
-                     &commit_active_job_driver, false, base);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        goto error_restore_flags;
-    }
-
-    return;
-
-error_restore_flags:
-    /* ignore error and errp for bdrv_reopen, because we want to propagate
-     * the original error */
-    bdrv_reopen(base, orig_base_flags, NULL);
-    return;
-}
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -1,407 +0,0 @@
-/*
- * QEMU Block driver for  NBD
- *
- * Copyright (C) 2008 Bull S.A.S.
- *     Author: Laurent Vivier <Laurent.Vivier@bull.net>
- *
- * Some parts:
- *    Copyright (C) 2007 Anthony Liguori <anthony@codemonkey.ws>
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to deal
- * in the Software without restriction, including without limitation the rights
- * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
- * copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
- * THE SOFTWARE.
- */
-
-#include "nbd-client.h"
-#include "qemu/sockets.h"
-
-#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
-#define INDEX_TO_HANDLE(bs, index)  ((index)  ^ ((uint64_t)(intptr_t)bs))
-
-static void nbd_recv_coroutines_enter_all(NbdClientSession *s)
-{
-    int i;
-
-    for (i = 0; i < MAX_NBD_REQUESTS; i++) {
-        if (s->recv_coroutine[i]) {
-            qemu_coroutine_enter(s->recv_coroutine[i], NULL);
-        }
-    }
-}
-
-static void nbd_teardown_connection(BlockDriverState *bs)
-{
-    NbdClientSession *client = nbd_get_client_session(bs);
-
-    /* finish any pending coroutines */
-    shutdown(client->sock, 2);
-    nbd_recv_coroutines_enter_all(client);
-
-    nbd_client_detach_aio_context(bs);
-    closesocket(client->sock);
-    client->sock = -1;
-}
-
-static void nbd_reply_ready(void *opaque)
-{
-    BlockDriverState *bs = opaque;
-    NbdClientSession *s = nbd_get_client_session(bs);
-    uint64_t i;
-    int ret;
-
-    if (s->reply.handle == 0) {
-        /* No reply already in flight.  Fetch a header.  It is possible
-         * that another thread has done the same thing in parallel, so
-         * the socket is not readable anymore.
-         */
-        ret = nbd_receive_reply(s->sock, &s->reply);
-        if (ret == -EAGAIN) {
-            return;
-        }
-        if (ret < 0) {
-            s->reply.handle = 0;
-            goto fail;
-        }
-    }
-
-    /* There's no need for a mutex on the receive side, because the
-     * handler acts as a synchronization point and ensures that only
-     * one coroutine is called until the reply finishes.  */
-    i = HANDLE_TO_INDEX(s, s->reply.handle);
-    if (i >= MAX_NBD_REQUESTS) {
-        goto fail;
-    }
-
-    if (s->recv_coroutine[i]) {
-        qemu_coroutine_enter(s->recv_coroutine[i], NULL);
-        return;
-    }
-
-fail:
-    nbd_teardown_connection(bs);
-}
-
-static void nbd_restart_write(void *opaque)
-{
-    BlockDriverState *bs = opaque;
-
-    qemu_coroutine_enter(nbd_get_client_session(bs)->send_coroutine, NULL);
-}
-
-static int nbd_co_send_request(BlockDriverState *bs,
-                               struct nbd_request *request,
-                               QEMUIOVector *qiov, int offset)
-{
-    NbdClientSession *s = nbd_get_client_session(bs);
-    AioContext *aio_context;
-    int rc, ret, i;
-
-    qemu_co_mutex_lock(&s->send_mutex);
-
-    for (i = 0; i < MAX_NBD_REQUESTS; i++) {
-        if (s->recv_coroutine[i] == NULL) {
-            s->recv_coroutine[i] = qemu_coroutine_self();
-            break;
-        }
-    }
-
-    assert(i < MAX_NBD_REQUESTS);
-    request->handle = INDEX_TO_HANDLE(s, i);
-    s->send_coroutine = qemu_coroutine_self();
-    aio_context = bdrv_get_aio_context(bs);
-
-    aio_set_fd_handler(aio_context, s->sock,
-                       nbd_reply_ready, nbd_restart_write, bs);
-    if (qiov) {
-        if (!s->is_unix) {
-            socket_set_cork(s->sock, 1);
-        }
-        rc = nbd_send_request(s->sock, request);
-        if (rc >= 0) {
-            ret = qemu_co_sendv(s->sock, qiov->iov, qiov->niov,
-                                offset, request->len);
-            if (ret != request->len) {
-                rc = -EIO;
-            }
-        }
-        if (!s->is_unix) {
-            socket_set_cork(s->sock, 0);
-        }
-    } else {
-        rc = nbd_send_request(s->sock, request);
-    }
-    aio_set_fd_handler(aio_context, s->sock, nbd_reply_ready, NULL, bs);
-    s->send_coroutine = NULL;
-    qemu_co_mutex_unlock(&s->send_mutex);
-    return rc;
-}
-
-static void nbd_co_receive_reply(NbdClientSession *s,
-    struct nbd_request *request, struct nbd_reply *reply,
-    QEMUIOVector *qiov, int offset)
-{
-    int ret;
-
-    /* Wait until we're woken up by the read handler.  TODO: perhaps
-     * peek at the next reply and avoid yielding if it's ours?  */
-    qemu_coroutine_yield();
-    *reply = s->reply;
-    if (reply->handle != request->handle) {
-        reply->error = EIO;
-    } else {
-        if (qiov && reply->error == 0) {
-            ret = qemu_co_recvv(s->sock, qiov->iov, qiov->niov,
-                                offset, request->len);
-            if (ret != request->len) {
-                reply->error = EIO;
-            }
-        }
-
-        /* Tell the read handler to read another header.  */
-        s->reply.handle = 0;
-    }
-}
-
-static void nbd_coroutine_start(NbdClientSession *s,
-   struct nbd_request *request)
-{
-    /* Poor man semaphore.  The free_sema is locked when no other request
-     * can be accepted, and unlocked after receiving one reply.  */
-    if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
-        qemu_co_mutex_lock(&s->free_sema);
-        assert(s->in_flight < MAX_NBD_REQUESTS);
-    }
-    s->in_flight++;
-
-    /* s->recv_coroutine[i] is set as soon as we get the send_lock.  */
-}
-
-static void nbd_coroutine_end(NbdClientSession *s,
-    struct nbd_request *request)
-{
-    int i = HANDLE_TO_INDEX(s, request->handle);
-    s->recv_coroutine[i] = NULL;
-    if (s->in_flight-- == MAX_NBD_REQUESTS) {
-        qemu_co_mutex_unlock(&s->free_sema);
-    }
-}
-
-static int nbd_co_readv_1(BlockDriverState *bs, int64_t sector_num,
-                          int nb_sectors, QEMUIOVector *qiov,
-                          int offset)
-{
-    NbdClientSession *client = nbd_get_client_session(bs);
-    struct nbd_request request = { .type = NBD_CMD_READ };
-    struct nbd_reply reply;
-    ssize_t ret;
-
-    request.from = sector_num * 512;
-    request.len = nb_sectors * 512;
-
-    nbd_coroutine_start(client, &request);
-    ret = nbd_co_send_request(bs, &request, NULL, 0);
-    if (ret < 0) {
-        reply.error = -ret;
-    } else {
-        nbd_co_receive_reply(client, &request, &reply, qiov, offset);
-    }
-    nbd_coroutine_end(client, &request);
-    return -reply.error;
-
-}
-
-static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
-                           int nb_sectors, QEMUIOVector *qiov,
-                           int offset)
-{
-    NbdClientSession *client = nbd_get_client_session(bs);
-    struct nbd_request request = { .type = NBD_CMD_WRITE };
-    struct nbd_reply reply;
-    ssize_t ret;
-
-    if (!bdrv_enable_write_cache(bs) &&
-        (client->nbdflags & NBD_FLAG_SEND_FUA)) {
-        request.type |= NBD_CMD_FLAG_FUA;
-    }
-
-    request.from = sector_num * 512;
-    request.len = nb_sectors * 512;
-
-    nbd_coroutine_start(client, &request);
-    ret = nbd_co_send_request(bs, &request, qiov, offset);
-    if (ret < 0) {
-        reply.error = -ret;
-    } else {
-        nbd_co_receive_reply(client, &request, &reply, NULL, 0);
-    }
-    nbd_coroutine_end(client, &request);
-    return -reply.error;
-}
-
-/* qemu-nbd has a limit of slightly less than 1M per request.  Try to
- * remain aligned to 4K. */
-#define NBD_MAX_SECTORS 2040
-
-int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
-                        int nb_sectors, QEMUIOVector *qiov)
-{
-    int offset = 0;
-    int ret;
-    while (nb_sectors > NBD_MAX_SECTORS) {
-        ret = nbd_co_readv_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
-        if (ret < 0) {
-            return ret;
-        }
-        offset += NBD_MAX_SECTORS * 512;
-        sector_num += NBD_MAX_SECTORS;
-        nb_sectors -= NBD_MAX_SECTORS;
-    }
-    return nbd_co_readv_1(bs, sector_num, nb_sectors, qiov, offset);
-}
-
-int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
-                         int nb_sectors, QEMUIOVector *qiov)
-{
-    int offset = 0;
-    int ret;
-    while (nb_sectors > NBD_MAX_SECTORS) {
-        ret = nbd_co_writev_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
-        if (ret < 0) {
-            return ret;
-        }
-        offset += NBD_MAX_SECTORS * 512;
-        sector_num += NBD_MAX_SECTORS;
-        nb_sectors -= NBD_MAX_SECTORS;
-    }
-    return nbd_co_writev_1(bs, sector_num, nb_sectors, qiov, offset);
-}
-
-int nbd_client_co_flush(BlockDriverState *bs)
-{
-    NbdClientSession *client = nbd_get_client_session(bs);
-    struct nbd_request request = { .type = NBD_CMD_FLUSH };
-    struct nbd_reply reply;
-    ssize_t ret;
-
-    if (!(client->nbdflags & NBD_FLAG_SEND_FLUSH)) {
-        return 0;
-    }
-
-    if (client->nbdflags & NBD_FLAG_SEND_FUA) {
-        request.type |= NBD_CMD_FLAG_FUA;
-    }
-
-    request.from = 0;
-    request.len = 0;
-
-    nbd_coroutine_start(client, &request);
-    ret = nbd_co_send_request(bs, &request, NULL, 0);
-    if (ret < 0) {
-        reply.error = -ret;
-    } else {
-        nbd_co_receive_reply(client, &request, &reply, NULL, 0);
-    }
-    nbd_coroutine_end(client, &request);
-    return -reply.error;
-}
-
-int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
-                          int nb_sectors)
-{
-    NbdClientSession *client = nbd_get_client_session(bs);
-    struct nbd_request request = { .type = NBD_CMD_TRIM };
-    struct nbd_reply reply;
-    ssize_t ret;
-
-    if (!(client->nbdflags & NBD_FLAG_SEND_TRIM)) {
-        return 0;
-    }
-    request.from = sector_num * 512;
-    request.len = nb_sectors * 512;
-
-    nbd_coroutine_start(client, &request);
-    ret = nbd_co_send_request(bs, &request, NULL, 0);
-    if (ret < 0) {
-        reply.error = -ret;
-    } else {
-        nbd_co_receive_reply(client, &request, &reply, NULL, 0);
-    }
-    nbd_coroutine_end(client, &request);
-    return -reply.error;
-
-}
-
-void nbd_client_detach_aio_context(BlockDriverState *bs)
-{
-    aio_set_fd_handler(bdrv_get_aio_context(bs),
-                       nbd_get_client_session(bs)->sock, NULL, NULL, NULL);
-}
-
-void nbd_client_attach_aio_context(BlockDriverState *bs,
-                                   AioContext *new_context)
-{
-    aio_set_fd_handler(new_context, nbd_get_client_session(bs)->sock,
-                       nbd_reply_ready, NULL, bs);
-}
-
-void nbd_client_close(BlockDriverState *bs)
-{
-    NbdClientSession *client = nbd_get_client_session(bs);
-    struct nbd_request request = {
-        .type = NBD_CMD_DISC,
-        .from = 0,
-        .len = 0
-    };
-
-    if (client->sock == -1) {
-        return;
-    }
-
-    nbd_send_request(client->sock, &request);
-
-    nbd_teardown_connection(bs);
-}
-
-int nbd_client_init(BlockDriverState *bs, int sock, const char *export,
-                    Error **errp)
-{
-    NbdClientSession *client = nbd_get_client_session(bs);
-    int ret;
-
-    /* NBD handshake */
-    logout("session init %s\n", export);
-    qemu_set_block(sock);
-    ret = nbd_receive_negotiate(sock, export,
-                                &client->nbdflags, &client->size, errp);
-    if (ret < 0) {
-        logout("Failed to negotiate with the NBD server\n");
-        closesocket(sock);
-        return ret;
-    }
-
-    qemu_co_mutex_init(&client->send_mutex);
-    qemu_co_mutex_init(&client->free_sema);
-    client->sock = sock;
-
-    /* Now that we're connected, set the socket to be non-blocking and
-     * kick the reply mechanism.  */
-    qemu_set_nonblock(sock);
-    nbd_client_attach_aio_context(bs, bdrv_get_aio_context(bs));
-
-    logout("Established connection with NBD server\n");
-    return 0;
-}
--- a/block/nbd-client.h
+++ b/block/nbd-client.h
@@ -1,53 +0,0 @@
-#ifndef NBD_CLIENT_H
-#define NBD_CLIENT_H
-
-#include "qemu-common.h"
-#include "block/nbd.h"
-#include "block/block_int.h"
-
-/* #define DEBUG_NBD */
-
-#if defined(DEBUG_NBD)
-#define logout(fmt, ...) \
-    fprintf(stderr, "nbd\t%-24s" fmt, __func__, ##__VA_ARGS__)
-#else
-#define logout(fmt, ...) ((void)0)
-#endif
-
-#define MAX_NBD_REQUESTS    16
-
-typedef struct NbdClientSession {
-    int sock;
-    uint32_t nbdflags;
-    off_t size;
-
-    CoMutex send_mutex;
-    CoMutex free_sema;
-    Coroutine *send_coroutine;
-    int in_flight;
-
-    Coroutine *recv_coroutine[MAX_NBD_REQUESTS];
-    struct nbd_reply reply;
-
-    bool is_unix;
-} NbdClientSession;
-
-NbdClientSession *nbd_get_client_session(BlockDriverState *bs);
-
-int nbd_client_init(BlockDriverState *bs, int sock, const char *export_name,
-                    Error **errp);
-void nbd_client_close(BlockDriverState *bs);
-
-int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
-                          int nb_sectors);
-int nbd_client_co_flush(BlockDriverState *bs);
-int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
-                         int nb_sectors, QEMUIOVector *qiov);
-int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
-                        int nb_sectors, QEMUIOVector *qiov);
-
-void nbd_client_detach_aio_context(BlockDriverState *bs);
-void nbd_client_attach_aio_context(BlockDriverState *bs,
-                                   AioContext *new_context);
-
-#endif /* NBD_CLIENT_H */
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -26,33 +26,56 @@
 * THE SOFTWARE.
 */

-#include "block/nbd-client.h"
-#include "qemu/uri.h"
-#include "block/block_int.h"
-#include "qemu/module.h"
-#include "qemu/sockets.h"
-#include "qapi/qmp/qdict.h"
-#include "qapi/qmp/qjson.h"
-#include "qapi/qmp/qint.h"
-#include "qapi/qmp/qstring.h"
+#include "qemu-common.h"
+#include "nbd.h"
+#include "uri.h"
+#include "block_int.h"
+#include "module.h"
+#include "qemu_socket.h"

 #include <sys/types.h>
 #include <unistd.h>

 #define EN_OPTSTR ":exportname="

+/* #define DEBUG_NBD */
+
+#if defined(DEBUG_NBD)
+#define logout(fmt, ...) \
+                fprintf(stderr, "nbd\t%-24s" fmt, __func__, ##__VA_ARGS__)
+#else
+#define logout(fmt, ...) ((void)0)
+#endif
+
+#define MAX_NBD_REQUESTS	16
+#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
+#define INDEX_TO_HANDLE(bs, index)  ((index)  ^ ((uint64_t)(intptr_t)bs))
+
 typedef struct BDRVNBDState {
-    NbdClientSession client;
-    QemuOpts *socket_opts;
+    int sock;
+    uint32_t nbdflags;
+    off_t size;
+    size_t blocksize;
+
+    CoMutex send_mutex;
+    CoMutex free_sema;
+    Coroutine *send_coroutine;
+    int in_flight;
+
+    Coroutine *recv_coroutine[MAX_NBD_REQUESTS];
+    struct nbd_reply reply;
+
+    int is_unix;
+    char *host_spec;
+    char *export_name; /* An NBD server may export several devices */
 } BDRVNBDState;

-static int nbd_parse_uri(const char *filename, QDict *options)
+static int nbd_parse_uri(BDRVNBDState *s, const char *filename)
 {
    URI *uri;
    const char *p;
    QueryParams *qp = NULL;
    int ret = 0;
-    bool is_unix;

    uri = uri_parse(filename);
    if (!uri) {
@@ -61,11 +84,11 @@ static int nbd_parse_uri(const char *filename, QDict *options)

    /* transport */
    if (!strcmp(uri->scheme, "nbd")) {
-        is_unix = false;
+        s->is_unix = false;
    } else if (!strcmp(uri->scheme, "nbd+tcp")) {
-        is_unix = false;
+        s->is_unix = false;
    } else if (!strcmp(uri->scheme, "nbd+unix")) {
-        is_unix = true;
+        s->is_unix = true;
    } else {
        ret = -EINVAL;
        goto out;
@@ -74,44 +97,32 @@ static int nbd_parse_uri(const char *filename, QDict *options)
    p = uri->path ? uri->path : "/";
    p += strspn(p, "/");
    if (p[0]) {
-        qdict_put(options, "export", qstring_from_str(p));
+        s->export_name = g_strdup(p);
    }

    qp = query_params_parse(uri->query);
-    if (qp->n > 1 || (is_unix && !qp->n) || (!is_unix && qp->n)) {
+    if (qp->n > 1 || (s->is_unix && !qp->n) || (!s->is_unix && qp->n)) {
        ret = -EINVAL;
        goto out;
    }

-    if (is_unix) {
+    if (s->is_unix) {
        /* nbd+unix:///export?socket=path */
        if (uri->server || uri->port || strcmp(qp->p[0].name, "socket")) {
            ret = -EINVAL;
            goto out;
        }
-        qdict_put(options, "path", qstring_from_str(qp->p[0].value));
+        s->host_spec = g_strdup(qp->p[0].value);
    } else {
-        QString *host;
-        /* nbd[+tcp]://host[:port]/export */
+        /* nbd[+tcp]://host:port/export */
        if (!uri->server) {
            ret = -EINVAL;
            goto out;
        }
-
-        /* strip braces from literal IPv6 address */
-        if (uri->server[0] == '[') {
-            host = qstring_from_substr(uri->server, 1,
-                                       strlen(uri->server) - 2);
-        } else {
-            host = qstring_from_str(uri->server);
-        }
-
-        qdict_put(options, "host", host);
-        if (uri->port) {
-            char* port_str = g_strdup_printf("%d", uri->port);
-            qdict_put(options, "port", qstring_from_str(port_str));
-            g_free(port_str);
+        if (!uri->port) {
+            uri->port = NBD_DEFAULT_PORT;
        }
+        s->host_spec = g_strdup_printf("%s:%d", uri->server, uri->port);
    }

 out:
@@ -122,29 +133,16 @@ out:
    return ret;
 }

-static void nbd_parse_filename(const char *filename, QDict *options,
-                               Error **errp)
+static int nbd_config(BDRVNBDState *s, const char *filename)
 {
    char *file;
    char *export_name;
    const char *host_spec;
    const char *unixpath;
-
-    if (qdict_haskey(options, "host")
-        || qdict_haskey(options, "port")
-        || qdict_haskey(options, "path"))
-    {
-        error_setg(errp, "host/port/path and a file name may not be specified "
-                         "at the same time");
-        return;
-    }
+    int err = -EINVAL;

    if (strstr(filename, "://")) {
-        int ret = nbd_parse_uri(filename, options);
-        if (ret < 0) {
-            error_setg(errp, "No valid URL specified");
-        }
-        return;
+        return nbd_parse_uri(s, filename);
    }

    file = g_strdup(filename);
@@ -156,286 +154,450 @@ static void nbd_parse_filename(const char *filename, QDict *options,
        }
        export_name[0] = 0; /* truncate 'file' */
        export_name += strlen(EN_OPTSTR);
-
-        qdict_put(options, "export", qstring_from_str(export_name));
+        s->export_name = g_strdup(export_name);
    }

    /* extract the host_spec - fail if it's not nbd:... */
    if (!strstart(file, "nbd:", &host_spec)) {
-        error_setg(errp, "File name string for NBD must start with 'nbd:'");
-        goto out;
-    }
-
-    if (!*host_spec) {
        goto out;
    }

    /* are we a UNIX or TCP socket? */
    if (strstart(host_spec, "unix:", &unixpath)) {
-        qdict_put(options, "path", qstring_from_str(unixpath));
+        s->is_unix = true;
+        s->host_spec = g_strdup(unixpath);
    } else {
-        InetSocketAddress *addr = NULL;
-
-        addr = inet_parse(host_spec, errp);
-        if (!addr) {
-            goto out;
-        }
-
-        qdict_put(options, "host", qstring_from_str(addr->host));
-        qdict_put(options, "port", qstring_from_str(addr->port));
-        qapi_free_InetSocketAddress(addr);
+        s->is_unix = false;
+        s->host_spec = g_strdup(host_spec);
    }

+    err = 0;
+
 out:
    g_free(file);
+    if (err != 0) {
+        g_free(s->export_name);
+        g_free(s->host_spec);
+    }
+    return err;
 }

-static void nbd_config(BDRVNBDState *s, QDict *options, char **export,
-                       Error **errp)
+static void nbd_coroutine_start(BDRVNBDState *s, struct nbd_request *request)
 {
-    Error *local_err = NULL;
+    int i;

-    if (qdict_haskey(options, "path") == qdict_haskey(options, "host")) {
-        if (qdict_haskey(options, "path")) {
-            error_setg(errp, "path and host may not be used at the same time.");
-        } else {
-            error_setg(errp, "one of path and host must be specified.");
+    /* Poor man semaphore.  The free_sema is locked when no other request
+     * can be accepted, and unlocked after receiving one reply.  */
+    if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
+        qemu_co_mutex_lock(&s->free_sema);
+        assert(s->in_flight < MAX_NBD_REQUESTS);
+    }
+    s->in_flight++;
+
+    for (i = 0; i < MAX_NBD_REQUESTS; i++) {
+        if (s->recv_coroutine[i] == NULL) {
+            s->recv_coroutine[i] = qemu_coroutine_self();
+            break;
        }
-        return;
    }

-    s->client.is_unix = qdict_haskey(options, "path");
-    s->socket_opts = qemu_opts_create(&socket_optslist, NULL, 0,
-                                      &error_abort);
-
-    qemu_opts_absorb_qdict(s->socket_opts, options, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        return;
-    }
-
-    if (!qemu_opt_get(s->socket_opts, "port")) {
-        qemu_opt_set_number(s->socket_opts, "port", NBD_DEFAULT_PORT,
-                            &error_abort);
-    }
-
-    *export = g_strdup(qdict_get_try_str(options, "export"));
-    if (*export) {
-        qdict_del(options, "export");
-    }
+    assert(i < MAX_NBD_REQUESTS);
+    request->handle = INDEX_TO_HANDLE(s, i);
 }

-NbdClientSession *nbd_get_client_session(BlockDriverState *bs)
+static int nbd_have_request(void *opaque)
 {
-    BDRVNBDState *s = bs->opaque;
-    return &s->client;
+    BDRVNBDState *s = opaque;
+
+    return s->in_flight > 0;
 }

-static int nbd_establish_connection(BlockDriverState *bs, Error **errp)
+static void nbd_reply_ready(void *opaque)
+{
+    BDRVNBDState *s = opaque;
+    uint64_t i;
+    int ret;
+
+    if (s->reply.handle == 0) {
+        /* No reply already in flight.  Fetch a header.  It is possible
+         * that another thread has done the same thing in parallel, so
+         * the socket is not readable anymore.
+         */
+        ret = nbd_receive_reply(s->sock, &s->reply);
+        if (ret == -EAGAIN) {
+            return;
+        }
+        if (ret < 0) {
+            s->reply.handle = 0;
+            goto fail;
+        }
+    }
+
+    /* There's no need for a mutex on the receive side, because the
+     * handler acts as a synchronization point and ensures that only
+     * one coroutine is called until the reply finishes.  */
+    i = HANDLE_TO_INDEX(s, s->reply.handle);
+    if (i >= MAX_NBD_REQUESTS) {
+        goto fail;
+    }
+
+    if (s->recv_coroutine[i]) {
+        qemu_coroutine_enter(s->recv_coroutine[i], NULL);
+        return;
+    }
+
+fail:
+    for (i = 0; i < MAX_NBD_REQUESTS; i++) {
+        if (s->recv_coroutine[i]) {
+            qemu_coroutine_enter(s->recv_coroutine[i], NULL);
+        }
+    }
+}
+
+static void nbd_restart_write(void *opaque)
+{
+    BDRVNBDState *s = opaque;
+    qemu_coroutine_enter(s->send_coroutine, NULL);
+}
+
+static int nbd_co_send_request(BDRVNBDState *s, struct nbd_request *request,
+                               QEMUIOVector *qiov, int offset)
+{
+    int rc, ret;
+
+    qemu_co_mutex_lock(&s->send_mutex);
+    s->send_coroutine = qemu_coroutine_self();
+    qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, nbd_restart_write,
+                            nbd_have_request, s);
+    rc = nbd_send_request(s->sock, request);
+    if (rc >= 0 && qiov) {
+        ret = qemu_co_sendv(s->sock, qiov->iov, qiov->niov,
+                            offset, request->len);
+        if (ret != request->len) {
+            return -EIO;
+        }
+    }
+    qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, NULL,
+                            nbd_have_request, s);
+    s->send_coroutine = NULL;
+    qemu_co_mutex_unlock(&s->send_mutex);
+    return rc;
+}
+
+static void nbd_co_receive_reply(BDRVNBDState *s, struct nbd_request *request,
+                                 struct nbd_reply *reply,
+                                 QEMUIOVector *qiov, int offset)
+{
+    int ret;
+
+    /* Wait until we're woken up by the read handler.  TODO: perhaps
+     * peek at the next reply and avoid yielding if it's ours?  */
+    qemu_coroutine_yield();
+    *reply = s->reply;
+    if (reply->handle != request->handle) {
+        reply->error = EIO;
+    } else {
+        if (qiov && reply->error == 0) {
+            ret = qemu_co_recvv(s->sock, qiov->iov, qiov->niov,
+                                offset, request->len);
+            if (ret != request->len) {
+                reply->error = EIO;
+            }
+        }
+
+        /* Tell the read handler to read another header.  */
+        s->reply.handle = 0;
+    }
+}
+
+static void nbd_coroutine_end(BDRVNBDState *s, struct nbd_request *request)
+{
+    int i = HANDLE_TO_INDEX(s, request->handle);
+    s->recv_coroutine[i] = NULL;
+    if (s->in_flight-- == MAX_NBD_REQUESTS) {
+        qemu_co_mutex_unlock(&s->free_sema);
+    }
+}
+
+static int nbd_establish_connection(BlockDriverState *bs)
 {
    BDRVNBDState *s = bs->opaque;
    int sock;
+    int ret;
+    off_t size;
+    size_t blocksize;

-    if (s->client.is_unix) {
-        sock = unix_connect_opts(s->socket_opts, errp, NULL, NULL);
+    if (s->is_unix) {
+        sock = unix_socket_outgoing(s->host_spec);
    } else {
-        sock = inet_connect_opts(s->socket_opts, errp, NULL, NULL);
-        if (sock >= 0) {
-            socket_set_nodelay(sock);
-        }
+        sock = tcp_socket_outgoing_spec(s->host_spec);
    }

    /* Failed to establish connection */
    if (sock < 0) {
        logout("Failed to establish connection to NBD server\n");
-        return -EIO;
+        return -errno;
    }

-    return sock;
+    /* NBD handshake */
+    ret = nbd_receive_negotiate(sock, s->export_name, &s->nbdflags, &size,
+                                &blocksize);
+    if (ret < 0) {
+        logout("Failed to negotiate with the NBD server\n");
+        closesocket(sock);
+        return ret;
+    }
+
+    /* Now that we're connected, set the socket to be non-blocking and
+     * kick the reply mechanism.  */
+    socket_set_nonblock(sock);
+    qemu_aio_set_fd_handler(sock, nbd_reply_ready, NULL,
+                            nbd_have_request, s);
+
+    s->sock = sock;
+    s->size = size;
+    s->blocksize = blocksize;
+
+    logout("Established connection with NBD server\n");
+    return 0;
 }

-static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
-                    Error **errp)
+static void nbd_teardown_connection(BlockDriverState *bs)
 {
    BDRVNBDState *s = bs->opaque;
-    char *export = NULL;
-    int result, sock;
-    Error *local_err = NULL;
+    struct nbd_request request;
+
+    request.type = NBD_CMD_DISC;
+    request.from = 0;
+    request.len = 0;
+    nbd_send_request(s->sock, &request);
+
+    qemu_aio_set_fd_handler(s->sock, NULL, NULL, NULL, NULL);
+    closesocket(s->sock);
+}
+
+static int nbd_open(BlockDriverState *bs, const char* filename, int flags)
+{
+    BDRVNBDState *s = bs->opaque;
+    int result;
+
+    qemu_co_mutex_init(&s->send_mutex);
+    qemu_co_mutex_init(&s->free_sema);

    /* Pop the config into our state object. Exit if invalid. */
-    nbd_config(s, options, &export, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        return -EINVAL;
+    result = nbd_config(s, filename);
+    if (result != 0) {
+        return result;
    }

    /* establish TCP connection, return error if it fails
     * TODO: Configurable retry-until-timeout behaviour.
     */
-    sock = nbd_establish_connection(bs, errp);
-    if (sock < 0) {
-        g_free(export);
-        return sock;
-    }
+    result = nbd_establish_connection(bs);

-    /* NBD handshake */
-    result = nbd_client_init(bs, sock, export, errp);
-    g_free(export);
    return result;
 }

+static int nbd_co_readv_1(BlockDriverState *bs, int64_t sector_num,
+                          int nb_sectors, QEMUIOVector *qiov,
+                          int offset)
+{
+    BDRVNBDState *s = bs->opaque;
+    struct nbd_request request;
+    struct nbd_reply reply;
+    ssize_t ret;
+
+    request.type = NBD_CMD_READ;
+    request.from = sector_num * 512;
+    request.len = nb_sectors * 512;
+
+    nbd_coroutine_start(s, &request);
+    ret = nbd_co_send_request(s, &request, NULL, 0);
+    if (ret < 0) {
+        reply.error = -ret;
+    } else {
+        nbd_co_receive_reply(s, &request, &reply, qiov, offset);
+    }
+    nbd_coroutine_end(s, &request);
+    return -reply.error;
+
+}
+
+static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
+                           int nb_sectors, QEMUIOVector *qiov,
+                           int offset)
+{
+    BDRVNBDState *s = bs->opaque;
+    struct nbd_request request;
+    struct nbd_reply reply;
+    ssize_t ret;
+
+    request.type = NBD_CMD_WRITE;
+    if (!bdrv_enable_write_cache(bs) && (s->nbdflags & NBD_FLAG_SEND_FUA)) {
+        request.type |= NBD_CMD_FLAG_FUA;
+    }
+
+    request.from = sector_num * 512;
+    request.len = nb_sectors * 512;
+
+    nbd_coroutine_start(s, &request);
+    ret = nbd_co_send_request(s, &request, qiov, offset);
+    if (ret < 0) {
+        reply.error = -ret;
+    } else {
+        nbd_co_receive_reply(s, &request, &reply, NULL, 0);
+    }
+    nbd_coroutine_end(s, &request);
+    return -reply.error;
+}
+
+/* qemu-nbd has a limit of slightly less than 1M per request.  Try to
+ * remain aligned to 4K. */
+#define NBD_MAX_SECTORS 2040
+
 static int nbd_co_readv(BlockDriverState *bs, int64_t sector_num,
                        int nb_sectors, QEMUIOVector *qiov)
 {
-    return nbd_client_co_readv(bs, sector_num, nb_sectors, qiov);
+    int offset = 0;
+    int ret;
+    while (nb_sectors > NBD_MAX_SECTORS) {
+        ret = nbd_co_readv_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
+        if (ret < 0) {
+            return ret;
+        }
+        offset += NBD_MAX_SECTORS * 512;
+        sector_num += NBD_MAX_SECTORS;
+        nb_sectors -= NBD_MAX_SECTORS;
+    }
+    return nbd_co_readv_1(bs, sector_num, nb_sectors, qiov, offset);
 }

 static int nbd_co_writev(BlockDriverState *bs, int64_t sector_num,
                         int nb_sectors, QEMUIOVector *qiov)
 {
-    return nbd_client_co_writev(bs, sector_num, nb_sectors, qiov);
+    int offset = 0;
+    int ret;
+    while (nb_sectors > NBD_MAX_SECTORS) {
+        ret = nbd_co_writev_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
+        if (ret < 0) {
+            return ret;
+        }
+        offset += NBD_MAX_SECTORS * 512;
+        sector_num += NBD_MAX_SECTORS;
+        nb_sectors -= NBD_MAX_SECTORS;
+    }
+    return nbd_co_writev_1(bs, sector_num, nb_sectors, qiov, offset);
 }

 static int nbd_co_flush(BlockDriverState *bs)
 {
-    return nbd_client_co_flush(bs);
-}
+    BDRVNBDState *s = bs->opaque;
+    struct nbd_request request;
+    struct nbd_reply reply;
+    ssize_t ret;

-static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
-{
-    bs->bl.max_discard = UINT32_MAX >> BDRV_SECTOR_BITS;
-    bs->bl.max_transfer_length = UINT32_MAX >> BDRV_SECTOR_BITS;
+    if (!(s->nbdflags & NBD_FLAG_SEND_FLUSH)) {
+        return 0;
+    }
+
+    request.type = NBD_CMD_FLUSH;
+    if (s->nbdflags & NBD_FLAG_SEND_FUA) {
+        request.type |= NBD_CMD_FLAG_FUA;
+    }
+
+    request.from = 0;
+    request.len = 0;
+
+    nbd_coroutine_start(s, &request);
+    ret = nbd_co_send_request(s, &request, NULL, 0);
+    if (ret < 0) {
+        reply.error = -ret;
+    } else {
+        nbd_co_receive_reply(s, &request, &reply, NULL, 0);
+    }
+    nbd_coroutine_end(s, &request);
+    return -reply.error;
 }

 static int nbd_co_discard(BlockDriverState *bs, int64_t sector_num,
                          int nb_sectors)
 {
-    return nbd_client_co_discard(bs, sector_num, nb_sectors);
+    BDRVNBDState *s = bs->opaque;
+    struct nbd_request request;
+    struct nbd_reply reply;
+    ssize_t ret;
+
+    if (!(s->nbdflags & NBD_FLAG_SEND_TRIM)) {
+        return 0;
+    }
+    request.type = NBD_CMD_TRIM;
+    request.from = sector_num * 512;;
+    request.len = nb_sectors * 512;
+
+    nbd_coroutine_start(s, &request);
+    ret = nbd_co_send_request(s, &request, NULL, 0);
+    if (ret < 0) {
+        reply.error = -ret;
+    } else {
+        nbd_co_receive_reply(s, &request, &reply, NULL, 0);
+    }
+    nbd_coroutine_end(s, &request);
+    return -reply.error;
 }

 static void nbd_close(BlockDriverState *bs)
 {
    BDRVNBDState *s = bs->opaque;
+    g_free(s->export_name);
+    g_free(s->host_spec);

-    qemu_opts_del(s->socket_opts);
-    nbd_client_close(bs);
+    nbd_teardown_connection(bs);
 }

 static int64_t nbd_getlength(BlockDriverState *bs)
 {
    BDRVNBDState *s = bs->opaque;

-    return s->client.size;
-}
-
-static void nbd_detach_aio_context(BlockDriverState *bs)
-{
-    nbd_client_detach_aio_context(bs);
-}
-
-static void nbd_attach_aio_context(BlockDriverState *bs,
-                                   AioContext *new_context)
-{
-    nbd_client_attach_aio_context(bs, new_context);
-}
-
-static void nbd_refresh_filename(BlockDriverState *bs)
-{
-    QDict *opts = qdict_new();
-    const char *path   = qdict_get_try_str(bs->options, "path");
-    const char *host   = qdict_get_try_str(bs->options, "host");
-    const char *port   = qdict_get_try_str(bs->options, "port");
-    const char *export = qdict_get_try_str(bs->options, "export");
-
-    qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("nbd")));
-
-    if (path && export) {
-        snprintf(bs->exact_filename, sizeof(bs->exact_filename),
-                 "nbd+unix:///%s?socket=%s", export, path);
-    } else if (path && !export) {
-        snprintf(bs->exact_filename, sizeof(bs->exact_filename),
-                 "nbd+unix://?socket=%s", path);
-    } else if (!path && export && port) {
-        snprintf(bs->exact_filename, sizeof(bs->exact_filename),
-                 "nbd://%s:%s/%s", host, port, export);
-    } else if (!path && export && !port) {
-        snprintf(bs->exact_filename, sizeof(bs->exact_filename),
-                 "nbd://%s/%s", host, export);
-    } else if (!path && !export && port) {
-        snprintf(bs->exact_filename, sizeof(bs->exact_filename),
-                 "nbd://%s:%s", host, port);
-    } else if (!path && !export && !port) {
-        snprintf(bs->exact_filename, sizeof(bs->exact_filename),
-                 "nbd://%s", host);
-    }
-
-    if (path) {
-        qdict_put_obj(opts, "path", QOBJECT(qstring_from_str(path)));
-    } else if (port) {
-        qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
-        qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(port)));
-    } else {
-        qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
-    }
-    if (export) {
-        qdict_put_obj(opts, "export", QOBJECT(qstring_from_str(export)));
-    }
-
-    bs->full_open_options = opts;
+    return s->size;
 }

 static BlockDriver bdrv_nbd = {
-    .format_name                = "nbd",
-    .protocol_name              = "nbd",
-    .instance_size              = sizeof(BDRVNBDState),
-    .bdrv_parse_filename        = nbd_parse_filename,
-    .bdrv_file_open             = nbd_open,
-    .bdrv_co_readv              = nbd_co_readv,
-    .bdrv_co_writev             = nbd_co_writev,
-    .bdrv_close                 = nbd_close,
-    .bdrv_co_flush_to_os        = nbd_co_flush,
-    .bdrv_co_discard            = nbd_co_discard,
-    .bdrv_refresh_limits        = nbd_refresh_limits,
-    .bdrv_getlength             = nbd_getlength,
-    .bdrv_detach_aio_context    = nbd_detach_aio_context,
-    .bdrv_attach_aio_context    = nbd_attach_aio_context,
-    .bdrv_refresh_filename      = nbd_refresh_filename,
+    .format_name         = "nbd",
+    .protocol_name       = "nbd",
+    .instance_size       = sizeof(BDRVNBDState),
+    .bdrv_file_open      = nbd_open,
+    .bdrv_co_readv       = nbd_co_readv,
+    .bdrv_co_writev      = nbd_co_writev,
+    .bdrv_close          = nbd_close,
+    .bdrv_co_flush_to_os = nbd_co_flush,
+    .bdrv_co_discard     = nbd_co_discard,
+    .bdrv_getlength      = nbd_getlength,
 };

 static BlockDriver bdrv_nbd_tcp = {
-    .format_name                = "nbd",
-    .protocol_name              = "nbd+tcp",
-    .instance_size              = sizeof(BDRVNBDState),
-    .bdrv_parse_filename        = nbd_parse_filename,
-    .bdrv_file_open             = nbd_open,
-    .bdrv_co_readv              = nbd_co_readv,
-    .bdrv_co_writev             = nbd_co_writev,
-    .bdrv_close                 = nbd_close,
-    .bdrv_co_flush_to_os        = nbd_co_flush,
-    .bdrv_co_discard            = nbd_co_discard,
-    .bdrv_refresh_limits        = nbd_refresh_limits,
-    .bdrv_getlength             = nbd_getlength,
-    .bdrv_detach_aio_context    = nbd_detach_aio_context,
-    .bdrv_attach_aio_context    = nbd_attach_aio_context,
-    .bdrv_refresh_filename      = nbd_refresh_filename,
+    .format_name         = "nbd",
+    .protocol_name       = "nbd+tcp",
+    .instance_size       = sizeof(BDRVNBDState),
+    .bdrv_file_open      = nbd_open,
+    .bdrv_co_readv       = nbd_co_readv,
+    .bdrv_co_writev      = nbd_co_writev,
+    .bdrv_close          = nbd_close,
+    .bdrv_co_flush_to_os = nbd_co_flush,
+    .bdrv_co_discard     = nbd_co_discard,
+    .bdrv_getlength      = nbd_getlength,
 };

 static BlockDriver bdrv_nbd_unix = {
-    .format_name                = "nbd",
-    .protocol_name              = "nbd+unix",
-    .instance_size              = sizeof(BDRVNBDState),
-    .bdrv_parse_filename        = nbd_parse_filename,
-    .bdrv_file_open             = nbd_open,
-    .bdrv_co_readv              = nbd_co_readv,
-    .bdrv_co_writev             = nbd_co_writev,
-    .bdrv_close                 = nbd_close,
-    .bdrv_co_flush_to_os        = nbd_co_flush,
-    .bdrv_co_discard            = nbd_co_discard,
-    .bdrv_refresh_limits        = nbd_refresh_limits,
-    .bdrv_getlength             = nbd_getlength,
-    .bdrv_detach_aio_context    = nbd_detach_aio_context,
-    .bdrv_attach_aio_context    = nbd_attach_aio_context,
-    .bdrv_refresh_filename      = nbd_refresh_filename,
+    .format_name         = "nbd",
+    .protocol_name       = "nbd+unix",
+    .instance_size       = sizeof(BDRVNBDState),
+    .bdrv_file_open      = nbd_open,
+    .bdrv_co_readv       = nbd_co_readv,
+    .bdrv_co_writev      = nbd_co_writev,
+    .bdrv_close          = nbd_close,
+    .bdrv_co_flush_to_os = nbd_co_flush,
+    .bdrv_co_discard     = nbd_co_discard,
+    .bdrv_getlength      = nbd_getlength,
 };

 static void bdrv_nbd_init(void)
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -1,509 +0,0 @@
-/*
- * QEMU Block driver for native access to files on NFS shares
- *
- * Copyright (c) 2014 Peter Lieven <pl@kamp.de>
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to deal
- * in the Software without restriction, including without limitation the rights
- * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
- * copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
- * THE SOFTWARE.
- */
-
-#include "config-host.h"
-
-#include <poll.h>
-#include "qemu-common.h"
-#include "qemu/config-file.h"
-#include "qemu/error-report.h"
-#include "block/block_int.h"
-#include "trace.h"
-#include "qemu/iov.h"
-#include "qemu/uri.h"
-#include "sysemu/sysemu.h"
-#include <nfsc/libnfs.h>
-
-typedef struct NFSClient {
-    struct nfs_context *context;
-    struct nfsfh *fh;
-    int events;
-    bool has_zero_init;
-    AioContext *aio_context;
-} NFSClient;
-
-typedef struct NFSRPC {
-    int ret;
-    int complete;
-    QEMUIOVector *iov;
-    struct stat *st;
-    Coroutine *co;
-    QEMUBH *bh;
-    NFSClient *client;
-} NFSRPC;
-
-static void nfs_process_read(void *arg);
-static void nfs_process_write(void *arg);
-
-static void nfs_set_events(NFSClient *client)
-{
-    int ev = nfs_which_events(client->context);
-    if (ev != client->events) {
-        aio_set_fd_handler(client->aio_context,
-                           nfs_get_fd(client->context),
-                           (ev & POLLIN) ? nfs_process_read : NULL,
-                           (ev & POLLOUT) ? nfs_process_write : NULL,
-                           client);
-
-    }
-    client->events = ev;
-}
-
-static void nfs_process_read(void *arg)
-{
-    NFSClient *client = arg;
-    nfs_service(client->context, POLLIN);
-    nfs_set_events(client);
-}
-
-static void nfs_process_write(void *arg)
-{
-    NFSClient *client = arg;
-    nfs_service(client->context, POLLOUT);
-    nfs_set_events(client);
-}
-
-static void nfs_co_init_task(NFSClient *client, NFSRPC *task)
-{
-    *task = (NFSRPC) {
-        .co             = qemu_coroutine_self(),
-        .client         = client,
-    };
-}
-
-static void nfs_co_generic_bh_cb(void *opaque)
-{
-    NFSRPC *task = opaque;
-    task->complete = 1;
-    qemu_bh_delete(task->bh);
-    qemu_coroutine_enter(task->co, NULL);
-}
-
-static void
-nfs_co_generic_cb(int ret, struct nfs_context *nfs, void *data,
-                  void *private_data)
-{
-    NFSRPC *task = private_data;
-    task->ret = ret;
-    if (task->ret > 0 && task->iov) {
-        if (task->ret <= task->iov->size) {
-            qemu_iovec_from_buf(task->iov, 0, data, task->ret);
-        } else {
-            task->ret = -EIO;
-        }
-    }
-    if (task->ret == 0 && task->st) {
-        memcpy(task->st, data, sizeof(struct stat));
-    }
-    if (task->ret < 0) {
-        error_report("NFS Error: %s", nfs_get_error(nfs));
-    }
-    if (task->co) {
-        task->bh = aio_bh_new(task->client->aio_context,
-                              nfs_co_generic_bh_cb, task);
-        qemu_bh_schedule(task->bh);
-    } else {
-        task->complete = 1;
-    }
-}
-
-static int coroutine_fn nfs_co_readv(BlockDriverState *bs,
-                                     int64_t sector_num, int nb_sectors,
-                                     QEMUIOVector *iov)
-{
-    NFSClient *client = bs->opaque;
-    NFSRPC task;
-
-    nfs_co_init_task(client, &task);
-    task.iov = iov;
-
-    if (nfs_pread_async(client->context, client->fh,
-                        sector_num * BDRV_SECTOR_SIZE,
-                        nb_sectors * BDRV_SECTOR_SIZE,
-                        nfs_co_generic_cb, &task) != 0) {
-        return -ENOMEM;
-    }
-
-    while (!task.complete) {
-        nfs_set_events(client);
-        qemu_coroutine_yield();
-    }
-
-    if (task.ret < 0) {
-        return task.ret;
-    }
-
-    /* zero pad short reads */
-    if (task.ret < iov->size) {
-        qemu_iovec_memset(iov, task.ret, 0, iov->size - task.ret);
-    }
-
-    return 0;
-}
-
-static int coroutine_fn nfs_co_writev(BlockDriverState *bs,
-                                        int64_t sector_num, int nb_sectors,
-                                        QEMUIOVector *iov)
-{
-    NFSClient *client = bs->opaque;
-    NFSRPC task;
-    char *buf = NULL;
-
-    nfs_co_init_task(client, &task);
-
-    buf = g_try_malloc(nb_sectors * BDRV_SECTOR_SIZE);
-    if (nb_sectors && buf == NULL) {
-        return -ENOMEM;
-    }
-
-    qemu_iovec_to_buf(iov, 0, buf, nb_sectors * BDRV_SECTOR_SIZE);
-
-    if (nfs_pwrite_async(client->context, client->fh,
-                         sector_num * BDRV_SECTOR_SIZE,
-                         nb_sectors * BDRV_SECTOR_SIZE,
-                         buf, nfs_co_generic_cb, &task) != 0) {
-        g_free(buf);
-        return -ENOMEM;
-    }
-
-    while (!task.complete) {
-        nfs_set_events(client);
-        qemu_coroutine_yield();
-    }
-
-    g_free(buf);
-
-    if (task.ret != nb_sectors * BDRV_SECTOR_SIZE) {
-        return task.ret < 0 ? task.ret : -EIO;
-    }
-
-    return 0;
-}
-
-static int coroutine_fn nfs_co_flush(BlockDriverState *bs)
-{
-    NFSClient *client = bs->opaque;
-    NFSRPC task;
-
-    nfs_co_init_task(client, &task);
-
-    if (nfs_fsync_async(client->context, client->fh, nfs_co_generic_cb,
-                        &task) != 0) {
-        return -ENOMEM;
-    }
-
-    while (!task.complete) {
-        nfs_set_events(client);
-        qemu_coroutine_yield();
-    }
-
-    return task.ret;
-}
-
-/* TODO Convert to fine grained options */
-static QemuOptsList runtime_opts = {
-    .name = "nfs",
-    .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
-    .desc = {
-        {
-            .name = "filename",
-            .type = QEMU_OPT_STRING,
-            .help = "URL to the NFS file",
-        },
-        { /* end of list */ }
-    },
-};
-
-static void nfs_detach_aio_context(BlockDriverState *bs)
-{
-    NFSClient *client = bs->opaque;
-
-    aio_set_fd_handler(client->aio_context,
-                       nfs_get_fd(client->context),
-                       NULL, NULL, NULL);
-    client->events = 0;
-}
-
-static void nfs_attach_aio_context(BlockDriverState *bs,
-                                   AioContext *new_context)
-{
-    NFSClient *client = bs->opaque;
-
-    client->aio_context = new_context;
-    nfs_set_events(client);
-}
-
-static void nfs_client_close(NFSClient *client)
-{
-    if (client->context) {
-        if (client->fh) {
-            nfs_close(client->context, client->fh);
-        }
-        aio_set_fd_handler(client->aio_context,
-                           nfs_get_fd(client->context),
-                           NULL, NULL, NULL);
-        nfs_destroy_context(client->context);
-    }
-    memset(client, 0, sizeof(NFSClient));
-}
-
-static void nfs_file_close(BlockDriverState *bs)
-{
-    NFSClient *client = bs->opaque;
-    nfs_client_close(client);
-}
-
-static int64_t nfs_client_open(NFSClient *client, const char *filename,
-                               int flags, Error **errp)
-{
-    int ret = -EINVAL, i;
-    struct stat st;
-    URI *uri;
-    QueryParams *qp = NULL;
-    char *file = NULL, *strp = NULL;
-
-    uri = uri_parse(filename);
-    if (!uri) {
-        error_setg(errp, "Invalid URL specified");
-        goto fail;
-    }
-    if (!uri->server) {
-        error_setg(errp, "Invalid URL specified");
-        goto fail;
-    }
-    strp = strrchr(uri->path, '/');
-    if (strp == NULL) {
-        error_setg(errp, "Invalid URL specified");
-        goto fail;
-    }
-    file = g_strdup(strp);
-    *strp = 0;
-
-    client->context = nfs_init_context();
-    if (client->context == NULL) {
-        error_setg(errp, "Failed to init NFS context");
-        goto fail;
-    }
-
-    qp = query_params_parse(uri->query);
-    for (i = 0; i < qp->n; i++) {
-        unsigned long long val;
-        if (!qp->p[i].value) {
-            error_setg(errp, "Value for NFS parameter expected: %s",
-                       qp->p[i].name);
-            goto fail;
-        }
-        if (parse_uint_full(qp->p[i].value, &val, 0)) {
-            error_setg(errp, "Illegal value for NFS parameter: %s",
-                       qp->p[i].name);
-            goto fail;
-        }
-        if (!strcmp(qp->p[i].name, "uid")) {
-            nfs_set_uid(client->context, val);
-        } else if (!strcmp(qp->p[i].name, "gid")) {
-            nfs_set_gid(client->context, val);
-        } else if (!strcmp(qp->p[i].name, "tcp-syncnt")) {
-            nfs_set_tcp_syncnt(client->context, val);
-#ifdef LIBNFS_FEATURE_READAHEAD
-        } else if (!strcmp(qp->p[i].name, "readahead")) {
-            nfs_set_readahead(client->context, val);
-#endif
-        } else {
-            error_setg(errp, "Unknown NFS parameter name: %s",
-                       qp->p[i].name);
-            goto fail;
-        }
-    }
-
-    ret = nfs_mount(client->context, uri->server, uri->path);
-    if (ret < 0) {
-        error_setg(errp, "Failed to mount nfs share: %s",
-                   nfs_get_error(client->context));
-        goto fail;
-    }
-
-    if (flags & O_CREAT) {
-        ret = nfs_creat(client->context, file, 0600, &client->fh);
-        if (ret < 0) {
-            error_setg(errp, "Failed to create file: %s",
-                       nfs_get_error(client->context));
-            goto fail;
-        }
-    } else {
-        ret = nfs_open(client->context, file, flags, &client->fh);
-        if (ret < 0) {
-            error_setg(errp, "Failed to open file : %s",
-                       nfs_get_error(client->context));
-            goto fail;
-        }
-    }
-
-    ret = nfs_fstat(client->context, client->fh, &st);
-    if (ret < 0) {
-        error_setg(errp, "Failed to fstat file: %s",
-                   nfs_get_error(client->context));
-        goto fail;
-    }
-
-    ret = DIV_ROUND_UP(st.st_size, BDRV_SECTOR_SIZE);
-    client->has_zero_init = S_ISREG(st.st_mode);
-    goto out;
-fail:
-    nfs_client_close(client);
-out:
-    if (qp) {
-        query_params_free(qp);
-    }
-    uri_free(uri);
-    g_free(file);
-    return ret;
-}
-
-static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
-                         Error **errp) {
-    NFSClient *client = bs->opaque;
-    int64_t ret;
-    QemuOpts *opts;
-    Error *local_err = NULL;
-
-    client->aio_context = bdrv_get_aio_context(bs);
-
-    opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
-    qemu_opts_absorb_qdict(opts, options, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto out;
-    }
-    ret = nfs_client_open(client, qemu_opt_get(opts, "filename"),
-                          (flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
-                          errp);
-    if (ret < 0) {
-        goto out;
-    }
-    bs->total_sectors = ret;
-    ret = 0;
-out:
-    qemu_opts_del(opts);
-    return ret;
-}
-
-static QemuOptsList nfs_create_opts = {
-    .name = "nfs-create-opts",
-    .head = QTAILQ_HEAD_INITIALIZER(nfs_create_opts.head),
-    .desc = {
-        {
-            .name = BLOCK_OPT_SIZE,
-            .type = QEMU_OPT_SIZE,
-            .help = "Virtual disk size"
-        },
-        { /* end of list */ }
-    }
-};
-
-static int nfs_file_create(const char *url, QemuOpts *opts, Error **errp)
-{
-    int ret = 0;
-    int64_t total_size = 0;
-    NFSClient *client = g_new0(NFSClient, 1);
-
-    client->aio_context = qemu_get_aio_context();
-
-    /* Read out options */
-    total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-                          BDRV_SECTOR_SIZE);
-
-    ret = nfs_client_open(client, url, O_CREAT, errp);
-    if (ret < 0) {
-        goto out;
-    }
-    ret = nfs_ftruncate(client->context, client->fh, total_size);
-    nfs_client_close(client);
-out:
-    g_free(client);
-    return ret;
-}
-
-static int nfs_has_zero_init(BlockDriverState *bs)
-{
-    NFSClient *client = bs->opaque;
-    return client->has_zero_init;
-}
-
-static int64_t nfs_get_allocated_file_size(BlockDriverState *bs)
-{
-    NFSClient *client = bs->opaque;
-    NFSRPC task = {0};
-    struct stat st;
-
-    task.st = &st;
-    if (nfs_fstat_async(client->context, client->fh, nfs_co_generic_cb,
-                        &task) != 0) {
-        return -ENOMEM;
-    }
-
-    while (!task.complete) {
-        nfs_set_events(client);
-        aio_poll(client->aio_context, true);
-    }
-
-    return (task.ret < 0 ? task.ret : st.st_blocks * st.st_blksize);
-}
-
-static int nfs_file_truncate(BlockDriverState *bs, int64_t offset)
-{
-    NFSClient *client = bs->opaque;
-    return nfs_ftruncate(client->context, client->fh, offset);
-}
-
-static BlockDriver bdrv_nfs = {
-    .format_name                    = "nfs",
-    .protocol_name                  = "nfs",
-
-    .instance_size                  = sizeof(NFSClient),
-    .bdrv_needs_filename            = true,
-    .create_opts                    = &nfs_create_opts,
-
-    .bdrv_has_zero_init             = nfs_has_zero_init,
-    .bdrv_get_allocated_file_size   = nfs_get_allocated_file_size,
-    .bdrv_truncate                  = nfs_file_truncate,
-
-    .bdrv_file_open                 = nfs_file_open,
-    .bdrv_close                     = nfs_file_close,
-    .bdrv_create                    = nfs_file_create,
-
-    .bdrv_co_readv                  = nfs_co_readv,
-    .bdrv_co_writev                 = nfs_co_writev,
-    .bdrv_co_flush_to_disk          = nfs_co_flush,
-
-    .bdrv_detach_aio_context        = nfs_detach_aio_context,
-    .bdrv_attach_aio_context        = nfs_attach_aio_context,
-};
-
-static void nfs_block_init(void)
-{
-    bdrv_register(&bdrv_nfs);
-}
-
-block_init(nfs_block_init);
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Michael Roth	04024dea26	update VERSION for v1.3.1 Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-28 10:38:28 -06:00
Markus Armbruster	1bd4397e8d	qxl: Fix SPICE_RING_PROD_ITEM(), SPICE_RING_CONS_ITEM() sanity check The pointer arithmetic there is safe, but ugly. Coverity grouses about it. However, the actual comparison is off by one: <= end instead of < end. Fix by rewriting the check in a cleaner way. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit `bc5f92e5db`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-21 14:44:31 -06:00
Sander Eikelenboom	e76672424d	Fix compile errors when enabling Xen debug logging. Signed-off-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> (cherry picked from commit `f1b8caf1d9`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-21 14:08:52 -06:00
Stefano Stabellini	df50a7e0cb	xen: fix trivial PCI passthrough MSI-X bug We are currently passing entry->data as address parameter. Pass entry->addr instead. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Xen-devel: http://marc.info/?l=xen-devel&m=135515462613715 (cherry picked from commit `044b99c655`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-21 14:07:21 -06:00
Roger Pau Monne	90c96d33c4	xen_disk: fix memory leak On ioreq_release the full ioreq was memset to 0, loosing all the data and memory allocations inside the QEMUIOVector, which leads to a memory leak. Create a new function to specifically reset ioreq. Reported-by: Maik Wessler <maik.wessler@yahoo.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> (cherry picked from commit `282c6a2f29`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-21 14:06:50 -06:00
Peter Maydell	4ee28799d4	tcg/target-arm: Add missing parens to assertions Silence a (legitimate) complaint about missing parentheses: tcg/arm/tcg-target.c: In function ‘tcg_out_qemu_ld’: tcg/arm/tcg-target.c:1148:5: error: suggest parentheses around comparison in operand of ‘&’ [-Werror=parentheses] tcg/arm/tcg-target.c: In function ‘tcg_out_qemu_st’: tcg/arm/tcg-target.c:1357:5: error: suggest parentheses around comparison in operand of ‘&’ [-Werror=parentheses] which meant that we would mistakenly always assert if running a QEMU built with debug enabled on ARM. Signed-off-by: Peter Maydell <peter.maydelL@linaro.org> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> (cherry picked from commit `5256a7208a`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-21 13:52:23 -06:00
Kevin Wolf	563068a8b2	win32-aio: Fix memory leak The buffer is allocated for both reads and writes, and obviously it should be freed even if an error occurs. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit `e8bccad5ac`) Conflicts: block/win32-aio.c *addressed conflict due to buggy g_free() still in use instead of qemu_vfree() as it is upstream (via commit `7479acdb`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-21 13:43:10 -06:00
Kevin Wolf	cdb483457c	win32-aio: Fix vectored reads Copying data in the right direction really helps a lot! Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit `bcbbd234d4`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-21 13:24:00 -06:00
Kevin Wolf	9d173df553	aio: Fix return value of aio_poll() aio_poll() must return true if any work is still pending, even if it didn't make progress, so that bdrv_drain_all() doesn't stop waiting too early. The possibility of stopping early occasionally lead to a failed assertion in bdrv_drain_all(), when some in-flight request was missed and the function didn't really drain all requests. In order to make that change, the return value as specified in the function comment must change for blocking = false; fortunately, the return value of blocking = false callers is only used in test cases, so this change shouldn't cause any trouble. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit `2ea9b58f0b`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-21 13:23:52 -06:00
Paolo Bonzini	204dd38c2d	raw-posix: fix bdrv_aio_ioctl When the raw-posix aio=thread code was moved from posix-aio-compat.c to block/raw-posix.c, there was an unintended change to the ioctl code. The code used to return the ioctl command, which posix_aio_read() would later morph into a zero. This hack is not necessary anymore, and in fact breaks scsi-generic (which expects a zero return code). Remove it. Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit `b608c8dc02`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 23:05:45 -06:00
Alex Williamson	86bab45948	vfio-pci: Loosen sanity checks to allow future features VFIO_PCI_NUM_REGIONS and VFIO_PCI_NUM_IRQS should never have been used in this manner as it locks a specific kernel implementation. Future features may introduce new regions or interrupt entries (VGA may add legacy ranges, AER might add an IRQ for error signalling). Fix this before it gets us into trouble. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: qemu-stable@nongnu.org (cherry picked from commit `8fc94e5a80`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 23:05:45 -06:00
Alex Williamson	006c747440	pci-assign: Enable MSIX on device to match guest When a guest enables MSIX on a device we evaluate the MSIX vector table, typically find no unmasked vectors and don't switch the device to MSIX mode. This generally works fine and the device will be switched once the guest enables and therefore unmasks a vector. Unfortunately some drivers enable MSIX, then use interfaces to send commands between VF & PF or PF & firmware that act based on the host state of the device. These therefore may break when MSIX is managed lazily. This change re-enables the previous test used to enable MSIX (see qemu-kvm a6b402c9), which basically guesses whether a vector will be used based on the data field of the vector table. Cc: qemu-stable@nongnu.org Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit `feb9a2ab4b`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 23:05:44 -06:00
Alex Williamson	f042cca009	vfio-pci: Make host MSI-X enable track guest Guests typically enable MSI-X with all of the vectors in the MSI-X vector table masked. Only when the vector is enabled does the vector get unmasked, resulting in a vector_use callback. These two points, enable and unmask, correspond to pci_enable_msix() and request_irq() for Linux guests. Some drivers rely on VF/PF or PF/fw communication channels that expect the physical state of the device to match the guest visible state of the device. They don't appreciate lazily enabling MSI-X on the physical device. To solve this, enable MSI-X with a single vector when the MSI-X capability is enabled and immediate disable the vector. This leaves the physical device in exactly the same state between host and guest. Furthermore, the brief gap where we enable vector 0, it fires into userspace, not KVM, so the guest doesn't get spurious interrupts. Ideally we could call VFIO_DEVICE_SET_IRQS with the right parameters to enable MSI-X with zero vectors, but this will currently return an error as the Linux MSI-X interfaces do not allow it. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: qemu-stable@nongnu.org (cherry picked from commit `b0223e29af`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 23:05:44 -06:00
Max Filippov	1205b8080f	target-xtensa: fix search_pc for the last TB opcode Zero out tcg_ctx.gen_opc_instr_start for instructions representing the last guest opcode in the TB. Cc: qemu-stable@nongnu.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> (cherry picked from commit `36f25d2537`) *modified to use older global version of gen_opc_instr_start Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 23:04:53 -06:00
Paolo Bonzini	ff0c079c14	buffered_file: do not send more than s->bytes_xfer bytes per tick Sending more was possible if the buffer was large. Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> (cherry picked from commit `bde54c08b4`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 22:38:00 -06:00
Paolo Bonzini	d745511fc9	migration: fix migration_bitmap leak Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> (cherry picked from commit `244eaa7514`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 22:37:38 -06:00
Michael Contreras	5afd0ecaa6	e1000: Discard oversized packets based on SBP\|LPE Discard packets longer than 16384 when !SBP to match the hardware behavior. Signed-off-by: Michael Contreras <michael@inetric.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit `2c0331f4f7`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 22:37:17 -06:00
Uri Lublin	c4cd5b0f6d	qxl+vnc: register a vm state change handler for dummy spice_server When qxl + vnc are used, a dummy spice_server is initialized. The spice_server has to be told when the VM runstate changes, which is what this patch does. Without it, from qxl_send_events(), the following error message is shown: qxl_send_events: spice-server bug: guest stopped, ignoring Cc: qemu-stable@nongnu.org Signed-off-by: Uri Lublin <uril@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit `938b8a36b6`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 22:36:22 -06:00
Gerd Hoffmann	7ca2496588	qxl: save qemu_create_displaysurface_from result Spotted by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=885644 Cc: qemu-stable@nongnu.org Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit `2f464b5a32`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 22:35:40 -06:00
Max Filippov	bfae9374f1	target-xtensa: fix ITLB/DTLB page protection flags With MMU option xtensa architecture has two TLBs: ITLB and DTLB. ITLB is only used for code access, DTLB is only for data. However TLB entries in both TLBs have attribute field controlling write and exec access. These bits need to be properly masked off depending on TLB type before being used as tlb_set_page prot argument. Otherwise the following happens: (1) ITLB entry for some PFN gets invalidated (2) DTLB entry for the same PFN gets updated, attributes allow code execution (3) code at the page with that PFN is executed (possible due to step 2), entry for the TB is written into the jump cache (4) QEMU TLB entry for the PFN gets replaced with an entry for some other PFN (5) code in the TB from step 3 is executed (possible due to jump cache) and it accesses data, for which there's no DTLB entry, causing DTLB miss exception (6) re-translation of the TB from step 5 is attempted, but there's no QEMU TLB entry nor xtensa ITLB entry for that PFN, which causes ITLB miss exception at the TB start address (7) ITLB miss exception is handled by the guest, but execution is resumed from the beginning of the faulting TB (the point where ITLB miss occured), not from the point where DTLB miss occured, which is wrong. With that fix the above scenario causes ITLB miss exception (that used to be step 7) at step 3, right at the beginning of the TB. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Cc: qemu-stable@nongnu.org Signed-off-by: Blue Swirl <blauwirbel@gmail.com> (cherry picked from commit `659f807c0a`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 22:34:54 -06:00
Gerd Hoffmann	b68c48ff01	pixman: fix vnc tight png/jpeg support This patch adds an x argument to qemu_pixman_linebuf_fill so it can also be used to convert a partial scanline. Then fix tight + png/jpeg encoding by passing in the x+y offset, so the data is read from the correct screen location instead of the upper left corner. Cc: 1087974@bugs.launchpad.net Cc: qemu-stable@nongnu.org Reported-by: Tim Hardeneck <thardeck@suse.de> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> (cherry picked from commit `bc210eb163`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 22:33:57 -06:00
Gerd Hoffmann	36fd8179b6	Update seabios to a810e4e72a0d42c7bc04eda57382f8e019add901 git shortlog: Kevin O'Connor (6): floppy: Minor - reduce handle_0e code size when CONFIG_FLOPPY is disabled. vga: Minor comment spelling fix. Don't recursively evaluate CFLAGS variables. Don't use gcc's -combine option. Add compile checking phase to build. acpi: Use prt_slot() macro to describe irq pins of first PCI device. Laszlo Ersek (1): maininit(): print machine UUID under seabios version message Paolo Bonzini (1): acpi: reintroduce LNKS Paolo's patch fixes the FreeBSD boot failure. Cc: qemu-stable@nongnu.org Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit `15faf946f7`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 10:47:39 -06:00
Gerd Hoffmann	0bc5f4ad63	seabios: update to e8a76b0f225bba5ba9d63ab227e0a37b3beb1059 This patch updates seabios to latest git master. Changes: (1) q35 patches merged. (2) some acpi cleanups. (3) fixes irq 8 conflict. (3) makes this a candidate for the stable branch Cc: qemu-stable@nongnu.org Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit `ff1562908d`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-15 10:46:57 -06:00
Alex Williamson	37e1428cc7	vfio-pci: Don't use kvm_irqchip_in_kernel kvm_irqchip_in_kernel() has an architecture specific meaning, so we shouldn't be using it to determine whether to enabled KVM INTx bypass. kvm_irqfds_enabled() seems most appropriate. Also use this to protect our other call to kvm_check_extension() as that explodes when KVM isn't enabled. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: qemu-stable@nongnu.org (cherry picked from commit `d281084d3e`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-14 15:45:06 -06:00
Petar Jovanovic	518799a3e7	target-mips: Fix incorrect shift for SHILO and SHILOV helper_shilo has not been shifting an accumulator value correctly for negative values in 'shift' field. Minor optimization for shift=0 case. This change also adds tests that will trigger issue and check for regressions. Signed-off-by: Petar Jovanovic <petarj@mips.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Eric Johnson <ericj@mips.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> (cherry picked from commit `19e6c50d2d`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-14 15:43:29 -06:00
Petar Jovanovic	16c5fe49de	target-mips: Fix incorrect code and test for INSV Content of register rs should be shifted for pos before applying a mask. This change contains both fix for the instruction and to the existing test. Signed-off-by: Petar Jovanovic <petarj@mips.com> Reviewed-by: Eric Johnson <ericj@mips.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> (cherry picked from commit `34f5606ee1`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-14 15:42:38 -06:00
David Gibson	f1a2195ec3	migration: Fix madvise breakage if host and guest have different page sizes madvise(DONTNEED) will throw away the contents of the whole page at the given address, even if the given length is less than the page size. One can argue about whether that's the correct behaviour, but that's what it's done for a long time in Linux at least. That means that the madvise() in ram_load(), on a setup where TARGET_PAGE_SIZE is smaller than the host page size, can throw away data in guest pages adjacent to the one it's actually processing right now, leading to guest memory corruption on an incoming migration. This patch therefore, disables the madvise() if the host page size is larger than TARGET_PAGE_SIZE. This means we don't get the benefits of that madvise() in this case, but a more complete fix is more difficult to accomplish. This at least fixes the guest memory corruption. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> (cherry picked from commit `45e6cee42b`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-14 15:41:11 -06:00
David Gibson	3b4fc1f9d2	Fix off-by-1 error in RAM migration code The code for migrating (or savevm-ing) memory pages starts off by creating a dirty bitmap and filling it with 1s. Except, actually, because bit addresses are 0-based it fills every bit except bit 0 with 1s and puts an extra 1 beyond the end of the bitmap, potentially corrupting unrelated memory. Oops. This patch fixes it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> (cherry picked from commit `7ec81e56ed`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-14 15:40:38 -06:00
Brad Smith	d67d95f24e	Disable semaphores fallback code for OpenBSD Disable the semaphores fallback code for OpenBSD as modern OpenBSD releases now have sem_timedwait(). Signed-off-by: Brad Smith <brad@comstyle.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> (cherry picked from commit `927fa909d5`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-14 15:36:36 -06:00
Brad Smith	0a7ad69a0f	Fix semaphores fallback code As reported in bug 1087114 the semaphores fallback code is broken which results in QEMU crashing and making QEMU unusable. This patch is from Paolo. This needs to be back ported to the 1.3 stable tree as well. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Brad Smith <brad@comstyle.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> (cherry picked from commit `a795ef8dcb`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2013-01-14 15:36:28 -06:00
@@ -1 +1 @@
 .3.50
 .3.1