Compare commits

..

60 Commits

Author SHA1 Message Date
Michael Roth
32d24131b2 Update version for 2.4.1 release
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-31 12:39:47 -05:00
Pavel Butsykin
fc63922556 virtio: sync the dataplane vring state to the virtqueue before virtio_save
When creating snapshot with the dataplane enabled, the snapshot file gets
not the actual state of virtqueue, because the current state is stored in
VirtIOBlockDataPlane. Therefore, before saving snapshot need to sync
the dataplane vring state to the virtqueue. The dataplane will resume its
work at the next notify virtqueue.

When snapshot loads with loadvm we get a message:
VQ 0 size 0x80 Guest index 0x15f5 inconsistent with Host index 0x0:
    delta 0x15f5
error while loading state for instance 0x0 of device
    '0000:00:08.0/virtio-blk'
Error -1 while loading VM state

to reproduce the error I used the following hmp commands:
savevm snap1
loadvm snap1

qemu parameters:
--enable-kvm -smp 4 -m 1024 -drive file=/var/lib/libvirt/images/centos6.4.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,id=virtio-disk0 -set device.virtio-disk0.x-data-plane=on

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1445859777-2982-1-git-send-email-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 10a06fd65f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-31 12:39:47 -05:00
Max Filippov
36e1eee760 target-xtensa: add window overflow check to L32E/S32E
Despite L32E and S32E primary use is for window underflow and overflow
exception handlers they are just normal instructions, and thus need to
check for window overflow.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit f822b7e497)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-31 12:35:17 -05:00
Michael S. Tsirkin
9137bd24c8 net: don't set native endianness
commit 5be7d9f1b1
    vhost-net: tell tap backend about the vnet endianness
makes vhost net always try to set LE - even if that matches the
native endian-ness.

This makes it fail on older kernels on x86 without TUNSETVNETLE support.

To fix, make qemu_set_vnet_le/qemu_set_vnet_be skip the
ioctl if it matches the host endian-ness.

Reported-by: Marcel Apfelbaum <marcel@redhat.com>
Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
(cherry picked from commit 052bd52fa9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-31 12:35:16 -05:00
Markus Armbruster
08231cbb76 device-introspect-test: New, covering device introspection
The test doesn't check that the output makes any sense, only that QEMU
survives.  Useful since we've had an astounding number of crash bugs
around there.

In fact, we have a bunch of them right now: a few devices crash or
hang, and some leave dangling pointers behind.  The test skips testing
the broken parts.  The next commits will fix them up, and drop the
skipping.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1443689999-12182-8-git-send-email-armbru@redhat.com>
(cherry picked from commit 2d1abb850f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-28 14:50:11 -05:00
Markus Armbruster
70a4483abb libqtest: New hmp() & friends
New convenience function hmp() to facilitate use of
human-monitor-command in tests.  Use it to simplify its existing uses.

To blend into existing libqtest code, also add qtest_hmpv() and
qtest_hmp().  That, and the egregiously verbose GTK-Doc comment format
make this patch look bigger than it is.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1443689999-12182-7-git-send-email-armbru@redhat.com>
(cherry picked from commit 5fb48d9673)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-28 14:50:05 -05:00
Markus Armbruster
39809852a7 tests: Fix how qom-test is run
We want to run qom-test for every architecture, without having to
manually add it to every architecture's list of tests.  Commit 3687d53
accomplished this by adding it to every architecture's list
automatically.

However, some architectures inherit their tests from others, like this:

    check-qtest-x86_64-y = $(check-qtest-i386-y)
    check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
    check-qtest-xtensaeb-y = $(check-qtest-xtensa-y)

For such architectures, we ended up running the (slow!) test twice.
Commit 2b8419c attempted to avoid this by adding the test only when
it's not already present.  Works only as long as we consider adding
the test to the architectures on the left hand side *after* the ones
on the right hand side: x86_64 after i386, microblazeel after
microblaze, xtensaeb after xtensa.

Turns out we consider them in $(SYSEMU_TARGET_LIST) order.  Defined as

    SYSEMU_TARGET_LIST := $(subst -softmmu.mak,,$(notdir \
       $(wildcard $(SRC_PATH)/default-configs/*-softmmu.mak)))

On my machine, this results in the oder xtensa, x86_64, microblazeel,
microblaze, i386.  Consequently, qom-test runs twice for microblazeel
and x86_64.

Replace this complex and flawed machinery with a much simpler one: add
generic tests (currently just qom-test) to check-qtest-generic-y
instead of check-qtest-$(target)-y for every target, then run
$(check-qtest-generic-y) for every target.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-Id: <1443689999-12182-5-git-send-email-armbru@redhat.com>
(cherry picked from commit e253c28715)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-28 14:50:00 -05:00
Paolo Bonzini
db97d9d886 macio: move DBDMA_init from instance_init to realize
DBDMA_init is not idempotent, and calling it from instance_init
breaks a simple object_new/object_unref pair.  Work around this,
pending qdev-ification of DBDMA, by moving the call to realize.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1443689999-12182-4-git-send-email-armbru@redhat.com>
(cherry picked from commit c710440235)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-28 14:49:54 -05:00
Paolo Bonzini
243b80c9c5 hw: do not pass NULL to memory_region_init from instance_init
This causes the region to outlive the object, because it attaches the
region to /machine.  This is not nice for the "realize" method, but
much worse for "instance_init" because it can cause dangling pointers
after a simple object_new/object_unref pair.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1443689999-12182-3-git-send-email-armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 81e0ab48dd)

Conflicts:
	hw/display/cg3.c
	hw/display/tcx.c

* removed context dependencies on &error_fatal/&error_abort

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-28 14:48:40 -05:00
Paolo Bonzini
91232d98da memory: allow destroying a non-empty MemoryRegion
This is legal; the MemoryRegion will simply unreference all the
existing subregions and possibly bring them down with it as well.
However, it requires a bit of care to avoid an infinite loop.
Finalizing a memory region cannot trigger an address space update,
but memory_region_del_subregion errs on the side of caution and
might trigger a spurious update: avoid that by resetting mr->enabled
first.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1443689999-12182-2-git-send-email-armbru@redhat.com>
(cherry picked from commit 2e2b8eb70f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-28 14:46:06 -05:00
Markus Armbruster
d68ba3cab3 update-linux-headers: Rename SW_MAX to SW_MAX_
The next commit will compile hw/input/virtio-input.c and
hw/input/virtio-input-hid.c even when CONFIG_LINUX is off.  These
files include both "include/standard-headers/linux/input.h" and
<windows.h> then.  Doesn't work, because both define SW_MAX.  We don't
actually use it.  Patch input.h to define SW_MAX_ instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1444320700-26260-2-git-send-email-armbru@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ac98fa849e)

Conflicts:
	scripts/update-linux-headers.sh

* remove dependency on eddb4de3

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-28 14:45:14 -05:00
Paolo Bonzini
381a290266 trace: remove malloc tracing
The malloc vtable is not supported anymore in glib, because it broke
when constructors called g_malloc.  Remove tracing of g_malloc,
g_realloc and g_free calls.

Note that, for systemtap users, glib also provides tracepoints
glib.mem_alloc, glib.mem_free, glib.mem_realloc, glib.slice_alloc
and glib.slice_free.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1442417924-25831-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 98cf48f60a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-28 14:34:11 -05:00
Jason Wang
696317f189 virtio-net: correctly drop truncated packets
When packet is truncated during receiving, we drop the packets but
neither discard the descriptor nor add and signal used
descriptor. This will lead several issues:

- sg mappings are leaked
- rx will be stalled if a lots of packets were truncated

In order to be consistent with vhost, fix by discarding the descriptor
in this case.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit 0cf33fb6b4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 13:40:28 -05:00
Jason Wang
c2a550d3df virtio: introduce virtqueue_discard()
This patch introduces virtqueue_discard() to discard a descriptor and
unmap the sgs. This will be used by the patch that will discard
descriptor when packet is truncated.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit 29b9f5efd7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 13:40:24 -05:00
Jason Wang
a64d4cafa9 virtio: introduce virtqueue_unmap_sg()
Factor out sg unmapping logic. This will be reused by the patch that
can discard descriptor.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Andrew James <andrew.james@hpe.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit ce31746157)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 13:40:18 -05:00
Gerd Hoffmann
2f99c80963 virtio-input: ignore events until the guest driver is ready
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d9460a7557)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 12:35:21 -05:00
Dr. David Alan Gilbert
f62c10bd20 Migration: Generate the completed event only when we complete
The current migration-completed event is generated a bit too early,
which means that an eager libvirt that's ready to go as soon
as it sees the event ends up racing with the actual end of migration.

This corresponds to RH bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1271145

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
xSigned-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit ed1f3e0090)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 11:28:59 -05:00
Tony Krowiak
8c4fa92d01 util/qemu-config: fix missing machine command line options
Commit 0a7cf217 ("util/qemu-config: fix regression of
qmp_query_command_line_options") aimed to restore parsing of global
machine options, but missed two: "aes-key-wrap" and
"dea-key-wrap" (which were present in the initial version of that
patch). Let's add them to the machine_opts again.

Fixes: 0a7cf217 ("util/qemu-config: fix regression of
                  qmp_query_command_line_options")
CC: Marcel Apfelbaum <marcel@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1444664181-28023-1-git-send-email-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>

(cherry picked from commit 5bcfa0c543)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 11:27:15 -05:00
Christian Borntraeger
7c22dcdeb8 s390x/kvm: Fix vector validity bit in device machine checks
Device hotplugs trigger a crw machine check. All machine checks
have validity bits for certain register types. With vector support
we also have to claim that vector registers are valid.
This is a band-aid suitable for stable. Long term we should
create the full  mcic value dynamically depending on the active
features in the kernel interrupt handler.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 2ab75df38e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 11:26:52 -05:00
Peter Crosthwaite
16514367ef misc: zynq_slcr: Fix MMIO writes
The /4 for offset calculation in MMIO writes was happening twice giving
wrong write offsets. Fix.

While touching the code, change the if-else to be a short returning if
and convert the debug message to a GUEST_ERROR, which is more accurate
for this condition.

Cc: qemu-stable@nongnu.org
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit c209b05372)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 11:17:10 -05:00
Markus Armbruster
55b4efb034 Revert "qdev: Use qdev_get_device_class() for -device <type>,help"
This reverts commit 31bed5509d.

The reverted commit changed qdev_device_help() to reject abstract
devices and devices that have cannot_instantiate_with_device_add_yet
set, to fix crash bugs like -device x86_64-cpu,help.

Rejecting abstract devices makes sense: they're purely internal, and
the implementation of the help feature can't cope with them.

Rejecting non-pluggable devices makes less sense: even though you
can't use them with -device, the help may still be useful elsewhere,
for instance with -global.  This is a regression: -device FOO,help
used to help even for FOO that aren't pluggable.

The previous two commits fixed the crash bug at a lower layer, so
reverting this one is now safe.  Fixes the -device FOO,help
regression, except for the broken devices marked
cannot_even_create_with_object_new_yet.  For those, the error message
is improved.

Example of a device where the regression is fixed:

    $ qemu-system-x86_64 -device PIIX4_PM,help
    PIIX4_PM.command_serr_enable=bool (on/off)
    PIIX4_PM.multifunction=bool (on/off)
    PIIX4_PM.rombar=uint32
    PIIX4_PM.romfile=str
    PIIX4_PM.addr=int32 (Slot and optional function number, example: 06.0 or 06)
    PIIX4_PM.memory-hotplug-support=bool
    PIIX4_PM.acpi-pci-hotplug-with-bridge-support=bool
    PIIX4_PM.s4_val=uint8
    PIIX4_PM.disable_s4=uint8
    PIIX4_PM.disable_s3=uint8
    PIIX4_PM.smb_io_base=uint32

Example of a device where it isn't fixed:

    $ qemu-system-x86_64 -device host-x86_64-cpu,help
    Can't list properties of device 'host-x86_64-cpu'

Both failed with "Parameter 'driver' expects pluggable device type"
before.

Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1443689999-12182-11-git-send-email-armbru@redhat.com>
(cherry picked from commit 33fe968330)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 10:52:19 -05:00
Markus Armbruster
2874c6565e qdev: Protect device-list-properties against broken devices
Several devices don't survive object_unref(object_new(T)): they crash
or hang during cleanup, or they leave dangling pointers behind.

This breaks at least device-list-properties, because
qmp_device_list_properties() needs to create a device to find its
properties.  Broken in commit f4eb32b "qmp: show QOM properties in
device-list-properties", v2.1.  Example reproducer:

    $ qemu-system-aarch64 -nodefaults -display none -machine none -S -qmp stdio
    {"QMP": {"version": {"qemu": {"micro": 50, "minor": 4, "major": 2}, "package": ""}, "capabilities": []}}
    { "execute": "qmp_capabilities" }
    {"return": {}}
    { "execute": "device-list-properties", "arguments": { "typename": "pxa2xx-pcmcia" } }
    qemu-system-aarch64: /home/armbru/work/qemu/memory.c:1307: memory_region_finalize: Assertion `((&mr->subregions)->tqh_first == ((void *)0))' failed.
    Aborted (core dumped)
    [Exit 134 (SIGABRT)]

Unfortunately, I can't fix the problems in these devices right now.
Instead, add DeviceClass member cannot_destroy_with_object_finalize_yet
to mark them:

* Hang during cleanup (didn't debug, so I can't say why):
  "realview_pci", "versatile_pci".

* Dangling pointer in cpus: most CPUs, plus "allwinner-a10", "digic",
  "fsl,imx25", "fsl,imx31", "xlnx,zynqmp", because they create such
  CPUs

* Assert kvm_enabled(): "host-x86_64-cpu", host-i386-cpu",
  "host-powerpc64-cpu", "host-embedded-powerpc-cpu",
  "host-powerpc-cpu" (the powerpc ones can't currently reach the
  assertion, because the CPUs are only registered when KVM is enabled,
  but the assertion is arguably in the wrong place all the same)

Make qmp_device_list_properties() fail cleanly when the device is so
marked.  This improves device-list-properties from "crashes, hangs or
leaves dangling pointers behind" to "fails".  Not a complete fix, just
a better-than-nothing work-around.  In the above reproducer,
device-list-properties now fails with "Can't list properties of device
'pxa2xx-pcmcia'".

This also protects -device FOO,help, which uses the same machinery
since commit ef52358 "qdev-monitor: include QOM properties in -device
FOO, help output", v2.2.  Example reproducer:

    $ qemu-system-aarch64 -machine none -device pxa2xx-pcmcia,help

Before:

    qemu-system-aarch64: .../memory.c:1307: memory_region_finalize: Assertion `((&mr->subregions)->tqh_first == ((void *)0))' failed.

After:

    Can't list properties of device 'pxa2xx-pcmcia'

Cc: "Andreas Färber" <afaerber@suse.de>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Anthony Green <green@moxielogic.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Jia Liu <proljc@gmail.com>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: qemu-ppc@nongnu.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1443689999-12182-10-git-send-email-armbru@redhat.com>
(cherry picked from commit 4c315c2766)

Conflicts:
	hw/arm/fsl-imx25.c
	hw/arm/fsl-imx31.c
	target-tilegx/cpu.c
	tests/device-introspect-test.c

* removed hunks pertaining to devices/tests not in 2.4

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 10:49:58 -05:00
Markus Armbruster
2d0583fc79 qmp: Fix device-list-properties not to crash for abstract device
Broken in commit f4eb32b "qmp: show QOM properties in
device-list-properties", v2.1.

Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-Id: <1443689999-12182-9-git-send-email-armbru@redhat.com>
(cherry picked from commit edb1523d90)

Conflicts:
	tests/device-introspect-test.c

* removed hunk specific to QAPI introspection (not in 2.4)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 10:46:48 -05:00
Fam Zheng
40161bf27b vmxnet3: Drop net_vmxnet3_info.can_receive
Commit 6e99c63 ("net/socket: Drop net_socket_can_send") changed the
semantics around .can_receive for sockets to now require the device to
flush queued pkts when transitioning to a .can_receive=true state. But
it's OK to drop incoming packets when the link is not active.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2734a20b81)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 10:42:31 -05:00
Jason Wang
2935ae915a virtio-net: unbreak self announcement and guest offloads after migration
After commit 019a3edbb2 ("virtio: make
features 64bit wide"). Device's guest_features was actually set after
vdc->load(). This breaks the assumption that device specific load()
function can check guest_features. For virtio-net, self announcement
and guest offloads won't work after migration.

Fixing this by defer them to virtio_net_load() where guest_features
were guaranteed to be set. Other virtio devices looks fine.

Fixes: 019a3edbb2
       ("virtio: make features 64bit wide")
Cc: qemu-stable@nongnu.org
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>

(cherry picked from commit 1f8828ef57)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 10:36:28 -05:00
Cornelia Huck
2f3c310818 virtio: avoid leading underscores for helpers
Commit ef546f1275 ("virtio: add
feature checking helpers") introduced a helper __virtio_has_feature.
We don't want to use reserved identifiers, though, so let's
rename __virtio_has_feature to virtio_has_feature and virtio_has_feature
to virtio_vdev_has_feature.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit 95129d6fc9)
* prereq for 1f8828e
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-21 10:36:05 -05:00
Aurelien Jarno
1f21d3b8dc target-ppc: fix xscmpodp and xscmpudp decoding
The xscmpodp and xscmpudp instructions only have the AX, BX bits in
there encoding, the lowest bit (usually TX) is marked as an invalid
bit. We therefore can't decode them with GEN_XX2FORM, which decodes
the two lowest bit.

Introduce a new form GEN_XX2FORM, which decodes AX and BX and mark
the lowest bit as invalid.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 8f60f8e2e5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 23:04:01 -05:00
Aurelien Jarno
bac9ce97d3 target-ppc: fix vcipher, vcipherlast, vncipherlast and vpermxor
For vector instructions, the helpers get pointers to the vector register
in arguments. Some operands might point to the same register, including
the operand holding the result.

When emulating instructions which access the vector elements in a
non-linear way, we need to store the result in an temporary variable.

This fixes openssl when emulating a POWER8 CPU.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 65cf1f65be)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 23:03:39 -05:00
James Hogan
33fca8589c tcg/mips: Fix clobbering of qemu_ld inputs
The MIPS TCG backend implements qemu_ld with 64-bit targets using the v0
register (base) as a temporary to load the upper half of the QEMU TLB
comparator (see line 5 below), however this happens before the input
address is used (line 8 to mask off the low bits for the TLB
comparison, and line 12 to add the host-guest offset). If the input
address (addrl) also happens to have been placed in v0 (as in the second
column below), it gets clobbered before it is used.

     addrl in t2              addrl in v0

 1 srl     a0,t2,0x7        srl     a0,v0,0x7
 2 andi    a0,a0,0x1fe0     andi    a0,a0,0x1fe0
 3 addu    a0,a0,s0         addu    a0,a0,s0
 4 lw      at,9136(a0)      lw      at,9136(a0)      set TCG_TMP0 (at)
 5 lw      v0,9140(a0)      lw      v0,9140(a0)      set base (v0)
 6 li      t9,-4093         li      t9,-4093
 7 lw      a0,9160(a0)      lw      a0,9160(a0)      set addend (a0)
 8 and     t9,t9,t2         and     t9,t9,v0         use addrl
 9 bne     at,t9,0x836d8c8  bne     at,t9,0x836d838  use TCG_TMP0
10  nop                      nop
11 bne     v0,t8,0x836d8c8  bne     v0,a1,0x836d838  use base
12  addu   v0,a0,t2          addu   v0,a0,v0         use addrl, addend
13 lw      t0,0(v0)         lw      t0,0(v0)

Fix by using TCG_TMP0 (at) as the temporary instead of v0 (base),
pushing the load on line 5 forward into the delay slot of the low
comparison (line 10). The early load of the addend on line 7 also needs
pushing even further for 64-bit targets, or it will clobber a0 before
we're done with it. The output for 32-bit targets is unaffected.

 srl     a0,v0,0x7
 andi    a0,a0,0x1fe0
 addu    a0,a0,s0
 lw      at,9136(a0)
-lw      v0,9140(a0)      load high comparator
 li      t9,-4093
-lw      a0,9160(a0)      load addend
 and     t9,t9,v0
 bne     at,t9,0x836d838
- nop
+ lw     at,9140(a0)      load high comparator
+lw      a0,9160(a0)      load addend
-bne     v0,a1,0x836d838
+bne     at,a1,0x836d838
  addu   v0,a0,v0
 lw      t0,0(v0)

Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 5eb4f645eb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 23:00:05 -05:00
Markus Armbruster
a479b21c11 qom: Fix invalid error check in property_get_str()
When a function returns a null pointer on error and only on error, you
can do

    if (!foo(foos, errp)) {
        ... handle error ...
    }

instead of the more cumbersome

    Error *err = NULL;

    if (!foo(foos, &err)) {
        error_propagate(errp, err);
        ... handle error ...
    }

A StringProperty's getter, however, may return null on success!  We
then fail to call visit_type_str().

Screwed up in 6a146eb, v1.1.

Fails tests/qom-test in my current, heavily hacked QAPI branch.  No
reproducer for master known (but I didn't look hard).

Cc: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit e1c8237df5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 22:57:42 -05:00
Markus Armbruster
d11ff15fd5 qom: Do not reuse errp after a possible error
The argument for an Error **errp parameter must point to a null
pointer.  If it doesn't, and an error happens, error_set() fails its
assertion.

Instead of

    foo(foos, errp);
    bar(bars, errp);

you need to do something like

    Error *err = NULL;

    foo(foos, &err);
    if (err) {
        error_propagate(errp, err);
        goto out;
    }

    bar(bars, errp);
out:

Screwed up in commit 0e55884 (v1.3.0): property_get_bool().

Screwed up in commit 1f21772 (v2.1.0): object_property_get_enum() and
object_property_get_uint16List().

Screwed up in commit a8e3fbe (v2.4.0): property_get_enum(),
property_set_enum().

Found by inspection, no actual crashes observed.

Fix them up.

Cc: Anthony Liguori <anthony@codemonkey.ws>
Cc: Hu Tao <hutao@cn.fujitsu.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 4715d42efe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 22:57:17 -05:00
John Snow
1b8e1f7ad9 ide: unify io_buffer_offset increments
IDEState's io_buffer_offset was originally added to keep track of offsets
in AHCI rather exclusively, but it was added to IDEState instead of an
AHCI-specific structure.

AHCI fakes all PIO transfers using DMA and a scatter-gather list. When
the core or atapi layers invoke HBA-specific mechanisms for transfers,
they do not always know that it is being backed by DMA or a sglist, so
this offset is not always updated by the HBA code everywhere.

If we modify it in dma_buf_commit, however, any HBA that needs to use
this offset to manage operating on only part of a sglist will have
access to it.

This will fix ATAPI PIO transfers performed through the AHCI HBA,
which were previously not modifying this value appropriately.

This will fix ATAPI PIO transfers larger than one sector.

Reported-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1440546331-29087-2-git-send-email-jsnow@redhat.com
CC: qemu-stable@nongnu.org
(cherry picked from commit aaeda4a3c9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 22:56:27 -05:00
Stefan Weil
e00bf9ee70 slirp: Fix non blocking connect for w32
Signed-off-by: Stefan Weil <sw@weilnetz.de>
(cherry picked from commit a246a01631)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 22:55:32 -05:00
Wen Congyang
78aeb6984c nbd: release exp->blk after all clients are closed
If the socket fd is shutdown, there may be some data which is received before
shutdown. We will read the data and do read/write in nbd_trip(). But the exp's
blk is NULL, and it will cause qemu crashed.

Reported-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Message-Id: <55F929E2.1020501@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d626834849)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 22:55:05 -05:00
Michael Roth
6d62d0e3dd spapr_pci: fix device tree props for MSI/MSI-X
PAPR requires ibm,req#msi and ibm,req#msi-x to be present in the
device node to define the number of msi/msi-x interrupts the device
supports, respectively.

Currently we have ibm,req#msi-x hardcoded to a non-sensical constant
that happens to be 2, and are missing ibm,req#msi entirely. The result
of that is that msi-x capable devices get limited to 2 msi-x
interrupts (which can impact performance), and msi-only devices likely
wouldn't work at all. Additionally, if devices expect a minimum that
exceeds 2, the guest driver may fail to load entirely.

SLOF still owns the generation of these properties at boot-time
(although other device properties have since been offloaded to QEMU),
but for hotplugged devices we rely on the values generated by QEMU
and thus hit the limitations above.

Fix this by generating these properties in QEMU as expected by guests.

In the future it may make sense to modify SLOF to pass through these
values directly as we do with other props since we're duplicating SLOF
code.

Cc: qemu-ppc@nongnu.org
Cc: qemu-stable@nongnu.org
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit a8ad731a00)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 22:53:50 -05:00
Alberto Garcia
5644f6f924 gtk: use setlocale() for LC_MESSAGES only
The QEMU code is not internationalized and assumes that it runs under
the C locale, but if we use the GTK+ UI we'll end up importing the
locale settings from the environment. This can break things, such as
the JSON generator and iotest 120 in locales that use a decimal comma.

We do however have translations for a few simple strings for the GTK+
menu items, so in order to run QEMU using the C locale, and yet have a
translated UI let's use setlocale() for LC_MESSAGES only.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2cb5d2a47c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 22:51:42 -05:00
John Snow
63d761388d ide: fix ATAPI command permissions
We're a little too lenient with what we'll let an ATAPI drive handle.
Clamp down on the IDE command execution table to remove CD_OK permissions
from commands that are not and have never been ATAPI commands.

For ATAPI command validity, please see:
- ATA4 Section 6.5 ("PACKET Command feature set")
- ATA8/ACS Section 4.3 ("The PACKET feature set")
- ACS3 Section 4.3 ("The PACKET feature set")

ACS3 has a historical command validity table in Table B.4
("Historical Command Assignments") that can be referenced to find when
a command was introduced, deprecated, obsoleted, etc.

The only reference for ATAPI command validity is by checking that
version's PACKET feature set section.

ATAPI was introduced by T13 into ATA4, all commands retired prior to ATA4
therefore are assumed to have never been ATAPI commands.

Mandatory commands, as listed in ATA8-ACS3, are:

- DEVICE RESET
- EXECUTE DEVICE DIAGNOSTIC
- IDENTIFY DEVICE
- IDENTIFY PACKET DEVICE
- NOP
- PACKET
- READ SECTOR(S)
- SET FEATURES

Optional commands as listed in ATA8-ACS3, are:

- FLUSH CACHE
- READ LOG DMA EXT
- READ LOG EXT
- WRITE LOG DMA EXT
- WRITE LOG EXT

All other commands are illegal to send to an ATAPI device and should
be rejected by the device.

CD_OK removal justifications:

0x06 WIN_DSM              Defined in ACS2. Not valid for ATAPI.
0x21 WIN_READ_ONCE        Retired in ATA5. Not ATAPI in ATA4.
0x94 WIN_STANDBYNOW2      Retired in ATA4. Did not coexist with ATAPI.
0x95 WIN_IDLEIMMEDIATE2   Retired in ATA4. Did not coexist with ATAPI.
0x96 WIN_STANDBY2         Retired in ATA4. Did not coexist with ATAPI.
0x97 WIN_SETIDLE2         Retired in ATA4. Did not coexist with ATAPI.
0x98 WIN_CHECKPOWERMODE2  Retired in ATA4. Did not coexist with ATAPI.
0x99 WIN_SLEEPNOW2        Retired in ATA4. Did not coexist with ATAPI.
0xE0 WIN_STANDBYNOW1      Not part of ATAPI in ATA4, ACS or ACS3.
0xE1 WIN_IDLEIMMDIATE     Not part of ATAPI in ATA4, ACS or ACS3.
0xE2 WIN_STANDBY          Not part of ATAPI in ATA4, ACS or ACS3.
0xE3 WIN_SETIDLE1         Not part of ATAPI in ATA4, ACS or ACS3.
0xE4 WIN_CHECKPOWERMODE1  Not part of ATAPI in ATA4, ACS or ACS3.
0xE5 WIN_SLEEPNOW1        Not part of ATAPI in ATA4, ACS or ACS3.
0xF8 WIN_READ_NATIVE_MAX  Obsoleted in ACS3. Not ATAPI in ATA4 or ACS.

This patch fixes a divide by zero fault that can be caused by sending
the WIN_READ_NATIVE_MAX command to an ATAPI drive, which causes it to
attempt to use zeroed CHS values to perform sector arithmetic.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1441816082-21031-1-git-send-email-jsnow@redhat.com
CC: qemu-stable@nongnu.org
(cherry picked from commit d9033e1d3a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 22:47:36 -05:00
Max Reitz
c13b1c8314 qcow2: Make size_to_clusters() return uint64_t
Sadly, some images may have more clusters than what can be represented
using a plain int. We should be prepared for that case (in
qcow2_check_refcounts() we actually were trying to catch that case, but
since size_to_clusters() truncated the returned value, that check never
did anything useful).

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b6d36def6d)

Conflicts:
	block/qcow2-cluster.c
	block/qcow2.h

* removed context dependency on ff99129a
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 20:18:17 -05:00
Richard Henderson
052677b2c8 target-arm: Share all common TCG temporaries
This is a bug fix for aarch64.  At present, we have branches using
the 32-bit (translate.c) versions of cpu_[NZCV]F, but we set the flags
using the 64-bit (translate-a64.c) versions of cpu_[NZCV]F.  From
the view of the TCG code generator, these are unrelated variables.

The bug is hard to see because we currently only read these variables
from branches, and upon reaching a branch TCG will first spill live
variables and then reload the arguments of the branch.  Since the
32-bit versions were never live until reaching the branch, we'd re-read
the data that had just been spilled from the 64-bit versions.

There is currently no such problem with the cpu_exclusive_* variables,
but there's no point in tempting fate.

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1441909103-24666-2-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 78bcaa3e37)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 20:12:11 -05:00
Pierre Morel
0fdf9f756f virtio dataplane: adapt dataplane for virtio Version 1
Let dataplane allocate different region for the desc/avail/used
ring regions.
Take VIRTIO_RING_F_EVENT_IDX into account to increase the used/avail
rings accordingly.

[Fix 32-bit builds by changing 16lx format specifier to HWADDR_PRIx.
--Stefan]

Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Message-id: 1441625636-23773-1-git-send-email-pmorel@linux.vnet.ibm.com
(changed __virtio16 into uint16_t,
 map descriptor table and available ring read-only)
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

(cherry picked from commit a9718ef000)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 20:08:04 -05:00
Aníbal Limón
d077545dfe cpus.c: qemu_mutex_lock_iothread fix race condition at cpu thread init
When QEMU starts the RCU thread executes qemu_mutex_lock_thread
causing error "qemu:qemu_cpu_kick_thread: No such process" and exits.

This isn't occur frequently but in glibc the thread id can exist and
this not guarantee that the thread is on active/running state. If is
inserted a sleep(1) after newthread assignment [1] the issue appears.

So not make assumption that thread exist if first_cpu->thread is set
then change the validation of cpu to created that is set into cpu
threads (kvm, tcg, dummy).

[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/pthread_create.c;h=d10f4ea8004e1d8f3a268b95cc0f8d93b8d89867;hb=HEAD#l621

Cc: qemu-stable@nongnu.org
Signed-off-by: Aníbal Limón <anibal.limon@linux.intel.com>
Message-Id: <1441313313-3040-1-git-send-email-anibal.limon@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 46036b2462)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 20:00:17 -05:00
Vladislav Yasevich
f6737604da rtl8139: Do not consume the packet during overflow in standard mode.
When operation in standard mode, we currently return the size
of packet during buffer overflow.  This consumes the overflow
packet.  Return 0 instead so we can re-process the overflow packet
when we have room.

This fixes issues with lost/dropped fragments of large messages.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Message-id: 1441121206-6997-3-git-send-email-vyasevic@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 26c4e7ca72)
*removed dependency on b76f21a7
*removed context dependency on 4cbea598
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 19:57:30 -05:00
Vladislav Yasevich
d2b0f96fe2 rtl8139: Fix receive buffer overflow check
rtl8139_do_receive() tries to check for the overflow condition
by making sure that packet_size + 8 does not exceed the
available buffer space.  The issue here is that RxBuffAddr,
used to calculate available buffer space, is aligned to a
a 4 byte boundry after every update.  So it is possible that
every packet ends up being slightly padded when written
to the receive buffer.  This padding is not taken into
account when checking for overflow and we may end up missing
the overflow condition can causing buffer overwrite.

This patch takes alignment into consideration when
checking for overflow condition.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Message-id: 1441121206-6997-2-git-send-email-vyasevic@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit fabdcd3392)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-20 13:30:05 -05:00
Cornelia Huck
a00431853f s390x/css: start with cleared cstat/dstat
When executing the start function, we should start with a clear state
regarding subchannel and device status; it is easy to forget updating one
of them after the ccw has been processed.

Note that we don't need to care about resetting the various control
fields: They are cleared by tsch(), and if they were still pending,
we wouldn't be able to execute the start function in the first
place.

Also note that we don't want to clear cstat/dstat if a suspended
subchannel is resumed.

This fixes a bug where we would continue to present channel-program
check in cstat even though later ccw requests for the subchannel
finished without error (i.e. cstat should be 0).

Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
(cherry picked from commit 6b7741c2be)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 23:20:42 -05:00
Alexander Graf
b51715e1c0 PPC: E500: Update u-boot to commit 79c884d7e4
The current U-Boot binary in QEMU has a bug where it fails to support
dynamic CCSR addressing. Without this support, u-boot can not boot the
ppce500 machine anymore. This has been fixed upstream in u-boot commit
e834975b.

Update the u-boot blob we carry in QEMU to the latest u-boot upstream,
so that we can successfully run u-boot with the ppce500 machine again.

CC: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
Tested-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit d4574435a6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 23:20:08 -05:00
Michael S. Tsirkin
267bc47438 scripts/dump-guest-memory.py: fix after RAMBlock change
commit 9b8424d573
    "exec: split length -> used_length/max_length"
changed field names in struct RAMBlock

It turns out that scripts/dump-guest-memory.py was
poking at this field, update it accordingly.

Cc: qemu-stable@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <1440666378-3152-1-git-send-email-mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0c71d41e2a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 23:17:43 -05:00
Gonglei
955ff148de vhost-scsi: fix wrong vhost-scsi firmware path
vhost-scsi bootindex does't work because Qemu passes
wrong fireware path to seabios.

before:
  /pci@i0cf8/scsi@7channel@0/vhost-scsi@0,0
after applying the patch:
  /pci@i0cf8/scsi@7/channel@0/vhost-scsi@0,0

Reported-by: Subo <subo7@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1440553971-11108-1-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f42bf6a262)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 23:15:39 -05:00
Mark Cave-Ayland
71b685832d mac_dbdma: always clear FLUSH bit once DBDMA channel flush is complete
The code to flush the DBDMA channel was effectively duplicated in
dbdma_control_write(), except for the fact that the copy executed outside of a
RUN bit transition was broken by not clearing the FLUSH bit once the flush was
complete.

Newer PPC Linux kernels would timeout waiting for the FLUSH bit to clear again
after submitting a FLUSH command. Fix this by always clearing the FLUSH bit
once the channel flush is complete and removing the repeated code.

Reported-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 1cde732d88)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 18:19:14 -05:00
Max Reitz
9a20ccaecd qemu-img: Fix crash in amend invocation
Example:
$ ./qemu-img create -f qcow2 /tmp/t.qcow2 64M
$ ./qemu-img amend -f qcow2 -o backing_file=/tmp/t.qcow2, -o help \
    /tmp/t.qcow2

This should not crash. This actually is tested by iotest 082, but not
caught due to the segmentation fault being silent (which is something
that needs to be fixed, too).

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit e814dffcc9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 18:18:12 -05:00
Peter Lieven
d9af73191c block/nfs: fix calculation of allocated file size
st.st_blocks is always counted in 512 byte units. Do not
use st.st_blksize as multiplicator which may be larger.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1440067607-14547-1-git-send-email-pl@kamp.de
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 055c6f912c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 18:17:25 -05:00
Peter Crosthwaite
637dd0bb7c exec-all: Translate TCI return addresses backwards too
This subtraction of return addresses applies directly to TCI as well as
host-TCG. This fixes Linux boots for at least Microblaze, CRIS, ARM and
SH4 when using TCI.

[sw: Removed indentation for preprocessor statement]
[sw: The patch also fixes Linux boot for x86_64]

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
(cherry picked from commit a17d448274)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 18:15:48 -05:00
Peter Lieven
2ac9fa162e block/iscsi: validate block size returned from target
It has been reported that at least tgtd returns a block size of 0
for LUN 0. To avoid running into divide by zero later on and protect
against other problematic block sizes validate the block size right
at connection time.

Cc: qemu-stable@nongnu.org
Reported-by: Andrey Korolyov <andrey@xdel.ru>
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1439552016-8557-1-git-send-email-pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6d1f252d8c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 18:14:37 -05:00
Peter Maydell
5b7d840e74 target-arm/arm-semi.c: Fix broken SYS_WRITE0 via gdb
A spurious trailing "\n" in the gdb syscall format string used
for SYS_WRITE0 meant that gdb would reject the remote syscall,
with the effect that the output from the guest was silently dropped.
Remove the newline so that gdb accepts the packet.

Cc: qemu-stable@nongnu.org

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 857b55adb7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 18:13:53 -05:00
Kevin Wolf
0de7d2b793 mirror: Fix coroutine reentrance
This fixes a regression introduced by commit dcfb3beb ("mirror: Do zero
write on target if sectors not allocated"), which was reported to cause
aborts with the message "Co-routine re-entered recursively".

The cause for this bug is the following code in mirror_iteration_done():

    if (s->common.busy) {
        qemu_coroutine_enter(s->common.co, NULL);
    }

This has always been ugly because - unlike most places that reenter - it
doesn't have a specific yield that it pairs with, but is more
uncontrolled.  What we really mean here is "reenter the coroutine if
it's in one of the four explicit yields in mirror.c".

This used to be equivalent with s->common.busy because neither
mirror_run() nor mirror_iteration() call any function that could yield.
However since commit dcfb3beb this doesn't hold true any more:
bdrv_get_block_status_above() can yield.

So what happens is that bdrv_get_block_status_above() wants to take a
lock that is already held, so it adds itself to the queue of waiting
coroutines and yields. Instead of being woken up by the unlock function,
however, it gets woken up by mirror_iteration_done(), which is obviously
wrong.

In most cases the code actually happens to cope fairly well with such
cases, but in this specific case, the unlock must already have scheduled
the coroutine for wakeup when mirror_iteration_done() reentered it. And
then the coroutine happened to process the scheduled restarts and tried
to reenter itself recursively.

This patch fixes the problem by pairing the reenter in
mirror_iteration_done() with specific yields instead of abusing
s->common.busy.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1439455310-11263-1-git-send-email-kwolf@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit e424aff5f3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 18:13:21 -05:00
Fam Zheng
f399ea092e scsi-disk: Fix assertion failure on WRITE SAME
The last portion of an unaligned WRITE SAME command could fail the
assertion in bdrv_aligned_pwritev:

    assert(!qiov || bytes == qiov->size);

Because we updated data->iov.iov_len right above this if block, but
data->qiov still has the old size.

Reinitialize the qiov to make them equal and keep block layer happy.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1438159512-3871-2-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a56537a127)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-10-17 18:03:09 -05:00
Michael Roth
83c92b4514 Update version for 2.4.0.1 release
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-09-22 16:53:29 -05:00
P J P
5a1ccdfe44 net: avoid infinite loop when receiving packets(CVE-2015-5278)
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. While receiving packets
via ne2000_receive() routine, a local 'index' variable
could exceed the ring buffer size, leading to an infinite
loop situation.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 737d2b3c41)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-09-21 17:04:22 -05:00
P J P
7aa2bcad0c net: add checks to validate ring buffer pointers(CVE-2015-5279)
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. While receiving packets
via ne2000_receive() routine, a local 'index' variable
could exceed the ring buffer size, which could lead to a
memory buffer overflow. Added other checks at initialisation.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 9bbdbc66e5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-09-21 17:04:14 -05:00
P J P
3a56af1fbc e1000: Avoid infinite loop in processing transmit descriptor (CVE-2015-6815)
While processing transmit descriptors, it could lead to an infinite
loop if 'bytes' was to become zero; Add a check to avoid it.

[The guest can force 'bytes' to 0 by setting the hdr_len and mss
descriptor fields to 0.
--Stefan]

Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1441383666-6590-1-git-send-email-stefanha@redhat.com
(cherry picked from commit b947ac2bf2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-09-21 17:04:05 -05:00
Gerd Hoffmann
efec4dcd25 vnc: fix memory corruption (CVE-2015-5225)
The _cmp_bytes variable added by commit "bea60dd ui/vnc: fix potential
memory corruption issues" can become negative.  Result is (possibly
exploitable) memory corruption.  Reason for that is it uses the stride
instead of bytes per scanline to apply limits.

For the server surface is is actually fine.  vnc creates that itself,
there is never any padding and thus scanline length always equals stride.

For the guest surface scanline length and stride are typically identical
too, but it doesn't has to be that way.  So add and use a new variable
(guest_ll) for the guest scanline length.  Also rename min_stride to
line_bytes to make more clear what it actually is.  Finally sprinkle
in an assert() to make sure we never use a negative _cmp_bytes again.

Reported-by: 范祚至(库特) <zuozhi.fzz@alibaba-inc.com>
Reviewed-by: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit eb8934b041)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2015-09-21 17:03:16 -05:00
3433 changed files with 120357 additions and 301951 deletions

View File

@@ -1,2 +0,0 @@
((c-mode . ((c-file-style . "stroustrup")
(indent-tabs-mode . nil))))

15
.gitignore vendored
View File

@@ -5,7 +5,6 @@
/config-target.* /config-target.*
/config.status /config.status
/config-temp /config-temp
/trace-events-all
/trace/generated-tracers.h /trace/generated-tracers.h
/trace/generated-tracers.c /trace/generated-tracers.c
/trace/generated-tracers-dtrace.h /trace/generated-tracers-dtrace.h
@@ -20,13 +19,12 @@
/trace/generated-ust.c /trace/generated-ust.c
/ui/shader/texture-blit-frag.h /ui/shader/texture-blit-frag.h
/ui/shader/texture-blit-vert.h /ui/shader/texture-blit-vert.h
/libcacard/trace/generated-tracers.c
*-timestamp *-timestamp
/*-softmmu /*-softmmu
/*-darwin-user /*-darwin-user
/*-linux-user /*-linux-user
/*-bsd-user /*-bsd-user
/ivshmem-client
/ivshmem-server
/libdis* /libdis*
/libuser /libuser
/linux-headers/asm /linux-headers/asm
@@ -36,7 +34,6 @@
/qapi-visit.[ch] /qapi-visit.[ch]
/qapi-event.[ch] /qapi-event.[ch]
/qmp-commands.h /qmp-commands.h
/qmp-introspect.[ch]
/qmp-marshal.c /qmp-marshal.c
/qemu-doc.html /qemu-doc.html
/qemu-tech.html /qemu-tech.html
@@ -52,9 +49,7 @@
/qemu-ga /qemu-ga
/qemu-bridge-helper /qemu-bridge-helper
/qemu-monitor.texi /qemu-monitor.texi
/qemu-monitor-info.texi /qmp-commands.txt
/qemu-version.h
/qemu-version.h.tmp
/vscclient /vscclient
/fsdev/virtfs-proxy-helper /fsdev/virtfs-proxy-helper
*.[1-9] *.[1-9]
@@ -63,7 +58,6 @@
*.cp *.cp
*.dvi *.dvi
*.exe *.exe
*.msi
*.dll *.dll
*.so *.so
*.mo *.mo
@@ -96,10 +90,6 @@
/pc-bios/optionrom/linuxboot.bin /pc-bios/optionrom/linuxboot.bin
/pc-bios/optionrom/linuxboot.raw /pc-bios/optionrom/linuxboot.raw
/pc-bios/optionrom/linuxboot.img /pc-bios/optionrom/linuxboot.img
/pc-bios/optionrom/linuxboot_dma.asm
/pc-bios/optionrom/linuxboot_dma.bin
/pc-bios/optionrom/linuxboot_dma.raw
/pc-bios/optionrom/linuxboot_dma.img
/pc-bios/optionrom/multiboot.asm /pc-bios/optionrom/multiboot.asm
/pc-bios/optionrom/multiboot.bin /pc-bios/optionrom/multiboot.bin
/pc-bios/optionrom/multiboot.raw /pc-bios/optionrom/multiboot.raw
@@ -114,5 +104,4 @@
cscope.* cscope.*
tags tags
TAGS TAGS
docker-src.*
*~ *~

View File

@@ -1,101 +1,103 @@
sudo: false
language: c language: c
python: python:
- "2.4" - "2.4"
compiler: compiler:
- gcc - gcc
- clang - clang
cache: ccache
addons:
apt:
packages:
- libaio-dev
- libattr1-dev
- libbrlapi-dev
- libcap-ng-dev
- libgnutls-dev
- libgtk-3-dev
- libiscsi-dev
- liblttng-ust-dev
- libnfs-dev
- libncurses5-dev
- libnss3-dev
- libpixman-1-dev
- libpng12-dev
- librados-dev
- libsdl1.2-dev
- libseccomp-dev
- libspice-protocol-dev
- libspice-server-dev
- libssh2-1-dev
- liburcu-dev
- libusb-1.0-0-dev
- libvte-2.90-dev
- sparse
- uuid-dev
# The channel name "irc.oftc.net#qemu" is encrypted against qemu/qemu
# to prevent IRC notifications from forks. This was created using:
# $ travis encrypt -r "qemu/qemu" "irc.oftc.net#qemu"
notifications: notifications:
irc: irc:
channels: channels:
- secure: "F7GDRgjuOo5IUyRLqSkmDL7kvdU4UcH3Lm/W2db2JnDHTGCqgEdaYEYKciyCLZ57vOTsTsOgesN8iUT7hNHBd1KWKjZe9KDTZWppWRYVwAwQMzVeSOsbbU4tRoJ6Pp+3qhH1Z0eGYR9ZgKYAoTumDFgSAYRp4IscKS8jkoedOqM=" - "irc.oftc.net#qemu"
on_success: change on_success: change
on_failure: always on_failure: always
env: env:
global: global:
- TEST_CMD="make check" - TEST_CMD=""
- EXTRA_CONFIG=""
# Development packages, EXTRA_PKGS saved for additional builds
- CORE_PKGS="libusb-1.0-0-dev libiscsi-dev librados-dev libncurses5-dev"
- NET_PKGS="libseccomp-dev libgnutls-dev libssh2-1-dev libspice-server-dev libspice-protocol-dev libnss3-dev"
- GUI_PKGS="libgtk-3-dev libvte-2.90-dev libsdl1.2-dev libpng12-dev libpixman-1-dev"
- EXTRA_PKGS=""
matrix: matrix:
- CONFIG="" # Group major targets together with their linux-user counterparts
- CONFIG="--enable-debug --enable-debug-tcg --enable-trace-backends=log" - TARGETS=alpha-softmmu,alpha-linux-user
- CONFIG="--disable-linux-aio --disable-cap-ng --disable-attr --disable-brlapi --disable-uuid --disable-libusb" - TARGETS=arm-softmmu,arm-linux-user,armeb-linux-user,aarch64-softmmu,aarch64-linux-user
- CONFIG="--enable-modules" - TARGETS=cris-softmmu,cris-linux-user
- CONFIG="--with-coroutine=ucontext" - TARGETS=i386-softmmu,i386-linux-user,x86_64-softmmu,x86_64-linux-user
- CONFIG="--with-coroutine=sigaltstack" - TARGETS=m68k-softmmu,m68k-linux-user
- TARGETS=microblaze-softmmu,microblazeel-softmmu,microblaze-linux-user,microblazeel-linux-user
- TARGETS=mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu
- TARGETS=mips-linux-user,mips64-linux-user,mips64el-linux-user,mipsel-linux-user,mipsn32-linux-user,mipsn32el-linux-user
- TARGETS=or32-softmmu,or32-linux-user
- TARGETS=ppc-softmmu,ppc64-softmmu,ppcemb-softmmu,ppc-linux-user,ppc64-linux-user,ppc64abi32-linux-user,ppc64le-linux-user
- TARGETS=s390x-softmmu,s390x-linux-user
- TARGETS=sh4-softmmu,sh4eb-softmmu,sh4-linux-user sh4eb-linux-user
- TARGETS=sparc-softmmu,sparc64-softmmu,sparc-linux-user,sparc32plus-linux-user,sparc64-linux-user
- TARGETS=unicore32-softmmu,unicore32-linux-user
# Group remaining softmmu only targets into one build
- TARGETS=lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,xtensaeb-softmmu
git: git:
# we want to do this ourselves # we want to do this ourselves
submodules: false submodules: false
before_install: before_install:
- if [ "$TRAVIS_OS_NAME" == "osx" ]; then brew update ; fi
- if [ "$TRAVIS_OS_NAME" == "osx" ]; then brew install libffi gettext glib pixman ; fi
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ - wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive - git submodule update --init --recursive
- sudo apt-get update -qq
- sudo apt-get install -qq ${CORE_PKGS} ${NET_PKGS} ${GUI_PKGS} ${EXTRA_PKGS}
before_script: before_script:
- ./configure ${CONFIG} - ./configure --target-list=${TARGETS} --enable-debug-tcg ${EXTRA_CONFIG}
script: script:
- make -j3 && ${TEST_CMD} - make -j2 && ${TEST_CMD}
matrix: matrix:
# We manually include a number of additional build for non-standard bits
include: include:
# gprof/gcov are GCC features # Make check target (we only do this once)
- env: CONFIG="--enable-gprof --enable-gcov --disable-pie" - env:
- TARGETS=alpha-softmmu,arm-softmmu,aarch64-softmmu,cris-softmmu,
i386-softmmu,x86_64-softmmu,m68k-softmmu,microblaze-softmmu,
microblazeel-softmmu,mips-softmmu,mips64-softmmu,
mips64el-softmmu,mipsel-softmmu,or32-softmmu,ppc-softmmu,
ppc64-softmmu,ppcemb-softmmu,s390x-softmmu,sh4-softmmu,
sh4eb-softmmu,sparc-softmmu,sparc64-softmmu,
unicore32-softmmu,unicore32-linux-user,
lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,
xtensaeb-softmmu
TEST_CMD="make check"
compiler: gcc compiler: gcc
# We manually include builds which we disable "make check" for # Debug related options
- env: CONFIG="--enable-debug --enable-tcg-interpreter" - env: TARGETS=i386-softmmu,x86_64-softmmu
TEST_CMD="" EXTRA_CONFIG="--enable-debug"
compiler: gcc compiler: gcc
- env: CONFIG="--enable-trace-backends=simple" - env: TARGETS=i386-softmmu,x86_64-softmmu
TEST_CMD="" EXTRA_CONFIG="--enable-debug --enable-tcg-interpreter"
compiler: gcc compiler: gcc
- env: CONFIG="--enable-trace-backends=ftrace" # All the extra -dev packages
TEST_CMD="" - env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="libaio-dev libcap-ng-dev libattr1-dev libbrlapi-dev uuid-dev libusb-1.0.0-dev"
compiler: gcc compiler: gcc
- env: CONFIG="--enable-trace-backends=ust" # Currently configure doesn't force --disable-pie
TEST_CMD="" - env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-gprof --enable-gcov --disable-pie"
compiler: gcc compiler: gcc
- env: CONFIG="--with-coroutine=gthread" - env: TARGETS=i386-softmmu,x86_64-softmmu
TEST_CMD="" EXTRA_PKGS="sparse"
EXTRA_CONFIG="--enable-sparse"
compiler: gcc compiler: gcc
- env: CONFIG="" # All the trace backends (apart from dtrace)
os: osx - env: TARGETS=i386-softmmu,x86_64-softmmu
compiler: clang EXTRA_CONFIG="--enable-trace-backends=stderr"
- env: CONFIG="" compiler: gcc
sudo: required - env: TARGETS=i386-softmmu,x86_64-softmmu
addons: EXTRA_CONFIG="--enable-trace-backends=simple"
dist: trusty compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=ftrace"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="liblttng-ust-dev liburcu-dev"
EXTRA_CONFIG="--enable-trace-backends=ust"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-modules"
compiler: gcc compiler: gcc
before_install:
- sudo apt-get update -qq
- sudo apt-get build-dep -qq qemu
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive

View File

@@ -31,11 +31,7 @@ Do not leave whitespace dangling off the ends of lines.
2. Line width 2. Line width
Lines should be 80 characters; try not to make them longer. Lines are 80 characters; not longer.
Sometimes it is hard to do, especially when dealing with QEMU subsystems
that use long function or symbol names. Even in that case, do not make
lines much longer than 80 characters.
Rationale: Rationale:
- Some people like to tile their 24" screens with a 6x4 matrix of 80x24 - Some people like to tile their 24" screens with a 6x4 matrix of 80x24
@@ -43,8 +39,6 @@ Rationale:
let them keep doing it. let them keep doing it.
- Code and especially patches is much more readable if limited to a sane - Code and especially patches is much more readable if limited to a sane
line length. Eighty is traditional. line length. Eighty is traditional.
- The four-space indentation makes the most common excuse ("But look
at all that white space on the left!") moot.
- It is the QEMU coding style. - It is the QEMU coding style.
3. Naming 3. Naming
@@ -93,15 +87,10 @@ Furthermore, it is the QEMU coding style.
5. Declarations 5. Declarations
Mixed declarations (interleaving statements and declarations within Mixed declarations (interleaving statements and declarations within blocks)
blocks) are generally not allowed; declarations should be at the beginning are not allowed; declarations should be at the beginning of blocks. In other
of blocks. words, the code should not generate warnings if using GCC's
-Wdeclaration-after-statement option.
Every now and then, an exception is made for declarations inside a
#ifdef or #ifndef block: if the code looks nicer, such declarations can
be placed at the top of the block even if there are statements above.
On the other hand, however, it's often best to move that #ifdef/#ifndef
block to a separate function altogether.
6. Conditional statements 6. Conditional statements

59
HACKING
View File

@@ -157,62 +157,3 @@ painful. These are:
* you may assume that integers are 2s complement representation * you may assume that integers are 2s complement representation
* you may assume that right shift of a signed integer duplicates * you may assume that right shift of a signed integer duplicates
the sign bit (ie it is an arithmetic shift, not a logical shift) the sign bit (ie it is an arithmetic shift, not a logical shift)
In addition, QEMU assumes that the compiler does not use the latitude
given in C99 and C11 to treat aspects of signed '<<' as undefined, as
documented in the GNU Compiler Collection manual starting at version 4.0.
7. Error handling and reporting
7.1 Reporting errors to the human user
Do not use printf(), fprintf() or monitor_printf(). Instead, use
error_report() or error_vreport() from error-report.h. This ensures the
error is reported in the right place (current monitor or stderr), and in
a uniform format.
Use error_printf() & friends to print additional information.
error_report() prints the current location. In certain common cases
like command line parsing, the current location is tracked
automatically. To manipulate it manually, use the loc_*() from
error-report.h.
7.2 Propagating errors
An error can't always be reported to the user right where it's detected,
but often needs to be propagated up the call chain to a place that can
handle it. This can be done in various ways.
The most flexible one is Error objects. See error.h for usage
information.
Use the simplest suitable method to communicate success / failure to
callers. Stick to common methods: non-negative on success / -1 on
error, non-negative / -errno, non-null / null, or Error objects.
Example: when a function returns a non-null pointer on success, and it
can fail only in one way (as far as the caller is concerned), returning
null on failure is just fine, and certainly simpler and a lot easier on
the eyes than propagating an Error object through an Error ** parameter.
Example: when a function's callers need to report details on failure
only the function really knows, use Error **, and set suitable errors.
Do not report an error to the user when you're also returning an error
for somebody else to handle. Leave the reporting to the place that
consumes the error returned.
7.3 Handling errors
Calling exit() is fine when handling configuration errors during
startup. It's problematic during normal operation. In particular,
monitor commands should never exit().
Do not call exit() or abort() to handle an error that can be triggered
by the guest (e.g., some unimplemented corner case in guest code
translation or device emulation). Guests should not be able to
terminate QEMU.
Note that &error_fatal is just another way to exit(1), and &error_abort
is just another way to abort().

File diff suppressed because it is too large Load Diff

209
Makefile
View File

@@ -6,7 +6,7 @@ BUILD_DIR=$(CURDIR)
# Before including a proper config-host.mak, assume we are in the source tree # Before including a proper config-host.mak, assume we are in the source tree
SRC_PATH=. SRC_PATH=.
UNCHECKED_GOALS := %clean TAGS cscope ctags docker docker-% UNCHECKED_GOALS := %clean TAGS cscope ctags
# All following code might depend on configuration variables # All following code might depend on configuration variables
ifneq ($(wildcard config-host.mak),) ifneq ($(wildcard config-host.mak),)
@@ -30,7 +30,8 @@ CONFIG_ALL=y
-include config-all-devices.mak -include config-all-devices.mak
-include config-all-disas.mak -include config-all-disas.mak
config-host.mak: $(SRC_PATH)/configure $(SRC_PATH)/pc-bios include $(SRC_PATH)/rules.mak
config-host.mak: $(SRC_PATH)/configure
@echo $@ is out-of-date, running configure @echo $@ is out-of-date, running configure
@# TODO: The next lines include code which supports a smooth @# TODO: The next lines include code which supports a smooth
@# transition from old configurations without config.status. @# transition from old configurations without config.status.
@@ -48,13 +49,9 @@ ifneq ($(filter-out $(UNCHECKED_GOALS),$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fa
endif endif
endif endif
include $(SRC_PATH)/rules.mak GENERATED_HEADERS = config-host.h qemu-options.def
GENERATED_HEADERS = qemu-version.h config-host.h qemu-options.def
GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h
GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c
GENERATED_HEADERS += qmp-introspect.h
GENERATED_SOURCES += qmp-introspect.c
GENERATED_HEADERS += trace/generated-events.h GENERATED_HEADERS += trace/generated-events.h
GENERATED_SOURCES += trace/generated-events.c GENERATED_SOURCES += trace/generated-events.c
@@ -76,15 +73,13 @@ GENERATED_HEADERS += trace/generated-ust-provider.h
GENERATED_SOURCES += trace/generated-ust.c GENERATED_SOURCES += trace/generated-ust.c
endif endif
GENERATED_HEADERS += module_block.h
# Don't try to regenerate Makefile or configure # Don't try to regenerate Makefile or configure
# We don't generate any of them # We don't generate any of them
Makefile: ; Makefile: ;
configure: ; configure: ;
.PHONY: all clean cscope distclean dvi html info install install-doc \ .PHONY: all clean cscope distclean dvi html info install install-doc \
pdf recurse-all speed test dist msi FORCE pdf recurse-all speed test dist msi
$(call set-vpath, $(SRC_PATH)) $(call set-vpath, $(SRC_PATH))
@@ -93,7 +88,10 @@ LIBS+=-lz $(LIBS_TOOLS)
HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF) HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)
ifdef BUILD_DOCS ifdef BUILD_DOCS
DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 qemu-ga.8 DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 qmp-commands.txt
ifdef CONFIG_LINUX
DOCS+=kvm_stat.1
endif
ifdef CONFIG_VIRTFS ifdef CONFIG_VIRTFS
DOCS+=fsdev/virtfs-proxy-helper.1 DOCS+=fsdev/virtfs-proxy-helper.1
endif endif
@@ -118,7 +116,7 @@ endif
-include $(SUBDIR_DEVICES_MAK_DEP) -include $(SUBDIR_DEVICES_MAK_DEP)
%/config-devices.mak: default-configs/%.mak $(SRC_PATH)/scripts/make_device_config.sh %/config-devices.mak: default-configs/%.mak
$(call quiet-command, \ $(call quiet-command, \
$(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $< $*-config-devices.mak.d $@ > $@.tmp, " GEN $@.tmp") $(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $< $*-config-devices.mak.d $@ > $@.tmp, " GEN $@.tmp")
$(call quiet-command, if test -f $@; then \ $(call quiet-command, if test -f $@; then \
@@ -150,55 +148,30 @@ dummy := $(call unnest-vars,, \
stub-obj-y \ stub-obj-y \
util-obj-y \ util-obj-y \
qga-obj-y \ qga-obj-y \
ivshmem-client-obj-y \
ivshmem-server-obj-y \
qga-vss-dll-obj-y \ qga-vss-dll-obj-y \
block-obj-y \ block-obj-y \
block-obj-m \ block-obj-m \
crypto-obj-y \
crypto-aes-obj-y \
qom-obj-y \
io-obj-y \
common-obj-y \ common-obj-y \
common-obj-m) common-obj-m)
ifneq ($(wildcard config-host.mak),) ifneq ($(wildcard config-host.mak),)
include $(SRC_PATH)/tests/Makefile.include include $(SRC_PATH)/tests/Makefile
endif
ifeq ($(CONFIG_SMARTCARD_NSS),y)
include $(SRC_PATH)/libcacard/Makefile
endif endif
all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules
qemu-version.h: FORCE
$(call quiet-command, \
(cd $(SRC_PATH); \
printf '#define QEMU_PKGVERSION '; \
if test -n "$(PKGVERSION)"; then \
printf '"$(PKGVERSION)"\n'; \
else \
if test -d .git; then \
printf '" ('; \
git describe --match 'v*' 2>/dev/null | tr -d '\n'; \
if ! git diff-index --quiet HEAD &>/dev/null; then \
printf -- '-dirty'; \
fi; \
printf ')"\n'; \
else \
printf '""\n'; \
fi; \
fi) > $@.tmp)
$(call quiet-command, cmp -s $@ $@.tmp || mv $@.tmp $@)
config-host.h: config-host.h-timestamp config-host.h: config-host.h-timestamp
config-host.h-timestamp: config-host.mak config-host.h-timestamp: config-host.mak
qemu-options.def: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool qemu-options.def: $(SRC_PATH)/qemu-options.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $@") $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $@")
SUBDIR_RULES=$(patsubst %,subdir-%, $(TARGET_DIRS)) SUBDIR_RULES=$(patsubst %,subdir-%, $(TARGET_DIRS))
SOFTMMU_SUBDIR_RULES=$(filter %-softmmu,$(SUBDIR_RULES)) SOFTMMU_SUBDIR_RULES=$(filter %-softmmu,$(SUBDIR_RULES))
$(SOFTMMU_SUBDIR_RULES): $(block-obj-y) $(SOFTMMU_SUBDIR_RULES): $(block-obj-y)
$(SOFTMMU_SUBDIR_RULES): $(crypto-obj-y)
$(SOFTMMU_SUBDIR_RULES): $(io-obj-y)
$(SOFTMMU_SUBDIR_RULES): config-all-devices.mak $(SOFTMMU_SUBDIR_RULES): config-all-devices.mak
subdir-%: subdir-%:
@@ -223,12 +196,11 @@ subdir-dtc:dtc/libfdt dtc/tests
dtc/%: dtc/%:
mkdir -p $@ mkdir -p $@
$(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y) $(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY)) $(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y)
ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS)) ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS))
# Only keep -O and -g cflags
romsubdir-%: romsubdir-%:
$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C pc-bios/$* V="$(V)" TARGET_DIR="$*/" CFLAGS="$(filter -O% -g%,$(CFLAGS))",) $(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C pc-bios/$* V="$(V)" TARGET_DIR="$*/",)
ALL_SUBDIRS=$(TARGET_DIRS) $(patsubst %,pc-bios/%, $(ROMS)) ALL_SUBDIRS=$(TARGET_DIRS) $(patsubst %,pc-bios/%, $(ROMS))
@@ -247,20 +219,23 @@ Makefile: $(version-obj-y) $(version-lobj-y)
libqemustub.a: $(stub-obj-y) libqemustub.a: $(stub-obj-y)
libqemuutil.a: $(util-obj-y) libqemuutil.a: $(util-obj-y)
block-modules = $(foreach o,$(block-obj-m),"$(basename $(subst /,-,$o))",) NULL
util/module.o-cflags = -D'CONFIG_BLOCK_MODULES=$(block-modules)'
###################################################################### ######################################################################
qemu-img.o: qemu-img-cmds.h qemu-img.o: qemu-img-cmds.h
qemu-img$(EXESUF): qemu-img.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a qemu-img$(EXESUF): qemu-img.o $(block-obj-y) libqemuutil.a libqemustub.a
qemu-nbd$(EXESUF): qemu-nbd.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a qemu-nbd$(EXESUF): qemu-nbd.o $(block-obj-y) libqemuutil.a libqemustub.a
qemu-io$(EXESUF): qemu-io.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a qemu-io$(EXESUF): qemu-io.o $(block-obj-y) libqemuutil.a libqemustub.a
qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o libqemuutil.a libqemustub.a qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o
fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/9p-marshal.o fsdev/9p-iov-marshal.o libqemuutil.a libqemustub.a fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/virtio-9p-marshal.o libqemuutil.a libqemustub.a
fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $@") $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $@")
qemu-ga$(EXESUF): LIBS = $(LIBS_QGA) qemu-ga$(EXESUF): LIBS = $(LIBS_QGA)
@@ -288,9 +263,7 @@ $(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json \ qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json \
$(SRC_PATH)/qapi/block.json $(SRC_PATH)/qapi/block-core.json \ $(SRC_PATH)/qapi/block.json $(SRC_PATH)/qapi/block-core.json \
$(SRC_PATH)/qapi/event.json $(SRC_PATH)/qapi/introspect.json \ $(SRC_PATH)/qapi/event.json
$(SRC_PATH)/qapi/crypto.json $(SRC_PATH)/qapi/rocker.json \
$(SRC_PATH)/qapi/trace.json
qapi-types.c qapi-types.h :\ qapi-types.c qapi-types.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py) $(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
@@ -310,12 +283,7 @@ $(qapi-modules) $(SRC_PATH)/scripts/qapi-event.py $(qapi-py)
qmp-commands.h qmp-marshal.c :\ qmp-commands.h qmp-marshal.c :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py) $(qapi-modules) $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \ $(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
$(gen-out-type) -o "." $<, \ $(gen-out-type) -o "." -m $<, \
" GEN $@")
qmp-introspect.h qmp-introspect.c :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-introspect.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-introspect.py \
$(gen-out-type) -o "." $<, \
" GEN $@") " GEN $@")
QGALIB_GEN=$(addprefix qga/qapi-generated/, qga-qapi-types.h qga-qapi-visit.h qga-qmp-commands.h) QGALIB_GEN=$(addprefix qga/qapi-generated/, qga-qapi-types.h qga-qapi-visit.h qga-qmp-commands.h)
@@ -327,35 +295,24 @@ qemu-ga$(EXESUF): $(qga-obj-y) libqemuutil.a libqemustub.a
ifdef QEMU_GA_MSI_ENABLED ifdef QEMU_GA_MSI_ENABLED
QEMU_GA_MSI=qemu-ga-$(ARCH).msi QEMU_GA_MSI=qemu-ga-$(ARCH).msi
msi: $(QEMU_GA_MSI) msi: ${QEMU_GA_MSI}
$(QEMU_GA_MSI): qemu-ga.exe $(QGA_VSS_PROVIDER) $(QEMU_GA_MSI): qemu-ga.exe
ifdef QEMU_GA_MSI_WITH_VSS
$(QEMU_GA_MSI): qga/vss-win32/qga-vss.dll
endif
$(QEMU_GA_MSI): config-host.mak $(QEMU_GA_MSI): config-host.mak
$(QEMU_GA_MSI): $(SRC_PATH)/qga/installer/qemu-ga.wxs $(QEMU_GA_MSI): qga/installer/qemu-ga.wxs
$(call quiet-command,QEMU_GA_VERSION="$(QEMU_GA_VERSION)" QEMU_GA_MANUFACTURER="$(QEMU_GA_MANUFACTURER)" QEMU_GA_DISTRO="$(QEMU_GA_DISTRO)" BUILD_DIR="$(BUILD_DIR)" \ $(call quiet-command,QEMU_GA_VERSION="$(QEMU_GA_VERSION)" QEMU_GA_MANUFACTURER="$(QEMU_GA_MANUFACTURER)" QEMU_GA_DISTRO="$(QEMU_GA_DISTRO)" \
wixl -o $@ $(QEMU_GA_MSI_ARCH) $(QEMU_GA_MSI_WITH_VSS) $(QEMU_GA_MSI_MINGW_DLL_PATH) $<, " WIXL $@") wixl -o $@ $(QEMU_GA_MSI_ARCH) $(QEMU_GA_MSI_WITH_VSS) $(QEMU_GA_MSI_MINGW_DLL_PATH) $<, " WIXL $@")
else else
msi: msi:
@echo "MSI build not configured or dependency resolution failed (reconfigure with --enable-guest-agent-msi option)" @echo MSI build not configured or dependency resolution failed (reconfigure with --enable-guest-agent-msi option)
endif endif
ifneq ($(EXESUF),)
.PHONY: qemu-ga
qemu-ga: qemu-ga$(EXESUF) $(QGA_VSS_PROVIDER) $(QEMU_GA_MSI)
endif
ivshmem-client$(EXESUF): $(ivshmem-client-obj-y) libqemuutil.a libqemustub.a
$(call LINK, $^)
ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) libqemuutil.a libqemustub.a
$(call LINK, $^)
module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak
$(call quiet-command,$(PYTHON) $< $@ \
$(addprefix $(SRC_PATH)/,$(patsubst %.mo,%.c,$(block-obj-m))), \
" GEN $@")
clean: clean:
# avoid old build problems by removing potentially incorrect old files # avoid old build problems by removing potentially incorrect old files
rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h gen-op-arm.h rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h gen-op-arm.h
@@ -378,7 +335,6 @@ clean:
if test -d $$d; then $(MAKE) -C $$d $@ || exit 1; fi; \ if test -d $$d; then $(MAKE) -C $$d $@ || exit 1; fi; \
rm -f $$d/qemu-options.def; \ rm -f $$d/qemu-options.def; \
done done
rm -f $(SUBDIR_DEVICES_MAK) config-all-devices.mak
VERSION ?= $(shell cat VERSION) VERSION ?= $(shell cat VERSION)
@@ -388,7 +344,7 @@ qemu-%.tar.bz2:
$(SRC_PATH)/scripts/make-release "$(SRC_PATH)" "$(patsubst qemu-%.tar.bz2,%,$@)" $(SRC_PATH)/scripts/make-release "$(SRC_PATH)" "$(patsubst qemu-%.tar.bz2,%,$@)"
distclean: clean distclean: clean
rm -f config-host.mak config-host.h* config-host.ld $(DOCS) qemu-options.texi qemu-img-cmds.texi qemu-monitor.texi qemu-monitor-info.texi rm -f config-host.mak config-host.h* config-host.ld $(DOCS) qemu-options.texi qemu-img-cmds.texi qemu-monitor.texi
rm -f config-all-devices.mak config-all-disas.mak config.status rm -f config-all-devices.mak config-all-disas.mak config.status
rm -f po/*.mo tests/qemu-iotests/common.env rm -f po/*.mo tests/qemu-iotests/common.env
rm -f roms/seabios/config.mak roms/vgabios/config.mak rm -f roms/seabios/config.mak roms/vgabios/config.mak
@@ -414,16 +370,16 @@ bepo cz
ifdef INSTALL_BLOBS ifdef INSTALL_BLOBS
BLOBS=bios.bin bios-256k.bin sgabios.bin vgabios.bin vgabios-cirrus.bin \ BLOBS=bios.bin bios-256k.bin sgabios.bin vgabios.bin vgabios-cirrus.bin \
vgabios-stdvga.bin vgabios-vmware.bin vgabios-qxl.bin vgabios-virtio.bin \ vgabios-stdvga.bin vgabios-vmware.bin vgabios-qxl.bin vgabios-virtio.bin \
acpi-dsdt.aml \ acpi-dsdt.aml q35-acpi-dsdt.aml \
ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc QEMU,tcx.bin QEMU,cgthree.bin \ ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc QEMU,tcx.bin QEMU,cgthree.bin \
pxe-e1000.rom pxe-eepro100.rom pxe-ne2k_pci.rom \ pxe-e1000.rom pxe-eepro100.rom pxe-ne2k_pci.rom \
pxe-pcnet.rom pxe-rtl8139.rom pxe-virtio.rom \ pxe-pcnet.rom pxe-rtl8139.rom pxe-virtio.rom \
efi-e1000.rom efi-eepro100.rom efi-ne2k_pci.rom \ efi-e1000.rom efi-eepro100.rom efi-ne2k_pci.rom \
efi-pcnet.rom efi-rtl8139.rom efi-virtio.rom \ efi-pcnet.rom efi-rtl8139.rom efi-virtio.rom \
efi-e1000e.rom efi-vmxnet3.rom \
qemu-icon.bmp qemu_logo_no_text.svg \ qemu-icon.bmp qemu_logo_no_text.svg \
bamboo.dtb petalogix-s3adsp1800.dtb petalogix-ml605.dtb \ bamboo.dtb petalogix-s3adsp1800.dtb petalogix-ml605.dtb \
multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin \ multiboot.bin linuxboot.bin kvmvapic.bin \
s390-zipl.rom \
s390-ccw.img \ s390-ccw.img \
spapr-rtas.bin slof.bin \ spapr-rtas.bin slof.bin \
palcode-clipper \ palcode-clipper \
@@ -435,7 +391,7 @@ endif
install-doc: $(DOCS) install-doc: $(DOCS)
$(INSTALL_DIR) "$(DESTDIR)$(qemu_docdir)" $(INSTALL_DIR) "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qemu-doc.html qemu-tech.html "$(DESTDIR)$(qemu_docdir)" $(INSTALL_DATA) qemu-doc.html qemu-tech.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) $(SRC_PATH)/docs/qmp-commands.txt "$(DESTDIR)$(qemu_docdir)" $(INSTALL_DATA) qmp-commands.txt "$(DESTDIR)$(qemu_docdir)"
ifdef CONFIG_POSIX ifdef CONFIG_POSIX
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1" $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DATA) qemu.1 "$(DESTDIR)$(mandir)/man1" $(INSTALL_DATA) qemu.1 "$(DESTDIR)$(mandir)/man1"
@@ -444,9 +400,6 @@ ifneq ($(TOOLS),)
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man8" $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man8"
$(INSTALL_DATA) qemu-nbd.8 "$(DESTDIR)$(mandir)/man8" $(INSTALL_DATA) qemu-nbd.8 "$(DESTDIR)$(mandir)/man8"
endif endif
ifneq (,$(findstring qemu-ga,$(TOOLS)))
$(INSTALL_DATA) qemu-ga.8 "$(DESTDIR)$(mandir)/man8"
endif
endif endif
ifdef CONFIG_VIRTFS ifdef CONFIG_VIRTFS
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1" $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
@@ -467,7 +420,7 @@ endif
install: all $(if $(BUILD_DOCS),install-doc) \ install: all $(if $(BUILD_DOCS),install-doc) \
install-datadir install-localstatedir install-datadir install-localstatedir
ifneq ($(TOOLS),) ifneq ($(TOOLS),)
$(call install-prog,$(subst qemu-ga,qemu-ga$(EXESUF),$(TOOLS)),$(DESTDIR)$(bindir)) $(call install-prog,$(TOOLS),$(DESTDIR)$(bindir))
endif endif
ifneq ($(CONFIG_MODULES),) ifneq ($(CONFIG_MODULES),)
$(INSTALL_DIR) "$(DESTDIR)$(qemu_moddir)" $(INSTALL_DIR) "$(DESTDIR)$(qemu_moddir)"
@@ -492,7 +445,7 @@ endif
set -e; for x in $(KEYMAPS); do \ set -e; for x in $(KEYMAPS); do \
$(INSTALL_DATA) $(SRC_PATH)/pc-bios/keymaps/$$x "$(DESTDIR)$(qemu_datadir)/keymaps"; \ $(INSTALL_DATA) $(SRC_PATH)/pc-bios/keymaps/$$x "$(DESTDIR)$(qemu_datadir)/keymaps"; \
done done
$(INSTALL_DATA) $(BUILD_DIR)/trace-events-all "$(DESTDIR)$(qemu_datadir)/trace-events-all" $(INSTALL_DATA) $(SRC_PATH)/trace-events "$(DESTDIR)$(qemu_datadir)/trace-events"
for d in $(TARGET_DIRS); do \ for d in $(TARGET_DIRS); do \
$(MAKE) $(SUBDIR_MAKEFLAGS) TARGET_DIR=$$d/ -C $$d $@ || exit 1 ; \ $(MAKE) $(SUBDIR_MAKEFLAGS) TARGET_DIR=$$d/ -C $$d $@ || exit 1 ; \
done done
@@ -503,12 +456,12 @@ test speed: all
.PHONY: ctags .PHONY: ctags
ctags: ctags:
rm -f tags rm -f $@
find "$(SRC_PATH)" -name '*.[hc]' -exec ctags --append {} + find "$(SRC_PATH)" -name '*.[hc]' -exec ctags --append {} +
.PHONY: TAGS .PHONY: TAGS
TAGS: TAGS:
rm -f TAGS rm -f $@
find "$(SRC_PATH)" -name '*.[hc]' -exec etags --append {} + find "$(SRC_PATH)" -name '*.[hc]' -exec etags --append {} +
cscope: cscope:
@@ -549,26 +502,25 @@ TEXIFLAG=$(if $(V),,--quiet)
%.pdf: %.texi %.pdf: %.texi
$(call quiet-command,texi2pdf $(TEXIFLAG) -I . $<," GEN $@") $(call quiet-command,texi2pdf $(TEXIFLAG) -I . $<," GEN $@")
qemu-options.texi: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool qemu-options.texi: $(SRC_PATH)/qemu-options.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@") $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qemu-monitor.texi: $(SRC_PATH)/hmp-commands.hx $(SRC_PATH)/scripts/hxtool qemu-monitor.texi: $(SRC_PATH)/hmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@") $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qemu-monitor-info.texi: $(SRC_PATH)/hmp-commands-info.hx $(SRC_PATH)/scripts/hxtool qmp-commands.txt: $(SRC_PATH)/qmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -q < $< > $@," GEN $@")
qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@") $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi qemu-monitor-info.texi
$(call quiet-command, \ $(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu.pod && \ perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu.pod && \
$(POD2MAN) --section=1 --center=" " --release=" " qemu.pod > $@, \ $(POD2MAN) --section=1 --center=" " --release=" " qemu.pod > $@, \
" GEN $@") " GEN $@")
qemu.1: qemu-option-trace.texi
qemu-img.1: qemu-img.texi qemu-option-trace.texi qemu-img-cmds.texi qemu-img.1: qemu-img.texi qemu-img-cmds.texi
$(call quiet-command, \ $(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-img.pod && \ perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-img.pod && \
$(POD2MAN) --section=1 --center=" " --release=" " qemu-img.pod > $@, \ $(POD2MAN) --section=1 --center=" " --release=" " qemu-img.pod > $@, \
@@ -580,16 +532,16 @@ fsdev/virtfs-proxy-helper.1: fsdev/virtfs-proxy-helper.texi
$(POD2MAN) --section=1 --center=" " --release=" " fsdev/virtfs-proxy-helper.pod > $@, \ $(POD2MAN) --section=1 --center=" " --release=" " fsdev/virtfs-proxy-helper.pod > $@, \
" GEN $@") " GEN $@")
qemu-nbd.8: qemu-nbd.texi qemu-option-trace.texi qemu-nbd.8: qemu-nbd.texi
$(call quiet-command, \ $(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-nbd.pod && \ perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-nbd.pod && \
$(POD2MAN) --section=8 --center=" " --release=" " qemu-nbd.pod > $@, \ $(POD2MAN) --section=8 --center=" " --release=" " qemu-nbd.pod > $@, \
" GEN $@") " GEN $@")
qemu-ga.8: qemu-ga.texi kvm_stat.1: scripts/kvm/kvm_stat.texi
$(call quiet-command, \ $(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-ga.pod && \ perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< kvm_stat.pod && \
$(POD2MAN) --section=8 --center=" " --release=" " qemu-ga.pod > $@, \ $(POD2MAN) --section=1 --center=" " --release=" " kvm_stat.pod > $@, \
" GEN $@") " GEN $@")
dvi: qemu-doc.dvi qemu-tech.dvi dvi: qemu-doc.dvi qemu-tech.dvi
@@ -598,9 +550,8 @@ info: qemu-doc.info qemu-tech.info
pdf: qemu-doc.pdf qemu-tech.pdf pdf: qemu-doc.pdf qemu-tech.pdf
qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \ qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
qemu-img.texi qemu-nbd.texi qemu-options.texi qemu-option-trace.texi \ qemu-img.texi qemu-nbd.texi qemu-options.texi \
qemu-monitor.texi qemu-img-cmds.texi qemu-ga.texi \ qemu-monitor.texi qemu-img-cmds.texi
qemu-monitor-info.texi
ifdef CONFIG_WIN32 ifdef CONFIG_WIN32
@@ -650,7 +601,6 @@ endif # SIGNCODE
$(if $(DLL_PATH),-DDLLDIR="$(DLL_PATH)") \ $(if $(DLL_PATH),-DDLLDIR="$(DLL_PATH)") \
-DSRCDIR="$(SRC_PATH)" \ -DSRCDIR="$(SRC_PATH)" \
-DOUTFILE="$(INSTALLER)" \ -DOUTFILE="$(INSTALLER)" \
-DDISPLAYVERSION="$(VERSION)" \
$(SRC_PATH)/qemu.nsi $(SRC_PATH)/qemu.nsi
rm -r ${INSTDIR} rm -r ${INSTDIR}
ifdef SIGNCODE ifdef SIGNCODE
@@ -667,42 +617,3 @@ endif
# Include automatically generated dependency files # Include automatically generated dependency files
# Dependencies in Makefile.objs files come from our recursive subdir rules # Dependencies in Makefile.objs files come from our recursive subdir rules
-include $(wildcard *.d tests/*.d) -include $(wildcard *.d tests/*.d)
include $(SRC_PATH)/tests/docker/Makefile.include
.PHONY: help
help:
@echo 'Generic targets:'
@echo ' all - Build all'
@echo ' dir/file.o - Build specified target only'
@echo ' install - Install QEMU, documentation and tools'
@echo ' ctags/TAGS - Generate tags file for editors'
@echo ' cscope - Generate cscope index'
@echo ''
@$(if $(TARGET_DIRS), \
echo 'Architecture specific targets:'; \
$(foreach t, $(TARGET_DIRS), \
printf " %-30s - Build for %s\\n" $(patsubst %,subdir-%,$(t)) $(t);) \
echo '')
@echo 'Cleaning targets:'
@echo ' clean - Remove most generated files but keep the config'
@echo ' distclean - Remove all generated files'
@echo ' dist - Build a distributable tarball'
@echo ''
@echo 'Test targets:'
@echo ' check - Run all tests (check-help for details)'
@echo ' docker - Help about targets running tests inside Docker containers'
@echo ''
@echo 'Documentation targets:'
@echo ' dvi html info pdf'
@echo ' - Build documentation in specified format'
@echo ''
ifdef CONFIG_WIN32
@echo 'Windows targets:'
@echo ' installer - Build NSIS-based installer for qemu-ga'
ifdef QEMU_GA_MSI_ENABLED
@echo ' msi - Build MSI-based installer for qemu-ga'
endif
@echo ''
endif
@echo ' make V=0|1 [targets] 0 => quiet build (default), 1 => verbose build'

View File

@@ -1,39 +1,38 @@
####################################################################### #######################################################################
# Common libraries for tools and emulators # Common libraries for tools and emulators
stub-obj-y = stubs/ crypto/ stub-obj-y = stubs/
util-obj-y = util/ qobject/ qapi/ util-obj-y = util/ qobject/ qapi/ qapi-types.o qapi-visit.o qapi-event.o
util-obj-y += qmp-introspect.o qapi-types.o qapi-visit.o qapi-event.o util-obj-y += crypto/
####################################################################### #######################################################################
# block-obj-y is code used by both qemu system emulation and qemu-img # block-obj-y is code used by both qemu system emulation and qemu-img
block-obj-y = async.o thread-pool.o block-obj-y = async.o thread-pool.o
block-obj-y += nbd/ block-obj-y += nbd.o block.o blockjob.o
block-obj-y += block.o blockjob.o
block-obj-y += main-loop.o iohandler.o qemu-timer.o block-obj-y += main-loop.o iohandler.o qemu-timer.o
block-obj-$(CONFIG_POSIX) += aio-posix.o block-obj-$(CONFIG_POSIX) += aio-posix.o
block-obj-$(CONFIG_WIN32) += aio-win32.o block-obj-$(CONFIG_WIN32) += aio-win32.o
block-obj-y += block/ block-obj-y += block/
block-obj-y += qemu-io-cmds.o block-obj-y += qemu-io-cmds.o
block-obj-$(CONFIG_REPLICATION) += replication.o
block-obj-y += qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
block-obj-y += qemu-coroutine-sleep.o
block-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o
block-obj-m = block/ block-obj-m = block/
#######################################################################
# crypto-obj-y is code used by both qemu system emulation and qemu-img
crypto-obj-y = crypto/ ######################################################################
crypto-aes-obj-y = crypto/ # smartcard
####################################################################### libcacard-y += libcacard/cac.o libcacard/event.o
# qom-obj-y is code used by both qemu system emulation and qemu-img libcacard-y += libcacard/vcard.o libcacard/vreader.o
libcacard-y += libcacard/vcard_emul_nss.o
qom-obj-y = qom/ libcacard-y += libcacard/vcard_emul_type.o
libcacard-y += libcacard/card_7816.o
####################################################################### libcacard-y += libcacard/vcardt.o
# io-obj-y is code used by both qemu system emulation and qemu-img libcacard/vcard_emul_nss.o-cflags := $(NSS_CFLAGS)
libcacard/vcard_emul_nss.o-libs := $(NSS_LIBS)
io-obj-y = io/
###################################################################### ######################################################################
# Target independent part of system emulation. The long term path is to # Target independent part of system emulation. The long term path is to
@@ -53,6 +52,7 @@ common-obj-$(CONFIG_LINUX) += fsdev/
common-obj-y += migration/ common-obj-y += migration/
common-obj-y += qemu-char.o #aio.o common-obj-y += qemu-char.o #aio.o
common-obj-y += page_cache.o common-obj-y += page_cache.o
common-obj-y += qjson.o
common-obj-$(CONFIG_SPICE) += spice-qemu-char.o common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
@@ -60,8 +60,6 @@ common-obj-y += audio/
common-obj-y += hw/ common-obj-y += hw/
common-obj-y += accel.o common-obj-y += accel.o
common-obj-y += replay/
common-obj-y += ui/ common-obj-y += ui/
common-obj-y += bt-host.o bt-vhci.o common-obj-y += bt-host.o bt-vhci.o
bt-host.o-cflags := $(BLUEZ_CFLAGS) bt-host.o-cflags := $(BLUEZ_CFLAGS)
@@ -77,18 +75,20 @@ common-obj-y += backends/
common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
common-obj-$(CONFIG_SMARTCARD_NSS) += $(libcacard-y)
common-obj-$(CONFIG_FDT) += device_tree.o common-obj-$(CONFIG_FDT) += device_tree.o
###################################################################### ######################################################################
# qapi # qapi
common-obj-y += qmp-marshal.o common-obj-y += qmp-marshal.o
common-obj-y += qmp-introspect.o
common-obj-y += qmp.o hmp.o common-obj-y += qmp.o hmp.o
endif endif
####################################################################### #######################################################################
# Target-independent parts used in system and user emulation # Target-independent parts used in system and user emulation
common-obj-y += qemu-log.o
common-obj-y += tcg-runtime.o common-obj-y += tcg-runtime.o
common-obj-y += hw/ common-obj-y += hw/
common-obj-y += qom/ common-obj-y += qom/
@@ -111,52 +111,3 @@ target-obj-y += trace/
# by libqemuutil.a. These should be moved to a separate .json schema. # by libqemuutil.a. These should be moved to a separate .json schema.
qga-obj-y = qga/ qga-obj-y = qga/
qga-vss-dll-obj-y = qga/ qga-vss-dll-obj-y = qga/
######################################################################
# contrib
ivshmem-client-obj-y = contrib/ivshmem-client/
ivshmem-server-obj-y = contrib/ivshmem-server/
######################################################################
trace-events-y = trace-events
trace-events-y += util/trace-events
trace-events-y += crypto/trace-events
trace-events-y += io/trace-events
trace-events-y += migration/trace-events
trace-events-y += block/trace-events
trace-events-y += hw/block/trace-events
trace-events-y += hw/char/trace-events
trace-events-y += hw/intc/trace-events
trace-events-y += hw/net/trace-events
trace-events-y += hw/virtio/trace-events
trace-events-y += hw/audio/trace-events
trace-events-y += hw/misc/trace-events
trace-events-y += hw/usb/trace-events
trace-events-y += hw/scsi/trace-events
trace-events-y += hw/nvram/trace-events
trace-events-y += hw/display/trace-events
trace-events-y += hw/input/trace-events
trace-events-y += hw/timer/trace-events
trace-events-y += hw/dma/trace-events
trace-events-y += hw/sparc/trace-events
trace-events-y += hw/sd/trace-events
trace-events-y += hw/isa/trace-events
trace-events-y += hw/i386/trace-events
trace-events-y += hw/9pfs/trace-events
trace-events-y += hw/ppc/trace-events
trace-events-y += hw/pci/trace-events
trace-events-y += hw/s390x/trace-events
trace-events-y += hw/vfio/trace-events
trace-events-y += hw/acpi/trace-events
trace-events-y += hw/arm/trace-events
trace-events-y += hw/alpha/trace-events
trace-events-y += ui/trace-events
trace-events-y += audio/trace-events
trace-events-y += net/trace-events
trace-events-y += target-i386/trace-events
trace-events-y += target-sparc/trace-events
trace-events-y += target-s390x/trace-events
trace-events-y += target-ppc/trace-events
trace-events-y += qom/trace-events
trace-events-y += linux-user/trace-events

View File

@@ -7,7 +7,7 @@ include config-target.mak
include config-devices.mak include config-devices.mak
include $(SRC_PATH)/rules.mak include $(SRC_PATH)/rules.mak
$(call set-vpath, $(SRC_PATH):$(BUILD_DIR)) $(call set-vpath, $(SRC_PATH))
ifdef CONFIG_LINUX ifdef CONFIG_LINUX
QEMU_CFLAGS += -I../linux-headers QEMU_CFLAGS += -I../linux-headers
endif endif
@@ -48,7 +48,7 @@ else
TARGET_TYPE=system TARGET_TYPE=system
endif endif
$(QEMU_PROG).stp-installed: $(BUILD_DIR)/trace-events-all $(QEMU_PROG).stp-installed: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \ $(call quiet-command,$(TRACETOOL) \
--format=stap \ --format=stap \
--backends=$(TRACE_BACKENDS) \ --backends=$(TRACE_BACKENDS) \
@@ -57,7 +57,7 @@ $(QEMU_PROG).stp-installed: $(BUILD_DIR)/trace-events-all
--target-type=$(TARGET_TYPE) \ --target-type=$(TARGET_TYPE) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp-installed") < $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp-installed")
$(QEMU_PROG).stp: $(BUILD_DIR)/trace-events-all $(QEMU_PROG).stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \ $(call quiet-command,$(TRACETOOL) \
--format=stap \ --format=stap \
--backends=$(TRACE_BACKENDS) \ --backends=$(TRACE_BACKENDS) \
@@ -66,7 +66,7 @@ $(QEMU_PROG).stp: $(BUILD_DIR)/trace-events-all
--target-type=$(TARGET_TYPE) \ --target-type=$(TARGET_TYPE) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp") < $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp")
$(QEMU_PROG)-simpletrace.stp: $(BUILD_DIR)/trace-events-all $(QEMU_PROG)-simpletrace.stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \ $(call quiet-command,$(TRACETOOL) \
--format=simpletrace-stap \ --format=simpletrace-stap \
--backends=$(TRACE_BACKENDS) \ --backends=$(TRACE_BACKENDS) \
@@ -85,11 +85,8 @@ all: $(PROGS) stap
######################################################### #########################################################
# cpu emulator library # cpu emulator library
obj-y = exec.o translate-all.o cpu-exec.o obj-y = exec.o translate-all.o cpu-exec.o
obj-y += translate-common.o
obj-y += cpu-exec-common.o
obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o
obj-$(CONFIG_TCG_INTERPRETER) += tci.o obj-$(CONFIG_TCG_INTERPRETER) += tci.o
obj-y += tcg/tcg-common.o
obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
obj-y += fpu/softfloat.o obj-y += fpu/softfloat.o
obj-y += target-$(TARGET_BASE_ARCH)/ obj-y += target-$(TARGET_BASE_ARCH)/
@@ -108,9 +105,7 @@ obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal128.o
ifdef CONFIG_LINUX_USER ifdef CONFIG_LINUX_USER
QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) \ QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) -I$(SRC_PATH)/linux-user
-I$(SRC_PATH)/linux-user/host/$(ARCH) \
-I$(SRC_PATH)/linux-user
obj-y += linux-user/ obj-y += linux-user/
obj-y += gdbstub.o thunk.o user-exec.o obj-y += gdbstub.o thunk.o user-exec.o
@@ -156,7 +151,7 @@ else
obj-y += hw/$(TARGET_BASE_ARCH)/ obj-y += hw/$(TARGET_BASE_ARCH)/
endif endif
GENERATED_HEADERS += hmp-commands.h hmp-commands-info.h GENERATED_HEADERS += hmp-commands.h qmp-commands-old.h
endif # CONFIG_SOFTMMU endif # CONFIG_SOFTMMU
@@ -175,20 +170,12 @@ target-obj-y-save := $(target-obj-y)
dummy := $(call unnest-vars,.., \ dummy := $(call unnest-vars,.., \
block-obj-y \ block-obj-y \
block-obj-m \ block-obj-m \
crypto-obj-y \
crypto-aes-obj-y \
qom-obj-y \
io-obj-y \
common-obj-y \ common-obj-y \
common-obj-m) common-obj-m)
target-obj-y := $(target-obj-y-save) target-obj-y := $(target-obj-y-save)
all-obj-y += $(common-obj-y) all-obj-y += $(common-obj-y)
all-obj-y += $(target-obj-y) all-obj-y += $(target-obj-y)
all-obj-y += $(qom-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y) all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y)
all-obj-$(CONFIG_USER_ONLY) += $(crypto-aes-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(crypto-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(io-obj-y)
$(QEMU_PROG_BUILD): config-devices.mak $(QEMU_PROG_BUILD): config-devices.mak
@@ -203,16 +190,16 @@ endif
gdbstub-xml.c: $(TARGET_XML_FILES) $(SRC_PATH)/scripts/feature_to_c.sh gdbstub-xml.c: $(TARGET_XML_FILES) $(SRC_PATH)/scripts/feature_to_c.sh
$(call quiet-command,rm -f $@ && $(SHELL) $(SRC_PATH)/scripts/feature_to_c.sh $@ $(TARGET_XML_FILES)," GEN $(TARGET_DIR)$@") $(call quiet-command,rm -f $@ && $(SHELL) $(SRC_PATH)/scripts/feature_to_c.sh $@ $(TARGET_XML_FILES)," GEN $(TARGET_DIR)$@")
hmp-commands.h: $(SRC_PATH)/hmp-commands.hx $(SRC_PATH)/scripts/hxtool hmp-commands.h: $(SRC_PATH)/hmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $(TARGET_DIR)$@") $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $(TARGET_DIR)$@")
hmp-commands-info.h: $(SRC_PATH)/hmp-commands-info.hx $(SRC_PATH)/scripts/hxtool qmp-commands-old.h: $(SRC_PATH)/qmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $(TARGET_DIR)$@") $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $(TARGET_DIR)$@")
clean: clean-target clean:
rm -f *.a *~ $(PROGS) rm -f *.a *~ $(PROGS)
rm -f $(shell find . -name '*.[od]') rm -f $(shell find . -name '*.[od]')
rm -f hmp-commands.h gdbstub-xml.c rm -f hmp-commands.h qmp-commands-old.h gdbstub-xml.c
ifdef CONFIG_TRACE_SYSTEMTAP ifdef CONFIG_TRACE_SYSTEMTAP
rm -f *.stp rm -f *.stp
endif endif

108
README
View File

@@ -1,107 +1,3 @@
QEMU README Read the documentation in qemu-doc.html or on http://wiki.qemu-project.org
===========
QEMU is a generic and open source machine & userspace emulator and - QEMU team
virtualizer.
QEMU is capable of emulating a complete machine in software without any
need for hardware virtualization support. By using dynamic translation,
it achieves very good performance. QEMU can also integrate with the Xen
and KVM hypervisors to provide emulated hardware while allowing the
hypervisor to manage the CPU. With hypervisor support, QEMU can achieve
near native performance for CPUs. When QEMU emulates CPUs directly it is
capable of running operating systems made for one machine (e.g. an ARMv7
board) on a different machine (e.g. an x86_64 PC board).
QEMU is also capable of providing userspace API virtualization for Linux
and BSD kernel interfaces. This allows binaries compiled against one
architecture ABI (e.g. the Linux PPC64 ABI) to be run on a host using a
different architecture ABI (e.g. the Linux x86_64 ABI). This does not
involve any hardware emulation, simply CPU and syscall emulation.
QEMU aims to fit into a variety of use cases. It can be invoked directly
by users wishing to have full control over its behaviour and settings.
It also aims to facilitate integration into higher level management
layers, by providing a stable command line interface and monitor API.
It is commonly invoked indirectly via the libvirt library when using
open source applications such as oVirt, OpenStack and virt-manager.
QEMU as a whole is released under the GNU General Public License,
version 2. For full licensing details, consult the LICENSE file.
Building
========
QEMU is multi-platform software intended to be buildable on all modern
Linux platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety
of other UNIX targets. The simple steps to build QEMU are:
mkdir build
cd build
../configure
make
Complete details of the process for building and configuring QEMU for
all supported host platforms can be found in the qemu-tech.html file.
Additional information can also be found online via the QEMU website:
http://qemu-project.org/Hosts/Linux
http://qemu-project.org/Hosts/W32
Submitting patches
==================
The QEMU source code is maintained under the GIT version control system.
git clone git://git.qemu-project.org/qemu.git
When submitting patches, the preferred approach is to use 'git
format-patch' and/or 'git send-email' to format & send the mail to the
qemu-devel@nongnu.org mailing list. All patches submitted must contain
a 'Signed-off-by' line from the author. Patches should follow the
guidelines set out in the HACKING and CODING_STYLE files.
Additional information on submitting patches can be found online via
the QEMU website
http://qemu-project.org/Contribute/SubmitAPatch
http://qemu-project.org/Contribute/TrivialPatches
Bug reporting
=============
The QEMU project uses Launchpad as its primary upstream bug tracker. Bugs
found when running code built from QEMU git or upstream released sources
should be reported via:
https://bugs.launchpad.net/qemu/
If using QEMU via an operating system vendor pre-built binary package, it
is preferable to report bugs to the vendor's own bug tracker first. If
the bug is also known to affect latest upstream code, it can also be
reported via launchpad.
For additional information on bug reporting consult:
http://qemu-project.org/Contribute/ReportABug
Contact
=======
The QEMU community can be contacted in a number of ways, with the two
main methods being email and IRC
- qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
- #qemu on irc.oftc.net
Information on additional methods of contacting the community can be
found online via the QEMU website:
http://qemu-project.org/Contribute/StartHere
-- End

View File

@@ -1 +1 @@
2.7.50 2.4.1

View File

@@ -23,7 +23,6 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "sysemu/accel.h" #include "sysemu/accel.h"
#include "hw/boards.h" #include "hw/boards.h"
#include "qemu-common.h" #include "qemu-common.h"
@@ -77,7 +76,7 @@ static int accel_init_machine(AccelClass *acc, MachineState *ms)
return ret; return ret;
} }
void configure_accelerator(MachineState *ms) int configure_accelerator(MachineState *ms)
{ {
const char *p; const char *p;
char buf[10]; char buf[10];
@@ -128,6 +127,8 @@ void configure_accelerator(MachineState *ms)
if (init_failed) { if (init_failed) {
fprintf(stderr, "Back to %s accelerator.\n", acc->name); fprintf(stderr, "Back to %s accelerator.\n", acc->name);
} }
return !accel_initialised;
} }

View File

@@ -13,14 +13,10 @@
* GNU GPL, version 2 or (at your option) any later version. * GNU GPL, version 2 or (at your option) any later version.
*/ */
#include "qemu/osdep.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "block/block.h" #include "block/block.h"
#include "qemu/queue.h" #include "qemu/queue.h"
#include "qemu/sockets.h" #include "qemu/sockets.h"
#ifdef CONFIG_EPOLL_CREATE1
#include <sys/epoll.h>
#endif
struct AioHandler struct AioHandler
{ {
@@ -29,166 +25,9 @@ struct AioHandler
IOHandler *io_write; IOHandler *io_write;
int deleted; int deleted;
void *opaque; void *opaque;
bool is_external;
QLIST_ENTRY(AioHandler) node; QLIST_ENTRY(AioHandler) node;
}; };
#ifdef CONFIG_EPOLL_CREATE1
/* The fd number threashold to switch to epoll */
#define EPOLL_ENABLE_THRESHOLD 64
static void aio_epoll_disable(AioContext *ctx)
{
ctx->epoll_available = false;
if (!ctx->epoll_enabled) {
return;
}
ctx->epoll_enabled = false;
close(ctx->epollfd);
}
static inline int epoll_events_from_pfd(int pfd_events)
{
return (pfd_events & G_IO_IN ? EPOLLIN : 0) |
(pfd_events & G_IO_OUT ? EPOLLOUT : 0) |
(pfd_events & G_IO_HUP ? EPOLLHUP : 0) |
(pfd_events & G_IO_ERR ? EPOLLERR : 0);
}
static bool aio_epoll_try_enable(AioContext *ctx)
{
AioHandler *node;
struct epoll_event event;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
int r;
if (node->deleted || !node->pfd.events) {
continue;
}
event.events = epoll_events_from_pfd(node->pfd.events);
event.data.ptr = node;
r = epoll_ctl(ctx->epollfd, EPOLL_CTL_ADD, node->pfd.fd, &event);
if (r) {
return false;
}
}
ctx->epoll_enabled = true;
return true;
}
static void aio_epoll_update(AioContext *ctx, AioHandler *node, bool is_new)
{
struct epoll_event event;
int r;
if (!ctx->epoll_enabled) {
return;
}
if (!node->pfd.events) {
r = epoll_ctl(ctx->epollfd, EPOLL_CTL_DEL, node->pfd.fd, &event);
if (r) {
aio_epoll_disable(ctx);
}
} else {
event.data.ptr = node;
event.events = epoll_events_from_pfd(node->pfd.events);
if (is_new) {
r = epoll_ctl(ctx->epollfd, EPOLL_CTL_ADD, node->pfd.fd, &event);
if (r) {
aio_epoll_disable(ctx);
}
} else {
r = epoll_ctl(ctx->epollfd, EPOLL_CTL_MOD, node->pfd.fd, &event);
if (r) {
aio_epoll_disable(ctx);
}
}
}
}
static int aio_epoll(AioContext *ctx, GPollFD *pfds,
unsigned npfd, int64_t timeout)
{
AioHandler *node;
int i, ret = 0;
struct epoll_event events[128];
assert(npfd == 1);
assert(pfds[0].fd == ctx->epollfd);
if (timeout > 0) {
ret = qemu_poll_ns(pfds, npfd, timeout);
}
if (timeout <= 0 || ret > 0) {
ret = epoll_wait(ctx->epollfd, events,
sizeof(events) / sizeof(events[0]),
timeout);
if (ret <= 0) {
goto out;
}
for (i = 0; i < ret; i++) {
int ev = events[i].events;
node = events[i].data.ptr;
node->pfd.revents = (ev & EPOLLIN ? G_IO_IN : 0) |
(ev & EPOLLOUT ? G_IO_OUT : 0) |
(ev & EPOLLHUP ? G_IO_HUP : 0) |
(ev & EPOLLERR ? G_IO_ERR : 0);
}
}
out:
return ret;
}
static bool aio_epoll_enabled(AioContext *ctx)
{
/* Fall back to ppoll when external clients are disabled. */
return !aio_external_disabled(ctx) && ctx->epoll_enabled;
}
static bool aio_epoll_check_poll(AioContext *ctx, GPollFD *pfds,
unsigned npfd, int64_t timeout)
{
if (!ctx->epoll_available) {
return false;
}
if (aio_epoll_enabled(ctx)) {
return true;
}
if (npfd >= EPOLL_ENABLE_THRESHOLD) {
if (aio_epoll_try_enable(ctx)) {
return true;
} else {
aio_epoll_disable(ctx);
}
}
return false;
}
#else
static void aio_epoll_update(AioContext *ctx, AioHandler *node, bool is_new)
{
}
static int aio_epoll(AioContext *ctx, GPollFD *pfds,
unsigned npfd, int64_t timeout)
{
assert(false);
}
static bool aio_epoll_enabled(AioContext *ctx)
{
return false;
}
static bool aio_epoll_check_poll(AioContext *ctx, GPollFD *pfds,
unsigned npfd, int64_t timeout)
{
return false;
}
#endif
static AioHandler *find_aio_handler(AioContext *ctx, int fd) static AioHandler *find_aio_handler(AioContext *ctx, int fd)
{ {
AioHandler *node; AioHandler *node;
@@ -204,14 +43,11 @@ static AioHandler *find_aio_handler(AioContext *ctx, int fd)
void aio_set_fd_handler(AioContext *ctx, void aio_set_fd_handler(AioContext *ctx,
int fd, int fd,
bool is_external,
IOHandler *io_read, IOHandler *io_read,
IOHandler *io_write, IOHandler *io_write,
void *opaque) void *opaque)
{ {
AioHandler *node; AioHandler *node;
bool is_new = false;
bool deleted = false;
node = find_aio_handler(ctx, fd); node = find_aio_handler(ctx, fd);
@@ -230,7 +66,7 @@ void aio_set_fd_handler(AioContext *ctx,
* releasing the walking_handlers lock. * releasing the walking_handlers lock.
*/ */
QLIST_REMOVE(node, node); QLIST_REMOVE(node, node);
deleted = true; g_free(node);
} }
} }
} else { } else {
@@ -241,32 +77,25 @@ void aio_set_fd_handler(AioContext *ctx,
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node); QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
g_source_add_poll(&ctx->source, &node->pfd); g_source_add_poll(&ctx->source, &node->pfd);
is_new = true;
} }
/* Update handler with latest information */ /* Update handler with latest information */
node->io_read = io_read; node->io_read = io_read;
node->io_write = io_write; node->io_write = io_write;
node->opaque = opaque; node->opaque = opaque;
node->is_external = is_external;
node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP | G_IO_ERR : 0); node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP | G_IO_ERR : 0);
node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0); node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0);
} }
aio_epoll_update(ctx, node, is_new);
aio_notify(ctx); aio_notify(ctx);
if (deleted) {
g_free(node);
}
} }
void aio_set_event_notifier(AioContext *ctx, void aio_set_event_notifier(AioContext *ctx,
EventNotifier *notifier, EventNotifier *notifier,
bool is_external,
EventNotifierHandler *io_read) EventNotifierHandler *io_read)
{ {
aio_set_fd_handler(ctx, event_notifier_get_fd(notifier), aio_set_fd_handler(ctx, event_notifier_get_fd(notifier),
is_external, (IOHandler *)io_read, NULL, notifier); (IOHandler *)io_read, NULL, notifier);
} }
bool aio_prepare(AioContext *ctx) bool aio_prepare(AioContext *ctx)
@@ -282,12 +111,10 @@ bool aio_pending(AioContext *ctx)
int revents; int revents;
revents = node->pfd.revents & node->pfd.events; revents = node->pfd.revents & node->pfd.events;
if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR) && node->io_read && if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR) && node->io_read) {
aio_node_check(ctx, node->is_external)) {
return true; return true;
} }
if (revents & (G_IO_OUT | G_IO_ERR) && node->io_write && if (revents & (G_IO_OUT | G_IO_ERR) && node->io_write) {
aio_node_check(ctx, node->is_external)) {
return true; return true;
} }
} }
@@ -325,7 +152,6 @@ bool aio_dispatch(AioContext *ctx)
if (!node->deleted && if (!node->deleted &&
(revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) && (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) &&
aio_node_check(ctx, node->is_external) &&
node->io_read) { node->io_read) {
node->io_read(node->opaque); node->io_read(node->opaque);
@@ -336,7 +162,6 @@ bool aio_dispatch(AioContext *ctx)
} }
if (!node->deleted && if (!node->deleted &&
(revents & (G_IO_OUT | G_IO_ERR)) && (revents & (G_IO_OUT | G_IO_ERR)) &&
aio_node_check(ctx, node->is_external) &&
node->io_write) { node->io_write) {
node->io_write(node->opaque); node->io_write(node->opaque);
progress = true; progress = true;
@@ -432,9 +257,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
/* fill pollfds */ /* fill pollfds */
QLIST_FOREACH(node, &ctx->aio_handlers, node) { QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (!node->deleted && node->pfd.events if (!node->deleted && node->pfd.events) {
&& !aio_epoll_enabled(ctx)
&& aio_node_check(ctx, node->is_external)) {
add_pollfd(node); add_pollfd(node);
} }
} }
@@ -445,17 +268,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
if (timeout) { if (timeout) {
aio_context_release(ctx); aio_context_release(ctx);
} }
if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) { ret = qemu_poll_ns((GPollFD *)pollfds, npfd, timeout);
AioHandler epoll_handler;
epoll_handler.pfd.fd = ctx->epollfd;
epoll_handler.pfd.events = G_IO_IN | G_IO_OUT | G_IO_HUP | G_IO_ERR;
npfd = 0;
add_pollfd(&epoll_handler);
ret = aio_epoll(ctx, pollfds, npfd, timeout);
} else {
ret = qemu_poll_ns(pollfds, npfd, timeout);
}
if (blocking) { if (blocking) {
atomic_sub(&ctx->notify_me, 2); atomic_sub(&ctx->notify_me, 2);
} }
@@ -484,17 +297,3 @@ bool aio_poll(AioContext *ctx, bool blocking)
return progress; return progress;
} }
void aio_context_setup(AioContext *ctx)
{
#ifdef CONFIG_EPOLL_CREATE1
assert(!ctx->epollfd);
ctx->epollfd = epoll_create1(EPOLL_CLOEXEC);
if (ctx->epollfd == -1) {
fprintf(stderr, "Failed to create epoll instance: %s", strerror(errno));
ctx->epoll_available = false;
} else {
ctx->epoll_available = true;
}
#endif
}

View File

@@ -15,7 +15,6 @@
* GNU GPL, version 2 or (at your option) any later version. * GNU GPL, version 2 or (at your option) any later version.
*/ */
#include "qemu/osdep.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "block/block.h" #include "block/block.h"
#include "qemu/queue.h" #include "qemu/queue.h"
@@ -29,13 +28,11 @@ struct AioHandler {
GPollFD pfd; GPollFD pfd;
int deleted; int deleted;
void *opaque; void *opaque;
bool is_external;
QLIST_ENTRY(AioHandler) node; QLIST_ENTRY(AioHandler) node;
}; };
void aio_set_fd_handler(AioContext *ctx, void aio_set_fd_handler(AioContext *ctx,
int fd, int fd,
bool is_external,
IOHandler *io_read, IOHandler *io_read,
IOHandler *io_write, IOHandler *io_write,
void *opaque) void *opaque)
@@ -89,7 +86,6 @@ void aio_set_fd_handler(AioContext *ctx,
node->opaque = opaque; node->opaque = opaque;
node->io_read = io_read; node->io_read = io_read;
node->io_write = io_write; node->io_write = io_write;
node->is_external = is_external;
event = event_notifier_get_handle(&ctx->notifier); event = event_notifier_get_handle(&ctx->notifier);
WSAEventSelect(node->pfd.fd, event, WSAEventSelect(node->pfd.fd, event,
@@ -102,7 +98,6 @@ void aio_set_fd_handler(AioContext *ctx,
void aio_set_event_notifier(AioContext *ctx, void aio_set_event_notifier(AioContext *ctx,
EventNotifier *e, EventNotifier *e,
bool is_external,
EventNotifierHandler *io_notify) EventNotifierHandler *io_notify)
{ {
AioHandler *node; AioHandler *node;
@@ -138,7 +133,6 @@ void aio_set_event_notifier(AioContext *ctx,
node->e = e; node->e = e;
node->pfd.fd = (uintptr_t)event_notifier_get_handle(e); node->pfd.fd = (uintptr_t)event_notifier_get_handle(e);
node->pfd.events = G_IO_IN; node->pfd.events = G_IO_IN;
node->is_external = is_external;
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node); QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
g_source_add_poll(&ctx->source, &node->pfd); g_source_add_poll(&ctx->source, &node->pfd);
@@ -310,8 +304,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
/* fill fd sets */ /* fill fd sets */
count = 0; count = 0;
QLIST_FOREACH(node, &ctx->aio_handlers, node) { QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (!node->deleted && node->io_notify if (!node->deleted && node->io_notify) {
&& aio_node_check(ctx, node->is_external)) {
events[count++] = event_notifier_get_handle(node->e); events[count++] = event_notifier_get_handle(node->e);
} }
} }
@@ -370,7 +363,3 @@ bool aio_poll(AioContext *ctx, bool blocking)
aio_context_release(ctx); aio_context_release(ctx);
return progress; return progress;
} }
void aio_context_setup(AioContext *ctx)
{
}

View File

@@ -21,19 +21,16 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h" #include <stdint.h>
#include "qemu-common.h"
#include "cpu.h"
#include "sysemu/sysemu.h" #include "sysemu/sysemu.h"
#include "sysemu/arch_init.h" #include "sysemu/arch_init.h"
#include "hw/pci/pci.h" #include "hw/pci/pci.h"
#include "hw/audio/audio.h" #include "hw/audio/audio.h"
#include "hw/smbios/smbios.h" #include "hw/i386/smbios.h"
#include "qemu/config-file.h" #include "qemu/config-file.h"
#include "qemu/error-report.h" #include "qemu/error-report.h"
#include "qmp-commands.h" #include "qmp-commands.h"
#include "hw/acpi/acpi.h" #include "hw/acpi/acpi.h"
#include "qemu/help_option.h"
#ifdef TARGET_SPARC #ifdef TARGET_SPARC
int graphic_width = 1024; int graphic_width = 1024;
@@ -235,6 +232,25 @@ void audio_init(void)
} }
} }
int qemu_uuid_parse(const char *str, uint8_t *uuid)
{
int ret;
if (strlen(str) != 36) {
return -1;
}
ret = sscanf(str, UUID_FMT, &uuid[0], &uuid[1], &uuid[2], &uuid[3],
&uuid[4], &uuid[5], &uuid[6], &uuid[7], &uuid[8], &uuid[9],
&uuid[10], &uuid[11], &uuid[12], &uuid[13], &uuid[14],
&uuid[15]);
if (ret != 16) {
return -1;
}
return 0;
}
void do_acpitable_option(const QemuOpts *opts) void do_acpitable_option(const QemuOpts *opts)
{ {
#ifdef TARGET_I386 #ifdef TARGET_I386
@@ -242,7 +258,9 @@ void do_acpitable_option(const QemuOpts *opts)
acpi_table_add(opts, &err); acpi_table_add(opts, &err);
if (err) { if (err) {
error_reportf_err(err, "Wrong acpi table provided: "); error_report("Wrong acpi table provided: %s",
error_get_pretty(err));
error_free(err);
exit(1); exit(1);
} }
#endif #endif
@@ -255,6 +273,13 @@ void do_smbios_option(QemuOpts *opts)
#endif #endif
} }
void cpudef_init(void)
{
#if defined(cpudef_setup)
cpudef_setup(); /* parse cpu definitions in target config file */
#endif
}
int kvm_available(void) int kvm_available(void)
{ {
#ifdef CONFIG_KVM #ifdef CONFIG_KVM

46
async.c
View File

@@ -22,14 +22,11 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "block/aio.h" #include "block/aio.h"
#include "block/thread-pool.h" #include "block/thread-pool.h"
#include "qemu/main-loop.h" #include "qemu/main-loop.h"
#include "qemu/atomic.h" #include "qemu/atomic.h"
#include "block/raw-aio.h"
/***********************************************************/ /***********************************************************/
/* bottom halves (can be seen as timers which expire ASAP) */ /* bottom halves (can be seen as timers which expire ASAP) */
@@ -62,11 +59,6 @@ QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
return bh; return bh;
} }
void aio_bh_call(QEMUBH *bh)
{
bh->cb(bh->opaque);
}
/* Multiple occurrences of aio_bh_poll cannot be called concurrently */ /* Multiple occurrences of aio_bh_poll cannot be called concurrently */
int aio_bh_poll(AioContext *ctx) int aio_bh_poll(AioContext *ctx)
{ {
@@ -92,7 +84,7 @@ int aio_bh_poll(AioContext *ctx)
ret = 1; ret = 1;
} }
bh->idle = 0; bh->idle = 0;
aio_bh_call(bh); bh->cb(bh->opaque);
} }
} }
@@ -218,7 +210,7 @@ aio_ctx_check(GSource *source)
for (bh = ctx->first_bh; bh; bh = bh->next) { for (bh = ctx->first_bh; bh; bh = bh->next) {
if (!bh->deleted && bh->scheduled) { if (!bh->deleted && bh->scheduled) {
return true; return true;
} }
} }
return aio_pending(ctx) || (timerlistgroup_deadline_ns(&ctx->tlg) == 0); return aio_pending(ctx) || (timerlistgroup_deadline_ns(&ctx->tlg) == 0);
} }
@@ -243,14 +235,6 @@ aio_ctx_finalize(GSource *source)
qemu_bh_delete(ctx->notify_dummy_bh); qemu_bh_delete(ctx->notify_dummy_bh);
thread_pool_free(ctx->thread_pool); thread_pool_free(ctx->thread_pool);
#ifdef CONFIG_LINUX_AIO
if (ctx->linux_aio) {
laio_detach_aio_context(ctx->linux_aio, ctx);
laio_cleanup(ctx->linux_aio);
ctx->linux_aio = NULL;
}
#endif
qemu_mutex_lock(&ctx->bh_lock); qemu_mutex_lock(&ctx->bh_lock);
while (ctx->first_bh) { while (ctx->first_bh) {
QEMUBH *next = ctx->first_bh->next; QEMUBH *next = ctx->first_bh->next;
@@ -263,7 +247,7 @@ aio_ctx_finalize(GSource *source)
} }
qemu_mutex_unlock(&ctx->bh_lock); qemu_mutex_unlock(&ctx->bh_lock);
aio_set_event_notifier(ctx, &ctx->notifier, false, NULL); aio_set_event_notifier(ctx, &ctx->notifier, NULL);
event_notifier_cleanup(&ctx->notifier); event_notifier_cleanup(&ctx->notifier);
rfifolock_destroy(&ctx->lock); rfifolock_destroy(&ctx->lock);
qemu_mutex_destroy(&ctx->bh_lock); qemu_mutex_destroy(&ctx->bh_lock);
@@ -291,17 +275,6 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
return ctx->thread_pool; return ctx->thread_pool;
} }
#ifdef CONFIG_LINUX_AIO
LinuxAioState *aio_get_linux_aio(AioContext *ctx)
{
if (!ctx->linux_aio) {
ctx->linux_aio = laio_init();
laio_attach_aio_context(ctx->linux_aio, ctx);
}
return ctx->linux_aio;
}
#endif
void aio_notify(AioContext *ctx) void aio_notify(AioContext *ctx)
{ {
/* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs
@@ -347,23 +320,17 @@ AioContext *aio_context_new(Error **errp)
{ {
int ret; int ret;
AioContext *ctx; AioContext *ctx;
ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext)); ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
aio_context_setup(ctx);
ret = event_notifier_init(&ctx->notifier, false); ret = event_notifier_init(&ctx->notifier, false);
if (ret < 0) { if (ret < 0) {
g_source_destroy(&ctx->source);
error_setg_errno(errp, -ret, "Failed to initialize event notifier"); error_setg_errno(errp, -ret, "Failed to initialize event notifier");
goto fail; return NULL;
} }
g_source_set_can_recurse(&ctx->source, true); g_source_set_can_recurse(&ctx->source, true);
aio_set_event_notifier(ctx, &ctx->notifier, aio_set_event_notifier(ctx, &ctx->notifier,
false,
(EventNotifierHandler *) (EventNotifierHandler *)
event_notifier_dummy_cb); event_notifier_dummy_cb);
#ifdef CONFIG_LINUX_AIO
ctx->linux_aio = NULL;
#endif
ctx->thread_pool = NULL; ctx->thread_pool = NULL;
qemu_mutex_init(&ctx->bh_lock); qemu_mutex_init(&ctx->bh_lock);
rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx); rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
@@ -372,9 +339,6 @@ AioContext *aio_context_new(Error **errp)
ctx->notify_dummy_bh = aio_bh_new(ctx, notify_dummy_bh, NULL); ctx->notify_dummy_bh = aio_bh_new(ctx, notify_dummy_bh, NULL);
return ctx; return ctx;
fail:
g_source_destroy(&ctx->source);
return NULL;
} }
void aio_context_ref(AioContext *ctx) void aio_context_ref(AioContext *ctx)

View File

@@ -21,7 +21,6 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include <alsa/asoundlib.h> #include <alsa/asoundlib.h>
#include "qemu-common.h" #include "qemu-common.h"
#include "qemu/main-loop.h" #include "qemu/main-loop.h"

View File

@@ -21,13 +21,11 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "hw/hw.h" #include "hw/hw.h"
#include "audio.h" #include "audio.h"
#include "monitor/monitor.h" #include "monitor/monitor.h"
#include "qemu/timer.h" #include "qemu/timer.h"
#include "sysemu/sysemu.h" #include "sysemu/sysemu.h"
#include "qemu/cutils.h"
#define AUDIO_CAP "audio" #define AUDIO_CAP "audio"
#include "audio_int.h" #include "audio_int.h"
@@ -1131,6 +1129,8 @@ static void audio_timer (void *opaque)
*/ */
int AUD_write (SWVoiceOut *sw, void *buf, int size) int AUD_write (SWVoiceOut *sw, void *buf, int size)
{ {
int bytes;
if (!sw) { if (!sw) {
/* XXX: Consider options */ /* XXX: Consider options */
return size; return size;
@@ -1141,11 +1141,14 @@ int AUD_write (SWVoiceOut *sw, void *buf, int size)
return 0; return 0;
} }
return sw->hw->pcm_ops->write(sw, buf, size); bytes = sw->hw->pcm_ops->write (sw, buf, size);
return bytes;
} }
int AUD_read (SWVoiceIn *sw, void *buf, int size) int AUD_read (SWVoiceIn *sw, void *buf, int size)
{ {
int bytes;
if (!sw) { if (!sw) {
/* XXX: Consider options */ /* XXX: Consider options */
return size; return size;
@@ -1156,7 +1159,8 @@ int AUD_read (SWVoiceIn *sw, void *buf, int size)
return 0; return 0;
} }
return sw->hw->pcm_ops->read(sw, buf, size); bytes = sw->hw->pcm_ops->read (sw, buf, size);
return bytes;
} }
int AUD_get_buffer_size_out (SWVoiceOut *sw) int AUD_get_buffer_size_out (SWVoiceOut *sw)
@@ -1739,21 +1743,13 @@ static void audio_vm_change_state_handler (void *opaque, int running,
audio_reset_timer (s); audio_reset_timer (s);
} }
static bool is_cleaning_up; static void audio_atexit (void)
bool audio_is_cleaning_up(void)
{
return is_cleaning_up;
}
void audio_cleanup(void)
{ {
AudioState *s = &glob_audio_state; AudioState *s = &glob_audio_state;
HWVoiceOut *hwo, *hwon; HWVoiceOut *hwo = NULL;
HWVoiceIn *hwi, *hwin; HWVoiceIn *hwi = NULL;
is_cleaning_up = true; while ((hwo = audio_pcm_hw_find_any_out (hwo))) {
QLIST_FOREACH_SAFE(hwo, &glob_audio_state.hw_head_out, entries, hwon) {
SWVoiceCap *sc; SWVoiceCap *sc;
if (hwo->enabled) { if (hwo->enabled) {
@@ -1769,20 +1765,17 @@ void audio_cleanup(void)
cb->ops.destroy (cb->opaque); cb->ops.destroy (cb->opaque);
} }
} }
QLIST_REMOVE(hwo, entries);
} }
QLIST_FOREACH_SAFE(hwi, &glob_audio_state.hw_head_in, entries, hwin) { while ((hwi = audio_pcm_hw_find_any_in (hwi))) {
if (hwi->enabled) { if (hwi->enabled) {
hwi->pcm_ops->ctl_in (hwi, VOICE_DISABLE); hwi->pcm_ops->ctl_in (hwi, VOICE_DISABLE);
} }
hwi->pcm_ops->fini_in (hwi); hwi->pcm_ops->fini_in (hwi);
QLIST_REMOVE(hwi, entries);
} }
if (s->drv) { if (s->drv) {
s->drv->fini (s->drv_opaque); s->drv->fini (s->drv_opaque);
s->drv = NULL;
} }
} }
@@ -1810,9 +1803,12 @@ static void audio_init (void)
QLIST_INIT (&s->hw_head_out); QLIST_INIT (&s->hw_head_out);
QLIST_INIT (&s->hw_head_in); QLIST_INIT (&s->hw_head_in);
QLIST_INIT (&s->cap_head); QLIST_INIT (&s->cap_head);
atexit(audio_cleanup); atexit (audio_atexit);
s->ts = timer_new_ns(QEMU_CLOCK_VIRTUAL, audio_timer, s); s->ts = timer_new_ns(QEMU_CLOCK_VIRTUAL, audio_timer, s);
if (!s->ts) {
hw_error("Could not create audio timer\n");
}
audio_process_options ("AUDIO", audio_options); audio_process_options ("AUDIO", audio_options);
@@ -1863,8 +1859,12 @@ static void audio_init (void)
if (!done) { if (!done) {
done = !audio_driver_init (s, &no_audio_driver); done = !audio_driver_init (s, &no_audio_driver);
assert(done); if (!done) {
dolog("warning: Using timer based audio emulation\n"); hw_error("Could not initialize audio subsystem\n");
}
else {
dolog ("warning: Using timer based audio emulation\n");
}
} }
if (conf.period.hertz <= 0) { if (conf.period.hertz <= 0) {
@@ -1875,7 +1875,8 @@ static void audio_init (void)
} }
conf.period.ticks = 1; conf.period.ticks = 1;
} else { } else {
conf.period.ticks = NANOSECONDS_PER_SECOND / conf.period.hertz; conf.period.ticks =
muldiv64 (1, get_ticks_per_sec (), conf.period.hertz);
} }
e = qemu_add_vm_change_state_handler (audio_vm_change_state_handler, s); e = qemu_add_vm_change_state_handler (audio_vm_change_state_handler, s);
@@ -1977,7 +1978,8 @@ CaptureVoiceOut *AUD_add_capture (
QLIST_INSERT_HEAD (&s->cap_head, cap, entries); QLIST_INSERT_HEAD (&s->cap_head, cap, entries);
QLIST_INSERT_HEAD (&cap->cb_head, cb, entries); QLIST_INSERT_HEAD (&cap->cb_head, cb, entries);
QLIST_FOREACH(hw, &glob_audio_state.hw_head_out, entries) { hw = NULL;
while ((hw = audio_pcm_hw_find_any_out (hw))) {
audio_attach_capture (hw); audio_attach_capture (hw);
} }
return cap; return cap;

View File

@@ -21,10 +21,10 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#ifndef QEMU_AUDIO_H #ifndef QEMU_AUDIO_H
#define QEMU_AUDIO_H #define QEMU_AUDIO_H
#include "config-host.h"
#include "qemu/queue.h" #include "qemu/queue.h"
typedef void (*audio_callback_fn) (void *opaque, int avail); typedef void (*audio_callback_fn) (void *opaque, int avail);
@@ -163,7 +163,4 @@ static inline void *advance (void *p, int incr)
int wav_start_capture (CaptureState *s, const char *path, int freq, int wav_start_capture (CaptureState *s, const char *path, int freq,
int bits, int nchannels); int bits, int nchannels);
bool audio_is_cleaning_up(void); #endif /* audio.h */
void audio_cleanup(void);
#endif /* QEMU_AUDIO_H */

View File

@@ -21,7 +21,6 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#ifndef QEMU_AUDIO_INT_H #ifndef QEMU_AUDIO_INT_H
#define QEMU_AUDIO_INT_H #define QEMU_AUDIO_INT_H
@@ -258,4 +257,4 @@ static inline int audio_ring_dist (int dst, int src, int len)
#define AUDIO_FUNC __FILE__ ":" AUDIO_STRINGIFY (__LINE__) #define AUDIO_FUNC __FILE__ ":" AUDIO_STRINGIFY (__LINE__)
#endif #endif
#endif /* QEMU_AUDIO_INT_H */ #endif /* audio_int.h */

View File

@@ -1,4 +1,3 @@
#include "qemu/osdep.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "audio.h" #include "audio.h"

View File

@@ -19,4 +19,4 @@ int audio_pt_wait (struct audio_pt *, const char *);
int audio_pt_unlock_and_signal (struct audio_pt *, const char *); int audio_pt_unlock_and_signal (struct audio_pt *, const char *);
int audio_pt_join (struct audio_pt *, void **, const char *); int audio_pt_join (struct audio_pt *, void **, const char *);
#endif /* QEMU_AUDIO_PT_INT_H */ #endif /* audio_pt_int.h */

View File

@@ -1,6 +1,5 @@
/* public domain */ /* public domain */
#include "qemu/osdep.h"
#include "qemu-common.h" #include "qemu-common.h"
#define AUDIO_CAP "win-int" #define AUDIO_CAP "win-int"

View File

@@ -22,8 +22,8 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include <CoreAudio/CoreAudio.h> #include <CoreAudio/CoreAudio.h>
#include <string.h> /* strerror */
#include <pthread.h> /* pthread_X */ #include <pthread.h> /* pthread_X */
#include "qemu-common.h" #include "qemu-common.h"
@@ -32,9 +32,7 @@
#define AUDIO_CAP "coreaudio" #define AUDIO_CAP "coreaudio"
#include "audio_int.h" #include "audio_int.h"
#ifndef MAC_OS_X_VERSION_10_6 static int isAtexit;
#define MAC_OS_X_VERSION_10_6 1060
#endif
typedef struct { typedef struct {
int buffer_frames; int buffer_frames;
@@ -47,233 +45,11 @@ typedef struct coreaudioVoiceOut {
AudioDeviceID outputDeviceID; AudioDeviceID outputDeviceID;
UInt32 audioDevicePropertyBufferFrameSize; UInt32 audioDevicePropertyBufferFrameSize;
AudioStreamBasicDescription outputStreamBasicDescription; AudioStreamBasicDescription outputStreamBasicDescription;
AudioDeviceIOProcID ioprocid;
int live; int live;
int decr; int decr;
int rpos; int rpos;
} coreaudioVoiceOut; } coreaudioVoiceOut;
#if MAC_OS_X_VERSION_MAX_ALLOWED >= MAC_OS_X_VERSION_10_6
/* The APIs used here only become available from 10.6 */
static OSStatus coreaudio_get_voice(AudioDeviceID *id)
{
UInt32 size = sizeof(*id);
AudioObjectPropertyAddress addr = {
kAudioHardwarePropertyDefaultOutputDevice,
kAudioObjectPropertyScopeGlobal,
kAudioObjectPropertyElementMaster
};
return AudioObjectGetPropertyData(kAudioObjectSystemObject,
&addr,
0,
NULL,
&size,
id);
}
static OSStatus coreaudio_get_framesizerange(AudioDeviceID id,
AudioValueRange *framerange)
{
UInt32 size = sizeof(*framerange);
AudioObjectPropertyAddress addr = {
kAudioDevicePropertyBufferFrameSizeRange,
kAudioDevicePropertyScopeOutput,
kAudioObjectPropertyElementMaster
};
return AudioObjectGetPropertyData(id,
&addr,
0,
NULL,
&size,
framerange);
}
static OSStatus coreaudio_get_framesize(AudioDeviceID id, UInt32 *framesize)
{
UInt32 size = sizeof(*framesize);
AudioObjectPropertyAddress addr = {
kAudioDevicePropertyBufferFrameSize,
kAudioDevicePropertyScopeOutput,
kAudioObjectPropertyElementMaster
};
return AudioObjectGetPropertyData(id,
&addr,
0,
NULL,
&size,
framesize);
}
static OSStatus coreaudio_set_framesize(AudioDeviceID id, UInt32 *framesize)
{
UInt32 size = sizeof(*framesize);
AudioObjectPropertyAddress addr = {
kAudioDevicePropertyBufferFrameSize,
kAudioDevicePropertyScopeOutput,
kAudioObjectPropertyElementMaster
};
return AudioObjectSetPropertyData(id,
&addr,
0,
NULL,
size,
framesize);
}
static OSStatus coreaudio_get_streamformat(AudioDeviceID id,
AudioStreamBasicDescription *d)
{
UInt32 size = sizeof(*d);
AudioObjectPropertyAddress addr = {
kAudioDevicePropertyStreamFormat,
kAudioDevicePropertyScopeOutput,
kAudioObjectPropertyElementMaster
};
return AudioObjectGetPropertyData(id,
&addr,
0,
NULL,
&size,
d);
}
static OSStatus coreaudio_set_streamformat(AudioDeviceID id,
AudioStreamBasicDescription *d)
{
UInt32 size = sizeof(*d);
AudioObjectPropertyAddress addr = {
kAudioDevicePropertyStreamFormat,
kAudioDevicePropertyScopeOutput,
kAudioObjectPropertyElementMaster
};
return AudioObjectSetPropertyData(id,
&addr,
0,
NULL,
size,
d);
}
static OSStatus coreaudio_get_isrunning(AudioDeviceID id, UInt32 *result)
{
UInt32 size = sizeof(*result);
AudioObjectPropertyAddress addr = {
kAudioDevicePropertyDeviceIsRunning,
kAudioDevicePropertyScopeOutput,
kAudioObjectPropertyElementMaster
};
return AudioObjectGetPropertyData(id,
&addr,
0,
NULL,
&size,
result);
}
#else
/* Legacy versions of functions using deprecated APIs */
static OSStatus coreaudio_get_voice(AudioDeviceID *id)
{
UInt32 size = sizeof(*id);
return AudioHardwareGetProperty(
kAudioHardwarePropertyDefaultOutputDevice,
&size,
id);
}
static OSStatus coreaudio_get_framesizerange(AudioDeviceID id,
AudioValueRange *framerange)
{
UInt32 size = sizeof(*framerange);
return AudioDeviceGetProperty(
id,
0,
0,
kAudioDevicePropertyBufferFrameSizeRange,
&size,
framerange);
}
static OSStatus coreaudio_get_framesize(AudioDeviceID id, UInt32 *framesize)
{
UInt32 size = sizeof(*framesize);
return AudioDeviceGetProperty(
id,
0,
false,
kAudioDevicePropertyBufferFrameSize,
&size,
framesize);
}
static OSStatus coreaudio_set_framesize(AudioDeviceID id, UInt32 *framesize)
{
UInt32 size = sizeof(*framesize);
return AudioDeviceSetProperty(
id,
NULL,
0,
false,
kAudioDevicePropertyBufferFrameSize,
size,
framesize);
}
static OSStatus coreaudio_get_streamformat(AudioDeviceID id,
AudioStreamBasicDescription *d)
{
UInt32 size = sizeof(*d);
return AudioDeviceGetProperty(
id,
0,
false,
kAudioDevicePropertyStreamFormat,
&size,
d);
}
static OSStatus coreaudio_set_streamformat(AudioDeviceID id,
AudioStreamBasicDescription *d)
{
UInt32 size = sizeof(*d);
return AudioDeviceSetProperty(
id,
0,
0,
0,
kAudioDevicePropertyStreamFormat,
size,
d);
}
static OSStatus coreaudio_get_isrunning(AudioDeviceID id, UInt32 *result)
{
UInt32 size = sizeof(*result);
return AudioDeviceGetProperty(
id,
0,
0,
kAudioDevicePropertyDeviceIsRunning,
&size,
result);
}
#endif
static void coreaudio_logstatus (OSStatus status) static void coreaudio_logstatus (OSStatus status)
{ {
const char *str = "BUG"; const char *str = "BUG";
@@ -368,7 +144,10 @@ static inline UInt32 isPlaying (AudioDeviceID outputDeviceID)
{ {
OSStatus status; OSStatus status;
UInt32 result = 0; UInt32 result = 0;
status = coreaudio_get_isrunning(outputDeviceID, &result); UInt32 propertySize = sizeof(outputDeviceID);
status = AudioDeviceGetProperty(
outputDeviceID, 0, 0,
kAudioDevicePropertyDeviceIsRunning, &propertySize, &result);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr(status, coreaudio_logerr(status,
"Could not determine whether Device is playing\n"); "Could not determine whether Device is playing\n");
@@ -376,6 +155,11 @@ static inline UInt32 isPlaying (AudioDeviceID outputDeviceID)
return result; return result;
} }
static void coreaudio_atexit (void)
{
isAtexit = 1;
}
static int coreaudio_lock (coreaudioVoiceOut *core, const char *fn_name) static int coreaudio_lock (coreaudioVoiceOut *core, const char *fn_name)
{ {
int err; int err;
@@ -504,6 +288,7 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
{ {
OSStatus status; OSStatus status;
coreaudioVoiceOut *core = (coreaudioVoiceOut *) hw; coreaudioVoiceOut *core = (coreaudioVoiceOut *) hw;
UInt32 propertySize;
int err; int err;
const char *typ = "playback"; const char *typ = "playback";
AudioValueRange frameRange; AudioValueRange frameRange;
@@ -518,7 +303,12 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
audio_pcm_init_info (&hw->info, as); audio_pcm_init_info (&hw->info, as);
status = coreaudio_get_voice(&core->outputDeviceID); /* open default output device */
propertySize = sizeof(core->outputDeviceID);
status = AudioHardwareGetProperty(
kAudioHardwarePropertyDefaultOutputDevice,
&propertySize,
&core->outputDeviceID);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr2 (status, typ, coreaudio_logerr2 (status, typ,
"Could not get default output Device\n"); "Could not get default output Device\n");
@@ -530,8 +320,14 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
} }
/* get minimum and maximum buffer frame sizes */ /* get minimum and maximum buffer frame sizes */
status = coreaudio_get_framesizerange(core->outputDeviceID, propertySize = sizeof(frameRange);
&frameRange); status = AudioDeviceGetProperty(
core->outputDeviceID,
0,
0,
kAudioDevicePropertyBufferFrameSizeRange,
&propertySize,
&frameRange);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr2 (status, typ, coreaudio_logerr2 (status, typ,
"Could not get device buffer frame range\n"); "Could not get device buffer frame range\n");
@@ -551,8 +347,15 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
} }
/* set Buffer Frame Size */ /* set Buffer Frame Size */
status = coreaudio_set_framesize(core->outputDeviceID, propertySize = sizeof(core->audioDevicePropertyBufferFrameSize);
&core->audioDevicePropertyBufferFrameSize); status = AudioDeviceSetProperty(
core->outputDeviceID,
NULL,
0,
false,
kAudioDevicePropertyBufferFrameSize,
propertySize,
&core->audioDevicePropertyBufferFrameSize);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr2 (status, typ, coreaudio_logerr2 (status, typ,
"Could not set device buffer frame size %" PRIu32 "\n", "Could not set device buffer frame size %" PRIu32 "\n",
@@ -561,8 +364,14 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
} }
/* get Buffer Frame Size */ /* get Buffer Frame Size */
status = coreaudio_get_framesize(core->outputDeviceID, propertySize = sizeof(core->audioDevicePropertyBufferFrameSize);
&core->audioDevicePropertyBufferFrameSize); status = AudioDeviceGetProperty(
core->outputDeviceID,
0,
false,
kAudioDevicePropertyBufferFrameSize,
&propertySize,
&core->audioDevicePropertyBufferFrameSize);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr2 (status, typ, coreaudio_logerr2 (status, typ,
"Could not get device buffer frame size\n"); "Could not get device buffer frame size\n");
@@ -571,8 +380,14 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
hw->samples = conf->nbuffers * core->audioDevicePropertyBufferFrameSize; hw->samples = conf->nbuffers * core->audioDevicePropertyBufferFrameSize;
/* get StreamFormat */ /* get StreamFormat */
status = coreaudio_get_streamformat(core->outputDeviceID, propertySize = sizeof(core->outputStreamBasicDescription);
&core->outputStreamBasicDescription); status = AudioDeviceGetProperty(
core->outputDeviceID,
0,
false,
kAudioDevicePropertyStreamFormat,
&propertySize,
&core->outputStreamBasicDescription);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr2 (status, typ, coreaudio_logerr2 (status, typ,
"Could not get Device Stream properties\n"); "Could not get Device Stream properties\n");
@@ -582,8 +397,15 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
/* set Samplerate */ /* set Samplerate */
core->outputStreamBasicDescription.mSampleRate = (Float64) as->freq; core->outputStreamBasicDescription.mSampleRate = (Float64) as->freq;
status = coreaudio_set_streamformat(core->outputDeviceID, propertySize = sizeof(core->outputStreamBasicDescription);
&core->outputStreamBasicDescription); status = AudioDeviceSetProperty(
core->outputDeviceID,
0,
0,
0,
kAudioDevicePropertyStreamFormat,
propertySize,
&core->outputStreamBasicDescription);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr2 (status, typ, "Could not set samplerate %d\n", coreaudio_logerr2 (status, typ, "Could not set samplerate %d\n",
as->freq); as->freq);
@@ -592,12 +414,8 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
} }
/* set Callback */ /* set Callback */
core->ioprocid = NULL; status = AudioDeviceAddIOProc(core->outputDeviceID, audioDeviceIOProc, hw);
status = AudioDeviceCreateIOProcID(core->outputDeviceID, if (status != kAudioHardwareNoError) {
audioDeviceIOProc,
hw,
&core->ioprocid);
if (status != kAudioHardwareNoError || core->ioprocid == NULL) {
coreaudio_logerr2 (status, typ, "Could not set IOProc\n"); coreaudio_logerr2 (status, typ, "Could not set IOProc\n");
core->outputDeviceID = kAudioDeviceUnknown; core->outputDeviceID = kAudioDeviceUnknown;
return -1; return -1;
@@ -605,10 +423,10 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
/* start Playback */ /* start Playback */
if (!isPlaying(core->outputDeviceID)) { if (!isPlaying(core->outputDeviceID)) {
status = AudioDeviceStart(core->outputDeviceID, core->ioprocid); status = AudioDeviceStart(core->outputDeviceID, audioDeviceIOProc);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr2 (status, typ, "Could not start playback\n"); coreaudio_logerr2 (status, typ, "Could not start playback\n");
AudioDeviceDestroyIOProcID(core->outputDeviceID, core->ioprocid); AudioDeviceRemoveIOProc(core->outputDeviceID, audioDeviceIOProc);
core->outputDeviceID = kAudioDeviceUnknown; core->outputDeviceID = kAudioDeviceUnknown;
return -1; return -1;
} }
@@ -623,18 +441,18 @@ static void coreaudio_fini_out (HWVoiceOut *hw)
int err; int err;
coreaudioVoiceOut *core = (coreaudioVoiceOut *) hw; coreaudioVoiceOut *core = (coreaudioVoiceOut *) hw;
if (!audio_is_cleaning_up()) { if (!isAtexit) {
/* stop playback */ /* stop playback */
if (isPlaying(core->outputDeviceID)) { if (isPlaying(core->outputDeviceID)) {
status = AudioDeviceStop(core->outputDeviceID, core->ioprocid); status = AudioDeviceStop(core->outputDeviceID, audioDeviceIOProc);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr (status, "Could not stop playback\n"); coreaudio_logerr (status, "Could not stop playback\n");
} }
} }
/* remove callback */ /* remove callback */
status = AudioDeviceDestroyIOProcID(core->outputDeviceID, status = AudioDeviceRemoveIOProc(core->outputDeviceID,
core->ioprocid); audioDeviceIOProc);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr (status, "Could not remove IOProc\n"); coreaudio_logerr (status, "Could not remove IOProc\n");
} }
@@ -657,7 +475,7 @@ static int coreaudio_ctl_out (HWVoiceOut *hw, int cmd, ...)
case VOICE_ENABLE: case VOICE_ENABLE:
/* start playback */ /* start playback */
if (!isPlaying(core->outputDeviceID)) { if (!isPlaying(core->outputDeviceID)) {
status = AudioDeviceStart(core->outputDeviceID, core->ioprocid); status = AudioDeviceStart(core->outputDeviceID, audioDeviceIOProc);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr (status, "Could not resume playback\n"); coreaudio_logerr (status, "Could not resume playback\n");
} }
@@ -666,10 +484,9 @@ static int coreaudio_ctl_out (HWVoiceOut *hw, int cmd, ...)
case VOICE_DISABLE: case VOICE_DISABLE:
/* stop playback */ /* stop playback */
if (!audio_is_cleaning_up()) { if (!isAtexit) {
if (isPlaying(core->outputDeviceID)) { if (isPlaying(core->outputDeviceID)) {
status = AudioDeviceStop(core->outputDeviceID, status = AudioDeviceStop(core->outputDeviceID, audioDeviceIOProc);
core->ioprocid);
if (status != kAudioHardwareNoError) { if (status != kAudioHardwareNoError) {
coreaudio_logerr (status, "Could not pause playback\n"); coreaudio_logerr (status, "Could not pause playback\n");
} }
@@ -690,6 +507,7 @@ static void *coreaudio_audio_init (void)
CoreaudioConf *conf = g_malloc(sizeof(CoreaudioConf)); CoreaudioConf *conf = g_malloc(sizeof(CoreaudioConf));
*conf = glob_conf; *conf = glob_conf;
atexit(coreaudio_atexit);
return conf; return conf;
} }

View File

@@ -26,7 +26,6 @@
* SEAL 1.07 by Carlos 'pel' Hasan was used as documentation * SEAL 1.07 by Carlos 'pel' Hasan was used as documentation
*/ */
#include "qemu/osdep.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "audio.h" #include "audio.h"

View File

@@ -22,9 +22,7 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "qemu/bswap.h"
#include "audio.h" #include "audio.h"
#define AUDIO_CAP "mixeng" #define AUDIO_CAP "mixeng"
@@ -271,7 +269,7 @@ f_sample *mixeng_clip[2][2][2][3] = {
* August 21, 1998 * August 21, 1998
* Copyright 1998 Fabrice Bellard. * Copyright 1998 Fabrice Bellard.
* *
* [Rewrote completely the code of Lance Norskog And Sundry * [Rewrote completly the code of Lance Norskog And Sundry
* Contributors with a more efficient algorithm.] * Contributors with a more efficient algorithm.]
* *
* This source code is freely redistributable and may be used for * This source code is freely redistributable and may be used for

View File

@@ -21,7 +21,6 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#ifndef QEMU_MIXENG_H #ifndef QEMU_MIXENG_H
#define QEMU_MIXENG_H #define QEMU_MIXENG_H
@@ -49,4 +48,4 @@ void st_rate_stop (void *opaque);
void mixeng_clear (struct st_sample *buf, int len); void mixeng_clear (struct st_sample *buf, int len);
void mixeng_volume (struct st_sample *buf, int len, struct mixeng_volume *vol); void mixeng_volume (struct st_sample *buf, int len, struct mixeng_volume *vol);
#endif /* QEMU_MIXENG_H */ #endif /* mixeng.h */

View File

@@ -21,9 +21,7 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "qemu/host-utils.h"
#include "audio.h" #include "audio.h"
#include "qemu/timer.h" #include "qemu/timer.h"
@@ -50,8 +48,8 @@ static int no_run_out (HWVoiceOut *hw, int live)
now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
ticks = now - no->old_ticks; ticks = now - no->old_ticks;
bytes = muldiv64(ticks, hw->info.bytes_per_second, NANOSECONDS_PER_SECOND); bytes = muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());
bytes = audio_MIN(bytes, INT_MAX); bytes = audio_MIN (bytes, INT_MAX);
samples = bytes >> hw->info.shift; samples = bytes >> hw->info.shift;
no->old_ticks = now; no->old_ticks = now;
@@ -62,7 +60,7 @@ static int no_run_out (HWVoiceOut *hw, int live)
static int no_write (SWVoiceOut *sw, void *buf, int len) static int no_write (SWVoiceOut *sw, void *buf, int len)
{ {
return audio_pcm_sw_write(sw, buf, len); return audio_pcm_sw_write (sw, buf, len);
} }
static int no_init_out(HWVoiceOut *hw, struct audsettings *as, void *drv_opaque) static int no_init_out(HWVoiceOut *hw, struct audsettings *as, void *drv_opaque)
@@ -107,7 +105,7 @@ static int no_run_in (HWVoiceIn *hw)
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
int64_t ticks = now - no->old_ticks; int64_t ticks = now - no->old_ticks;
int64_t bytes = int64_t bytes =
muldiv64(ticks, hw->info.bytes_per_second, NANOSECONDS_PER_SECOND); muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());
no->old_ticks = now; no->old_ticks = now;
bytes = audio_MIN (bytes, INT_MAX); bytes = audio_MIN (bytes, INT_MAX);

View File

@@ -21,7 +21,9 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h" #include <stdlib.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/ioctl.h> #include <sys/ioctl.h>
#include <sys/soundcard.h> #include <sys/soundcard.h>
#include "qemu-common.h" #include "qemu-common.h"
@@ -897,7 +899,7 @@ static struct audio_option oss_options[] = {
.name = "EXCLUSIVE", .name = "EXCLUSIVE",
.tag = AUD_OPT_BOOL, .tag = AUD_OPT_BOOL,
.valp = &glob_conf.exclusive, .valp = &glob_conf.exclusive,
.descr = "Open device in exclusive mode (vmix won't work)" .descr = "Open device in exclusive mode (vmix wont work)"
}, },
#ifdef USE_DSP_POLICY #ifdef USE_DSP_POLICY
{ {

View File

@@ -1,5 +1,4 @@
/* public domain */ /* public domain */
#include "qemu/osdep.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "audio.h" #include "audio.h"
@@ -781,22 +780,23 @@ static int qpa_ctl_in (HWVoiceIn *hw, int cmd, ...)
pa_threaded_mainloop_lock (g->mainloop); pa_threaded_mainloop_lock (g->mainloop);
op = pa_context_set_source_output_volume (g->context, /* FIXME: use the upcoming "set_source_output_{volume,mute}" */
pa_stream_get_index (pa->stream), op = pa_context_set_source_volume_by_index (g->context,
pa_stream_get_device_index (pa->stream),
&v, NULL, NULL); &v, NULL, NULL);
if (!op) { if (!op) {
qpa_logerr (pa_context_errno (g->context), qpa_logerr (pa_context_errno (g->context),
"set_source_output_volume() failed\n"); "set_source_volume() failed\n");
} else { } else {
pa_operation_unref(op); pa_operation_unref(op);
} }
op = pa_context_set_source_output_mute (g->context, op = pa_context_set_source_mute_by_index (g->context,
pa_stream_get_index (pa->stream), pa_stream_get_index (pa->stream),
sw->vol.mute, NULL, NULL); sw->vol.mute, NULL, NULL);
if (!op) { if (!op) {
qpa_logerr (pa_context_errno (g->context), qpa_logerr (pa_context_errno (g->context),
"set_source_output_mute() failed\n"); "set_source_mute() failed\n");
} else { } else {
pa_operation_unref (op); pa_operation_unref (op);
} }

View File

@@ -21,7 +21,6 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include <SDL.h> #include <SDL.h>
#include <SDL_thread.h> #include <SDL_thread.h>
#include "qemu-common.h" #include "qemu-common.h"

View File

@@ -17,9 +17,7 @@
* along with this program; if not, see <http://www.gnu.org/licenses/>. * along with this program; if not, see <http://www.gnu.org/licenses/>.
*/ */
#include "qemu/osdep.h"
#include "hw/hw.h" #include "hw/hw.h"
#include "qemu/host-utils.h"
#include "qemu/error-report.h" #include "qemu/error-report.h"
#include "qemu/timer.h" #include "qemu/timer.h"
#include "ui/qemu-spice.h" #include "ui/qemu-spice.h"
@@ -105,11 +103,11 @@ static int rate_get_samples (struct audio_pcm_info *info, SpiceRateCtl *rate)
now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
ticks = now - rate->start_ticks; ticks = now - rate->start_ticks;
bytes = muldiv64(ticks, info->bytes_per_second, NANOSECONDS_PER_SECOND); bytes = muldiv64 (ticks, info->bytes_per_second, get_ticks_per_sec ());
samples = (bytes - rate->bytes_sent) >> info->shift; samples = (bytes - rate->bytes_sent) >> info->shift;
if (samples < 0 || samples > 65536) { if (samples < 0 || samples > 65536) {
error_report("Resetting rate control (%" PRId64 " samples)", samples); error_report("Resetting rate control (%" PRId64 " samples)", samples);
rate_start(rate); rate_start (rate);
samples = 0; samples = 0;
} }
rate->bytes_sent += samples << info->shift; rate->bytes_sent += samples << info->shift;

View File

@@ -1,17 +0,0 @@
# See docs/tracing.txt for syntax documentation.
# audio/alsaaudio.c
alsa_revents(int revents) "revents = %d"
alsa_pollout(int i, int fd) "i = %d fd = %d"
alsa_set_handler(int events, int index, int fd, int err) "events=%#x index=%d fd=%d err=%d"
alsa_wrote_zero(int len) "Failed to write %d frames (wrote zero)"
alsa_read_zero(long len) "Failed to read %ld frames (read zero)"
alsa_xrun_out(void) "Recovering from playback xrun"
alsa_xrun_in(void) "Recovering from capture xrun"
alsa_resume_out(void) "Resuming suspended output stream"
alsa_resume_in(void) "Resuming suspended input stream"
alsa_no_frames(int state) "No frames available and ALSA state is %d"
# audio/ossaudio.c
oss_version(int version) "OSS version = %#x"
oss_invalid_available_size(int size, int bufsize) "Invalid available size, size=%d bufsize=%d"

View File

@@ -21,8 +21,7 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h" #include "hw/hw.h"
#include "qemu/host-utils.h"
#include "qemu/timer.h" #include "qemu/timer.h"
#include "audio.h" #include "audio.h"
@@ -51,7 +50,7 @@ static int wav_run_out (HWVoiceOut *hw, int live)
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
int64_t ticks = now - wav->old_ticks; int64_t ticks = now - wav->old_ticks;
int64_t bytes = int64_t bytes =
muldiv64(ticks, hw->info.bytes_per_second, NANOSECONDS_PER_SECOND); muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());
if (bytes > INT_MAX) { if (bytes > INT_MAX) {
samples = INT_MAX >> hw->info.shift; samples = INT_MAX >> hw->info.shift;

View File

@@ -1,4 +1,3 @@
#include "qemu/osdep.h"
#include "hw/hw.h" #include "hw/hw.h"
#include "monitor/monitor.h" #include "monitor/monitor.h"
#include "qemu/error-report.h" #include "qemu/error-report.h"

View File

@@ -21,8 +21,6 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "sysemu/char.h" #include "sysemu/char.h"
#include "qemu/timer.h" #include "qemu/timer.h"
@@ -305,7 +303,7 @@ static int baum_eat_packet(BaumDriverState *baum, const uint8_t *buf, int len)
return 0; return 0;
cur++; cur++;
} }
DPRINTF("Dropped %td bytes!\n", cur - buf); DPRINTF("Dropped %d bytes!\n", cur - buf);
} }
#define EAT(c) do {\ #define EAT(c) do {\
@@ -337,7 +335,7 @@ static int baum_eat_packet(BaumDriverState *baum, const uint8_t *buf, int len)
/* Allow 100ms to complete the DisplayData packet */ /* Allow 100ms to complete the DisplayData packet */
timer_mod(baum->cellCount_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + timer_mod(baum->cellCount_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
NANOSECONDS_PER_SECOND / 10); get_ticks_per_sec() / 10);
for (i = 0; i < baum->x * baum->y ; i++) { for (i = 0; i < baum->x * baum->y ; i++) {
EAT(c); EAT(c);
cells[i] = c; cells[i] = c;
@@ -563,12 +561,8 @@ static void baum_close(struct CharDriverState *chr)
g_free(baum); g_free(baum);
} }
static CharDriverState *chr_baum_init(const char *id, CharDriverState *chr_baum_init(void)
ChardevBackend *backend,
ChardevReturn *ret,
Error **errp)
{ {
ChardevCommon *common = backend->u.braille.data;
BaumDriverState *baum; BaumDriverState *baum;
CharDriverState *chr; CharDriverState *chr;
brlapi_handle_t *handle; brlapi_handle_t *handle;
@@ -579,12 +573,8 @@ static CharDriverState *chr_baum_init(const char *id,
#endif #endif
int tty; int tty;
chr = qemu_chr_alloc(common, errp);
if (!chr) {
return NULL;
}
baum = g_malloc0(sizeof(BaumDriverState)); baum = g_malloc0(sizeof(BaumDriverState));
baum->chr = chr; baum->chr = chr = qemu_chr_alloc();
chr->opaque = baum; chr->opaque = baum;
chr->chr_write = baum_write; chr->chr_write = baum_write;
@@ -596,16 +586,14 @@ static CharDriverState *chr_baum_init(const char *id,
baum->brlapi_fd = brlapi__openConnection(handle, NULL, NULL); baum->brlapi_fd = brlapi__openConnection(handle, NULL, NULL);
if (baum->brlapi_fd == -1) { if (baum->brlapi_fd == -1) {
error_setg(errp, "brlapi__openConnection: %s", brlapi_perror("baum_init: brlapi_openConnection");
brlapi_strerror(brlapi_error_location()));
goto fail_handle; goto fail_handle;
} }
baum->cellCount_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, baum_cellCount_timer_cb, baum); baum->cellCount_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, baum_cellCount_timer_cb, baum);
if (brlapi__getDisplaySize(handle, &baum->x, &baum->y) == -1) { if (brlapi__getDisplaySize(handle, &baum->x, &baum->y) == -1) {
error_setg(errp, "brlapi__getDisplaySize: %s", brlapi_perror("baum_init: brlapi_getDisplaySize");
brlapi_strerror(brlapi_error_location()));
goto fail; goto fail;
} }
@@ -621,8 +609,7 @@ static CharDriverState *chr_baum_init(const char *id,
tty = BRLAPI_TTY_DEFAULT; tty = BRLAPI_TTY_DEFAULT;
if (brlapi__enterTtyMode(handle, tty, NULL) == -1) { if (brlapi__enterTtyMode(handle, tty, NULL) == -1) {
error_setg(errp, "brlapi__enterTtyMode: %s", brlapi_perror("baum_init: brlapi_enterTtyMode");
brlapi_strerror(brlapi_error_location()));
goto fail; goto fail;
} }
@@ -642,8 +629,7 @@ fail_handle:
static void register_types(void) static void register_types(void)
{ {
register_char_driver("braille", CHARDEV_BACKEND_KIND_BRAILLE, NULL, register_char_driver("braille", CHARDEV_BACKEND_KIND_BRAILLE, NULL);
chr_baum_init);
} }
type_init(register_types); type_init(register_types);

View File

@@ -9,8 +9,6 @@
* This work is licensed under the terms of the GNU GPL, version 2 or later. * This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory. * See the COPYING file in the top-level directory.
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "sysemu/hostmem.h" #include "sysemu/hostmem.h"
#include "sysemu/sysemu.h" #include "sysemu/sysemu.h"
@@ -52,14 +50,11 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
error_setg(errp, "-mem-path not supported on this host"); error_setg(errp, "-mem-path not supported on this host");
#else #else
if (!memory_region_size(&backend->mr)) { if (!memory_region_size(&backend->mr)) {
gchar *path;
backend->force_prealloc = mem_prealloc; backend->force_prealloc = mem_prealloc;
path = object_get_canonical_path(OBJECT(backend));
memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
path, object_get_canonical_path(OBJECT(backend)),
backend->size, fb->share, backend->size, fb->share,
fb->mem_path, errp); fb->mem_path, errp);
g_free(path);
} }
#endif #endif
} }
@@ -88,7 +83,9 @@ static void set_mem_path(Object *o, const char *str, Error **errp)
error_setg(errp, "cannot change property value"); error_setg(errp, "cannot change property value");
return; return;
} }
g_free(fb->mem_path); if (fb->mem_path) {
g_free(fb->mem_path);
}
fb->mem_path = g_strdup(str); fb->mem_path = g_strdup(str);
} }
@@ -121,19 +118,11 @@ file_backend_instance_init(Object *o)
set_mem_path, NULL); set_mem_path, NULL);
} }
static void file_backend_instance_finalize(Object *o)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
g_free(fb->mem_path);
}
static const TypeInfo file_backend_info = { static const TypeInfo file_backend_info = {
.name = TYPE_MEMORY_BACKEND_FILE, .name = TYPE_MEMORY_BACKEND_FILE,
.parent = TYPE_MEMORY_BACKEND, .parent = TYPE_MEMORY_BACKEND,
.class_init = file_backend_class_init, .class_init = file_backend_class_init,
.instance_init = file_backend_instance_init, .instance_init = file_backend_instance_init,
.instance_finalize = file_backend_instance_finalize,
.instance_size = sizeof(HostMemoryBackendFile), .instance_size = sizeof(HostMemoryBackendFile),
}; };

View File

@@ -9,9 +9,7 @@
* This work is licensed under the terms of the GNU GPL, version 2 or later. * This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory. * See the COPYING file in the top-level directory.
*/ */
#include "qemu/osdep.h"
#include "sysemu/hostmem.h" #include "sysemu/hostmem.h"
#include "qapi/error.h"
#include "qom/object_interfaces.h" #include "qom/object_interfaces.h"
#define TYPE_MEMORY_BACKEND_RAM "memory-backend-ram" #define TYPE_MEMORY_BACKEND_RAM "memory-backend-ram"

View File

@@ -9,10 +9,8 @@
* This work is licensed under the terms of the GNU GPL, version 2 or later. * This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory. * See the COPYING file in the top-level directory.
*/ */
#include "qemu/osdep.h"
#include "sysemu/hostmem.h" #include "sysemu/hostmem.h"
#include "hw/boards.h" #include "hw/boards.h"
#include "qapi/error.h"
#include "qapi/visitor.h" #include "qapi/visitor.h"
#include "qapi-types.h" #include "qapi-types.h"
#include "qapi-visit.h" #include "qapi-visit.h"
@@ -28,18 +26,18 @@ QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_INTERLEAVE != MPOL_INTERLEAVE);
#endif #endif
static void static void
host_memory_backend_get_size(Object *obj, Visitor *v, const char *name, host_memory_backend_get_size(Object *obj, Visitor *v, void *opaque,
void *opaque, Error **errp) const char *name, Error **errp)
{ {
HostMemoryBackend *backend = MEMORY_BACKEND(obj); HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint64_t value = backend->size; uint64_t value = backend->size;
visit_type_size(v, name, &value, errp); visit_type_size(v, &value, name, errp);
} }
static void static void
host_memory_backend_set_size(Object *obj, Visitor *v, const char *name, host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
void *opaque, Error **errp) const char *name, Error **errp)
{ {
HostMemoryBackend *backend = MEMORY_BACKEND(obj); HostMemoryBackend *backend = MEMORY_BACKEND(obj);
Error *local_err = NULL; Error *local_err = NULL;
@@ -50,7 +48,7 @@ host_memory_backend_set_size(Object *obj, Visitor *v, const char *name,
goto out; goto out;
} }
visit_type_size(v, name, &value, &local_err); visit_type_size(v, &value, name, &local_err);
if (local_err) { if (local_err) {
goto out; goto out;
} }
@@ -64,17 +62,9 @@ out:
error_propagate(errp, local_err); error_propagate(errp, local_err);
} }
static uint16List **host_memory_append_node(uint16List **node,
unsigned long value)
{
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
return &(*node)->next;
}
static void static void
host_memory_backend_get_host_nodes(Object *obj, Visitor *v, const char *name, host_memory_backend_get_host_nodes(Object *obj, Visitor *v, void *opaque,
void *opaque, Error **errp) const char *name, Error **errp)
{ {
HostMemoryBackend *backend = MEMORY_BACKEND(obj); HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint16List *host_nodes = NULL; uint16List *host_nodes = NULL;
@@ -82,35 +72,37 @@ host_memory_backend_get_host_nodes(Object *obj, Visitor *v, const char *name,
unsigned long value; unsigned long value;
value = find_first_bit(backend->host_nodes, MAX_NODES); value = find_first_bit(backend->host_nodes, MAX_NODES);
node = host_memory_append_node(node, value);
if (value == MAX_NODES) { if (value == MAX_NODES) {
goto out; return;
} }
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
do { do {
value = find_next_bit(backend->host_nodes, MAX_NODES, value + 1); value = find_next_bit(backend->host_nodes, MAX_NODES, value + 1);
if (value == MAX_NODES) { if (value == MAX_NODES) {
break; break;
} }
node = host_memory_append_node(node, value); *node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
} while (true); } while (true);
out: visit_type_uint16List(v, &host_nodes, name, errp);
visit_type_uint16List(v, name, &host_nodes, errp);
} }
static void static void
host_memory_backend_set_host_nodes(Object *obj, Visitor *v, const char *name, host_memory_backend_set_host_nodes(Object *obj, Visitor *v, void *opaque,
void *opaque, Error **errp) const char *name, Error **errp)
{ {
#ifdef CONFIG_NUMA #ifdef CONFIG_NUMA
HostMemoryBackend *backend = MEMORY_BACKEND(obj); HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint16List *l = NULL; uint16List *l = NULL;
visit_type_uint16List(v, name, &l, errp); visit_type_uint16List(v, &l, name, errp);
while (l) { while (l) {
bitmap_set(backend->host_nodes, l->value, 1); bitmap_set(backend->host_nodes, l->value, 1);
@@ -203,7 +195,6 @@ static bool host_memory_backend_get_prealloc(Object *obj, Error **errp)
static void host_memory_backend_set_prealloc(Object *obj, bool value, static void host_memory_backend_set_prealloc(Object *obj, bool value,
Error **errp) Error **errp)
{ {
Error *local_err = NULL;
HostMemoryBackend *backend = MEMORY_BACKEND(obj); HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (backend->force_prealloc) { if (backend->force_prealloc) {
@@ -224,11 +215,7 @@ static void host_memory_backend_set_prealloc(Object *obj, bool value,
void *ptr = memory_region_get_ram_ptr(&backend->mr); void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr); uint64_t sz = memory_region_size(&backend->mr);
os_mem_prealloc(fd, ptr, sz, &local_err); os_mem_prealloc(fd, ptr, sz);
if (local_err) {
error_propagate(errp, local_err);
return;
}
backend->prealloc = true; backend->prealloc = true;
} }
} }
@@ -269,16 +256,6 @@ host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
return memory_region_size(&backend->mr) ? &backend->mr : NULL; return memory_region_size(&backend->mr) ? &backend->mr : NULL;
} }
void host_memory_backend_set_mapped(HostMemoryBackend *backend, bool mapped)
{
backend->is_mapped = mapped;
}
bool host_memory_backend_is_mapped(HostMemoryBackend *backend)
{
return backend->is_mapped;
}
static void static void
host_memory_backend_memory_complete(UserCreatable *uc, Error **errp) host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
{ {
@@ -291,7 +268,8 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
if (bc->alloc) { if (bc->alloc) {
bc->alloc(backend, &local_err); bc->alloc(backend, &local_err);
if (local_err) { if (local_err) {
goto out; error_propagate(errp, local_err);
return;
} }
ptr = memory_region_get_ram_ptr(&backend->mr); ptr = memory_region_get_ram_ptr(&backend->mr);
@@ -335,11 +313,9 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
assert(maxnode <= MAX_NODES); assert(maxnode <= MAX_NODES);
if (mbind(ptr, sz, backend->policy, if (mbind(ptr, sz, backend->policy,
maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) { maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) {
if (backend->policy != MPOL_DEFAULT || errno != ENOSYS) { error_setg_errno(errp, errno,
error_setg_errno(errp, errno, "cannot bind memory to host NUMA nodes");
"cannot bind memory to host NUMA nodes"); return;
return;
}
} }
#endif #endif
/* Preallocate memory after the NUMA policy has been instantiated. /* Preallocate memory after the NUMA policy has been instantiated.
@@ -347,21 +323,18 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
* specified NUMA policy in place. * specified NUMA policy in place.
*/ */
if (backend->prealloc) { if (backend->prealloc) {
os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz, os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz);
&local_err);
if (local_err) {
goto out;
}
} }
} }
out:
error_propagate(errp, local_err);
} }
static bool static bool
host_memory_backend_can_be_deleted(UserCreatable *uc, Error **errp) host_memory_backend_can_be_deleted(UserCreatable *uc, Error **errp)
{ {
if (host_memory_backend_is_mapped(MEMORY_BACKEND(uc))) { MemoryRegion *mr;
mr = host_memory_backend_get_memory(MEMORY_BACKEND(uc), errp);
if (memory_region_is_mapped(mr)) {
return false; return false;
} else { } else {
return true; return true;

View File

@@ -21,55 +21,20 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h" #include <stdlib.h>
#include "qemu-common.h" #include "qemu-common.h"
#include "sysemu/char.h" #include "sysemu/char.h"
#include "ui/console.h" #include "ui/console.h"
#include "ui/input.h"
#define MSMOUSE_LO6(n) ((n) & 0x3f) #define MSMOUSE_LO6(n) ((n) & 0x3f)
#define MSMOUSE_HI2(n) (((n) & 0xc0) >> 6) #define MSMOUSE_HI2(n) (((n) & 0xc0) >> 6)
typedef struct { static void msmouse_event(void *opaque,
CharDriverState *chr; int dx, int dy, int dz, int buttons_state)
QemuInputHandlerState *hs;
int axis[INPUT_AXIS__MAX];
bool btns[INPUT_BUTTON__MAX];
bool btnc[INPUT_BUTTON__MAX];
uint8_t outbuf[32];
int outlen;
} MouseState;
static void msmouse_chr_accept_input(CharDriverState *chr)
{ {
MouseState *mouse = chr->opaque; CharDriverState *chr = (CharDriverState *)opaque;
int len;
len = qemu_chr_be_can_write(chr);
if (len > mouse->outlen) {
len = mouse->outlen;
}
if (!len) {
return;
}
qemu_chr_be_write(chr, mouse->outbuf, len);
mouse->outlen -= len;
if (mouse->outlen) {
memmove(mouse->outbuf, mouse->outbuf + len, mouse->outlen);
}
}
static void msmouse_queue_event(MouseState *mouse)
{
unsigned char bytes[4] = { 0x40, 0x00, 0x00, 0x00 }; unsigned char bytes[4] = { 0x40, 0x00, 0x00, 0x00 };
int dx, dy, count = 3;
dx = mouse->axis[INPUT_AXIS_X];
mouse->axis[INPUT_AXIS_X] = 0;
dy = mouse->axis[INPUT_AXIS_Y];
mouse->axis[INPUT_AXIS_Y] = 0;
/* Movement deltas */ /* Movement deltas */
bytes[0] |= (MSMOUSE_HI2(dy) << 2) | MSMOUSE_HI2(dx); bytes[0] |= (MSMOUSE_HI2(dy) << 2) | MSMOUSE_HI2(dx);
@@ -77,54 +42,14 @@ static void msmouse_queue_event(MouseState *mouse)
bytes[2] |= MSMOUSE_LO6(dy); bytes[2] |= MSMOUSE_LO6(dy);
/* Buttons */ /* Buttons */
bytes[0] |= (mouse->btns[INPUT_BUTTON_LEFT] ? 0x20 : 0x00); bytes[0] |= (buttons_state & 0x01 ? 0x20 : 0x00);
bytes[0] |= (mouse->btns[INPUT_BUTTON_RIGHT] ? 0x10 : 0x00); bytes[0] |= (buttons_state & 0x02 ? 0x10 : 0x00);
if (mouse->btns[INPUT_BUTTON_MIDDLE] || bytes[3] |= (buttons_state & 0x04 ? 0x20 : 0x00);
mouse->btnc[INPUT_BUTTON_MIDDLE]) {
bytes[3] |= (mouse->btns[INPUT_BUTTON_MIDDLE] ? 0x20 : 0x00);
mouse->btnc[INPUT_BUTTON_MIDDLE] = false;
count = 4;
}
if (mouse->outlen <= sizeof(mouse->outbuf) - count) { /* We always send the packet of, so that we do not have to keep track
memcpy(mouse->outbuf + mouse->outlen, bytes, count); of previous state of the middle button. This can potentially confuse
mouse->outlen += count; some very old drivers for two button mice though. */
} else { qemu_chr_be_write(chr, bytes, 4);
/* queue full -> drop event */
}
}
static void msmouse_input_event(DeviceState *dev, QemuConsole *src,
InputEvent *evt)
{
MouseState *mouse = (MouseState *)dev;
InputMoveEvent *move;
InputBtnEvent *btn;
switch (evt->type) {
case INPUT_EVENT_KIND_REL:
move = evt->u.rel.data;
mouse->axis[move->axis] += move->value;
break;
case INPUT_EVENT_KIND_BTN:
btn = evt->u.btn.data;
mouse->btns[btn->button] = btn->down;
mouse->btnc[btn->button] = true;
break;
default:
/* keep gcc happy */
break;
}
}
static void msmouse_input_sync(DeviceState *dev)
{
MouseState *mouse = (MouseState *)dev;
msmouse_queue_event(mouse);
msmouse_chr_accept_input(mouse->chr);
} }
static int msmouse_chr_write (struct CharDriverState *s, const uint8_t *buf, int len) static int msmouse_chr_write (struct CharDriverState *s, const uint8_t *buf, int len)
@@ -135,51 +60,26 @@ static int msmouse_chr_write (struct CharDriverState *s, const uint8_t *buf, int
static void msmouse_chr_close (struct CharDriverState *chr) static void msmouse_chr_close (struct CharDriverState *chr)
{ {
MouseState *mouse = chr->opaque; g_free (chr);
qemu_input_handler_unregister(mouse->hs);
g_free(mouse);
} }
static QemuInputHandler msmouse_handler = { CharDriverState *qemu_chr_open_msmouse(void)
.name = "QEMU Microsoft Mouse",
.mask = INPUT_EVENT_MASK_BTN | INPUT_EVENT_MASK_REL,
.event = msmouse_input_event,
.sync = msmouse_input_sync,
};
static CharDriverState *qemu_chr_open_msmouse(const char *id,
ChardevBackend *backend,
ChardevReturn *ret,
Error **errp)
{ {
ChardevCommon *common = backend->u.msmouse.data;
MouseState *mouse;
CharDriverState *chr; CharDriverState *chr;
chr = qemu_chr_alloc(common, errp); chr = qemu_chr_alloc();
if (!chr) {
return NULL;
}
chr->chr_write = msmouse_chr_write; chr->chr_write = msmouse_chr_write;
chr->chr_close = msmouse_chr_close; chr->chr_close = msmouse_chr_close;
chr->chr_accept_input = msmouse_chr_accept_input;
chr->explicit_be_open = true; chr->explicit_be_open = true;
mouse = g_new0(MouseState, 1); qemu_add_mouse_event_handler(msmouse_event, chr, 0, "QEMU Microsoft Mouse");
mouse->hs = qemu_input_handler_register((DeviceState *)mouse,
&msmouse_handler);
mouse->chr = chr;
chr->opaque = mouse;
return chr; return chr;
} }
static void register_types(void) static void register_types(void)
{ {
register_char_driver("msmouse", CHARDEV_BACKEND_KIND_MSMOUSE, NULL, register_char_driver("msmouse", CHARDEV_BACKEND_KIND_MSMOUSE, NULL);
qemu_chr_open_msmouse);
} }
type_init(register_types); type_init(register_types);

View File

@@ -10,10 +10,8 @@
* See the COPYING file in the top-level directory. * See the COPYING file in the top-level directory.
*/ */
#include "qemu/osdep.h"
#include "sysemu/rng.h" #include "sysemu/rng.h"
#include "sysemu/char.h" #include "sysemu/char.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h" #include "qapi/qmp/qerror.h"
#include "hw/qdev.h" /* just for DEFINE_PROP_CHR */ #include "hw/qdev.h" /* just for DEFINE_PROP_CHR */
@@ -26,12 +24,33 @@ typedef struct RngEgd
CharDriverState *chr; CharDriverState *chr;
char *chr_name; char *chr_name;
GSList *requests;
} RngEgd; } RngEgd;
static void rng_egd_request_entropy(RngBackend *b, RngRequest *req) typedef struct RngRequest
{
EntropyReceiveFunc *receive_entropy;
uint8_t *data;
void *opaque;
size_t offset;
size_t size;
} RngRequest;
static void rng_egd_request_entropy(RngBackend *b, size_t size,
EntropyReceiveFunc *receive_entropy,
void *opaque)
{ {
RngEgd *s = RNG_EGD(b); RngEgd *s = RNG_EGD(b);
size_t size = req->size; RngRequest *req;
req = g_malloc(sizeof(*req));
req->offset = 0;
req->size = size;
req->receive_entropy = receive_entropy;
req->opaque = opaque;
req->data = g_malloc(req->size);
while (size > 0) { while (size > 0) {
uint8_t header[2]; uint8_t header[2];
@@ -41,21 +60,28 @@ static void rng_egd_request_entropy(RngBackend *b, RngRequest *req)
header[0] = 0x02; header[0] = 0x02;
header[1] = len; header[1] = len;
/* XXX this blocks entire thread. Rewrite to use qemu_chr_fe_write(s->chr, header, sizeof(header));
* qemu_chr_fe_write and background I/O callbacks */
qemu_chr_fe_write_all(s->chr, header, sizeof(header));
size -= len; size -= len;
} }
s->requests = g_slist_append(s->requests, req);
}
static void rng_egd_free_request(RngRequest *req)
{
g_free(req->data);
g_free(req);
} }
static int rng_egd_chr_can_read(void *opaque) static int rng_egd_chr_can_read(void *opaque)
{ {
RngEgd *s = RNG_EGD(opaque); RngEgd *s = RNG_EGD(opaque);
RngRequest *req; GSList *i;
int size = 0; int size = 0;
QSIMPLEQ_FOREACH(req, &s->parent.requests, next) { for (i = s->requests; i; i = i->next) {
RngRequest *req = i->data;
size += req->size - req->offset; size += req->size - req->offset;
} }
@@ -67,8 +93,8 @@ static void rng_egd_chr_read(void *opaque, const uint8_t *buf, int size)
RngEgd *s = RNG_EGD(opaque); RngEgd *s = RNG_EGD(opaque);
size_t buf_offset = 0; size_t buf_offset = 0;
while (size > 0 && !QSIMPLEQ_EMPTY(&s->parent.requests)) { while (size > 0 && s->requests) {
RngRequest *req = QSIMPLEQ_FIRST(&s->parent.requests); RngRequest *req = s->requests->data;
int len = MIN(size, req->size - req->offset); int len = MIN(size, req->size - req->offset);
memcpy(req->data + req->offset, buf + buf_offset, len); memcpy(req->data + req->offset, buf + buf_offset, len);
@@ -77,13 +103,38 @@ static void rng_egd_chr_read(void *opaque, const uint8_t *buf, int size)
size -= len; size -= len;
if (req->offset == req->size) { if (req->offset == req->size) {
s->requests = g_slist_remove_link(s->requests, s->requests);
req->receive_entropy(req->opaque, req->data, req->size); req->receive_entropy(req->opaque, req->data, req->size);
rng_backend_finalize_request(&s->parent, req); rng_egd_free_request(req);
} }
} }
} }
static void rng_egd_free_requests(RngEgd *s)
{
GSList *i;
for (i = s->requests; i; i = i->next) {
rng_egd_free_request(i->data);
}
g_slist_free(s->requests);
s->requests = NULL;
}
static void rng_egd_cancel_requests(RngBackend *b)
{
RngEgd *s = RNG_EGD(b);
/* We simply delete the list of pending requests. If there is data in the
* queue waiting to be read, this is okay, because there will always be
* more data than we requested originally
*/
rng_egd_free_requests(s);
}
static void rng_egd_opened(RngBackend *b, Error **errp) static void rng_egd_opened(RngBackend *b, Error **errp)
{ {
RngEgd *s = RNG_EGD(b); RngEgd *s = RNG_EGD(b);
@@ -152,6 +203,8 @@ static void rng_egd_finalize(Object *obj)
} }
g_free(s->chr_name); g_free(s->chr_name);
rng_egd_free_requests(s);
} }
static void rng_egd_class_init(ObjectClass *klass, void *data) static void rng_egd_class_init(ObjectClass *klass, void *data)
@@ -159,6 +212,7 @@ static void rng_egd_class_init(ObjectClass *klass, void *data)
RngBackendClass *rbc = RNG_BACKEND_CLASS(klass); RngBackendClass *rbc = RNG_BACKEND_CLASS(klass);
rbc->request_entropy = rng_egd_request_entropy; rbc->request_entropy = rng_egd_request_entropy;
rbc->cancel_requests = rng_egd_cancel_requests;
rbc->opened = rng_egd_opened; rbc->opened = rng_egd_opened;
} }

View File

@@ -10,19 +10,21 @@
* See the COPYING file in the top-level directory. * See the COPYING file in the top-level directory.
*/ */
#include "qemu/osdep.h"
#include "sysemu/rng-random.h" #include "sysemu/rng-random.h"
#include "sysemu/rng.h" #include "sysemu/rng.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h" #include "qapi/qmp/qerror.h"
#include "qemu/main-loop.h" #include "qemu/main-loop.h"
struct RngRandom struct RndRandom
{ {
RngBackend parent; RngBackend parent;
int fd; int fd;
char *filename; char *filename;
EntropyReceiveFunc *receive_func;
void *opaque;
size_t size;
}; };
/** /**
@@ -34,41 +36,42 @@ struct RngRandom
static void entropy_available(void *opaque) static void entropy_available(void *opaque)
{ {
RngRandom *s = RNG_RANDOM(opaque); RndRandom *s = RNG_RANDOM(opaque);
uint8_t buffer[s->size];
ssize_t len;
while (!QSIMPLEQ_EMPTY(&s->parent.requests)) { len = read(s->fd, buffer, s->size);
RngRequest *req = QSIMPLEQ_FIRST(&s->parent.requests); if (len < 0 && errno == EAGAIN) {
ssize_t len; return;
len = read(s->fd, req->data, req->size);
if (len < 0 && errno == EAGAIN) {
return;
}
g_assert(len != -1);
req->receive_entropy(req->opaque, req->data, len);
rng_backend_finalize_request(&s->parent, req);
} }
g_assert(len != -1);
s->receive_func(s->opaque, buffer, len);
s->receive_func = NULL;
/* We've drained all requests, the fd handler can be reset. */
qemu_set_fd_handler(s->fd, NULL, NULL, NULL); qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
} }
static void rng_random_request_entropy(RngBackend *b, RngRequest *req) static void rng_random_request_entropy(RngBackend *b, size_t size,
EntropyReceiveFunc *receive_entropy,
void *opaque)
{ {
RngRandom *s = RNG_RANDOM(b); RndRandom *s = RNG_RANDOM(b);
if (QSIMPLEQ_EMPTY(&s->parent.requests)) { if (s->receive_func) {
/* If there are no pending requests yet, we need to s->receive_func(s->opaque, NULL, 0);
* install our fd handler. */
qemu_set_fd_handler(s->fd, entropy_available, NULL, s);
} }
s->receive_func = receive_entropy;
s->opaque = opaque;
s->size = size;
qemu_set_fd_handler(s->fd, entropy_available, NULL, s);
} }
static void rng_random_opened(RngBackend *b, Error **errp) static void rng_random_opened(RngBackend *b, Error **errp)
{ {
RngRandom *s = RNG_RANDOM(b); RndRandom *s = RNG_RANDOM(b);
if (s->filename == NULL) { if (s->filename == NULL) {
error_setg(errp, QERR_INVALID_PARAMETER_VALUE, error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
@@ -83,7 +86,7 @@ static void rng_random_opened(RngBackend *b, Error **errp)
static char *rng_random_get_filename(Object *obj, Error **errp) static char *rng_random_get_filename(Object *obj, Error **errp)
{ {
RngRandom *s = RNG_RANDOM(obj); RndRandom *s = RNG_RANDOM(obj);
return g_strdup(s->filename); return g_strdup(s->filename);
} }
@@ -92,7 +95,7 @@ static void rng_random_set_filename(Object *obj, const char *filename,
Error **errp) Error **errp)
{ {
RngBackend *b = RNG_BACKEND(obj); RngBackend *b = RNG_BACKEND(obj);
RngRandom *s = RNG_RANDOM(obj); RndRandom *s = RNG_RANDOM(obj);
if (b->opened) { if (b->opened) {
error_setg(errp, QERR_PERMISSION_DENIED); error_setg(errp, QERR_PERMISSION_DENIED);
@@ -105,7 +108,7 @@ static void rng_random_set_filename(Object *obj, const char *filename,
static void rng_random_init(Object *obj) static void rng_random_init(Object *obj)
{ {
RngRandom *s = RNG_RANDOM(obj); RndRandom *s = RNG_RANDOM(obj);
object_property_add_str(obj, "filename", object_property_add_str(obj, "filename",
rng_random_get_filename, rng_random_get_filename,
@@ -118,7 +121,7 @@ static void rng_random_init(Object *obj)
static void rng_random_finalize(Object *obj) static void rng_random_finalize(Object *obj)
{ {
RngRandom *s = RNG_RANDOM(obj); RndRandom *s = RNG_RANDOM(obj);
if (s->fd != -1) { if (s->fd != -1) {
qemu_set_fd_handler(s->fd, NULL, NULL, NULL); qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
@@ -139,7 +142,7 @@ static void rng_random_class_init(ObjectClass *klass, void *data)
static const TypeInfo rng_random_info = { static const TypeInfo rng_random_info = {
.name = TYPE_RNG_RANDOM, .name = TYPE_RNG_RANDOM,
.parent = TYPE_RNG_BACKEND, .parent = TYPE_RNG_BACKEND,
.instance_size = sizeof(RngRandom), .instance_size = sizeof(RndRandom),
.class_init = rng_random_class_init, .class_init = rng_random_class_init,
.instance_init = rng_random_init, .instance_init = rng_random_init,
.instance_finalize = rng_random_finalize, .instance_finalize = rng_random_finalize,

View File

@@ -10,9 +10,7 @@
* See the COPYING file in the top-level directory. * See the COPYING file in the top-level directory.
*/ */
#include "qemu/osdep.h"
#include "sysemu/rng.h" #include "sysemu/rng.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h" #include "qapi/qmp/qerror.h"
#include "qom/object_interfaces.h" #include "qom/object_interfaces.h"
@@ -21,20 +19,18 @@ void rng_backend_request_entropy(RngBackend *s, size_t size,
void *opaque) void *opaque)
{ {
RngBackendClass *k = RNG_BACKEND_GET_CLASS(s); RngBackendClass *k = RNG_BACKEND_GET_CLASS(s);
RngRequest *req;
if (k->request_entropy) { if (k->request_entropy) {
req = g_malloc(sizeof(*req)); k->request_entropy(s, size, receive_entropy, opaque);
}
}
req->offset = 0; void rng_backend_cancel_requests(RngBackend *s)
req->size = size; {
req->receive_entropy = receive_entropy; RngBackendClass *k = RNG_BACKEND_GET_CLASS(s);
req->opaque = opaque;
req->data = g_malloc(req->size);
k->request_entropy(s, req); if (k->cancel_requests) {
k->cancel_requests(s);
QSIMPLEQ_INSERT_TAIL(&s->requests, req, next);
} }
} }
@@ -76,48 +72,14 @@ static void rng_backend_prop_set_opened(Object *obj, bool value, Error **errp)
s->opened = true; s->opened = true;
} }
static void rng_backend_free_request(RngRequest *req)
{
g_free(req->data);
g_free(req);
}
static void rng_backend_free_requests(RngBackend *s)
{
RngRequest *req, *next;
QSIMPLEQ_FOREACH_SAFE(req, &s->requests, next, next) {
rng_backend_free_request(req);
}
QSIMPLEQ_INIT(&s->requests);
}
void rng_backend_finalize_request(RngBackend *s, RngRequest *req)
{
QSIMPLEQ_REMOVE(&s->requests, req, RngRequest, next);
rng_backend_free_request(req);
}
static void rng_backend_init(Object *obj) static void rng_backend_init(Object *obj)
{ {
RngBackend *s = RNG_BACKEND(obj);
QSIMPLEQ_INIT(&s->requests);
object_property_add_bool(obj, "opened", object_property_add_bool(obj, "opened",
rng_backend_prop_get_opened, rng_backend_prop_get_opened,
rng_backend_prop_set_opened, rng_backend_prop_set_opened,
NULL); NULL);
} }
static void rng_backend_finalize(Object *obj)
{
RngBackend *s = RNG_BACKEND(obj);
rng_backend_free_requests(s);
}
static void rng_backend_class_init(ObjectClass *oc, void *data) static void rng_backend_class_init(ObjectClass *oc, void *data)
{ {
UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc); UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -130,7 +92,6 @@ static const TypeInfo rng_backend_info = {
.parent = TYPE_OBJECT, .parent = TYPE_OBJECT,
.instance_size = sizeof(RngBackend), .instance_size = sizeof(RngBackend),
.instance_init = rng_backend_init, .instance_init = rng_backend_init,
.instance_finalize = rng_backend_finalize,
.class_size = sizeof(RngBackendClass), .class_size = sizeof(RngBackendClass),
.class_init = rng_backend_class_init, .class_init = rng_backend_class_init,
.abstract = true, .abstract = true,

View File

@@ -23,7 +23,6 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "sysemu/char.h" #include "sysemu/char.h"
@@ -109,16 +108,13 @@ static void testdev_close(struct CharDriverState *chr)
g_free(testdev); g_free(testdev);
} }
static CharDriverState *chr_testdev_init(const char *id, CharDriverState *chr_testdev_init(void)
ChardevBackend *backend,
ChardevReturn *ret,
Error **errp)
{ {
TestdevCharState *testdev; TestdevCharState *testdev;
CharDriverState *chr; CharDriverState *chr;
testdev = g_new0(TestdevCharState, 1); testdev = g_malloc0(sizeof(TestdevCharState));
testdev->chr = chr = g_new0(CharDriverState, 1); testdev->chr = chr = g_malloc0(sizeof(CharDriverState));
chr->opaque = testdev; chr->opaque = testdev;
chr->chr_write = testdev_write; chr->chr_write = testdev_write;
@@ -129,8 +125,7 @@ static CharDriverState *chr_testdev_init(const char *id,
static void register_types(void) static void register_types(void)
{ {
register_char_driver("testdev", CHARDEV_BACKEND_KIND_TESTDEV, NULL, register_char_driver("testdev", CHARDEV_BACKEND_KIND_TESTDEV, NULL);
chr_testdev_init);
} }
type_init(register_types); type_init(register_types);

View File

@@ -12,9 +12,7 @@
* Based on backends/rng.c by Anthony Liguori * Based on backends/rng.c by Anthony Liguori
*/ */
#include "qemu/osdep.h"
#include "sysemu/tpm_backend.h" #include "sysemu/tpm_backend.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h" #include "qapi/qmp/qerror.h"
#include "sysemu/tpm.h" #include "sysemu/tpm.h"
#include "qemu/thread.h" #include "qemu/thread.h"

View File

@@ -24,7 +24,6 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "exec/cpu-common.h" #include "exec/cpu-common.h"
#include "sysemu/kvm.h" #include "sysemu/kvm.h"
@@ -37,17 +36,6 @@
static QEMUBalloonEvent *balloon_event_fn; static QEMUBalloonEvent *balloon_event_fn;
static QEMUBalloonStatus *balloon_stat_fn; static QEMUBalloonStatus *balloon_stat_fn;
static void *balloon_opaque; static void *balloon_opaque;
static bool balloon_inhibited;
bool qemu_balloon_is_inhibited(void)
{
return balloon_inhibited;
}
void qemu_balloon_inhibit(bool state)
{
balloon_inhibited = state;
}
static bool have_balloon(Error **errp) static bool have_balloon(Error **errp)
{ {

2556
block.c

File diff suppressed because it is too large Load Diff

View File

@@ -1,15 +1,15 @@
block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o dmg.o block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o block-obj-y += qed-check.o
block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
block-obj-y += quorum.o block-obj-y += quorum.o
block-obj-y += parallels.o blkdebug.o blkverify.o blkreplay.o block-obj-y += parallels.o blkdebug.o blkverify.o
block-obj-y += block-backend.o snapshot.o qapi.o block-obj-y += block-backend.o snapshot.o qapi.o
block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
block-obj-$(CONFIG_POSIX) += raw-posix.o block-obj-$(CONFIG_POSIX) += raw-posix.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
block-obj-y += null.o mirror.o commit.o io.o block-obj-y += null.o mirror.o io.o
block-obj-y += throttle-groups.o block-obj-y += throttle-groups.o
block-obj-y += nbd.o nbd-client.o sheepdog.o block-obj-y += nbd.o nbd-client.o sheepdog.o
@@ -20,16 +20,13 @@ block-obj-$(CONFIG_RBD) += rbd.o
block-obj-$(CONFIG_GLUSTERFS) += gluster.o block-obj-$(CONFIG_GLUSTERFS) += gluster.o
block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
block-obj-$(CONFIG_LIBSSH2) += ssh.o block-obj-$(CONFIG_LIBSSH2) += ssh.o
block-obj-y += accounting.o dirty-bitmap.o block-obj-y += accounting.o
block-obj-y += write-threshold.o block-obj-y += write-threshold.o
block-obj-y += backup.o
block-obj-$(CONFIG_REPLICATION) += replication.o
block-obj-y += crypto.o
common-obj-y += stream.o common-obj-y += stream.o
common-obj-y += commit.o
common-obj-y += backup.o
nfs.o-libs := $(LIBNFS_LIBS)
iscsi.o-cflags := $(LIBISCSI_CFLAGS) iscsi.o-cflags := $(LIBISCSI_CFLAGS)
iscsi.o-libs := $(LIBISCSI_LIBS) iscsi.o-libs := $(LIBISCSI_LIBS)
curl.o-cflags := $(CURL_CFLAGS) curl.o-cflags := $(CURL_CFLAGS)
@@ -41,6 +38,7 @@ gluster.o-libs := $(GLUSTERFS_LIBS)
ssh.o-cflags := $(LIBSSH2_CFLAGS) ssh.o-cflags := $(LIBSSH2_CFLAGS)
ssh.o-libs := $(LIBSSH2_LIBS) ssh.o-libs := $(LIBSSH2_LIBS)
archipelago.o-libs := $(ARCHIPELAGO_LIBS) archipelago.o-libs := $(ARCHIPELAGO_LIBS)
block-obj-m += dmg.o
dmg.o-libs := $(BZIP2_LIBS) dmg.o-libs := $(BZIP2_LIBS)
qcow.o-libs := -lz qcow.o-libs := -lz
linux-aio.o-libs := -laio linux-aio.o-libs := -laio

View File

@@ -2,7 +2,6 @@
* QEMU System Emulator block accounting * QEMU System Emulator block accounting
* *
* Copyright (c) 2011 Christoph Hellwig * Copyright (c) 2011 Christoph Hellwig
* Copyright (c) 2015 Igalia, S.L.
* *
* Permission is hereby granted, free of charge, to any person obtaining a copy * Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal * of this software and associated documentation files (the "Software"), to deal
@@ -23,58 +22,9 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "block/accounting.h" #include "block/accounting.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qemu/timer.h" #include "qemu/timer.h"
#include "sysemu/qtest.h"
static QEMUClockType clock_type = QEMU_CLOCK_REALTIME;
static const int qtest_latency_ns = NANOSECONDS_PER_SECOND / 1000;
void block_acct_init(BlockAcctStats *stats, bool account_invalid,
bool account_failed)
{
stats->account_invalid = account_invalid;
stats->account_failed = account_failed;
if (qtest_enabled()) {
clock_type = QEMU_CLOCK_VIRTUAL;
}
}
void block_acct_cleanup(BlockAcctStats *stats)
{
BlockAcctTimedStats *s, *next;
QSLIST_FOREACH_SAFE(s, &stats->intervals, entries, next) {
g_free(s);
}
}
void block_acct_add_interval(BlockAcctStats *stats, unsigned interval_length)
{
BlockAcctTimedStats *s;
unsigned i;
s = g_new0(BlockAcctTimedStats, 1);
s->interval_length = interval_length;
QSLIST_INSERT_HEAD(&stats->intervals, s, entries);
for (i = 0; i < BLOCK_MAX_IOTYPE; i++) {
timed_average_init(&s->latency[i], clock_type,
(uint64_t) interval_length * NANOSECONDS_PER_SECOND);
}
}
BlockAcctTimedStats *block_acct_interval_next(BlockAcctStats *stats,
BlockAcctTimedStats *s)
{
if (s == NULL) {
return QSLIST_FIRST(&stats->intervals);
} else {
return QSLIST_NEXT(s, entries);
}
}
void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie, void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
int64_t bytes, enum BlockAcctType type) int64_t bytes, enum BlockAcctType type)
@@ -82,69 +32,26 @@ void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
assert(type < BLOCK_MAX_IOTYPE); assert(type < BLOCK_MAX_IOTYPE);
cookie->bytes = bytes; cookie->bytes = bytes;
cookie->start_time_ns = qemu_clock_get_ns(clock_type); cookie->start_time_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
cookie->type = type; cookie->type = type;
} }
void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie) void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
{ {
BlockAcctTimedStats *s;
int64_t time_ns = qemu_clock_get_ns(clock_type);
int64_t latency_ns = time_ns - cookie->start_time_ns;
if (qtest_enabled()) {
latency_ns = qtest_latency_ns;
}
assert(cookie->type < BLOCK_MAX_IOTYPE); assert(cookie->type < BLOCK_MAX_IOTYPE);
stats->nr_bytes[cookie->type] += cookie->bytes; stats->nr_bytes[cookie->type] += cookie->bytes;
stats->nr_ops[cookie->type]++; stats->nr_ops[cookie->type]++;
stats->total_time_ns[cookie->type] += latency_ns; stats->total_time_ns[cookie->type] +=
stats->last_access_time_ns = time_ns; qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - cookie->start_time_ns;
QSLIST_FOREACH(s, &stats->intervals, entries) {
timed_average_account(&s->latency[cookie->type], latency_ns);
}
} }
void block_acct_failed(BlockAcctStats *stats, BlockAcctCookie *cookie)
void block_acct_highest_sector(BlockAcctStats *stats, int64_t sector_num,
unsigned int nb_sectors)
{ {
assert(cookie->type < BLOCK_MAX_IOTYPE); if (stats->wr_highest_sector < sector_num + nb_sectors - 1) {
stats->wr_highest_sector = sector_num + nb_sectors - 1;
stats->failed_ops[cookie->type]++;
if (stats->account_failed) {
BlockAcctTimedStats *s;
int64_t time_ns = qemu_clock_get_ns(clock_type);
int64_t latency_ns = time_ns - cookie->start_time_ns;
if (qtest_enabled()) {
latency_ns = qtest_latency_ns;
}
stats->total_time_ns[cookie->type] += latency_ns;
stats->last_access_time_ns = time_ns;
QSLIST_FOREACH(s, &stats->intervals, entries) {
timed_average_account(&s->latency[cookie->type], latency_ns);
}
}
}
void block_acct_invalid(BlockAcctStats *stats, enum BlockAcctType type)
{
assert(type < BLOCK_MAX_IOTYPE);
/* block_acct_done() and block_acct_failed() update
* total_time_ns[], but this one does not. The reason is that
* invalid requests are accounted during their submission,
* therefore there's no actual I/O involved. */
stats->invalid_ops[type]++;
if (stats->account_invalid) {
stats->last_access_time_ns = qemu_clock_get_ns(clock_type);
} }
} }
@@ -154,20 +61,3 @@ void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type,
assert(type < BLOCK_MAX_IOTYPE); assert(type < BLOCK_MAX_IOTYPE);
stats->merged[type] += num_requests; stats->merged[type] += num_requests;
} }
int64_t block_acct_idle_time_ns(BlockAcctStats *stats)
{
return qemu_clock_get_ns(clock_type) - stats->last_access_time_ns;
}
double block_acct_queue_depth(BlockAcctTimedStats *stats,
enum BlockAcctType type)
{
uint64_t sum, elapsed;
assert(type < BLOCK_MAX_IOTYPE);
sum = timed_average_sum(&stats->latency[type], &elapsed);
return (double) sum / elapsed;
}

View File

@@ -50,8 +50,7 @@
* *
*/ */
#include "qemu/osdep.h" #include "qemu-common.h"
#include "qemu/cutils.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qemu/error-report.h" #include "qemu/error-report.h"
#include "qemu/thread.h" #include "qemu/thread.h"
@@ -60,6 +59,7 @@
#include "qapi/qmp/qjson.h" #include "qapi/qmp/qjson.h"
#include "qemu/atomic.h" #include "qemu/atomic.h"
#include <inttypes.h>
#include <xseg/xseg.h> #include <xseg/xseg.h>
#include <xseg/protocol.h> #include <xseg/protocol.h>
@@ -974,9 +974,11 @@ err_exit2:
static int64_t qemu_archipelago_getlength(BlockDriverState *bs) static int64_t qemu_archipelago_getlength(BlockDriverState *bs)
{ {
int64_t ret;
BDRVArchipelagoState *s = bs->opaque; BDRVArchipelagoState *s = bs->opaque;
return archipelago_volume_info(s); ret = archipelago_volume_info(s);
return ret;
} }
static int qemu_archipelago_truncate(BlockDriverState *bs, int64_t offset) static int qemu_archipelago_truncate(BlockDriverState *bs, int64_t offset)

View File

@@ -11,26 +11,33 @@
* *
*/ */
#include "qemu/osdep.h" #include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include "trace.h" #include "trace.h"
#include "block/block.h" #include "block/block.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "block/blockjob.h" #include "block/blockjob.h"
#include "block/block_backup.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h" #include "qapi/qmp/qerror.h"
#include "qemu/ratelimit.h" #include "qemu/ratelimit.h"
#include "qemu/cutils.h"
#include "sysemu/block-backend.h"
#include "qemu/bitmap.h"
#define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16) #define BACKUP_CLUSTER_BITS 16
#define BACKUP_CLUSTER_SIZE (1 << BACKUP_CLUSTER_BITS)
#define BACKUP_SECTORS_PER_CLUSTER (BACKUP_CLUSTER_SIZE / BDRV_SECTOR_SIZE)
#define SLICE_TIME 100000000ULL /* ns */ #define SLICE_TIME 100000000ULL /* ns */
typedef struct CowRequest {
int64_t start;
int64_t end;
QLIST_ENTRY(CowRequest) list;
CoQueue wait_queue; /* coroutines blocked on this request */
} CowRequest;
typedef struct BackupBlockJob { typedef struct BackupBlockJob {
BlockJob common; BlockJob common;
BlockBackend *target; BlockDriverState *target;
/* bitmap for sync=incremental */ /* bitmap for sync=incremental */
BdrvDirtyBitmap *sync_bitmap; BdrvDirtyBitmap *sync_bitmap;
MirrorSyncMode sync_mode; MirrorSyncMode sync_mode;
@@ -39,19 +46,10 @@ typedef struct BackupBlockJob {
BlockdevOnError on_target_error; BlockdevOnError on_target_error;
CoRwlock flush_rwlock; CoRwlock flush_rwlock;
uint64_t sectors_read; uint64_t sectors_read;
unsigned long *done_bitmap; HBitmap *bitmap;
int64_t cluster_size;
bool compress;
NotifierWithReturn before_write;
QLIST_HEAD(, CowRequest) inflight_reqs; QLIST_HEAD(, CowRequest) inflight_reqs;
} BackupBlockJob; } BackupBlockJob;
/* Size of a cluster in sectors, instead of bytes. */
static inline int64_t cluster_size_sectors(BackupBlockJob *job)
{
return job->cluster_size / BDRV_SECTOR_SIZE;
}
/* See if in-flight requests overlap and wait for them to complete */ /* See if in-flight requests overlap and wait for them to complete */
static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job, static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
int64_t start, int64_t start,
@@ -89,25 +87,23 @@ static void cow_request_end(CowRequest *req)
qemu_co_queue_restart_all(&req->wait_queue); qemu_co_queue_restart_all(&req->wait_queue);
} }
static int coroutine_fn backup_do_cow(BackupBlockJob *job, static int coroutine_fn backup_do_cow(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int64_t sector_num, int nb_sectors,
bool *error_is_read, bool *error_is_read)
bool is_write_notifier)
{ {
BlockBackend *blk = job->common.blk; BackupBlockJob *job = (BackupBlockJob *)bs->job;
CowRequest cow_request; CowRequest cow_request;
struct iovec iov; struct iovec iov;
QEMUIOVector bounce_qiov; QEMUIOVector bounce_qiov;
void *bounce_buffer = NULL; void *bounce_buffer = NULL;
int ret = 0; int ret = 0;
int64_t sectors_per_cluster = cluster_size_sectors(job);
int64_t start, end; int64_t start, end;
int n; int n;
qemu_co_rwlock_rdlock(&job->flush_rwlock); qemu_co_rwlock_rdlock(&job->flush_rwlock);
start = sector_num / sectors_per_cluster; start = sector_num / BACKUP_SECTORS_PER_CLUSTER;
end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster); end = DIV_ROUND_UP(sector_num + nb_sectors, BACKUP_SECTORS_PER_CLUSTER);
trace_backup_do_cow_enter(job, start, sector_num, nb_sectors); trace_backup_do_cow_enter(job, start, sector_num, nb_sectors);
@@ -115,27 +111,26 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
cow_request_begin(&cow_request, job, start, end); cow_request_begin(&cow_request, job, start, end);
for (; start < end; start++) { for (; start < end; start++) {
if (test_bit(start, job->done_bitmap)) { if (hbitmap_get(job->bitmap, start)) {
trace_backup_do_cow_skip(job, start); trace_backup_do_cow_skip(job, start);
continue; /* already copied */ continue; /* already copied */
} }
trace_backup_do_cow_process(job, start); trace_backup_do_cow_process(job, start);
n = MIN(sectors_per_cluster, n = MIN(BACKUP_SECTORS_PER_CLUSTER,
job->common.len / BDRV_SECTOR_SIZE - job->common.len / BDRV_SECTOR_SIZE -
start * sectors_per_cluster); start * BACKUP_SECTORS_PER_CLUSTER);
if (!bounce_buffer) { if (!bounce_buffer) {
bounce_buffer = blk_blockalign(blk, job->cluster_size); bounce_buffer = qemu_blockalign(bs, BACKUP_CLUSTER_SIZE);
} }
iov.iov_base = bounce_buffer; iov.iov_base = bounce_buffer;
iov.iov_len = n * BDRV_SECTOR_SIZE; iov.iov_len = n * BDRV_SECTOR_SIZE;
qemu_iovec_init_external(&bounce_qiov, &iov, 1); qemu_iovec_init_external(&bounce_qiov, &iov, 1);
ret = blk_co_preadv(blk, start * job->cluster_size, ret = bdrv_co_readv(bs, start * BACKUP_SECTORS_PER_CLUSTER, n,
bounce_qiov.size, &bounce_qiov, &bounce_qiov);
is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
if (ret < 0) { if (ret < 0) {
trace_backup_do_cow_read_fail(job, start, ret); trace_backup_do_cow_read_fail(job, start, ret);
if (error_is_read) { if (error_is_read) {
@@ -145,12 +140,13 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
} }
if (buffer_is_zero(iov.iov_base, iov.iov_len)) { if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
ret = blk_co_pwrite_zeroes(job->target, start * job->cluster_size, ret = bdrv_co_write_zeroes(job->target,
bounce_qiov.size, BDRV_REQ_MAY_UNMAP); start * BACKUP_SECTORS_PER_CLUSTER,
n, BDRV_REQ_MAY_UNMAP);
} else { } else {
ret = blk_co_pwritev(job->target, start * job->cluster_size, ret = bdrv_co_writev(job->target,
bounce_qiov.size, &bounce_qiov, start * BACKUP_SECTORS_PER_CLUSTER, n,
job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0); &bounce_qiov);
} }
if (ret < 0) { if (ret < 0) {
trace_backup_do_cow_write_fail(job, start, ret); trace_backup_do_cow_write_fail(job, start, ret);
@@ -160,7 +156,7 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
goto out; goto out;
} }
set_bit(start, job->done_bitmap); hbitmap_set(job->bitmap, start, 1);
/* Publish progress, guest I/O counts as progress too. Note that the /* Publish progress, guest I/O counts as progress too. Note that the
* offset field is an opaque progress value, it is not a disk offset. * offset field is an opaque progress value, it is not a disk offset.
@@ -187,16 +183,14 @@ static int coroutine_fn backup_before_write_notify(
NotifierWithReturn *notifier, NotifierWithReturn *notifier,
void *opaque) void *opaque)
{ {
BackupBlockJob *job = container_of(notifier, BackupBlockJob, before_write);
BdrvTrackedRequest *req = opaque; BdrvTrackedRequest *req = opaque;
int64_t sector_num = req->offset >> BDRV_SECTOR_BITS; int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
int nb_sectors = req->bytes >> BDRV_SECTOR_BITS; int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;
assert(req->bs == blk_bs(job->common.blk));
assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0); assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0); assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
return backup_do_cow(job, sector_num, nb_sectors, NULL, true); return backup_do_cow(req->bs, sector_num, nb_sectors, NULL);
} }
static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp) static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -210,114 +204,29 @@ static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME); ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
} }
static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret) static void backup_iostatus_reset(BlockJob *job)
{
BdrvDirtyBitmap *bm;
BlockDriverState *bs = blk_bs(job->common.blk);
if (ret < 0 || block_job_is_cancelled(&job->common)) {
/* Merge the successor back into the parent, delete nothing. */
bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
assert(bm);
} else {
/* Everything is fine, delete this bitmap and install the backup. */
bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
assert(bm);
}
}
static void backup_commit(BlockJob *job)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
if (s->sync_bitmap) {
backup_cleanup_sync_bitmap(s, 0);
}
}
static void backup_abort(BlockJob *job)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
if (s->sync_bitmap) {
backup_cleanup_sync_bitmap(s, -1);
}
}
static void backup_attached_aio_context(BlockJob *job, AioContext *aio_context)
{ {
BackupBlockJob *s = container_of(job, BackupBlockJob, common); BackupBlockJob *s = container_of(job, BackupBlockJob, common);
blk_set_aio_context(s->target, aio_context); bdrv_iostatus_reset(s->target);
}
void backup_do_checkpoint(BlockJob *job, Error **errp)
{
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
int64_t len;
assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
if (backup_job->sync_mode != MIRROR_SYNC_MODE_NONE) {
error_setg(errp, "The backup job only supports block checkpoint in"
" sync=none mode");
return;
}
len = DIV_ROUND_UP(backup_job->common.len, backup_job->cluster_size);
bitmap_zero(backup_job->done_bitmap, len);
}
void backup_wait_for_overlapping_requests(BlockJob *job, int64_t sector_num,
int nb_sectors)
{
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
int64_t sectors_per_cluster = cluster_size_sectors(backup_job);
int64_t start, end;
assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
start = sector_num / sectors_per_cluster;
end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
wait_for_overlapping_requests(backup_job, start, end);
}
void backup_cow_request_begin(CowRequest *req, BlockJob *job,
int64_t sector_num,
int nb_sectors)
{
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
int64_t sectors_per_cluster = cluster_size_sectors(backup_job);
int64_t start, end;
assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
start = sector_num / sectors_per_cluster;
end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
cow_request_begin(req, backup_job, start, end);
}
void backup_cow_request_end(CowRequest *req)
{
cow_request_end(req);
} }
static const BlockJobDriver backup_job_driver = { static const BlockJobDriver backup_job_driver = {
.instance_size = sizeof(BackupBlockJob), .instance_size = sizeof(BackupBlockJob),
.job_type = BLOCK_JOB_TYPE_BACKUP, .job_type = BLOCK_JOB_TYPE_BACKUP,
.set_speed = backup_set_speed, .set_speed = backup_set_speed,
.commit = backup_commit, .iostatus_reset = backup_iostatus_reset,
.abort = backup_abort,
.attached_aio_context = backup_attached_aio_context,
}; };
static BlockErrorAction backup_error_action(BackupBlockJob *job, static BlockErrorAction backup_error_action(BackupBlockJob *job,
bool read, int error) bool read, int error)
{ {
if (read) { if (read) {
return block_job_error_action(&job->common, job->on_source_error, return block_job_error_action(&job->common, job->common.bs,
true, error); job->on_source_error, true, error);
} else { } else {
return block_job_error_action(&job->common, job->on_target_error, return block_job_error_action(&job->common, job->target,
false, error); job->on_target_error, false, error);
} }
} }
@@ -330,7 +239,7 @@ static void backup_complete(BlockJob *job, void *opaque)
BackupBlockJob *s = container_of(job, BackupBlockJob, common); BackupBlockJob *s = container_of(job, BackupBlockJob, common);
BackupCompleteData *data = opaque; BackupCompleteData *data = opaque;
blk_unref(s->target); bdrv_unref(s->target);
block_job_completed(job, data->ret); block_job_completed(job, data->ret);
g_free(data); g_free(data);
@@ -371,21 +280,21 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
int64_t cluster; int64_t cluster;
int64_t end; int64_t end;
int64_t last_cluster = -1; int64_t last_cluster = -1;
int64_t sectors_per_cluster = cluster_size_sectors(job); BlockDriverState *bs = job->common.bs;
HBitmapIter hbi; HBitmapIter hbi;
granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap); granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
clusters_per_iter = MAX((granularity / job->cluster_size), 1); clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);
bdrv_dirty_iter_init(job->sync_bitmap, &hbi); bdrv_dirty_iter_init(job->sync_bitmap, &hbi);
/* Find the next dirty sector(s) */ /* Find the next dirty sector(s) */
while ((sector = hbitmap_iter_next(&hbi)) != -1) { while ((sector = hbitmap_iter_next(&hbi)) != -1) {
cluster = sector / sectors_per_cluster; cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
/* Fake progress updates for any clusters we skipped */ /* Fake progress updates for any clusters we skipped */
if (cluster != last_cluster + 1) { if (cluster != last_cluster + 1) {
job->common.offset += ((cluster - last_cluster - 1) * job->common.offset += ((cluster - last_cluster - 1) *
job->cluster_size); BACKUP_CLUSTER_SIZE);
} }
for (end = cluster + clusters_per_iter; cluster < end; cluster++) { for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
@@ -393,9 +302,8 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
if (yield_and_check(job)) { if (yield_and_check(job)) {
return ret; return ret;
} }
ret = backup_do_cow(job, cluster * sectors_per_cluster, ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
sectors_per_cluster, &error_is_read, BACKUP_SECTORS_PER_CLUSTER, &error_is_read);
false);
if ((ret < 0) && if ((ret < 0) &&
backup_error_action(job, error_is_read, -ret) == backup_error_action(job, error_is_read, -ret) ==
BLOCK_ERROR_ACTION_REPORT) { BLOCK_ERROR_ACTION_REPORT) {
@@ -406,17 +314,17 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
/* If the bitmap granularity is smaller than the backup granularity, /* If the bitmap granularity is smaller than the backup granularity,
* we need to advance the iterator pointer to the next cluster. */ * we need to advance the iterator pointer to the next cluster. */
if (granularity < job->cluster_size) { if (granularity < BACKUP_CLUSTER_SIZE) {
bdrv_set_dirty_iter(&hbi, cluster * sectors_per_cluster); bdrv_set_dirty_iter(&hbi, cluster * BACKUP_SECTORS_PER_CLUSTER);
} }
last_cluster = cluster - 1; last_cluster = cluster - 1;
} }
/* Play some final catchup with the progress meter */ /* Play some final catchup with the progress meter */
end = DIV_ROUND_UP(job->common.len, job->cluster_size); end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
if (last_cluster + 1 < end) { if (last_cluster + 1 < end) {
job->common.offset += ((end - last_cluster - 1) * job->cluster_size); job->common.offset += ((end - last_cluster - 1) * BACKUP_CLUSTER_SIZE);
} }
return ret; return ret;
@@ -426,28 +334,36 @@ static void coroutine_fn backup_run(void *opaque)
{ {
BackupBlockJob *job = opaque; BackupBlockJob *job = opaque;
BackupCompleteData *data; BackupCompleteData *data;
BlockDriverState *bs = blk_bs(job->common.blk); BlockDriverState *bs = job->common.bs;
BlockBackend *target = job->target; BlockDriverState *target = job->target;
BlockdevOnError on_target_error = job->on_target_error;
NotifierWithReturn before_write = {
.notify = backup_before_write_notify,
};
int64_t start, end; int64_t start, end;
int64_t sectors_per_cluster = cluster_size_sectors(job);
int ret = 0; int ret = 0;
QLIST_INIT(&job->inflight_reqs); QLIST_INIT(&job->inflight_reqs);
qemu_co_rwlock_init(&job->flush_rwlock); qemu_co_rwlock_init(&job->flush_rwlock);
start = 0; start = 0;
end = DIV_ROUND_UP(job->common.len, job->cluster_size); end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
job->done_bitmap = bitmap_new(end); job->bitmap = hbitmap_alloc(end, 0);
job->before_write.notify = backup_before_write_notify; bdrv_set_enable_write_cache(target, true);
bdrv_add_before_write_notifier(bs, &job->before_write); bdrv_set_on_error(target, on_target_error, on_target_error);
bdrv_iostatus_enable(target);
bdrv_add_before_write_notifier(bs, &before_write);
if (job->sync_mode == MIRROR_SYNC_MODE_NONE) { if (job->sync_mode == MIRROR_SYNC_MODE_NONE) {
while (!block_job_is_cancelled(&job->common)) { while (!block_job_is_cancelled(&job->common)) {
/* Yield until the job is cancelled. We just let our before_write /* Yield until the job is cancelled. We just let our before_write
* notify callback service CoW requests. */ * notify callback service CoW requests. */
block_job_yield(&job->common); job->common.busy = false;
qemu_coroutine_yield();
job->common.busy = true;
} }
} else if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) { } else if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
ret = backup_run_incremental(job); ret = backup_run_incremental(job);
@@ -466,7 +382,7 @@ static void coroutine_fn backup_run(void *opaque)
/* Check to see if these blocks are already in the /* Check to see if these blocks are already in the
* backing file. */ * backing file. */
for (i = 0; i < sectors_per_cluster;) { for (i = 0; i < BACKUP_SECTORS_PER_CLUSTER;) {
/* bdrv_is_allocated() only returns true/false based /* bdrv_is_allocated() only returns true/false based
* on the first set of sectors it comes across that * on the first set of sectors it comes across that
* are are all in the same state. * are are all in the same state.
@@ -475,8 +391,8 @@ static void coroutine_fn backup_run(void *opaque)
* needed but at some point that is always the case. */ * needed but at some point that is always the case. */
alloced = alloced =
bdrv_is_allocated(bs, bdrv_is_allocated(bs,
start * sectors_per_cluster + i, start * BACKUP_SECTORS_PER_CLUSTER + i,
sectors_per_cluster - i, &n); BACKUP_SECTORS_PER_CLUSTER - i, &n);
i += n; i += n;
if (alloced == 1 || n == 0) { if (alloced == 1 || n == 0) {
@@ -491,8 +407,8 @@ static void coroutine_fn backup_run(void *opaque)
} }
} }
/* FULL sync mode we copy the whole drive. */ /* FULL sync mode we copy the whole drive. */
ret = backup_do_cow(job, start * sectors_per_cluster, ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER,
sectors_per_cluster, &error_is_read, false); BACKUP_SECTORS_PER_CLUSTER, &error_is_read);
if (ret < 0) { if (ret < 0) {
/* Depending on error action, fail now or retry cluster */ /* Depending on error action, fail now or retry cluster */
BlockErrorAction action = BlockErrorAction action =
@@ -507,42 +423,60 @@ static void coroutine_fn backup_run(void *opaque)
} }
} }
notifier_with_return_remove(&job->before_write); notifier_with_return_remove(&before_write);
/* wait until pending backup_do_cow() calls have completed */ /* wait until pending backup_do_cow() calls have completed */
qemu_co_rwlock_wrlock(&job->flush_rwlock); qemu_co_rwlock_wrlock(&job->flush_rwlock);
qemu_co_rwlock_unlock(&job->flush_rwlock); qemu_co_rwlock_unlock(&job->flush_rwlock);
g_free(job->done_bitmap);
bdrv_op_unblock_all(blk_bs(target), job->common.blocker); if (job->sync_bitmap) {
BdrvDirtyBitmap *bm;
if (ret < 0 || block_job_is_cancelled(&job->common)) {
/* Merge the successor back into the parent, delete nothing. */
bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
assert(bm);
} else {
/* Everything is fine, delete this bitmap and install the backup. */
bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
assert(bm);
}
}
hbitmap_free(job->bitmap);
bdrv_iostatus_disable(target);
bdrv_op_unblock_all(target, job->common.blocker);
data = g_malloc(sizeof(*data)); data = g_malloc(sizeof(*data));
data->ret = ret; data->ret = ret;
block_job_defer_to_main_loop(&job->common, backup_complete, data); block_job_defer_to_main_loop(&job->common, backup_complete, data);
} }
void backup_start(const char *job_id, BlockDriverState *bs, void backup_start(BlockDriverState *bs, BlockDriverState *target,
BlockDriverState *target, int64_t speed, int64_t speed, MirrorSyncMode sync_mode,
MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap, BdrvDirtyBitmap *sync_bitmap,
bool compress,
BlockdevOnError on_source_error, BlockdevOnError on_source_error,
BlockdevOnError on_target_error, BlockdevOnError on_target_error,
BlockCompletionFunc *cb, void *opaque, BlockCompletionFunc *cb, void *opaque,
BlockJobTxn *txn, Error **errp) Error **errp)
{ {
int64_t len; int64_t len;
BlockDriverInfo bdi;
BackupBlockJob *job = NULL;
int ret;
assert(bs); assert(bs);
assert(target); assert(target);
assert(cb);
if (bs == target) { if (bs == target) {
error_setg(errp, "Source and target cannot be the same"); error_setg(errp, "Source and target cannot be the same");
return; return;
} }
if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
!bdrv_iostatus_is_enabled(bs)) {
error_setg(errp, QERR_INVALID_PARAMETER, "on-source-error");
return;
}
if (!bdrv_is_inserted(bs)) { if (!bdrv_is_inserted(bs)) {
error_setg(errp, "Device is not inserted: %s", error_setg(errp, "Device is not inserted: %s",
bdrv_get_device_name(bs)); bdrv_get_device_name(bs));
@@ -555,12 +489,6 @@ void backup_start(const char *job_id, BlockDriverState *bs,
return; return;
} }
if (compress && target->drv->bdrv_co_pwritev_compressed == NULL) {
error_setg(errp, "Compression is not supported for this drive %s",
bdrv_get_device_name(target));
return;
}
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) { if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
return; return;
} }
@@ -595,53 +523,27 @@ void backup_start(const char *job_id, BlockDriverState *bs,
goto error; goto error;
} }
job = block_job_create(job_id, &backup_job_driver, bs, speed, BackupBlockJob *job = block_job_create(&backup_job_driver, bs, speed,
cb, opaque, errp); cb, opaque, errp);
if (!job) { if (!job) {
goto error; goto error;
} }
job->target = blk_new(); bdrv_op_block_all(target, job->common.blocker);
blk_insert_bs(job->target, target);
job->on_source_error = on_source_error; job->on_source_error = on_source_error;
job->on_target_error = on_target_error; job->on_target_error = on_target_error;
job->target = target;
job->sync_mode = sync_mode; job->sync_mode = sync_mode;
job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ? job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
sync_bitmap : NULL; sync_bitmap : NULL;
job->compress = compress;
/* If there is no backing file on the target, we cannot rely on COW if our
* backup cluster size is smaller than the target cluster size. Even for
* targets with a backing file, try to avoid COW if possible. */
ret = bdrv_get_info(target, &bdi);
if (ret < 0 && !target->backing) {
error_setg_errno(errp, -ret,
"Couldn't determine the cluster size of the target image, "
"which has no backing file");
error_append_hint(errp,
"Aborting, since this may create an unusable destination image\n");
goto error;
} else if (ret < 0 && target->backing) {
/* Not fatal; just trudge on ahead. */
job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT;
} else {
job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
}
bdrv_op_block_all(target, job->common.blocker);
job->common.len = len; job->common.len = len;
job->common.co = qemu_coroutine_create(backup_run, job); job->common.co = qemu_coroutine_create(backup_run);
block_job_txn_add_job(txn, &job->common); qemu_coroutine_enter(job->common.co, job);
qemu_coroutine_enter(job->common.co);
return; return;
error: error:
if (sync_bitmap) { if (sync_bitmap) {
bdrv_reclaim_dirty_bitmap(bs, sync_bitmap, NULL); bdrv_reclaim_dirty_bitmap(bs, sync_bitmap, NULL);
} }
if (job) {
blk_unref(job->target);
block_job_unref(&job->common);
}
} }

View File

@@ -22,9 +22,7 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h" #include "qemu-common.h"
#include "qapi/error.h"
#include "qemu/cutils.h"
#include "qemu/config-file.h" #include "qemu/config-file.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qemu/module.h" #include "qemu/module.h"
@@ -32,17 +30,12 @@
#include "qapi/qmp/qdict.h" #include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h" #include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h" #include "qapi/qmp/qstring.h"
#include "sysemu/qtest.h"
typedef struct BDRVBlkdebugState { typedef struct BDRVBlkdebugState {
int state; int state;
int new_state; int new_state;
int align;
/* For blkdebug_refresh_filename() */ QLIST_HEAD(, BlkdebugRule) rules[BLKDBG_EVENT_MAX];
char *config_file;
QLIST_HEAD(, BlkdebugRule) rules[BLKDBG__MAX];
QSIMPLEQ_HEAD(, BlkdebugRule) active_rules; QSIMPLEQ_HEAD(, BlkdebugRule) active_rules;
QLIST_HEAD(, BlkdebugSuspendedReq) suspended_reqs; QLIST_HEAD(, BlkdebugSuspendedReq) suspended_reqs;
} BDRVBlkdebugState; } BDRVBlkdebugState;
@@ -70,7 +63,7 @@ enum {
}; };
typedef struct BlkdebugRule { typedef struct BlkdebugRule {
BlkdebugEvent event; BlkDebugEvent event;
int action; int action;
int state; int state;
union { union {
@@ -149,12 +142,69 @@ static QemuOptsList *config_groups[] = {
NULL NULL
}; };
static int get_event_by_name(const char *name, BlkdebugEvent *event) static const char *event_names[BLKDBG_EVENT_MAX] = {
[BLKDBG_L1_UPDATE] = "l1_update",
[BLKDBG_L1_GROW_ALLOC_TABLE] = "l1_grow.alloc_table",
[BLKDBG_L1_GROW_WRITE_TABLE] = "l1_grow.write_table",
[BLKDBG_L1_GROW_ACTIVATE_TABLE] = "l1_grow.activate_table",
[BLKDBG_L2_LOAD] = "l2_load",
[BLKDBG_L2_UPDATE] = "l2_update",
[BLKDBG_L2_UPDATE_COMPRESSED] = "l2_update_compressed",
[BLKDBG_L2_ALLOC_COW_READ] = "l2_alloc.cow_read",
[BLKDBG_L2_ALLOC_WRITE] = "l2_alloc.write",
[BLKDBG_READ_AIO] = "read_aio",
[BLKDBG_READ_BACKING_AIO] = "read_backing_aio",
[BLKDBG_READ_COMPRESSED] = "read_compressed",
[BLKDBG_WRITE_AIO] = "write_aio",
[BLKDBG_WRITE_COMPRESSED] = "write_compressed",
[BLKDBG_VMSTATE_LOAD] = "vmstate_load",
[BLKDBG_VMSTATE_SAVE] = "vmstate_save",
[BLKDBG_COW_READ] = "cow_read",
[BLKDBG_COW_WRITE] = "cow_write",
[BLKDBG_REFTABLE_LOAD] = "reftable_load",
[BLKDBG_REFTABLE_GROW] = "reftable_grow",
[BLKDBG_REFTABLE_UPDATE] = "reftable_update",
[BLKDBG_REFBLOCK_LOAD] = "refblock_load",
[BLKDBG_REFBLOCK_UPDATE] = "refblock_update",
[BLKDBG_REFBLOCK_UPDATE_PART] = "refblock_update_part",
[BLKDBG_REFBLOCK_ALLOC] = "refblock_alloc",
[BLKDBG_REFBLOCK_ALLOC_HOOKUP] = "refblock_alloc.hookup",
[BLKDBG_REFBLOCK_ALLOC_WRITE] = "refblock_alloc.write",
[BLKDBG_REFBLOCK_ALLOC_WRITE_BLOCKS] = "refblock_alloc.write_blocks",
[BLKDBG_REFBLOCK_ALLOC_WRITE_TABLE] = "refblock_alloc.write_table",
[BLKDBG_REFBLOCK_ALLOC_SWITCH_TABLE] = "refblock_alloc.switch_table",
[BLKDBG_CLUSTER_ALLOC] = "cluster_alloc",
[BLKDBG_CLUSTER_ALLOC_BYTES] = "cluster_alloc_bytes",
[BLKDBG_CLUSTER_FREE] = "cluster_free",
[BLKDBG_FLUSH_TO_OS] = "flush_to_os",
[BLKDBG_FLUSH_TO_DISK] = "flush_to_disk",
[BLKDBG_PWRITEV_RMW_HEAD] = "pwritev_rmw.head",
[BLKDBG_PWRITEV_RMW_AFTER_HEAD] = "pwritev_rmw.after_head",
[BLKDBG_PWRITEV_RMW_TAIL] = "pwritev_rmw.tail",
[BLKDBG_PWRITEV_RMW_AFTER_TAIL] = "pwritev_rmw.after_tail",
[BLKDBG_PWRITEV] = "pwritev",
[BLKDBG_PWRITEV_ZERO] = "pwritev_zero",
[BLKDBG_PWRITEV_DONE] = "pwritev_done",
[BLKDBG_EMPTY_IMAGE_PREPARE] = "empty_image_prepare",
};
static int get_event_by_name(const char *name, BlkDebugEvent *event)
{ {
int i; int i;
for (i = 0; i < BLKDBG__MAX; i++) { for (i = 0; i < BLKDBG_EVENT_MAX; i++) {
if (!strcmp(BlkdebugEvent_lookup[i], name)) { if (!strcmp(event_names[i], name)) {
*event = i; *event = i;
return 0; return 0;
} }
@@ -173,7 +223,7 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
struct add_rule_data *d = opaque; struct add_rule_data *d = opaque;
BDRVBlkdebugState *s = d->s; BDRVBlkdebugState *s = d->s;
const char* event_name; const char* event_name;
BlkdebugEvent event; BlkDebugEvent event;
struct BlkdebugRule *rule; struct BlkdebugRule *rule;
/* Find the right event for the rule */ /* Find the right event for the rule */
@@ -354,6 +404,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
BDRVBlkdebugState *s = bs->opaque; BDRVBlkdebugState *s = bs->opaque;
QemuOpts *opts; QemuOpts *opts;
Error *local_err = NULL; Error *local_err = NULL;
const char *config;
uint64_t align; uint64_t align;
int ret; int ret;
@@ -366,8 +417,8 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
} }
/* Read rules from config file or command line options */ /* Read rules from config file or command line options */
s->config_file = g_strdup(qemu_opt_get(opts, "config")); config = qemu_opt_get(opts, "config");
ret = read_config(s, s->config_file, options, errp); ret = read_config(s, config, options, errp);
if (ret) { if (ret) {
goto out; goto out;
} }
@@ -375,20 +426,20 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
/* Set initial state */ /* Set initial state */
s->state = 1; s->state = 1;
/* Open the image file */ /* Open the backing file */
bs->file = bdrv_open_child(qemu_opt_get(opts, "x-image"), options, "image", assert(bs->file == NULL);
bs, &child_file, false, &local_err); ret = bdrv_open_image(&bs->file, qemu_opt_get(opts, "x-image"), options, "image",
if (local_err) { bs, &child_file, false, &local_err);
ret = -EINVAL; if (ret < 0) {
error_propagate(errp, local_err); error_propagate(errp, local_err);
goto out; goto out;
} }
/* Set request alignment */ /* Set request alignment */
align = qemu_opt_get_size(opts, "align", 0); align = qemu_opt_get_size(opts, "align", bs->request_alignment);
if (align < INT_MAX && is_power_of_2(align)) { if (align > 0 && align < INT_MAX && !(align & (align - 1))) {
s->align = align; bs->request_alignment = align;
} else if (align) { } else {
error_setg(errp, "Invalid alignment"); error_setg(errp, "Invalid alignment");
ret = -EINVAL; ret = -EINVAL;
goto fail_unref; goto fail_unref;
@@ -398,11 +449,8 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
goto out; goto out;
fail_unref: fail_unref:
bdrv_unref_child(bs, bs->file); bdrv_unref(bs->file);
out: out:
if (ret < 0) {
g_free(s->config_file);
}
qemu_opts_del(opts); qemu_opts_del(opts);
return ret; return ret;
} }
@@ -462,8 +510,7 @@ static BlockAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
return inject_error(bs, cb, opaque, rule); return inject_error(bs, cb, opaque, rule);
} }
return bdrv_aio_readv(bs->file, sector_num, qiov, nb_sectors, return bdrv_aio_readv(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
cb, opaque);
} }
static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs, static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
@@ -485,8 +532,7 @@ static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
return inject_error(bs, cb, opaque, rule); return inject_error(bs, cb, opaque, rule);
} }
return bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, return bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
cb, opaque);
} }
static BlockAIOCB *blkdebug_aio_flush(BlockDriverState *bs, static BlockAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
@@ -505,7 +551,7 @@ static BlockAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
return inject_error(bs, cb, opaque, rule); return inject_error(bs, cb, opaque, rule);
} }
return bdrv_aio_flush(bs->file->bs, cb, opaque); return bdrv_aio_flush(bs->file, cb, opaque);
} }
@@ -515,13 +561,11 @@ static void blkdebug_close(BlockDriverState *bs)
BlkdebugRule *rule, *next; BlkdebugRule *rule, *next;
int i; int i;
for (i = 0; i < BLKDBG__MAX; i++) { for (i = 0; i < BLKDBG_EVENT_MAX; i++) {
QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) { QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
remove_rule(rule); remove_rule(rule);
} }
} }
g_free(s->config_file);
} }
static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule) static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
@@ -537,13 +581,9 @@ static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
remove_rule(rule); remove_rule(rule);
QLIST_INSERT_HEAD(&s->suspended_reqs, &r, next); QLIST_INSERT_HEAD(&s->suspended_reqs, &r, next);
if (!qtest_enabled()) { printf("blkdebug: Suspended request '%s'\n", r.tag);
printf("blkdebug: Suspended request '%s'\n", r.tag);
}
qemu_coroutine_yield(); qemu_coroutine_yield();
if (!qtest_enabled()) { printf("blkdebug: Resuming request '%s'\n", r.tag);
printf("blkdebug: Resuming request '%s'\n", r.tag);
}
QLIST_REMOVE(&r, next); QLIST_REMOVE(&r, next);
g_free(r.tag); g_free(r.tag);
@@ -580,13 +620,13 @@ static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
return injected; return injected;
} }
static void blkdebug_debug_event(BlockDriverState *bs, BlkdebugEvent event) static void blkdebug_debug_event(BlockDriverState *bs, BlkDebugEvent event)
{ {
BDRVBlkdebugState *s = bs->opaque; BDRVBlkdebugState *s = bs->opaque;
struct BlkdebugRule *rule, *next; struct BlkdebugRule *rule, *next;
bool injected; bool injected;
assert((int)event >= 0 && event < BLKDBG__MAX); assert((int)event >= 0 && event < BLKDBG_EVENT_MAX);
injected = false; injected = false;
s->new_state = s->state; s->new_state = s->state;
@@ -601,7 +641,7 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
{ {
BDRVBlkdebugState *s = bs->opaque; BDRVBlkdebugState *s = bs->opaque;
struct BlkdebugRule *rule; struct BlkdebugRule *rule;
BlkdebugEvent blkdebug_event; BlkDebugEvent blkdebug_event;
if (get_event_by_name(event, &blkdebug_event) < 0) { if (get_event_by_name(event, &blkdebug_event) < 0) {
return -ENOENT; return -ENOENT;
@@ -628,7 +668,7 @@ static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, next) { QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, next) {
if (!strcmp(r->tag, tag)) { if (!strcmp(r->tag, tag)) {
qemu_coroutine_enter(r->co); qemu_coroutine_enter(r->co, NULL);
return 0; return 0;
} }
} }
@@ -643,7 +683,7 @@ static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
BlkdebugRule *rule, *next; BlkdebugRule *rule, *next;
int i, ret = -ENOENT; int i, ret = -ENOENT;
for (i = 0; i < BLKDBG__MAX; i++) { for (i = 0; i < BLKDBG_EVENT_MAX; i++) {
QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) { QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
if (rule->action == ACTION_SUSPEND && if (rule->action == ACTION_SUSPEND &&
!strcmp(rule->options.suspend.tag, tag)) { !strcmp(rule->options.suspend.tag, tag)) {
@@ -654,7 +694,7 @@ static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
} }
QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, r_next) { QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, r_next) {
if (!strcmp(r->tag, tag)) { if (!strcmp(r->tag, tag)) {
qemu_coroutine_enter(r->co); qemu_coroutine_enter(r->co, NULL);
ret = 0; ret = 0;
} }
} }
@@ -676,50 +716,55 @@ static bool blkdebug_debug_is_suspended(BlockDriverState *bs, const char *tag)
static int64_t blkdebug_getlength(BlockDriverState *bs) static int64_t blkdebug_getlength(BlockDriverState *bs)
{ {
return bdrv_getlength(bs->file->bs); return bdrv_getlength(bs->file);
} }
static int blkdebug_truncate(BlockDriverState *bs, int64_t offset) static int blkdebug_truncate(BlockDriverState *bs, int64_t offset)
{ {
return bdrv_truncate(bs->file->bs, offset); return bdrv_truncate(bs->file, offset);
} }
static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options) static void blkdebug_refresh_filename(BlockDriverState *bs)
{ {
BDRVBlkdebugState *s = bs->opaque;
QDict *opts; QDict *opts;
const QDictEntry *e; const QDictEntry *e;
bool force_json = false; bool force_json = false;
for (e = qdict_first(options); e; e = qdict_next(options, e)) { for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
if (strcmp(qdict_entry_key(e), "config") && if (strcmp(qdict_entry_key(e), "config") &&
strcmp(qdict_entry_key(e), "x-image")) strcmp(qdict_entry_key(e), "x-image") &&
strcmp(qdict_entry_key(e), "image") &&
strncmp(qdict_entry_key(e), "image.", strlen("image.")))
{ {
force_json = true; force_json = true;
break; break;
} }
} }
if (force_json && !bs->file->bs->full_open_options) { if (force_json && !bs->file->full_open_options) {
/* The config file cannot be recreated, so creating a plain filename /* The config file cannot be recreated, so creating a plain filename
* is impossible */ * is impossible */
return; return;
} }
if (!force_json && bs->file->bs->exact_filename[0]) { if (!force_json && bs->file->exact_filename[0]) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename), snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkdebug:%s:%s", s->config_file ?: "", "blkdebug:%s:%s",
bs->file->bs->exact_filename); qdict_get_try_str(bs->options, "config") ?: "",
bs->file->exact_filename);
} }
opts = qdict_new(); opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkdebug"))); qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkdebug")));
QINCREF(bs->file->bs->full_open_options); QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "image", QOBJECT(bs->file->bs->full_open_options)); qdict_put_obj(opts, "image", QOBJECT(bs->file->full_open_options));
for (e = qdict_first(options); e; e = qdict_next(options, e)) { for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
if (strcmp(qdict_entry_key(e), "x-image")) { if (strcmp(qdict_entry_key(e), "x-image") &&
strcmp(qdict_entry_key(e), "image") &&
strncmp(qdict_entry_key(e), "image.", strlen("image.")))
{
qobject_incref(qdict_entry_value(e)); qobject_incref(qdict_entry_value(e));
qdict_put_obj(opts, qdict_entry_key(e), qdict_entry_value(e)); qdict_put_obj(opts, qdict_entry_key(e), qdict_entry_value(e));
} }
@@ -728,21 +773,6 @@ static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
bs->full_open_options = opts; bs->full_open_options = opts;
} }
static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVBlkdebugState *s = bs->opaque;
if (s->align) {
bs->bl.request_alignment = s->align;
}
}
static int blkdebug_reopen_prepare(BDRVReopenState *reopen_state,
BlockReopenQueue *queue, Error **errp)
{
return 0;
}
static BlockDriver bdrv_blkdebug = { static BlockDriver bdrv_blkdebug = {
.format_name = "blkdebug", .format_name = "blkdebug",
.protocol_name = "blkdebug", .protocol_name = "blkdebug",
@@ -751,11 +781,9 @@ static BlockDriver bdrv_blkdebug = {
.bdrv_parse_filename = blkdebug_parse_filename, .bdrv_parse_filename = blkdebug_parse_filename,
.bdrv_file_open = blkdebug_open, .bdrv_file_open = blkdebug_open,
.bdrv_close = blkdebug_close, .bdrv_close = blkdebug_close,
.bdrv_reopen_prepare = blkdebug_reopen_prepare,
.bdrv_getlength = blkdebug_getlength, .bdrv_getlength = blkdebug_getlength,
.bdrv_truncate = blkdebug_truncate, .bdrv_truncate = blkdebug_truncate,
.bdrv_refresh_filename = blkdebug_refresh_filename, .bdrv_refresh_filename = blkdebug_refresh_filename,
.bdrv_refresh_limits = blkdebug_refresh_limits,
.bdrv_aio_readv = blkdebug_aio_readv, .bdrv_aio_readv = blkdebug_aio_readv,
.bdrv_aio_writev = blkdebug_aio_writev, .bdrv_aio_writev = blkdebug_aio_writev,

View File

@@ -1,160 +0,0 @@
/*
* Block protocol for record/replay
*
* Copyright (c) 2010-2016 Institute for System Programming
* of the Russian Academy of Sciences.
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "sysemu/replay.h"
#include "qapi/error.h"
typedef struct Request {
Coroutine *co;
QEMUBH *bh;
} Request;
/* Next request id.
This counter is global, because requests from different
block devices should not get overlapping ids. */
static uint64_t request_id;
static int blkreplay_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
Error *local_err = NULL;
int ret;
/* Open the image file */
bs->file = bdrv_open_child(NULL, options, "image",
bs, &child_file, false, &local_err);
if (local_err) {
ret = -EINVAL;
error_propagate(errp, local_err);
goto fail;
}
ret = 0;
fail:
if (ret < 0) {
bdrv_unref_child(bs, bs->file);
}
return ret;
}
static void blkreplay_close(BlockDriverState *bs)
{
}
static int64_t blkreplay_getlength(BlockDriverState *bs)
{
return bdrv_getlength(bs->file->bs);
}
/* This bh is used for synchronization of return from coroutines.
It continues yielded coroutine which then finishes its execution.
BH is called adjusted to some replay checkpoint, therefore
record and replay will always finish coroutines deterministically.
*/
static void blkreplay_bh_cb(void *opaque)
{
Request *req = opaque;
qemu_coroutine_enter(req->co);
qemu_bh_delete(req->bh);
g_free(req);
}
static void block_request_create(uint64_t reqid, BlockDriverState *bs,
Coroutine *co)
{
Request *req = g_new(Request, 1);
*req = (Request) {
.co = co,
.bh = aio_bh_new(bdrv_get_aio_context(bs), blkreplay_bh_cb, req),
};
replay_block_event(req->bh, reqid);
}
static int coroutine_fn blkreplay_co_preadv(BlockDriverState *bs,
uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
{
uint64_t reqid = request_id++;
int ret = bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
return ret;
}
static int coroutine_fn blkreplay_co_pwritev(BlockDriverState *bs,
uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
{
uint64_t reqid = request_id++;
int ret = bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
return ret;
}
static int coroutine_fn blkreplay_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int count, BdrvRequestFlags flags)
{
uint64_t reqid = request_id++;
int ret = bdrv_co_pwrite_zeroes(bs->file, offset, count, flags);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
return ret;
}
static int coroutine_fn blkreplay_co_pdiscard(BlockDriverState *bs,
int64_t offset, int count)
{
uint64_t reqid = request_id++;
int ret = bdrv_co_pdiscard(bs->file->bs, offset, count);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
return ret;
}
static int coroutine_fn blkreplay_co_flush(BlockDriverState *bs)
{
uint64_t reqid = request_id++;
int ret = bdrv_co_flush(bs->file->bs);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
return ret;
}
static BlockDriver bdrv_blkreplay = {
.format_name = "blkreplay",
.protocol_name = "blkreplay",
.instance_size = 0,
.bdrv_file_open = blkreplay_open,
.bdrv_close = blkreplay_close,
.bdrv_getlength = blkreplay_getlength,
.bdrv_co_preadv = blkreplay_co_preadv,
.bdrv_co_pwritev = blkreplay_co_pwritev,
.bdrv_co_pwrite_zeroes = blkreplay_co_pwrite_zeroes,
.bdrv_co_pdiscard = blkreplay_co_pdiscard,
.bdrv_co_flush = blkreplay_co_flush,
};
static void bdrv_blkreplay_init(void)
{
bdrv_register(&bdrv_blkreplay);
}
block_init(bdrv_blkreplay_init);

View File

@@ -7,16 +7,14 @@
* See the COPYING file in the top-level directory. * See the COPYING file in the top-level directory.
*/ */
#include "qemu/osdep.h" #include <stdarg.h>
#include "qapi/error.h"
#include "qemu/sockets.h" /* for EINPROGRESS on Windows */ #include "qemu/sockets.h" /* for EINPROGRESS on Windows */
#include "block/block_int.h" #include "block/block_int.h"
#include "qapi/qmp/qdict.h" #include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h" #include "qapi/qmp/qstring.h"
#include "qemu/cutils.h"
typedef struct { typedef struct {
BdrvChild *test_file; BlockDriverState *test_file;
} BDRVBlkverifyState; } BDRVBlkverifyState;
typedef struct BlkverifyAIOCB BlkverifyAIOCB; typedef struct BlkverifyAIOCB BlkverifyAIOCB;
@@ -125,29 +123,26 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
} }
/* Open the raw file */ /* Open the raw file */
bs->file = bdrv_open_child(qemu_opt_get(opts, "x-raw"), options, "raw", assert(bs->file == NULL);
bs, &child_file, false, &local_err); ret = bdrv_open_image(&bs->file, qemu_opt_get(opts, "x-raw"), options,
if (local_err) { "raw", bs, &child_file, false, &local_err);
ret = -EINVAL; if (ret < 0) {
error_propagate(errp, local_err); error_propagate(errp, local_err);
goto fail; goto fail;
} }
/* Open the test file */ /* Open the test file */
s->test_file = bdrv_open_child(qemu_opt_get(opts, "x-image"), options, assert(s->test_file == NULL);
"test", bs, &child_format, false, ret = bdrv_open_image(&s->test_file, qemu_opt_get(opts, "x-image"), options,
&local_err); "test", bs, &child_format, false, &local_err);
if (local_err) { if (ret < 0) {
ret = -EINVAL;
error_propagate(errp, local_err); error_propagate(errp, local_err);
s->test_file = NULL;
goto fail; goto fail;
} }
ret = 0; ret = 0;
fail: fail:
if (ret < 0) {
bdrv_unref_child(bs, bs->file);
}
qemu_opts_del(opts); qemu_opts_del(opts);
return ret; return ret;
} }
@@ -156,7 +151,7 @@ static void blkverify_close(BlockDriverState *bs)
{ {
BDRVBlkverifyState *s = bs->opaque; BDRVBlkverifyState *s = bs->opaque;
bdrv_unref_child(bs, s->test_file); bdrv_unref(s->test_file);
s->test_file = NULL; s->test_file = NULL;
} }
@@ -164,7 +159,7 @@ static int64_t blkverify_getlength(BlockDriverState *bs)
{ {
BDRVBlkverifyState *s = bs->opaque; BDRVBlkverifyState *s = bs->opaque;
return bdrv_getlength(s->test_file->bs); return bdrv_getlength(s->test_file);
} }
static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write, static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
@@ -243,7 +238,7 @@ static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
nb_sectors, cb, opaque); nb_sectors, cb, opaque);
acb->verify = blkverify_verify_readv; acb->verify = blkverify_verify_readv;
acb->buf = qemu_blockalign(bs->file->bs, qiov->size); acb->buf = qemu_blockalign(bs->file, qiov->size);
qemu_iovec_init(&acb->raw_qiov, acb->qiov->niov); qemu_iovec_init(&acb->raw_qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->raw_qiov, qiov, acb->buf); qemu_iovec_clone(&acb->raw_qiov, qiov, acb->buf);
@@ -276,7 +271,7 @@ static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
BDRVBlkverifyState *s = bs->opaque; BDRVBlkverifyState *s = bs->opaque;
/* Only flush test file, the raw file is not important */ /* Only flush test file, the raw file is not important */
return bdrv_aio_flush(s->test_file->bs, cb, opaque); return bdrv_aio_flush(s->test_file, cb, opaque);
} }
static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs, static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
@@ -284,44 +279,54 @@ static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
{ {
BDRVBlkverifyState *s = bs->opaque; BDRVBlkverifyState *s = bs->opaque;
bool perm = bdrv_recurse_is_first_non_filter(bs->file->bs, candidate); bool perm = bdrv_recurse_is_first_non_filter(bs->file, candidate);
if (perm) { if (perm) {
return true; return true;
} }
return bdrv_recurse_is_first_non_filter(s->test_file->bs, candidate); return bdrv_recurse_is_first_non_filter(s->test_file, candidate);
} }
static void blkverify_refresh_filename(BlockDriverState *bs, QDict *options) /* Propagate AioContext changes to ->test_file */
static void blkverify_detach_aio_context(BlockDriverState *bs)
{ {
BDRVBlkverifyState *s = bs->opaque; BDRVBlkverifyState *s = bs->opaque;
/* bs->file->bs has already been refreshed */ bdrv_detach_aio_context(s->test_file);
bdrv_refresh_filename(s->test_file->bs); }
if (bs->file->bs->full_open_options static void blkverify_attach_aio_context(BlockDriverState *bs,
&& s->test_file->bs->full_open_options) AioContext *new_context)
{ {
BDRVBlkverifyState *s = bs->opaque;
bdrv_attach_aio_context(s->test_file, new_context);
}
static void blkverify_refresh_filename(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
/* bs->file has already been refreshed */
bdrv_refresh_filename(s->test_file);
if (bs->file->full_open_options && s->test_file->full_open_options) {
QDict *opts = qdict_new(); QDict *opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkverify"))); qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkverify")));
QINCREF(bs->file->bs->full_open_options); QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "raw", QOBJECT(bs->file->bs->full_open_options)); qdict_put_obj(opts, "raw", QOBJECT(bs->file->full_open_options));
QINCREF(s->test_file->bs->full_open_options); QINCREF(s->test_file->full_open_options);
qdict_put_obj(opts, "test", qdict_put_obj(opts, "test", QOBJECT(s->test_file->full_open_options));
QOBJECT(s->test_file->bs->full_open_options));
bs->full_open_options = opts; bs->full_open_options = opts;
} }
if (bs->file->bs->exact_filename[0] if (bs->file->exact_filename[0] && s->test_file->exact_filename[0]) {
&& s->test_file->bs->exact_filename[0])
{
snprintf(bs->exact_filename, sizeof(bs->exact_filename), snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkverify:%s:%s", "blkverify:%s:%s",
bs->file->bs->exact_filename, bs->file->exact_filename, s->test_file->exact_filename);
s->test_file->bs->exact_filename);
} }
} }
@@ -340,6 +345,9 @@ static BlockDriver bdrv_blkverify = {
.bdrv_aio_writev = blkverify_aio_writev, .bdrv_aio_writev = blkverify_aio_writev,
.bdrv_aio_flush = blkverify_aio_flush, .bdrv_aio_flush = blkverify_aio_flush,
.bdrv_attach_aio_context = blkverify_attach_aio_context,
.bdrv_detach_aio_context = blkverify_detach_aio_context,
.is_filter = true, .is_filter = true,
.bdrv_recurse_is_first_non_filter = blkverify_recurse_is_first_non_filter, .bdrv_recurse_is_first_non_filter = blkverify_recurse_is_first_non_filter,
}; };

File diff suppressed because it is too large Load Diff

View File

@@ -22,12 +22,9 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qemu/module.h" #include "qemu/module.h"
#include "qemu/bswap.h"
/**************************************************************/ /**************************************************************/
@@ -104,7 +101,7 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
struct bochs_header bochs; struct bochs_header bochs;
int ret; int ret;
bs->read_only = true; /* no write support yet */ bs->read_only = 1; // no write support yet
ret = bdrv_pread(bs->file, 0, &bochs, sizeof(bochs)); ret = bdrv_pread(bs->file, 0, &bochs, sizeof(bochs));
if (ret < 0) { if (ret < 0) {
@@ -188,11 +185,6 @@ fail:
return ret; return ret;
} }
static void bochs_refresh_limits(BlockDriverState *bs, Error **errp)
{
bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
}
static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num) static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
{ {
BDRVBochsState *s = bs->opaque; BDRVBochsState *s = bs->opaque;
@@ -227,52 +219,38 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
return bitmap_offset + (512 * (s->bitmap_blocks + extent_offset)); return bitmap_offset + (512 * (s->bitmap_blocks + extent_offset));
} }
static int coroutine_fn static int bochs_read(BlockDriverState *bs, int64_t sector_num,
bochs_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, uint8_t *buf, int nb_sectors)
QEMUIOVector *qiov, int flags)
{ {
BDRVBochsState *s = bs->opaque;
uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
int nb_sectors = bytes >> BDRV_SECTOR_BITS;
uint64_t bytes_done = 0;
QEMUIOVector local_qiov;
int ret; int ret;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
qemu_iovec_init(&local_qiov, qiov->niov);
qemu_co_mutex_lock(&s->lock);
while (nb_sectors > 0) { while (nb_sectors > 0) {
int64_t block_offset = seek_to_sector(bs, sector_num); int64_t block_offset = seek_to_sector(bs, sector_num);
if (block_offset < 0) { if (block_offset < 0) {
ret = block_offset; return block_offset;
goto fail; } else if (block_offset > 0) {
} ret = bdrv_pread(bs->file, block_offset, buf, 512);
qemu_iovec_reset(&local_qiov);
qemu_iovec_concat(&local_qiov, qiov, bytes_done, 512);
if (block_offset > 0) {
ret = bdrv_co_preadv(bs->file, block_offset, 512,
&local_qiov, 0);
if (ret < 0) { if (ret < 0) {
goto fail; return ret;
} }
} else { } else {
qemu_iovec_memset(&local_qiov, 0, 0, 512); memset(buf, 0, 512);
} }
nb_sectors--; nb_sectors--;
sector_num++; sector_num++;
bytes_done += 512; buf += 512;
} }
return 0;
}
ret = 0; static coroutine_fn int bochs_co_read(BlockDriverState *bs, int64_t sector_num,
fail: uint8_t *buf, int nb_sectors)
{
int ret;
BDRVBochsState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = bochs_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock); qemu_co_mutex_unlock(&s->lock);
qemu_iovec_destroy(&local_qiov);
return ret; return ret;
} }
@@ -287,8 +265,7 @@ static BlockDriver bdrv_bochs = {
.instance_size = sizeof(BDRVBochsState), .instance_size = sizeof(BDRVBochsState),
.bdrv_probe = bochs_probe, .bdrv_probe = bochs_probe,
.bdrv_open = bochs_open, .bdrv_open = bochs_open,
.bdrv_refresh_limits = bochs_refresh_limits, .bdrv_read = bochs_co_read,
.bdrv_co_preadv = bochs_co_preadv,
.bdrv_close = bochs_close, .bdrv_close = bochs_close,
}; };

View File

@@ -21,12 +21,9 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qemu/module.h" #include "qemu/module.h"
#include "qemu/bswap.h"
#include <zlib.h> #include <zlib.h>
/* Maximum compressed block size */ /* Maximum compressed block size */
@@ -66,7 +63,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
uint32_t offsets_size, max_compressed_block_size = 1, i; uint32_t offsets_size, max_compressed_block_size = 1, i;
int ret; int ret;
bs->read_only = true; bs->read_only = 1;
/* read header */ /* read header */
ret = bdrv_pread(bs->file, 128, &s->block_size, 4); ret = bdrv_pread(bs->file, 128, &s->block_size, 4);
@@ -198,11 +195,6 @@ fail:
return ret; return ret;
} }
static void cloop_refresh_limits(BlockDriverState *bs, Error **errp)
{
bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
}
static inline int cloop_read_block(BlockDriverState *bs, int block_num) static inline int cloop_read_block(BlockDriverState *bs, int block_num)
{ {
BDRVCloopState *s = bs->opaque; BDRVCloopState *s = bs->opaque;
@@ -211,8 +203,8 @@ static inline int cloop_read_block(BlockDriverState *bs, int block_num)
int ret; int ret;
uint32_t bytes = s->offsets[block_num + 1] - s->offsets[block_num]; uint32_t bytes = s->offsets[block_num + 1] - s->offsets[block_num];
ret = bdrv_pread(bs->file, s->offsets[block_num], ret = bdrv_pread(bs->file, s->offsets[block_num], s->compressed_block,
s->compressed_block, bytes); bytes);
if (ret != bytes) { if (ret != bytes) {
return -1; return -1;
} }
@@ -235,38 +227,33 @@ static inline int cloop_read_block(BlockDriverState *bs, int block_num)
return 0; return 0;
} }
static int coroutine_fn static int cloop_read(BlockDriverState *bs, int64_t sector_num,
cloop_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, uint8_t *buf, int nb_sectors)
QEMUIOVector *qiov, int flags)
{ {
BDRVCloopState *s = bs->opaque; BDRVCloopState *s = bs->opaque;
uint64_t sector_num = offset >> BDRV_SECTOR_BITS; int i;
int nb_sectors = bytes >> BDRV_SECTOR_BITS;
int ret, i;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
qemu_co_mutex_lock(&s->lock);
for (i = 0; i < nb_sectors; i++) { for (i = 0; i < nb_sectors; i++) {
void *data;
uint32_t sector_offset_in_block = uint32_t sector_offset_in_block =
((sector_num + i) % s->sectors_per_block), ((sector_num + i) % s->sectors_per_block),
block_num = (sector_num + i) / s->sectors_per_block; block_num = (sector_num + i) / s->sectors_per_block;
if (cloop_read_block(bs, block_num) != 0) { if (cloop_read_block(bs, block_num) != 0) {
ret = -EIO; return -1;
goto fail;
} }
memcpy(buf + i * 512,
data = s->uncompressed_block + sector_offset_in_block * 512; s->uncompressed_block + sector_offset_in_block * 512, 512);
qemu_iovec_from_buf(qiov, i * 512, data, 512);
} }
return 0;
}
ret = 0; static coroutine_fn int cloop_co_read(BlockDriverState *bs, int64_t sector_num,
fail: uint8_t *buf, int nb_sectors)
{
int ret;
BDRVCloopState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = cloop_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock); qemu_co_mutex_unlock(&s->lock);
return ret; return ret;
} }
@@ -284,8 +271,7 @@ static BlockDriver bdrv_cloop = {
.instance_size = sizeof(BDRVCloopState), .instance_size = sizeof(BDRVCloopState),
.bdrv_probe = cloop_probe, .bdrv_probe = cloop_probe,
.bdrv_open = cloop_open, .bdrv_open = cloop_open,
.bdrv_refresh_limits = cloop_refresh_limits, .bdrv_read = cloop_co_read,
.bdrv_co_preadv = cloop_co_preadv,
.bdrv_close = cloop_close, .bdrv_close = cloop_close,
}; };

View File

@@ -12,14 +12,11 @@
* *
*/ */
#include "qemu/osdep.h"
#include "trace.h" #include "trace.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "block/blockjob.h" #include "block/blockjob.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h" #include "qapi/qmp/qerror.h"
#include "qemu/ratelimit.h" #include "qemu/ratelimit.h"
#include "sysemu/block-backend.h"
enum { enum {
/* /*
@@ -36,36 +33,28 @@ typedef struct CommitBlockJob {
BlockJob common; BlockJob common;
RateLimit limit; RateLimit limit;
BlockDriverState *active; BlockDriverState *active;
BlockBackend *top; BlockDriverState *top;
BlockBackend *base; BlockDriverState *base;
BlockdevOnError on_error; BlockdevOnError on_error;
int base_flags; int base_flags;
int orig_overlay_flags; int orig_overlay_flags;
char *backing_file_str; char *backing_file_str;
} CommitBlockJob; } CommitBlockJob;
static int coroutine_fn commit_populate(BlockBackend *bs, BlockBackend *base, static int coroutine_fn commit_populate(BlockDriverState *bs,
BlockDriverState *base,
int64_t sector_num, int nb_sectors, int64_t sector_num, int nb_sectors,
void *buf) void *buf)
{ {
int ret = 0; int ret = 0;
QEMUIOVector qiov;
struct iovec iov = {
.iov_base = buf,
.iov_len = nb_sectors * BDRV_SECTOR_SIZE,
};
qemu_iovec_init_external(&qiov, &iov, 1); ret = bdrv_read(bs, sector_num, buf, nb_sectors);
if (ret) {
ret = blk_co_preadv(bs, sector_num * BDRV_SECTOR_SIZE,
qiov.size, &qiov, 0);
if (ret < 0) {
return ret; return ret;
} }
ret = blk_co_pwritev(base, sector_num * BDRV_SECTOR_SIZE, ret = bdrv_write(base, sector_num, buf, nb_sectors);
qiov.size, &qiov, 0); if (ret) {
if (ret < 0) {
return ret; return ret;
} }
@@ -81,9 +70,9 @@ static void commit_complete(BlockJob *job, void *opaque)
CommitBlockJob *s = container_of(job, CommitBlockJob, common); CommitBlockJob *s = container_of(job, CommitBlockJob, common);
CommitCompleteData *data = opaque; CommitCompleteData *data = opaque;
BlockDriverState *active = s->active; BlockDriverState *active = s->active;
BlockDriverState *top = blk_bs(s->top); BlockDriverState *top = s->top;
BlockDriverState *base = blk_bs(s->base); BlockDriverState *base = s->base;
BlockDriverState *overlay_bs = bdrv_find_overlay(active, top); BlockDriverState *overlay_bs;
int ret = data->ret; int ret = data->ret;
if (!block_job_is_cancelled(&s->common) && ret == 0) { if (!block_job_is_cancelled(&s->common) && ret == 0) {
@@ -97,12 +86,11 @@ static void commit_complete(BlockJob *job, void *opaque)
if (s->base_flags != bdrv_get_flags(base)) { if (s->base_flags != bdrv_get_flags(base)) {
bdrv_reopen(base, s->base_flags, NULL); bdrv_reopen(base, s->base_flags, NULL);
} }
overlay_bs = bdrv_find_overlay(active, top);
if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) { if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL); bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
} }
g_free(s->backing_file_str); g_free(s->backing_file_str);
blk_unref(s->top);
blk_unref(s->base);
block_job_completed(&s->common, ret); block_job_completed(&s->common, ret);
g_free(data); g_free(data);
} }
@@ -111,39 +99,42 @@ static void coroutine_fn commit_run(void *opaque)
{ {
CommitBlockJob *s = opaque; CommitBlockJob *s = opaque;
CommitCompleteData *data; CommitCompleteData *data;
BlockDriverState *top = s->top;
BlockDriverState *base = s->base;
int64_t sector_num, end; int64_t sector_num, end;
uint64_t delay_ns = 0;
int ret = 0; int ret = 0;
int n = 0; int n = 0;
void *buf = NULL; void *buf = NULL;
int bytes_written = 0; int bytes_written = 0;
int64_t base_len; int64_t base_len;
ret = s->common.len = blk_getlength(s->top); ret = s->common.len = bdrv_getlength(top);
if (s->common.len < 0) { if (s->common.len < 0) {
goto out; goto out;
} }
ret = base_len = blk_getlength(s->base); ret = base_len = bdrv_getlength(base);
if (base_len < 0) { if (base_len < 0) {
goto out; goto out;
} }
if (base_len < s->common.len) { if (base_len < s->common.len) {
ret = blk_truncate(s->base, s->common.len); ret = bdrv_truncate(base, s->common.len);
if (ret) { if (ret) {
goto out; goto out;
} }
} }
end = s->common.len >> BDRV_SECTOR_BITS; end = s->common.len >> BDRV_SECTOR_BITS;
buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE); buf = qemu_blockalign(top, COMMIT_BUFFER_SIZE);
for (sector_num = 0; sector_num < end; sector_num += n) { for (sector_num = 0; sector_num < end; sector_num += n) {
uint64_t delay_ns = 0;
bool copy; bool copy;
wait:
/* Note that even when no rate limit is applied we need to yield /* Note that even when no rate limit is applied we need to yield
* with no pending I/O here so that bdrv_drain_all() returns. * with no pending I/O here so that bdrv_drain_all() returns.
*/ */
@@ -152,20 +143,25 @@ static void coroutine_fn commit_run(void *opaque)
break; break;
} }
/* Copy if allocated above the base */ /* Copy if allocated above the base */
ret = bdrv_is_allocated_above(blk_bs(s->top), blk_bs(s->base), ret = bdrv_is_allocated_above(top, base, sector_num,
sector_num,
COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE, COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
&n); &n);
copy = (ret == 1); copy = (ret == 1);
trace_commit_one_iteration(s, sector_num, n, ret); trace_commit_one_iteration(s, sector_num, n, ret);
if (copy) { if (copy) {
ret = commit_populate(s->top, s->base, sector_num, n, buf); if (s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, n);
if (delay_ns > 0) {
goto wait;
}
}
ret = commit_populate(top, base, sector_num, n, buf);
bytes_written += n * BDRV_SECTOR_SIZE; bytes_written += n * BDRV_SECTOR_SIZE;
} }
if (ret < 0) { if (ret < 0) {
BlockErrorAction action = if (s->on_error == BLOCKDEV_ON_ERROR_STOP ||
block_job_error_action(&s->common, false, s->on_error, -ret); s->on_error == BLOCKDEV_ON_ERROR_REPORT||
if (action == BLOCK_ERROR_ACTION_REPORT) { (s->on_error == BLOCKDEV_ON_ERROR_ENOSPC && ret == -ENOSPC)) {
goto out; goto out;
} else { } else {
n = 0; n = 0;
@@ -174,10 +170,6 @@ static void coroutine_fn commit_run(void *opaque)
} }
/* Publish progress */ /* Publish progress */
s->common.offset += n * BDRV_SECTOR_SIZE; s->common.offset += n * BDRV_SECTOR_SIZE;
if (copy && s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, n);
}
} }
ret = 0; ret = 0;
@@ -207,8 +199,8 @@ static const BlockJobDriver commit_job_driver = {
.set_speed = commit_set_speed, .set_speed = commit_set_speed,
}; };
void commit_start(const char *job_id, BlockDriverState *bs, void commit_start(BlockDriverState *bs, BlockDriverState *base,
BlockDriverState *base, BlockDriverState *top, int64_t speed, BlockDriverState *top, int64_t speed,
BlockdevOnError on_error, BlockCompletionFunc *cb, BlockdevOnError on_error, BlockCompletionFunc *cb,
void *opaque, const char *backing_file_str, Error **errp) void *opaque, const char *backing_file_str, Error **errp)
{ {
@@ -219,6 +211,13 @@ void commit_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *overlay_bs; BlockDriverState *overlay_bs;
Error *local_err = NULL; Error *local_err = NULL;
if ((on_error == BLOCKDEV_ON_ERROR_STOP ||
on_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
!bdrv_iostatus_is_enabled(bs)) {
error_setg(errp, "Invalid parameter combination");
return;
}
assert(top != bs); assert(top != bs);
if (top == base) { if (top == base) {
error_setg(errp, "Invalid files for merge: top and base are the same"); error_setg(errp, "Invalid files for merge: top and base are the same");
@@ -232,40 +231,34 @@ void commit_start(const char *job_id, BlockDriverState *bs,
return; return;
} }
s = block_job_create(job_id, &commit_job_driver, bs, speed,
cb, opaque, errp);
if (!s) {
return;
}
orig_base_flags = bdrv_get_flags(base); orig_base_flags = bdrv_get_flags(base);
orig_overlay_flags = bdrv_get_flags(overlay_bs); orig_overlay_flags = bdrv_get_flags(overlay_bs);
/* convert base & overlay_bs to r/w, if necessary */ /* convert base & overlay_bs to r/w, if necessary */
if (!(orig_base_flags & BDRV_O_RDWR)) { if (!(orig_base_flags & BDRV_O_RDWR)) {
reopen_queue = bdrv_reopen_queue(reopen_queue, base, NULL, reopen_queue = bdrv_reopen_queue(reopen_queue, base,
orig_base_flags | BDRV_O_RDWR); orig_base_flags | BDRV_O_RDWR);
} }
if (!(orig_overlay_flags & BDRV_O_RDWR)) { if (!(orig_overlay_flags & BDRV_O_RDWR)) {
reopen_queue = bdrv_reopen_queue(reopen_queue, overlay_bs, NULL, reopen_queue = bdrv_reopen_queue(reopen_queue, overlay_bs,
orig_overlay_flags | BDRV_O_RDWR); orig_overlay_flags | BDRV_O_RDWR);
} }
if (reopen_queue) { if (reopen_queue) {
bdrv_reopen_multiple(reopen_queue, &local_err); bdrv_reopen_multiple(reopen_queue, &local_err);
if (local_err != NULL) { if (local_err != NULL) {
error_propagate(errp, local_err); error_propagate(errp, local_err);
block_job_unref(&s->common);
return; return;
} }
} }
s->base = blk_new(); s = block_job_create(&commit_job_driver, bs, speed, cb, opaque, errp);
blk_insert_bs(s->base, base); if (!s) {
return;
s->top = blk_new(); }
blk_insert_bs(s->top, top);
s->base = base;
s->top = top;
s->active = bs; s->active = bs;
s->base_flags = orig_base_flags; s->base_flags = orig_base_flags;
@@ -274,129 +267,8 @@ void commit_start(const char *job_id, BlockDriverState *bs,
s->backing_file_str = g_strdup(backing_file_str); s->backing_file_str = g_strdup(backing_file_str);
s->on_error = on_error; s->on_error = on_error;
s->common.co = qemu_coroutine_create(commit_run, s); s->common.co = qemu_coroutine_create(commit_run);
trace_commit_start(bs, base, top, s, s->common.co, opaque); trace_commit_start(bs, base, top, s, s->common.co, opaque);
qemu_coroutine_enter(s->common.co); qemu_coroutine_enter(s->common.co, s);
}
#define COMMIT_BUF_SECTORS 2048
/* commit COW file into the raw image */
int bdrv_commit(BlockDriverState *bs)
{
BlockBackend *src, *backing;
BlockDriver *drv = bs->drv;
int64_t sector, total_sectors, length, backing_length;
int n, ro, open_flags;
int ret = 0;
uint8_t *buf = NULL;
if (!drv)
return -ENOMEDIUM;
if (!bs->backing) {
return -ENOTSUP;
}
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_COMMIT_SOURCE, NULL) ||
bdrv_op_is_blocked(bs->backing->bs, BLOCK_OP_TYPE_COMMIT_TARGET, NULL)) {
return -EBUSY;
}
ro = bs->backing->bs->read_only;
open_flags = bs->backing->bs->open_flags;
if (ro) {
if (bdrv_reopen(bs->backing->bs, open_flags | BDRV_O_RDWR, NULL)) {
return -EACCES;
}
}
src = blk_new();
blk_insert_bs(src, bs);
backing = blk_new();
blk_insert_bs(backing, bs->backing->bs);
length = blk_getlength(src);
if (length < 0) {
ret = length;
goto ro_cleanup;
}
backing_length = blk_getlength(backing);
if (backing_length < 0) {
ret = backing_length;
goto ro_cleanup;
}
/* If our top snapshot is larger than the backing file image,
* grow the backing file image if possible. If not possible,
* we must return an error */
if (length > backing_length) {
ret = blk_truncate(backing, length);
if (ret < 0) {
goto ro_cleanup;
}
}
total_sectors = length >> BDRV_SECTOR_BITS;
/* blk_try_blockalign() for src will choose an alignment that works for
* backing as well, so no need to compare the alignment manually. */
buf = blk_try_blockalign(src, COMMIT_BUF_SECTORS * BDRV_SECTOR_SIZE);
if (buf == NULL) {
ret = -ENOMEM;
goto ro_cleanup;
}
for (sector = 0; sector < total_sectors; sector += n) {
ret = bdrv_is_allocated(bs, sector, COMMIT_BUF_SECTORS, &n);
if (ret < 0) {
goto ro_cleanup;
}
if (ret) {
ret = blk_pread(src, sector * BDRV_SECTOR_SIZE, buf,
n * BDRV_SECTOR_SIZE);
if (ret < 0) {
goto ro_cleanup;
}
ret = blk_pwrite(backing, sector * BDRV_SECTOR_SIZE, buf,
n * BDRV_SECTOR_SIZE, 0);
if (ret < 0) {
goto ro_cleanup;
}
}
}
if (drv->bdrv_make_empty) {
ret = drv->bdrv_make_empty(bs);
if (ret < 0) {
goto ro_cleanup;
}
blk_flush(src);
}
/*
* Make sure all data we wrote to the backing device is actually
* stable on disk.
*/
blk_flush(backing);
ret = 0;
ro_cleanup:
qemu_vfree(buf);
blk_unref(src);
blk_unref(backing);
if (ro) {
/* ignoring error return here */
bdrv_reopen(bs->backing->bs, open_flags & ~BDRV_O_RDWR, NULL);
}
return ret;
} }

View File

@@ -1,641 +0,0 @@
/*
* QEMU block full disk encryption
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "qemu/osdep.h"
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "crypto/block.h"
#include "qapi/opts-visitor.h"
#include "qapi-visit.h"
#include "qapi/error.h"
#define BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET "key-secret"
#define BLOCK_CRYPTO_OPT_LUKS_CIPHER_ALG "cipher-alg"
#define BLOCK_CRYPTO_OPT_LUKS_CIPHER_MODE "cipher-mode"
#define BLOCK_CRYPTO_OPT_LUKS_IVGEN_ALG "ivgen-alg"
#define BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG "ivgen-hash-alg"
#define BLOCK_CRYPTO_OPT_LUKS_HASH_ALG "hash-alg"
#define BLOCK_CRYPTO_OPT_LUKS_ITER_TIME "iter-time"
typedef struct BlockCrypto BlockCrypto;
struct BlockCrypto {
QCryptoBlock *block;
};
static int block_crypto_probe_generic(QCryptoBlockFormat format,
const uint8_t *buf,
int buf_size,
const char *filename)
{
if (qcrypto_block_has_format(format, buf, buf_size)) {
return 100;
} else {
return 0;
}
}
static ssize_t block_crypto_read_func(QCryptoBlock *block,
size_t offset,
uint8_t *buf,
size_t buflen,
Error **errp,
void *opaque)
{
BlockDriverState *bs = opaque;
ssize_t ret;
ret = bdrv_pread(bs->file, offset, buf, buflen);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not read encryption header");
return ret;
}
return ret;
}
struct BlockCryptoCreateData {
const char *filename;
QemuOpts *opts;
BlockBackend *blk;
uint64_t size;
};
static ssize_t block_crypto_write_func(QCryptoBlock *block,
size_t offset,
const uint8_t *buf,
size_t buflen,
Error **errp,
void *opaque)
{
struct BlockCryptoCreateData *data = opaque;
ssize_t ret;
ret = blk_pwrite(data->blk, offset, buf, buflen, 0);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not write encryption header");
return ret;
}
return ret;
}
static ssize_t block_crypto_init_func(QCryptoBlock *block,
size_t headerlen,
Error **errp,
void *opaque)
{
struct BlockCryptoCreateData *data = opaque;
int ret;
/* User provided size should reflect amount of space made
* available to the guest, so we must take account of that
* which will be used by the crypto header
*/
data->size += headerlen;
qemu_opt_set_number(data->opts, BLOCK_OPT_SIZE, data->size, &error_abort);
ret = bdrv_create_file(data->filename, data->opts, errp);
if (ret < 0) {
return -1;
}
data->blk = blk_new_open(data->filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, errp);
if (!data->blk) {
return -1;
}
return 0;
}
static QemuOptsList block_crypto_runtime_opts_luks = {
.name = "crypto",
.head = QTAILQ_HEAD_INITIALIZER(block_crypto_runtime_opts_luks.head),
.desc = {
{
.name = BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of the secret that provides the encryption key",
},
{ /* end of list */ }
},
};
static QemuOptsList block_crypto_create_opts_luks = {
.name = "crypto",
.head = QTAILQ_HEAD_INITIALIZER(block_crypto_create_opts_luks.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of the secret that provides the encryption key",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_CIPHER_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of encryption cipher algorithm",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_CIPHER_MODE,
.type = QEMU_OPT_STRING,
.help = "Name of encryption cipher mode",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_IVGEN_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of IV generator algorithm",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of IV generator hash algorithm",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_HASH_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of encryption hash algorithm",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_ITER_TIME,
.type = QEMU_OPT_NUMBER,
.help = "Time to spend in PBKDF in milliseconds",
},
{ /* end of list */ }
},
};
static QCryptoBlockOpenOptions *
block_crypto_open_opts_init(QCryptoBlockFormat format,
QemuOpts *opts,
Error **errp)
{
Visitor *v;
QCryptoBlockOpenOptions *ret = NULL;
Error *local_err = NULL;
ret = g_new0(QCryptoBlockOpenOptions, 1);
ret->format = format;
v = opts_visitor_new(opts);
visit_start_struct(v, NULL, NULL, 0, &local_err);
if (local_err) {
goto out;
}
switch (format) {
case Q_CRYPTO_BLOCK_FORMAT_LUKS:
visit_type_QCryptoBlockOptionsLUKS_members(
v, &ret->u.luks, &local_err);
break;
default:
error_setg(&local_err, "Unsupported block format %d", format);
break;
}
if (!local_err) {
visit_check_struct(v, &local_err);
}
visit_end_struct(v, NULL);
out:
if (local_err) {
error_propagate(errp, local_err);
qapi_free_QCryptoBlockOpenOptions(ret);
ret = NULL;
}
visit_free(v);
return ret;
}
static QCryptoBlockCreateOptions *
block_crypto_create_opts_init(QCryptoBlockFormat format,
QemuOpts *opts,
Error **errp)
{
Visitor *v;
QCryptoBlockCreateOptions *ret = NULL;
Error *local_err = NULL;
ret = g_new0(QCryptoBlockCreateOptions, 1);
ret->format = format;
v = opts_visitor_new(opts);
visit_start_struct(v, NULL, NULL, 0, &local_err);
if (local_err) {
goto out;
}
switch (format) {
case Q_CRYPTO_BLOCK_FORMAT_LUKS:
visit_type_QCryptoBlockCreateOptionsLUKS_members(
v, &ret->u.luks, &local_err);
break;
default:
error_setg(&local_err, "Unsupported block format %d", format);
break;
}
if (!local_err) {
visit_check_struct(v, &local_err);
}
visit_end_struct(v, NULL);
out:
if (local_err) {
error_propagate(errp, local_err);
qapi_free_QCryptoBlockCreateOptions(ret);
ret = NULL;
}
visit_free(v);
return ret;
}
static int block_crypto_open_generic(QCryptoBlockFormat format,
QemuOptsList *opts_spec,
BlockDriverState *bs,
QDict *options,
int flags,
Error **errp)
{
BlockCrypto *crypto = bs->opaque;
QemuOpts *opts = NULL;
Error *local_err = NULL;
int ret = -EINVAL;
QCryptoBlockOpenOptions *open_opts = NULL;
unsigned int cflags = 0;
opts = qemu_opts_create(opts_spec, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto cleanup;
}
open_opts = block_crypto_open_opts_init(format, opts, errp);
if (!open_opts) {
goto cleanup;
}
if (flags & BDRV_O_NO_IO) {
cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
}
crypto->block = qcrypto_block_open(open_opts,
block_crypto_read_func,
bs,
cflags,
errp);
if (!crypto->block) {
ret = -EIO;
goto cleanup;
}
bs->encrypted = true;
bs->valid_key = true;
ret = 0;
cleanup:
qapi_free_QCryptoBlockOpenOptions(open_opts);
return ret;
}
static int block_crypto_create_generic(QCryptoBlockFormat format,
const char *filename,
QemuOpts *opts,
Error **errp)
{
int ret = -EINVAL;
QCryptoBlockCreateOptions *create_opts = NULL;
QCryptoBlock *crypto = NULL;
struct BlockCryptoCreateData data = {
.size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE),
.opts = opts,
.filename = filename,
};
create_opts = block_crypto_create_opts_init(format, opts, errp);
if (!create_opts) {
return -1;
}
crypto = qcrypto_block_create(create_opts,
block_crypto_init_func,
block_crypto_write_func,
&data,
errp);
if (!crypto) {
ret = -EIO;
goto cleanup;
}
ret = 0;
cleanup:
qcrypto_block_free(crypto);
blk_unref(data.blk);
qapi_free_QCryptoBlockCreateOptions(create_opts);
return ret;
}
static int block_crypto_truncate(BlockDriverState *bs, int64_t offset)
{
BlockCrypto *crypto = bs->opaque;
size_t payload_offset =
qcrypto_block_get_payload_offset(crypto->block);
offset += payload_offset;
return bdrv_truncate(bs->file->bs, offset);
}
static void block_crypto_close(BlockDriverState *bs)
{
BlockCrypto *crypto = bs->opaque;
qcrypto_block_free(crypto->block);
}
#define BLOCK_CRYPTO_MAX_SECTORS 32
static coroutine_fn int
block_crypto_co_readv(BlockDriverState *bs, int64_t sector_num,
int remaining_sectors, QEMUIOVector *qiov)
{
BlockCrypto *crypto = bs->opaque;
int cur_nr_sectors; /* number of sectors in current iteration */
uint64_t bytes_done = 0;
uint8_t *cipher_data = NULL;
QEMUIOVector hd_qiov;
int ret = 0;
size_t payload_offset =
qcrypto_block_get_payload_offset(crypto->block) / 512;
qemu_iovec_init(&hd_qiov, qiov->niov);
/* Bounce buffer so we have a linear mem region for
* entire sector. XXX optimize so we avoid bounce
* buffer in case that qiov->niov == 1
*/
cipher_data =
qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_SECTORS * 512,
qiov->size));
if (cipher_data == NULL) {
ret = -ENOMEM;
goto cleanup;
}
while (remaining_sectors) {
cur_nr_sectors = remaining_sectors;
if (cur_nr_sectors > BLOCK_CRYPTO_MAX_SECTORS) {
cur_nr_sectors = BLOCK_CRYPTO_MAX_SECTORS;
}
qemu_iovec_reset(&hd_qiov);
qemu_iovec_add(&hd_qiov, cipher_data, cur_nr_sectors * 512);
ret = bdrv_co_readv(bs->file,
payload_offset + sector_num,
cur_nr_sectors, &hd_qiov);
if (ret < 0) {
goto cleanup;
}
if (qcrypto_block_decrypt(crypto->block,
sector_num,
cipher_data, cur_nr_sectors * 512,
NULL) < 0) {
ret = -EIO;
goto cleanup;
}
qemu_iovec_from_buf(qiov, bytes_done,
cipher_data, cur_nr_sectors * 512);
remaining_sectors -= cur_nr_sectors;
sector_num += cur_nr_sectors;
bytes_done += cur_nr_sectors * 512;
}
cleanup:
qemu_iovec_destroy(&hd_qiov);
qemu_vfree(cipher_data);
return ret;
}
static coroutine_fn int
block_crypto_co_writev(BlockDriverState *bs, int64_t sector_num,
int remaining_sectors, QEMUIOVector *qiov)
{
BlockCrypto *crypto = bs->opaque;
int cur_nr_sectors; /* number of sectors in current iteration */
uint64_t bytes_done = 0;
uint8_t *cipher_data = NULL;
QEMUIOVector hd_qiov;
int ret = 0;
size_t payload_offset =
qcrypto_block_get_payload_offset(crypto->block) / 512;
qemu_iovec_init(&hd_qiov, qiov->niov);
/* Bounce buffer so we have a linear mem region for
* entire sector. XXX optimize so we avoid bounce
* buffer in case that qiov->niov == 1
*/
cipher_data =
qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_SECTORS * 512,
qiov->size));
if (cipher_data == NULL) {
ret = -ENOMEM;
goto cleanup;
}
while (remaining_sectors) {
cur_nr_sectors = remaining_sectors;
if (cur_nr_sectors > BLOCK_CRYPTO_MAX_SECTORS) {
cur_nr_sectors = BLOCK_CRYPTO_MAX_SECTORS;
}
qemu_iovec_to_buf(qiov, bytes_done,
cipher_data, cur_nr_sectors * 512);
if (qcrypto_block_encrypt(crypto->block,
sector_num,
cipher_data, cur_nr_sectors * 512,
NULL) < 0) {
ret = -EIO;
goto cleanup;
}
qemu_iovec_reset(&hd_qiov);
qemu_iovec_add(&hd_qiov, cipher_data, cur_nr_sectors * 512);
ret = bdrv_co_writev(bs->file,
payload_offset + sector_num,
cur_nr_sectors, &hd_qiov);
if (ret < 0) {
goto cleanup;
}
remaining_sectors -= cur_nr_sectors;
sector_num += cur_nr_sectors;
bytes_done += cur_nr_sectors * 512;
}
cleanup:
qemu_iovec_destroy(&hd_qiov);
qemu_vfree(cipher_data);
return ret;
}
static int64_t block_crypto_getlength(BlockDriverState *bs)
{
BlockCrypto *crypto = bs->opaque;
int64_t len = bdrv_getlength(bs->file->bs);
ssize_t offset = qcrypto_block_get_payload_offset(crypto->block);
len -= offset;
return len;
}
static int block_crypto_probe_luks(const uint8_t *buf,
int buf_size,
const char *filename) {
return block_crypto_probe_generic(Q_CRYPTO_BLOCK_FORMAT_LUKS,
buf, buf_size, filename);
}
static int block_crypto_open_luks(BlockDriverState *bs,
QDict *options,
int flags,
Error **errp)
{
return block_crypto_open_generic(Q_CRYPTO_BLOCK_FORMAT_LUKS,
&block_crypto_runtime_opts_luks,
bs, options, flags, errp);
}
static int block_crypto_create_luks(const char *filename,
QemuOpts *opts,
Error **errp)
{
return block_crypto_create_generic(Q_CRYPTO_BLOCK_FORMAT_LUKS,
filename, opts, errp);
}
static int block_crypto_get_info_luks(BlockDriverState *bs,
BlockDriverInfo *bdi)
{
BlockDriverInfo subbdi;
int ret;
ret = bdrv_get_info(bs->file->bs, &subbdi);
if (ret != 0) {
return ret;
}
bdi->unallocated_blocks_are_zero = false;
bdi->can_write_zeroes_with_unmap = false;
bdi->cluster_size = subbdi.cluster_size;
return 0;
}
static ImageInfoSpecific *
block_crypto_get_specific_info_luks(BlockDriverState *bs)
{
BlockCrypto *crypto = bs->opaque;
ImageInfoSpecific *spec_info;
QCryptoBlockInfo *info;
info = qcrypto_block_get_info(crypto->block, NULL);
if (!info) {
return NULL;
}
if (info->format != Q_CRYPTO_BLOCK_FORMAT_LUKS) {
qapi_free_QCryptoBlockInfo(info);
return NULL;
}
spec_info = g_new(ImageInfoSpecific, 1);
spec_info->type = IMAGE_INFO_SPECIFIC_KIND_LUKS;
spec_info->u.luks.data = g_new(QCryptoBlockInfoLUKS, 1);
*spec_info->u.luks.data = info->u.luks;
/* Blank out pointers we've just stolen to avoid double free */
memset(&info->u.luks, 0, sizeof(info->u.luks));
qapi_free_QCryptoBlockInfo(info);
return spec_info;
}
BlockDriver bdrv_crypto_luks = {
.format_name = "luks",
.instance_size = sizeof(BlockCrypto),
.bdrv_probe = block_crypto_probe_luks,
.bdrv_open = block_crypto_open_luks,
.bdrv_close = block_crypto_close,
.bdrv_create = block_crypto_create_luks,
.bdrv_truncate = block_crypto_truncate,
.create_opts = &block_crypto_create_opts_luks,
.bdrv_co_readv = block_crypto_co_readv,
.bdrv_co_writev = block_crypto_co_writev,
.bdrv_getlength = block_crypto_getlength,
.bdrv_get_info = block_crypto_get_info_luks,
.bdrv_get_specific_info = block_crypto_get_specific_info_luks,
};
static void block_crypto_init(void)
{
bdrv_register(&bdrv_crypto_luks);
}
block_init(block_crypto_init);

View File

@@ -21,31 +21,21 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "qemu/error-report.h" #include "qemu/error-report.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qapi/qmp/qbool.h" #include "qapi/qmp/qbool.h"
#include "qapi/qmp/qstring.h" #include "qapi/qmp/qstring.h"
#include "crypto/secret.h"
#include <curl/curl.h> #include <curl/curl.h>
#include "qemu/cutils.h"
// #define DEBUG_CURL // #define DEBUG_CURL
// #define DEBUG_VERBOSE // #define DEBUG_VERBOSE
#ifdef DEBUG_CURL #ifdef DEBUG_CURL
#define DEBUG_CURL_PRINT 1 #define DPRINTF(fmt, ...) do { printf(fmt, ## __VA_ARGS__); } while (0)
#else #else
#define DEBUG_CURL_PRINT 0 #define DPRINTF(fmt, ...) do { } while (0)
#endif #endif
#define DPRINTF(fmt, ...) \
do { \
if (DEBUG_CURL_PRINT) { \
fprintf(stderr, fmt, ## __VA_ARGS__); \
} \
} while (0)
#if LIBCURL_VERSION_NUM >= 0x071000 #if LIBCURL_VERSION_NUM >= 0x071000
/* The multi interface timer callback was introduced in 7.16.0 */ /* The multi interface timer callback was introduced in 7.16.0 */
@@ -87,10 +77,6 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_BLOCK_OPT_SSLVERIFY "sslverify" #define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
#define CURL_BLOCK_OPT_TIMEOUT "timeout" #define CURL_BLOCK_OPT_TIMEOUT "timeout"
#define CURL_BLOCK_OPT_COOKIE "cookie" #define CURL_BLOCK_OPT_COOKIE "cookie"
#define CURL_BLOCK_OPT_USERNAME "username"
#define CURL_BLOCK_OPT_PASSWORD_SECRET "password-secret"
#define CURL_BLOCK_OPT_PROXY_USERNAME "proxy-username"
#define CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET "proxy-password-secret"
struct BDRVCURLState; struct BDRVCURLState;
@@ -133,10 +119,6 @@ typedef struct BDRVCURLState {
char *cookie; char *cookie;
bool accept_range; bool accept_range;
AioContext *aio_context; AioContext *aio_context;
char *username;
char *password;
char *proxyusername;
char *proxypassword;
} BDRVCURLState; } BDRVCURLState;
static void curl_clean_state(CURLState *s); static void curl_clean_state(CURLState *s);
@@ -169,23 +151,21 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
state->sock_fd = fd; state->sock_fd = fd;
s = state->s; s = state->s;
DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, (int)fd); DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
switch (action) { switch (action) {
case CURL_POLL_IN: case CURL_POLL_IN:
aio_set_fd_handler(s->aio_context, fd, false, aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
curl_multi_read, NULL, state); NULL, state);
break; break;
case CURL_POLL_OUT: case CURL_POLL_OUT:
aio_set_fd_handler(s->aio_context, fd, false, aio_set_fd_handler(s->aio_context, fd, NULL, curl_multi_do, state);
NULL, curl_multi_do, state);
break; break;
case CURL_POLL_INOUT: case CURL_POLL_INOUT:
aio_set_fd_handler(s->aio_context, fd, false, aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
curl_multi_read, curl_multi_do, state); curl_multi_do, state);
break; break;
case CURL_POLL_REMOVE: case CURL_POLL_REMOVE:
aio_set_fd_handler(s->aio_context, fd, false, aio_set_fd_handler(s->aio_context, fd, NULL, NULL, NULL);
NULL, NULL, NULL);
break; break;
} }
@@ -436,21 +416,6 @@ static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
curl_easy_setopt(state->curl, CURLOPT_ERRORBUFFER, state->errmsg); curl_easy_setopt(state->curl, CURLOPT_ERRORBUFFER, state->errmsg);
curl_easy_setopt(state->curl, CURLOPT_FAILONERROR, 1); curl_easy_setopt(state->curl, CURLOPT_FAILONERROR, 1);
if (s->username) {
curl_easy_setopt(state->curl, CURLOPT_USERNAME, s->username);
}
if (s->password) {
curl_easy_setopt(state->curl, CURLOPT_PASSWORD, s->password);
}
if (s->proxyusername) {
curl_easy_setopt(state->curl,
CURLOPT_PROXYUSERNAME, s->proxyusername);
}
if (s->proxypassword) {
curl_easy_setopt(state->curl,
CURLOPT_PROXYPASSWORD, s->proxypassword);
}
/* Restrict supported protocols to avoid security issues in the more /* Restrict supported protocols to avoid security issues in the more
* obscure protocols. For example, do not allow POP3/SMTP/IMAP see * obscure protocols. For example, do not allow POP3/SMTP/IMAP see
* CVE-2013-0249. * CVE-2013-0249.
@@ -557,31 +522,10 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING, .type = QEMU_OPT_STRING,
.help = "Pass the cookie or list of cookies with each request" .help = "Pass the cookie or list of cookies with each request"
}, },
{
.name = CURL_BLOCK_OPT_USERNAME,
.type = QEMU_OPT_STRING,
.help = "Username for HTTP auth"
},
{
.name = CURL_BLOCK_OPT_PASSWORD_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of secret used as password for HTTP auth",
},
{
.name = CURL_BLOCK_OPT_PROXY_USERNAME,
.type = QEMU_OPT_STRING,
.help = "Username for HTTP proxy auth"
},
{
.name = CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of secret used as password for HTTP proxy auth",
},
{ /* end of list */ } { /* end of list */ }
}, },
}; };
static int curl_open(BlockDriverState *bs, QDict *options, int flags, static int curl_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp) Error **errp)
{ {
@@ -592,7 +536,6 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
const char *file; const char *file;
const char *cookie; const char *cookie;
double d; double d;
const char *secretid;
static int inited = 0; static int inited = 0;
@@ -634,26 +577,6 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
goto out_noclean; goto out_noclean;
} }
s->username = g_strdup(qemu_opt_get(opts, CURL_BLOCK_OPT_USERNAME));
secretid = qemu_opt_get(opts, CURL_BLOCK_OPT_PASSWORD_SECRET);
if (secretid) {
s->password = qcrypto_secret_lookup_as_utf8(secretid, errp);
if (!s->password) {
goto out_noclean;
}
}
s->proxyusername = g_strdup(
qemu_opt_get(opts, CURL_BLOCK_OPT_PROXY_USERNAME));
secretid = qemu_opt_get(opts, CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET);
if (secretid) {
s->proxypassword = qcrypto_secret_lookup_as_utf8(secretid, errp);
if (!s->proxypassword) {
goto out_noclean;
}
}
if (!inited) { if (!inited) {
curl_global_init(CURL_GLOBAL_ALL); curl_global_init(CURL_GLOBAL_ALL);
inited = 1; inited = 1;
@@ -675,28 +598,11 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
curl_easy_setopt(state->curl, CURLOPT_HEADERDATA, s); curl_easy_setopt(state->curl, CURLOPT_HEADERDATA, s);
if (curl_easy_perform(state->curl)) if (curl_easy_perform(state->curl))
goto out; goto out;
if (curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, &d)) { curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, &d);
if (d)
s->len = (size_t)d;
else if(!s->len)
goto out; goto out;
}
/* Prior CURL 7.19.4 return value of 0 could mean that the file size is not
* know or the size is zero. From 7.19.4 CURL returns -1 if size is not
* known and zero if it is realy zero-length file. */
#if LIBCURL_VERSION_NUM >= 0x071304
if (d < 0) {
pstrcpy(state->errmsg, CURL_ERROR_SIZE,
"Server didn't report file size.");
goto out;
}
#else
if (d <= 0) {
pstrcpy(state->errmsg, CURL_ERROR_SIZE,
"Unknown file size or zero-length file.");
goto out;
}
#endif
s->len = (size_t)d;
if ((!strncasecmp(s->url, "http://", strlen("http://")) if ((!strncasecmp(s->url, "http://", strlen("http://"))
|| !strncasecmp(s->url, "https://", strlen("https://"))) || !strncasecmp(s->url, "https://", strlen("https://")))
&& !s->accept_range) { && !s->accept_range) {

View File

@@ -1,387 +0,0 @@
/*
* Block Dirty Bitmap
*
* Copyright (c) 2016 Red Hat. Inc
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "trace.h"
#include "block/block_int.h"
#include "block/blockjob.h"
/**
* A BdrvDirtyBitmap can be in three possible states:
* (1) successor is NULL and disabled is false: full r/w mode
* (2) successor is NULL and disabled is true: read only mode ("disabled")
* (3) successor is set: frozen mode.
* A frozen bitmap cannot be renamed, deleted, anonymized, cleared, set,
* or enabled. A frozen bitmap can only abdicate() or reclaim().
*/
struct BdrvDirtyBitmap {
HBitmap *bitmap; /* Dirty sector bitmap implementation */
BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
char *name; /* Optional non-empty unique ID */
int64_t size; /* Size of the bitmap (Number of sectors) */
bool disabled; /* Bitmap is read-only */
QLIST_ENTRY(BdrvDirtyBitmap) list;
};
BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
{
BdrvDirtyBitmap *bm;
assert(name);
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
if (bm->name && !strcmp(name, bm->name)) {
return bm;
}
}
return NULL;
}
void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
g_free(bitmap->name);
bitmap->name = NULL;
}
BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
uint32_t granularity,
const char *name,
Error **errp)
{
int64_t bitmap_size;
BdrvDirtyBitmap *bitmap;
uint32_t sector_granularity;
assert((granularity & (granularity - 1)) == 0);
if (name && bdrv_find_dirty_bitmap(bs, name)) {
error_setg(errp, "Bitmap already exists: %s", name);
return NULL;
}
sector_granularity = granularity >> BDRV_SECTOR_BITS;
assert(sector_granularity);
bitmap_size = bdrv_nb_sectors(bs);
if (bitmap_size < 0) {
error_setg_errno(errp, -bitmap_size, "could not get length of device");
errno = -bitmap_size;
return NULL;
}
bitmap = g_new0(BdrvDirtyBitmap, 1);
bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(sector_granularity));
bitmap->size = bitmap_size;
bitmap->name = g_strdup(name);
bitmap->disabled = false;
QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
return bitmap;
}
bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
{
return bitmap->successor;
}
bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
{
return !(bitmap->disabled || bitmap->successor);
}
DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
{
if (bdrv_dirty_bitmap_frozen(bitmap)) {
return DIRTY_BITMAP_STATUS_FROZEN;
} else if (!bdrv_dirty_bitmap_enabled(bitmap)) {
return DIRTY_BITMAP_STATUS_DISABLED;
} else {
return DIRTY_BITMAP_STATUS_ACTIVE;
}
}
/**
* Create a successor bitmap destined to replace this bitmap after an operation.
* Requires that the bitmap is not frozen and has no successor.
*/
int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, Error **errp)
{
uint64_t granularity;
BdrvDirtyBitmap *child;
if (bdrv_dirty_bitmap_frozen(bitmap)) {
error_setg(errp, "Cannot create a successor for a bitmap that is "
"currently frozen");
return -1;
}
assert(!bitmap->successor);
/* Create an anonymous successor */
granularity = bdrv_dirty_bitmap_granularity(bitmap);
child = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
if (!child) {
return -1;
}
/* Successor will be on or off based on our current state. */
child->disabled = bitmap->disabled;
/* Install the successor and freeze the parent */
bitmap->successor = child;
return 0;
}
/**
* For a bitmap with a successor, yield our name to the successor,
* delete the old bitmap, and return a handle to the new bitmap.
*/
BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
Error **errp)
{
char *name;
BdrvDirtyBitmap *successor = bitmap->successor;
if (successor == NULL) {
error_setg(errp, "Cannot relinquish control if "
"there's no successor present");
return NULL;
}
name = bitmap->name;
bitmap->name = NULL;
successor->name = name;
bitmap->successor = NULL;
bdrv_release_dirty_bitmap(bs, bitmap);
return successor;
}
/**
* In cases of failure where we can no longer safely delete the parent,
* we may wish to re-join the parent and child/successor.
* The merged parent will be un-frozen, but not explicitly re-enabled.
*/
BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
Error **errp)
{
BdrvDirtyBitmap *successor = parent->successor;
if (!successor) {
error_setg(errp, "Cannot reclaim a successor when none is present");
return NULL;
}
if (!hbitmap_merge(parent->bitmap, successor->bitmap)) {
error_setg(errp, "Merging of parent and successor bitmap failed");
return NULL;
}
bdrv_release_dirty_bitmap(bs, successor);
parent->successor = NULL;
return parent;
}
/**
* Truncates _all_ bitmaps attached to a BDS.
*/
void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
{
BdrvDirtyBitmap *bitmap;
uint64_t size = bdrv_nb_sectors(bs);
QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
assert(!bdrv_dirty_bitmap_frozen(bitmap));
hbitmap_truncate(bitmap->bitmap, size);
bitmap->size = size;
}
}
static void bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
bool only_named)
{
BdrvDirtyBitmap *bm, *next;
QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
if ((!bitmap || bm == bitmap) && (!only_named || bm->name)) {
assert(!bdrv_dirty_bitmap_frozen(bm));
QLIST_REMOVE(bm, list);
hbitmap_free(bm->bitmap);
g_free(bm->name);
g_free(bm);
if (bitmap) {
return;
}
}
}
}
void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
{
bdrv_do_release_matching_dirty_bitmap(bs, bitmap, false);
}
/**
* Release all named dirty bitmaps attached to a BDS (for use in bdrv_close()).
* There must not be any frozen bitmaps attached.
*/
void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs)
{
bdrv_do_release_matching_dirty_bitmap(bs, NULL, true);
}
void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
bitmap->disabled = true;
}
void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
bitmap->disabled = false;
}
BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
{
BdrvDirtyBitmap *bm;
BlockDirtyInfoList *list = NULL;
BlockDirtyInfoList **plist = &list;
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
info->count = bdrv_get_dirty_count(bm);
info->granularity = bdrv_dirty_bitmap_granularity(bm);
info->has_name = !!bm->name;
info->name = g_strdup(bm->name);
info->status = bdrv_dirty_bitmap_status(bm);
entry->value = info;
*plist = entry;
plist = &entry->next;
}
return list;
}
int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t sector)
{
if (bitmap) {
return hbitmap_get(bitmap->bitmap, sector);
} else {
return 0;
}
}
/**
* Chooses a default granularity based on the existing cluster size,
* but clamped between [4K, 64K]. Defaults to 64K in the case that there
* is no cluster size information available.
*/
uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
{
BlockDriverInfo bdi;
uint32_t granularity;
if (bdrv_get_info(bs, &bdi) >= 0 && bdi.cluster_size > 0) {
granularity = MAX(4096, bdi.cluster_size);
granularity = MIN(65536, granularity);
} else {
granularity = 65536;
}
return granularity;
}
uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
{
return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
}
void bdrv_dirty_iter_init(BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
{
hbitmap_iter_init(hbi, bitmap->bitmap, 0);
}
void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int64_t nr_sectors)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
}
void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int64_t nr_sectors)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
}
void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
if (!out) {
hbitmap_reset_all(bitmap->bitmap);
} else {
HBitmap *backup = bitmap->bitmap;
bitmap->bitmap = hbitmap_alloc(bitmap->size,
hbitmap_granularity(backup));
*out = backup;
}
}
void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in)
{
HBitmap *tmp = bitmap->bitmap;
assert(bdrv_dirty_bitmap_enabled(bitmap));
bitmap->bitmap = in;
hbitmap_free(tmp);
}
void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
int64_t nr_sectors)
{
BdrvDirtyBitmap *bitmap;
QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
if (!bdrv_dirty_bitmap_enabled(bitmap)) {
continue;
}
hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
}
}
/**
* Advance an HBitmapIter to an arbitrary offset.
*/
void bdrv_set_dirty_iter(HBitmapIter *hbi, int64_t offset)
{
assert(hbi->hb);
hbitmap_iter_init(hbi, hbi->hb, offset);
}
int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
{
return hbitmap_count(bitmap->bitmap);
}

View File

@@ -21,8 +21,6 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qemu/bswap.h" #include "qemu/bswap.h"
@@ -32,6 +30,7 @@
#ifdef CONFIG_BZIP2 #ifdef CONFIG_BZIP2
#include <bzlib.h> #include <bzlib.h>
#endif #endif
#include <glib.h>
enum { enum {
/* Limit chunk sizes to prevent unreasonable amounts of memory being used /* Limit chunk sizes to prevent unreasonable amounts of memory being used
@@ -153,9 +152,8 @@ static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
} }
} }
static int64_t dmg_find_koly_offset(BdrvChild *file, Error **errp) static int64_t dmg_find_koly_offset(BlockDriverState *file_bs, Error **errp)
{ {
BlockDriverState *file_bs = file->bs;
int64_t length; int64_t length;
int64_t offset = 0; int64_t offset = 0;
uint8_t buffer[515]; uint8_t buffer[515];
@@ -179,7 +177,7 @@ static int64_t dmg_find_koly_offset(BdrvChild *file, Error **errp)
offset = length - 511 - 512; offset = length - 511 - 512;
} }
length = length < 515 ? length : 515; length = length < 515 ? length : 515;
ret = bdrv_pread(file, offset, buffer, length); ret = bdrv_pread(file_bs, offset, buffer, length);
if (ret < 0) { if (ret < 0) {
error_setg_errno(errp, -ret, "Failed while reading UDIF trailer"); error_setg_errno(errp, -ret, "Failed while reading UDIF trailer");
return ret; return ret;
@@ -439,8 +437,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
int64_t offset; int64_t offset;
int ret; int ret;
bs->read_only = true; bs->read_only = 1;
s->n_chunks = 0; s->n_chunks = 0;
s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL; s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
/* used by dmg_read_mish_block to keep track of the current I/O position */ /* used by dmg_read_mish_block to keep track of the current I/O position */
@@ -517,9 +514,9 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
} }
/* initialize zlib engine */ /* initialize zlib engine */
s->compressed_chunk = qemu_try_blockalign(bs->file->bs, s->compressed_chunk = qemu_try_blockalign(bs->file,
ds.max_compressed_size + 1); ds.max_compressed_size + 1);
s->uncompressed_chunk = qemu_try_blockalign(bs->file->bs, s->uncompressed_chunk = qemu_try_blockalign(bs->file,
512 * ds.max_sectors_per_chunk); 512 * ds.max_sectors_per_chunk);
if (s->compressed_chunk == NULL || s->uncompressed_chunk == NULL) { if (s->compressed_chunk == NULL || s->uncompressed_chunk == NULL) {
ret = -ENOMEM; ret = -ENOMEM;
@@ -547,11 +544,6 @@ fail:
return ret; return ret;
} }
static void dmg_refresh_limits(BlockDriverState *bs, Error **errp)
{
bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
}
static inline int is_sector_in_chunk(BDRVDMGState* s, static inline int is_sector_in_chunk(BDRVDMGState* s,
uint32_t chunk_num, uint64_t sector_num) uint32_t chunk_num, uint64_t sector_num)
{ {
@@ -665,42 +657,38 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
return 0; return 0;
} }
static int coroutine_fn static int dmg_read(BlockDriverState *bs, int64_t sector_num,
dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, uint8_t *buf, int nb_sectors)
QEMUIOVector *qiov, int flags)
{ {
BDRVDMGState *s = bs->opaque; BDRVDMGState *s = bs->opaque;
uint64_t sector_num = offset >> BDRV_SECTOR_BITS; int i;
int nb_sectors = bytes >> BDRV_SECTOR_BITS;
int ret, i;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
qemu_co_mutex_lock(&s->lock);
for (i = 0; i < nb_sectors; i++) { for (i = 0; i < nb_sectors; i++) {
uint32_t sector_offset_in_chunk; uint32_t sector_offset_in_chunk;
void *data;
if (dmg_read_chunk(bs, sector_num + i) != 0) { if (dmg_read_chunk(bs, sector_num + i) != 0) {
ret = -EIO; return -1;
goto fail;
} }
/* Special case: current chunk is all zeroes. Do not perform a memcpy as /* Special case: current chunk is all zeroes. Do not perform a memcpy as
* s->uncompressed_chunk may be too small to cover the large all-zeroes * s->uncompressed_chunk may be too small to cover the large all-zeroes
* section. dmg_read_chunk is called to find s->current_chunk */ * section. dmg_read_chunk is called to find s->current_chunk */
if (s->types[s->current_chunk] == 2) { /* all zeroes block entry */ if (s->types[s->current_chunk] == 2) { /* all zeroes block entry */
qemu_iovec_memset(qiov, i * 512, 0, 512); memset(buf + i * 512, 0, 512);
continue; continue;
} }
sector_offset_in_chunk = sector_num + i - s->sectors[s->current_chunk]; sector_offset_in_chunk = sector_num + i - s->sectors[s->current_chunk];
data = s->uncompressed_chunk + sector_offset_in_chunk * 512; memcpy(buf + i * 512,
qemu_iovec_from_buf(qiov, i * 512, data, 512); s->uncompressed_chunk + sector_offset_in_chunk * 512, 512);
} }
return 0;
}
ret = 0; static coroutine_fn int dmg_co_read(BlockDriverState *bs, int64_t sector_num,
fail: uint8_t *buf, int nb_sectors)
{
int ret;
BDRVDMGState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = dmg_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock); qemu_co_mutex_unlock(&s->lock);
return ret; return ret;
} }
@@ -725,8 +713,7 @@ static BlockDriver bdrv_dmg = {
.instance_size = sizeof(BDRVDMGState), .instance_size = sizeof(BDRVDMGState),
.bdrv_probe = dmg_probe, .bdrv_probe = dmg_probe,
.bdrv_open = dmg_open, .bdrv_open = dmg_open,
.bdrv_refresh_limits = dmg_refresh_limits, .bdrv_read = dmg_co_read,
.bdrv_co_preadv = dmg_co_preadv,
.bdrv_close = dmg_close, .bdrv_close = dmg_close,
}; };

File diff suppressed because it is too large Load Diff

2070
block/io.c

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -7,14 +7,11 @@
* This work is licensed under the terms of the GNU GPL, version 2 or later. * This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory. * See the COPYING file in the top-level directory.
*/ */
#include "qemu/osdep.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "block/aio.h" #include "block/aio.h"
#include "qemu/queue.h" #include "qemu/queue.h"
#include "block/block.h"
#include "block/raw-aio.h" #include "block/raw-aio.h"
#include "qemu/event_notifier.h" #include "qemu/event_notifier.h"
#include "qemu/coroutine.h"
#include <libaio.h> #include <libaio.h>
@@ -28,10 +25,11 @@
*/ */
#define MAX_EVENTS 128 #define MAX_EVENTS 128
#define MAX_QUEUED_IO 128
struct qemu_laiocb { struct qemu_laiocb {
BlockAIOCB common; BlockAIOCB common;
Coroutine *co; struct qemu_laio_state *ctx;
LinuxAioState *ctx;
struct iocb iocb; struct iocb iocb;
ssize_t ret; ssize_t ret;
size_t nbytes; size_t nbytes;
@@ -42,15 +40,12 @@ struct qemu_laiocb {
typedef struct { typedef struct {
int plugged; int plugged;
unsigned int in_queue; unsigned int n;
unsigned int in_flight;
bool blocked; bool blocked;
QSIMPLEQ_HEAD(, qemu_laiocb) pending; QSIMPLEQ_HEAD(, qemu_laiocb) pending;
} LaioQueue; } LaioQueue;
struct LinuxAioState { struct qemu_laio_state {
AioContext *aio_context;
io_context_t ctx; io_context_t ctx;
EventNotifier e; EventNotifier e;
@@ -59,11 +54,12 @@ struct LinuxAioState {
/* I/O completion processing */ /* I/O completion processing */
QEMUBH *completion_bh; QEMUBH *completion_bh;
struct io_event events[MAX_EVENTS];
int event_idx; int event_idx;
int event_max; int event_max;
}; };
static void ioq_submit(LinuxAioState *s); static void ioq_submit(struct qemu_laio_state *s);
static inline ssize_t io_event_ret(struct io_event *ev) static inline ssize_t io_event_ret(struct io_event *ev)
{ {
@@ -73,7 +69,8 @@ static inline ssize_t io_event_ret(struct io_event *ev)
/* /*
* Completes an AIO request (calls the callback and frees the ACB). * Completes an AIO request (calls the callback and frees the ACB).
*/ */
static void qemu_laio_process_completion(struct qemu_laiocb *laiocb) static void qemu_laio_process_completion(struct qemu_laio_state *s,
struct qemu_laiocb *laiocb)
{ {
int ret; int ret;
@@ -87,168 +84,71 @@ static void qemu_laio_process_completion(struct qemu_laiocb *laiocb)
qemu_iovec_memset(laiocb->qiov, ret, 0, qemu_iovec_memset(laiocb->qiov, ret, 0,
laiocb->qiov->size - ret); laiocb->qiov->size - ret);
} else { } else {
ret = -ENOSPC; ret = -EINVAL;
} }
} }
} }
laiocb->common.cb(laiocb->common.opaque, ret);
laiocb->ret = ret; qemu_aio_unref(laiocb);
if (laiocb->co) {
/* Jump and continue completion for foreign requests, don't do
* anything for current request, it will be completed shortly. */
if (laiocb->co != qemu_coroutine_self()) {
qemu_coroutine_enter(laiocb->co);
}
} else {
laiocb->common.cb(laiocb->common.opaque, ret);
qemu_aio_unref(laiocb);
}
} }
/** /* The completion BH fetches completed I/O requests and invokes their
* aio_ring buffer which is shared between userspace and kernel. * callbacks.
*
* This copied from linux/fs/aio.c, common header does not exist
* but AIO exists for ages so we assume ABI is stable.
*/
struct aio_ring {
unsigned id; /* kernel internal index number */
unsigned nr; /* number of io_events */
unsigned head; /* Written to by userland or by kernel. */
unsigned tail;
unsigned magic;
unsigned compat_features;
unsigned incompat_features;
unsigned header_length; /* size of aio_ring */
struct io_event io_events[0];
};
/**
* io_getevents_peek:
* @ctx: AIO context
* @events: pointer on events array, output value
* Returns the number of completed events and sets a pointer
* on events array. This function does not update the internal
* ring buffer, only reads head and tail. When @events has been
* processed io_getevents_commit() must be called.
*/
static inline unsigned int io_getevents_peek(io_context_t ctx,
struct io_event **events)
{
struct aio_ring *ring = (struct aio_ring *)ctx;
unsigned int head = ring->head, tail = ring->tail;
unsigned int nr;
nr = tail >= head ? tail - head : ring->nr - head;
*events = ring->io_events + head;
/* To avoid speculative loads of s->events[i] before observing tail.
Paired with smp_wmb() inside linux/fs/aio.c: aio_complete(). */
smp_rmb();
return nr;
}
/**
* io_getevents_commit:
* @ctx: AIO context
* @nr: the number of events on which head should be advanced
*
* Advances head of a ring buffer.
*/
static inline void io_getevents_commit(io_context_t ctx, unsigned int nr)
{
struct aio_ring *ring = (struct aio_ring *)ctx;
if (nr) {
ring->head = (ring->head + nr) % ring->nr;
}
}
/**
* io_getevents_advance_and_peek:
* @ctx: AIO context
* @events: pointer on events array, output value
* @nr: the number of events on which head should be advanced
*
* Advances head of a ring buffer and returns number of elements left.
*/
static inline unsigned int
io_getevents_advance_and_peek(io_context_t ctx,
struct io_event **events,
unsigned int nr)
{
io_getevents_commit(ctx, nr);
return io_getevents_peek(ctx, events);
}
/**
* qemu_laio_process_completions:
* @s: AIO state
*
* Fetches completed I/O requests and invokes their callbacks.
* *
* The function is somewhat tricky because it supports nested event loops, for * The function is somewhat tricky because it supports nested event loops, for
* example when a request callback invokes aio_poll(). In order to do this, * example when a request callback invokes aio_poll(). In order to do this,
* indices are kept in LinuxAioState. Function schedules BH completion so it * the completion events array and index are kept in qemu_laio_state. The BH
* can be called again in a nested event loop. When there are no events left * reschedules itself as long as there are completions pending so it will
* to complete the BH is being canceled. * either be called again in a nested event loop or will be called after all
* events have been completed. When there are no events left to complete, the
* BH returns without rescheduling.
*/ */
static void qemu_laio_process_completions(LinuxAioState *s) static void qemu_laio_completion_bh(void *opaque)
{ {
struct io_event *events; struct qemu_laio_state *s = opaque;
/* Fetch more completion events when empty */
if (s->event_idx == s->event_max) {
do {
struct timespec ts = { 0 };
s->event_max = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS,
s->events, &ts);
} while (s->event_max == -EINTR);
s->event_idx = 0;
if (s->event_max <= 0) {
s->event_max = 0;
return; /* no more events */
}
}
/* Reschedule so nested event loops see currently pending completions */ /* Reschedule so nested event loops see currently pending completions */
qemu_bh_schedule(s->completion_bh); qemu_bh_schedule(s->completion_bh);
while ((s->event_max = io_getevents_advance_and_peek(s->ctx, &events, /* Process completion events */
s->event_idx))) { while (s->event_idx < s->event_max) {
for (s->event_idx = 0; s->event_idx < s->event_max; ) { struct iocb *iocb = s->events[s->event_idx].obj;
struct iocb *iocb = events[s->event_idx].obj; struct qemu_laiocb *laiocb =
struct qemu_laiocb *laiocb =
container_of(iocb, struct qemu_laiocb, iocb); container_of(iocb, struct qemu_laiocb, iocb);
laiocb->ret = io_event_ret(&events[s->event_idx]); laiocb->ret = io_event_ret(&s->events[s->event_idx]);
s->event_idx++;
/* Change counters one-by-one because we can be nested. */ qemu_laio_process_completion(s, laiocb);
s->io_q.in_flight--;
s->event_idx++;
qemu_laio_process_completion(laiocb);
}
} }
qemu_bh_cancel(s->completion_bh);
/* If we are nested we have to notify the level above that we are done
* by setting event_max to zero, upper level will then jump out of it's
* own `for` loop. If we are the last all counters droped to zero. */
s->event_max = 0;
s->event_idx = 0;
}
static void qemu_laio_process_completions_and_submit(LinuxAioState *s)
{
qemu_laio_process_completions(s);
if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) { if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
ioq_submit(s); ioq_submit(s);
} }
} }
static void qemu_laio_completion_bh(void *opaque)
{
LinuxAioState *s = opaque;
qemu_laio_process_completions_and_submit(s);
}
static void qemu_laio_completion_cb(EventNotifier *e) static void qemu_laio_completion_cb(EventNotifier *e)
{ {
LinuxAioState *s = container_of(e, LinuxAioState, e); struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, e);
if (event_notifier_test_and_clear(&s->e)) { if (event_notifier_test_and_clear(&s->e)) {
qemu_laio_process_completions_and_submit(s); qemu_bh_schedule(s->completion_bh);
} }
} }
@@ -280,26 +180,22 @@ static void ioq_init(LaioQueue *io_q)
{ {
QSIMPLEQ_INIT(&io_q->pending); QSIMPLEQ_INIT(&io_q->pending);
io_q->plugged = 0; io_q->plugged = 0;
io_q->in_queue = 0; io_q->n = 0;
io_q->in_flight = 0;
io_q->blocked = false; io_q->blocked = false;
} }
static void ioq_submit(LinuxAioState *s) static void ioq_submit(struct qemu_laio_state *s)
{ {
int ret, len; int ret, len;
struct qemu_laiocb *aiocb; struct qemu_laiocb *aiocb;
struct iocb *iocbs[MAX_EVENTS]; struct iocb *iocbs[MAX_QUEUED_IO];
QSIMPLEQ_HEAD(, qemu_laiocb) completed; QSIMPLEQ_HEAD(, qemu_laiocb) completed;
do { do {
if (s->io_q.in_flight >= MAX_EVENTS) {
break;
}
len = 0; len = 0;
QSIMPLEQ_FOREACH(aiocb, &s->io_q.pending, next) { QSIMPLEQ_FOREACH(aiocb, &s->io_q.pending, next) {
iocbs[len++] = &aiocb->iocb; iocbs[len++] = &aiocb->iocb;
if (s->io_q.in_flight + len >= MAX_EVENTS) { if (len == MAX_QUEUED_IO) {
break; break;
} }
} }
@@ -309,56 +205,55 @@ static void ioq_submit(LinuxAioState *s)
break; break;
} }
if (ret < 0) { if (ret < 0) {
/* Fail the first request, retry the rest */ abort();
aiocb = QSIMPLEQ_FIRST(&s->io_q.pending);
QSIMPLEQ_REMOVE_HEAD(&s->io_q.pending, next);
s->io_q.in_queue--;
aiocb->ret = ret;
qemu_laio_process_completion(aiocb);
continue;
} }
s->io_q.in_flight += ret; s->io_q.n -= ret;
s->io_q.in_queue -= ret;
aiocb = container_of(iocbs[ret - 1], struct qemu_laiocb, iocb); aiocb = container_of(iocbs[ret - 1], struct qemu_laiocb, iocb);
QSIMPLEQ_SPLIT_AFTER(&s->io_q.pending, aiocb, next, &completed); QSIMPLEQ_SPLIT_AFTER(&s->io_q.pending, aiocb, next, &completed);
} while (ret == len && !QSIMPLEQ_EMPTY(&s->io_q.pending)); } while (ret == len && !QSIMPLEQ_EMPTY(&s->io_q.pending));
s->io_q.blocked = (s->io_q.in_queue > 0); s->io_q.blocked = (s->io_q.n > 0);
if (s->io_q.in_flight) {
/* We can try to complete something just right away if there are
* still requests in-flight. */
qemu_laio_process_completions(s);
/*
* Even we have completed everything (in_flight == 0), the queue can
* have still pended requests (in_queue > 0). We do not attempt to
* repeat submission to avoid IO hang. The reason is simple: s->e is
* still set and completion callback will be called shortly and all
* pended requests will be submitted from there.
*/
}
} }
void laio_io_plug(BlockDriverState *bs, LinuxAioState *s) void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
{ {
struct qemu_laio_state *s = aio_ctx;
s->io_q.plugged++; s->io_q.plugged++;
} }
void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s) void laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
{ {
assert(s->io_q.plugged); struct qemu_laio_state *s = aio_ctx;
if (--s->io_q.plugged == 0 &&
!s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending)) { assert(s->io_q.plugged > 0 || !unplug);
if (unplug && --s->io_q.plugged > 0) {
return;
}
if (!s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
ioq_submit(s); ioq_submit(s);
} }
} }
static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset, BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int type) int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type)
{ {
LinuxAioState *s = laiocb->ctx; struct qemu_laio_state *s = aio_ctx;
struct iocb *iocbs = &laiocb->iocb; struct qemu_laiocb *laiocb;
QEMUIOVector *qiov = laiocb->qiov; struct iocb *iocbs;
off_t offset = sector_num * 512;
laiocb = qemu_aio_get(&laio_aiocb_info, bs, cb, opaque);
laiocb->nbytes = nb_sectors * 512;
laiocb->ctx = s;
laiocb->ret = -EINPROGRESS;
laiocb->is_read = (type == QEMU_AIO_READ);
laiocb->qiov = qiov;
iocbs = &laiocb->iocb;
switch (type) { switch (type) {
case QEMU_AIO_WRITE: case QEMU_AIO_WRITE:
@@ -371,86 +266,42 @@ static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset,
default: default:
fprintf(stderr, "%s: invalid AIO request type 0x%x.\n", fprintf(stderr, "%s: invalid AIO request type 0x%x.\n",
__func__, type); __func__, type);
return -EIO; goto out_free_aiocb;
} }
io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e)); io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
QSIMPLEQ_INSERT_TAIL(&s->io_q.pending, laiocb, next); QSIMPLEQ_INSERT_TAIL(&s->io_q.pending, laiocb, next);
s->io_q.in_queue++; s->io_q.n++;
if (!s->io_q.blocked && if (!s->io_q.blocked &&
(!s->io_q.plugged || (!s->io_q.plugged || s->io_q.n >= MAX_QUEUED_IO)) {
s->io_q.in_flight + s->io_q.in_queue >= MAX_EVENTS)) {
ioq_submit(s); ioq_submit(s);
} }
return 0;
}
int coroutine_fn laio_co_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
uint64_t offset, QEMUIOVector *qiov, int type)
{
int ret;
struct qemu_laiocb laiocb = {
.co = qemu_coroutine_self(),
.nbytes = qiov->size,
.ctx = s,
.ret = -EINPROGRESS,
.is_read = (type == QEMU_AIO_READ),
.qiov = qiov,
};
ret = laio_do_submit(fd, &laiocb, offset, type);
if (ret < 0) {
return ret;
}
if (laiocb.ret == -EINPROGRESS) {
qemu_coroutine_yield();
}
return laiocb.ret;
}
BlockAIOCB *laio_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type)
{
struct qemu_laiocb *laiocb;
off_t offset = sector_num * BDRV_SECTOR_SIZE;
int ret;
laiocb = qemu_aio_get(&laio_aiocb_info, bs, cb, opaque);
laiocb->nbytes = nb_sectors * BDRV_SECTOR_SIZE;
laiocb->ctx = s;
laiocb->ret = -EINPROGRESS;
laiocb->is_read = (type == QEMU_AIO_READ);
laiocb->qiov = qiov;
ret = laio_do_submit(fd, laiocb, offset, type);
if (ret < 0) {
qemu_aio_unref(laiocb);
return NULL;
}
return &laiocb->common; return &laiocb->common;
out_free_aiocb:
qemu_aio_unref(laiocb);
return NULL;
} }
void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context) void laio_detach_aio_context(void *s_, AioContext *old_context)
{ {
aio_set_event_notifier(old_context, &s->e, false, NULL); struct qemu_laio_state *s = s_;
aio_set_event_notifier(old_context, &s->e, NULL);
qemu_bh_delete(s->completion_bh); qemu_bh_delete(s->completion_bh);
} }
void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context) void laio_attach_aio_context(void *s_, AioContext *new_context)
{ {
s->aio_context = new_context; struct qemu_laio_state *s = s_;
s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s); s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
aio_set_event_notifier(new_context, &s->e, false, aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
qemu_laio_completion_cb);
} }
LinuxAioState *laio_init(void) void *laio_init(void)
{ {
LinuxAioState *s; struct qemu_laio_state *s;
s = g_malloc0(sizeof(*s)); s = g_malloc0(sizeof(*s));
if (event_notifier_init(&s->e, false) < 0) { if (event_notifier_init(&s->e, false) < 0) {
@@ -472,8 +323,10 @@ out_free_state:
return NULL; return NULL;
} }
void laio_cleanup(LinuxAioState *s) void laio_cleanup(void *s_)
{ {
struct qemu_laio_state *s = s_;
event_notifier_cleanup(&s->e); event_notifier_cleanup(&s->e);
if (io_destroy(s->ctx) != 0) { if (io_destroy(s->ctx) != 0) {

File diff suppressed because it is too large Load Diff

View File

@@ -26,8 +26,8 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "nbd-client.h" #include "nbd-client.h"
#include "qemu/sockets.h"
#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs)) #define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
#define INDEX_TO_HANDLE(bs, index) ((index) ^ ((uint64_t)(intptr_t)bs)) #define INDEX_TO_HANDLE(bs, index) ((index) ^ ((uint64_t)(intptr_t)bs))
@@ -38,7 +38,7 @@ static void nbd_recv_coroutines_enter_all(NbdClientSession *s)
for (i = 0; i < MAX_NBD_REQUESTS; i++) { for (i = 0; i < MAX_NBD_REQUESTS; i++) {
if (s->recv_coroutine[i]) { if (s->recv_coroutine[i]) {
qemu_coroutine_enter(s->recv_coroutine[i]); qemu_coroutine_enter(s->recv_coroutine[i], NULL);
} }
} }
} }
@@ -47,21 +47,13 @@ static void nbd_teardown_connection(BlockDriverState *bs)
{ {
NbdClientSession *client = nbd_get_client_session(bs); NbdClientSession *client = nbd_get_client_session(bs);
if (!client->ioc) { /* Already closed */
return;
}
/* finish any pending coroutines */ /* finish any pending coroutines */
qio_channel_shutdown(client->ioc, shutdown(client->sock, 2);
QIO_CHANNEL_SHUTDOWN_BOTH,
NULL);
nbd_recv_coroutines_enter_all(client); nbd_recv_coroutines_enter_all(client);
nbd_client_detach_aio_context(bs); nbd_client_detach_aio_context(bs);
object_unref(OBJECT(client->sioc)); closesocket(client->sock);
client->sioc = NULL; client->sock = -1;
object_unref(OBJECT(client->ioc));
client->ioc = NULL;
} }
static void nbd_reply_ready(void *opaque) static void nbd_reply_ready(void *opaque)
@@ -71,16 +63,12 @@ static void nbd_reply_ready(void *opaque)
uint64_t i; uint64_t i;
int ret; int ret;
if (!s->ioc) { /* Already closed */
return;
}
if (s->reply.handle == 0) { if (s->reply.handle == 0) {
/* No reply already in flight. Fetch a header. It is possible /* No reply already in flight. Fetch a header. It is possible
* that another thread has done the same thing in parallel, so * that another thread has done the same thing in parallel, so
* the socket is not readable anymore. * the socket is not readable anymore.
*/ */
ret = nbd_receive_reply(s->ioc, &s->reply); ret = nbd_receive_reply(s->sock, &s->reply);
if (ret == -EAGAIN) { if (ret == -EAGAIN) {
return; return;
} }
@@ -99,7 +87,7 @@ static void nbd_reply_ready(void *opaque)
} }
if (s->recv_coroutine[i]) { if (s->recv_coroutine[i]) {
qemu_coroutine_enter(s->recv_coroutine[i]); qemu_coroutine_enter(s->recv_coroutine[i], NULL);
return; return;
} }
@@ -111,12 +99,12 @@ static void nbd_restart_write(void *opaque)
{ {
BlockDriverState *bs = opaque; BlockDriverState *bs = opaque;
qemu_coroutine_enter(nbd_get_client_session(bs)->send_coroutine); qemu_coroutine_enter(nbd_get_client_session(bs)->send_coroutine, NULL);
} }
static int nbd_co_send_request(BlockDriverState *bs, static int nbd_co_send_request(BlockDriverState *bs,
struct nbd_request *request, struct nbd_request *request,
QEMUIOVector *qiov) QEMUIOVector *qiov, int offset)
{ {
NbdClientSession *s = nbd_get_client_session(bs); NbdClientSession *s = nbd_get_client_session(bs);
AioContext *aio_context; AioContext *aio_context;
@@ -131,45 +119,40 @@ static int nbd_co_send_request(BlockDriverState *bs,
} }
} }
g_assert(qemu_in_coroutine());
assert(i < MAX_NBD_REQUESTS); assert(i < MAX_NBD_REQUESTS);
request->handle = INDEX_TO_HANDLE(s, i); request->handle = INDEX_TO_HANDLE(s, i);
if (!s->ioc) {
qemu_co_mutex_unlock(&s->send_mutex);
return -EPIPE;
}
s->send_coroutine = qemu_coroutine_self(); s->send_coroutine = qemu_coroutine_self();
aio_context = bdrv_get_aio_context(bs); aio_context = bdrv_get_aio_context(bs);
aio_set_fd_handler(aio_context, s->sioc->fd, false, aio_set_fd_handler(aio_context, s->sock,
nbd_reply_ready, nbd_restart_write, bs); nbd_reply_ready, nbd_restart_write, bs);
if (qiov) { if (qiov) {
qio_channel_set_cork(s->ioc, true); if (!s->is_unix) {
rc = nbd_send_request(s->ioc, request); socket_set_cork(s->sock, 1);
}
rc = nbd_send_request(s->sock, request);
if (rc >= 0) { if (rc >= 0) {
ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, request->len, ret = qemu_co_sendv(s->sock, qiov->iov, qiov->niov,
false); offset, request->len);
if (ret != request->len) { if (ret != request->len) {
rc = -EIO; rc = -EIO;
} }
} }
qio_channel_set_cork(s->ioc, false); if (!s->is_unix) {
socket_set_cork(s->sock, 0);
}
} else { } else {
rc = nbd_send_request(s->ioc, request); rc = nbd_send_request(s->sock, request);
} }
aio_set_fd_handler(aio_context, s->sioc->fd, false, aio_set_fd_handler(aio_context, s->sock, nbd_reply_ready, NULL, bs);
nbd_reply_ready, NULL, bs);
s->send_coroutine = NULL; s->send_coroutine = NULL;
qemu_co_mutex_unlock(&s->send_mutex); qemu_co_mutex_unlock(&s->send_mutex);
return rc; return rc;
} }
static void nbd_co_receive_reply(NbdClientSession *s, static void nbd_co_receive_reply(NbdClientSession *s,
struct nbd_request *request, struct nbd_request *request, struct nbd_reply *reply,
struct nbd_reply *reply, QEMUIOVector *qiov, int offset)
QEMUIOVector *qiov)
{ {
int ret; int ret;
@@ -177,13 +160,12 @@ static void nbd_co_receive_reply(NbdClientSession *s,
* peek at the next reply and avoid yielding if it's ours? */ * peek at the next reply and avoid yielding if it's ours? */
qemu_coroutine_yield(); qemu_coroutine_yield();
*reply = s->reply; *reply = s->reply;
if (reply->handle != request->handle || if (reply->handle != request->handle) {
!s->ioc) {
reply->error = EIO; reply->error = EIO;
} else { } else {
if (qiov && reply->error == 0) { if (qiov && reply->error == 0) {
ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, request->len, ret = qemu_co_recvv(s->sock, qiov->iov, qiov->niov,
true); offset, request->len);
if (ret != request->len) { if (ret != request->len) {
reply->error = EIO; reply->error = EIO;
} }
@@ -218,60 +200,94 @@ static void nbd_coroutine_end(NbdClientSession *s,
} }
} }
int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, static int nbd_co_readv_1(BlockDriverState *bs, int64_t sector_num,
uint64_t bytes, QEMUIOVector *qiov, int flags) int nb_sectors, QEMUIOVector *qiov,
int offset)
{ {
NbdClientSession *client = nbd_get_client_session(bs); NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { struct nbd_request request = { .type = NBD_CMD_READ };
.type = NBD_CMD_READ,
.from = offset,
.len = bytes,
};
struct nbd_reply reply; struct nbd_reply reply;
ssize_t ret; ssize_t ret;
assert(bytes <= NBD_MAX_BUFFER_SIZE); request.from = sector_num * 512;
assert(!flags); request.len = nb_sectors * 512;
nbd_coroutine_start(client, &request); nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL); ret = nbd_co_send_request(bs, &request, NULL, 0);
if (ret < 0) { if (ret < 0) {
reply.error = -ret; reply.error = -ret;
} else { } else {
nbd_co_receive_reply(client, &request, &reply, qiov); nbd_co_receive_reply(client, &request, &reply, qiov, offset);
}
nbd_coroutine_end(client, &request);
return -reply.error;
}
static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov,
int offset)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { .type = NBD_CMD_WRITE };
struct nbd_reply reply;
ssize_t ret;
if (!bdrv_enable_write_cache(bs) &&
(client->nbdflags & NBD_FLAG_SEND_FUA)) {
request.type |= NBD_CMD_FLAG_FUA;
}
request.from = sector_num * 512;
request.len = nb_sectors * 512;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, qiov, offset);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, NULL, 0);
} }
nbd_coroutine_end(client, &request); nbd_coroutine_end(client, &request);
return -reply.error; return -reply.error;
} }
int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, /* qemu-nbd has a limit of slightly less than 1M per request. Try to
uint64_t bytes, QEMUIOVector *qiov, int flags) * remain aligned to 4K. */
#define NBD_MAX_SECTORS 2040
int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
{ {
NbdClientSession *client = nbd_get_client_session(bs); int offset = 0;
struct nbd_request request = { int ret;
.type = NBD_CMD_WRITE, while (nb_sectors > NBD_MAX_SECTORS) {
.from = offset, ret = nbd_co_readv_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
.len = bytes, if (ret < 0) {
}; return ret;
struct nbd_reply reply; }
ssize_t ret; offset += NBD_MAX_SECTORS * 512;
sector_num += NBD_MAX_SECTORS;
if (flags & BDRV_REQ_FUA) { nb_sectors -= NBD_MAX_SECTORS;
assert(client->nbdflags & NBD_FLAG_SEND_FUA);
request.type |= NBD_CMD_FLAG_FUA;
} }
return nbd_co_readv_1(bs, sector_num, nb_sectors, qiov, offset);
}
assert(bytes <= NBD_MAX_BUFFER_SIZE); int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
nbd_coroutine_start(client, &request); {
ret = nbd_co_send_request(bs, &request, qiov); int offset = 0;
if (ret < 0) { int ret;
reply.error = -ret; while (nb_sectors > NBD_MAX_SECTORS) {
} else { ret = nbd_co_writev_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
nbd_co_receive_reply(client, &request, &reply, NULL); if (ret < 0) {
return ret;
}
offset += NBD_MAX_SECTORS * 512;
sector_num += NBD_MAX_SECTORS;
nb_sectors -= NBD_MAX_SECTORS;
} }
nbd_coroutine_end(client, &request); return nbd_co_writev_1(bs, sector_num, nb_sectors, qiov, offset);
return -reply.error;
} }
int nbd_client_co_flush(BlockDriverState *bs) int nbd_client_co_flush(BlockDriverState *bs)
@@ -285,41 +301,44 @@ int nbd_client_co_flush(BlockDriverState *bs)
return 0; return 0;
} }
if (client->nbdflags & NBD_FLAG_SEND_FUA) {
request.type |= NBD_CMD_FLAG_FUA;
}
request.from = 0; request.from = 0;
request.len = 0; request.len = 0;
nbd_coroutine_start(client, &request); nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL); ret = nbd_co_send_request(bs, &request, NULL, 0);
if (ret < 0) { if (ret < 0) {
reply.error = -ret; reply.error = -ret;
} else { } else {
nbd_co_receive_reply(client, &request, &reply, NULL); nbd_co_receive_reply(client, &request, &reply, NULL, 0);
} }
nbd_coroutine_end(client, &request); nbd_coroutine_end(client, &request);
return -reply.error; return -reply.error;
} }
int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count) int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{ {
NbdClientSession *client = nbd_get_client_session(bs); NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { struct nbd_request request = { .type = NBD_CMD_TRIM };
.type = NBD_CMD_TRIM,
.from = offset,
.len = count,
};
struct nbd_reply reply; struct nbd_reply reply;
ssize_t ret; ssize_t ret;
if (!(client->nbdflags & NBD_FLAG_SEND_TRIM)) { if (!(client->nbdflags & NBD_FLAG_SEND_TRIM)) {
return 0; return 0;
} }
request.from = sector_num * 512;
request.len = nb_sectors * 512;
nbd_coroutine_start(client, &request); nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL); ret = nbd_co_send_request(bs, &request, NULL, 0);
if (ret < 0) { if (ret < 0) {
reply.error = -ret; reply.error = -ret;
} else { } else {
nbd_co_receive_reply(client, &request, &reply, NULL); nbd_co_receive_reply(client, &request, &reply, NULL, 0);
} }
nbd_coroutine_end(client, &request); nbd_coroutine_end(client, &request);
return -reply.error; return -reply.error;
@@ -329,15 +348,14 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
void nbd_client_detach_aio_context(BlockDriverState *bs) void nbd_client_detach_aio_context(BlockDriverState *bs)
{ {
aio_set_fd_handler(bdrv_get_aio_context(bs), aio_set_fd_handler(bdrv_get_aio_context(bs),
nbd_get_client_session(bs)->sioc->fd, nbd_get_client_session(bs)->sock, NULL, NULL, NULL);
false, NULL, NULL, NULL);
} }
void nbd_client_attach_aio_context(BlockDriverState *bs, void nbd_client_attach_aio_context(BlockDriverState *bs,
AioContext *new_context) AioContext *new_context)
{ {
aio_set_fd_handler(new_context, nbd_get_client_session(bs)->sioc->fd, aio_set_fd_handler(new_context, nbd_get_client_session(bs)->sock,
false, nbd_reply_ready, NULL, bs); nbd_reply_ready, NULL, bs);
} }
void nbd_client_close(BlockDriverState *bs) void nbd_client_close(BlockDriverState *bs)
@@ -349,20 +367,16 @@ void nbd_client_close(BlockDriverState *bs)
.len = 0 .len = 0
}; };
if (client->ioc == NULL) { if (client->sock == -1) {
return; return;
} }
nbd_send_request(client->ioc, &request); nbd_send_request(client->sock, &request);
nbd_teardown_connection(bs); nbd_teardown_connection(bs);
} }
int nbd_client_init(BlockDriverState *bs, int nbd_client_init(BlockDriverState *bs, int sock, const char *export,
QIOChannelSocket *sioc,
const char *export,
QCryptoTLSCreds *tlscreds,
const char *hostname,
Error **errp) Error **errp)
{ {
NbdClientSession *client = nbd_get_client_session(bs); NbdClientSession *client = nbd_get_client_session(bs);
@@ -370,35 +384,22 @@ int nbd_client_init(BlockDriverState *bs,
/* NBD handshake */ /* NBD handshake */
logout("session init %s\n", export); logout("session init %s\n", export);
qio_channel_set_blocking(QIO_CHANNEL(sioc), true, NULL); qemu_set_block(sock);
ret = nbd_receive_negotiate(sock, export,
ret = nbd_receive_negotiate(QIO_CHANNEL(sioc), export, &client->nbdflags, &client->size, errp);
&client->nbdflags,
tlscreds, hostname,
&client->ioc,
&client->size, errp);
if (ret < 0) { if (ret < 0) {
logout("Failed to negotiate with the NBD server\n"); logout("Failed to negotiate with the NBD server\n");
closesocket(sock);
return ret; return ret;
} }
if (client->nbdflags & NBD_FLAG_SEND_FUA) {
bs->supported_write_flags = BDRV_REQ_FUA;
}
qemu_co_mutex_init(&client->send_mutex); qemu_co_mutex_init(&client->send_mutex);
qemu_co_mutex_init(&client->free_sema); qemu_co_mutex_init(&client->free_sema);
client->sioc = sioc; client->sock = sock;
object_ref(OBJECT(client->sioc));
if (!client->ioc) {
client->ioc = QIO_CHANNEL(sioc);
object_ref(OBJECT(client->ioc));
}
/* Now that we're connected, set the socket to be non-blocking and /* Now that we're connected, set the socket to be non-blocking and
* kick the reply mechanism. */ * kick the reply mechanism. */
qio_channel_set_blocking(QIO_CHANNEL(sioc), false, NULL); qemu_set_nonblock(sock);
nbd_client_attach_aio_context(bs, bdrv_get_aio_context(bs)); nbd_client_attach_aio_context(bs, bdrv_get_aio_context(bs));
logout("Established connection with NBD server\n"); logout("Established connection with NBD server\n");

View File

@@ -4,7 +4,6 @@
#include "qemu-common.h" #include "qemu-common.h"
#include "block/nbd.h" #include "block/nbd.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "io/channel-socket.h"
/* #define DEBUG_NBD */ /* #define DEBUG_NBD */
@@ -18,9 +17,8 @@
#define MAX_NBD_REQUESTS 16 #define MAX_NBD_REQUESTS 16
typedef struct NbdClientSession { typedef struct NbdClientSession {
QIOChannelSocket *sioc; /* The master data channel */ int sock;
QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */ uint32_t nbdflags;
uint16_t nbdflags;
off_t size; off_t size;
CoMutex send_mutex; CoMutex send_mutex;
@@ -36,20 +34,17 @@ typedef struct NbdClientSession {
NbdClientSession *nbd_get_client_session(BlockDriverState *bs); NbdClientSession *nbd_get_client_session(BlockDriverState *bs);
int nbd_client_init(BlockDriverState *bs, int nbd_client_init(BlockDriverState *bs, int sock, const char *export_name,
QIOChannelSocket *sock,
const char *export_name,
QCryptoTLSCreds *tlscreds,
const char *hostname,
Error **errp); Error **errp);
void nbd_client_close(BlockDriverState *bs); void nbd_client_close(BlockDriverState *bs);
int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count); int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
int nb_sectors);
int nbd_client_co_flush(BlockDriverState *bs); int nbd_client_co_flush(BlockDriverState *bs);
int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
uint64_t bytes, QEMUIOVector *qiov, int flags); int nb_sectors, QEMUIOVector *qiov);
int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
uint64_t bytes, QEMUIOVector *qiov, int flags); int nb_sectors, QEMUIOVector *qiov);
void nbd_client_detach_aio_context(BlockDriverState *bs); void nbd_client_detach_aio_context(BlockDriverState *bs);
void nbd_client_attach_aio_context(BlockDriverState *bs, void nbd_client_attach_aio_context(BlockDriverState *bs,

View File

@@ -26,25 +26,24 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "block/nbd-client.h" #include "block/nbd-client.h"
#include "qapi/error.h"
#include "qemu/uri.h" #include "qemu/uri.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qemu/module.h" #include "qemu/module.h"
#include "qemu/sockets.h"
#include "qapi/qmp/qdict.h" #include "qapi/qmp/qdict.h"
#include "qapi/qmp/qjson.h" #include "qapi/qmp/qjson.h"
#include "qapi/qmp/qint.h" #include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h" #include "qapi/qmp/qstring.h"
#include "qemu/cutils.h"
#include <sys/types.h>
#include <unistd.h>
#define EN_OPTSTR ":exportname=" #define EN_OPTSTR ":exportname="
typedef struct BDRVNBDState { typedef struct BDRVNBDState {
NbdClientSession client; NbdClientSession client;
QemuOpts *socket_opts;
/* For nbd_refresh_filename() */
char *path, *host, *port, *export, *tlscredsid;
} BDRVNBDState; } BDRVNBDState;
static int nbd_parse_uri(const char *filename, QDict *options) static int nbd_parse_uri(const char *filename, QDict *options)
@@ -191,48 +190,39 @@ out:
g_free(file); g_free(file);
} }
static SocketAddress *nbd_config(BDRVNBDState *s, QemuOpts *opts, Error **errp) static void nbd_config(BDRVNBDState *s, QDict *options, char **export,
Error **errp)
{ {
SocketAddress *saddr; Error *local_err = NULL;
s->path = g_strdup(qemu_opt_get(opts, "path")); if (qdict_haskey(options, "path") == qdict_haskey(options, "host")) {
s->host = g_strdup(qemu_opt_get(opts, "host")); if (qdict_haskey(options, "path")) {
if (!s->path == !s->host) {
if (s->path) {
error_setg(errp, "path and host may not be used at the same time."); error_setg(errp, "path and host may not be used at the same time.");
} else { } else {
error_setg(errp, "one of path and host must be specified."); error_setg(errp, "one of path and host must be specified.");
} }
return NULL; return;
} }
saddr = g_new0(SocketAddress, 1); s->client.is_unix = qdict_haskey(options, "path");
s->socket_opts = qemu_opts_create(&socket_optslist, NULL, 0,
&error_abort);
if (s->path) { qemu_opts_absorb_qdict(s->socket_opts, options, &local_err);
UnixSocketAddress *q_unix; if (local_err) {
saddr->type = SOCKET_ADDRESS_KIND_UNIX; error_propagate(errp, local_err);
q_unix = saddr->u.q_unix.data = g_new0(UnixSocketAddress, 1); return;
q_unix->path = g_strdup(s->path);
} else {
InetSocketAddress *inet;
s->port = g_strdup(qemu_opt_get(opts, "port"));
saddr->type = SOCKET_ADDRESS_KIND_INET;
inet = saddr->u.inet.data = g_new0(InetSocketAddress, 1);
inet->host = g_strdup(s->host);
inet->port = g_strdup(s->port);
if (!inet->port) {
inet->port = g_strdup_printf("%d", NBD_DEFAULT_PORT);
}
} }
s->client.is_unix = saddr->type == SOCKET_ADDRESS_KIND_UNIX; if (!qemu_opt_get(s->socket_opts, "port")) {
qemu_opt_set_number(s->socket_opts, "port", NBD_DEFAULT_PORT,
&error_abort);
}
s->export = g_strdup(qemu_opt_get(opts, "export")); *export = g_strdup(qdict_get_try_str(options, "export"));
if (*export) {
return saddr; qdict_del(options, "export");
}
} }
NbdClientSession *nbd_get_client_session(BlockDriverState *bs) NbdClientSession *nbd_get_client_session(BlockDriverState *bs)
@@ -241,158 +231,69 @@ NbdClientSession *nbd_get_client_session(BlockDriverState *bs)
return &s->client; return &s->client;
} }
static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr, static int nbd_establish_connection(BlockDriverState *bs, Error **errp)
Error **errp)
{ {
QIOChannelSocket *sioc; BDRVNBDState *s = bs->opaque;
Error *local_err = NULL; int sock;
sioc = qio_channel_socket_new(); if (s->client.is_unix) {
sock = unix_connect_opts(s->socket_opts, errp, NULL, NULL);
qio_channel_socket_connect_sync(sioc, } else {
saddr, sock = inet_connect_opts(s->socket_opts, errp, NULL, NULL);
&local_err); if (sock >= 0) {
if (local_err) { socket_set_nodelay(sock);
error_propagate(errp, local_err); }
return NULL;
} }
qio_channel_set_delay(QIO_CHANNEL(sioc), false); /* Failed to establish connection */
if (sock < 0) {
logout("Failed to establish connection to NBD server\n");
return -EIO;
}
return sioc; return sock;
} }
static QCryptoTLSCreds *nbd_get_tls_creds(const char *id, Error **errp)
{
Object *obj;
QCryptoTLSCreds *creds;
obj = object_resolve_path_component(
object_get_objects_root(), id);
if (!obj) {
error_setg(errp, "No TLS credentials with id '%s'",
id);
return NULL;
}
creds = (QCryptoTLSCreds *)
object_dynamic_cast(obj, TYPE_QCRYPTO_TLS_CREDS);
if (!creds) {
error_setg(errp, "Object with id '%s' is not TLS credentials",
id);
return NULL;
}
if (creds->endpoint != QCRYPTO_TLS_CREDS_ENDPOINT_CLIENT) {
error_setg(errp,
"Expecting TLS credentials with a client endpoint");
return NULL;
}
object_ref(obj);
return creds;
}
static QemuOptsList nbd_runtime_opts = {
.name = "nbd",
.head = QTAILQ_HEAD_INITIALIZER(nbd_runtime_opts.head),
.desc = {
{
.name = "host",
.type = QEMU_OPT_STRING,
.help = "TCP host to connect to",
},
{
.name = "port",
.type = QEMU_OPT_STRING,
.help = "TCP port to connect to",
},
{
.name = "path",
.type = QEMU_OPT_STRING,
.help = "Unix socket path to connect to",
},
{
.name = "export",
.type = QEMU_OPT_STRING,
.help = "Name of the NBD export to open",
},
{
.name = "tls-creds",
.type = QEMU_OPT_STRING,
.help = "ID of the TLS credentials to use",
},
},
};
static int nbd_open(BlockDriverState *bs, QDict *options, int flags, static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp) Error **errp)
{ {
BDRVNBDState *s = bs->opaque; BDRVNBDState *s = bs->opaque;
QemuOpts *opts = NULL; char *export = NULL;
int result, sock;
Error *local_err = NULL; Error *local_err = NULL;
QIOChannelSocket *sioc = NULL;
SocketAddress *saddr = NULL;
QCryptoTLSCreds *tlscreds = NULL;
const char *hostname = NULL;
int ret = -EINVAL;
opts = qemu_opts_create(&nbd_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto error;
}
/* Pop the config into our state object. Exit if invalid. */ /* Pop the config into our state object. Exit if invalid. */
saddr = nbd_config(s, opts, errp); nbd_config(s, options, &export, &local_err);
if (!saddr) { if (local_err) {
goto error; error_propagate(errp, local_err);
} return -EINVAL;
s->tlscredsid = g_strdup(qemu_opt_get(opts, "tls-creds"));
if (s->tlscredsid) {
tlscreds = nbd_get_tls_creds(s->tlscredsid, errp);
if (!tlscreds) {
goto error;
}
if (saddr->type != SOCKET_ADDRESS_KIND_INET) {
error_setg(errp, "TLS only supported over IP sockets");
goto error;
}
hostname = saddr->u.inet.data->host;
} }
/* establish TCP connection, return error if it fails /* establish TCP connection, return error if it fails
* TODO: Configurable retry-until-timeout behaviour. * TODO: Configurable retry-until-timeout behaviour.
*/ */
sioc = nbd_establish_connection(saddr, errp); sock = nbd_establish_connection(bs, errp);
if (!sioc) { if (sock < 0) {
ret = -ECONNREFUSED; g_free(export);
goto error; return sock;
} }
/* NBD handshake */ /* NBD handshake */
ret = nbd_client_init(bs, sioc, s->export, result = nbd_client_init(bs, sock, export, errp);
tlscreds, hostname, errp); g_free(export);
error: return result;
if (sioc) { }
object_unref(OBJECT(sioc));
} static int nbd_co_readv(BlockDriverState *bs, int64_t sector_num,
if (tlscreds) { int nb_sectors, QEMUIOVector *qiov)
object_unref(OBJECT(tlscreds)); {
} return nbd_client_co_readv(bs, sector_num, nb_sectors, qiov);
if (ret < 0) { }
g_free(s->path);
g_free(s->host); static int nbd_co_writev(BlockDriverState *bs, int64_t sector_num,
g_free(s->port); int nb_sectors, QEMUIOVector *qiov)
g_free(s->export); {
g_free(s->tlscredsid); return nbd_client_co_writev(bs, sector_num, nb_sectors, qiov);
}
qapi_free_SocketAddress(saddr);
qemu_opts_del(opts);
return ret;
} }
static int nbd_co_flush(BlockDriverState *bs) static int nbd_co_flush(BlockDriverState *bs)
@@ -402,21 +303,22 @@ static int nbd_co_flush(BlockDriverState *bs)
static void nbd_refresh_limits(BlockDriverState *bs, Error **errp) static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
{ {
bs->bl.max_pdiscard = NBD_MAX_BUFFER_SIZE; bs->bl.max_discard = UINT32_MAX >> BDRV_SECTOR_BITS;
bs->bl.max_transfer = NBD_MAX_BUFFER_SIZE; bs->bl.max_transfer_length = UINT32_MAX >> BDRV_SECTOR_BITS;
}
static int nbd_co_discard(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
return nbd_client_co_discard(bs, sector_num, nb_sectors);
} }
static void nbd_close(BlockDriverState *bs) static void nbd_close(BlockDriverState *bs)
{ {
BDRVNBDState *s = bs->opaque; BDRVNBDState *s = bs->opaque;
qemu_opts_del(s->socket_opts);
nbd_client_close(bs); nbd_client_close(bs);
g_free(s->path);
g_free(s->host);
g_free(s->port);
g_free(s->export);
g_free(s->tlscredsid);
} }
static int64_t nbd_getlength(BlockDriverState *bs) static int64_t nbd_getlength(BlockDriverState *bs)
@@ -437,47 +339,46 @@ static void nbd_attach_aio_context(BlockDriverState *bs,
nbd_client_attach_aio_context(bs, new_context); nbd_client_attach_aio_context(bs, new_context);
} }
static void nbd_refresh_filename(BlockDriverState *bs, QDict *options) static void nbd_refresh_filename(BlockDriverState *bs)
{ {
BDRVNBDState *s = bs->opaque;
QDict *opts = qdict_new(); QDict *opts = qdict_new();
const char *path = qdict_get_try_str(bs->options, "path");
const char *host = qdict_get_try_str(bs->options, "host");
const char *port = qdict_get_try_str(bs->options, "port");
const char *export = qdict_get_try_str(bs->options, "export");
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("nbd"))); qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("nbd")));
if (s->path && s->export) { if (path && export) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename), snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd+unix:///%s?socket=%s", s->export, s->path); "nbd+unix:///%s?socket=%s", export, path);
} else if (s->path && !s->export) { } else if (path && !export) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename), snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd+unix://?socket=%s", s->path); "nbd+unix://?socket=%s", path);
} else if (!s->path && s->export && s->port) { } else if (!path && export && port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename), snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s:%s/%s", s->host, s->port, s->export); "nbd://%s:%s/%s", host, port, export);
} else if (!s->path && s->export && !s->port) { } else if (!path && export && !port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename), snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s/%s", s->host, s->export); "nbd://%s/%s", host, export);
} else if (!s->path && !s->export && s->port) { } else if (!path && !export && port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename), snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s:%s", s->host, s->port); "nbd://%s:%s", host, port);
} else if (!s->path && !s->export && !s->port) { } else if (!path && !export && !port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename), snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s", s->host); "nbd://%s", host);
} }
if (s->path) { if (path) {
qdict_put_obj(opts, "path", QOBJECT(qstring_from_str(s->path))); qdict_put_obj(opts, "path", QOBJECT(qstring_from_str(path)));
} else if (s->port) { } else if (port) {
qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(s->host))); qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(s->port))); qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(port)));
} else { } else {
qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(s->host))); qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
} }
if (s->export) { if (export) {
qdict_put_obj(opts, "export", QOBJECT(qstring_from_str(s->export))); qdict_put_obj(opts, "export", QOBJECT(qstring_from_str(export)));
}
if (s->tlscredsid) {
qdict_put_obj(opts, "tls-creds",
QOBJECT(qstring_from_str(s->tlscredsid)));
} }
bs->full_open_options = opts; bs->full_open_options = opts;
@@ -489,11 +390,11 @@ static BlockDriver bdrv_nbd = {
.instance_size = sizeof(BDRVNBDState), .instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename, .bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open, .bdrv_file_open = nbd_open,
.bdrv_co_preadv = nbd_client_co_preadv, .bdrv_co_readv = nbd_co_readv,
.bdrv_co_pwritev = nbd_client_co_pwritev, .bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close, .bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush, .bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_pdiscard = nbd_client_co_pdiscard, .bdrv_co_discard = nbd_co_discard,
.bdrv_refresh_limits = nbd_refresh_limits, .bdrv_refresh_limits = nbd_refresh_limits,
.bdrv_getlength = nbd_getlength, .bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context, .bdrv_detach_aio_context = nbd_detach_aio_context,
@@ -507,11 +408,11 @@ static BlockDriver bdrv_nbd_tcp = {
.instance_size = sizeof(BDRVNBDState), .instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename, .bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open, .bdrv_file_open = nbd_open,
.bdrv_co_preadv = nbd_client_co_preadv, .bdrv_co_readv = nbd_co_readv,
.bdrv_co_pwritev = nbd_client_co_pwritev, .bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close, .bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush, .bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_pdiscard = nbd_client_co_pdiscard, .bdrv_co_discard = nbd_co_discard,
.bdrv_refresh_limits = nbd_refresh_limits, .bdrv_refresh_limits = nbd_refresh_limits,
.bdrv_getlength = nbd_getlength, .bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context, .bdrv_detach_aio_context = nbd_detach_aio_context,
@@ -525,11 +426,11 @@ static BlockDriver bdrv_nbd_unix = {
.instance_size = sizeof(BDRVNBDState), .instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename, .bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open, .bdrv_file_open = nbd_open,
.bdrv_co_preadv = nbd_client_co_preadv, .bdrv_co_readv = nbd_co_readv,
.bdrv_co_pwritev = nbd_client_co_pwritev, .bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close, .bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush, .bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_pdiscard = nbd_client_co_pdiscard, .bdrv_co_discard = nbd_co_discard,
.bdrv_refresh_limits = nbd_refresh_limits, .bdrv_refresh_limits = nbd_refresh_limits,
.bdrv_getlength = nbd_getlength, .bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context, .bdrv_detach_aio_context = nbd_detach_aio_context,

View File

@@ -1,7 +1,7 @@
/* /*
* QEMU Block driver for native access to files on NFS shares * QEMU Block driver for native access to files on NFS shares
* *
* Copyright (c) 2014-2016 Peter Lieven <pl@kamp.de> * Copyright (c) 2014 Peter Lieven <pl@kamp.de>
* *
* Permission is hereby granted, free of charge, to any person obtaining a copy * Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal * of this software and associated documentation files (the "Software"), to deal
@@ -22,24 +22,20 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h" #include "config-host.h"
#include <poll.h> #include <poll.h>
#include "qemu-common.h" #include "qemu-common.h"
#include "qemu/config-file.h" #include "qemu/config-file.h"
#include "qemu/error-report.h" #include "qemu/error-report.h"
#include "qapi/error.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "trace.h" #include "trace.h"
#include "qemu/iov.h" #include "qemu/iov.h"
#include "qemu/uri.h" #include "qemu/uri.h"
#include "qemu/cutils.h"
#include "sysemu/sysemu.h" #include "sysemu/sysemu.h"
#include <nfsc/libnfs.h> #include <nfsc/libnfs.h>
#define QEMU_NFS_MAX_READAHEAD_SIZE 1048576 #define QEMU_NFS_MAX_READAHEAD_SIZE 1048576
#define QEMU_NFS_MAX_PAGECACHE_SIZE (8388608 / NFS_BLKSIZE)
#define QEMU_NFS_MAX_DEBUG_LEVEL 2
typedef struct NFSClient { typedef struct NFSClient {
struct nfs_context *context; struct nfs_context *context;
@@ -47,8 +43,6 @@ typedef struct NFSClient {
int events; int events;
bool has_zero_init; bool has_zero_init;
AioContext *aio_context; AioContext *aio_context;
blkcnt_t st_blocks;
bool cache_used;
} NFSClient; } NFSClient;
typedef struct NFSRPC { typedef struct NFSRPC {
@@ -68,10 +62,11 @@ static void nfs_set_events(NFSClient *client)
{ {
int ev = nfs_which_events(client->context); int ev = nfs_which_events(client->context);
if (ev != client->events) { if (ev != client->events) {
aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context), aio_set_fd_handler(client->aio_context,
false, nfs_get_fd(client->context),
(ev & POLLIN) ? nfs_process_read : NULL, (ev & POLLIN) ? nfs_process_read : NULL,
(ev & POLLOUT) ? nfs_process_write : NULL, client); (ev & POLLOUT) ? nfs_process_write : NULL,
client);
} }
client->events = ev; client->events = ev;
@@ -104,7 +99,7 @@ static void nfs_co_generic_bh_cb(void *opaque)
NFSRPC *task = opaque; NFSRPC *task = opaque;
task->complete = 1; task->complete = 1;
qemu_bh_delete(task->bh); qemu_bh_delete(task->bh);
qemu_coroutine_enter(task->co); qemu_coroutine_enter(task->co, NULL);
} }
static void static void
@@ -246,8 +241,9 @@ static void nfs_detach_aio_context(BlockDriverState *bs)
{ {
NFSClient *client = bs->opaque; NFSClient *client = bs->opaque;
aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context), aio_set_fd_handler(client->aio_context,
false, NULL, NULL, NULL); nfs_get_fd(client->context),
NULL, NULL, NULL);
client->events = 0; client->events = 0;
} }
@@ -266,8 +262,9 @@ static void nfs_client_close(NFSClient *client)
if (client->fh) { if (client->fh) {
nfs_close(client->context, client->fh); nfs_close(client->context, client->fh);
} }
aio_set_fd_handler(client->aio_context, nfs_get_fd(client->context), aio_set_fd_handler(client->aio_context,
false, NULL, NULL, NULL); nfs_get_fd(client->context),
NULL, NULL, NULL);
nfs_destroy_context(client->context); nfs_destroy_context(client->context);
} }
memset(client, 0, sizeof(NFSClient)); memset(client, 0, sizeof(NFSClient));
@@ -280,7 +277,7 @@ static void nfs_file_close(BlockDriverState *bs)
} }
static int64_t nfs_client_open(NFSClient *client, const char *filename, static int64_t nfs_client_open(NFSClient *client, const char *filename,
int flags, Error **errp, int open_flags) int flags, Error **errp)
{ {
int ret = -EINVAL, i; int ret = -EINVAL, i;
struct stat st; struct stat st;
@@ -332,49 +329,12 @@ static int64_t nfs_client_open(NFSClient *client, const char *filename,
nfs_set_tcp_syncnt(client->context, val); nfs_set_tcp_syncnt(client->context, val);
#ifdef LIBNFS_FEATURE_READAHEAD #ifdef LIBNFS_FEATURE_READAHEAD
} else if (!strcmp(qp->p[i].name, "readahead")) { } else if (!strcmp(qp->p[i].name, "readahead")) {
if (open_flags & BDRV_O_NOCACHE) {
error_setg(errp, "Cannot enable NFS readahead "
"if cache.direct = on");
goto fail;
}
if (val > QEMU_NFS_MAX_READAHEAD_SIZE) { if (val > QEMU_NFS_MAX_READAHEAD_SIZE) {
error_report("NFS Warning: Truncating NFS readahead" error_report("NFS Warning: Truncating NFS readahead"
" size to %d", QEMU_NFS_MAX_READAHEAD_SIZE); " size to %d", QEMU_NFS_MAX_READAHEAD_SIZE);
val = QEMU_NFS_MAX_READAHEAD_SIZE; val = QEMU_NFS_MAX_READAHEAD_SIZE;
} }
nfs_set_readahead(client->context, val); nfs_set_readahead(client->context, val);
#ifdef LIBNFS_FEATURE_PAGECACHE
nfs_set_pagecache_ttl(client->context, 0);
#endif
client->cache_used = true;
#endif
#ifdef LIBNFS_FEATURE_PAGECACHE
nfs_set_pagecache_ttl(client->context, 0);
} else if (!strcmp(qp->p[i].name, "pagecache")) {
if (open_flags & BDRV_O_NOCACHE) {
error_setg(errp, "Cannot enable NFS pagecache "
"if cache.direct = on");
goto fail;
}
if (val > QEMU_NFS_MAX_PAGECACHE_SIZE) {
error_report("NFS Warning: Truncating NFS pagecache"
" size to %d pages", QEMU_NFS_MAX_PAGECACHE_SIZE);
val = QEMU_NFS_MAX_PAGECACHE_SIZE;
}
nfs_set_pagecache(client->context, val);
nfs_set_pagecache_ttl(client->context, 0);
client->cache_used = true;
#endif
#ifdef LIBNFS_FEATURE_DEBUG
} else if (!strcmp(qp->p[i].name, "debug")) {
/* limit the maximum debug level to avoid potential flooding
* of our log files. */
if (val > QEMU_NFS_MAX_DEBUG_LEVEL) {
error_report("NFS Warning: Limiting NFS debug level"
" to %d", QEMU_NFS_MAX_DEBUG_LEVEL);
val = QEMU_NFS_MAX_DEBUG_LEVEL;
}
nfs_set_debug(client->context, val);
#endif #endif
} else { } else {
error_setg(errp, "Unknown NFS parameter name: %s", error_setg(errp, "Unknown NFS parameter name: %s",
@@ -414,7 +374,6 @@ static int64_t nfs_client_open(NFSClient *client, const char *filename,
} }
ret = DIV_ROUND_UP(st.st_size, BDRV_SECTOR_SIZE); ret = DIV_ROUND_UP(st.st_size, BDRV_SECTOR_SIZE);
client->st_blocks = st.st_blocks;
client->has_zero_init = S_ISREG(st.st_mode); client->has_zero_init = S_ISREG(st.st_mode);
goto out; goto out;
fail: fail:
@@ -446,7 +405,7 @@ static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
} }
ret = nfs_client_open(client, qemu_opt_get(opts, "filename"), ret = nfs_client_open(client, qemu_opt_get(opts, "filename"),
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY, (flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
errp, bs->open_flags); errp);
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
@@ -482,7 +441,7 @@ static int nfs_file_create(const char *url, QemuOpts *opts, Error **errp)
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0), total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE); BDRV_SECTOR_SIZE);
ret = nfs_client_open(client, url, O_CREAT, errp, 0); ret = nfs_client_open(client, url, O_CREAT, errp);
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
@@ -505,11 +464,6 @@ static int64_t nfs_get_allocated_file_size(BlockDriverState *bs)
NFSRPC task = {0}; NFSRPC task = {0};
struct stat st; struct stat st;
if (bdrv_is_read_only(bs) &&
!(bs->open_flags & BDRV_O_NOCACHE)) {
return client->st_blocks * 512;
}
task.st = &st; task.st = &st;
if (nfs_fstat_async(client->context, client->fh, nfs_co_generic_cb, if (nfs_fstat_async(client->context, client->fh, nfs_co_generic_cb,
&task) != 0) { &task) != 0) {
@@ -530,49 +484,6 @@ static int nfs_file_truncate(BlockDriverState *bs, int64_t offset)
return nfs_ftruncate(client->context, client->fh, offset); return nfs_ftruncate(client->context, client->fh, offset);
} }
/* Note that this will not re-establish a connection with the NFS server
* - it is effectively a NOP. */
static int nfs_reopen_prepare(BDRVReopenState *state,
BlockReopenQueue *queue, Error **errp)
{
NFSClient *client = state->bs->opaque;
struct stat st;
int ret = 0;
if (state->flags & BDRV_O_RDWR && bdrv_is_read_only(state->bs)) {
error_setg(errp, "Cannot open a read-only mount as read-write");
return -EACCES;
}
if ((state->flags & BDRV_O_NOCACHE) && client->cache_used) {
error_setg(errp, "Cannot disable cache if libnfs readahead or"
" pagecache is enabled");
return -EINVAL;
}
/* Update cache for read-only reopens */
if (!(state->flags & BDRV_O_RDWR)) {
ret = nfs_fstat(client->context, client->fh, &st);
if (ret < 0) {
error_setg(errp, "Failed to fstat file: %s",
nfs_get_error(client->context));
return ret;
}
client->st_blocks = st.st_blocks;
}
return 0;
}
#ifdef LIBNFS_FEATURE_PAGECACHE
static void nfs_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
NFSClient *client = bs->opaque;
nfs_pagecache_invalidate(client->context, client->fh);
}
#endif
static BlockDriver bdrv_nfs = { static BlockDriver bdrv_nfs = {
.format_name = "nfs", .format_name = "nfs",
.protocol_name = "nfs", .protocol_name = "nfs",
@@ -588,7 +499,6 @@ static BlockDriver bdrv_nfs = {
.bdrv_file_open = nfs_file_open, .bdrv_file_open = nfs_file_open,
.bdrv_close = nfs_file_close, .bdrv_close = nfs_file_close,
.bdrv_create = nfs_file_create, .bdrv_create = nfs_file_create,
.bdrv_reopen_prepare = nfs_reopen_prepare,
.bdrv_co_readv = nfs_co_readv, .bdrv_co_readv = nfs_co_readv,
.bdrv_co_writev = nfs_co_writev, .bdrv_co_writev = nfs_co_writev,
@@ -596,10 +506,6 @@ static BlockDriver bdrv_nfs = {
.bdrv_detach_aio_context = nfs_detach_aio_context, .bdrv_detach_aio_context = nfs_detach_aio_context,
.bdrv_attach_aio_context = nfs_attach_aio_context, .bdrv_attach_aio_context = nfs_attach_aio_context,
#ifdef LIBNFS_FEATURE_PAGECACHE
.bdrv_invalidate_cache = nfs_invalidate_cache,
#endif
}; };
static void nfs_block_init(void) static void nfs_block_init(void)

View File

@@ -10,19 +10,13 @@
* See the COPYING file in the top-level directory. * See the COPYING file in the top-level directory.
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
#include "block/block_int.h" #include "block/block_int.h"
#define NULL_OPT_LATENCY "latency-ns" #define NULL_OPT_LATENCY "latency-ns"
#define NULL_OPT_ZEROES "read-zeroes"
typedef struct { typedef struct {
int64_t length; int64_t length;
int64_t latency_ns; int64_t latency_ns;
bool read_zeroes;
} BDRVNullState; } BDRVNullState;
static QemuOptsList runtime_opts = { static QemuOptsList runtime_opts = {
@@ -45,11 +39,6 @@ static QemuOptsList runtime_opts = {
.help = "nanoseconds (approximated) to wait " .help = "nanoseconds (approximated) to wait "
"before completing request", "before completing request",
}, },
{
.name = NULL_OPT_ZEROES,
.type = QEMU_OPT_BOOL,
.help = "return zeroes when read",
},
{ /* end of list */ } { /* end of list */ }
}, },
}; };
@@ -71,7 +60,6 @@ static int null_file_open(BlockDriverState *bs, QDict *options, int flags,
error_setg(errp, "latency-ns is invalid"); error_setg(errp, "latency-ns is invalid");
ret = -EINVAL; ret = -EINVAL;
} }
s->read_zeroes = qemu_opt_get_bool(opts, NULL_OPT_ZEROES, false);
qemu_opts_del(opts); qemu_opts_del(opts);
return ret; return ret;
} }
@@ -101,12 +89,6 @@ static coroutine_fn int null_co_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int64_t sector_num, int nb_sectors,
QEMUIOVector *qiov) QEMUIOVector *qiov)
{ {
BDRVNullState *s = bs->opaque;
if (s->read_zeroes) {
qemu_iovec_memset(qiov, 0, 0, nb_sectors * BDRV_SECTOR_SIZE);
}
return null_co_common(bs); return null_co_common(bs);
} }
@@ -176,12 +158,6 @@ static BlockAIOCB *null_aio_readv(BlockDriverState *bs,
BlockCompletionFunc *cb, BlockCompletionFunc *cb,
void *opaque) void *opaque)
{ {
BDRVNullState *s = bs->opaque;
if (s->read_zeroes) {
qemu_iovec_memset(qiov, 0, 0, nb_sectors * BDRV_SECTOR_SIZE);
}
return null_aio_common(bs, cb, opaque); return null_aio_common(bs, cb, opaque);
} }
@@ -207,38 +183,6 @@ static int null_reopen_prepare(BDRVReopenState *reopen_state,
return 0; return 0;
} }
static int64_t coroutine_fn null_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum,
BlockDriverState **file)
{
BDRVNullState *s = bs->opaque;
off_t start = sector_num * BDRV_SECTOR_SIZE;
*pnum = nb_sectors;
*file = bs;
if (s->read_zeroes) {
return BDRV_BLOCK_OFFSET_VALID | start | BDRV_BLOCK_ZERO;
} else {
return BDRV_BLOCK_OFFSET_VALID | start;
}
}
static void null_refresh_filename(BlockDriverState *bs, QDict *opts)
{
QINCREF(opts);
qdict_del(opts, "filename");
if (!qdict_size(opts)) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s://",
bs->drv->format_name);
}
qdict_put(opts, "driver", qstring_from_str(bs->drv->format_name));
bs->full_open_options = opts;
}
static BlockDriver bdrv_null_co = { static BlockDriver bdrv_null_co = {
.format_name = "null-co", .format_name = "null-co",
.protocol_name = "null-co", .protocol_name = "null-co",
@@ -252,10 +196,6 @@ static BlockDriver bdrv_null_co = {
.bdrv_co_writev = null_co_writev, .bdrv_co_writev = null_co_writev,
.bdrv_co_flush_to_disk = null_co_flush, .bdrv_co_flush_to_disk = null_co_flush,
.bdrv_reopen_prepare = null_reopen_prepare, .bdrv_reopen_prepare = null_reopen_prepare,
.bdrv_co_get_block_status = null_co_get_block_status,
.bdrv_refresh_filename = null_refresh_filename,
}; };
static BlockDriver bdrv_null_aio = { static BlockDriver bdrv_null_aio = {
@@ -271,10 +211,6 @@ static BlockDriver bdrv_null_aio = {
.bdrv_aio_writev = null_aio_writev, .bdrv_aio_writev = null_aio_writev,
.bdrv_aio_flush = null_aio_flush, .bdrv_aio_flush = null_aio_flush,
.bdrv_reopen_prepare = null_reopen_prepare, .bdrv_reopen_prepare = null_reopen_prepare,
.bdrv_co_get_block_status = null_co_get_block_status,
.bdrv_refresh_filename = null_refresh_filename,
}; };
static void bdrv_null_init(void) static void bdrv_null_init(void)

View File

@@ -27,13 +27,9 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h" #include "qemu/module.h"
#include "qemu/bswap.h"
#include "qemu/bitmap.h" #include "qemu/bitmap.h"
#include "qapi/util.h" #include "qapi/util.h"
@@ -43,7 +39,6 @@
#define HEADER_MAGIC2 "WithouFreSpacExt" #define HEADER_MAGIC2 "WithouFreSpacExt"
#define HEADER_VERSION 2 #define HEADER_VERSION 2
#define HEADER_INUSE_MAGIC (0x746F6E59) #define HEADER_INUSE_MAGIC (0x746F6E59)
#define MAX_PARALLELS_IMAGE_FACTOR (1ull << 32)
#define DEFAULT_CLUSTER_SIZE 1048576 /* 1 MiB */ #define DEFAULT_CLUSTER_SIZE 1048576 /* 1 MiB */
@@ -66,7 +61,7 @@ typedef struct ParallelsHeader {
typedef enum ParallelsPreallocMode { typedef enum ParallelsPreallocMode {
PRL_PREALLOC_MODE_FALLOCATE = 0, PRL_PREALLOC_MODE_FALLOCATE = 0,
PRL_PREALLOC_MODE_TRUNCATE = 1, PRL_PREALLOC_MODE_TRUNCATE = 1,
PRL_PREALLOC_MODE__MAX = 2, PRL_PREALLOC_MODE_MAX = 2,
} ParallelsPreallocMode; } ParallelsPreallocMode;
static const char *prealloc_mode_lookup[] = { static const char *prealloc_mode_lookup[] = {
@@ -205,17 +200,15 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num,
return -EINVAL; return -EINVAL;
} }
to_allocate = DIV_ROUND_UP(sector_num + *pnum, s->tracks) - idx; to_allocate = (sector_num + *pnum + s->tracks - 1) / s->tracks - idx;
space = to_allocate * s->tracks; space = to_allocate * s->tracks;
if (s->data_end + space > bdrv_getlength(bs->file->bs) >> BDRV_SECTOR_BITS) { if (s->data_end + space > bdrv_getlength(bs->file) >> BDRV_SECTOR_BITS) {
int ret; int ret;
space += s->prealloc_size; space += s->prealloc_size;
if (s->prealloc_mode == PRL_PREALLOC_MODE_FALLOCATE) { if (s->prealloc_mode == PRL_PREALLOC_MODE_FALLOCATE) {
ret = bdrv_pwrite_zeroes(bs->file, ret = bdrv_write_zeroes(bs->file, s->data_end, space, 0);
s->data_end << BDRV_SECTOR_BITS,
space << BDRV_SECTOR_BITS, 0);
} else { } else {
ret = bdrv_truncate(bs->file->bs, ret = bdrv_truncate(bs->file,
(s->data_end + space) << BDRV_SECTOR_BITS); (s->data_end + space) << BDRV_SECTOR_BITS);
} }
if (ret < 0) { if (ret < 0) {
@@ -227,7 +220,7 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num,
s->bat_bitmap[idx + i] = cpu_to_le32(s->data_end / s->off_multiplier); s->bat_bitmap[idx + i] = cpu_to_le32(s->data_end / s->off_multiplier);
s->data_end += s->tracks; s->data_end += s->tracks;
bitmap_set(s->bat_dirty_bmap, bitmap_set(s->bat_dirty_bmap,
bat_entry_off(idx + i) / s->bat_dirty_block, 1); bat_entry_off(idx) / s->bat_dirty_block, 1);
} }
return bat2sect(s, idx) + sector_num % s->tracks; return bat2sect(s, idx) + sector_num % s->tracks;
@@ -251,8 +244,7 @@ static coroutine_fn int parallels_co_flush_to_os(BlockDriverState *bs)
if (off + to_write > s->header_size) { if (off + to_write > s->header_size) {
to_write = s->header_size - off; to_write = s->header_size - off;
} }
ret = bdrv_pwrite(bs->file, off, (uint8_t *)s->header + off, ret = bdrv_pwrite(bs->file, off, (uint8_t *)s->header + off, to_write);
to_write);
if (ret < 0) { if (ret < 0) {
qemu_co_mutex_unlock(&s->lock); qemu_co_mutex_unlock(&s->lock);
return ret; return ret;
@@ -267,7 +259,7 @@ static coroutine_fn int parallels_co_flush_to_os(BlockDriverState *bs)
static int64_t coroutine_fn parallels_co_get_block_status(BlockDriverState *bs, static int64_t coroutine_fn parallels_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum, BlockDriverState **file) int64_t sector_num, int nb_sectors, int *pnum)
{ {
BDRVParallelsState *s = bs->opaque; BDRVParallelsState *s = bs->opaque;
int64_t offset; int64_t offset;
@@ -280,7 +272,6 @@ static int64_t coroutine_fn parallels_co_get_block_status(BlockDriverState *bs,
return 0; return 0;
} }
*file = bs->file->bs;
return (offset << BDRV_SECTOR_BITS) | return (offset << BDRV_SECTOR_BITS) |
BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID; BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
} }
@@ -378,7 +369,7 @@ static int parallels_check(BlockDriverState *bs, BdrvCheckResult *res,
bool flush_bat = false; bool flush_bat = false;
int cluster_size = s->tracks << BDRV_SECTOR_BITS; int cluster_size = s->tracks << BDRV_SECTOR_BITS;
size = bdrv_getlength(bs->file->bs); size = bdrv_getlength(bs->file);
if (size < 0) { if (size < 0) {
res->check_errors++; res->check_errors++;
return size; return size;
@@ -449,7 +440,7 @@ static int parallels_check(BlockDriverState *bs, BdrvCheckResult *res,
size - res->image_end_offset); size - res->image_end_offset);
res->leaks += count; res->leaks += count;
if (fix & BDRV_FIX_LEAKS) { if (fix & BDRV_FIX_LEAKS) {
ret = bdrv_truncate(bs->file->bs, res->image_end_offset); ret = bdrv_truncate(bs->file, res->image_end_offset);
if (ret < 0) { if (ret < 0) {
res->check_errors++; res->check_errors++;
return ret; return ret;
@@ -467,7 +458,7 @@ static int parallels_create(const char *filename, QemuOpts *opts, Error **errp)
int64_t total_size, cl_size; int64_t total_size, cl_size;
uint8_t tmp[BDRV_SECTOR_SIZE]; uint8_t tmp[BDRV_SECTOR_SIZE];
Error *local_err = NULL; Error *local_err = NULL;
BlockBackend *file; BlockDriverState *file;
uint32_t bat_entries, bat_sectors; uint32_t bat_entries, bat_sectors;
ParallelsHeader header; ParallelsHeader header;
int ret; int ret;
@@ -476,10 +467,6 @@ static int parallels_create(const char *filename, QemuOpts *opts, Error **errp)
BDRV_SECTOR_SIZE); BDRV_SECTOR_SIZE);
cl_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE, cl_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE), BDRV_SECTOR_SIZE); DEFAULT_CLUSTER_SIZE), BDRV_SECTOR_SIZE);
if (total_size >= MAX_PARALLELS_IMAGE_FACTOR * cl_size) {
error_propagate(errp, local_err);
return -E2BIG;
}
ret = bdrv_create_file(filename, opts, &local_err); ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) { if (ret < 0) {
@@ -487,16 +474,14 @@ static int parallels_create(const char *filename, QemuOpts *opts, Error **errp)
return ret; return ret;
} }
file = blk_new_open(filename, NULL, NULL, file = NULL;
BDRV_O_RDWR | BDRV_O_PROTOCOL, &local_err); ret = bdrv_open(&file, filename, NULL, NULL,
if (file == NULL) { BDRV_O_RDWR | BDRV_O_PROTOCOL, NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err); error_propagate(errp, local_err);
return -EIO; return ret;
} }
ret = bdrv_truncate(file, 0);
blk_set_allow_write_beyond_eof(file, true);
ret = blk_truncate(file, 0);
if (ret < 0) { if (ret < 0) {
goto exit; goto exit;
} }
@@ -520,19 +505,18 @@ static int parallels_create(const char *filename, QemuOpts *opts, Error **errp)
memset(tmp, 0, sizeof(tmp)); memset(tmp, 0, sizeof(tmp));
memcpy(tmp, &header, sizeof(header)); memcpy(tmp, &header, sizeof(header));
ret = blk_pwrite(file, 0, tmp, BDRV_SECTOR_SIZE, 0); ret = bdrv_pwrite(file, 0, tmp, BDRV_SECTOR_SIZE);
if (ret < 0) { if (ret < 0) {
goto exit; goto exit;
} }
ret = blk_pwrite_zeroes(file, BDRV_SECTOR_SIZE, ret = bdrv_write_zeroes(file, 1, bat_sectors - 1, 0);
(bat_sectors - 1) << BDRV_SECTOR_BITS, 0);
if (ret < 0) { if (ret < 0) {
goto exit; goto exit;
} }
ret = 0; ret = 0;
done: done:
blk_unref(file); bdrv_unref(file);
return ret; return ret;
exit: exit:
@@ -562,8 +546,7 @@ static int parallels_probe(const uint8_t *buf, int buf_size,
static int parallels_update_header(BlockDriverState *bs) static int parallels_update_header(BlockDriverState *bs)
{ {
BDRVParallelsState *s = bs->opaque; BDRVParallelsState *s = bs->opaque;
unsigned size = MAX(bdrv_opt_mem_align(bs->file->bs), unsigned size = MAX(bdrv_opt_mem_align(bs->file), sizeof(ParallelsHeader));
sizeof(ParallelsHeader));
if (size > s->header_size) { if (size > s->header_size) {
size = s->header_size; size = s->header_size;
@@ -620,8 +603,8 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
} }
size = bat_entry_off(s->bat_size); size = bat_entry_off(s->bat_size);
s->header_size = ROUND_UP(size, bdrv_opt_mem_align(bs->file->bs)); s->header_size = ROUND_UP(size, bdrv_opt_mem_align(bs->file));
s->header = qemu_try_blockalign(bs->file->bs, s->header_size); s->header = qemu_try_blockalign(bs->file, s->header_size);
if (s->header == NULL) { if (s->header == NULL) {
ret = -ENOMEM; ret = -ENOMEM;
goto fail; goto fail;
@@ -675,13 +658,13 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
s->prealloc_size = MAX(s->tracks, s->prealloc_size >> BDRV_SECTOR_BITS); s->prealloc_size = MAX(s->tracks, s->prealloc_size >> BDRV_SECTOR_BITS);
buf = qemu_opt_get_del(opts, PARALLELS_OPT_PREALLOC_MODE); buf = qemu_opt_get_del(opts, PARALLELS_OPT_PREALLOC_MODE);
s->prealloc_mode = qapi_enum_parse(prealloc_mode_lookup, buf, s->prealloc_mode = qapi_enum_parse(prealloc_mode_lookup, buf,
PRL_PREALLOC_MODE__MAX, PRL_PREALLOC_MODE_FALLOCATE, &local_err); PRL_PREALLOC_MODE_MAX, PRL_PREALLOC_MODE_FALLOCATE, &local_err);
g_free(buf); g_free(buf);
if (local_err != NULL) { if (local_err != NULL) {
goto fail_options; goto fail_options;
} }
if (!bdrv_has_zero_init(bs->file->bs) || if (!bdrv_has_zero_init(bs->file) ||
bdrv_truncate(bs->file->bs, bdrv_getlength(bs->file->bs)) != 0) { bdrv_truncate(bs->file, bdrv_getlength(bs->file)) != 0) {
s->prealloc_mode = PRL_PREALLOC_MODE_FALLOCATE; s->prealloc_mode = PRL_PREALLOC_MODE_FALLOCATE;
} }
@@ -724,7 +707,7 @@ static void parallels_close(BlockDriverState *bs)
} }
if (bs->open_flags & BDRV_O_RDWR) { if (bs->open_flags & BDRV_O_RDWR) {
bdrv_truncate(bs->file->bs, s->data_end << BDRV_SECTOR_BITS); bdrv_truncate(bs->file, s->data_end << BDRV_SECTOR_BITS);
} }
g_free(s->bat_dirty_bmap); g_free(s->bat_dirty_bmap);

View File

@@ -22,7 +22,6 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "block/qapi.h" #include "block/qapi.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "block/throttle-groups.h" #include "block/throttle-groups.h"
@@ -32,10 +31,8 @@
#include "qapi/qmp-output-visitor.h" #include "qapi/qmp-output-visitor.h"
#include "qapi/qmp/types.h" #include "qapi/qmp/types.h"
#include "sysemu/block-backend.h" #include "sysemu/block-backend.h"
#include "qemu/cutils.h"
BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk, BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs, Error **errp)
BlockDriverState *bs, Error **errp)
{ {
ImageInfo **p_image_info; ImageInfo **p_image_info;
BlockDriverState *bs0; BlockDriverState *bs0;
@@ -49,7 +46,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
info->cache = g_new(BlockdevCacheInfo, 1); info->cache = g_new(BlockdevCacheInfo, 1);
*info->cache = (BlockdevCacheInfo) { *info->cache = (BlockdevCacheInfo) {
.writeback = blk ? blk_enable_write_cache(blk) : true, .writeback = bdrv_enable_write_cache(bs),
.direct = !!(bs->open_flags & BDRV_O_NOCACHE), .direct = !!(bs->open_flags & BDRV_O_NOCACHE),
.no_flush = !!(bs->open_flags & BDRV_O_NO_FLUSH), .no_flush = !!(bs->open_flags & BDRV_O_NO_FLUSH),
}; };
@@ -67,10 +64,10 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
info->backing_file_depth = bdrv_get_backing_file_depth(bs); info->backing_file_depth = bdrv_get_backing_file_depth(bs);
info->detect_zeroes = bs->detect_zeroes; info->detect_zeroes = bs->detect_zeroes;
if (blk && blk_get_public(blk)->throttle_state) { if (bs->io_limits_enabled) {
ThrottleConfig cfg; ThrottleConfig cfg;
throttle_group_get_config(blk, &cfg); throttle_group_get_config(bs, &cfg);
info->bps = cfg.buckets[THROTTLE_BPS_TOTAL].avg; info->bps = cfg.buckets[THROTTLE_BPS_TOTAL].avg;
info->bps_rd = cfg.buckets[THROTTLE_BPS_READ].avg; info->bps_rd = cfg.buckets[THROTTLE_BPS_READ].avg;
@@ -94,31 +91,11 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
info->has_iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max; info->has_iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max;
info->iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max; info->iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max;
info->has_bps_max_length = info->has_bps_max;
info->bps_max_length =
cfg.buckets[THROTTLE_BPS_TOTAL].burst_length;
info->has_bps_rd_max_length = info->has_bps_rd_max;
info->bps_rd_max_length =
cfg.buckets[THROTTLE_BPS_READ].burst_length;
info->has_bps_wr_max_length = info->has_bps_wr_max;
info->bps_wr_max_length =
cfg.buckets[THROTTLE_BPS_WRITE].burst_length;
info->has_iops_max_length = info->has_iops_max;
info->iops_max_length =
cfg.buckets[THROTTLE_OPS_TOTAL].burst_length;
info->has_iops_rd_max_length = info->has_iops_rd_max;
info->iops_rd_max_length =
cfg.buckets[THROTTLE_OPS_READ].burst_length;
info->has_iops_wr_max_length = info->has_iops_wr_max;
info->iops_wr_max_length =
cfg.buckets[THROTTLE_OPS_WRITE].burst_length;
info->has_iops_size = cfg.op_size; info->has_iops_size = cfg.op_size;
info->iops_size = cfg.op_size; info->iops_size = cfg.op_size;
info->has_group = true; info->has_group = true;
info->group = g_strdup(throttle_group_get_name(blk)); info->group = g_strdup(throttle_group_get_name(bs));
} }
info->write_threshold = bdrv_write_threshold_get(bs); info->write_threshold = bdrv_write_threshold_get(bs);
@@ -133,8 +110,8 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
qapi_free_BlockDeviceInfo(info); qapi_free_BlockDeviceInfo(info);
return NULL; return NULL;
} }
if (bs0->drv && bs0->backing) { if (bs0->drv && bs0->backing_hd) {
bs0 = bs0->backing->bs; bs0 = bs0->backing_hd;
(*p_image_info)->has_backing_image = true; (*p_image_info)->has_backing_image = true;
p_image_info = &((*p_image_info)->backing_image); p_image_info = &((*p_image_info)->backing_image);
} else { } else {
@@ -233,13 +210,11 @@ void bdrv_query_image_info(BlockDriverState *bs,
Error *err = NULL; Error *err = NULL;
ImageInfo *info; ImageInfo *info;
aio_context_acquire(bdrv_get_aio_context(bs));
size = bdrv_getlength(bs); size = bdrv_getlength(bs);
if (size < 0) { if (size < 0) {
error_setg_errno(errp, -size, "Can't get size of device '%s'", error_setg_errno(errp, -size, "Can't get size of device '%s'",
bdrv_get_device_name(bs)); bdrv_get_device_name(bs));
goto out; return;
} }
info = g_new0(ImageInfo, 1); info = g_new0(ImageInfo, 1);
@@ -270,18 +245,15 @@ void bdrv_query_image_info(BlockDriverState *bs,
info->has_backing_filename = true; info->has_backing_filename = true;
bdrv_get_full_backing_filename(bs, backing_filename2, PATH_MAX, &err); bdrv_get_full_backing_filename(bs, backing_filename2, PATH_MAX, &err);
if (err) { if (err) {
/* Can't reconstruct the full backing filename, so we must omit error_propagate(errp, err);
* this field and apply a Best Effort to this query. */ qapi_free_ImageInfo(info);
g_free(backing_filename2); g_free(backing_filename2);
backing_filename2 = NULL; return;
error_free(err);
err = NULL;
} }
/* Always report the full_backing_filename if present, even if it's the if (strcmp(backing_filename, backing_filename2) != 0) {
* same as backing_filename. That they are same is useful info. */ info->full_backing_filename =
if (backing_filename2) { g_strdup(backing_filename2);
info->full_backing_filename = g_strdup(backing_filename2);
info->has_full_backing_filename = true; info->has_full_backing_filename = true;
} }
@@ -307,13 +279,10 @@ void bdrv_query_image_info(BlockDriverState *bs,
default: default:
error_propagate(errp, err); error_propagate(errp, err);
qapi_free_ImageInfo(info); qapi_free_ImageInfo(info);
goto out; return;
} }
*p_info = info; *p_info = info;
out:
aio_context_release(bdrv_get_aio_context(bs));
} }
/* @p_info will be set only on success. */ /* @p_info will be set only on success. */
@@ -327,24 +296,24 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
info->locked = blk_dev_is_medium_locked(blk); info->locked = blk_dev_is_medium_locked(blk);
info->removable = blk_dev_has_removable_media(blk); info->removable = blk_dev_has_removable_media(blk);
if (blk_dev_has_tray(blk)) { if (blk_dev_has_removable_media(blk)) {
info->has_tray_open = true; info->has_tray_open = true;
info->tray_open = blk_dev_is_tray_open(blk); info->tray_open = blk_dev_is_tray_open(blk);
} }
if (blk_iostatus_is_enabled(blk)) { if (bdrv_iostatus_is_enabled(bs)) {
info->has_io_status = true; info->has_io_status = true;
info->io_status = blk_iostatus(blk); info->io_status = bs->iostatus;
} }
if (bs && !QLIST_EMPTY(&bs->dirty_bitmaps)) { if (!QLIST_EMPTY(&bs->dirty_bitmaps)) {
info->has_dirty_bitmaps = true; info->has_dirty_bitmaps = true;
info->dirty_bitmaps = bdrv_query_dirty_bitmaps(bs); info->dirty_bitmaps = bdrv_query_dirty_bitmaps(bs);
} }
if (bs && bs->drv) { if (bs->drv) {
info->has_inserted = true; info->has_inserted = true;
info->inserted = bdrv_block_device_info(blk, bs, errp); info->inserted = bdrv_block_device_info(bs, errp);
if (info->inserted == NULL) { if (info->inserted == NULL) {
goto err; goto err;
} }
@@ -357,115 +326,45 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
qapi_free_BlockInfo(info); qapi_free_BlockInfo(info);
} }
static BlockStats *bdrv_query_stats(BlockBackend *blk, static BlockStats *bdrv_query_stats(const BlockDriverState *bs,
const BlockDriverState *bs,
bool query_backing);
static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
{
BlockAcctStats *stats = blk_get_stats(blk);
BlockAcctTimedStats *ts = NULL;
ds->rd_bytes = stats->nr_bytes[BLOCK_ACCT_READ];
ds->wr_bytes = stats->nr_bytes[BLOCK_ACCT_WRITE];
ds->rd_operations = stats->nr_ops[BLOCK_ACCT_READ];
ds->wr_operations = stats->nr_ops[BLOCK_ACCT_WRITE];
ds->failed_rd_operations = stats->failed_ops[BLOCK_ACCT_READ];
ds->failed_wr_operations = stats->failed_ops[BLOCK_ACCT_WRITE];
ds->failed_flush_operations = stats->failed_ops[BLOCK_ACCT_FLUSH];
ds->invalid_rd_operations = stats->invalid_ops[BLOCK_ACCT_READ];
ds->invalid_wr_operations = stats->invalid_ops[BLOCK_ACCT_WRITE];
ds->invalid_flush_operations =
stats->invalid_ops[BLOCK_ACCT_FLUSH];
ds->rd_merged = stats->merged[BLOCK_ACCT_READ];
ds->wr_merged = stats->merged[BLOCK_ACCT_WRITE];
ds->flush_operations = stats->nr_ops[BLOCK_ACCT_FLUSH];
ds->wr_total_time_ns = stats->total_time_ns[BLOCK_ACCT_WRITE];
ds->rd_total_time_ns = stats->total_time_ns[BLOCK_ACCT_READ];
ds->flush_total_time_ns = stats->total_time_ns[BLOCK_ACCT_FLUSH];
ds->has_idle_time_ns = stats->last_access_time_ns > 0;
if (ds->has_idle_time_ns) {
ds->idle_time_ns = block_acct_idle_time_ns(stats);
}
ds->account_invalid = stats->account_invalid;
ds->account_failed = stats->account_failed;
while ((ts = block_acct_interval_next(stats, ts))) {
BlockDeviceTimedStatsList *timed_stats =
g_malloc0(sizeof(*timed_stats));
BlockDeviceTimedStats *dev_stats = g_malloc0(sizeof(*dev_stats));
timed_stats->next = ds->timed_stats;
timed_stats->value = dev_stats;
ds->timed_stats = timed_stats;
TimedAverage *rd = &ts->latency[BLOCK_ACCT_READ];
TimedAverage *wr = &ts->latency[BLOCK_ACCT_WRITE];
TimedAverage *fl = &ts->latency[BLOCK_ACCT_FLUSH];
dev_stats->interval_length = ts->interval_length;
dev_stats->min_rd_latency_ns = timed_average_min(rd);
dev_stats->max_rd_latency_ns = timed_average_max(rd);
dev_stats->avg_rd_latency_ns = timed_average_avg(rd);
dev_stats->min_wr_latency_ns = timed_average_min(wr);
dev_stats->max_wr_latency_ns = timed_average_max(wr);
dev_stats->avg_wr_latency_ns = timed_average_avg(wr);
dev_stats->min_flush_latency_ns = timed_average_min(fl);
dev_stats->max_flush_latency_ns = timed_average_max(fl);
dev_stats->avg_flush_latency_ns = timed_average_avg(fl);
dev_stats->avg_rd_queue_depth =
block_acct_queue_depth(ts, BLOCK_ACCT_READ);
dev_stats->avg_wr_queue_depth =
block_acct_queue_depth(ts, BLOCK_ACCT_WRITE);
}
}
static void bdrv_query_bds_stats(BlockStats *s, const BlockDriverState *bs,
bool query_backing)
{
if (bdrv_get_node_name(bs)[0]) {
s->has_node_name = true;
s->node_name = g_strdup(bdrv_get_node_name(bs));
}
s->stats->wr_highest_offset = bs->wr_highest_offset;
if (bs->file) {
s->has_parent = true;
s->parent = bdrv_query_stats(NULL, bs->file->bs, query_backing);
}
if (query_backing && bs->backing) {
s->has_backing = true;
s->backing = bdrv_query_stats(NULL, bs->backing->bs, query_backing);
}
}
static BlockStats *bdrv_query_stats(BlockBackend *blk,
const BlockDriverState *bs,
bool query_backing) bool query_backing)
{ {
BlockStats *s; BlockStats *s;
s = g_malloc0(sizeof(*s)); s = g_malloc0(sizeof(*s));
s->stats = g_malloc0(sizeof(*s->stats));
if (blk) { if (bdrv_get_device_name(bs)[0]) {
s->has_device = true; s->has_device = true;
s->device = g_strdup(blk_name(blk)); s->device = g_strdup(bdrv_get_device_name(bs));
bdrv_query_blk_stats(s->stats, blk);
} }
if (bs) {
bdrv_query_bds_stats(s, bs, query_backing); if (bdrv_get_node_name(bs)[0]) {
s->has_node_name = true;
s->node_name = g_strdup(bdrv_get_node_name(bs));
}
s->stats = g_malloc0(sizeof(*s->stats));
s->stats->rd_bytes = bs->stats.nr_bytes[BLOCK_ACCT_READ];
s->stats->wr_bytes = bs->stats.nr_bytes[BLOCK_ACCT_WRITE];
s->stats->rd_operations = bs->stats.nr_ops[BLOCK_ACCT_READ];
s->stats->wr_operations = bs->stats.nr_ops[BLOCK_ACCT_WRITE];
s->stats->rd_merged = bs->stats.merged[BLOCK_ACCT_READ];
s->stats->wr_merged = bs->stats.merged[BLOCK_ACCT_WRITE];
s->stats->wr_highest_offset =
bs->stats.wr_highest_sector * BDRV_SECTOR_SIZE;
s->stats->flush_operations = bs->stats.nr_ops[BLOCK_ACCT_FLUSH];
s->stats->wr_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_WRITE];
s->stats->rd_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_READ];
s->stats->flush_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_FLUSH];
if (bs->file) {
s->has_parent = true;
s->parent = bdrv_query_stats(bs->file, query_backing);
}
if (query_backing && bs->backing_hd) {
s->has_backing = true;
s->backing = bdrv_query_stats(bs->backing_hd, query_backing);
} }
return s; return s;
@@ -482,9 +381,7 @@ BlockInfoList *qmp_query_block(Error **errp)
bdrv_query_info(blk, &info->value, &local_err); bdrv_query_info(blk, &info->value, &local_err);
if (local_err) { if (local_err) {
error_propagate(errp, local_err); error_propagate(errp, local_err);
g_free(info); goto err;
qapi_free_BlockInfoList(head);
return NULL;
} }
*p_next = info; *p_next = info;
@@ -492,20 +389,10 @@ BlockInfoList *qmp_query_block(Error **errp)
} }
return head; return head;
}
static bool next_query_bds(BlockBackend **blk, BlockDriverState **bs, err:
bool query_nodes) qapi_free_BlockInfoList(head);
{ return NULL;
if (query_nodes) {
*bs = bdrv_next_node(*bs);
return !!*bs;
}
*blk = blk_next(*blk);
*bs = *blk ? blk_bs(*blk) : NULL;
return !!*blk;
} }
BlockStatsList *qmp_query_blockstats(bool has_query_nodes, BlockStatsList *qmp_query_blockstats(bool has_query_nodes,
@@ -513,19 +400,17 @@ BlockStatsList *qmp_query_blockstats(bool has_query_nodes,
Error **errp) Error **errp)
{ {
BlockStatsList *head = NULL, **p_next = &head; BlockStatsList *head = NULL, **p_next = &head;
BlockBackend *blk = NULL;
BlockDriverState *bs = NULL; BlockDriverState *bs = NULL;
/* Just to be safe if query_nodes is not always initialized */ /* Just to be safe if query_nodes is not always initialized */
query_nodes = has_query_nodes && query_nodes; query_nodes = has_query_nodes && query_nodes;
while (next_query_bds(&blk, &bs, query_nodes)) { while ((bs = query_nodes ? bdrv_next_node(bs) : bdrv_next(bs))) {
BlockStatsList *info = g_malloc0(sizeof(*info)); BlockStatsList *info = g_malloc0(sizeof(*info));
AioContext *ctx = blk ? blk_get_aio_context(blk) AioContext *ctx = bdrv_get_aio_context(bs);
: bdrv_get_aio_context(bs);
aio_context_acquire(ctx); aio_context_acquire(ctx);
info->value = bdrv_query_stats(blk, bs, !query_nodes); info->value = bdrv_query_stats(bs, !query_nodes);
aio_context_release(ctx); aio_context_release(ctx);
*p_next = info; *p_next = info;
@@ -650,10 +535,11 @@ static void dump_qlist(fprintf_function func_fprintf, void *f, int indentation,
int i = 0; int i = 0;
for (entry = qlist_first(list); entry; entry = qlist_next(entry), i++) { for (entry = qlist_first(list); entry; entry = qlist_next(entry), i++) {
QType type = qobject_type(entry->value); qtype_code type = qobject_type(entry->value);
bool composite = (type == QTYPE_QDICT || type == QTYPE_QLIST); bool composite = (type == QTYPE_QDICT || type == QTYPE_QLIST);
func_fprintf(f, "%*s[%i]:%c", indentation * 4, "", i, const char *format = composite ? "%*s[%i]:\n" : "%*s[%i]: ";
composite ? '\n' : ' ');
func_fprintf(f, format, indentation * 4, "", i);
dump_qobject(func_fprintf, f, indentation + 1, entry->value); dump_qobject(func_fprintf, f, indentation + 1, entry->value);
if (!composite) { if (!composite) {
func_fprintf(f, "\n"); func_fprintf(f, "\n");
@@ -667,9 +553,10 @@ static void dump_qdict(fprintf_function func_fprintf, void *f, int indentation,
const QDictEntry *entry; const QDictEntry *entry;
for (entry = qdict_first(dict); entry; entry = qdict_next(dict, entry)) { for (entry = qdict_first(dict); entry; entry = qdict_next(dict, entry)) {
QType type = qobject_type(entry->value); qtype_code type = qobject_type(entry->value);
bool composite = (type == QTYPE_QDICT || type == QTYPE_QLIST); bool composite = (type == QTYPE_QDICT || type == QTYPE_QLIST);
char *key = g_malloc(strlen(entry->key) + 1); const char *format = composite ? "%*s%s:\n" : "%*s%s: ";
char key[strlen(entry->key) + 1];
int i; int i;
/* replace dashes with spaces in key (variable) names */ /* replace dashes with spaces in key (variable) names */
@@ -677,28 +564,28 @@ static void dump_qdict(fprintf_function func_fprintf, void *f, int indentation,
key[i] = entry->key[i] == '-' ? ' ' : entry->key[i]; key[i] = entry->key[i] == '-' ? ' ' : entry->key[i];
} }
key[i] = 0; key[i] = 0;
func_fprintf(f, "%*s%s:%c", indentation * 4, "", key,
composite ? '\n' : ' '); func_fprintf(f, format, indentation * 4, "", key);
dump_qobject(func_fprintf, f, indentation + 1, entry->value); dump_qobject(func_fprintf, f, indentation + 1, entry->value);
if (!composite) { if (!composite) {
func_fprintf(f, "\n"); func_fprintf(f, "\n");
} }
g_free(key);
} }
} }
void bdrv_image_info_specific_dump(fprintf_function func_fprintf, void *f, void bdrv_image_info_specific_dump(fprintf_function func_fprintf, void *f,
ImageInfoSpecific *info_spec) ImageInfoSpecific *info_spec)
{ {
QmpOutputVisitor *ov = qmp_output_visitor_new();
QObject *obj, *data; QObject *obj, *data;
Visitor *v = qmp_output_visitor_new(&obj);
visit_type_ImageInfoSpecific(v, NULL, &info_spec, &error_abort); visit_type_ImageInfoSpecific(qmp_output_get_visitor(ov), &info_spec, NULL,
visit_complete(v, &obj); &error_abort);
obj = qmp_output_get_qobject(ov);
assert(qobject_type(obj) == QTYPE_QDICT); assert(qobject_type(obj) == QTYPE_QDICT);
data = qdict_get(qobject_to_qdict(obj), "data"); data = qdict_get(qobject_to_qdict(obj), "data");
dump_qobject(func_fprintf, f, 1, data); dump_qobject(func_fprintf, f, 1, data);
visit_free(v); qmp_output_visitor_cleanup(ov);
} }
void bdrv_image_info_dump(fprintf_function func_fprintf, void *f, void bdrv_image_info_dump(fprintf_function func_fprintf, void *f,
@@ -736,10 +623,7 @@ void bdrv_image_info_dump(fprintf_function func_fprintf, void *f,
if (info->has_backing_filename) { if (info->has_backing_filename) {
func_fprintf(f, "backing file: %s", info->backing_filename); func_fprintf(f, "backing file: %s", info->backing_filename);
if (!info->has_full_backing_filename) { if (info->has_full_backing_filename) {
func_fprintf(f, " (cannot determine actual path)");
} else if (strcmp(info->backing_filename,
info->full_backing_filename) != 0) {
func_fprintf(f, " (actual path: %s)", info->full_backing_filename); func_fprintf(f, " (actual path: %s)", info->full_backing_filename);
} }
func_fprintf(f, "\n"); func_fprintf(f, "\n");

View File

@@ -21,14 +21,9 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "qemu/error-report.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h" #include "qemu/module.h"
#include "qemu/bswap.h"
#include <zlib.h> #include <zlib.h>
#include "qapi/qmp/qerror.h" #include "qapi/qmp/qerror.h"
#include "crypto/cipher.h" #include "crypto/cipher.h"
@@ -124,7 +119,11 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
goto fail; goto fail;
} }
if (header.version != QCOW_VERSION) { if (header.version != QCOW_VERSION) {
error_setg(errp, "Unsupported qcow version %" PRIu32, header.version); char version[64];
snprintf(version, sizeof(version), "QCOW version %" PRIu32,
header.version);
error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bdrv_get_device_or_node_name(bs), "qcow", version);
ret = -ENOTSUP; ret = -ENOTSUP;
goto fail; goto fail;
} }
@@ -160,21 +159,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
} }
s->crypt_method_header = header.crypt_method; s->crypt_method_header = header.crypt_method;
if (s->crypt_method_header) { if (s->crypt_method_header) {
if (bdrv_uses_whitelist() && bs->encrypted = 1;
s->crypt_method_header == QCOW_CRYPT_AES) {
error_setg(errp,
"Use of AES-CBC encrypted qcow images is no longer "
"supported in system emulators");
error_append_hint(errp,
"You can use 'qemu-img convert' to convert your "
"image to an alternative supported format, such "
"as unencrypted qcow, or raw with the LUKS "
"format instead.\n");
ret = -ENOSYS;
goto fail;
}
bs->encrypted = true;
} }
s->cluster_bits = header.cluster_bits; s->cluster_bits = header.cluster_bits;
s->cluster_size = 1 << s->cluster_bits; s->cluster_size = 1 << s->cluster_bits;
@@ -220,7 +205,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
/* alloc L2 cache (max. 64k * 16 * 8 = 8 MB) */ /* alloc L2 cache (max. 64k * 16 * 8 = 8 MB) */
s->l2_cache = s->l2_cache =
qemu_try_blockalign(bs->file->bs, qemu_try_blockalign(bs->file,
s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t)); s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t));
if (s->l2_cache == NULL) { if (s->l2_cache == NULL) {
error_setg(errp, "Could not allocate L2 table cache"); error_setg(errp, "Could not allocate L2 table cache");
@@ -384,7 +369,7 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
if (!allocate) if (!allocate)
return 0; return 0;
/* allocate a new l2 entry */ /* allocate a new l2 entry */
l2_offset = bdrv_getlength(bs->file->bs); l2_offset = bdrv_getlength(bs->file);
/* round to cluster size */ /* round to cluster size */
l2_offset = (l2_offset + s->cluster_size - 1) & ~(s->cluster_size - 1); l2_offset = (l2_offset + s->cluster_size - 1) & ~(s->cluster_size - 1);
/* update the L1 entry */ /* update the L1 entry */
@@ -424,8 +409,7 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
s->l2_size * sizeof(uint64_t)) < 0) s->l2_size * sizeof(uint64_t)) < 0)
return 0; return 0;
} else { } else {
if (bdrv_pread(bs->file, l2_offset, l2_table, if (bdrv_pread(bs->file, l2_offset, l2_table, s->l2_size * sizeof(uint64_t)) !=
s->l2_size * sizeof(uint64_t)) !=
s->l2_size * sizeof(uint64_t)) s->l2_size * sizeof(uint64_t))
return 0; return 0;
} }
@@ -446,21 +430,20 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
overwritten */ overwritten */
if (decompress_cluster(bs, cluster_offset) < 0) if (decompress_cluster(bs, cluster_offset) < 0)
return 0; return 0;
cluster_offset = bdrv_getlength(bs->file->bs); cluster_offset = bdrv_getlength(bs->file);
cluster_offset = (cluster_offset + s->cluster_size - 1) & cluster_offset = (cluster_offset + s->cluster_size - 1) &
~(s->cluster_size - 1); ~(s->cluster_size - 1);
/* write the cluster content */ /* write the cluster content */
if (bdrv_pwrite(bs->file, cluster_offset, s->cluster_cache, if (bdrv_pwrite(bs->file, cluster_offset, s->cluster_cache, s->cluster_size) !=
s->cluster_size) !=
s->cluster_size) s->cluster_size)
return -1; return -1;
} else { } else {
cluster_offset = bdrv_getlength(bs->file->bs); cluster_offset = bdrv_getlength(bs->file);
if (allocate == 1) { if (allocate == 1) {
/* round to cluster size */ /* round to cluster size */
cluster_offset = (cluster_offset + s->cluster_size - 1) & cluster_offset = (cluster_offset + s->cluster_size - 1) &
~(s->cluster_size - 1); ~(s->cluster_size - 1);
bdrv_truncate(bs->file->bs, cluster_offset + s->cluster_size); bdrv_truncate(bs->file, cluster_offset + s->cluster_size);
/* if encrypted, we must initialize the cluster /* if encrypted, we must initialize the cluster
content which won't be written */ content which won't be written */
if (bs->encrypted && if (bs->encrypted &&
@@ -480,8 +463,7 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
errno = EIO; errno = EIO;
return -1; return -1;
} }
if (bdrv_pwrite(bs->file, if (bdrv_pwrite(bs->file, cluster_offset + i * 512,
cluster_offset + i * 512,
s->cluster_data, 512) != 512) s->cluster_data, 512) != 512)
return -1; return -1;
} }
@@ -503,7 +485,7 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
} }
static int64_t coroutine_fn qcow_co_get_block_status(BlockDriverState *bs, static int64_t coroutine_fn qcow_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum, BlockDriverState **file) int64_t sector_num, int nb_sectors, int *pnum)
{ {
BDRVQcowState *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int index_in_cluster, n; int index_in_cluster, n;
@@ -524,7 +506,6 @@ static int64_t coroutine_fn qcow_co_get_block_status(BlockDriverState *bs,
return BDRV_BLOCK_DATA; return BDRV_BLOCK_DATA;
} }
cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS); cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
*file = bs->file->bs;
return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | cluster_offset; return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | cluster_offset;
} }
@@ -613,13 +594,14 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, int64_t sector_num,
} }
if (!cluster_offset) { if (!cluster_offset) {
if (bs->backing) { if (bs->backing_hd) {
/* read from the base image */ /* read from the base image */
hd_iov.iov_base = (void *)buf; hd_iov.iov_base = (void *)buf;
hd_iov.iov_len = n * 512; hd_iov.iov_len = n * 512;
qemu_iovec_init_external(&hd_qiov, &hd_iov, 1); qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
qemu_co_mutex_unlock(&s->lock); qemu_co_mutex_unlock(&s->lock);
ret = bdrv_co_readv(bs->backing, sector_num, n, &hd_qiov); ret = bdrv_co_readv(bs->backing_hd, sector_num,
n, &hd_qiov);
qemu_co_mutex_lock(&s->lock); qemu_co_mutex_lock(&s->lock);
if (ret < 0) { if (ret < 0) {
goto fail; goto fail;
@@ -793,7 +775,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
int flags = 0; int flags = 0;
Error *local_err = NULL; Error *local_err = NULL;
int ret; int ret;
BlockBackend *qcow_blk; BlockDriverState *qcow_bs;
/* Read out options */ /* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0), total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
@@ -809,17 +791,15 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
goto cleanup; goto cleanup;
} }
qcow_blk = blk_new_open(filename, NULL, NULL, qcow_bs = NULL;
BDRV_O_RDWR | BDRV_O_PROTOCOL, &local_err); ret = bdrv_open(&qcow_bs, filename, NULL, NULL,
if (qcow_blk == NULL) { BDRV_O_RDWR | BDRV_O_PROTOCOL, NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err); error_propagate(errp, local_err);
ret = -EIO;
goto cleanup; goto cleanup;
} }
blk_set_allow_write_beyond_eof(qcow_blk, true); ret = bdrv_truncate(qcow_bs, 0);
ret = blk_truncate(qcow_blk, 0);
if (ret < 0) { if (ret < 0) {
goto exit; goto exit;
} }
@@ -859,24 +839,24 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
} }
/* write all the data */ /* write all the data */
ret = blk_pwrite(qcow_blk, 0, &header, sizeof(header), 0); ret = bdrv_pwrite(qcow_bs, 0, &header, sizeof(header));
if (ret != sizeof(header)) { if (ret != sizeof(header)) {
goto exit; goto exit;
} }
if (backing_file) { if (backing_file) {
ret = blk_pwrite(qcow_blk, sizeof(header), ret = bdrv_pwrite(qcow_bs, sizeof(header),
backing_file, backing_filename_len, 0); backing_file, backing_filename_len);
if (ret != backing_filename_len) { if (ret != backing_filename_len) {
goto exit; goto exit;
} }
} }
tmp = g_malloc0(BDRV_SECTOR_SIZE); tmp = g_malloc0(BDRV_SECTOR_SIZE);
for (i = 0; i < DIV_ROUND_UP(sizeof(uint64_t) * l1_size, BDRV_SECTOR_SIZE); for (i = 0; i < ((sizeof(uint64_t)*l1_size + BDRV_SECTOR_SIZE - 1)/
i++) { BDRV_SECTOR_SIZE); i++) {
ret = blk_pwrite(qcow_blk, header_size + BDRV_SECTOR_SIZE * i, ret = bdrv_pwrite(qcow_bs, header_size +
tmp, BDRV_SECTOR_SIZE, 0); BDRV_SECTOR_SIZE*i, tmp, BDRV_SECTOR_SIZE);
if (ret != BDRV_SECTOR_SIZE) { if (ret != BDRV_SECTOR_SIZE) {
g_free(tmp); g_free(tmp);
goto exit; goto exit;
@@ -886,7 +866,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
g_free(tmp); g_free(tmp);
ret = 0; ret = 0;
exit: exit:
blk_unref(qcow_blk); bdrv_unref(qcow_bs);
cleanup: cleanup:
g_free(backing_file); g_free(backing_file);
return ret; return ret;
@@ -902,7 +882,7 @@ static int qcow_make_empty(BlockDriverState *bs)
if (bdrv_pwrite_sync(bs->file, s->l1_table_offset, s->l1_table, if (bdrv_pwrite_sync(bs->file, s->l1_table_offset, s->l1_table,
l1_length) < 0) l1_length) < 0)
return -1; return -1;
ret = bdrv_truncate(bs->file->bs, s->l1_table_offset + l1_length); ret = bdrv_truncate(bs->file, s->l1_table_offset + l1_length);
if (ret < 0) if (ret < 0)
return ret; return ret;
@@ -915,32 +895,32 @@ static int qcow_make_empty(BlockDriverState *bs)
/* XXX: put compressed sectors first, then all the cluster aligned /* XXX: put compressed sectors first, then all the cluster aligned
tables to avoid losing bytes in alignment */ tables to avoid losing bytes in alignment */
static coroutine_fn int static int qcow_write_compressed(BlockDriverState *bs, int64_t sector_num,
qcow_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset, const uint8_t *buf, int nb_sectors)
uint64_t bytes, QEMUIOVector *qiov)
{ {
BDRVQcowState *s = bs->opaque; BDRVQcowState *s = bs->opaque;
QEMUIOVector hd_qiov;
struct iovec iov;
z_stream strm; z_stream strm;
int ret, out_len; int ret, out_len;
uint8_t *buf, *out_buf; uint8_t *out_buf;
uint64_t cluster_offset; uint64_t cluster_offset;
buf = qemu_blockalign(bs, s->cluster_size); if (nb_sectors != s->cluster_sectors) {
if (bytes != s->cluster_size) { ret = -EINVAL;
if (bytes > s->cluster_size ||
offset + bytes != bs->total_sectors << BDRV_SECTOR_BITS)
{
qemu_vfree(buf);
return -EINVAL;
}
/* Zero-pad last write if image size is not cluster aligned */
memset(buf + bytes, 0, s->cluster_size - bytes);
}
qemu_iovec_to_buf(qiov, 0, buf, qiov->size);
out_buf = g_malloc(s->cluster_size); /* Zero-pad last write if image size is not cluster aligned */
if (sector_num + nb_sectors == bs->total_sectors &&
nb_sectors < s->cluster_sectors) {
uint8_t *pad_buf = qemu_blockalign(bs, s->cluster_size);
memset(pad_buf, 0, s->cluster_size);
memcpy(pad_buf, buf, nb_sectors * BDRV_SECTOR_SIZE);
ret = qcow_write_compressed(bs, sector_num,
pad_buf, s->cluster_sectors);
qemu_vfree(pad_buf);
}
return ret;
}
out_buf = g_malloc(s->cluster_size + (s->cluster_size / 1000) + 128);
/* best compression, small window, no zlib header */ /* best compression, small window, no zlib header */
memset(&strm, 0, sizeof(strm)); memset(&strm, 0, sizeof(strm));
@@ -969,35 +949,27 @@ qcow_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset,
if (ret != Z_STREAM_END || out_len >= s->cluster_size) { if (ret != Z_STREAM_END || out_len >= s->cluster_size) {
/* could not compress: write normal cluster */ /* could not compress: write normal cluster */
ret = qcow_co_writev(bs, offset >> BDRV_SECTOR_BITS, ret = bdrv_write(bs, sector_num, buf, s->cluster_sectors);
bytes >> BDRV_SECTOR_BITS, qiov);
if (ret < 0) { if (ret < 0) {
goto fail; goto fail;
} }
goto success; } else {
} cluster_offset = get_cluster_offset(bs, sector_num << 9, 2,
qemu_co_mutex_lock(&s->lock); out_len, 0, 0);
cluster_offset = get_cluster_offset(bs, offset, 2, out_len, 0, 0); if (cluster_offset == 0) {
qemu_co_mutex_unlock(&s->lock); ret = -EIO;
if (cluster_offset == 0) { goto fail;
ret = -EIO; }
goto fail;
}
cluster_offset &= s->cluster_offset_mask;
iov = (struct iovec) { cluster_offset &= s->cluster_offset_mask;
.iov_base = out_buf, ret = bdrv_pwrite(bs->file, cluster_offset, out_buf, out_len);
.iov_len = out_len, if (ret < 0) {
}; goto fail;
qemu_iovec_init_external(&hd_qiov, &iov, 1); }
ret = bdrv_co_pwritev(bs->file, cluster_offset, out_len, &hd_qiov, 0);
if (ret < 0) {
goto fail;
} }
success:
ret = 0; ret = 0;
fail: fail:
qemu_vfree(buf);
g_free(out_buf); g_free(out_buf);
return ret; return ret;
} }
@@ -1050,7 +1022,7 @@ static BlockDriver bdrv_qcow = {
.bdrv_set_key = qcow_set_key, .bdrv_set_key = qcow_set_key,
.bdrv_make_empty = qcow_make_empty, .bdrv_make_empty = qcow_make_empty,
.bdrv_co_pwritev_compressed = qcow_co_pwritev_compressed, .bdrv_write_compressed = qcow_write_compressed,
.bdrv_get_info = qcow_get_info, .bdrv_get_info = qcow_get_info,
.create_opts = &qcow_create_opts, .create_opts = &qcow_create_opts,

View File

@@ -22,8 +22,6 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
/* Needed for CONFIG_MADVISE */
#include "qemu/osdep.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "qcow2.h" #include "qcow2.h"
@@ -31,9 +29,9 @@
typedef struct Qcow2CachedTable { typedef struct Qcow2CachedTable {
int64_t offset; int64_t offset;
bool dirty;
uint64_t lru_counter; uint64_t lru_counter;
int ref; int ref;
bool dirty;
} Qcow2CachedTable; } Qcow2CachedTable;
struct Qcow2Cache { struct Qcow2Cache {
@@ -43,85 +41,34 @@ struct Qcow2Cache {
bool depends_on_flush; bool depends_on_flush;
void *table_array; void *table_array;
uint64_t lru_counter; uint64_t lru_counter;
uint64_t cache_clean_lru_counter;
}; };
static inline void *qcow2_cache_get_table_addr(BlockDriverState *bs, static inline void *qcow2_cache_get_table_addr(BlockDriverState *bs,
Qcow2Cache *c, int table) Qcow2Cache *c, int table)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
return (uint8_t *) c->table_array + (size_t) table * s->cluster_size; return (uint8_t *) c->table_array + (size_t) table * s->cluster_size;
} }
static inline int qcow2_cache_get_table_idx(BlockDriverState *bs, static inline int qcow2_cache_get_table_idx(BlockDriverState *bs,
Qcow2Cache *c, void *table) Qcow2Cache *c, void *table)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
ptrdiff_t table_offset = (uint8_t *) table - (uint8_t *) c->table_array; ptrdiff_t table_offset = (uint8_t *) table - (uint8_t *) c->table_array;
int idx = table_offset / s->cluster_size; int idx = table_offset / s->cluster_size;
assert(idx >= 0 && idx < c->size && table_offset % s->cluster_size == 0); assert(idx >= 0 && idx < c->size && table_offset % s->cluster_size == 0);
return idx; return idx;
} }
static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c,
int i, int num_tables)
{
#if QEMU_MADV_DONTNEED != QEMU_MADV_INVALID
BDRVQcow2State *s = bs->opaque;
void *t = qcow2_cache_get_table_addr(bs, c, i);
int align = getpagesize();
size_t mem_size = (size_t) s->cluster_size * num_tables;
size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t;
size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align);
if (length > 0) {
qemu_madvise((uint8_t *) t + offset, length, QEMU_MADV_DONTNEED);
}
#endif
}
static inline bool can_clean_entry(Qcow2Cache *c, int i)
{
Qcow2CachedTable *t = &c->entries[i];
return t->ref == 0 && !t->dirty && t->offset != 0 &&
t->lru_counter <= c->cache_clean_lru_counter;
}
void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c)
{
int i = 0;
while (i < c->size) {
int to_clean = 0;
/* Skip the entries that we don't need to clean */
while (i < c->size && !can_clean_entry(c, i)) {
i++;
}
/* And count how many we can clean in a row */
while (i < c->size && can_clean_entry(c, i)) {
c->entries[i].offset = 0;
c->entries[i].lru_counter = 0;
i++;
to_clean++;
}
if (to_clean > 0) {
qcow2_cache_table_release(bs, c, i - to_clean, to_clean);
}
}
c->cache_clean_lru_counter = c->lru_counter;
}
Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
Qcow2Cache *c; Qcow2Cache *c;
c = g_new0(Qcow2Cache, 1); c = g_new0(Qcow2Cache, 1);
c->size = num_tables; c->size = num_tables;
c->entries = g_try_new0(Qcow2CachedTable, num_tables); c->entries = g_try_new0(Qcow2CachedTable, num_tables);
c->table_array = qemu_try_blockalign(bs->file->bs, c->table_array = qemu_try_blockalign(bs->file,
(size_t) num_tables * s->cluster_size); (size_t) num_tables * s->cluster_size);
if (!c->entries || !c->table_array) { if (!c->entries || !c->table_array) {
@@ -166,7 +113,7 @@ static int qcow2_cache_flush_dependency(BlockDriverState *bs, Qcow2Cache *c)
static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i) static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int ret = 0; int ret = 0;
if (!c->entries[i].dirty || !c->entries[i].offset) { if (!c->entries[i].dirty || !c->entries[i].offset) {
@@ -179,7 +126,7 @@ static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
if (c->depends) { if (c->depends) {
ret = qcow2_cache_flush_dependency(bs, c); ret = qcow2_cache_flush_dependency(bs, c);
} else if (c->depends_on_flush) { } else if (c->depends_on_flush) {
ret = bdrv_flush(bs->file->bs); ret = bdrv_flush(bs->file);
if (ret >= 0) { if (ret >= 0) {
c->depends_on_flush = false; c->depends_on_flush = false;
} }
@@ -221,9 +168,9 @@ static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
return 0; return 0;
} }
int qcow2_cache_write(BlockDriverState *bs, Qcow2Cache *c) int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int result = 0; int result = 0;
int ret; int ret;
int i; int i;
@@ -237,15 +184,8 @@ int qcow2_cache_write(BlockDriverState *bs, Qcow2Cache *c)
} }
} }
return result;
}
int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
{
int result = qcow2_cache_write(bs, c);
if (result == 0) { if (result == 0) {
int ret = bdrv_flush(bs->file->bs); ret = bdrv_flush(bs->file);
if (ret < 0) { if (ret < 0) {
result = ret; result = ret;
} }
@@ -297,8 +237,6 @@ int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c)
c->entries[i].lru_counter = 0; c->entries[i].lru_counter = 0;
} }
qcow2_cache_table_release(bs, c, 0, c->size);
c->lru_counter = 0; c->lru_counter = 0;
return 0; return 0;
@@ -307,7 +245,7 @@ int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c)
static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c, static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
uint64_t offset, void **table, bool read_from_disk) uint64_t offset, void **table, bool read_from_disk)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int i; int i;
int ret; int ret;
int lookup_index; int lookup_index;
@@ -357,8 +295,7 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD); BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
} }
ret = bdrv_pread(bs->file, offset, ret = bdrv_pread(bs->file, offset, qcow2_cache_get_table_addr(bs, c, i),
qcow2_cache_get_table_addr(bs, c, i),
s->cluster_size); s->cluster_size);
if (ret < 0) { if (ret < 0) {
return ret; return ret;

View File

@@ -22,20 +22,17 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include <zlib.h> #include <zlib.h>
#include "qapi/error.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "block/qcow2.h" #include "block/qcow2.h"
#include "qemu/bswap.h"
#include "trace.h" #include "trace.h"
int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size, int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
bool exact_size) bool exact_size)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int new_l1_size2, ret, i; int new_l1_size2, ret, i;
uint64_t *new_l1_table; uint64_t *new_l1_table;
int64_t old_l1_table_offset, old_l1_size; int64_t old_l1_table_offset, old_l1_size;
@@ -65,8 +62,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
} }
} }
QEMU_BUILD_BUG_ON(QCOW_MAX_L1_SIZE > INT_MAX); if (new_l1_size > INT_MAX / sizeof(uint64_t)) {
if (new_l1_size > QCOW_MAX_L1_SIZE / sizeof(uint64_t)) {
return -EFBIG; return -EFBIG;
} }
@@ -76,16 +72,14 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
#endif #endif
new_l1_size2 = sizeof(uint64_t) * new_l1_size; new_l1_size2 = sizeof(uint64_t) * new_l1_size;
new_l1_table = qemu_try_blockalign(bs->file->bs, new_l1_table = qemu_try_blockalign(bs->file,
align_offset(new_l1_size2, 512)); align_offset(new_l1_size2, 512));
if (new_l1_table == NULL) { if (new_l1_table == NULL) {
return -ENOMEM; return -ENOMEM;
} }
memset(new_l1_table, 0, align_offset(new_l1_size2, 512)); memset(new_l1_table, 0, align_offset(new_l1_size2, 512));
if (s->l1_size) { memcpy(new_l1_table, s->l1_table, s->l1_size * sizeof(uint64_t));
memcpy(new_l1_table, s->l1_table, s->l1_size * sizeof(uint64_t));
}
/* write new table (align to cluster) */ /* write new table (align to cluster) */
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ALLOC_TABLE); BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ALLOC_TABLE);
@@ -111,8 +105,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_WRITE_TABLE); BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_WRITE_TABLE);
for(i = 0; i < s->l1_size; i++) for(i = 0; i < s->l1_size; i++)
new_l1_table[i] = cpu_to_be64(new_l1_table[i]); new_l1_table[i] = cpu_to_be64(new_l1_table[i]);
ret = bdrv_pwrite_sync(bs->file, new_l1_table_offset, ret = bdrv_pwrite_sync(bs->file, new_l1_table_offset, new_l1_table, new_l1_size2);
new_l1_table, new_l1_size2);
if (ret < 0) if (ret < 0)
goto fail; goto fail;
for(i = 0; i < s->l1_size; i++) for(i = 0; i < s->l1_size; i++)
@@ -120,10 +113,9 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
/* set new table */ /* set new table */
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ACTIVATE_TABLE); BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ACTIVATE_TABLE);
stl_be_p(data, new_l1_size); cpu_to_be32w((uint32_t*)data, new_l1_size);
stq_be_p(data + 4, new_l1_table_offset); stq_be_p(data + 4, new_l1_table_offset);
ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, l1_size), ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, l1_size), data,sizeof(data));
data, sizeof(data));
if (ret < 0) { if (ret < 0) {
goto fail; goto fail;
} }
@@ -156,10 +148,12 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
static int l2_load(BlockDriverState *bs, uint64_t l2_offset, static int l2_load(BlockDriverState *bs, uint64_t l2_offset,
uint64_t **l2_table) uint64_t **l2_table)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int ret;
return qcow2_cache_get(bs, s->l2_table_cache, l2_offset, ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, (void**) l2_table);
(void **)l2_table);
return ret;
} }
/* /*
@@ -169,7 +163,7 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset,
#define L1_ENTRIES_PER_SECTOR (512 / 8) #define L1_ENTRIES_PER_SECTOR (512 / 8)
int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index) int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t buf[L1_ENTRIES_PER_SECTOR] = { 0 }; uint64_t buf[L1_ENTRIES_PER_SECTOR] = { 0 };
int l1_start_index; int l1_start_index;
int i, ret; int i, ret;
@@ -188,9 +182,8 @@ int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
} }
BLKDBG_EVENT(bs->file, BLKDBG_L1_UPDATE); BLKDBG_EVENT(bs->file, BLKDBG_L1_UPDATE);
ret = bdrv_pwrite_sync(bs->file, ret = bdrv_pwrite_sync(bs->file, s->l1_table_offset + 8 * l1_start_index,
s->l1_table_offset + 8 * l1_start_index, buf, sizeof(buf));
buf, sizeof(buf));
if (ret < 0) { if (ret < 0) {
return ret; return ret;
} }
@@ -210,7 +203,7 @@ int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t old_l2_offset; uint64_t old_l2_offset;
uint64_t *l2_table = NULL; uint64_t *l2_table = NULL;
int64_t l2_offset; int64_t l2_offset;
@@ -316,7 +309,7 @@ static int count_contiguous_clusters(int nb_clusters, int cluster_size,
if (!offset) if (!offset)
return 0; return 0;
assert(qcow2_get_cluster_type(first_entry) == QCOW2_CLUSTER_NORMAL); assert(qcow2_get_cluster_type(first_entry) != QCOW2_CLUSTER_COMPRESSED);
for (i = 0; i < nb_clusters; i++) { for (i = 0; i < nb_clusters; i++) {
uint64_t l2_entry = be64_to_cpu(l2_table[i]) & mask; uint64_t l2_entry = be64_to_cpu(l2_table[i]) & mask;
@@ -328,16 +321,14 @@ static int count_contiguous_clusters(int nb_clusters, int cluster_size,
return i; return i;
} }
static int count_contiguous_clusters_by_type(int nb_clusters, static int count_contiguous_free_clusters(int nb_clusters, uint64_t *l2_table)
uint64_t *l2_table,
int wanted_type)
{ {
int i; int i;
for (i = 0; i < nb_clusters; i++) { for (i = 0; i < nb_clusters; i++) {
int type = qcow2_get_cluster_type(be64_to_cpu(l2_table[i])); int type = qcow2_get_cluster_type(be64_to_cpu(l2_table[i]));
if (type != wanted_type) { if (type != QCOW2_CLUSTER_UNALLOCATED) {
break; break;
} }
} }
@@ -348,7 +339,7 @@ static int count_contiguous_clusters_by_type(int nb_clusters,
/* The crypt function is compatible with the linux cryptoloop /* The crypt function is compatible with the linux cryptoloop
algorithm for < 4 GB images. NOTE: out_buf == in_buf is algorithm for < 4 GB images. NOTE: out_buf == in_buf is
supported */ supported */
int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num, int qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num,
uint8_t *out_buf, const uint8_t *in_buf, uint8_t *out_buf, const uint8_t *in_buf,
int nb_sectors, bool enc, int nb_sectors, bool enc,
Error **errp) Error **errp)
@@ -391,18 +382,22 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
return 0; return 0;
} }
static int coroutine_fn do_perform_cow(BlockDriverState *bs, static int coroutine_fn copy_sectors(BlockDriverState *bs,
uint64_t src_cluster_offset, uint64_t start_sect,
uint64_t cluster_offset, uint64_t cluster_offset,
int offset_in_cluster, int n_start, int n_end)
int bytes)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
QEMUIOVector qiov; QEMUIOVector qiov;
struct iovec iov; struct iovec iov;
int ret; int n, ret;
iov.iov_len = bytes; n = n_end - n_start;
if (n <= 0) {
return 0;
}
iov.iov_len = n * BDRV_SECTOR_SIZE;
iov.iov_base = qemu_try_blockalign(bs, iov.iov_len); iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
if (iov.iov_base == NULL) { if (iov.iov_base == NULL) {
return -ENOMEM; return -ENOMEM;
@@ -421,21 +416,17 @@ static int coroutine_fn do_perform_cow(BlockDriverState *bs,
* interface. This avoids double I/O throttling and request tracking, * interface. This avoids double I/O throttling and request tracking,
* which can lead to deadlock when block layer copy-on-read is enabled. * which can lead to deadlock when block layer copy-on-read is enabled.
*/ */
ret = bs->drv->bdrv_co_preadv(bs, src_cluster_offset + offset_in_cluster, ret = bs->drv->bdrv_co_readv(bs, start_sect + n_start, n, &qiov);
bytes, &qiov, 0);
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
if (bs->encrypted) { if (bs->encrypted) {
Error *err = NULL; Error *err = NULL;
int64_t sector = (src_cluster_offset + offset_in_cluster)
>> BDRV_SECTOR_BITS;
assert(s->cipher); assert(s->cipher);
assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0); if (qcow2_encrypt_sectors(s, start_sect + n_start,
assert((bytes & ~BDRV_SECTOR_MASK) == 0); iov.iov_base, iov.iov_base, n,
if (qcow2_encrypt_sectors(s, sector, iov.iov_base, iov.iov_base, true, &err) < 0) {
bytes >> BDRV_SECTOR_BITS, true, &err) < 0) {
ret = -EIO; ret = -EIO;
error_free(err); error_free(err);
goto out; goto out;
@@ -443,14 +434,13 @@ static int coroutine_fn do_perform_cow(BlockDriverState *bs,
} }
ret = qcow2_pre_write_overlap_check(bs, 0, ret = qcow2_pre_write_overlap_check(bs, 0,
cluster_offset + offset_in_cluster, bytes); cluster_offset + n_start * BDRV_SECTOR_SIZE, n * BDRV_SECTOR_SIZE);
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE); BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
ret = bdrv_co_pwritev(bs->file, cluster_offset + offset_in_cluster, ret = bdrv_co_writev(bs->file, (cluster_offset >> 9) + n_start, n, &qiov);
bytes, &qiov, 0);
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
@@ -465,47 +455,51 @@ out:
/* /*
* get_cluster_offset * get_cluster_offset
* *
* For a given offset of the virtual disk, find the cluster type and offset in * For a given offset of the disk image, find the cluster offset in
* the qcow2 file. The offset is stored in *cluster_offset. * qcow2 file. The offset is stored in *cluster_offset.
* *
* On entry, *bytes is the maximum number of contiguous bytes starting at * on entry, *num is the number of contiguous sectors we'd like to
* offset that we are interested in. * access following offset.
* *
* On exit, *bytes is the number of bytes starting at offset that have the same * on exit, *num is the number of contiguous sectors we can read.
* cluster type and (if applicable) are stored contiguously in the image file.
* Compressed clusters are always returned one by one.
* *
* Returns the cluster type (QCOW2_CLUSTER_*) on success, -errno in error * Returns the cluster type (QCOW2_CLUSTER_*) on success, -errno in error
* cases. * cases.
*/ */
int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset, int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
unsigned int *bytes, uint64_t *cluster_offset) int *num, uint64_t *cluster_offset)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
unsigned int l2_index; unsigned int l2_index;
uint64_t l1_index, l2_offset, *l2_table; uint64_t l1_index, l2_offset, *l2_table;
int l1_bits, c; int l1_bits, c;
unsigned int offset_in_cluster; unsigned int index_in_cluster, nb_clusters;
uint64_t bytes_available, bytes_needed, nb_clusters; uint64_t nb_available, nb_needed;
int ret; int ret;
offset_in_cluster = offset_into_cluster(s, offset); index_in_cluster = (offset >> 9) & (s->cluster_sectors - 1);
bytes_needed = (uint64_t) *bytes + offset_in_cluster; nb_needed = *num + index_in_cluster;
l1_bits = s->l2_bits + s->cluster_bits; l1_bits = s->l2_bits + s->cluster_bits;
/* compute how many bytes there are between the start of the cluster /* compute how many bytes there are between the offset and
* containing offset and the end of the l1 entry */ * the end of the l1 entry
bytes_available = (1ULL << l1_bits) - (offset & ((1ULL << l1_bits) - 1)) */
+ offset_in_cluster;
if (bytes_needed > bytes_available) { nb_available = (1ULL << l1_bits) - (offset & ((1ULL << l1_bits) - 1));
bytes_needed = bytes_available;
/* compute the number of available sectors */
nb_available = (nb_available >> 9) + index_in_cluster;
if (nb_needed > nb_available) {
nb_needed = nb_available;
} }
assert(nb_needed <= INT_MAX);
*cluster_offset = 0; *cluster_offset = 0;
/* seek to the l2 offset in the l1 table */ /* seek the the l2 offset in the l1 table */
l1_index = offset >> l1_bits; l1_index = offset >> l1_bits;
if (l1_index >= s->l1_size) { if (l1_index >= s->l1_size) {
@@ -538,11 +532,8 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1); l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
*cluster_offset = be64_to_cpu(l2_table[l2_index]); *cluster_offset = be64_to_cpu(l2_table[l2_index]);
nb_clusters = size_to_clusters(s, bytes_needed); /* nb_needed <= INT_MAX, thus nb_clusters <= INT_MAX, too */
/* bytes_needed <= *bytes + offset_in_cluster, both of which are unsigned nb_clusters = size_to_clusters(s, nb_needed << 9);
* integers; the minimum cluster size is 512, so this assertion is always
* true */
assert(nb_clusters <= INT_MAX);
ret = qcow2_get_cluster_type(*cluster_offset); ret = qcow2_get_cluster_type(*cluster_offset);
switch (ret) { switch (ret) {
@@ -559,14 +550,13 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
ret = -EIO; ret = -EIO;
goto fail; goto fail;
} }
c = count_contiguous_clusters_by_type(nb_clusters, &l2_table[l2_index], c = count_contiguous_clusters(nb_clusters, s->cluster_size,
QCOW2_CLUSTER_ZERO); &l2_table[l2_index], QCOW_OFLAG_ZERO);
*cluster_offset = 0; *cluster_offset = 0;
break; break;
case QCOW2_CLUSTER_UNALLOCATED: case QCOW2_CLUSTER_UNALLOCATED:
/* how many empty clusters ? */ /* how many empty clusters ? */
c = count_contiguous_clusters_by_type(nb_clusters, &l2_table[l2_index], c = count_contiguous_free_clusters(nb_clusters, &l2_table[l2_index]);
QCOW2_CLUSTER_UNALLOCATED);
*cluster_offset = 0; *cluster_offset = 0;
break; break;
case QCOW2_CLUSTER_NORMAL: case QCOW2_CLUSTER_NORMAL:
@@ -589,18 +579,13 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
bytes_available = (int64_t)c * s->cluster_size; nb_available = (c * s->cluster_sectors);
out: out:
if (bytes_available > bytes_needed) { if (nb_available > nb_needed)
bytes_available = bytes_needed; nb_available = nb_needed;
}
/* bytes_available <= bytes_needed <= *bytes + offset_in_cluster; *num = nb_available - index_in_cluster;
* subtracting offset_in_cluster will therefore definitely yield something
* not exceeding UINT_MAX */
assert(bytes_available - offset_in_cluster <= UINT_MAX);
*bytes = bytes_available - offset_in_cluster;
return ret; return ret;
@@ -624,13 +609,13 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
uint64_t **new_l2_table, uint64_t **new_l2_table,
int *new_l2_index) int *new_l2_index)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
unsigned int l2_index; unsigned int l2_index;
uint64_t l1_index, l2_offset; uint64_t l1_index, l2_offset;
uint64_t *l2_table = NULL; uint64_t *l2_table = NULL;
int ret; int ret;
/* seek to the l2 offset in the l1 table */ /* seek the the l2 offset in the l1 table */
l1_index = offset >> (s->l2_bits + s->cluster_bits); l1_index = offset >> (s->l2_bits + s->cluster_bits);
if (l1_index >= s->l1_size) { if (l1_index >= s->l1_size) {
@@ -698,7 +683,7 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
uint64_t offset, uint64_t offset,
int compressed_size) int compressed_size)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int l2_index, ret; int l2_index, ret;
uint64_t *l2_table; uint64_t *l2_table;
int64_t cluster_offset; int64_t cluster_offset;
@@ -743,15 +728,17 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r) static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int ret; int ret;
if (r->nb_bytes == 0) { if (r->nb_sectors == 0) {
return 0; return 0;
} }
qemu_co_mutex_unlock(&s->lock); qemu_co_mutex_unlock(&s->lock);
ret = do_perform_cow(bs, m->offset, m->alloc_offset, r->offset, r->nb_bytes); ret = copy_sectors(bs, m->offset / BDRV_SECTOR_SIZE, m->alloc_offset,
r->offset / BDRV_SECTOR_SIZE,
r->offset / BDRV_SECTOR_SIZE + r->nb_sectors);
qemu_co_mutex_lock(&s->lock); qemu_co_mutex_lock(&s->lock);
if (ret < 0) { if (ret < 0) {
@@ -770,7 +757,7 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r)
int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int i, j = 0, l2_index, ret; int i, j = 0, l2_index, ret;
uint64_t *old_cluster, *l2_table; uint64_t *old_cluster, *l2_table;
uint64_t cluster_offset = m->alloc_offset; uint64_t cluster_offset = m->alloc_offset;
@@ -813,14 +800,13 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
assert(l2_index + m->nb_clusters <= s->l2_size); assert(l2_index + m->nb_clusters <= s->l2_size);
for (i = 0; i < m->nb_clusters; i++) { for (i = 0; i < m->nb_clusters; i++) {
/* if two concurrent writes happen to the same unallocated cluster /* if two concurrent writes happen to the same unallocated cluster
* each write allocates separate cluster and writes data concurrently. * each write allocates separate cluster and writes data concurrently.
* The first one to complete updates l2 table with pointer to its * The first one to complete updates l2 table with pointer to its
* cluster the second one has to do RMW (which is done above by * cluster the second one has to do RMW (which is done above by
* perform_cow()), update l2 table with its cluster pointer and free * copy_sectors()), update l2 table with its cluster pointer and free
* old cluster. This is what this loop does */ * old cluster. This is what this loop does */
if (l2_table[l2_index + i] != 0) { if(l2_table[l2_index + i] != 0)
old_cluster[j++] = l2_table[l2_index + i]; old_cluster[j++] = l2_table[l2_index + i];
}
l2_table[l2_index + i] = cpu_to_be64((cluster_offset + l2_table[l2_index + i] = cpu_to_be64((cluster_offset +
(i << s->cluster_bits)) | QCOW_OFLAG_COPIED); (i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
@@ -831,6 +817,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
/* /*
* If this was a COW, we need to decrease the refcount of the old cluster. * If this was a COW, we need to decrease the refcount of the old cluster.
* Also flush bs->file to get the right order for L2 and refcount update.
* *
* Don't discard clusters that reach a refcount of 0 (e.g. compressed * Don't discard clusters that reach a refcount of 0 (e.g. compressed
* clusters), the next write will reuse them anyway. * clusters), the next write will reuse them anyway.
@@ -853,7 +840,7 @@ err:
* write, but require COW to be performed (this includes yet unallocated space, * write, but require COW to be performed (this includes yet unallocated space,
* which must copy from the backing file) * which must copy from the backing file)
*/ */
static int count_cow_clusters(BDRVQcow2State *s, int nb_clusters, static int count_cow_clusters(BDRVQcowState *s, int nb_clusters,
uint64_t *l2_table, int l2_index) uint64_t *l2_table, int l2_index)
{ {
int i; int i;
@@ -899,7 +886,7 @@ out:
static int handle_dependencies(BlockDriverState *bs, uint64_t guest_offset, static int handle_dependencies(BlockDriverState *bs, uint64_t guest_offset,
uint64_t *cur_bytes, QCowL2Meta **m) uint64_t *cur_bytes, QCowL2Meta **m)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
QCowL2Meta *old_alloc; QCowL2Meta *old_alloc;
uint64_t bytes = *cur_bytes; uint64_t bytes = *cur_bytes;
@@ -972,7 +959,7 @@ static int handle_dependencies(BlockDriverState *bs, uint64_t guest_offset,
static int handle_copied(BlockDriverState *bs, uint64_t guest_offset, static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
uint64_t *host_offset, uint64_t *bytes, QCowL2Meta **m) uint64_t *host_offset, uint64_t *bytes, QCowL2Meta **m)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int l2_index; int l2_index;
uint64_t cluster_offset; uint64_t cluster_offset;
uint64_t *l2_table; uint64_t *l2_table;
@@ -1080,7 +1067,7 @@ out:
static int do_alloc_cluster_offset(BlockDriverState *bs, uint64_t guest_offset, static int do_alloc_cluster_offset(BlockDriverState *bs, uint64_t guest_offset,
uint64_t *host_offset, uint64_t *nb_clusters) uint64_t *host_offset, uint64_t *nb_clusters)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
trace_qcow2_do_alloc_clusters_offset(qemu_coroutine_self(), guest_offset, trace_qcow2_do_alloc_clusters_offset(qemu_coroutine_self(), guest_offset,
*host_offset, *nb_clusters); *host_offset, *nb_clusters);
@@ -1128,7 +1115,7 @@ static int do_alloc_cluster_offset(BlockDriverState *bs, uint64_t guest_offset,
static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset, static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
uint64_t *host_offset, uint64_t *bytes, QCowL2Meta **m) uint64_t *host_offset, uint64_t *bytes, QCowL2Meta **m)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int l2_index; int l2_index;
uint64_t *l2_table; uint64_t *l2_table;
uint64_t entry; uint64_t entry;
@@ -1202,20 +1189,25 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
/* /*
* Save info needed for meta data update. * Save info needed for meta data update.
* *
* requested_bytes: Number of bytes from the start of the first * requested_sectors: Number of sectors from the start of the first
* newly allocated cluster to the end of the (possibly shortened * newly allocated cluster to the end of the (possibly shortened
* before) write request. * before) write request.
* *
* avail_bytes: Number of bytes from the start of the first * avail_sectors: Number of sectors from the start of the first
* newly allocated to the end of the last newly allocated cluster. * newly allocated to the end of the last newly allocated cluster.
* *
* nb_bytes: The number of bytes from the start of the first * nb_sectors: The number of sectors from the start of the first
* newly allocated cluster to the end of the area that the write * newly allocated cluster to the end of the area that the write
* request actually writes to (excluding COW at the end) * request actually writes to (excluding COW at the end)
*/ */
uint64_t requested_bytes = *bytes + offset_into_cluster(s, guest_offset); int requested_sectors =
int avail_bytes = MIN(INT_MAX, nb_clusters << s->cluster_bits); (*bytes + offset_into_cluster(s, guest_offset))
int nb_bytes = MIN(requested_bytes, avail_bytes); >> BDRV_SECTOR_BITS;
int avail_sectors = nb_clusters
<< (s->cluster_bits - BDRV_SECTOR_BITS);
int alloc_n_start = offset_into_cluster(s, guest_offset)
>> BDRV_SECTOR_BITS;
int nb_sectors = MIN(requested_sectors, avail_sectors);
QCowL2Meta *old_m = *m; QCowL2Meta *old_m = *m;
*m = g_malloc0(sizeof(**m)); *m = g_malloc0(sizeof(**m));
@@ -1226,21 +1218,23 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
.alloc_offset = alloc_cluster_offset, .alloc_offset = alloc_cluster_offset,
.offset = start_of_cluster(s, guest_offset), .offset = start_of_cluster(s, guest_offset),
.nb_clusters = nb_clusters, .nb_clusters = nb_clusters,
.nb_available = nb_sectors,
.cow_start = { .cow_start = {
.offset = 0, .offset = 0,
.nb_bytes = offset_into_cluster(s, guest_offset), .nb_sectors = alloc_n_start,
}, },
.cow_end = { .cow_end = {
.offset = nb_bytes, .offset = nb_sectors * BDRV_SECTOR_SIZE,
.nb_bytes = avail_bytes - nb_bytes, .nb_sectors = avail_sectors - nb_sectors,
}, },
}; };
qemu_co_queue_init(&(*m)->dependent_requests); qemu_co_queue_init(&(*m)->dependent_requests);
QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight); QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight);
*host_offset = alloc_cluster_offset + offset_into_cluster(s, guest_offset); *host_offset = alloc_cluster_offset + offset_into_cluster(s, guest_offset);
*bytes = MIN(*bytes, nb_bytes - offset_into_cluster(s, guest_offset)); *bytes = MIN(*bytes, (nb_sectors * BDRV_SECTOR_SIZE)
- offset_into_cluster(s, guest_offset));
assert(*bytes != 0); assert(*bytes != 0);
return 1; return 1;
@@ -1272,20 +1266,21 @@ fail:
* Return 0 on success and -errno in error cases * Return 0 on success and -errno in error cases
*/ */
int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset, int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
unsigned int *bytes, uint64_t *host_offset, int *num, uint64_t *host_offset, QCowL2Meta **m)
QCowL2Meta **m)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t start, remaining; uint64_t start, remaining;
uint64_t cluster_offset; uint64_t cluster_offset;
uint64_t cur_bytes; uint64_t cur_bytes;
int ret; int ret;
trace_qcow2_alloc_clusters_offset(qemu_coroutine_self(), offset, *bytes); trace_qcow2_alloc_clusters_offset(qemu_coroutine_self(), offset, *num);
assert((offset & ~BDRV_SECTOR_MASK) == 0);
again: again:
start = offset; start = offset;
remaining = *bytes; remaining = (uint64_t)*num << BDRV_SECTOR_BITS;
cluster_offset = 0; cluster_offset = 0;
*host_offset = 0; *host_offset = 0;
cur_bytes = 0; cur_bytes = 0;
@@ -1371,8 +1366,8 @@ again:
} }
} }
*bytes -= remaining; *num -= remaining >> BDRV_SECTOR_BITS;
assert(*bytes > 0); assert(*num > 0);
assert(*host_offset != 0); assert(*host_offset != 0);
return 0; return 0;
@@ -1407,7 +1402,7 @@ static int decompress_buffer(uint8_t *out_buf, int out_buf_size,
int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset) int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int ret, csize, nb_csectors, sector_offset; int ret, csize, nb_csectors, sector_offset;
uint64_t coffset; uint64_t coffset;
@@ -1417,8 +1412,7 @@ int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset)
sector_offset = coffset & 511; sector_offset = coffset & 511;
csize = nb_csectors * 512 - sector_offset; csize = nb_csectors * 512 - sector_offset;
BLKDBG_EVENT(bs->file, BLKDBG_READ_COMPRESSED); BLKDBG_EVENT(bs->file, BLKDBG_READ_COMPRESSED);
ret = bdrv_read(bs->file, coffset >> 9, s->cluster_data, ret = bdrv_read(bs->file, coffset >> 9, s->cluster_data, nb_csectors);
nb_csectors);
if (ret < 0) { if (ret < 0) {
return ret; return ret;
} }
@@ -1440,7 +1434,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
uint64_t nb_clusters, enum qcow2_discard_type type, uint64_t nb_clusters, enum qcow2_discard_type type,
bool full_discard) bool full_discard)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t *l2_table; uint64_t *l2_table;
int l2_index; int l2_index;
int ret; int ret;
@@ -1475,7 +1469,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
*/ */
switch (qcow2_get_cluster_type(old_l2_entry)) { switch (qcow2_get_cluster_type(old_l2_entry)) {
case QCOW2_CLUSTER_UNALLOCATED: case QCOW2_CLUSTER_UNALLOCATED:
if (full_discard || !bs->backing) { if (full_discard || !bs->backing_hd) {
continue; continue;
} }
break; break;
@@ -1514,7 +1508,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset, int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int nb_sectors, enum qcow2_discard_type type, bool full_discard) int nb_sectors, enum qcow2_discard_type type, bool full_discard)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t end_offset; uint64_t end_offset;
uint64_t nb_clusters; uint64_t nb_clusters;
int ret; int ret;
@@ -1560,7 +1554,7 @@ fail:
static int zero_single_l2(BlockDriverState *bs, uint64_t offset, static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
uint64_t nb_clusters) uint64_t nb_clusters)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t *l2_table; uint64_t *l2_table;
int l2_index; int l2_index;
int ret; int ret;
@@ -1597,7 +1591,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors) int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t nb_clusters; uint64_t nb_clusters;
int ret; int ret;
@@ -1640,10 +1634,9 @@ fail:
static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
int l1_size, int64_t *visited_l1_entries, int l1_size, int64_t *visited_l1_entries,
int64_t l1_entries, int64_t l1_entries,
BlockDriverAmendStatusCB *status_cb, BlockDriverAmendStatusCB *status_cb)
void *cb_opaque)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
bool is_active_l1 = (l1_table == s->l1_table); bool is_active_l1 = (l1_table == s->l1_table);
uint64_t *l2_table = NULL; uint64_t *l2_table = NULL;
int ret; int ret;
@@ -1652,7 +1645,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
if (!is_active_l1) { if (!is_active_l1) {
/* inactive L2 tables require a buffer to be stored in when loading /* inactive L2 tables require a buffer to be stored in when loading
* them from disk */ * them from disk */
l2_table = qemu_try_blockalign(bs->file->bs, s->cluster_size); l2_table = qemu_try_blockalign(bs->file, s->cluster_size);
if (l2_table == NULL) { if (l2_table == NULL) {
return -ENOMEM; return -ENOMEM;
} }
@@ -1667,7 +1660,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
/* unallocated */ /* unallocated */
(*visited_l1_entries)++; (*visited_l1_entries)++;
if (status_cb) { if (status_cb) {
status_cb(bs, *visited_l1_entries, l1_entries, cb_opaque); status_cb(bs, *visited_l1_entries, l1_entries);
} }
continue; continue;
} }
@@ -1687,7 +1680,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
} else { } else {
/* load inactive L2 tables from disk */ /* load inactive L2 tables from disk */
ret = bdrv_read(bs->file, l2_offset / BDRV_SECTOR_SIZE, ret = bdrv_read(bs->file, l2_offset / BDRV_SECTOR_SIZE,
(void *)l2_table, s->cluster_sectors); (void *)l2_table, s->cluster_sectors);
} }
if (ret < 0) { if (ret < 0) {
goto fail; goto fail;
@@ -1710,7 +1703,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
} }
if (!preallocated) { if (!preallocated) {
if (!bs->backing) { if (!bs->backing_hd) {
/* not backed; therefore we can simply deallocate the /* not backed; therefore we can simply deallocate the
* cluster */ * cluster */
l2_table[j] = 0; l2_table[j] = 0;
@@ -1761,7 +1754,8 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
goto fail; goto fail;
} }
ret = bdrv_pwrite_zeroes(bs->file, offset, s->cluster_size, 0); ret = bdrv_write_zeroes(bs->file, offset / BDRV_SECTOR_SIZE,
s->cluster_sectors, 0);
if (ret < 0) { if (ret < 0) {
if (!preallocated) { if (!preallocated) {
qcow2_free_clusters(bs, offset, s->cluster_size, qcow2_free_clusters(bs, offset, s->cluster_size,
@@ -1794,7 +1788,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
} }
ret = bdrv_write(bs->file, l2_offset / BDRV_SECTOR_SIZE, ret = bdrv_write(bs->file, l2_offset / BDRV_SECTOR_SIZE,
(void *)l2_table, s->cluster_sectors); (void *)l2_table, s->cluster_sectors);
if (ret < 0) { if (ret < 0) {
goto fail; goto fail;
} }
@@ -1803,7 +1797,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
(*visited_l1_entries)++; (*visited_l1_entries)++;
if (status_cb) { if (status_cb) {
status_cb(bs, *visited_l1_entries, l1_entries, cb_opaque); status_cb(bs, *visited_l1_entries, l1_entries);
} }
} }
@@ -1827,10 +1821,9 @@ fail:
* qcow2 version which doesn't yet support metadata zero clusters. * qcow2 version which doesn't yet support metadata zero clusters.
*/ */
int qcow2_expand_zero_clusters(BlockDriverState *bs, int qcow2_expand_zero_clusters(BlockDriverState *bs,
BlockDriverAmendStatusCB *status_cb, BlockDriverAmendStatusCB *status_cb)
void *cb_opaque)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t *l1_table = NULL; uint64_t *l1_table = NULL;
int64_t l1_entries = 0, visited_l1_entries = 0; int64_t l1_entries = 0, visited_l1_entries = 0;
int ret; int ret;
@@ -1845,7 +1838,7 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs,
ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size, ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size,
&visited_l1_entries, l1_entries, &visited_l1_entries, l1_entries,
status_cb, cb_opaque); status_cb);
if (ret < 0) { if (ret < 0) {
goto fail; goto fail;
} }
@@ -1863,14 +1856,13 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs,
} }
for (i = 0; i < s->nb_snapshots; i++) { for (i = 0; i < s->nb_snapshots; i++) {
int l1_sectors = DIV_ROUND_UP(s->snapshots[i].l1_size * int l1_sectors = (s->snapshots[i].l1_size * sizeof(uint64_t) +
sizeof(uint64_t), BDRV_SECTOR_SIZE); BDRV_SECTOR_SIZE - 1) / BDRV_SECTOR_SIZE;
l1_table = g_realloc(l1_table, l1_sectors * BDRV_SECTOR_SIZE); l1_table = g_realloc(l1_table, l1_sectors * BDRV_SECTOR_SIZE);
ret = bdrv_read(bs->file, ret = bdrv_read(bs->file, s->snapshots[i].l1_table_offset /
s->snapshots[i].l1_table_offset / BDRV_SECTOR_SIZE, BDRV_SECTOR_SIZE, (void *)l1_table, l1_sectors);
(void *)l1_table, l1_sectors);
if (ret < 0) { if (ret < 0) {
goto fail; goto fail;
} }
@@ -1881,7 +1873,7 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs,
ret = expand_zero_clusters_in_l1(bs, l1_table, s->snapshots[i].l1_size, ret = expand_zero_clusters_in_l1(bs, l1_table, s->snapshots[i].l1_size,
&visited_l1_entries, l1_entries, &visited_l1_entries, l1_entries,
status_cb, cb_opaque); status_cb);
if (ret < 0) { if (ret < 0) {
goto fail; goto fail;
} }

View File

@@ -22,13 +22,10 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h" #include "qemu-common.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "block/qcow2.h" #include "block/qcow2.h"
#include "qemu/range.h" #include "qemu/range.h"
#include "qemu/bswap.h"
static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size); static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
@@ -85,7 +82,7 @@ static Qcow2SetRefcountFunc *const set_refcount_funcs[] = {
int qcow2_refcount_init(BlockDriverState *bs) int qcow2_refcount_init(BlockDriverState *bs)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
unsigned int refcount_table_size2, i; unsigned int refcount_table_size2, i;
int ret; int ret;
@@ -119,7 +116,7 @@ int qcow2_refcount_init(BlockDriverState *bs)
void qcow2_refcount_close(BlockDriverState *bs) void qcow2_refcount_close(BlockDriverState *bs)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
g_free(s->refcount_table); g_free(s->refcount_table);
} }
@@ -217,11 +214,14 @@ static int load_refcount_block(BlockDriverState *bs,
int64_t refcount_block_offset, int64_t refcount_block_offset,
void **refcount_block) void **refcount_block)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int ret;
BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_LOAD); BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_LOAD);
return qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset, ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
refcount_block); refcount_block);
return ret;
} }
/* /*
@@ -231,7 +231,7 @@ static int load_refcount_block(BlockDriverState *bs,
int qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index, int qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index,
uint64_t *refcount) uint64_t *refcount)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t refcount_table_index, block_index; uint64_t refcount_table_index, block_index;
int64_t refcount_block_offset; int64_t refcount_block_offset;
int ret; int ret;
@@ -274,7 +274,7 @@ int qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index,
* Rounds the refcount table size up to avoid growing the table for each single * Rounds the refcount table size up to avoid growing the table for each single
* refcount block that is allocated. * refcount block that is allocated.
*/ */
static unsigned int next_refcount_table_size(BDRVQcow2State *s, static unsigned int next_refcount_table_size(BDRVQcowState *s,
unsigned int min_size) unsigned int min_size)
{ {
unsigned int min_clusters = (min_size >> (s->cluster_bits - 3)) + 1; unsigned int min_clusters = (min_size >> (s->cluster_bits - 3)) + 1;
@@ -290,7 +290,7 @@ static unsigned int next_refcount_table_size(BDRVQcow2State *s,
/* Checks if two offsets are described by the same refcount block */ /* Checks if two offsets are described by the same refcount block */
static int in_same_refcount_block(BDRVQcow2State *s, uint64_t offset_a, static int in_same_refcount_block(BDRVQcowState *s, uint64_t offset_a,
uint64_t offset_b) uint64_t offset_b)
{ {
uint64_t block_a = offset_a >> (s->cluster_bits + s->refcount_block_bits); uint64_t block_a = offset_a >> (s->cluster_bits + s->refcount_block_bits);
@@ -308,7 +308,7 @@ static int in_same_refcount_block(BDRVQcow2State *s, uint64_t offset_a,
static int alloc_refcount_block(BlockDriverState *bs, static int alloc_refcount_block(BlockDriverState *bs,
int64_t cluster_index, void **refcount_block) int64_t cluster_index, void **refcount_block)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
unsigned int refcount_table_index; unsigned int refcount_table_index;
int ret; int ret;
@@ -487,12 +487,14 @@ static int alloc_refcount_block(BlockDriverState *bs,
uint64_t table_clusters = uint64_t table_clusters =
size_to_clusters(s, table_size * sizeof(uint64_t)); size_to_clusters(s, table_size * sizeof(uint64_t));
blocks_clusters = 1 + blocks_clusters = 1 +
DIV_ROUND_UP(table_clusters, s->refcount_block_size); ((table_clusters + s->refcount_block_size - 1)
/ s->refcount_block_size);
uint64_t meta_clusters = table_clusters + blocks_clusters; uint64_t meta_clusters = table_clusters + blocks_clusters;
last_table_size = table_size; last_table_size = table_size;
table_size = next_refcount_table_size(s, blocks_used + table_size = next_refcount_table_size(s, blocks_used +
DIV_ROUND_UP(meta_clusters, s->refcount_block_size)); ((meta_clusters + s->refcount_block_size - 1)
/ s->refcount_block_size));
} while (last_table_size != table_size); } while (last_table_size != table_size);
@@ -558,16 +560,12 @@ static int alloc_refcount_block(BlockDriverState *bs,
} }
/* Hook up the new refcount table in the qcow2 header */ /* Hook up the new refcount table in the qcow2 header */
struct QEMU_PACKED { uint8_t data[12];
uint64_t d64; cpu_to_be64w((uint64_t*)data, table_offset);
uint32_t d32; cpu_to_be32w((uint32_t*)(data + 8), table_clusters);
} data;
data.d64 = cpu_to_be64(table_offset);
data.d32 = cpu_to_be32(table_clusters);
BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_ALLOC_SWITCH_TABLE); BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_ALLOC_SWITCH_TABLE);
ret = bdrv_pwrite_sync(bs->file, ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, refcount_table_offset),
offsetof(QCowHeader, refcount_table_offset), data, sizeof(data));
&data, sizeof(data));
if (ret < 0) { if (ret < 0) {
goto fail_table; goto fail_table;
} }
@@ -607,7 +605,7 @@ fail_block:
void qcow2_process_discards(BlockDriverState *bs, int ret) void qcow2_process_discards(BlockDriverState *bs, int ret)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
Qcow2DiscardRegion *d, *next; Qcow2DiscardRegion *d, *next;
QTAILQ_FOREACH_SAFE(d, &s->discards, next, next) { QTAILQ_FOREACH_SAFE(d, &s->discards, next, next) {
@@ -615,7 +613,9 @@ void qcow2_process_discards(BlockDriverState *bs, int ret)
/* Discard is optional, ignore the return value */ /* Discard is optional, ignore the return value */
if (ret >= 0) { if (ret >= 0) {
bdrv_pdiscard(bs->file->bs, d->offset, d->bytes); bdrv_discard(bs->file,
d->offset >> BDRV_SECTOR_BITS,
d->bytes >> BDRV_SECTOR_BITS);
} }
g_free(d); g_free(d);
@@ -625,7 +625,7 @@ void qcow2_process_discards(BlockDriverState *bs, int ret)
static void update_refcount_discard(BlockDriverState *bs, static void update_refcount_discard(BlockDriverState *bs,
uint64_t offset, uint64_t length) uint64_t offset, uint64_t length)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
Qcow2DiscardRegion *d, *p, *next; Qcow2DiscardRegion *d, *p, *next;
QTAILQ_FOREACH(d, &s->discards, next) { QTAILQ_FOREACH(d, &s->discards, next) {
@@ -682,7 +682,7 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
bool decrease, bool decrease,
enum qcow2_discard_type type) enum qcow2_discard_type type)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int64_t start, last, cluster_offset; int64_t start, last, cluster_offset;
void *refcount_block = NULL; void *refcount_block = NULL;
int64_t old_table_index = -1; int64_t old_table_index = -1;
@@ -793,7 +793,7 @@ int qcow2_update_cluster_refcount(BlockDriverState *bs,
uint64_t addend, bool decrease, uint64_t addend, bool decrease,
enum qcow2_discard_type type) enum qcow2_discard_type type)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int ret; int ret;
ret = update_refcount(bs, cluster_index << s->cluster_bits, 1, addend, ret = update_refcount(bs, cluster_index << s->cluster_bits, 1, addend,
@@ -815,7 +815,7 @@ int qcow2_update_cluster_refcount(BlockDriverState *bs,
/* return < 0 if error */ /* return < 0 if error */
static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size) static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t i, nb_clusters, refcount; uint64_t i, nb_clusters, refcount;
int ret; int ret;
@@ -878,7 +878,7 @@ int64_t qcow2_alloc_clusters(BlockDriverState *bs, uint64_t size)
int64_t qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset, int64_t qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
int64_t nb_clusters) int64_t nb_clusters)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t cluster_index, refcount; uint64_t cluster_index, refcount;
uint64_t i; uint64_t i;
int ret; int ret;
@@ -916,7 +916,7 @@ int64_t qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
contiguous sectors. size must be <= cluster_size */ contiguous sectors. size must be <= cluster_size */
int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size) int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int64_t offset; int64_t offset;
size_t free_in_cluster; size_t free_in_cluster;
int ret; int ret;
@@ -949,17 +949,11 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
if (!offset || ROUND_UP(offset, s->cluster_size) != new_cluster) { if (!offset || ROUND_UP(offset, s->cluster_size) != new_cluster) {
offset = new_cluster; offset = new_cluster;
free_in_cluster = s->cluster_size;
} else {
free_in_cluster += s->cluster_size;
} }
} }
assert(offset); assert(offset);
ret = update_refcount(bs, offset, size, 1, false, QCOW2_DISCARD_NEVER); ret = update_refcount(bs, offset, size, 1, false, QCOW2_DISCARD_NEVER);
if (ret < 0) {
offset = 0;
}
} while (ret == -EAGAIN); } while (ret == -EAGAIN);
if (ret < 0) { if (ret < 0) {
return ret; return ret;
@@ -998,7 +992,7 @@ void qcow2_free_clusters(BlockDriverState *bs,
void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry, void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
int nb_clusters, enum qcow2_discard_type type) int nb_clusters, enum qcow2_discard_type type)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
switch (qcow2_get_cluster_type(l2_entry)) { switch (qcow2_get_cluster_type(l2_entry)) {
case QCOW2_CLUSTER_COMPRESSED: case QCOW2_CLUSTER_COMPRESSED:
@@ -1042,7 +1036,7 @@ void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
int qcow2_update_snapshot_refcount(BlockDriverState *bs, int qcow2_update_snapshot_refcount(BlockDriverState *bs,
int64_t l1_table_offset, int l1_size, int addend) int64_t l1_table_offset, int l1_size, int addend)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t *l1_table, *l2_table, l2_offset, offset, l1_size2, refcount; uint64_t *l1_table, *l2_table, l2_offset, offset, l1_size2, refcount;
bool l1_allocated = false; bool l1_allocated = false;
int64_t old_offset, old_l2_offset; int64_t old_offset, old_l2_offset;
@@ -1221,8 +1215,7 @@ fail:
cpu_to_be64s(&l1_table[i]); cpu_to_be64s(&l1_table[i]);
} }
ret = bdrv_pwrite_sync(bs->file, l1_table_offset, ret = bdrv_pwrite_sync(bs->file, l1_table_offset, l1_table, l1_size2);
l1_table, l1_size2);
for (i = 0; i < l1_size; i++) { for (i = 0; i < l1_size; i++) {
be64_to_cpus(&l1_table[i]); be64_to_cpus(&l1_table[i]);
@@ -1240,7 +1233,7 @@ fail:
/* refcount checking functions */ /* refcount checking functions */
static uint64_t refcount_array_byte_size(BDRVQcow2State *s, uint64_t entries) static size_t refcount_array_byte_size(BDRVQcowState *s, uint64_t entries)
{ {
/* This assertion holds because there is no way we can address more than /* This assertion holds because there is no way we can address more than
* 2^(64 - 9) clusters at once (with cluster size 512 = 2^9, and because * 2^(64 - 9) clusters at once (with cluster size 512 = 2^9, and because
@@ -1263,7 +1256,7 @@ static uint64_t refcount_array_byte_size(BDRVQcow2State *s, uint64_t entries)
* refcount array buffer will be aligned to a cluster boundary, and the newly * refcount array buffer will be aligned to a cluster boundary, and the newly
* allocated area will be zeroed. * allocated area will be zeroed.
*/ */
static int realloc_refcount_array(BDRVQcow2State *s, void **array, static int realloc_refcount_array(BDRVQcowState *s, void **array,
int64_t *size, int64_t new_size) int64_t *size, int64_t new_size)
{ {
int64_t old_byte_size, new_byte_size; int64_t old_byte_size, new_byte_size;
@@ -1305,7 +1298,7 @@ static int realloc_refcount_array(BDRVQcow2State *s, void **array,
/* /*
* Increases the refcount for a range of clusters in a given refcount table. * Increases the refcount for a range of clusters in a given refcount table.
* This is used to construct a temporary refcount table out of L1 and L2 tables * This is used to construct a temporary refcount table out of L1 and L2 tables
* which can be compared to the refcount table saved in the image. * which can be compared the the refcount table saved in the image.
* *
* Modifies the number of errors in res. * Modifies the number of errors in res.
*/ */
@@ -1315,7 +1308,7 @@ static int inc_refcounts(BlockDriverState *bs,
int64_t *refcount_table_size, int64_t *refcount_table_size,
int64_t offset, int64_t size) int64_t offset, int64_t size)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t start, last, cluster_offset, k, refcount; uint64_t start, last, cluster_offset, k, refcount;
int ret; int ret;
@@ -1341,9 +1334,6 @@ static int inc_refcounts(BlockDriverState *bs,
if (refcount == s->refcount_max) { if (refcount == s->refcount_max) {
fprintf(stderr, "ERROR: overflow cluster offset=0x%" PRIx64 fprintf(stderr, "ERROR: overflow cluster offset=0x%" PRIx64
"\n", cluster_offset); "\n", cluster_offset);
fprintf(stderr, "Use qemu-img amend to increase the refcount entry "
"width or qemu-img convert to create a clean copy if the "
"image cannot be opened for writing\n");
res->corruptions++; res->corruptions++;
continue; continue;
} }
@@ -1371,7 +1361,7 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
int64_t *refcount_table_size, int64_t l2_offset, int64_t *refcount_table_size, int64_t l2_offset,
int flags) int flags)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t *l2_table, l2_entry; uint64_t *l2_table, l2_entry;
uint64_t next_contiguous_offset = 0; uint64_t next_contiguous_offset = 0;
int i, l2_size, nb_csectors, ret; int i, l2_size, nb_csectors, ret;
@@ -1491,7 +1481,7 @@ static int check_refcounts_l1(BlockDriverState *bs,
int64_t l1_table_offset, int l1_size, int64_t l1_table_offset, int l1_size,
int flags) int flags)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t *l1_table = NULL, l2_offset, l1_size2; uint64_t *l1_table = NULL, l2_offset, l1_size2;
int i, ret; int i, ret;
@@ -1568,7 +1558,7 @@ fail:
static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res, static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix) BdrvCheckMode fix)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
uint64_t *l2_table = qemu_blockalign(bs, s->cluster_size); uint64_t *l2_table = qemu_blockalign(bs, s->cluster_size);
int ret; int ret;
uint64_t refcount; uint64_t refcount;
@@ -1662,8 +1652,7 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
goto fail; goto fail;
} }
ret = bdrv_pwrite(bs->file, l2_offset, l2_table, ret = bdrv_pwrite(bs->file, l2_offset, l2_table, s->cluster_size);
s->cluster_size);
if (ret < 0) { if (ret < 0) {
fprintf(stderr, "ERROR: Could not write L2 table: %s\n", fprintf(stderr, "ERROR: Could not write L2 table: %s\n",
strerror(-ret)); strerror(-ret));
@@ -1688,7 +1677,7 @@ static int check_refblocks(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix, bool *rebuild, BdrvCheckMode fix, bool *rebuild,
void **refcount_table, int64_t *nb_clusters) void **refcount_table, int64_t *nb_clusters)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int64_t i, size; int64_t i, size;
int ret; int ret;
@@ -1718,11 +1707,11 @@ static int check_refblocks(BlockDriverState *bs, BdrvCheckResult *res,
goto resize_fail; goto resize_fail;
} }
ret = bdrv_truncate(bs->file->bs, offset + s->cluster_size); ret = bdrv_truncate(bs->file, offset + s->cluster_size);
if (ret < 0) { if (ret < 0) {
goto resize_fail; goto resize_fail;
} }
size = bdrv_getlength(bs->file->bs); size = bdrv_getlength(bs->file);
if (size < 0) { if (size < 0) {
ret = size; ret = size;
goto resize_fail; goto resize_fail;
@@ -1791,7 +1780,7 @@ static int calculate_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix, bool *rebuild, BdrvCheckMode fix, bool *rebuild,
void **refcount_table, int64_t *nb_clusters) void **refcount_table, int64_t *nb_clusters)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int64_t i; int64_t i;
QCowSnapshot *sn; QCowSnapshot *sn;
int ret; int ret;
@@ -1855,7 +1844,7 @@ static void compare_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
int64_t *highest_cluster, int64_t *highest_cluster,
void *refcount_table, int64_t nb_clusters) void *refcount_table, int64_t nb_clusters)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int64_t i; int64_t i;
uint64_t refcount1, refcount2; uint64_t refcount1, refcount2;
int ret; int ret;
@@ -1932,7 +1921,7 @@ static int64_t alloc_clusters_imrt(BlockDriverState *bs,
int64_t *imrt_nb_clusters, int64_t *imrt_nb_clusters,
int64_t *first_free_cluster) int64_t *first_free_cluster)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int64_t cluster = *first_free_cluster, i; int64_t cluster = *first_free_cluster, i;
bool first_gap = true; bool first_gap = true;
int contiguous_free_clusters; int contiguous_free_clusters;
@@ -2002,7 +1991,7 @@ static int rebuild_refcount_structure(BlockDriverState *bs,
void **refcount_table, void **refcount_table,
int64_t *nb_clusters) int64_t *nb_clusters)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int64_t first_free_cluster = 0, reftable_offset = -1, cluster = 0; int64_t first_free_cluster = 0, reftable_offset = -1, cluster = 0;
int64_t refblock_offset, refblock_start, refblock_index; int64_t refblock_offset, refblock_start, refblock_index;
uint32_t reftable_size = 0; uint32_t reftable_size = 0;
@@ -2153,11 +2142,12 @@ write_refblocks:
} }
/* Enter new reftable into the image header */ /* Enter new reftable into the image header */
reftable_offset_and_clusters.reftable_offset = cpu_to_be64(reftable_offset); cpu_to_be64w(&reftable_offset_and_clusters.reftable_offset,
reftable_offset_and_clusters.reftable_clusters = reftable_offset);
cpu_to_be32(size_to_clusters(s, reftable_size * sizeof(uint64_t))); cpu_to_be32w(&reftable_offset_and_clusters.reftable_clusters,
ret = bdrv_pwrite_sync(bs->file, size_to_clusters(s, reftable_size * sizeof(uint64_t)));
offsetof(QCowHeader, refcount_table_offset), ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader,
refcount_table_offset),
&reftable_offset_and_clusters, &reftable_offset_and_clusters,
sizeof(reftable_offset_and_clusters)); sizeof(reftable_offset_and_clusters));
if (ret < 0) { if (ret < 0) {
@@ -2188,14 +2178,14 @@ fail:
int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res, int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix) BdrvCheckMode fix)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
BdrvCheckResult pre_compare_res; BdrvCheckResult pre_compare_res;
int64_t size, highest_cluster, nb_clusters; int64_t size, highest_cluster, nb_clusters;
void *refcount_table = NULL; void *refcount_table = NULL;
bool rebuild = false; bool rebuild = false;
int ret; int ret;
size = bdrv_getlength(bs->file->bs); size = bdrv_getlength(bs->file);
if (size < 0) { if (size < 0) {
res->check_errors++; res->check_errors++;
return size; return size;
@@ -2325,7 +2315,7 @@ fail:
int qcow2_check_metadata_overlap(BlockDriverState *bs, int ign, int64_t offset, int qcow2_check_metadata_overlap(BlockDriverState *bs, int ign, int64_t offset,
int64_t size) int64_t size)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int chk = s->overlap_check & ~ign; int chk = s->overlap_check & ~ign;
int i, j; int i, j;
@@ -2465,450 +2455,3 @@ int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
return 0; return 0;
} }
/* A pointer to a function of this type is given to walk_over_reftable(). That
* function will create refblocks and pass them to a RefblockFinishOp once they
* are completed (@refblock). @refblock_empty is set if the refblock is
* completely empty.
*
* Along with the refblock, a corresponding reftable entry is passed, in the
* reftable @reftable (which may be reallocated) at @reftable_index.
*
* @allocated should be set to true if a new cluster has been allocated.
*/
typedef int (RefblockFinishOp)(BlockDriverState *bs, uint64_t **reftable,
uint64_t reftable_index, uint64_t *reftable_size,
void *refblock, bool refblock_empty,
bool *allocated, Error **errp);
/**
* This "operation" for walk_over_reftable() allocates the refblock on disk (if
* it is not empty) and inserts its offset into the new reftable. The size of
* this new reftable is increased as required.
*/
static int alloc_refblock(BlockDriverState *bs, uint64_t **reftable,
uint64_t reftable_index, uint64_t *reftable_size,
void *refblock, bool refblock_empty, bool *allocated,
Error **errp)
{
BDRVQcow2State *s = bs->opaque;
int64_t offset;
if (!refblock_empty && reftable_index >= *reftable_size) {
uint64_t *new_reftable;
uint64_t new_reftable_size;
new_reftable_size = ROUND_UP(reftable_index + 1,
s->cluster_size / sizeof(uint64_t));
if (new_reftable_size > QCOW_MAX_REFTABLE_SIZE / sizeof(uint64_t)) {
error_setg(errp,
"This operation would make the refcount table grow "
"beyond the maximum size supported by QEMU, aborting");
return -ENOTSUP;
}
new_reftable = g_try_realloc(*reftable, new_reftable_size *
sizeof(uint64_t));
if (!new_reftable) {
error_setg(errp, "Failed to increase reftable buffer size");
return -ENOMEM;
}
memset(new_reftable + *reftable_size, 0,
(new_reftable_size - *reftable_size) * sizeof(uint64_t));
*reftable = new_reftable;
*reftable_size = new_reftable_size;
}
if (!refblock_empty && !(*reftable)[reftable_index]) {
offset = qcow2_alloc_clusters(bs, s->cluster_size);
if (offset < 0) {
error_setg_errno(errp, -offset, "Failed to allocate refblock");
return offset;
}
(*reftable)[reftable_index] = offset;
*allocated = true;
}
return 0;
}
/**
* This "operation" for walk_over_reftable() writes the refblock to disk at the
* offset specified by the new reftable's entry. It does not modify the new
* reftable or change any refcounts.
*/
static int flush_refblock(BlockDriverState *bs, uint64_t **reftable,
uint64_t reftable_index, uint64_t *reftable_size,
void *refblock, bool refblock_empty, bool *allocated,
Error **errp)
{
BDRVQcow2State *s = bs->opaque;
int64_t offset;
int ret;
if (reftable_index < *reftable_size && (*reftable)[reftable_index]) {
offset = (*reftable)[reftable_index];
ret = qcow2_pre_write_overlap_check(bs, 0, offset, s->cluster_size);
if (ret < 0) {
error_setg_errno(errp, -ret, "Overlap check failed");
return ret;
}
ret = bdrv_pwrite(bs->file, offset, refblock, s->cluster_size);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to write refblock");
return ret;
}
} else {
assert(refblock_empty);
}
return 0;
}
/**
* This function walks over the existing reftable and every referenced refblock;
* if @new_set_refcount is non-NULL, it is called for every refcount entry to
* create an equal new entry in the passed @new_refblock. Once that
* @new_refblock is completely filled, @operation will be called.
*
* @status_cb and @cb_opaque are used for the amend operation's status callback.
* @index is the index of the walk_over_reftable() calls and @total is the total
* number of walk_over_reftable() calls per amend operation. Both are used for
* calculating the parameters for the status callback.
*
* @allocated is set to true if a new cluster has been allocated.
*/
static int walk_over_reftable(BlockDriverState *bs, uint64_t **new_reftable,
uint64_t *new_reftable_index,
uint64_t *new_reftable_size,
void *new_refblock, int new_refblock_size,
int new_refcount_bits,
RefblockFinishOp *operation, bool *allocated,
Qcow2SetRefcountFunc *new_set_refcount,
BlockDriverAmendStatusCB *status_cb,
void *cb_opaque, int index, int total,
Error **errp)
{
BDRVQcow2State *s = bs->opaque;
uint64_t reftable_index;
bool new_refblock_empty = true;
int refblock_index;
int new_refblock_index = 0;
int ret;
for (reftable_index = 0; reftable_index < s->refcount_table_size;
reftable_index++)
{
uint64_t refblock_offset = s->refcount_table[reftable_index]
& REFT_OFFSET_MASK;
status_cb(bs, (uint64_t)index * s->refcount_table_size + reftable_index,
(uint64_t)total * s->refcount_table_size, cb_opaque);
if (refblock_offset) {
void *refblock;
if (offset_into_cluster(s, refblock_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "Refblock offset %#"
PRIx64 " unaligned (reftable index: %#"
PRIx64 ")", refblock_offset,
reftable_index);
error_setg(errp,
"Image is corrupt (unaligned refblock offset)");
return -EIO;
}
ret = qcow2_cache_get(bs, s->refcount_block_cache, refblock_offset,
&refblock);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to retrieve refblock");
return ret;
}
for (refblock_index = 0; refblock_index < s->refcount_block_size;
refblock_index++)
{
uint64_t refcount;
if (new_refblock_index >= new_refblock_size) {
/* new_refblock is now complete */
ret = operation(bs, new_reftable, *new_reftable_index,
new_reftable_size, new_refblock,
new_refblock_empty, allocated, errp);
if (ret < 0) {
qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
return ret;
}
(*new_reftable_index)++;
new_refblock_index = 0;
new_refblock_empty = true;
}
refcount = s->get_refcount(refblock, refblock_index);
if (new_refcount_bits < 64 && refcount >> new_refcount_bits) {
uint64_t offset;
qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
offset = ((reftable_index << s->refcount_block_bits)
+ refblock_index) << s->cluster_bits;
error_setg(errp, "Cannot decrease refcount entry width to "
"%i bits: Cluster at offset %#" PRIx64 " has a "
"refcount of %" PRIu64, new_refcount_bits,
offset, refcount);
return -EINVAL;
}
if (new_set_refcount) {
new_set_refcount(new_refblock, new_refblock_index++,
refcount);
} else {
new_refblock_index++;
}
new_refblock_empty = new_refblock_empty && refcount == 0;
}
qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
} else {
/* No refblock means every refcount is 0 */
for (refblock_index = 0; refblock_index < s->refcount_block_size;
refblock_index++)
{
if (new_refblock_index >= new_refblock_size) {
/* new_refblock is now complete */
ret = operation(bs, new_reftable, *new_reftable_index,
new_reftable_size, new_refblock,
new_refblock_empty, allocated, errp);
if (ret < 0) {
return ret;
}
(*new_reftable_index)++;
new_refblock_index = 0;
new_refblock_empty = true;
}
if (new_set_refcount) {
new_set_refcount(new_refblock, new_refblock_index++, 0);
} else {
new_refblock_index++;
}
}
}
}
if (new_refblock_index > 0) {
/* Complete the potentially existing partially filled final refblock */
if (new_set_refcount) {
for (; new_refblock_index < new_refblock_size;
new_refblock_index++)
{
new_set_refcount(new_refblock, new_refblock_index, 0);
}
}
ret = operation(bs, new_reftable, *new_reftable_index,
new_reftable_size, new_refblock, new_refblock_empty,
allocated, errp);
if (ret < 0) {
return ret;
}
(*new_reftable_index)++;
}
status_cb(bs, (uint64_t)(index + 1) * s->refcount_table_size,
(uint64_t)total * s->refcount_table_size, cb_opaque);
return 0;
}
int qcow2_change_refcount_order(BlockDriverState *bs, int refcount_order,
BlockDriverAmendStatusCB *status_cb,
void *cb_opaque, Error **errp)
{
BDRVQcow2State *s = bs->opaque;
Qcow2GetRefcountFunc *new_get_refcount;
Qcow2SetRefcountFunc *new_set_refcount;
void *new_refblock = qemu_blockalign(bs->file->bs, s->cluster_size);
uint64_t *new_reftable = NULL, new_reftable_size = 0;
uint64_t *old_reftable, old_reftable_size, old_reftable_offset;
uint64_t new_reftable_index = 0;
uint64_t i;
int64_t new_reftable_offset = 0, allocated_reftable_size = 0;
int new_refblock_size, new_refcount_bits = 1 << refcount_order;
int old_refcount_order;
int walk_index = 0;
int ret;
bool new_allocation;
assert(s->qcow_version >= 3);
assert(refcount_order >= 0 && refcount_order <= 6);
/* see qcow2_open() */
new_refblock_size = 1 << (s->cluster_bits - (refcount_order - 3));
new_get_refcount = get_refcount_funcs[refcount_order];
new_set_refcount = set_refcount_funcs[refcount_order];
do {
int total_walks;
new_allocation = false;
/* At least we have to do this walk and the one which writes the
* refblocks; also, at least we have to do this loop here at least
* twice (normally), first to do the allocations, and second to
* determine that everything is correctly allocated, this then makes
* three walks in total */
total_walks = MAX(walk_index + 2, 3);
/* First, allocate the structures so they are present in the refcount
* structures */
ret = walk_over_reftable(bs, &new_reftable, &new_reftable_index,
&new_reftable_size, NULL, new_refblock_size,
new_refcount_bits, &alloc_refblock,
&new_allocation, NULL, status_cb, cb_opaque,
walk_index++, total_walks, errp);
if (ret < 0) {
goto done;
}
new_reftable_index = 0;
if (new_allocation) {
if (new_reftable_offset) {
qcow2_free_clusters(bs, new_reftable_offset,
allocated_reftable_size * sizeof(uint64_t),
QCOW2_DISCARD_NEVER);
}
new_reftable_offset = qcow2_alloc_clusters(bs, new_reftable_size *
sizeof(uint64_t));
if (new_reftable_offset < 0) {
error_setg_errno(errp, -new_reftable_offset,
"Failed to allocate the new reftable");
ret = new_reftable_offset;
goto done;
}
allocated_reftable_size = new_reftable_size;
}
} while (new_allocation);
/* Second, write the new refblocks */
ret = walk_over_reftable(bs, &new_reftable, &new_reftable_index,
&new_reftable_size, new_refblock,
new_refblock_size, new_refcount_bits,
&flush_refblock, &new_allocation, new_set_refcount,
status_cb, cb_opaque, walk_index, walk_index + 1,
errp);
if (ret < 0) {
goto done;
}
assert(!new_allocation);
/* Write the new reftable */
ret = qcow2_pre_write_overlap_check(bs, 0, new_reftable_offset,
new_reftable_size * sizeof(uint64_t));
if (ret < 0) {
error_setg_errno(errp, -ret, "Overlap check failed");
goto done;
}
for (i = 0; i < new_reftable_size; i++) {
cpu_to_be64s(&new_reftable[i]);
}
ret = bdrv_pwrite(bs->file, new_reftable_offset, new_reftable,
new_reftable_size * sizeof(uint64_t));
for (i = 0; i < new_reftable_size; i++) {
be64_to_cpus(&new_reftable[i]);
}
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to write the new reftable");
goto done;
}
/* Empty the refcount cache */
ret = qcow2_cache_flush(bs, s->refcount_block_cache);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to flush the refblock cache");
goto done;
}
/* Update the image header to point to the new reftable; this only updates
* the fields which are relevant to qcow2_update_header(); other fields
* such as s->refcount_table or s->refcount_bits stay stale for now
* (because we have to restore everything if qcow2_update_header() fails) */
old_refcount_order = s->refcount_order;
old_reftable_size = s->refcount_table_size;
old_reftable_offset = s->refcount_table_offset;
s->refcount_order = refcount_order;
s->refcount_table_size = new_reftable_size;
s->refcount_table_offset = new_reftable_offset;
ret = qcow2_update_header(bs);
if (ret < 0) {
s->refcount_order = old_refcount_order;
s->refcount_table_size = old_reftable_size;
s->refcount_table_offset = old_reftable_offset;
error_setg_errno(errp, -ret, "Failed to update the qcow2 header");
goto done;
}
/* Now update the rest of the in-memory information */
old_reftable = s->refcount_table;
s->refcount_table = new_reftable;
s->refcount_bits = 1 << refcount_order;
s->refcount_max = UINT64_C(1) << (s->refcount_bits - 1);
s->refcount_max += s->refcount_max - 1;
s->refcount_block_bits = s->cluster_bits - (refcount_order - 3);
s->refcount_block_size = 1 << s->refcount_block_bits;
s->get_refcount = new_get_refcount;
s->set_refcount = new_set_refcount;
/* For cleaning up all old refblocks and the old reftable below the "done"
* label */
new_reftable = old_reftable;
new_reftable_size = old_reftable_size;
new_reftable_offset = old_reftable_offset;
done:
if (new_reftable) {
/* On success, new_reftable actually points to the old reftable (and
* new_reftable_size is the old reftable's size); but that is just
* fine */
for (i = 0; i < new_reftable_size; i++) {
uint64_t offset = new_reftable[i] & REFT_OFFSET_MASK;
if (offset) {
qcow2_free_clusters(bs, offset, s->cluster_size,
QCOW2_DISCARD_OTHER);
}
}
g_free(new_reftable);
if (new_reftable_offset > 0) {
qcow2_free_clusters(bs, new_reftable_offset,
new_reftable_size * sizeof(uint64_t),
QCOW2_DISCARD_OTHER);
}
}
qemu_vfree(new_refblock);
return ret;
}

View File

@@ -22,17 +22,14 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h" #include "qemu-common.h"
#include "qapi/error.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "block/qcow2.h" #include "block/qcow2.h"
#include "qemu/bswap.h"
#include "qemu/error-report.h" #include "qemu/error-report.h"
#include "qemu/cutils.h"
void qcow2_free_snapshots(BlockDriverState *bs) void qcow2_free_snapshots(BlockDriverState *bs)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int i; int i;
for(i = 0; i < s->nb_snapshots; i++) { for(i = 0; i < s->nb_snapshots; i++) {
@@ -46,7 +43,7 @@ void qcow2_free_snapshots(BlockDriverState *bs)
int qcow2_read_snapshots(BlockDriverState *bs) int qcow2_read_snapshots(BlockDriverState *bs)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
QCowSnapshotHeader h; QCowSnapshotHeader h;
QCowSnapshotExtraData extra; QCowSnapshotExtraData extra;
QCowSnapshot *sn; QCowSnapshot *sn;
@@ -139,7 +136,7 @@ fail:
/* add at the end of the file a new list of snapshots */ /* add at the end of the file a new list of snapshots */
static int qcow2_write_snapshots(BlockDriverState *bs) static int qcow2_write_snapshots(BlockDriverState *bs)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
QCowSnapshot *sn; QCowSnapshot *sn;
QCowSnapshotHeader h; QCowSnapshotHeader h;
QCowSnapshotExtraData extra; QCowSnapshotExtraData extra;
@@ -281,7 +278,7 @@ fail:
static void find_new_snapshot_id(BlockDriverState *bs, static void find_new_snapshot_id(BlockDriverState *bs,
char *id_str, int id_str_size) char *id_str, int id_str_size)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
QCowSnapshot *sn; QCowSnapshot *sn;
int i; int i;
unsigned long id, id_max = 0; unsigned long id, id_max = 0;
@@ -299,7 +296,7 @@ static int find_snapshot_by_id_and_name(BlockDriverState *bs,
const char *id, const char *id,
const char *name) const char *name)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
int i; int i;
if (id && name) { if (id && name) {
@@ -341,7 +338,7 @@ static int find_snapshot_by_id_or_name(BlockDriverState *bs,
/* if no id is provided, a new one is constructed */ /* if no id is provided, a new one is constructed */
int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info) int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
QCowSnapshot *new_snapshot_list = NULL; QCowSnapshot *new_snapshot_list = NULL;
QCowSnapshot *old_snapshot_list = NULL; QCowSnapshot *old_snapshot_list = NULL;
QCowSnapshot sn1, *sn = &sn1; QCowSnapshot sn1, *sn = &sn1;
@@ -464,7 +461,7 @@ fail:
/* copy the snapshot 'snapshot_name' into the current disk image */ /* copy the snapshot 'snapshot_name' into the current disk image */
int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id) int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
QCowSnapshot *sn; QCowSnapshot *sn;
int i, snapshot_index; int i, snapshot_index;
int cur_l1_bytes, sn_l1_bytes; int cur_l1_bytes, sn_l1_bytes;
@@ -512,8 +509,7 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
goto fail; goto fail;
} }
ret = bdrv_pread(bs->file, sn->l1_table_offset, ret = bdrv_pread(bs->file, sn->l1_table_offset, sn_l1_table, sn_l1_bytes);
sn_l1_table, sn_l1_bytes);
if (ret < 0) { if (ret < 0) {
goto fail; goto fail;
} }
@@ -591,7 +587,7 @@ int qcow2_snapshot_delete(BlockDriverState *bs,
const char *name, const char *name,
Error **errp) Error **errp)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
QCowSnapshot sn; QCowSnapshot sn;
int snapshot_index, ret; int snapshot_index, ret;
@@ -654,7 +650,7 @@ int qcow2_snapshot_delete(BlockDriverState *bs,
int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab) int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
{ {
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
QEMUSnapshotInfo *sn_tab, *sn_info; QEMUSnapshotInfo *sn_tab, *sn_info;
QCowSnapshot *sn; QCowSnapshot *sn;
int i; int i;
@@ -687,7 +683,7 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
Error **errp) Error **errp)
{ {
int i, snapshot_index; int i, snapshot_index;
BDRVQcow2State *s = bs->opaque; BDRVQcowState *s = bs->opaque;
QCowSnapshot *sn; QCowSnapshot *sn;
uint64_t *new_l1_table; uint64_t *new_l1_table;
int new_l1_bytes; int new_l1_bytes;
@@ -710,14 +706,13 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
return -EFBIG; return -EFBIG;
} }
new_l1_bytes = sn->l1_size * sizeof(uint64_t); new_l1_bytes = sn->l1_size * sizeof(uint64_t);
new_l1_table = qemu_try_blockalign(bs->file->bs, new_l1_table = qemu_try_blockalign(bs->file,
align_offset(new_l1_bytes, 512)); align_offset(new_l1_bytes, 512));
if (new_l1_table == NULL) { if (new_l1_table == NULL) {
return -ENOMEM; return -ENOMEM;
} }
ret = bdrv_pread(bs->file, sn->l1_table_offset, ret = bdrv_pread(bs->file, sn->l1_table_offset, new_l1_table, new_l1_bytes);
new_l1_table, new_l1_bytes);
if (ret < 0) { if (ret < 0) {
error_setg(errp, "Failed to read l1 table for snapshot"); error_setg(errp, "Failed to read l1 table for snapshot");
qemu_vfree(new_l1_table); qemu_vfree(new_l1_table);

File diff suppressed because it is too large Load Diff

View File

@@ -26,7 +26,7 @@
#define BLOCK_QCOW2_H #define BLOCK_QCOW2_H
#include "crypto/cipher.h" #include "crypto/cipher.h"
#include "qemu/coroutine.h" #include "block/coroutine.h"
//#define DEBUG_ALLOC //#define DEBUG_ALLOC
//#define DEBUG_ALLOC2 //#define DEBUG_ALLOC2
@@ -96,7 +96,6 @@
#define QCOW2_OPT_CACHE_SIZE "cache-size" #define QCOW2_OPT_CACHE_SIZE "cache-size"
#define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size" #define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size"
#define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size" #define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size"
#define QCOW2_OPT_CACHE_CLEAN_INTERVAL "cache-clean-interval"
typedef struct QCowHeader { typedef struct QCowHeader {
uint32_t magic; uint32_t magic;
@@ -222,7 +221,7 @@ typedef uint64_t Qcow2GetRefcountFunc(const void *refcount_array,
typedef void Qcow2SetRefcountFunc(void *refcount_array, typedef void Qcow2SetRefcountFunc(void *refcount_array,
uint64_t index, uint64_t value); uint64_t index, uint64_t value);
typedef struct BDRVQcow2State { typedef struct BDRVQcowState {
int cluster_bits; int cluster_bits;
int cluster_size; int cluster_size;
int cluster_sectors; int cluster_sectors;
@@ -240,8 +239,6 @@ typedef struct BDRVQcow2State {
Qcow2Cache* l2_table_cache; Qcow2Cache* l2_table_cache;
Qcow2Cache* refcount_block_cache; Qcow2Cache* refcount_block_cache;
QEMUTimer *cache_clean_timer;
unsigned cache_clean_interval;
uint8_t *cluster_cache; uint8_t *cluster_cache;
uint8_t *cluster_data; uint8_t *cluster_data;
@@ -293,7 +290,9 @@ typedef struct BDRVQcow2State {
* override) */ * override) */
char *image_backing_file; char *image_backing_file;
char *image_backing_format; char *image_backing_format;
} BDRVQcow2State; } BDRVQcowState;
struct QCowAIOCB;
typedef struct Qcow2COWRegion { typedef struct Qcow2COWRegion {
/** /**
@@ -302,8 +301,8 @@ typedef struct Qcow2COWRegion {
*/ */
uint64_t offset; uint64_t offset;
/** Number of bytes to copy */ /** Number of sectors to copy */
int nb_bytes; int nb_sectors;
} Qcow2COWRegion; } Qcow2COWRegion;
/** /**
@@ -318,6 +317,12 @@ typedef struct QCowL2Meta
/** Host offset of the first newly allocated cluster */ /** Host offset of the first newly allocated cluster */
uint64_t alloc_offset; uint64_t alloc_offset;
/**
* Number of sectors from the start of the first allocated cluster to
* the end of the (possibly shortened) request
*/
int nb_available;
/** Number of newly allocated clusters */ /** Number of newly allocated clusters */
int nb_clusters; int nb_clusters;
@@ -397,28 +402,28 @@ typedef enum QCow2MetadataOverlap {
#define REFT_OFFSET_MASK 0xfffffffffffffe00ULL #define REFT_OFFSET_MASK 0xfffffffffffffe00ULL
static inline int64_t start_of_cluster(BDRVQcow2State *s, int64_t offset) static inline int64_t start_of_cluster(BDRVQcowState *s, int64_t offset)
{ {
return offset & ~(s->cluster_size - 1); return offset & ~(s->cluster_size - 1);
} }
static inline int64_t offset_into_cluster(BDRVQcow2State *s, int64_t offset) static inline int64_t offset_into_cluster(BDRVQcowState *s, int64_t offset)
{ {
return offset & (s->cluster_size - 1); return offset & (s->cluster_size - 1);
} }
static inline uint64_t size_to_clusters(BDRVQcow2State *s, uint64_t size) static inline uint64_t size_to_clusters(BDRVQcowState *s, uint64_t size)
{ {
return (size + (s->cluster_size - 1)) >> s->cluster_bits; return (size + (s->cluster_size - 1)) >> s->cluster_bits;
} }
static inline int64_t size_to_l1(BDRVQcow2State *s, int64_t size) static inline int64_t size_to_l1(BDRVQcowState *s, int64_t size)
{ {
int shift = s->cluster_bits + s->l2_bits; int shift = s->cluster_bits + s->l2_bits;
return (size + (1ULL << shift) - 1) >> shift; return (size + (1ULL << shift) - 1) >> shift;
} }
static inline int offset_to_l2_index(BDRVQcow2State *s, int64_t offset) static inline int offset_to_l2_index(BDRVQcowState *s, int64_t offset)
{ {
return (offset >> s->cluster_bits) & (s->l2_size - 1); return (offset >> s->cluster_bits) & (s->l2_size - 1);
} }
@@ -429,12 +434,12 @@ static inline int64_t align_offset(int64_t offset, int n)
return offset; return offset;
} }
static inline int64_t qcow2_vm_state_offset(BDRVQcow2State *s) static inline int64_t qcow2_vm_state_offset(BDRVQcowState *s)
{ {
return (int64_t)s->l1_vm_state_index << (s->cluster_bits + s->l2_bits); return (int64_t)s->l1_vm_state_index << (s->cluster_bits + s->l2_bits);
} }
static inline uint64_t qcow2_max_refcount_clusters(BDRVQcow2State *s) static inline uint64_t qcow2_max_refcount_clusters(BDRVQcowState *s)
{ {
return QCOW_MAX_REFTABLE_SIZE >> s->cluster_bits; return QCOW_MAX_REFTABLE_SIZE >> s->cluster_bits;
} }
@@ -453,7 +458,7 @@ static inline int qcow2_get_cluster_type(uint64_t l2_entry)
} }
/* Check whether refcounts are eager or lazy */ /* Check whether refcounts are eager or lazy */
static inline bool qcow2_need_accurate_refcounts(BDRVQcow2State *s) static inline bool qcow2_need_accurate_refcounts(BDRVQcowState *s)
{ {
return !(s->incompatible_features & QCOW2_INCOMPAT_DIRTY); return !(s->incompatible_features & QCOW2_INCOMPAT_DIRTY);
} }
@@ -465,7 +470,8 @@ static inline uint64_t l2meta_cow_start(QCowL2Meta *m)
static inline uint64_t l2meta_cow_end(QCowL2Meta *m) static inline uint64_t l2meta_cow_end(QCowL2Meta *m)
{ {
return m->offset + m->cow_end.offset + m->cow_end.nb_bytes; return m->offset + m->cow_end.offset
+ (m->cow_end.nb_sectors << BDRV_SECTOR_BITS);
} }
static inline uint64_t refcount_diff(uint64_t r1, uint64_t r2) static inline uint64_t refcount_diff(uint64_t r1, uint64_t r2)
@@ -522,24 +528,20 @@ int qcow2_check_metadata_overlap(BlockDriverState *bs, int ign, int64_t offset,
int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset, int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
int64_t size); int64_t size);
int qcow2_change_refcount_order(BlockDriverState *bs, int refcount_order,
BlockDriverAmendStatusCB *status_cb,
void *cb_opaque, Error **errp);
/* qcow2-cluster.c functions */ /* qcow2-cluster.c functions */
int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size, int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
bool exact_size); bool exact_size);
int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index); int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index);
void qcow2_l2_cache_reset(BlockDriverState *bs);
int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset); int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num, int qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num,
uint8_t *out_buf, const uint8_t *in_buf, uint8_t *out_buf, const uint8_t *in_buf,
int nb_sectors, bool enc, Error **errp); int nb_sectors, bool enc, Error **errp);
int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset, int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
unsigned int *bytes, uint64_t *cluster_offset); int *num, uint64_t *cluster_offset);
int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset, int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
unsigned int *bytes, uint64_t *host_offset, int *num, uint64_t *host_offset, QCowL2Meta **m);
QCowL2Meta **m);
uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
uint64_t offset, uint64_t offset,
int compressed_size); int compressed_size);
@@ -550,8 +552,7 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors); int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors);
int qcow2_expand_zero_clusters(BlockDriverState *bs, int qcow2_expand_zero_clusters(BlockDriverState *bs,
BlockDriverAmendStatusCB *status_cb, BlockDriverAmendStatusCB *status_cb);
void *cb_opaque);
/* qcow2-snapshot.c functions */ /* qcow2-snapshot.c functions */
int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info); int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info);
@@ -576,12 +577,10 @@ int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
void qcow2_cache_entry_mark_dirty(BlockDriverState *bs, Qcow2Cache *c, void qcow2_cache_entry_mark_dirty(BlockDriverState *bs, Qcow2Cache *c,
void *table); void *table);
int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c); int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
int qcow2_cache_write(BlockDriverState *bs, Qcow2Cache *c);
int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
Qcow2Cache *dependency); Qcow2Cache *dependency);
void qcow2_cache_depends_on_flush(Qcow2Cache *c); void qcow2_cache_depends_on_flush(Qcow2Cache *c);
void qcow2_cache_clean_unused(BlockDriverState *bs, Qcow2Cache *c);
int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c); int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c);
int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,

View File

@@ -11,7 +11,6 @@
* *
*/ */
#include "qemu/osdep.h"
#include "qed.h" #include "qed.h"
typedef struct { typedef struct {
@@ -234,7 +233,8 @@ int qed_check(BDRVQEDState *s, BdrvCheckResult *result, bool fix)
} }
check.result->bfi.total_clusters = check.result->bfi.total_clusters =
DIV_ROUND_UP(s->header.image_size, s->header.cluster_size); (s->header.image_size + s->header.cluster_size - 1) /
s->header.cluster_size;
ret = qed_check_l1_table(&check, s->l1_table); ret = qed_check_l1_table(&check, s->l1_table);
if (ret == 0) { if (ret == 0) {
/* Only check for leaks if entire image was scanned successfully */ /* Only check for leaks if entire image was scanned successfully */

View File

@@ -12,7 +12,6 @@
* *
*/ */
#include "qemu/osdep.h"
#include "qed.h" #include "qed.h"
/** /**

View File

@@ -11,7 +11,6 @@
* *
*/ */
#include "qemu/osdep.h"
#include "qed.h" #include "qed.h"
void *gencb_alloc(size_t len, BlockCompletionFunc *cb, void *opaque) void *gencb_alloc(size_t len, BlockCompletionFunc *cb, void *opaque)

View File

@@ -50,7 +50,6 @@
* table will be deleted in favor of the existing cache entry. * table will be deleted in favor of the existing cache entry.
*/ */
#include "qemu/osdep.h"
#include "trace.h" #include "trace.h"
#include "qed.h" #include "qed.h"

View File

@@ -12,11 +12,9 @@
* *
*/ */
#include "qemu/osdep.h"
#include "trace.h" #include "trace.h"
#include "qemu/sockets.h" /* for EINPROGRESS on Windows */ #include "qemu/sockets.h" /* for EINPROGRESS on Windows */
#include "qed.h" #include "qed.h"
#include "qemu/bswap.h"
typedef struct { typedef struct {
GenericCB gencb; GenericCB gencb;

View File

@@ -12,15 +12,11 @@
* *
*/ */
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu/timer.h" #include "qemu/timer.h"
#include "qemu/bswap.h"
#include "trace.h" #include "trace.h"
#include "qed.h" #include "qed.h"
#include "qapi/qmp/qerror.h" #include "qapi/qmp/qerror.h"
#include "migration/migration.h" #include "migration/migration.h"
#include "sysemu/block-backend.h"
static const AIOCBInfo qed_aiocb_info = { static const AIOCBInfo qed_aiocb_info = {
.aiocb_size = sizeof(QEDAIOCB), .aiocb_size = sizeof(QEDAIOCB),
@@ -143,7 +139,8 @@ static void qed_write_header(BDRVQEDState *s, BlockCompletionFunc cb,
* them, and write back. * them, and write back.
*/ */
int nsectors = DIV_ROUND_UP(sizeof(QEDHeader), BDRV_SECTOR_SIZE); int nsectors = (sizeof(QEDHeader) + BDRV_SECTOR_SIZE - 1) /
BDRV_SECTOR_SIZE;
size_t len = nsectors * BDRV_SECTOR_SIZE; size_t len = nsectors * BDRV_SECTOR_SIZE;
QEDWriteHeaderCB *write_header_cb = gencb_alloc(sizeof(*write_header_cb), QEDWriteHeaderCB *write_header_cb = gencb_alloc(sizeof(*write_header_cb),
cb, opaque); cb, opaque);
@@ -218,7 +215,7 @@ static bool qed_is_image_size_valid(uint64_t image_size, uint32_t cluster_size,
* *
* The string is NUL-terminated. * The string is NUL-terminated.
*/ */
static int qed_read_string(BdrvChild *file, uint64_t offset, size_t n, static int qed_read_string(BlockDriverState *file, uint64_t offset, size_t n,
char *buf, size_t buflen) char *buf, size_t buflen)
{ {
int ret; int ret;
@@ -347,7 +344,7 @@ static void qed_start_need_check_timer(BDRVQEDState *s)
* migration. * migration.
*/ */
timer_mod(s->need_check_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + timer_mod(s->need_check_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
NANOSECONDS_PER_SECOND * QED_NEED_CHECK_TIMEOUT); get_ticks_per_sec() * QED_NEED_CHECK_TIMEOUT);
} }
/* It's okay to call this multiple times or when no timer is started */ /* It's okay to call this multiple times or when no timer is started */
@@ -357,6 +354,12 @@ static void qed_cancel_need_check_timer(BDRVQEDState *s)
timer_del(s->need_check_timer); timer_del(s->need_check_timer);
} }
static void bdrv_qed_rebind(BlockDriverState *bs)
{
BDRVQEDState *s = bs->opaque;
s->bs = bs;
}
static void bdrv_qed_detach_aio_context(BlockDriverState *bs) static void bdrv_qed_detach_aio_context(BlockDriverState *bs)
{ {
BDRVQEDState *s = bs->opaque; BDRVQEDState *s = bs->opaque;
@@ -401,8 +404,11 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
} }
if (s->header.features & ~QED_FEATURE_MASK) { if (s->header.features & ~QED_FEATURE_MASK) {
/* image uses unsupported feature bits */ /* image uses unsupported feature bits */
error_setg(errp, "Unsupported QED features: %" PRIx64, char buf[64];
s->header.features & ~QED_FEATURE_MASK); snprintf(buf, sizeof(buf), "%" PRIx64,
s->header.features & ~QED_FEATURE_MASK);
error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bdrv_get_device_or_node_name(bs), "QED", buf);
return -ENOTSUP; return -ENOTSUP;
} }
if (!qed_is_cluster_size_valid(s->header.cluster_size)) { if (!qed_is_cluster_size_valid(s->header.cluster_size)) {
@@ -410,7 +416,7 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
} }
/* Round down file size to the last cluster */ /* Round down file size to the last cluster */
file_size = bdrv_getlength(bs->file->bs); file_size = bdrv_getlength(bs->file);
if (file_size < 0) { if (file_size < 0) {
return file_size; return file_size;
} }
@@ -465,7 +471,7 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
* feature is no longer valid. * feature is no longer valid.
*/ */
if ((s->header.autoclear_features & ~QED_AUTOCLEAR_FEATURE_MASK) != 0 && if ((s->header.autoclear_features & ~QED_AUTOCLEAR_FEATURE_MASK) != 0 &&
!bdrv_is_read_only(bs->file->bs) && !(flags & BDRV_O_INACTIVE)) { !bdrv_is_read_only(bs->file) && !(flags & BDRV_O_INCOMING)) {
s->header.autoclear_features &= QED_AUTOCLEAR_FEATURE_MASK; s->header.autoclear_features &= QED_AUTOCLEAR_FEATURE_MASK;
ret = qed_write_header_sync(s); ret = qed_write_header_sync(s);
@@ -474,7 +480,7 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
} }
/* From here on only known autoclear feature bits are valid */ /* From here on only known autoclear feature bits are valid */
bdrv_flush(bs->file->bs); bdrv_flush(bs->file);
} }
s->l1_table = qed_alloc_table(s); s->l1_table = qed_alloc_table(s);
@@ -492,8 +498,8 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
* potentially inconsistent images to be opened read-only. This can * potentially inconsistent images to be opened read-only. This can
* aid data recovery from an otherwise inconsistent image. * aid data recovery from an otherwise inconsistent image.
*/ */
if (!bdrv_is_read_only(bs->file->bs) && if (!bdrv_is_read_only(bs->file) &&
!(flags & BDRV_O_INACTIVE)) { !(flags & BDRV_O_INCOMING)) {
BdrvCheckResult result = {0}; BdrvCheckResult result = {0};
ret = qed_check(s, &result, true); ret = qed_check(s, &result, true);
@@ -517,7 +523,7 @@ static void bdrv_qed_refresh_limits(BlockDriverState *bs, Error **errp)
{ {
BDRVQEDState *s = bs->opaque; BDRVQEDState *s = bs->opaque;
bs->bl.pwrite_zeroes_alignment = s->header.cluster_size; bs->bl.write_zeroes_alignment = s->header.cluster_size >> BDRV_SECTOR_BITS;
} }
/* We have nothing to do for QED reopen, stubs just return /* We have nothing to do for QED reopen, stubs just return
@@ -535,7 +541,7 @@ static void bdrv_qed_close(BlockDriverState *bs)
bdrv_qed_detach_aio_context(bs); bdrv_qed_detach_aio_context(bs);
/* Ensure writes reach stable storage */ /* Ensure writes reach stable storage */
bdrv_flush(bs->file->bs); bdrv_flush(bs->file);
/* Clean shutdown, no check required on next open */ /* Clean shutdown, no check required on next open */
if (s->header.features & QED_F_NEED_CHECK) { if (s->header.features & QED_F_NEED_CHECK) {
@@ -567,7 +573,7 @@ static int qed_create(const char *filename, uint32_t cluster_size,
size_t l1_size = header.cluster_size * header.table_size; size_t l1_size = header.cluster_size * header.table_size;
Error *local_err = NULL; Error *local_err = NULL;
int ret = 0; int ret = 0;
BlockBackend *blk; BlockDriverState *bs;
ret = bdrv_create_file(filename, opts, &local_err); ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) { if (ret < 0) {
@@ -575,17 +581,17 @@ static int qed_create(const char *filename, uint32_t cluster_size,
return ret; return ret;
} }
blk = blk_new_open(filename, NULL, NULL, bs = NULL;
BDRV_O_RDWR | BDRV_O_PROTOCOL, &local_err); ret = bdrv_open(&bs, filename, NULL, NULL,
if (blk == NULL) { BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL, NULL,
&local_err);
if (ret < 0) {
error_propagate(errp, local_err); error_propagate(errp, local_err);
return -EIO; return ret;
} }
blk_set_allow_write_beyond_eof(blk, true);
/* File must start empty and grow, check truncate is supported */ /* File must start empty and grow, check truncate is supported */
ret = blk_truncate(blk, 0); ret = bdrv_truncate(bs, 0);
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
@@ -601,18 +607,18 @@ static int qed_create(const char *filename, uint32_t cluster_size,
} }
qed_header_cpu_to_le(&header, &le_header); qed_header_cpu_to_le(&header, &le_header);
ret = blk_pwrite(blk, 0, &le_header, sizeof(le_header), 0); ret = bdrv_pwrite(bs, 0, &le_header, sizeof(le_header));
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
ret = blk_pwrite(blk, sizeof(le_header), backing_file, ret = bdrv_pwrite(bs, sizeof(le_header), backing_file,
header.backing_filename_size, 0); header.backing_filename_size);
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
l1_table = g_malloc0(l1_size); l1_table = g_malloc0(l1_size);
ret = blk_pwrite(blk, header.l1_table_offset, l1_table, l1_size, 0); ret = bdrv_pwrite(bs, header.l1_table_offset, l1_table, l1_size);
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
@@ -620,7 +626,7 @@ static int qed_create(const char *filename, uint32_t cluster_size,
ret = 0; /* success */ ret = 0; /* success */
out: out:
g_free(l1_table); g_free(l1_table);
blk_unref(blk); bdrv_unref(bs);
return ret; return ret;
} }
@@ -680,7 +686,6 @@ typedef struct {
uint64_t pos; uint64_t pos;
int64_t status; int64_t status;
int *pnum; int *pnum;
BlockDriverState **file;
} QEDIsAllocatedCB; } QEDIsAllocatedCB;
static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t len) static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t len)
@@ -692,7 +697,6 @@ static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t l
case QED_CLUSTER_FOUND: case QED_CLUSTER_FOUND:
offset |= qed_offset_into_cluster(s, cb->pos); offset |= qed_offset_into_cluster(s, cb->pos);
cb->status = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset; cb->status = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
*cb->file = cb->bs->file->bs;
break; break;
case QED_CLUSTER_ZERO: case QED_CLUSTER_ZERO:
cb->status = BDRV_BLOCK_ZERO; cb->status = BDRV_BLOCK_ZERO;
@@ -708,14 +712,13 @@ static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t l
} }
if (cb->co) { if (cb->co) {
qemu_coroutine_enter(cb->co); qemu_coroutine_enter(cb->co, NULL);
} }
} }
static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs, static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int64_t sector_num,
int nb_sectors, int *pnum, int nb_sectors, int *pnum)
BlockDriverState **file)
{ {
BDRVQEDState *s = bs->opaque; BDRVQEDState *s = bs->opaque;
size_t len = (size_t)nb_sectors * BDRV_SECTOR_SIZE; size_t len = (size_t)nb_sectors * BDRV_SECTOR_SIZE;
@@ -724,7 +727,6 @@ static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
.pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE, .pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE,
.status = BDRV_BLOCK_OFFSET_MASK, .status = BDRV_BLOCK_OFFSET_MASK,
.pnum = pnum, .pnum = pnum,
.file = file,
}; };
QEDRequest request = { .l2_table = NULL }; QEDRequest request = { .l2_table = NULL };
@@ -770,8 +772,8 @@ static void qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
/* If there is a backing file, get its length. Treat the absence of a /* If there is a backing file, get its length. Treat the absence of a
* backing file like a zero length backing file. * backing file like a zero length backing file.
*/ */
if (s->bs->backing) { if (s->bs->backing_hd) {
int64_t l = bdrv_getlength(s->bs->backing->bs); int64_t l = bdrv_getlength(s->bs->backing_hd);
if (l < 0) { if (l < 0) {
cb(opaque, l); cb(opaque, l);
return; return;
@@ -800,7 +802,7 @@ static void qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
qemu_iovec_concat(*backing_qiov, qiov, 0, size); qemu_iovec_concat(*backing_qiov, qiov, 0, size);
BLKDBG_EVENT(s->bs->file, BLKDBG_READ_BACKING_AIO); BLKDBG_EVENT(s->bs->file, BLKDBG_READ_BACKING_AIO);
bdrv_aio_readv(s->bs->backing, pos / BDRV_SECTOR_SIZE, bdrv_aio_readv(s->bs->backing_hd, pos / BDRV_SECTOR_SIZE,
*backing_qiov, size / BDRV_SECTOR_SIZE, cb, opaque); *backing_qiov, size / BDRV_SECTOR_SIZE, cb, opaque);
} }
@@ -1053,7 +1055,7 @@ static void qed_aio_write_flush_before_l2_update(void *opaque, int ret)
QEDAIOCB *acb = opaque; QEDAIOCB *acb = opaque;
BDRVQEDState *s = acb_to_s(acb); BDRVQEDState *s = acb_to_s(acb);
if (!bdrv_aio_flush(s->bs->file->bs, qed_aio_write_l2_update_cb, opaque)) { if (!bdrv_aio_flush(s->bs->file, qed_aio_write_l2_update_cb, opaque)) {
qed_aio_complete(acb, -EIO); qed_aio_complete(acb, -EIO);
} }
} }
@@ -1079,7 +1081,7 @@ static void qed_aio_write_main(void *opaque, int ret)
if (acb->find_cluster_ret == QED_CLUSTER_FOUND) { if (acb->find_cluster_ret == QED_CLUSTER_FOUND) {
next_fn = qed_aio_next_io; next_fn = qed_aio_next_io;
} else { } else {
if (s->bs->backing) { if (s->bs->backing_hd) {
next_fn = qed_aio_write_flush_before_l2_update; next_fn = qed_aio_write_flush_before_l2_update;
} else { } else {
next_fn = qed_aio_write_l2_update_cb; next_fn = qed_aio_write_l2_update_cb;
@@ -1137,7 +1139,7 @@ static void qed_aio_write_prefill(void *opaque, int ret)
static bool qed_should_set_need_check(BDRVQEDState *s) static bool qed_should_set_need_check(BDRVQEDState *s)
{ {
/* The flush before L2 update path ensures consistency */ /* The flush before L2 update path ensures consistency */
if (s->bs->backing) { if (s->bs->backing_hd) {
return false; return false;
} }
@@ -1418,21 +1420,21 @@ typedef struct {
bool done; bool done;
} QEDWriteZeroesCB; } QEDWriteZeroesCB;
static void coroutine_fn qed_co_pwrite_zeroes_cb(void *opaque, int ret) static void coroutine_fn qed_co_write_zeroes_cb(void *opaque, int ret)
{ {
QEDWriteZeroesCB *cb = opaque; QEDWriteZeroesCB *cb = opaque;
cb->done = true; cb->done = true;
cb->ret = ret; cb->ret = ret;
if (cb->co) { if (cb->co) {
qemu_coroutine_enter(cb->co); qemu_coroutine_enter(cb->co, NULL);
} }
} }
static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs, static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
int64_t offset, int64_t sector_num,
int count, int nb_sectors,
BdrvRequestFlags flags) BdrvRequestFlags flags)
{ {
BlockAIOCB *blockacb; BlockAIOCB *blockacb;
BDRVQEDState *s = bs->opaque; BDRVQEDState *s = bs->opaque;
@@ -1440,22 +1442,25 @@ static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs,
QEMUIOVector qiov; QEMUIOVector qiov;
struct iovec iov; struct iovec iov;
/* Fall back if the request is not aligned */ /* Refuse if there are untouched backing file sectors */
if (qed_offset_into_cluster(s, offset) || if (bs->backing_hd) {
qed_offset_into_cluster(s, count)) { if (qed_offset_into_cluster(s, sector_num * BDRV_SECTOR_SIZE) != 0) {
return -ENOTSUP; return -ENOTSUP;
}
if (qed_offset_into_cluster(s, nb_sectors * BDRV_SECTOR_SIZE) != 0) {
return -ENOTSUP;
}
} }
/* Zero writes start without an I/O buffer. If a buffer becomes necessary /* Zero writes start without an I/O buffer. If a buffer becomes necessary
* then it will be allocated during request processing. * then it will be allocated during request processing.
*/ */
iov.iov_base = NULL; iov.iov_base = NULL,
iov.iov_len = count; iov.iov_len = nb_sectors * BDRV_SECTOR_SIZE,
qemu_iovec_init_external(&qiov, &iov, 1); qemu_iovec_init_external(&qiov, &iov, 1);
blockacb = qed_aio_setup(bs, offset >> BDRV_SECTOR_BITS, &qiov, blockacb = qed_aio_setup(bs, sector_num, &qiov, nb_sectors,
count >> BDRV_SECTOR_BITS, qed_co_write_zeroes_cb, &cb,
qed_co_pwrite_zeroes_cb, &cb,
QED_AIOCB_WRITE | QED_AIOCB_ZERO); QED_AIOCB_WRITE | QED_AIOCB_ZERO);
if (!blockacb) { if (!blockacb) {
return -EIO; return -EIO;
@@ -1591,11 +1596,18 @@ static void bdrv_qed_invalidate_cache(BlockDriverState *bs, Error **errp)
bdrv_qed_close(bs); bdrv_qed_close(bs);
bdrv_invalidate_cache(bs->file, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
memset(s, 0, sizeof(BDRVQEDState)); memset(s, 0, sizeof(BDRVQEDState));
ret = bdrv_qed_open(bs, NULL, bs->open_flags, &local_err); ret = bdrv_qed_open(bs, NULL, bs->open_flags, &local_err);
if (local_err) { if (local_err) {
error_propagate(errp, local_err); error_setg(errp, "Could not reopen qed layer: %s",
error_prepend(errp, "Could not reopen qed layer: "); error_get_pretty(local_err));
error_free(local_err);
return; return;
} else if (ret < 0) { } else if (ret < 0) {
error_setg_errno(errp, -ret, "Could not reopen qed layer"); error_setg_errno(errp, -ret, "Could not reopen qed layer");
@@ -1652,6 +1664,7 @@ static BlockDriver bdrv_qed = {
.supports_backing = true, .supports_backing = true,
.bdrv_probe = bdrv_qed_probe, .bdrv_probe = bdrv_qed_probe,
.bdrv_rebind = bdrv_qed_rebind,
.bdrv_open = bdrv_qed_open, .bdrv_open = bdrv_qed_open,
.bdrv_close = bdrv_qed_close, .bdrv_close = bdrv_qed_close,
.bdrv_reopen_prepare = bdrv_qed_reopen_prepare, .bdrv_reopen_prepare = bdrv_qed_reopen_prepare,
@@ -1660,7 +1673,7 @@ static BlockDriver bdrv_qed = {
.bdrv_co_get_block_status = bdrv_qed_co_get_block_status, .bdrv_co_get_block_status = bdrv_qed_co_get_block_status,
.bdrv_aio_readv = bdrv_qed_aio_readv, .bdrv_aio_readv = bdrv_qed_aio_readv,
.bdrv_aio_writev = bdrv_qed_aio_writev, .bdrv_aio_writev = bdrv_qed_aio_writev,
.bdrv_co_pwrite_zeroes = bdrv_qed_co_pwrite_zeroes, .bdrv_co_write_zeroes = bdrv_qed_co_write_zeroes,
.bdrv_truncate = bdrv_qed_truncate, .bdrv_truncate = bdrv_qed_truncate,
.bdrv_getlength = bdrv_qed_getlength, .bdrv_getlength = bdrv_qed_getlength,
.bdrv_get_info = bdrv_qed_get_info, .bdrv_get_info = bdrv_qed_get_info,

View File

@@ -16,7 +16,6 @@
#define BLOCK_QED_H #define BLOCK_QED_H
#include "block/block_int.h" #include "block/block_int.h"
#include "qemu/cutils.h"
/* The layout of a QED file is as follows: /* The layout of a QED file is as follows:
* *

View File

@@ -13,8 +13,6 @@
* See the COPYING file in the top-level directory. * See the COPYING file in the top-level directory.
*/ */
#include "qemu/osdep.h"
#include "qemu/cutils.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qapi/qmp/qbool.h" #include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h" #include "qapi/qmp/qdict.h"
@@ -66,11 +64,8 @@ typedef struct QuorumVotes {
/* the following structure holds the state of one quorum instance */ /* the following structure holds the state of one quorum instance */
typedef struct BDRVQuorumState { typedef struct BDRVQuorumState {
BdrvChild **children; /* children BlockDriverStates */ BlockDriverState **bs; /* children BlockDriverStates */
int num_children; /* children count */ int num_children; /* children count */
unsigned next_child_index; /* the index of the next child that should
* be added
*/
int threshold; /* if less than threshold children reads gave the int threshold; /* if less than threshold children reads gave the
* same result a quorum error occurs. * same result a quorum error occurs.
*/ */
@@ -219,16 +214,14 @@ static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
return acb; return acb;
} }
static void quorum_report_bad(QuorumOpType type, uint64_t sector_num, static void quorum_report_bad(QuorumAIOCB *acb, char *node_name, int ret)
int nb_sectors, char *node_name, int ret)
{ {
const char *msg = NULL; const char *msg = NULL;
if (ret < 0) { if (ret < 0) {
msg = strerror(-ret); msg = strerror(-ret);
} }
qapi_event_send_quorum_report_bad(!!msg, msg, node_name,
qapi_event_send_quorum_report_bad(type, !!msg, msg, node_name, acb->sector_num, acb->nb_sectors, &error_abort);
sector_num, nb_sectors, &error_abort);
} }
static void quorum_report_failure(QuorumAIOCB *acb) static void quorum_report_failure(QuorumAIOCB *acb)
@@ -290,19 +283,9 @@ static void quorum_aio_cb(void *opaque, int ret)
BDRVQuorumState *s = acb->common.bs->opaque; BDRVQuorumState *s = acb->common.bs->opaque;
bool rewrite = false; bool rewrite = false;
if (ret == 0) {
acb->success_count++;
} else {
QuorumOpType type;
type = acb->is_read ? QUORUM_OP_TYPE_READ : QUORUM_OP_TYPE_WRITE;
quorum_report_bad(type, acb->sector_num, acb->nb_sectors,
sacb->aiocb->bs->node_name, ret);
}
if (acb->is_read && s->read_pattern == QUORUM_READ_PATTERN_FIFO) { if (acb->is_read && s->read_pattern == QUORUM_READ_PATTERN_FIFO) {
/* We try to read next child in FIFO order if we fail to read */ /* We try to read next child in FIFO order if we fail to read */
if (ret < 0 && (acb->child_iter + 1) < s->num_children) { if (ret < 0 && ++acb->child_iter < s->num_children) {
acb->child_iter++;
read_fifo_child(acb); read_fifo_child(acb);
return; return;
} }
@@ -317,6 +300,11 @@ static void quorum_aio_cb(void *opaque, int ret)
sacb->ret = ret; sacb->ret = ret;
acb->count++; acb->count++;
if (ret == 0) {
acb->success_count++;
} else {
quorum_report_bad(acb, sacb->aiocb->bs->node_name, ret);
}
assert(acb->count <= s->num_children); assert(acb->count <= s->num_children);
assert(acb->success_count <= s->num_children); assert(acb->success_count <= s->num_children);
if (acb->count < s->num_children) { if (acb->count < s->num_children) {
@@ -348,9 +336,7 @@ static void quorum_report_bad_versions(BDRVQuorumState *s,
continue; continue;
} }
QLIST_FOREACH(item, &version->items, next) { QLIST_FOREACH(item, &version->items, next) {
quorum_report_bad(QUORUM_OP_TYPE_READ, acb->sector_num, quorum_report_bad(acb, s->bs[item->index]->node_name, 0);
acb->nb_sectors,
s->children[item->index]->bs->node_name, 0);
} }
} }
} }
@@ -383,9 +369,8 @@ static bool quorum_rewrite_bad_versions(BDRVQuorumState *s, QuorumAIOCB *acb,
continue; continue;
} }
QLIST_FOREACH(item, &version->items, next) { QLIST_FOREACH(item, &version->items, next) {
bdrv_aio_writev(s->children[item->index], acb->sector_num, bdrv_aio_writev(s->bs[item->index], acb->sector_num, acb->qiov,
acb->qiov, acb->nb_sectors, quorum_rewrite_aio_cb, acb->nb_sectors, quorum_rewrite_aio_cb, acb);
acb);
} }
} }
@@ -654,15 +639,14 @@ static BlockAIOCB *read_quorum_children(QuorumAIOCB *acb)
int i; int i;
for (i = 0; i < s->num_children; i++) { for (i = 0; i < s->num_children; i++) {
acb->qcrs[i].buf = qemu_blockalign(s->children[i]->bs, acb->qiov->size); acb->qcrs[i].buf = qemu_blockalign(s->bs[i], acb->qiov->size);
qemu_iovec_init(&acb->qcrs[i].qiov, acb->qiov->niov); qemu_iovec_init(&acb->qcrs[i].qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->qcrs[i].qiov, acb->qiov, acb->qcrs[i].buf); qemu_iovec_clone(&acb->qcrs[i].qiov, acb->qiov, acb->qcrs[i].buf);
} }
for (i = 0; i < s->num_children; i++) { for (i = 0; i < s->num_children; i++) {
acb->qcrs[i].aiocb = bdrv_aio_readv(s->children[i], acb->sector_num, bdrv_aio_readv(s->bs[i], acb->sector_num, &acb->qcrs[i].qiov,
&acb->qcrs[i].qiov, acb->nb_sectors, acb->nb_sectors, quorum_aio_cb, &acb->qcrs[i]);
quorum_aio_cb, &acb->qcrs[i]);
} }
return &acb->common; return &acb->common;
@@ -672,15 +656,14 @@ static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb)
{ {
BDRVQuorumState *s = acb->common.bs->opaque; BDRVQuorumState *s = acb->common.bs->opaque;
acb->qcrs[acb->child_iter].buf = acb->qcrs[acb->child_iter].buf = qemu_blockalign(s->bs[acb->child_iter],
qemu_blockalign(s->children[acb->child_iter]->bs, acb->qiov->size); acb->qiov->size);
qemu_iovec_init(&acb->qcrs[acb->child_iter].qiov, acb->qiov->niov); qemu_iovec_init(&acb->qcrs[acb->child_iter].qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->qcrs[acb->child_iter].qiov, acb->qiov, qemu_iovec_clone(&acb->qcrs[acb->child_iter].qiov, acb->qiov,
acb->qcrs[acb->child_iter].buf); acb->qcrs[acb->child_iter].buf);
acb->qcrs[acb->child_iter].aiocb = bdrv_aio_readv(s->bs[acb->child_iter], acb->sector_num,
bdrv_aio_readv(s->children[acb->child_iter], acb->sector_num, &acb->qcrs[acb->child_iter].qiov, acb->nb_sectors,
&acb->qcrs[acb->child_iter].qiov, acb->nb_sectors, quorum_aio_cb, &acb->qcrs[acb->child_iter]);
quorum_aio_cb, &acb->qcrs[acb->child_iter]);
return &acb->common; return &acb->common;
} }
@@ -719,8 +702,8 @@ static BlockAIOCB *quorum_aio_writev(BlockDriverState *bs,
int i; int i;
for (i = 0; i < s->num_children; i++) { for (i = 0; i < s->num_children; i++) {
acb->qcrs[i].aiocb = bdrv_aio_writev(s->children[i], sector_num, acb->qcrs[i].aiocb = bdrv_aio_writev(s->bs[i], sector_num, qiov,
qiov, nb_sectors, &quorum_aio_cb, nb_sectors, &quorum_aio_cb,
&acb->qcrs[i]); &acb->qcrs[i]);
} }
@@ -734,12 +717,12 @@ static int64_t quorum_getlength(BlockDriverState *bs)
int i; int i;
/* check that all file have the same length */ /* check that all file have the same length */
result = bdrv_getlength(s->children[0]->bs); result = bdrv_getlength(s->bs[0]);
if (result < 0) { if (result < 0) {
return result; return result;
} }
for (i = 1; i < s->num_children; i++) { for (i = 1; i < s->num_children; i++) {
int64_t value = bdrv_getlength(s->children[i]->bs); int64_t value = bdrv_getlength(s->bs[i]);
if (value < 0) { if (value < 0) {
return value; return value;
} }
@@ -751,6 +734,21 @@ static int64_t quorum_getlength(BlockDriverState *bs)
return result; return result;
} }
static void quorum_invalidate_cache(BlockDriverState *bs, Error **errp)
{
BDRVQuorumState *s = bs->opaque;
Error *local_err = NULL;
int i;
for (i = 0; i < s->num_children; i++) {
bdrv_invalidate_cache(s->bs[i], &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
}
}
static coroutine_fn int quorum_co_flush(BlockDriverState *bs) static coroutine_fn int quorum_co_flush(BlockDriverState *bs)
{ {
BDRVQuorumState *s = bs->opaque; BDRVQuorumState *s = bs->opaque;
@@ -759,30 +757,19 @@ static coroutine_fn int quorum_co_flush(BlockDriverState *bs)
QuorumVoteValue result_value; QuorumVoteValue result_value;
int i; int i;
int result = 0; int result = 0;
int success_count = 0;
QLIST_INIT(&error_votes.vote_list); QLIST_INIT(&error_votes.vote_list);
error_votes.compare = quorum_64bits_compare; error_votes.compare = quorum_64bits_compare;
for (i = 0; i < s->num_children; i++) { for (i = 0; i < s->num_children; i++) {
result = bdrv_co_flush(s->children[i]->bs); result = bdrv_co_flush(s->bs[i]);
if (result) { result_value.l = result;
quorum_report_bad(QUORUM_OP_TYPE_FLUSH, 0, quorum_count_vote(&error_votes, &result_value, i);
bdrv_nb_sectors(s->children[i]->bs),
s->children[i]->bs->node_name, result);
result_value.l = result;
quorum_count_vote(&error_votes, &result_value, i);
} else {
success_count++;
}
} }
if (success_count >= s->threshold) { winner = quorum_get_vote_winner(&error_votes);
result = 0; result = winner->value.l;
} else {
winner = quorum_get_vote_winner(&error_votes);
result = winner->value.l;
}
quorum_free_vote_list(&error_votes); quorum_free_vote_list(&error_votes);
return result; return result;
@@ -795,7 +782,7 @@ static bool quorum_recurse_is_first_non_filter(BlockDriverState *bs,
int i; int i;
for (i = 0; i < s->num_children; i++) { for (i = 0; i < s->num_children; i++) {
bool perm = bdrv_recurse_is_first_non_filter(s->children[i]->bs, bool perm = bdrv_recurse_is_first_non_filter(s->bs[i],
candidate); candidate);
if (perm) { if (perm) {
return true; return true;
@@ -859,7 +846,7 @@ static int parse_read_pattern(const char *opt)
return QUORUM_READ_PATTERN_QUORUM; return QUORUM_READ_PATTERN_QUORUM;
} }
for (i = 0; i < QUORUM_READ_PATTERN__MAX; i++) { for (i = 0; i < QUORUM_READ_PATTERN_MAX; i++) {
if (!strcmp(opt, QuorumReadPattern_lookup[i])) { if (!strcmp(opt, QuorumReadPattern_lookup[i])) {
return i; return i;
} }
@@ -887,9 +874,9 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
ret = -EINVAL; ret = -EINVAL;
goto exit; goto exit;
} }
if (s->num_children < 1) { if (s->num_children < 2) {
error_setg(&local_err, error_setg(&local_err,
"Number of provided children must be 1 or more"); "Number of provided children must be greater than 1");
ret = -EINVAL; ret = -EINVAL;
goto exit; goto exit;
} }
@@ -902,12 +889,6 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
} }
s->threshold = qemu_opt_get_number(opts, QUORUM_OPT_VOTE_THRESHOLD, 0); s->threshold = qemu_opt_get_number(opts, QUORUM_OPT_VOTE_THRESHOLD, 0);
/* and validate it against s->num_children */
ret = quorum_valid_threshold(s->threshold, s->num_children, &local_err);
if (ret < 0) {
goto exit;
}
ret = parse_read_pattern(qemu_opt_get(opts, QUORUM_OPT_READ_PATTERN)); ret = parse_read_pattern(qemu_opt_get(opts, QUORUM_OPT_READ_PATTERN));
if (ret < 0) { if (ret < 0) {
error_setg(&local_err, "Please set read-pattern as fifo or quorum"); error_setg(&local_err, "Please set read-pattern as fifo or quorum");
@@ -916,6 +897,12 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
s->read_pattern = ret; s->read_pattern = ret;
if (s->read_pattern == QUORUM_READ_PATTERN_QUORUM) { if (s->read_pattern == QUORUM_READ_PATTERN_QUORUM) {
/* and validate it against s->num_children */
ret = quorum_valid_threshold(s->threshold, s->num_children, &local_err);
if (ret < 0) {
goto exit;
}
/* is the driver in blkverify mode */ /* is the driver in blkverify mode */
if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false) && if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false) &&
s->num_children == 2 && s->threshold == 2) { s->num_children == 2 && s->threshold == 2) {
@@ -935,8 +922,8 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
} }
} }
/* allocate the children array */ /* allocate the children BlockDriverState array */
s->children = g_new0(BdrvChild *, s->num_children); s->bs = g_new0(BlockDriverState *, s->num_children);
opened = g_new0(bool, s->num_children); opened = g_new0(bool, s->num_children);
for (i = 0; i < s->num_children; i++) { for (i = 0; i < s->num_children; i++) {
@@ -944,16 +931,14 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
ret = snprintf(indexstr, 32, "children.%d", i); ret = snprintf(indexstr, 32, "children.%d", i);
assert(ret < 32); assert(ret < 32);
s->children[i] = bdrv_open_child(NULL, options, indexstr, bs, ret = bdrv_open_image(&s->bs[i], NULL, options, indexstr, bs,
&child_format, false, &local_err); &child_format, false, &local_err);
if (local_err) { if (ret < 0) {
ret = -EINVAL;
goto close_exit; goto close_exit;
} }
opened[i] = true; opened[i] = true;
} }
s->next_child_index = s->num_children;
g_free(opened); g_free(opened);
goto exit; goto exit;
@@ -964,14 +949,16 @@ close_exit:
if (!opened[i]) { if (!opened[i]) {
continue; continue;
} }
bdrv_unref_child(bs, s->children[i]); bdrv_unref(s->bs[i]);
} }
g_free(s->children); g_free(s->bs);
g_free(opened); g_free(opened);
exit: exit:
qemu_opts_del(opts); qemu_opts_del(opts);
/* propagate error */ /* propagate error */
error_propagate(errp, local_err); if (local_err) {
error_propagate(errp, local_err);
}
return ret; return ret;
} }
@@ -981,79 +968,34 @@ static void quorum_close(BlockDriverState *bs)
int i; int i;
for (i = 0; i < s->num_children; i++) { for (i = 0; i < s->num_children; i++) {
bdrv_unref_child(bs, s->children[i]); bdrv_unref(s->bs[i]);
} }
g_free(s->children); g_free(s->bs);
} }
static void quorum_add_child(BlockDriverState *bs, BlockDriverState *child_bs, static void quorum_detach_aio_context(BlockDriverState *bs)
Error **errp)
{
BDRVQuorumState *s = bs->opaque;
BdrvChild *child;
char indexstr[32];
int ret;
assert(s->num_children <= INT_MAX / sizeof(BdrvChild *));
if (s->num_children == INT_MAX / sizeof(BdrvChild *) ||
s->next_child_index == UINT_MAX) {
error_setg(errp, "Too many children");
return;
}
ret = snprintf(indexstr, 32, "children.%u", s->next_child_index);
if (ret < 0 || ret >= 32) {
error_setg(errp, "cannot generate child name");
return;
}
s->next_child_index++;
bdrv_drained_begin(bs);
/* We can safely add the child now */
bdrv_ref(child_bs);
child = bdrv_attach_child(bs, child_bs, indexstr, &child_format);
s->children = g_renew(BdrvChild *, s->children, s->num_children + 1);
s->children[s->num_children++] = child;
bdrv_drained_end(bs);
}
static void quorum_del_child(BlockDriverState *bs, BdrvChild *child,
Error **errp)
{ {
BDRVQuorumState *s = bs->opaque; BDRVQuorumState *s = bs->opaque;
int i; int i;
for (i = 0; i < s->num_children; i++) { for (i = 0; i < s->num_children; i++) {
if (s->children[i] == child) { bdrv_detach_aio_context(s->bs[i]);
break;
}
} }
/* we have checked it in bdrv_del_child() */
assert(i < s->num_children);
if (s->num_children <= s->threshold) {
error_setg(errp,
"The number of children cannot be lower than the vote threshold %d",
s->threshold);
return;
}
bdrv_drained_begin(bs);
/* We can safely remove this child now */
memmove(&s->children[i], &s->children[i + 1],
(s->num_children - i - 1) * sizeof(BdrvChild *));
s->children = g_renew(BdrvChild *, s->children, --s->num_children);
bdrv_unref_child(bs, child);
bdrv_drained_end(bs);
} }
static void quorum_refresh_filename(BlockDriverState *bs, QDict *options) static void quorum_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVQuorumState *s = bs->opaque;
int i;
for (i = 0; i < s->num_children; i++) {
bdrv_attach_aio_context(s->bs[i], new_context);
}
}
static void quorum_refresh_filename(BlockDriverState *bs)
{ {
BDRVQuorumState *s = bs->opaque; BDRVQuorumState *s = bs->opaque;
QDict *opts; QDict *opts;
@@ -1061,17 +1003,16 @@ static void quorum_refresh_filename(BlockDriverState *bs, QDict *options)
int i; int i;
for (i = 0; i < s->num_children; i++) { for (i = 0; i < s->num_children; i++) {
bdrv_refresh_filename(s->children[i]->bs); bdrv_refresh_filename(s->bs[i]);
if (!s->children[i]->bs->full_open_options) { if (!s->bs[i]->full_open_options) {
return; return;
} }
} }
children = qlist_new(); children = qlist_new();
for (i = 0; i < s->num_children; i++) { for (i = 0; i < s->num_children; i++) {
QINCREF(s->children[i]->bs->full_open_options); QINCREF(s->bs[i]->full_open_options);
qlist_append_obj(children, qlist_append_obj(children, QOBJECT(s->bs[i]->full_open_options));
QOBJECT(s->children[i]->bs->full_open_options));
} }
opts = qdict_new(); opts = qdict_new();
@@ -1103,9 +1044,10 @@ static BlockDriver bdrv_quorum = {
.bdrv_aio_readv = quorum_aio_readv, .bdrv_aio_readv = quorum_aio_readv,
.bdrv_aio_writev = quorum_aio_writev, .bdrv_aio_writev = quorum_aio_writev,
.bdrv_invalidate_cache = quorum_invalidate_cache,
.bdrv_add_child = quorum_add_child, .bdrv_detach_aio_context = quorum_detach_aio_context,
.bdrv_del_child = quorum_del_child, .bdrv_attach_aio_context = quorum_attach_aio_context,
.is_filter = true, .is_filter = true,
.bdrv_recurse_is_first_non_filter = quorum_recurse_is_first_non_filter, .bdrv_recurse_is_first_non_filter = quorum_recurse_is_first_non_filter,

View File

@@ -15,9 +15,6 @@
#ifndef QEMU_RAW_AIO_H #ifndef QEMU_RAW_AIO_H
#define QEMU_RAW_AIO_H #define QEMU_RAW_AIO_H
#include "qemu/coroutine.h"
#include "qemu/iov.h"
/* AIO request types */ /* AIO request types */
#define QEMU_AIO_READ 0x0001 #define QEMU_AIO_READ 0x0001
#define QEMU_AIO_WRITE 0x0002 #define QEMU_AIO_WRITE 0x0002
@@ -36,18 +33,15 @@
/* linux-aio.c - Linux native implementation */ /* linux-aio.c - Linux native implementation */
#ifdef CONFIG_LINUX_AIO #ifdef CONFIG_LINUX_AIO
typedef struct LinuxAioState LinuxAioState; void *laio_init(void);
LinuxAioState *laio_init(void); void laio_cleanup(void *s);
void laio_cleanup(LinuxAioState *s); BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int coroutine_fn laio_co_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
uint64_t offset, QEMUIOVector *qiov, int type);
BlockAIOCB *laio_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type); BlockCompletionFunc *cb, void *opaque, int type);
void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context); void laio_detach_aio_context(void *s, AioContext *old_context);
void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context); void laio_attach_aio_context(void *s, AioContext *new_context);
void laio_io_plug(BlockDriverState *bs, LinuxAioState *s); void laio_io_plug(BlockDriverState *bs, void *aio_ctx);
void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s); void laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug);
#endif #endif
#ifdef _WIN32 #ifdef _WIN32

File diff suppressed because it is too large Load Diff

View File

@@ -21,13 +21,11 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h" #include "qemu-common.h"
#include "qapi/error.h"
#include "qemu/cutils.h"
#include "qemu/timer.h" #include "qemu/timer.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qemu/module.h" #include "qemu/module.h"
#include "block/raw-aio.h" #include "raw-aio.h"
#include "trace.h" #include "trace.h"
#include "block/thread-pool.h" #include "block/thread-pool.h"
#include "qemu/iov.h" #include "qemu/iov.h"
@@ -121,9 +119,9 @@ static int aio_worker(void *arg)
case QEMU_AIO_WRITE: case QEMU_AIO_WRITE:
count = handle_aiocb_rw(aiocb); count = handle_aiocb_rw(aiocb);
if (count == aiocb->aio_nbytes) { if (count == aiocb->aio_nbytes) {
ret = 0; count = 0;
} else { } else {
ret = -EINVAL; count = -EINVAL;
} }
break; break;
case QEMU_AIO_FLUSH: case QEMU_AIO_FLUSH:
@@ -137,15 +135,15 @@ static int aio_worker(void *arg)
break; break;
} }
g_free(aiocb); g_slice_free(RawWin32AIOData, aiocb);
return ret; return ret;
} }
static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile, static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
int64_t offset, QEMUIOVector *qiov, int count, int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type) BlockCompletionFunc *cb, void *opaque, int type)
{ {
RawWin32AIOData *acb = g_new(RawWin32AIOData, 1); RawWin32AIOData *acb = g_slice_new(RawWin32AIOData);
ThreadPool *pool; ThreadPool *pool;
acb->bs = bs; acb->bs = bs;
@@ -155,12 +153,11 @@ static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
if (qiov) { if (qiov) {
acb->aio_iov = qiov->iov; acb->aio_iov = qiov->iov;
acb->aio_niov = qiov->niov; acb->aio_niov = qiov->niov;
assert(qiov->size == count);
} }
acb->aio_nbytes = count; acb->aio_nbytes = nb_sectors * 512;
acb->aio_offset = offset; acb->aio_offset = sector_num * 512;
trace_paio_submit(acb, opaque, offset, count, type); trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
pool = aio_get_thread_pool(bdrv_get_aio_context(bs)); pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque); return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
} }
@@ -223,7 +220,7 @@ static void raw_attach_aio_context(BlockDriverState *bs,
} }
} }
static void raw_probe_alignment(BlockDriverState *bs, Error **errp) static void raw_probe_alignment(BlockDriverState *bs)
{ {
BDRVRawState *s = bs->opaque; BDRVRawState *s = bs->opaque;
DWORD sectorsPerCluster, freeClusters, totalClusters, count; DWORD sectorsPerCluster, freeClusters, totalClusters, count;
@@ -231,14 +228,14 @@ static void raw_probe_alignment(BlockDriverState *bs, Error **errp)
BOOL status; BOOL status;
if (s->type == FTYPE_CD) { if (s->type == FTYPE_CD) {
bs->bl.request_alignment = 2048; bs->request_alignment = 2048;
return; return;
} }
if (s->type == FTYPE_HARDDISK) { if (s->type == FTYPE_HARDDISK) {
status = DeviceIoControl(s->hfile, IOCTL_DISK_GET_DRIVE_GEOMETRY_EX, status = DeviceIoControl(s->hfile, IOCTL_DISK_GET_DRIVE_GEOMETRY_EX,
NULL, 0, &dg, sizeof(dg), &count, NULL); NULL, 0, &dg, sizeof(dg), &count, NULL);
if (status != 0) { if (status != 0) {
bs->bl.request_alignment = dg.Geometry.BytesPerSector; bs->request_alignment = dg.Geometry.BytesPerSector;
return; return;
} }
/* try GetDiskFreeSpace too */ /* try GetDiskFreeSpace too */
@@ -248,7 +245,7 @@ static void raw_probe_alignment(BlockDriverState *bs, Error **errp)
GetDiskFreeSpace(s->drive_path, &sectorsPerCluster, GetDiskFreeSpace(s->drive_path, &sectorsPerCluster,
&dg.Geometry.BytesPerSector, &dg.Geometry.BytesPerSector,
&freeClusters, &totalClusters); &freeClusters, &totalClusters);
bs->bl.request_alignment = dg.Geometry.BytesPerSector; bs->request_alignment = dg.Geometry.BytesPerSector;
} }
} }
@@ -366,6 +363,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
win32_aio_attach_aio_context(s->aio, bdrv_get_aio_context(bs)); win32_aio_attach_aio_context(s->aio, bdrv_get_aio_context(bs));
} }
raw_probe_alignment(bs);
ret = 0; ret = 0;
fail: fail:
qemu_opts_del(opts); qemu_opts_del(opts);
@@ -379,10 +377,9 @@ static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
BDRVRawState *s = bs->opaque; BDRVRawState *s = bs->opaque;
if (s->aio) { if (s->aio) {
return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov, return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
nb_sectors, cb, opaque, QEMU_AIO_READ); nb_sectors, cb, opaque, QEMU_AIO_READ);
} else { } else {
return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov, return paio_submit(bs, s->hfile, sector_num, qiov, nb_sectors,
nb_sectors << BDRV_SECTOR_BITS,
cb, opaque, QEMU_AIO_READ); cb, opaque, QEMU_AIO_READ);
} }
} }
@@ -394,10 +391,9 @@ static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
BDRVRawState *s = bs->opaque; BDRVRawState *s = bs->opaque;
if (s->aio) { if (s->aio) {
return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov, return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
nb_sectors, cb, opaque, QEMU_AIO_WRITE); nb_sectors, cb, opaque, QEMU_AIO_WRITE);
} else { } else {
return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov, return paio_submit(bs, s->hfile, sector_num, qiov, nb_sectors,
nb_sectors << BDRV_SECTOR_BITS,
cb, opaque, QEMU_AIO_WRITE); cb, opaque, QEMU_AIO_WRITE);
} }
} }
@@ -552,7 +548,6 @@ BlockDriver bdrv_file = {
.bdrv_needs_filename = true, .bdrv_needs_filename = true,
.bdrv_parse_filename = raw_parse_filename, .bdrv_parse_filename = raw_parse_filename,
.bdrv_file_open = raw_open, .bdrv_file_open = raw_open,
.bdrv_refresh_limits = raw_probe_alignment,
.bdrv_close = raw_close, .bdrv_close = raw_close,
.bdrv_create = raw_create, .bdrv_create = raw_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1, .bdrv_has_zero_init = bdrv_has_zero_init_1,

View File

@@ -1,6 +1,6 @@
/* BlockDriver implementation for "raw" /* BlockDriver implementation for "raw"
* *
* Copyright (C) 2010-2016 Red Hat, Inc. * Copyright (C) 2010, 2013, Red Hat, Inc.
* Copyright (C) 2010, Blue Swirl <blauwirbel@gmail.com> * Copyright (C) 2010, Blue Swirl <blauwirbel@gmail.com>
* Copyright (C) 2009, Anthony Liguori <aliguori@us.ibm.com> * Copyright (C) 2009, Anthony Liguori <aliguori@us.ibm.com>
* *
@@ -26,9 +26,7 @@
* IN THE SOFTWARE. * IN THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qapi/error.h"
#include "qemu/option.h" #include "qemu/option.h"
static QemuOptsList raw_create_opts = { static QemuOptsList raw_create_opts = {
@@ -50,32 +48,34 @@ static int raw_reopen_prepare(BDRVReopenState *reopen_state,
return 0; return 0;
} }
static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset, static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num,
uint64_t bytes, QEMUIOVector *qiov, int nb_sectors, QEMUIOVector *qiov)
int flags)
{ {
BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO); BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags); return bdrv_co_readv(bs->file, sector_num, nb_sectors, qiov);
} }
static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset, static int coroutine_fn raw_co_writev(BlockDriverState *bs, int64_t sector_num,
uint64_t bytes, QEMUIOVector *qiov, int nb_sectors, QEMUIOVector *qiov)
int flags)
{ {
void *buf = NULL; void *buf = NULL;
BlockDriver *drv; BlockDriver *drv;
QEMUIOVector local_qiov; QEMUIOVector local_qiov;
int ret; int ret;
if (bs->probed && offset < BLOCK_PROBE_BUF_SIZE && bytes) { if (bs->probed && sector_num == 0) {
/* Handling partial writes would be a pain - so we just /* As long as these conditions are true, we can't get partial writes to
* require that guests have 512-byte request alignment if * the probe buffer and can just directly check the request. */
* probing occurred */
QEMU_BUILD_BUG_ON(BLOCK_PROBE_BUF_SIZE != 512); QEMU_BUILD_BUG_ON(BLOCK_PROBE_BUF_SIZE != 512);
QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512); QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512);
assert(offset == 0 && bytes >= BLOCK_PROBE_BUF_SIZE);
buf = qemu_try_blockalign(bs->file->bs, 512); if (nb_sectors == 0) {
/* qemu_iovec_to_buf() would fail, but we want to return success
* instead of -EINVAL in this case. */
return 0;
}
buf = qemu_try_blockalign(bs->file, 512);
if (!buf) { if (!buf) {
ret = -ENOMEM; ret = -ENOMEM;
goto fail; goto fail;
@@ -102,7 +102,7 @@ static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset,
} }
BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO); BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
ret = bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags); ret = bdrv_co_writev(bs->file, sector_num, nb_sectors, qiov);
fail: fail:
if (qiov == &local_qiov) { if (qiov == &local_qiov) {
@@ -114,66 +114,69 @@ fail:
static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs, static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int64_t sector_num,
int nb_sectors, int *pnum, int nb_sectors, int *pnum)
BlockDriverState **file)
{ {
*pnum = nb_sectors; *pnum = nb_sectors;
*file = bs->file->bs;
return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA | return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
(sector_num << BDRV_SECTOR_BITS); (sector_num << BDRV_SECTOR_BITS);
} }
static int coroutine_fn raw_co_pwrite_zeroes(BlockDriverState *bs, static int coroutine_fn raw_co_write_zeroes(BlockDriverState *bs,
int64_t offset, int count, int64_t sector_num, int nb_sectors,
BdrvRequestFlags flags) BdrvRequestFlags flags)
{ {
return bdrv_co_pwrite_zeroes(bs->file, offset, count, flags); return bdrv_co_write_zeroes(bs->file, sector_num, nb_sectors, flags);
} }
static int coroutine_fn raw_co_pdiscard(BlockDriverState *bs, static int coroutine_fn raw_co_discard(BlockDriverState *bs,
int64_t offset, int count) int64_t sector_num, int nb_sectors)
{ {
return bdrv_co_pdiscard(bs->file->bs, offset, count); return bdrv_co_discard(bs->file, sector_num, nb_sectors);
} }
static int64_t raw_getlength(BlockDriverState *bs) static int64_t raw_getlength(BlockDriverState *bs)
{ {
return bdrv_getlength(bs->file->bs); return bdrv_getlength(bs->file);
} }
static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi) static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{ {
return bdrv_get_info(bs->file->bs, bdi); return bdrv_get_info(bs->file, bdi);
} }
static void raw_refresh_limits(BlockDriverState *bs, Error **errp) static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
{ {
if (bs->probed) { bs->bl = bs->file->bl;
/* To make it easier to protect the first sector, any probed
* image is restricted to read-modify-write on sub-sector
* operations. */
bs->bl.request_alignment = BDRV_SECTOR_SIZE;
}
} }
static int raw_truncate(BlockDriverState *bs, int64_t offset) static int raw_truncate(BlockDriverState *bs, int64_t offset)
{ {
return bdrv_truncate(bs->file->bs, offset); return bdrv_truncate(bs->file, offset);
}
static int raw_is_inserted(BlockDriverState *bs)
{
return bdrv_is_inserted(bs->file);
} }
static int raw_media_changed(BlockDriverState *bs) static int raw_media_changed(BlockDriverState *bs)
{ {
return bdrv_media_changed(bs->file->bs); return bdrv_media_changed(bs->file);
} }
static void raw_eject(BlockDriverState *bs, bool eject_flag) static void raw_eject(BlockDriverState *bs, bool eject_flag)
{ {
bdrv_eject(bs->file->bs, eject_flag); bdrv_eject(bs->file, eject_flag);
} }
static void raw_lock_medium(BlockDriverState *bs, bool locked) static void raw_lock_medium(BlockDriverState *bs, bool locked)
{ {
bdrv_lock_medium(bs->file->bs, locked); bdrv_lock_medium(bs->file, locked);
}
static int raw_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
{
return bdrv_ioctl(bs->file, req, buf);
} }
static BlockAIOCB *raw_aio_ioctl(BlockDriverState *bs, static BlockAIOCB *raw_aio_ioctl(BlockDriverState *bs,
@@ -181,27 +184,30 @@ static BlockAIOCB *raw_aio_ioctl(BlockDriverState *bs,
BlockCompletionFunc *cb, BlockCompletionFunc *cb,
void *opaque) void *opaque)
{ {
return bdrv_aio_ioctl(bs->file->bs, req, buf, cb, opaque); return bdrv_aio_ioctl(bs->file, req, buf, cb, opaque);
} }
static int raw_has_zero_init(BlockDriverState *bs) static int raw_has_zero_init(BlockDriverState *bs)
{ {
return bdrv_has_zero_init(bs->file->bs); return bdrv_has_zero_init(bs->file);
} }
static int raw_create(const char *filename, QemuOpts *opts, Error **errp) static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
{ {
return bdrv_create_file(filename, opts, errp); Error *local_err = NULL;
int ret;
ret = bdrv_create_file(filename, opts, &local_err);
if (local_err) {
error_propagate(errp, local_err);
}
return ret;
} }
static int raw_open(BlockDriverState *bs, QDict *options, int flags, static int raw_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp) Error **errp)
{ {
bs->sg = bs->file->bs->sg; bs->sg = bs->file->sg;
bs->supported_write_flags = BDRV_REQ_FUA &
bs->file->bs->supported_write_flags;
bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
bs->file->bs->supported_zero_flags;
if (bs->probed && !bdrv_is_read_only(bs)) { if (bs->probed && !bdrv_is_read_only(bs)) {
fprintf(stderr, fprintf(stderr,
@@ -211,7 +217,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
"raw images, write operations on block 0 will be restricted.\n" "raw images, write operations on block 0 will be restricted.\n"
" Specify the 'raw' format explicitly to remove the " " Specify the 'raw' format explicitly to remove the "
"restrictions.\n", "restrictions.\n",
bs->file->bs->filename); bs->file->filename);
} }
return 0; return 0;
@@ -231,12 +237,12 @@ static int raw_probe(const uint8_t *buf, int buf_size, const char *filename)
static int raw_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz) static int raw_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz)
{ {
return bdrv_probe_blocksizes(bs->file->bs, bsz); return bdrv_probe_blocksizes(bs->file, bsz);
} }
static int raw_probe_geometry(BlockDriverState *bs, HDGeometry *geo) static int raw_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
{ {
return bdrv_probe_geometry(bs->file->bs, geo); return bdrv_probe_geometry(bs->file, geo);
} }
BlockDriver bdrv_raw = { BlockDriver bdrv_raw = {
@@ -246,10 +252,10 @@ BlockDriver bdrv_raw = {
.bdrv_open = &raw_open, .bdrv_open = &raw_open,
.bdrv_close = &raw_close, .bdrv_close = &raw_close,
.bdrv_create = &raw_create, .bdrv_create = &raw_create,
.bdrv_co_preadv = &raw_co_preadv, .bdrv_co_readv = &raw_co_readv,
.bdrv_co_pwritev = &raw_co_pwritev, .bdrv_co_writev = &raw_co_writev,
.bdrv_co_pwrite_zeroes = &raw_co_pwrite_zeroes, .bdrv_co_write_zeroes = &raw_co_write_zeroes,
.bdrv_co_pdiscard = &raw_co_pdiscard, .bdrv_co_discard = &raw_co_discard,
.bdrv_co_get_block_status = &raw_co_get_block_status, .bdrv_co_get_block_status = &raw_co_get_block_status,
.bdrv_truncate = &raw_truncate, .bdrv_truncate = &raw_truncate,
.bdrv_getlength = &raw_getlength, .bdrv_getlength = &raw_getlength,
@@ -258,9 +264,11 @@ BlockDriver bdrv_raw = {
.bdrv_refresh_limits = &raw_refresh_limits, .bdrv_refresh_limits = &raw_refresh_limits,
.bdrv_probe_blocksizes = &raw_probe_blocksizes, .bdrv_probe_blocksizes = &raw_probe_blocksizes,
.bdrv_probe_geometry = &raw_probe_geometry, .bdrv_probe_geometry = &raw_probe_geometry,
.bdrv_is_inserted = &raw_is_inserted,
.bdrv_media_changed = &raw_media_changed, .bdrv_media_changed = &raw_media_changed,
.bdrv_eject = &raw_eject, .bdrv_eject = &raw_eject,
.bdrv_lock_medium = &raw_lock_medium, .bdrv_lock_medium = &raw_lock_medium,
.bdrv_ioctl = &raw_ioctl,
.bdrv_aio_ioctl = &raw_aio_ioctl, .bdrv_aio_ioctl = &raw_aio_ioctl,
.create_opts = &raw_create_opts, .create_opts = &raw_create_opts,
.bdrv_has_zero_init = &raw_has_zero_init .bdrv_has_zero_init = &raw_has_zero_init

View File

@@ -11,13 +11,11 @@
* GNU GPL, version 2 or (at your option) any later version. * GNU GPL, version 2 or (at your option) any later version.
*/ */
#include "qemu/osdep.h" #include <inttypes.h>
#include "qapi/error.h" #include "qemu-common.h"
#include "qemu/error-report.h" #include "qemu/error-report.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "crypto/secret.h"
#include "qemu/cutils.h"
#include <rbd/librbd.h> #include <rbd/librbd.h>
@@ -230,27 +228,6 @@ static char *qemu_rbd_parse_clientname(const char *conf, char *clientname)
return NULL; return NULL;
} }
static int qemu_rbd_set_auth(rados_t cluster, const char *secretid,
Error **errp)
{
if (secretid == 0) {
return 0;
}
gchar *secret = qcrypto_secret_lookup_as_base64(secretid,
errp);
if (!secret) {
return -1;
}
rados_conf_set(cluster, "key", secret);
g_free(secret);
return 0;
}
static int qemu_rbd_set_conf(rados_t cluster, const char *conf, static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
bool only_read_conf_file, bool only_read_conf_file,
Error **errp) Error **errp)
@@ -290,8 +267,7 @@ static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
if (only_read_conf_file) { if (only_read_conf_file) {
ret = rados_conf_read_file(cluster, value); ret = rados_conf_read_file(cluster, value);
if (ret < 0) { if (ret < 0) {
error_setg_errno(errp, -ret, "error reading conf file %s", error_setg(errp, "error reading conf file %s", value);
value);
break; break;
} }
} }
@@ -300,7 +276,7 @@ static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
} else if (!only_read_conf_file) { } else if (!only_read_conf_file) {
ret = rados_conf_set(cluster, name, value); ret = rados_conf_set(cluster, name, value);
if (ret < 0) { if (ret < 0) {
error_setg_errno(errp, -ret, "invalid conf option %s", name); error_setg(errp, "invalid conf option %s", name);
ret = -EINVAL; ret = -EINVAL;
break; break;
} }
@@ -323,13 +299,10 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
char conf[RBD_MAX_CONF_SIZE]; char conf[RBD_MAX_CONF_SIZE];
char clientname_buf[RBD_MAX_CONF_SIZE]; char clientname_buf[RBD_MAX_CONF_SIZE];
char *clientname; char *clientname;
const char *secretid;
rados_t cluster; rados_t cluster;
rados_ioctx_t io_ctx; rados_ioctx_t io_ctx;
int ret; int ret;
secretid = qemu_opt_get(opts, "password-secret");
if (qemu_rbd_parsename(filename, pool, sizeof(pool), if (qemu_rbd_parsename(filename, pool, sizeof(pool),
snap_buf, sizeof(snap_buf), snap_buf, sizeof(snap_buf),
name, sizeof(name), name, sizeof(name),
@@ -355,10 +328,9 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
} }
clientname = qemu_rbd_parse_clientname(conf, clientname_buf); clientname = qemu_rbd_parse_clientname(conf, clientname_buf);
ret = rados_create(&cluster, clientname); if (rados_create(&cluster, clientname) < 0) {
if (ret < 0) { error_setg(errp, "error initializing");
error_setg_errno(errp, -ret, "error initializing"); return -EIO;
return ret;
} }
if (strstr(conf, "conf=") == NULL) { if (strstr(conf, "conf=") == NULL) {
@@ -378,32 +350,21 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
return -EIO; return -EIO;
} }
if (qemu_rbd_set_auth(cluster, secretid, errp) < 0) { if (rados_connect(cluster) < 0) {
error_setg(errp, "error connecting");
rados_shutdown(cluster); rados_shutdown(cluster);
return -EIO; return -EIO;
} }
ret = rados_connect(cluster); if (rados_ioctx_create(cluster, pool, &io_ctx) < 0) {
if (ret < 0) { error_setg(errp, "error opening pool %s", pool);
error_setg_errno(errp, -ret, "error connecting");
rados_shutdown(cluster); rados_shutdown(cluster);
return ret; return -EIO;
}
ret = rados_ioctx_create(cluster, pool, &io_ctx);
if (ret < 0) {
error_setg_errno(errp, -ret, "error opening pool %s", pool);
rados_shutdown(cluster);
return ret;
} }
ret = rbd_create(io_ctx, name, bytes, &obj_order); ret = rbd_create(io_ctx, name, bytes, &obj_order);
rados_ioctx_destroy(io_ctx); rados_ioctx_destroy(io_ctx);
rados_shutdown(cluster); rados_shutdown(cluster);
if (ret < 0) {
error_setg_errno(errp, -ret, "error rbd create");
return ret;
}
return ret; return ret;
} }
@@ -462,11 +423,6 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING, .type = QEMU_OPT_STRING,
.help = "Specification of the rbd image", .help = "Specification of the rbd image",
}, },
{
.name = "password-secret",
.type = QEMU_OPT_STRING,
.help = "ID of secret providing the password",
},
{ /* end of list */ } { /* end of list */ }
}, },
}; };
@@ -480,7 +436,6 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
char conf[RBD_MAX_CONF_SIZE]; char conf[RBD_MAX_CONF_SIZE];
char clientname_buf[RBD_MAX_CONF_SIZE]; char clientname_buf[RBD_MAX_CONF_SIZE];
char *clientname; char *clientname;
const char *secretid;
QemuOpts *opts; QemuOpts *opts;
Error *local_err = NULL; Error *local_err = NULL;
const char *filename; const char *filename;
@@ -495,7 +450,6 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
} }
filename = qemu_opt_get(opts, "filename"); filename = qemu_opt_get(opts, "filename");
secretid = qemu_opt_get(opts, "password-secret");
if (qemu_rbd_parsename(filename, pool, sizeof(pool), if (qemu_rbd_parsename(filename, pool, sizeof(pool),
snap_buf, sizeof(snap_buf), snap_buf, sizeof(snap_buf),
@@ -508,7 +462,7 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
clientname = qemu_rbd_parse_clientname(conf, clientname_buf); clientname = qemu_rbd_parse_clientname(conf, clientname_buf);
r = rados_create(&s->cluster, clientname); r = rados_create(&s->cluster, clientname);
if (r < 0) { if (r < 0) {
error_setg_errno(errp, -r, "error initializing"); error_setg(errp, "error initializing");
goto failed_opts; goto failed_opts;
} }
@@ -534,11 +488,6 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
} }
} }
if (qemu_rbd_set_auth(s->cluster, secretid, errp) < 0) {
r = -EIO;
goto failed_shutdown;
}
/* /*
* Fallback to more conservative semantics if setting cache * Fallback to more conservative semantics if setting cache
* options fails. Ignore errors from setting rbd_cache because the * options fails. Ignore errors from setting rbd_cache because the
@@ -554,19 +503,19 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
r = rados_connect(s->cluster); r = rados_connect(s->cluster);
if (r < 0) { if (r < 0) {
error_setg_errno(errp, -r, "error connecting"); error_setg(errp, "error connecting");
goto failed_shutdown; goto failed_shutdown;
} }
r = rados_ioctx_create(s->cluster, pool, &s->io_ctx); r = rados_ioctx_create(s->cluster, pool, &s->io_ctx);
if (r < 0) { if (r < 0) {
error_setg_errno(errp, -r, "error opening pool %s", pool); error_setg(errp, "error opening pool %s", pool);
goto failed_shutdown; goto failed_shutdown;
} }
r = rbd_open(s->io_ctx, s->name, &s->image, s->snap); r = rbd_open(s->io_ctx, s->name, &s->image, s->snap);
if (r < 0) { if (r < 0) {
error_setg_errno(errp, -r, "error reading header from %s", s->name); error_setg(errp, "error reading header from %s", s->name);
goto failed_open; goto failed_open;
} }
@@ -649,9 +598,9 @@ static int rbd_aio_flush_wrapper(rbd_image_t image,
} }
static BlockAIOCB *rbd_start_aio(BlockDriverState *bs, static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
int64_t off, int64_t sector_num,
QEMUIOVector *qiov, QEMUIOVector *qiov,
int64_t size, int nb_sectors,
BlockCompletionFunc *cb, BlockCompletionFunc *cb,
void *opaque, void *opaque,
RBDAIOCmd cmd) RBDAIOCmd cmd)
@@ -659,6 +608,7 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
RBDAIOCB *acb; RBDAIOCB *acb;
RADOSCB *rcb = NULL; RADOSCB *rcb = NULL;
rbd_completion_t c; rbd_completion_t c;
int64_t off, size;
char *buf; char *buf;
int r; int r;
@@ -667,7 +617,6 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
acb = qemu_aio_get(&rbd_aiocb_info, bs, cb, opaque); acb = qemu_aio_get(&rbd_aiocb_info, bs, cb, opaque);
acb->cmd = cmd; acb->cmd = cmd;
acb->qiov = qiov; acb->qiov = qiov;
assert(!qiov || qiov->size == size);
if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) { if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
acb->bounce = NULL; acb->bounce = NULL;
} else { } else {
@@ -687,6 +636,9 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
buf = acb->bounce; buf = acb->bounce;
off = sector_num * BDRV_SECTOR_SIZE;
size = nb_sectors * BDRV_SECTOR_SIZE;
rcb = g_new(RADOSCB, 1); rcb = g_new(RADOSCB, 1);
rcb->acb = acb; rcb->acb = acb;
rcb->buf = buf; rcb->buf = buf;
@@ -736,8 +688,7 @@ static BlockAIOCB *qemu_rbd_aio_readv(BlockDriverState *bs,
BlockCompletionFunc *cb, BlockCompletionFunc *cb,
void *opaque) void *opaque)
{ {
return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov, return rbd_start_aio(bs, sector_num, qiov, nb_sectors, cb, opaque,
nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
RBD_AIO_READ); RBD_AIO_READ);
} }
@@ -748,8 +699,7 @@ static BlockAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
BlockCompletionFunc *cb, BlockCompletionFunc *cb,
void *opaque) void *opaque)
{ {
return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov, return rbd_start_aio(bs, sector_num, qiov, nb_sectors, cb, opaque,
nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
RBD_AIO_WRITE); RBD_AIO_WRITE);
} }
@@ -882,8 +832,10 @@ static int qemu_rbd_snap_rollback(BlockDriverState *bs,
const char *snapshot_name) const char *snapshot_name)
{ {
BDRVRBDState *s = bs->opaque; BDRVRBDState *s = bs->opaque;
int r;
return rbd_snap_rollback(s->image, snapshot_name); r = rbd_snap_rollback(s->image, snapshot_name);
return r;
} }
static int qemu_rbd_snap_list(BlockDriverState *bs, static int qemu_rbd_snap_list(BlockDriverState *bs,
@@ -930,13 +882,13 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
} }
#ifdef LIBRBD_SUPPORTS_DISCARD #ifdef LIBRBD_SUPPORTS_DISCARD
static BlockAIOCB *qemu_rbd_aio_pdiscard(BlockDriverState *bs, static BlockAIOCB* qemu_rbd_aio_discard(BlockDriverState *bs,
int64_t offset, int64_t sector_num,
int count, int nb_sectors,
BlockCompletionFunc *cb, BlockCompletionFunc *cb,
void *opaque) void *opaque)
{ {
return rbd_start_aio(bs, offset, NULL, count, cb, opaque, return rbd_start_aio(bs, sector_num, NULL, nb_sectors, cb, opaque,
RBD_AIO_DISCARD); RBD_AIO_DISCARD);
} }
#endif #endif
@@ -967,11 +919,6 @@ static QemuOptsList qemu_rbd_create_opts = {
.type = QEMU_OPT_SIZE, .type = QEMU_OPT_SIZE,
.help = "RBD object size" .help = "RBD object size"
}, },
{
.name = "password-secret",
.type = QEMU_OPT_STRING,
.help = "ID of secret providing the password",
},
{ /* end of list */ } { /* end of list */ }
} }
}; };
@@ -1000,7 +947,7 @@ static BlockDriver bdrv_rbd = {
#endif #endif
#ifdef LIBRBD_SUPPORTS_DISCARD #ifdef LIBRBD_SUPPORTS_DISCARD
.bdrv_aio_pdiscard = qemu_rbd_aio_pdiscard, .bdrv_aio_discard = qemu_rbd_aio_discard,
#endif #endif
.bdrv_snapshot_create = qemu_rbd_snap_create, .bdrv_snapshot_create = qemu_rbd_snap_create,

View File

@@ -1,659 +0,0 @@
/*
* Replication Block filter
*
* Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
* Copyright (c) 2016 Intel Corporation
* Copyright (c) 2016 FUJITSU LIMITED
*
* Author:
* Wen Congyang <wency@cn.fujitsu.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/nbd.h"
#include "block/blockjob.h"
#include "block/block_int.h"
#include "block/block_backup.h"
#include "sysemu/block-backend.h"
#include "qapi/error.h"
#include "replication.h"
typedef struct BDRVReplicationState {
ReplicationMode mode;
int replication_state;
BdrvChild *active_disk;
BdrvChild *hidden_disk;
BdrvChild *secondary_disk;
char *top_id;
ReplicationState *rs;
Error *blocker;
int orig_hidden_flags;
int orig_secondary_flags;
int error;
} BDRVReplicationState;
enum {
BLOCK_REPLICATION_NONE, /* block replication is not started */
BLOCK_REPLICATION_RUNNING, /* block replication is running */
BLOCK_REPLICATION_FAILOVER, /* failover is running in background */
BLOCK_REPLICATION_FAILOVER_FAILED, /* failover failed */
BLOCK_REPLICATION_DONE, /* block replication is done */
};
static void replication_start(ReplicationState *rs, ReplicationMode mode,
Error **errp);
static void replication_do_checkpoint(ReplicationState *rs, Error **errp);
static void replication_get_error(ReplicationState *rs, Error **errp);
static void replication_stop(ReplicationState *rs, bool failover,
Error **errp);
#define REPLICATION_MODE "mode"
#define REPLICATION_TOP_ID "top-id"
static QemuOptsList replication_runtime_opts = {
.name = "replication",
.head = QTAILQ_HEAD_INITIALIZER(replication_runtime_opts.head),
.desc = {
{
.name = REPLICATION_MODE,
.type = QEMU_OPT_STRING,
},
{
.name = REPLICATION_TOP_ID,
.type = QEMU_OPT_STRING,
},
{ /* end of list */ }
},
};
static ReplicationOps replication_ops = {
.start = replication_start,
.checkpoint = replication_do_checkpoint,
.get_error = replication_get_error,
.stop = replication_stop,
};
static int replication_open(BlockDriverState *bs, QDict *options,
int flags, Error **errp)
{
int ret;
BDRVReplicationState *s = bs->opaque;
Error *local_err = NULL;
QemuOpts *opts = NULL;
const char *mode;
const char *top_id;
ret = -EINVAL;
opts = qemu_opts_create(&replication_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
goto fail;
}
mode = qemu_opt_get(opts, REPLICATION_MODE);
if (!mode) {
error_setg(&local_err, "Missing the option mode");
goto fail;
}
if (!strcmp(mode, "primary")) {
s->mode = REPLICATION_MODE_PRIMARY;
} else if (!strcmp(mode, "secondary")) {
s->mode = REPLICATION_MODE_SECONDARY;
top_id = qemu_opt_get(opts, REPLICATION_TOP_ID);
s->top_id = g_strdup(top_id);
if (!s->top_id) {
error_setg(&local_err, "Missing the option top-id");
goto fail;
}
} else {
error_setg(&local_err,
"The option mode's value should be primary or secondary");
goto fail;
}
s->rs = replication_new(bs, &replication_ops);
ret = 0;
fail:
qemu_opts_del(opts);
error_propagate(errp, local_err);
return ret;
}
static void replication_close(BlockDriverState *bs)
{
BDRVReplicationState *s = bs->opaque;
if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
replication_stop(s->rs, false, NULL);
}
if (s->mode == REPLICATION_MODE_SECONDARY) {
g_free(s->top_id);
}
replication_remove(s->rs);
}
static int64_t replication_getlength(BlockDriverState *bs)
{
return bdrv_getlength(bs->file->bs);
}
static int replication_get_io_status(BDRVReplicationState *s)
{
switch (s->replication_state) {
case BLOCK_REPLICATION_NONE:
return -EIO;
case BLOCK_REPLICATION_RUNNING:
return 0;
case BLOCK_REPLICATION_FAILOVER:
return s->mode == REPLICATION_MODE_PRIMARY ? -EIO : 0;
case BLOCK_REPLICATION_FAILOVER_FAILED:
return s->mode == REPLICATION_MODE_PRIMARY ? -EIO : 1;
case BLOCK_REPLICATION_DONE:
/*
* active commit job completes, and active disk and secondary_disk
* is swapped, so we can operate bs->file directly
*/
return s->mode == REPLICATION_MODE_PRIMARY ? -EIO : 0;
default:
abort();
}
}
static int replication_return_value(BDRVReplicationState *s, int ret)
{
if (s->mode == REPLICATION_MODE_SECONDARY) {
return ret;
}
if (ret < 0) {
s->error = ret;
ret = 0;
}
return ret;
}
static coroutine_fn int replication_co_readv(BlockDriverState *bs,
int64_t sector_num,
int remaining_sectors,
QEMUIOVector *qiov)
{
BDRVReplicationState *s = bs->opaque;
BdrvChild *child = s->secondary_disk;
BlockJob *job = NULL;
CowRequest req;
int ret;
if (s->mode == REPLICATION_MODE_PRIMARY) {
/* We only use it to forward primary write requests */
return -EIO;
}
ret = replication_get_io_status(s);
if (ret < 0) {
return ret;
}
if (child && child->bs) {
job = child->bs->job;
}
if (job) {
backup_wait_for_overlapping_requests(child->bs->job, sector_num,
remaining_sectors);
backup_cow_request_begin(&req, child->bs->job, sector_num,
remaining_sectors);
ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors,
qiov);
backup_cow_request_end(&req);
goto out;
}
ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors, qiov);
out:
return replication_return_value(s, ret);
}
static coroutine_fn int replication_co_writev(BlockDriverState *bs,
int64_t sector_num,
int remaining_sectors,
QEMUIOVector *qiov)
{
BDRVReplicationState *s = bs->opaque;
QEMUIOVector hd_qiov;
uint64_t bytes_done = 0;
BdrvChild *top = bs->file;
BdrvChild *base = s->secondary_disk;
BdrvChild *target;
int ret, n;
ret = replication_get_io_status(s);
if (ret < 0) {
goto out;
}
if (ret == 0) {
ret = bdrv_co_writev(top, sector_num,
remaining_sectors, qiov);
return replication_return_value(s, ret);
}
/*
* Failover failed, only write to active disk if the sectors
* have already been allocated in active disk/hidden disk.
*/
qemu_iovec_init(&hd_qiov, qiov->niov);
while (remaining_sectors > 0) {
ret = bdrv_is_allocated_above(top->bs, base->bs, sector_num,
remaining_sectors, &n);
if (ret < 0) {
goto out1;
}
qemu_iovec_reset(&hd_qiov);
qemu_iovec_concat(&hd_qiov, qiov, bytes_done, n * BDRV_SECTOR_SIZE);
target = ret ? top : base;
ret = bdrv_co_writev(target, sector_num, n, &hd_qiov);
if (ret < 0) {
goto out1;
}
remaining_sectors -= n;
sector_num += n;
bytes_done += n * BDRV_SECTOR_SIZE;
}
out1:
qemu_iovec_destroy(&hd_qiov);
out:
return ret;
}
static bool replication_recurse_is_first_non_filter(BlockDriverState *bs,
BlockDriverState *candidate)
{
return bdrv_recurse_is_first_non_filter(bs->file->bs, candidate);
}
static void secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
{
Error *local_err = NULL;
int ret;
if (!s->secondary_disk->bs->job) {
error_setg(errp, "Backup job was cancelled unexpectedly");
return;
}
backup_do_checkpoint(s->secondary_disk->bs->job, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
ret = s->active_disk->bs->drv->bdrv_make_empty(s->active_disk->bs);
if (ret < 0) {
error_setg(errp, "Cannot make active disk empty");
return;
}
ret = s->hidden_disk->bs->drv->bdrv_make_empty(s->hidden_disk->bs);
if (ret < 0) {
error_setg(errp, "Cannot make hidden disk empty");
return;
}
}
static void reopen_backing_file(BDRVReplicationState *s, bool writable,
Error **errp)
{
BlockReopenQueue *reopen_queue = NULL;
int orig_hidden_flags, orig_secondary_flags;
int new_hidden_flags, new_secondary_flags;
Error *local_err = NULL;
if (writable) {
orig_hidden_flags = s->orig_hidden_flags =
bdrv_get_flags(s->hidden_disk->bs);
new_hidden_flags = (orig_hidden_flags | BDRV_O_RDWR) &
~BDRV_O_INACTIVE;
orig_secondary_flags = s->orig_secondary_flags =
bdrv_get_flags(s->secondary_disk->bs);
new_secondary_flags = (orig_secondary_flags | BDRV_O_RDWR) &
~BDRV_O_INACTIVE;
} else {
orig_hidden_flags = (s->orig_hidden_flags | BDRV_O_RDWR) &
~BDRV_O_INACTIVE;
new_hidden_flags = s->orig_hidden_flags;
orig_secondary_flags = (s->orig_secondary_flags | BDRV_O_RDWR) &
~BDRV_O_INACTIVE;
new_secondary_flags = s->orig_secondary_flags;
}
if (orig_hidden_flags != new_hidden_flags) {
reopen_queue = bdrv_reopen_queue(reopen_queue, s->hidden_disk->bs, NULL,
new_hidden_flags);
}
if (!(orig_secondary_flags & BDRV_O_RDWR)) {
reopen_queue = bdrv_reopen_queue(reopen_queue, s->secondary_disk->bs,
NULL, new_secondary_flags);
}
if (reopen_queue) {
bdrv_reopen_multiple(reopen_queue, &local_err);
error_propagate(errp, local_err);
}
}
static void backup_job_cleanup(BDRVReplicationState *s)
{
BlockDriverState *top_bs;
top_bs = bdrv_lookup_bs(s->top_id, s->top_id, NULL);
if (!top_bs) {
return;
}
bdrv_op_unblock_all(top_bs, s->blocker);
error_free(s->blocker);
reopen_backing_file(s, false, NULL);
}
static void backup_job_completed(void *opaque, int ret)
{
BDRVReplicationState *s = opaque;
if (s->replication_state != BLOCK_REPLICATION_FAILOVER) {
/* The backup job is cancelled unexpectedly */
s->error = -EIO;
}
backup_job_cleanup(s);
}
static bool check_top_bs(BlockDriverState *top_bs, BlockDriverState *bs)
{
BdrvChild *child;
/* The bs itself is the top_bs */
if (top_bs == bs) {
return true;
}
/* Iterate over top_bs's children */
QLIST_FOREACH(child, &top_bs->children, next) {
if (child->bs == bs || check_top_bs(child->bs, bs)) {
return true;
}
}
return false;
}
static void replication_start(ReplicationState *rs, ReplicationMode mode,
Error **errp)
{
BlockDriverState *bs = rs->opaque;
BDRVReplicationState *s;
BlockDriverState *top_bs;
int64_t active_length, hidden_length, disk_length;
AioContext *aio_context;
Error *local_err = NULL;
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
s = bs->opaque;
if (s->replication_state != BLOCK_REPLICATION_NONE) {
error_setg(errp, "Block replication is running or done");
aio_context_release(aio_context);
return;
}
if (s->mode != mode) {
error_setg(errp, "The parameter mode's value is invalid, needs %d,"
" but got %d", s->mode, mode);
aio_context_release(aio_context);
return;
}
switch (s->mode) {
case REPLICATION_MODE_PRIMARY:
break;
case REPLICATION_MODE_SECONDARY:
s->active_disk = bs->file;
if (!s->active_disk || !s->active_disk->bs ||
!s->active_disk->bs->backing) {
error_setg(errp, "Active disk doesn't have backing file");
aio_context_release(aio_context);
return;
}
s->hidden_disk = s->active_disk->bs->backing;
if (!s->hidden_disk->bs || !s->hidden_disk->bs->backing) {
error_setg(errp, "Hidden disk doesn't have backing file");
aio_context_release(aio_context);
return;
}
s->secondary_disk = s->hidden_disk->bs->backing;
if (!s->secondary_disk->bs || !bdrv_has_blk(s->secondary_disk->bs)) {
error_setg(errp, "The secondary disk doesn't have block backend");
aio_context_release(aio_context);
return;
}
/* verify the length */
active_length = bdrv_getlength(s->active_disk->bs);
hidden_length = bdrv_getlength(s->hidden_disk->bs);
disk_length = bdrv_getlength(s->secondary_disk->bs);
if (active_length < 0 || hidden_length < 0 || disk_length < 0 ||
active_length != hidden_length || hidden_length != disk_length) {
error_setg(errp, "Active disk, hidden disk, secondary disk's length"
" are not the same");
aio_context_release(aio_context);
return;
}
if (!s->active_disk->bs->drv->bdrv_make_empty ||
!s->hidden_disk->bs->drv->bdrv_make_empty) {
error_setg(errp,
"Active disk or hidden disk doesn't support make_empty");
aio_context_release(aio_context);
return;
}
/* reopen the backing file in r/w mode */
reopen_backing_file(s, true, &local_err);
if (local_err) {
error_propagate(errp, local_err);
aio_context_release(aio_context);
return;
}
/* start backup job now */
error_setg(&s->blocker,
"Block device is in use by internal backup job");
top_bs = bdrv_lookup_bs(s->top_id, s->top_id, NULL);
if (!top_bs || !bdrv_is_root_node(top_bs) ||
!check_top_bs(top_bs, bs)) {
error_setg(errp, "No top_bs or it is invalid");
reopen_backing_file(s, false, NULL);
aio_context_release(aio_context);
return;
}
bdrv_op_block_all(top_bs, s->blocker);
bdrv_op_unblock(top_bs, BLOCK_OP_TYPE_DATAPLANE, s->blocker);
backup_start("replication-backup", s->secondary_disk->bs,
s->hidden_disk->bs, 0, MIRROR_SYNC_MODE_NONE, NULL, false,
BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
backup_job_completed, s, NULL, &local_err);
if (local_err) {
error_propagate(errp, local_err);
backup_job_cleanup(s);
aio_context_release(aio_context);
return;
}
break;
default:
aio_context_release(aio_context);
abort();
}
s->replication_state = BLOCK_REPLICATION_RUNNING;
if (s->mode == REPLICATION_MODE_SECONDARY) {
secondary_do_checkpoint(s, errp);
}
s->error = 0;
aio_context_release(aio_context);
}
static void replication_do_checkpoint(ReplicationState *rs, Error **errp)
{
BlockDriverState *bs = rs->opaque;
BDRVReplicationState *s;
AioContext *aio_context;
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
s = bs->opaque;
if (s->mode == REPLICATION_MODE_SECONDARY) {
secondary_do_checkpoint(s, errp);
}
aio_context_release(aio_context);
}
static void replication_get_error(ReplicationState *rs, Error **errp)
{
BlockDriverState *bs = rs->opaque;
BDRVReplicationState *s;
AioContext *aio_context;
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
s = bs->opaque;
if (s->replication_state != BLOCK_REPLICATION_RUNNING) {
error_setg(errp, "Block replication is not running");
aio_context_release(aio_context);
return;
}
if (s->error) {
error_setg(errp, "I/O error occurred");
aio_context_release(aio_context);
return;
}
aio_context_release(aio_context);
}
static void replication_done(void *opaque, int ret)
{
BlockDriverState *bs = opaque;
BDRVReplicationState *s = bs->opaque;
if (ret == 0) {
s->replication_state = BLOCK_REPLICATION_DONE;
/* refresh top bs's filename */
bdrv_refresh_filename(bs);
s->active_disk = NULL;
s->secondary_disk = NULL;
s->hidden_disk = NULL;
s->error = 0;
} else {
s->replication_state = BLOCK_REPLICATION_FAILOVER_FAILED;
s->error = -EIO;
}
}
static void replication_stop(ReplicationState *rs, bool failover, Error **errp)
{
BlockDriverState *bs = rs->opaque;
BDRVReplicationState *s;
AioContext *aio_context;
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
s = bs->opaque;
if (s->replication_state != BLOCK_REPLICATION_RUNNING) {
error_setg(errp, "Block replication is not running");
aio_context_release(aio_context);
return;
}
switch (s->mode) {
case REPLICATION_MODE_PRIMARY:
s->replication_state = BLOCK_REPLICATION_DONE;
s->error = 0;
break;
case REPLICATION_MODE_SECONDARY:
/*
* This BDS will be closed, and the job should be completed
* before the BDS is closed, because we will access hidden
* disk, secondary disk in backup_job_completed().
*/
if (s->secondary_disk->bs->job) {
block_job_cancel_sync(s->secondary_disk->bs->job);
}
if (!failover) {
secondary_do_checkpoint(s, errp);
s->replication_state = BLOCK_REPLICATION_DONE;
aio_context_release(aio_context);
return;
}
s->replication_state = BLOCK_REPLICATION_FAILOVER;
commit_active_start("replication-commit", s->active_disk->bs,
s->secondary_disk->bs, 0, BLOCKDEV_ON_ERROR_REPORT,
replication_done,
bs, errp, true);
break;
default:
aio_context_release(aio_context);
abort();
}
aio_context_release(aio_context);
}
BlockDriver bdrv_replication = {
.format_name = "replication",
.protocol_name = "replication",
.instance_size = sizeof(BDRVReplicationState),
.bdrv_open = replication_open,
.bdrv_close = replication_close,
.bdrv_getlength = replication_getlength,
.bdrv_co_readv = replication_co_readv,
.bdrv_co_writev = replication_co_writev,
.is_filter = true,
.bdrv_recurse_is_first_non_filter = replication_recurse_is_first_non_filter,
.has_variable_length = true,
};
static void bdrv_replication_init(void)
{
bdrv_register(&bdrv_replication);
}
block_init(bdrv_replication_init);

View File

@@ -12,15 +12,12 @@
* GNU GPL, version 2 or (at your option) any later version. * GNU GPL, version 2 or (at your option) any later version.
*/ */
#include "qemu/osdep.h" #include "qemu-common.h"
#include "qapi/error.h"
#include "qemu/uri.h" #include "qemu/uri.h"
#include "qemu/error-report.h" #include "qemu/error-report.h"
#include "qemu/sockets.h" #include "qemu/sockets.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/bitops.h" #include "qemu/bitops.h"
#include "qemu/cutils.h"
#define SD_PROTO_VER 0x01 #define SD_PROTO_VER 0x01
@@ -31,6 +28,7 @@
#define SD_OP_READ_OBJ 0x02 #define SD_OP_READ_OBJ 0x02
#define SD_OP_WRITE_OBJ 0x03 #define SD_OP_WRITE_OBJ 0x03
/* 0x04 is used internally by Sheepdog */ /* 0x04 is used internally by Sheepdog */
#define SD_OP_DISCARD_OBJ 0x05
#define SD_OP_NEW_VDI 0x11 #define SD_OP_NEW_VDI 0x11
#define SD_OP_LOCK_VDI 0x12 #define SD_OP_LOCK_VDI 0x12
@@ -286,24 +284,15 @@ static inline bool is_snapshot(struct SheepdogInode *inode)
return !!inode->snap_ctime; return !!inode->snap_ctime;
} }
static inline size_t count_data_objs(const struct SheepdogInode *inode)
{
return DIV_ROUND_UP(inode->vdi_size,
(1UL << inode->block_size_shift));
}
#undef DPRINTF #undef DPRINTF
#ifdef DEBUG_SDOG #ifdef DEBUG_SDOG
#define DEBUG_SDOG_PRINT 1 #define DPRINTF(fmt, args...) \
#else do { \
#define DEBUG_SDOG_PRINT 0 fprintf(stdout, "%s %d: " fmt, __func__, __LINE__, ##args); \
#endif
#define DPRINTF(fmt, args...) \
do { \
if (DEBUG_SDOG_PRINT) { \
fprintf(stderr, "%s %d: " fmt, __func__, __LINE__, ##args); \
} \
} while (0) } while (0)
#else
#define DPRINTF(fmt, args...)
#endif
typedef struct SheepdogAIOCB SheepdogAIOCB; typedef struct SheepdogAIOCB SheepdogAIOCB;
@@ -329,7 +318,7 @@ enum AIOCBState {
AIOCB_DISCARD_OBJ, AIOCB_DISCARD_OBJ,
}; };
#define AIOCBOverlapping(x, y) \ #define AIOCBOverwrapping(x, y) \
(!(x->max_affect_data_idx < y->min_affect_data_idx \ (!(x->max_affect_data_idx < y->min_affect_data_idx \
|| y->max_affect_data_idx < x->min_affect_data_idx)) || y->max_affect_data_idx < x->min_affect_data_idx))
@@ -353,15 +342,6 @@ struct SheepdogAIOCB {
uint32_t min_affect_data_idx; uint32_t min_affect_data_idx;
uint32_t max_affect_data_idx; uint32_t max_affect_data_idx;
/*
* The difference between affect_data_idx and dirty_data_idx:
* affect_data_idx represents range of index of all request types.
* dirty_data_idx represents range of index updated by COW requests.
* dirty_data_idx is used for updating an inode object.
*/
uint32_t min_dirty_data_idx;
uint32_t max_dirty_data_idx;
QLIST_ENTRY(SheepdogAIOCB) aiocb_siblings; QLIST_ENTRY(SheepdogAIOCB) aiocb_siblings;
}; };
@@ -371,6 +351,9 @@ typedef struct BDRVSheepdogState {
SheepdogInode inode; SheepdogInode inode;
uint32_t min_dirty_data_idx;
uint32_t max_dirty_data_idx;
char name[SD_MAX_VDI_LEN]; char name[SD_MAX_VDI_LEN];
bool is_snapshot; bool is_snapshot;
uint32_t cache_flags; uint32_t cache_flags;
@@ -390,15 +373,10 @@ typedef struct BDRVSheepdogState {
QLIST_HEAD(inflight_aio_head, AIOReq) inflight_aio_head; QLIST_HEAD(inflight_aio_head, AIOReq) inflight_aio_head;
QLIST_HEAD(failed_aio_head, AIOReq) failed_aio_head; QLIST_HEAD(failed_aio_head, AIOReq) failed_aio_head;
CoQueue overlapping_queue; CoQueue overwrapping_queue;
QLIST_HEAD(inflight_aiocb_head, SheepdogAIOCB) inflight_aiocb_head; QLIST_HEAD(inflight_aiocb_head, SheepdogAIOCB) inflight_aiocb_head;
} BDRVSheepdogState; } BDRVSheepdogState;
typedef struct BDRVSheepdogReopenState {
int fd;
int cache_flags;
} BDRVSheepdogReopenState;
static const char * sd_strerror(int err) static const char * sd_strerror(int err)
{ {
int i; int i;
@@ -495,7 +473,7 @@ static inline void free_aio_req(BDRVSheepdogState *s, AIOReq *aio_req)
static void coroutine_fn sd_finish_aiocb(SheepdogAIOCB *acb) static void coroutine_fn sd_finish_aiocb(SheepdogAIOCB *acb)
{ {
qemu_coroutine_enter(acb->coroutine); qemu_coroutine_enter(acb->coroutine, NULL);
qemu_aio_unref(acb); qemu_aio_unref(acb);
} }
@@ -578,9 +556,6 @@ static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov,
acb->max_affect_data_idx = (acb->sector_num * BDRV_SECTOR_SIZE + acb->max_affect_data_idx = (acb->sector_num * BDRV_SECTOR_SIZE +
acb->nb_sectors * BDRV_SECTOR_SIZE) / object_size; acb->nb_sectors * BDRV_SECTOR_SIZE) / object_size;
acb->min_dirty_data_idx = UINT32_MAX;
acb->max_dirty_data_idx = 0;
return acb; return acb;
} }
@@ -620,13 +595,14 @@ static coroutine_fn int send_co_req(int sockfd, SheepdogReq *hdr, void *data,
ret = qemu_co_send(sockfd, hdr, sizeof(*hdr)); ret = qemu_co_send(sockfd, hdr, sizeof(*hdr));
if (ret != sizeof(*hdr)) { if (ret != sizeof(*hdr)) {
error_report("failed to send a req, %s", strerror(errno)); error_report("failed to send a req, %s", strerror(errno));
return -errno; ret = -socket_error();
return ret;
} }
ret = qemu_co_send(sockfd, data, *wlen); ret = qemu_co_send(sockfd, data, *wlen);
if (ret != *wlen) { if (ret != *wlen) {
ret = -socket_error();
error_report("failed to send a req, %s", strerror(errno)); error_report("failed to send a req, %s", strerror(errno));
return -errno;
} }
return ret; return ret;
@@ -636,7 +612,7 @@ static void restart_co_req(void *opaque)
{ {
Coroutine *co = opaque; Coroutine *co = opaque;
qemu_coroutine_enter(co); qemu_coroutine_enter(co, NULL);
} }
typedef struct SheepdogReqCo { typedef struct SheepdogReqCo {
@@ -662,16 +638,14 @@ static coroutine_fn void do_co_req(void *opaque)
unsigned int *rlen = srco->rlen; unsigned int *rlen = srco->rlen;
co = qemu_coroutine_self(); co = qemu_coroutine_self();
aio_set_fd_handler(srco->aio_context, sockfd, false, aio_set_fd_handler(srco->aio_context, sockfd, NULL, restart_co_req, co);
NULL, restart_co_req, co);
ret = send_co_req(sockfd, hdr, data, wlen); ret = send_co_req(sockfd, hdr, data, wlen);
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
aio_set_fd_handler(srco->aio_context, sockfd, false, aio_set_fd_handler(srco->aio_context, sockfd, restart_co_req, NULL, co);
restart_co_req, NULL, co);
ret = qemu_co_recv(sockfd, hdr, sizeof(*hdr)); ret = qemu_co_recv(sockfd, hdr, sizeof(*hdr));
if (ret != sizeof(*hdr)) { if (ret != sizeof(*hdr)) {
@@ -696,8 +670,7 @@ static coroutine_fn void do_co_req(void *opaque)
out: out:
/* there is at most one request for this sockfd, so it is safe to /* there is at most one request for this sockfd, so it is safe to
* set each handler to NULL. */ * set each handler to NULL. */
aio_set_fd_handler(srco->aio_context, sockfd, false, aio_set_fd_handler(srco->aio_context, sockfd, NULL, NULL, NULL);
NULL, NULL, NULL);
srco->ret = ret; srco->ret = ret;
srco->finished = true; srco->finished = true;
@@ -726,8 +699,8 @@ static int do_req(int sockfd, AioContext *aio_context, SheepdogReq *hdr,
if (qemu_in_coroutine()) { if (qemu_in_coroutine()) {
do_co_req(&srco); do_co_req(&srco);
} else { } else {
co = qemu_coroutine_create(do_co_req, &srco); co = qemu_coroutine_create(do_co_req);
qemu_coroutine_enter(co); qemu_coroutine_enter(co, &srco);
while (!srco.finished) { while (!srco.finished) {
aio_poll(aio_context, true); aio_poll(aio_context, true);
} }
@@ -749,8 +722,7 @@ static coroutine_fn void reconnect_to_sdog(void *opaque)
BDRVSheepdogState *s = opaque; BDRVSheepdogState *s = opaque;
AIOReq *aio_req, *next; AIOReq *aio_req, *next;
aio_set_fd_handler(s->aio_context, s->fd, false, NULL, aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
NULL, NULL);
close(s->fd); close(s->fd);
s->fd = -1; s->fd = -1;
@@ -847,8 +819,8 @@ static void coroutine_fn aio_read_response(void *opaque)
*/ */
if (rsp.result == SD_RES_SUCCESS) { if (rsp.result == SD_RES_SUCCESS) {
s->inode.data_vdi_id[idx] = s->inode.vdi_id; s->inode.data_vdi_id[idx] = s->inode.vdi_id;
acb->max_dirty_data_idx = MAX(idx, acb->max_dirty_data_idx); s->max_dirty_data_idx = MAX(idx, s->max_dirty_data_idx);
acb->min_dirty_data_idx = MIN(idx, acb->min_dirty_data_idx); s->min_dirty_data_idx = MIN(idx, s->min_dirty_data_idx);
} }
} }
break; break;
@@ -875,6 +847,10 @@ static void coroutine_fn aio_read_response(void *opaque)
rsp.result = SD_RES_SUCCESS; rsp.result = SD_RES_SUCCESS;
s->discard_supported = false; s->discard_supported = false;
break; break;
case SD_RES_SUCCESS:
idx = data_oid_to_idx(aio_req->oid);
s->inode.data_vdi_id[idx] = 0;
break;
default: default:
break; break;
} }
@@ -925,17 +901,17 @@ static void co_read_response(void *opaque)
BDRVSheepdogState *s = opaque; BDRVSheepdogState *s = opaque;
if (!s->co_recv) { if (!s->co_recv) {
s->co_recv = qemu_coroutine_create(aio_read_response, opaque); s->co_recv = qemu_coroutine_create(aio_read_response);
} }
qemu_coroutine_enter(s->co_recv); qemu_coroutine_enter(s->co_recv, opaque);
} }
static void co_write_request(void *opaque) static void co_write_request(void *opaque)
{ {
BDRVSheepdogState *s = opaque; BDRVSheepdogState *s = opaque;
qemu_coroutine_enter(s->co_send); qemu_coroutine_enter(s->co_send, NULL);
} }
/* /*
@@ -953,8 +929,7 @@ static int get_sheep_fd(BDRVSheepdogState *s, Error **errp)
return fd; return fd;
} }
aio_set_fd_handler(s->aio_context, fd, false, aio_set_fd_handler(s->aio_context, fd, co_read_response, NULL, s);
co_read_response, NULL, s);
return fd; return fd;
} }
@@ -1049,7 +1024,7 @@ static int parse_vdiname(BDRVSheepdogState *s, const char *filename,
const char *host_spec, *vdi_spec; const char *host_spec, *vdi_spec;
int nr_sep, ret; int nr_sep, ret;
strstart(filename, "sheepdog:", &filename); strstart(filename, "sheepdog:", (const char **)&filename);
p = q = g_strdup(filename); p = q = g_strdup(filename);
/* count the number of separators */ /* count the number of separators */
@@ -1190,13 +1165,7 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
hdr.flags = SD_FLAG_CMD_WRITE | flags; hdr.flags = SD_FLAG_CMD_WRITE | flags;
break; break;
case AIOCB_DISCARD_OBJ: case AIOCB_DISCARD_OBJ:
hdr.opcode = SD_OP_WRITE_OBJ; hdr.opcode = SD_OP_DISCARD_OBJ;
hdr.flags = SD_FLAG_CMD_WRITE | flags;
s->inode.data_vdi_id[data_oid_to_idx(oid)] = 0;
offset = offsetof(SheepdogInode,
data_vdi_id[data_oid_to_idx(oid)]);
oid = vid_to_vdi_oid(s->inode.vdi_id);
wlen = datalen = sizeof(uint32_t);
break; break;
} }
@@ -1215,7 +1184,7 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
qemu_co_mutex_lock(&s->lock); qemu_co_mutex_lock(&s->lock);
s->co_send = qemu_coroutine_self(); s->co_send = qemu_coroutine_self();
aio_set_fd_handler(s->aio_context, s->fd, false, aio_set_fd_handler(s->aio_context, s->fd,
co_read_response, co_write_request, s); co_read_response, co_write_request, s);
socket_set_cork(s->fd, 1); socket_set_cork(s->fd, 1);
@@ -1234,8 +1203,7 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
} }
out: out:
socket_set_cork(s->fd, 0); socket_set_cork(s->fd, 0);
aio_set_fd_handler(s->aio_context, s->fd, false, aio_set_fd_handler(s->aio_context, s->fd, co_read_response, NULL, s);
co_read_response, NULL, s);
s->co_send = NULL; s->co_send = NULL;
qemu_co_mutex_unlock(&s->lock); qemu_co_mutex_unlock(&s->lock);
} }
@@ -1385,8 +1353,7 @@ static void sd_detach_aio_context(BlockDriverState *bs)
{ {
BDRVSheepdogState *s = bs->opaque; BDRVSheepdogState *s = bs->opaque;
aio_set_fd_handler(s->aio_context, s->fd, false, NULL, aio_set_fd_handler(s->aio_context, s->fd, NULL, NULL, NULL);
NULL, NULL);
} }
static void sd_attach_aio_context(BlockDriverState *bs, static void sd_attach_aio_context(BlockDriverState *bs,
@@ -1395,8 +1362,7 @@ static void sd_attach_aio_context(BlockDriverState *bs,
BDRVSheepdogState *s = bs->opaque; BDRVSheepdogState *s = bs->opaque;
s->aio_context = new_context; s->aio_context = new_context;
aio_set_fd_handler(new_context, s->fd, false, aio_set_fd_handler(new_context, s->fd, co_read_response, NULL, s);
co_read_response, NULL, s);
} }
/* TODO Convert to fine grained options */ /* TODO Convert to fine grained options */
@@ -1500,17 +1466,18 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
} }
memcpy(&s->inode, buf, sizeof(s->inode)); memcpy(&s->inode, buf, sizeof(s->inode));
s->min_dirty_data_idx = UINT32_MAX;
s->max_dirty_data_idx = 0;
bs->total_sectors = s->inode.vdi_size / BDRV_SECTOR_SIZE; bs->total_sectors = s->inode.vdi_size / BDRV_SECTOR_SIZE;
pstrcpy(s->name, sizeof(s->name), vdi); pstrcpy(s->name, sizeof(s->name), vdi);
qemu_co_mutex_init(&s->lock); qemu_co_mutex_init(&s->lock);
qemu_co_queue_init(&s->overlapping_queue); qemu_co_queue_init(&s->overwrapping_queue);
qemu_opts_del(opts); qemu_opts_del(opts);
g_free(buf); g_free(buf);
return 0; return 0;
out: out:
aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL);
false, NULL, NULL, NULL);
if (s->fd >= 0) { if (s->fd >= 0) {
closesocket(s->fd); closesocket(s->fd);
} }
@@ -1519,70 +1486,6 @@ out:
return ret; return ret;
} }
static int sd_reopen_prepare(BDRVReopenState *state, BlockReopenQueue *queue,
Error **errp)
{
BDRVSheepdogState *s = state->bs->opaque;
BDRVSheepdogReopenState *re_s;
int ret = 0;
re_s = state->opaque = g_new0(BDRVSheepdogReopenState, 1);
re_s->cache_flags = SD_FLAG_CMD_CACHE;
if (state->flags & BDRV_O_NOCACHE) {
re_s->cache_flags = SD_FLAG_CMD_DIRECT;
}
re_s->fd = get_sheep_fd(s, errp);
if (re_s->fd < 0) {
ret = re_s->fd;
return ret;
}
return ret;
}
static void sd_reopen_commit(BDRVReopenState *state)
{
BDRVSheepdogReopenState *re_s = state->opaque;
BDRVSheepdogState *s = state->bs->opaque;
if (s->fd) {
aio_set_fd_handler(s->aio_context, s->fd, false,
NULL, NULL, NULL);
closesocket(s->fd);
}
s->fd = re_s->fd;
s->cache_flags = re_s->cache_flags;
g_free(state->opaque);
state->opaque = NULL;
return;
}
static void sd_reopen_abort(BDRVReopenState *state)
{
BDRVSheepdogReopenState *re_s = state->opaque;
BDRVSheepdogState *s = state->bs->opaque;
if (re_s == NULL) {
return;
}
if (re_s->fd) {
aio_set_fd_handler(s->aio_context, re_s->fd, false,
NULL, NULL, NULL);
closesocket(re_s->fd);
}
g_free(state->opaque);
state->opaque = NULL;
return;
}
static int do_sd_create(BDRVSheepdogState *s, uint32_t *vdi_id, int snapshot, static int do_sd_create(BDRVSheepdogState *s, uint32_t *vdi_id, int snapshot,
Error **errp) Error **errp)
{ {
@@ -1641,7 +1544,7 @@ static int do_sd_create(BDRVSheepdogState *s, uint32_t *vdi_id, int snapshot,
static int sd_prealloc(const char *filename, Error **errp) static int sd_prealloc(const char *filename, Error **errp)
{ {
BlockBackend *blk = NULL; BlockDriverState *bs = NULL;
BDRVSheepdogState *base = NULL; BDRVSheepdogState *base = NULL;
unsigned long buf_size; unsigned long buf_size;
uint32_t idx, max_idx; uint32_t idx, max_idx;
@@ -1650,22 +1553,19 @@ static int sd_prealloc(const char *filename, Error **errp)
void *buf = NULL; void *buf = NULL;
int ret; int ret;
blk = blk_new_open(filename, NULL, NULL, ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, errp); NULL, errp);
if (blk == NULL) { if (ret < 0) {
ret = -EIO;
goto out_with_err_set; goto out_with_err_set;
} }
blk_set_allow_write_beyond_eof(blk, true); vdi_size = bdrv_getlength(bs);
vdi_size = blk_getlength(blk);
if (vdi_size < 0) { if (vdi_size < 0) {
ret = vdi_size; ret = vdi_size;
goto out; goto out;
} }
base = blk_bs(blk)->opaque; base = bs->opaque;
object_size = (UINT32_C(1) << base->inode.block_size_shift); object_size = (UINT32_C(1) << base->inode.block_size_shift);
buf_size = MIN(object_size, SD_DATA_OBJ_SIZE); buf_size = MIN(object_size, SD_DATA_OBJ_SIZE);
buf = g_malloc0(buf_size); buf = g_malloc0(buf_size);
@@ -1677,24 +1577,23 @@ static int sd_prealloc(const char *filename, Error **errp)
* The created image can be a cloned image, so we need to read * The created image can be a cloned image, so we need to read
* a data from the source image. * a data from the source image.
*/ */
ret = blk_pread(blk, idx * buf_size, buf, buf_size); ret = bdrv_pread(bs, idx * buf_size, buf, buf_size);
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
ret = blk_pwrite(blk, idx * buf_size, buf, buf_size, 0); ret = bdrv_pwrite(bs, idx * buf_size, buf, buf_size);
if (ret < 0) { if (ret < 0) {
goto out; goto out;
} }
} }
ret = 0;
out: out:
if (ret < 0) { if (ret < 0) {
error_setg_errno(errp, -ret, "Can't pre-allocate"); error_setg_errno(errp, -ret, "Can't pre-allocate");
} }
out_with_err_set: out_with_err_set:
if (blk) { if (bs) {
blk_unref(blk); bdrv_unref(bs);
} }
g_free(buf); g_free(buf);
@@ -1834,7 +1733,7 @@ static int sd_create(const char *filename, QemuOpts *opts,
} }
if (backing_file) { if (backing_file) {
BlockBackend *blk; BlockDriverState *bs;
BDRVSheepdogState *base; BDRVSheepdogState *base;
BlockDriver *drv; BlockDriver *drv;
@@ -1846,23 +1745,23 @@ static int sd_create(const char *filename, QemuOpts *opts,
goto out; goto out;
} }
blk = blk_new_open(backing_file, NULL, NULL, bs = NULL;
BDRV_O_PROTOCOL, errp); ret = bdrv_open(&bs, backing_file, NULL, NULL, BDRV_O_PROTOCOL, NULL,
if (blk == NULL) { errp);
ret = -EIO; if (ret < 0) {
goto out; goto out;
} }
base = blk_bs(blk)->opaque; base = bs->opaque;
if (!is_snapshot(&base->inode)) { if (!is_snapshot(&base->inode)) {
error_setg(errp, "cannot clone from a non snapshot vdi"); error_setg(errp, "cannot clone from a non snapshot vdi");
blk_unref(blk); bdrv_unref(bs);
ret = -EINVAL; ret = -EINVAL;
goto out; goto out;
} }
s->inode.vdi_id = base->inode.vdi_id; s->inode.vdi_id = base->inode.vdi_id;
blk_unref(blk); bdrv_unref(bs);
} }
s->aio_context = qemu_get_aio_context(); s->aio_context = qemu_get_aio_context();
@@ -1877,7 +1776,8 @@ static int sd_create(const char *filename, QemuOpts *opts,
fd = connect_to_sdog(s, &local_err); fd = connect_to_sdog(s, &local_err);
if (fd < 0) { if (fd < 0) {
error_report_err(local_err); error_report("%s", error_get_pretty(local_err));
error_free(local_err);
ret = -EIO; ret = -EIO;
goto out; goto out;
} }
@@ -1961,8 +1861,7 @@ static void sd_close(BlockDriverState *bs)
error_report("%s, %s", sd_strerror(rsp->result), s->name); error_report("%s, %s", sd_strerror(rsp->result), s->name);
} }
aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd, NULL, NULL, NULL);
false, NULL, NULL, NULL);
closesocket(s->fd); closesocket(s->fd);
g_free(s->host_spec); g_free(s->host_spec);
} }
@@ -2024,16 +1923,16 @@ static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
AIOReq *aio_req; AIOReq *aio_req;
uint32_t offset, data_len, mn, mx; uint32_t offset, data_len, mn, mx;
mn = acb->min_dirty_data_idx; mn = s->min_dirty_data_idx;
mx = acb->max_dirty_data_idx; mx = s->max_dirty_data_idx;
if (mn <= mx) { if (mn <= mx) {
/* we need to update the vdi object. */ /* we need to update the vdi object. */
offset = sizeof(s->inode) - sizeof(s->inode.data_vdi_id) + offset = sizeof(s->inode) - sizeof(s->inode.data_vdi_id) +
mn * sizeof(s->inode.data_vdi_id[0]); mn * sizeof(s->inode.data_vdi_id[0]);
data_len = (mx - mn + 1) * sizeof(s->inode.data_vdi_id[0]); data_len = (mx - mn + 1) * sizeof(s->inode.data_vdi_id[0]);
acb->min_dirty_data_idx = UINT32_MAX; s->min_dirty_data_idx = UINT32_MAX;
acb->max_dirty_data_idx = 0; s->max_dirty_data_idx = 0;
iov.iov_base = &s->inode; iov.iov_base = &s->inode;
iov.iov_len = sizeof(s->inode); iov.iov_len = sizeof(s->inode);
@@ -2242,9 +2141,7 @@ static int coroutine_fn sd_co_rw_vector(void *p)
} }
aio_req = alloc_aio_req(s, acb, oid, len, offset, flags, create, aio_req = alloc_aio_req(s, acb, oid, len, offset, flags, create,
old_oid, old_oid, done);
acb->aiocb_type == AIOCB_DISCARD_OBJ ?
0 : done);
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings); QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov, add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
@@ -2261,12 +2158,12 @@ out:
return 1; return 1;
} }
static bool check_overlapping_aiocb(BDRVSheepdogState *s, SheepdogAIOCB *aiocb) static bool check_overwrapping_aiocb(BDRVSheepdogState *s, SheepdogAIOCB *aiocb)
{ {
SheepdogAIOCB *cb; SheepdogAIOCB *cb;
QLIST_FOREACH(cb, &s->inflight_aiocb_head, aiocb_siblings) { QLIST_FOREACH(cb, &s->inflight_aiocb_head, aiocb_siblings) {
if (AIOCBOverlapping(aiocb, cb)) { if (AIOCBOverwrapping(aiocb, cb)) {
return true; return true;
} }
} }
@@ -2295,15 +2192,15 @@ static coroutine_fn int sd_co_writev(BlockDriverState *bs, int64_t sector_num,
acb->aiocb_type = AIOCB_WRITE_UDATA; acb->aiocb_type = AIOCB_WRITE_UDATA;
retry: retry:
if (check_overlapping_aiocb(s, acb)) { if (check_overwrapping_aiocb(s, acb)) {
qemu_co_queue_wait(&s->overlapping_queue); qemu_co_queue_wait(&s->overwrapping_queue);
goto retry; goto retry;
} }
ret = sd_co_rw_vector(acb); ret = sd_co_rw_vector(acb);
if (ret <= 0) { if (ret <= 0) {
QLIST_REMOVE(acb, aiocb_siblings); QLIST_REMOVE(acb, aiocb_siblings);
qemu_co_queue_restart_all(&s->overlapping_queue); qemu_co_queue_restart_all(&s->overwrapping_queue);
qemu_aio_unref(acb); qemu_aio_unref(acb);
return ret; return ret;
} }
@@ -2311,7 +2208,7 @@ retry:
qemu_coroutine_yield(); qemu_coroutine_yield();
QLIST_REMOVE(acb, aiocb_siblings); QLIST_REMOVE(acb, aiocb_siblings);
qemu_co_queue_restart_all(&s->overlapping_queue); qemu_co_queue_restart_all(&s->overwrapping_queue);
return acb->ret; return acb->ret;
} }
@@ -2328,15 +2225,15 @@ static coroutine_fn int sd_co_readv(BlockDriverState *bs, int64_t sector_num,
acb->aio_done_func = sd_finish_aiocb; acb->aio_done_func = sd_finish_aiocb;
retry: retry:
if (check_overlapping_aiocb(s, acb)) { if (check_overwrapping_aiocb(s, acb)) {
qemu_co_queue_wait(&s->overlapping_queue); qemu_co_queue_wait(&s->overwrapping_queue);
goto retry; goto retry;
} }
ret = sd_co_rw_vector(acb); ret = sd_co_rw_vector(acb);
if (ret <= 0) { if (ret <= 0) {
QLIST_REMOVE(acb, aiocb_siblings); QLIST_REMOVE(acb, aiocb_siblings);
qemu_co_queue_restart_all(&s->overlapping_queue); qemu_co_queue_restart_all(&s->overwrapping_queue);
qemu_aio_unref(acb); qemu_aio_unref(acb);
return ret; return ret;
} }
@@ -2344,7 +2241,7 @@ retry:
qemu_coroutine_yield(); qemu_coroutine_yield();
QLIST_REMOVE(acb, aiocb_siblings); QLIST_REMOVE(acb, aiocb_siblings);
qemu_co_queue_restart_all(&s->overlapping_queue); qemu_co_queue_restart_all(&s->overwrapping_queue);
return acb->ret; return acb->ret;
} }
@@ -2421,8 +2318,9 @@ static int sd_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
ret = do_sd_create(s, &new_vid, 1, &local_err); ret = do_sd_create(s, &new_vid, 1, &local_err);
if (ret < 0) { if (ret < 0) {
error_reportf_err(local_err, error_report("failed to create inode for snapshot: %s",
"failed to create inode for snapshot: "); error_get_pretty(local_err));
error_free(local_err);
goto cleanup; goto cleanup;
} }
@@ -2493,131 +2391,13 @@ out:
return ret; return ret;
} }
#define NR_BATCHED_DISCARD 128
static bool remove_objects(BDRVSheepdogState *s)
{
int fd, i = 0, nr_objs = 0;
Error *local_err = NULL;
int ret = 0;
bool result = true;
SheepdogInode *inode = &s->inode;
fd = connect_to_sdog(s, &local_err);
if (fd < 0) {
error_report_err(local_err);
return false;
}
nr_objs = count_data_objs(inode);
while (i < nr_objs) {
int start_idx, nr_filled_idx;
while (i < nr_objs && !inode->data_vdi_id[i]) {
i++;
}
start_idx = i;
nr_filled_idx = 0;
while (i < nr_objs && nr_filled_idx < NR_BATCHED_DISCARD) {
if (inode->data_vdi_id[i]) {
inode->data_vdi_id[i] = 0;
nr_filled_idx++;
}
i++;
}
ret = write_object(fd, s->aio_context,
(char *)&inode->data_vdi_id[start_idx],
vid_to_vdi_oid(s->inode.vdi_id), inode->nr_copies,
(i - start_idx) * sizeof(uint32_t),
offsetof(struct SheepdogInode,
data_vdi_id[start_idx]),
false, s->cache_flags);
if (ret < 0) {
error_report("failed to discard snapshot inode.");
result = false;
goto out;
}
}
out:
closesocket(fd);
return result;
}
static int sd_snapshot_delete(BlockDriverState *bs, static int sd_snapshot_delete(BlockDriverState *bs,
const char *snapshot_id, const char *snapshot_id,
const char *name, const char *name,
Error **errp) Error **errp)
{ {
unsigned long snap_id = 0; /* FIXME: Delete specified snapshot id. */
char snap_tag[SD_MAX_VDI_TAG_LEN]; return 0;
Error *local_err = NULL;
int fd, ret;
char buf[SD_MAX_VDI_LEN + SD_MAX_VDI_TAG_LEN];
BDRVSheepdogState *s = bs->opaque;
unsigned int wlen = SD_MAX_VDI_LEN + SD_MAX_VDI_TAG_LEN, rlen = 0;
uint32_t vid;
SheepdogVdiReq hdr = {
.opcode = SD_OP_DEL_VDI,
.data_length = wlen,
.flags = SD_FLAG_CMD_WRITE,
};
SheepdogVdiRsp *rsp = (SheepdogVdiRsp *)&hdr;
if (!remove_objects(s)) {
return -1;
}
memset(buf, 0, sizeof(buf));
memset(snap_tag, 0, sizeof(snap_tag));
pstrcpy(buf, SD_MAX_VDI_LEN, s->name);
ret = qemu_strtoul(snapshot_id, NULL, 10, &snap_id);
if (ret || snap_id > UINT32_MAX) {
error_setg(errp, "Invalid snapshot ID: %s",
snapshot_id ? snapshot_id : "<null>");
return -EINVAL;
}
if (snap_id) {
hdr.snapid = (uint32_t) snap_id;
} else {
pstrcpy(snap_tag, sizeof(snap_tag), snapshot_id);
pstrcpy(buf + SD_MAX_VDI_LEN, SD_MAX_VDI_TAG_LEN, snap_tag);
}
ret = find_vdi_name(s, s->name, snap_id, snap_tag, &vid, true,
&local_err);
if (ret) {
return ret;
}
fd = connect_to_sdog(s, &local_err);
if (fd < 0) {
error_report_err(local_err);
return -1;
}
ret = do_req(fd, s->aio_context, (SheepdogReq *)&hdr,
buf, &wlen, &rlen);
closesocket(fd);
if (ret) {
return ret;
}
switch (rsp->result) {
case SD_RES_NO_VDI:
error_report("%s was already deleted", s->name);
case SD_RES_SUCCESS:
break;
default:
error_report("%s, %s", sd_strerror(rsp->result), s->name);
return -1;
}
return ret;
} }
static int sd_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab) static int sd_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
@@ -2652,7 +2432,7 @@ static int sd_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
req.opcode = SD_OP_READ_VDIS; req.opcode = SD_OP_READ_VDIS;
req.data_length = max; req.data_length = max;
ret = do_req(fd, s->aio_context, &req, ret = do_req(fd, s->aio_context, (SheepdogReq *)&req,
vdi_inuse, &wlen, &rlen); vdi_inuse, &wlen, &rlen);
closesocket(fd); closesocket(fd);
@@ -2784,59 +2564,41 @@ static int sd_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov,
return ret; return ret;
} }
static int sd_load_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, static int sd_load_vmstate(BlockDriverState *bs, uint8_t *data,
int64_t pos) int64_t pos, int size)
{ {
BDRVSheepdogState *s = bs->opaque; BDRVSheepdogState *s = bs->opaque;
void *buf;
int ret;
buf = qemu_blockalign(bs, qiov->size); return do_load_save_vmstate(s, data, pos, size, 1);
ret = do_load_save_vmstate(s, buf, pos, qiov->size, 1);
qemu_iovec_from_buf(qiov, 0, buf, qiov->size);
qemu_vfree(buf);
return ret;
} }
static coroutine_fn int sd_co_pdiscard(BlockDriverState *bs, int64_t offset, static coroutine_fn int sd_co_discard(BlockDriverState *bs, int64_t sector_num,
int count) int nb_sectors)
{ {
SheepdogAIOCB *acb; SheepdogAIOCB *acb;
QEMUIOVector dummy;
BDRVSheepdogState *s = bs->opaque; BDRVSheepdogState *s = bs->opaque;
int ret; int ret;
QEMUIOVector discard_iov;
struct iovec iov;
uint32_t zero = 0;
if (!s->discard_supported) { if (!s->discard_supported) {
return 0; return 0;
} }
memset(&discard_iov, 0, sizeof(discard_iov)); acb = sd_aio_setup(bs, &dummy, sector_num, nb_sectors);
memset(&iov, 0, sizeof(iov));
iov.iov_base = &zero;
iov.iov_len = sizeof(zero);
discard_iov.iov = &iov;
discard_iov.niov = 1;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((count & (BDRV_SECTOR_SIZE - 1)) == 0);
acb = sd_aio_setup(bs, &discard_iov, offset >> BDRV_SECTOR_BITS,
count >> BDRV_SECTOR_BITS);
acb->aiocb_type = AIOCB_DISCARD_OBJ; acb->aiocb_type = AIOCB_DISCARD_OBJ;
acb->aio_done_func = sd_finish_aiocb; acb->aio_done_func = sd_finish_aiocb;
retry: retry:
if (check_overlapping_aiocb(s, acb)) { if (check_overwrapping_aiocb(s, acb)) {
qemu_co_queue_wait(&s->overlapping_queue); qemu_co_queue_wait(&s->overwrapping_queue);
goto retry; goto retry;
} }
ret = sd_co_rw_vector(acb); ret = sd_co_rw_vector(acb);
if (ret <= 0) { if (ret <= 0) {
QLIST_REMOVE(acb, aiocb_siblings); QLIST_REMOVE(acb, aiocb_siblings);
qemu_co_queue_restart_all(&s->overlapping_queue); qemu_co_queue_restart_all(&s->overwrapping_queue);
qemu_aio_unref(acb); qemu_aio_unref(acb);
return ret; return ret;
} }
@@ -2844,14 +2606,14 @@ retry:
qemu_coroutine_yield(); qemu_coroutine_yield();
QLIST_REMOVE(acb, aiocb_siblings); QLIST_REMOVE(acb, aiocb_siblings);
qemu_co_queue_restart_all(&s->overlapping_queue); qemu_co_queue_restart_all(&s->overwrapping_queue);
return acb->ret; return acb->ret;
} }
static coroutine_fn int64_t static coroutine_fn int64_t
sd_co_get_block_status(BlockDriverState *bs, int64_t sector_num, int nb_sectors, sd_co_get_block_status(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
int *pnum, BlockDriverState **file) int *pnum)
{ {
BDRVSheepdogState *s = bs->opaque; BDRVSheepdogState *s = bs->opaque;
SheepdogInode *inode = &s->inode; SheepdogInode *inode = &s->inode;
@@ -2882,9 +2644,6 @@ sd_co_get_block_status(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
if (*pnum > nb_sectors) { if (*pnum > nb_sectors) {
*pnum = nb_sectors; *pnum = nb_sectors;
} }
if (ret > 0 && ret & BDRV_BLOCK_OFFSET_VALID) {
*file = bs;
}
return ret; return ret;
} }
@@ -2944,9 +2703,6 @@ static BlockDriver bdrv_sheepdog = {
.instance_size = sizeof(BDRVSheepdogState), .instance_size = sizeof(BDRVSheepdogState),
.bdrv_needs_filename = true, .bdrv_needs_filename = true,
.bdrv_file_open = sd_open, .bdrv_file_open = sd_open,
.bdrv_reopen_prepare = sd_reopen_prepare,
.bdrv_reopen_commit = sd_reopen_commit,
.bdrv_reopen_abort = sd_reopen_abort,
.bdrv_close = sd_close, .bdrv_close = sd_close,
.bdrv_create = sd_create, .bdrv_create = sd_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1, .bdrv_has_zero_init = bdrv_has_zero_init_1,
@@ -2957,7 +2713,7 @@ static BlockDriver bdrv_sheepdog = {
.bdrv_co_readv = sd_co_readv, .bdrv_co_readv = sd_co_readv,
.bdrv_co_writev = sd_co_writev, .bdrv_co_writev = sd_co_writev,
.bdrv_co_flush_to_disk = sd_co_flush_to_disk, .bdrv_co_flush_to_disk = sd_co_flush_to_disk,
.bdrv_co_pdiscard = sd_co_pdiscard, .bdrv_co_discard = sd_co_discard,
.bdrv_co_get_block_status = sd_co_get_block_status, .bdrv_co_get_block_status = sd_co_get_block_status,
.bdrv_snapshot_create = sd_snapshot_create, .bdrv_snapshot_create = sd_snapshot_create,
@@ -2980,9 +2736,6 @@ static BlockDriver bdrv_sheepdog_tcp = {
.instance_size = sizeof(BDRVSheepdogState), .instance_size = sizeof(BDRVSheepdogState),
.bdrv_needs_filename = true, .bdrv_needs_filename = true,
.bdrv_file_open = sd_open, .bdrv_file_open = sd_open,
.bdrv_reopen_prepare = sd_reopen_prepare,
.bdrv_reopen_commit = sd_reopen_commit,
.bdrv_reopen_abort = sd_reopen_abort,
.bdrv_close = sd_close, .bdrv_close = sd_close,
.bdrv_create = sd_create, .bdrv_create = sd_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1, .bdrv_has_zero_init = bdrv_has_zero_init_1,
@@ -2993,7 +2746,7 @@ static BlockDriver bdrv_sheepdog_tcp = {
.bdrv_co_readv = sd_co_readv, .bdrv_co_readv = sd_co_readv,
.bdrv_co_writev = sd_co_writev, .bdrv_co_writev = sd_co_writev,
.bdrv_co_flush_to_disk = sd_co_flush_to_disk, .bdrv_co_flush_to_disk = sd_co_flush_to_disk,
.bdrv_co_pdiscard = sd_co_pdiscard, .bdrv_co_discard = sd_co_discard,
.bdrv_co_get_block_status = sd_co_get_block_status, .bdrv_co_get_block_status = sd_co_get_block_status,
.bdrv_snapshot_create = sd_snapshot_create, .bdrv_snapshot_create = sd_snapshot_create,
@@ -3016,9 +2769,6 @@ static BlockDriver bdrv_sheepdog_unix = {
.instance_size = sizeof(BDRVSheepdogState), .instance_size = sizeof(BDRVSheepdogState),
.bdrv_needs_filename = true, .bdrv_needs_filename = true,
.bdrv_file_open = sd_open, .bdrv_file_open = sd_open,
.bdrv_reopen_prepare = sd_reopen_prepare,
.bdrv_reopen_commit = sd_reopen_commit,
.bdrv_reopen_abort = sd_reopen_abort,
.bdrv_close = sd_close, .bdrv_close = sd_close,
.bdrv_create = sd_create, .bdrv_create = sd_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1, .bdrv_has_zero_init = bdrv_has_zero_init_1,
@@ -3029,7 +2779,7 @@ static BlockDriver bdrv_sheepdog_unix = {
.bdrv_co_readv = sd_co_readv, .bdrv_co_readv = sd_co_readv,
.bdrv_co_writev = sd_co_writev, .bdrv_co_writev = sd_co_writev,
.bdrv_co_flush_to_disk = sd_co_flush_to_disk, .bdrv_co_flush_to_disk = sd_co_flush_to_disk,
.bdrv_co_pdiscard = sd_co_pdiscard, .bdrv_co_discard = sd_co_discard,
.bdrv_co_get_block_status = sd_co_get_block_status, .bdrv_co_get_block_status = sd_co_get_block_status,
.bdrv_snapshot_create = sd_snapshot_create, .bdrv_snapshot_create = sd_snapshot_create,

View File

@@ -22,10 +22,8 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h"
#include "block/snapshot.h" #include "block/snapshot.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h" #include "qapi/qmp/qerror.h"
QemuOptsList internal_snapshot_opts = { QemuOptsList internal_snapshot_opts = {
@@ -151,7 +149,7 @@ int bdrv_can_snapshot(BlockDriverState *bs)
if (!drv->bdrv_snapshot_create) { if (!drv->bdrv_snapshot_create) {
if (bs->file != NULL) { if (bs->file != NULL) {
return bdrv_can_snapshot(bs->file->bs); return bdrv_can_snapshot(bs->file);
} }
return 0; return 0;
} }
@@ -170,7 +168,7 @@ int bdrv_snapshot_create(BlockDriverState *bs,
return drv->bdrv_snapshot_create(bs, sn_info); return drv->bdrv_snapshot_create(bs, sn_info);
} }
if (bs->file) { if (bs->file) {
return bdrv_snapshot_create(bs->file->bs, sn_info); return bdrv_snapshot_create(bs->file, sn_info);
} }
return -ENOTSUP; return -ENOTSUP;
} }
@@ -190,10 +188,10 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
if (bs->file) { if (bs->file) {
drv->bdrv_close(bs); drv->bdrv_close(bs);
ret = bdrv_snapshot_goto(bs->file->bs, snapshot_id); ret = bdrv_snapshot_goto(bs->file, snapshot_id);
open_ret = drv->bdrv_open(bs, NULL, bs->open_flags, NULL); open_ret = drv->bdrv_open(bs, NULL, bs->open_flags, NULL);
if (open_ret < 0) { if (open_ret < 0) {
bdrv_unref(bs->file->bs); bdrv_unref(bs->file);
bs->drv = NULL; bs->drv = NULL;
return open_ret; return open_ret;
} }
@@ -231,8 +229,6 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
Error **errp) Error **errp)
{ {
BlockDriver *drv = bs->drv; BlockDriver *drv = bs->drv;
int ret;
if (!drv) { if (!drv) {
error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, bdrv_get_device_name(bs)); error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, bdrv_get_device_name(bs));
return -ENOMEDIUM; return -ENOMEDIUM;
@@ -243,26 +239,23 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
} }
/* drain all pending i/o before deleting snapshot */ /* drain all pending i/o before deleting snapshot */
bdrv_drained_begin(bs); bdrv_drain(bs);
if (drv->bdrv_snapshot_delete) { if (drv->bdrv_snapshot_delete) {
ret = drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp); return drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
} else if (bs->file) {
ret = bdrv_snapshot_delete(bs->file->bs, snapshot_id, name, errp);
} else {
error_setg(errp, "Block format '%s' used by device '%s' "
"does not support internal snapshot deletion",
drv->format_name, bdrv_get_device_name(bs));
ret = -ENOTSUP;
} }
if (bs->file) {
bdrv_drained_end(bs); return bdrv_snapshot_delete(bs->file, snapshot_id, name, errp);
return ret; }
error_setg(errp, "Block format '%s' used by device '%s' "
"does not support internal snapshot deletion",
drv->format_name, bdrv_get_device_name(bs));
return -ENOTSUP;
} }
int bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs, void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
const char *id_or_name, const char *id_or_name,
Error **errp) Error **errp)
{ {
int ret; int ret;
Error *local_err = NULL; Error *local_err = NULL;
@@ -277,7 +270,6 @@ int bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
if (ret < 0) { if (ret < 0) {
error_propagate(errp, local_err); error_propagate(errp, local_err);
} }
return ret;
} }
int bdrv_snapshot_list(BlockDriverState *bs, int bdrv_snapshot_list(BlockDriverState *bs,
@@ -291,7 +283,7 @@ int bdrv_snapshot_list(BlockDriverState *bs,
return drv->bdrv_snapshot_list(bs, psn_info); return drv->bdrv_snapshot_list(bs, psn_info);
} }
if (bs->file) { if (bs->file) {
return bdrv_snapshot_list(bs->file->bs, psn_info); return bdrv_snapshot_list(bs->file, psn_info);
} }
return -ENOTSUP; return -ENOTSUP;
} }
@@ -358,164 +350,9 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
ret = bdrv_snapshot_load_tmp(bs, NULL, id_or_name, &local_err); ret = bdrv_snapshot_load_tmp(bs, NULL, id_or_name, &local_err);
} }
error_propagate(errp, local_err); if (local_err) {
error_propagate(errp, local_err);
}
return ret; return ret;
} }
/* Group operations. All block drivers are involved.
* These functions will properly handle dataplane (take aio_context_acquire
* when appropriate for appropriate block drivers) */
bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs)
{
bool ok = true;
BlockDriverState *bs;
BdrvNextIterator it;
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
if (bdrv_is_inserted(bs) && !bdrv_is_read_only(bs)) {
ok = bdrv_can_snapshot(bs);
}
aio_context_release(ctx);
if (!ok) {
goto fail;
}
}
fail:
*first_bad_bs = bs;
return ok;
}
int bdrv_all_delete_snapshot(const char *name, BlockDriverState **first_bad_bs,
Error **err)
{
int ret = 0;
BlockDriverState *bs;
BdrvNextIterator it;
QEMUSnapshotInfo sn1, *snapshot = &sn1;
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
if (bdrv_can_snapshot(bs) &&
bdrv_snapshot_find(bs, snapshot, name) >= 0) {
ret = bdrv_snapshot_delete_by_id_or_name(bs, name, err);
}
aio_context_release(ctx);
if (ret < 0) {
goto fail;
}
}
fail:
*first_bad_bs = bs;
return ret;
}
int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs)
{
int err = 0;
BlockDriverState *bs;
BdrvNextIterator it;
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
if (bdrv_can_snapshot(bs)) {
err = bdrv_snapshot_goto(bs, name);
}
aio_context_release(ctx);
if (err < 0) {
goto fail;
}
}
fail:
*first_bad_bs = bs;
return err;
}
int bdrv_all_find_snapshot(const char *name, BlockDriverState **first_bad_bs)
{
QEMUSnapshotInfo sn;
int err = 0;
BlockDriverState *bs;
BdrvNextIterator it;
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
if (bdrv_can_snapshot(bs)) {
err = bdrv_snapshot_find(bs, &sn, name);
}
aio_context_release(ctx);
if (err < 0) {
goto fail;
}
}
fail:
*first_bad_bs = bs;
return err;
}
int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
BlockDriverState *vm_state_bs,
uint64_t vm_state_size,
BlockDriverState **first_bad_bs)
{
int err = 0;
BlockDriverState *bs;
BdrvNextIterator it;
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
if (bs == vm_state_bs) {
sn->vm_state_size = vm_state_size;
err = bdrv_snapshot_create(bs, sn);
} else if (bdrv_can_snapshot(bs)) {
sn->vm_state_size = 0;
err = bdrv_snapshot_create(bs, sn);
}
aio_context_release(ctx);
if (err < 0) {
goto fail;
}
}
fail:
*first_bad_bs = bs;
return err;
}
BlockDriverState *bdrv_all_find_vmstate_bs(void)
{
BlockDriverState *bs;
BdrvNextIterator it;
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
bool found;
aio_context_acquire(ctx);
found = bdrv_can_snapshot(bs);
aio_context_release(ctx);
if (found) {
break;
}
}
return bs;
}

View File

@@ -22,13 +22,14 @@
* THE SOFTWARE. * THE SOFTWARE.
*/ */
#include "qemu/osdep.h" #include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <libssh2.h> #include <libssh2.h>
#include <libssh2_sftp.h> #include <libssh2_sftp.h>
#include "block/block_int.h" #include "block/block_int.h"
#include "qapi/error.h"
#include "qemu/error-report.h" #include "qemu/error-report.h"
#include "qemu/sockets.h" #include "qemu/sockets.h"
#include "qemu/uri.h" #include "qemu/uri.h"
@@ -192,7 +193,7 @@ sftp_error_report(BDRVSSHState *s, const char *fs, ...)
static int parse_uri(const char *filename, QDict *options, Error **errp) static int parse_uri(const char *filename, QDict *options, Error **errp)
{ {
URI *uri = NULL; URI *uri = NULL;
QueryParams *qp; QueryParams *qp = NULL;
int i; int i;
uri = uri_parse(filename); uri = uri_parse(filename);
@@ -248,6 +249,9 @@ static int parse_uri(const char *filename, QDict *options, Error **errp)
return 0; return 0;
err: err:
if (qp) {
query_params_free(qp);
}
if (uri) { if (uri) {
uri_free(uri); uri_free(uri);
} }
@@ -508,73 +512,36 @@ static int authenticate(BDRVSSHState *s, const char *user, Error **errp)
return ret; return ret;
} }
static QemuOptsList ssh_runtime_opts = {
.name = "ssh",
.head = QTAILQ_HEAD_INITIALIZER(ssh_runtime_opts.head),
.desc = {
{
.name = "host",
.type = QEMU_OPT_STRING,
.help = "Host to connect to",
},
{
.name = "port",
.type = QEMU_OPT_NUMBER,
.help = "Port to connect to",
},
{
.name = "path",
.type = QEMU_OPT_STRING,
.help = "Path of the image on the host",
},
{
.name = "user",
.type = QEMU_OPT_STRING,
.help = "User as which to connect",
},
{
.name = "host_key_check",
.type = QEMU_OPT_STRING,
.help = "Defines how and what to check the host key against",
},
},
};
static int connect_to_ssh(BDRVSSHState *s, QDict *options, static int connect_to_ssh(BDRVSSHState *s, QDict *options,
int ssh_flags, int creat_mode, Error **errp) int ssh_flags, int creat_mode, Error **errp)
{ {
int r, ret; int r, ret;
QemuOpts *opts = NULL;
Error *local_err = NULL;
const char *host, *user, *path, *host_key_check; const char *host, *user, *path, *host_key_check;
int port; int port;
opts = qemu_opts_create(&ssh_runtime_opts, NULL, 0, &error_abort); if (!qdict_haskey(options, "host")) {
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
ret = -EINVAL;
error_propagate(errp, local_err);
goto err;
}
host = qemu_opt_get(opts, "host");
if (!host) {
ret = -EINVAL; ret = -EINVAL;
error_setg(errp, "No hostname was specified"); error_setg(errp, "No hostname was specified");
goto err; goto err;
} }
host = qdict_get_str(options, "host");
port = qemu_opt_get_number(opts, "port", 22); if (qdict_haskey(options, "port")) {
port = qdict_get_int(options, "port");
} else {
port = 22;
}
path = qemu_opt_get(opts, "path"); if (!qdict_haskey(options, "path")) {
if (!path) {
ret = -EINVAL; ret = -EINVAL;
error_setg(errp, "No path was specified"); error_setg(errp, "No path was specified");
goto err; goto err;
} }
path = qdict_get_str(options, "path");
user = qemu_opt_get(opts, "user"); if (qdict_haskey(options, "user")) {
if (!user) { user = qdict_get_str(options, "user");
} else {
user = g_get_user_name(); user = g_get_user_name();
if (!user) { if (!user) {
error_setg_errno(errp, errno, "Can't get user name"); error_setg_errno(errp, errno, "Can't get user name");
@@ -583,8 +550,9 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
} }
} }
host_key_check = qemu_opt_get(opts, "host_key_check"); if (qdict_haskey(options, "host_key_check")) {
if (!host_key_check) { host_key_check = qdict_get_str(options, "host_key_check");
} else {
host_key_check = "yes"; host_key_check = "yes";
} }
@@ -648,14 +616,21 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
goto err; goto err;
} }
qemu_opts_del(opts);
r = libssh2_sftp_fstat(s->sftp_handle, &s->attrs); r = libssh2_sftp_fstat(s->sftp_handle, &s->attrs);
if (r < 0) { if (r < 0) {
sftp_error_setg(errp, s, "failed to read file attributes"); sftp_error_setg(errp, s, "failed to read file attributes");
return -EINVAL; return -EINVAL;
} }
/* Delete the options we've used; any not deleted will cause the
* block layer to give an error about unused options.
*/
qdict_del(options, "host");
qdict_del(options, "port");
qdict_del(options, "user");
qdict_del(options, "path");
qdict_del(options, "host_key_check");
return 0; return 0;
err: err:
@@ -675,8 +650,6 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
} }
s->session = NULL; s->session = NULL;
qemu_opts_del(opts);
return ret; return ret;
} }
@@ -808,7 +781,7 @@ static void restart_coroutine(void *opaque)
DPRINTF("co=%p", co); DPRINTF("co=%p", co);
qemu_coroutine_enter(co); qemu_coroutine_enter(co, NULL);
} }
static coroutine_fn void set_fd_handler(BDRVSSHState *s, BlockDriverState *bs) static coroutine_fn void set_fd_handler(BDRVSSHState *s, BlockDriverState *bs)
@@ -830,15 +803,14 @@ static coroutine_fn void set_fd_handler(BDRVSSHState *s, BlockDriverState *bs)
rd_handler, wr_handler); rd_handler, wr_handler);
aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock,
false, rd_handler, wr_handler, co); rd_handler, wr_handler, co);
} }
static coroutine_fn void clear_fd_handler(BDRVSSHState *s, static coroutine_fn void clear_fd_handler(BDRVSSHState *s,
BlockDriverState *bs) BlockDriverState *bs)
{ {
DPRINTF("s->sock=%d", s->sock); DPRINTF("s->sock=%d", s->sock);
aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, NULL, NULL, NULL);
false, NULL, NULL, NULL);
} }
/* A non-blocking call returned EAGAIN, so yield, ensuring the /* A non-blocking call returned EAGAIN, so yield, ensuring the

View File

@@ -11,14 +11,11 @@
* *
*/ */
#include "qemu/osdep.h"
#include "trace.h" #include "trace.h"
#include "block/block_int.h" #include "block/block_int.h"
#include "block/blockjob.h" #include "block/blockjob.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h" #include "qapi/qmp/qerror.h"
#include "qemu/ratelimit.h" #include "qemu/ratelimit.h"
#include "sysemu/block-backend.h"
enum { enum {
/* /*
@@ -39,7 +36,7 @@ typedef struct StreamBlockJob {
char *backing_file_str; char *backing_file_str;
} StreamBlockJob; } StreamBlockJob;
static int coroutine_fn stream_populate(BlockBackend *blk, static int coroutine_fn stream_populate(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int64_t sector_num, int nb_sectors,
void *buf) void *buf)
{ {
@@ -52,8 +49,35 @@ static int coroutine_fn stream_populate(BlockBackend *blk,
qemu_iovec_init_external(&qiov, &iov, 1); qemu_iovec_init_external(&qiov, &iov, 1);
/* Copy-on-read the unallocated clusters */ /* Copy-on-read the unallocated clusters */
return blk_co_preadv(blk, sector_num * BDRV_SECTOR_SIZE, qiov.size, &qiov, return bdrv_co_copy_on_readv(bs, sector_num, nb_sectors, &qiov);
BDRV_REQ_COPY_ON_READ); }
static void close_unused_images(BlockDriverState *top, BlockDriverState *base,
const char *base_id)
{
BlockDriverState *intermediate;
intermediate = top->backing_hd;
/* Must assign before bdrv_delete() to prevent traversing dangling pointer
* while we delete backing image instances.
*/
bdrv_set_backing_hd(top, base);
while (intermediate) {
BlockDriverState *unused;
/* reached base */
if (intermediate == base) {
break;
}
unused = intermediate;
intermediate = intermediate->backing_hd;
bdrv_set_backing_hd(unused, NULL);
bdrv_unref(unused);
}
bdrv_refresh_limits(top, NULL);
} }
typedef struct { typedef struct {
@@ -65,7 +89,6 @@ static void stream_complete(BlockJob *job, void *opaque)
{ {
StreamBlockJob *s = container_of(job, StreamBlockJob, common); StreamBlockJob *s = container_of(job, StreamBlockJob, common);
StreamCompleteData *data = opaque; StreamCompleteData *data = opaque;
BlockDriverState *bs = blk_bs(job->blk);
BlockDriverState *base = s->base; BlockDriverState *base = s->base;
if (!block_job_is_cancelled(&s->common) && data->reached_end && if (!block_job_is_cancelled(&s->common) && data->reached_end &&
@@ -77,8 +100,8 @@ static void stream_complete(BlockJob *job, void *opaque)
base_fmt = base->drv->format_name; base_fmt = base->drv->format_name;
} }
} }
data->ret = bdrv_change_backing_file(bs, base_id, base_fmt); data->ret = bdrv_change_backing_file(job->bs, base_id, base_fmt);
bdrv_set_backing_hd(bs, base); close_unused_images(job->bs, base, base_id);
} }
g_free(s->backing_file_str); g_free(s->backing_file_str);
@@ -90,25 +113,23 @@ static void coroutine_fn stream_run(void *opaque)
{ {
StreamBlockJob *s = opaque; StreamBlockJob *s = opaque;
StreamCompleteData *data; StreamCompleteData *data;
BlockBackend *blk = s->common.blk; BlockDriverState *bs = s->common.bs;
BlockDriverState *bs = blk_bs(blk);
BlockDriverState *base = s->base; BlockDriverState *base = s->base;
int64_t sector_num = 0; int64_t sector_num, end;
int64_t end = -1;
uint64_t delay_ns = 0;
int error = 0; int error = 0;
int ret = 0; int ret = 0;
int n = 0; int n = 0;
void *buf; void *buf;
if (!bs->backing) { if (!bs->backing_hd) {
goto out; block_job_completed(&s->common, 0);
return;
} }
s->common.len = bdrv_getlength(bs); s->common.len = bdrv_getlength(bs);
if (s->common.len < 0) { if (s->common.len < 0) {
ret = s->common.len; block_job_completed(&s->common, s->common.len);
goto out; return;
} }
end = s->common.len >> BDRV_SECTOR_BITS; end = s->common.len >> BDRV_SECTOR_BITS;
@@ -124,8 +145,10 @@ static void coroutine_fn stream_run(void *opaque)
} }
for (sector_num = 0; sector_num < end; sector_num += n) { for (sector_num = 0; sector_num < end; sector_num += n) {
uint64_t delay_ns = 0;
bool copy; bool copy;
wait:
/* Note that even when no rate limit is applied we need to yield /* Note that even when no rate limit is applied we need to yield
* with no pending I/O here so that bdrv_drain_all() returns. * with no pending I/O here so that bdrv_drain_all() returns.
*/ */
@@ -143,7 +166,7 @@ static void coroutine_fn stream_run(void *opaque)
} else if (ret >= 0) { } else if (ret >= 0) {
/* Copy if allocated in the intermediate images. Limit to the /* Copy if allocated in the intermediate images. Limit to the
* known-unallocated area [sector_num, sector_num+n). */ * known-unallocated area [sector_num, sector_num+n). */
ret = bdrv_is_allocated_above(backing_bs(bs), base, ret = bdrv_is_allocated_above(bs->backing_hd, base,
sector_num, n, &n); sector_num, n, &n);
/* Finish early if end of backing file has been reached */ /* Finish early if end of backing file has been reached */
@@ -155,11 +178,18 @@ static void coroutine_fn stream_run(void *opaque)
} }
trace_stream_one_iteration(s, sector_num, n, ret); trace_stream_one_iteration(s, sector_num, n, ret);
if (copy) { if (copy) {
ret = stream_populate(blk, sector_num, n, buf); if (s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, n);
if (delay_ns > 0) {
goto wait;
}
}
ret = stream_populate(bs, sector_num, n, buf);
} }
if (ret < 0) { if (ret < 0) {
BlockErrorAction action = BlockErrorAction action =
block_job_error_action(&s->common, s->on_error, true, -ret); block_job_error_action(&s->common, s->common.bs, s->on_error,
true, -ret);
if (action == BLOCK_ERROR_ACTION_STOP) { if (action == BLOCK_ERROR_ACTION_STOP) {
n = 0; n = 0;
continue; continue;
@@ -175,9 +205,6 @@ static void coroutine_fn stream_run(void *opaque)
/* Publish progress */ /* Publish progress */
s->common.offset += n * BDRV_SECTOR_SIZE; s->common.offset += n * BDRV_SECTOR_SIZE;
if (copy && s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, n);
}
} }
if (!base) { if (!base) {
@@ -189,7 +216,6 @@ static void coroutine_fn stream_run(void *opaque)
qemu_vfree(buf); qemu_vfree(buf);
out:
/* Modify backing chain and close BDSes in main loop */ /* Modify backing chain and close BDSes in main loop */
data = g_malloc(sizeof(*data)); data = g_malloc(sizeof(*data));
data->ret = ret; data->ret = ret;
@@ -214,15 +240,22 @@ static const BlockJobDriver stream_job_driver = {
.set_speed = stream_set_speed, .set_speed = stream_set_speed,
}; };
void stream_start(const char *job_id, BlockDriverState *bs, void stream_start(BlockDriverState *bs, BlockDriverState *base,
BlockDriverState *base, const char *backing_file_str, const char *backing_file_str, int64_t speed,
int64_t speed, BlockdevOnError on_error, BlockdevOnError on_error,
BlockCompletionFunc *cb, void *opaque, Error **errp) BlockCompletionFunc *cb,
void *opaque, Error **errp)
{ {
StreamBlockJob *s; StreamBlockJob *s;
s = block_job_create(job_id, &stream_job_driver, bs, speed, if ((on_error == BLOCKDEV_ON_ERROR_STOP ||
cb, opaque, errp); on_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
!bdrv_iostatus_is_enabled(bs)) {
error_setg(errp, QERR_INVALID_PARAMETER, "on-error");
return;
}
s = block_job_create(&stream_job_driver, bs, speed, cb, opaque, errp);
if (!s) { if (!s) {
return; return;
} }
@@ -231,7 +264,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
s->backing_file_str = g_strdup(backing_file_str); s->backing_file_str = g_strdup(backing_file_str);
s->on_error = on_error; s->on_error = on_error;
s->common.co = qemu_coroutine_create(stream_run, s); s->common.co = qemu_coroutine_create(stream_run);
trace_stream_start(bs, base, s, s->common.co, opaque); trace_stream_start(bs, base, s, s->common.co, opaque);
qemu_coroutine_enter(s->common.co); qemu_coroutine_enter(s->common.co, s);
} }

View File

@@ -22,44 +22,43 @@
* along with this program; if not, see <http://www.gnu.org/licenses/>. * along with this program; if not, see <http://www.gnu.org/licenses/>.
*/ */
#include "qemu/osdep.h"
#include "sysemu/block-backend.h"
#include "block/throttle-groups.h" #include "block/throttle-groups.h"
#include "qemu/queue.h" #include "qemu/queue.h"
#include "qemu/thread.h" #include "qemu/thread.h"
#include "sysemu/qtest.h" #include "sysemu/qtest.h"
/* The ThrottleGroup structure (with its ThrottleState) is shared /* The ThrottleGroup structure (with its ThrottleState) is shared
* among different BlockBackends and it's independent from * among different BlockDriverState and it's independent from
* AioContext, so in order to use it from different threads it needs * AioContext, so in order to use it from different threads it needs
* its own locking. * its own locking.
* *
* This locking is however handled internally in this file, so it's * This locking is however handled internally in this file, so it's
* transparent to outside users. * mostly transparent to outside users (but see the documentation in
* throttle_groups_lock()).
* *
* The whole ThrottleGroup structure is private and invisible to * The whole ThrottleGroup structure is private and invisible to
* outside users, that only use it through its ThrottleState. * outside users, that only use it through its ThrottleState.
* *
* In addition to the ThrottleGroup structure, BlockBackendPublic has * In addition to the ThrottleGroup structure, BlockDriverState has
* fields that need to be accessed by other members of the group and * fields that need to be accessed by other members of the group and
* therefore also need to be protected by this lock. Once a * therefore also need to be protected by this lock. Once a BDS is
* BlockBackend is registered in a group those fields can be accessed * registered in a group those fields can be accessed by other threads
* by other threads any time. * any time.
* *
* Again, all this is handled internally and is mostly transparent to * Again, all this is handled internally and is mostly transparent to
* the outside. The 'throttle_timers' field however has an additional * the outside. The 'throttle_timers' field however has an additional
* constraint because it may be temporarily invalid (see for example * constraint because it may be temporarily invalid (see for example
* bdrv_set_aio_context()). Therefore in this file a thread will * bdrv_set_aio_context()). Therefore in this file a thread will
* access some other BlockBackend's timers only after verifying that * access some other BDS's timers only after verifying that that BDS
* that BlockBackend has throttled requests in the queue. * has throttled requests in the queue.
*/ */
typedef struct ThrottleGroup { typedef struct ThrottleGroup {
char *name; /* This is constant during the lifetime of the group */ char *name; /* This is constant during the lifetime of the group */
QemuMutex lock; /* This lock protects the following four fields */ QemuMutex lock; /* This lock protects the following four fields */
ThrottleState ts; ThrottleState ts;
QLIST_HEAD(, BlockBackendPublic) head; QLIST_HEAD(, BlockDriverState) head;
BlockBackend *tokens[2]; BlockDriverState *tokens[2];
bool any_timer_armed[2]; bool any_timer_armed[2];
/* These two are protected by the global throttle_groups_lock */ /* These two are protected by the global throttle_groups_lock */
@@ -77,9 +76,9 @@ static QTAILQ_HEAD(, ThrottleGroup) throttle_groups =
* created. * created.
* *
* @name: the name of the ThrottleGroup * @name: the name of the ThrottleGroup
* @ret: the ThrottleState member of the ThrottleGroup * @ret: the ThrottleGroup
*/ */
ThrottleState *throttle_group_incref(const char *name) static ThrottleGroup *throttle_group_incref(const char *name)
{ {
ThrottleGroup *tg = NULL; ThrottleGroup *tg = NULL;
ThrottleGroup *iter; ThrottleGroup *iter;
@@ -109,7 +108,7 @@ ThrottleState *throttle_group_incref(const char *name)
qemu_mutex_unlock(&throttle_groups_lock); qemu_mutex_unlock(&throttle_groups_lock);
return &tg->ts; return tg;
} }
/* Decrease the reference count of a ThrottleGroup. /* Decrease the reference count of a ThrottleGroup.
@@ -117,12 +116,10 @@ ThrottleState *throttle_group_incref(const char *name)
* When the reference count reaches zero the ThrottleGroup is * When the reference count reaches zero the ThrottleGroup is
* destroyed. * destroyed.
* *
* @ts: The ThrottleGroup to unref, given by its ThrottleState member * @tg: The ThrottleGroup to unref
*/ */
void throttle_group_unref(ThrottleState *ts) static void throttle_group_unref(ThrottleGroup *tg)
{ {
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
qemu_mutex_lock(&throttle_groups_lock); qemu_mutex_lock(&throttle_groups_lock);
if (--tg->refcount == 0) { if (--tg->refcount == 0) {
QTAILQ_REMOVE(&throttle_groups, tg, list); QTAILQ_REMOVE(&throttle_groups, tg, list);
@@ -133,98 +130,93 @@ void throttle_group_unref(ThrottleState *ts)
qemu_mutex_unlock(&throttle_groups_lock); qemu_mutex_unlock(&throttle_groups_lock);
} }
/* Get the name from a BlockBackend's ThrottleGroup. The name (and the pointer) /* Get the name from a BlockDriverState's ThrottleGroup. The name (and
* is guaranteed to remain constant during the lifetime of the group. * the pointer) is guaranteed to remain constant during the lifetime
* of the group.
* *
* @blk: a BlockBackend that is member of a throttling group * @bs: a BlockDriverState that is member of a throttling group
* @ret: the name of the group. * @ret: the name of the group.
*/ */
const char *throttle_group_get_name(BlockBackend *blk) const char *throttle_group_get_name(BlockDriverState *bs)
{ {
BlockBackendPublic *blkp = blk_get_public(blk); ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts);
return tg->name; return tg->name;
} }
/* Return the next BlockBackend in the round-robin sequence, simulating a /* Return the next BlockDriverState in the round-robin sequence,
* circular list. * simulating a circular list.
* *
* This assumes that tg->lock is held. * This assumes that tg->lock is held.
* *
* @blk: the current BlockBackend * @bs: the current BlockDriverState
* @ret: the next BlockBackend in the sequence * @ret: the next BlockDriverState in the sequence
*/ */
static BlockBackend *throttle_group_next_blk(BlockBackend *blk) static BlockDriverState *throttle_group_next_bs(BlockDriverState *bs)
{ {
BlockBackendPublic *blkp = blk_get_public(blk); ThrottleState *ts = bs->throttle_state;
ThrottleState *ts = blkp->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts); ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
BlockBackendPublic *next = QLIST_NEXT(blkp, round_robin); BlockDriverState *next = QLIST_NEXT(bs, round_robin);
if (!next) { if (!next) {
next = QLIST_FIRST(&tg->head); return QLIST_FIRST(&tg->head);
} }
return blk_by_public(next); return next;
} }
/* Return the next BlockBackend in the round-robin sequence with pending I/O /* Return the next BlockDriverState in the round-robin sequence with
* requests. * pending I/O requests.
* *
* This assumes that tg->lock is held. * This assumes that tg->lock is held.
* *
* @blk: the current BlockBackend * @bs: the current BlockDriverState
* @is_write: the type of operation (read/write) * @is_write: the type of operation (read/write)
* @ret: the next BlockBackend with pending requests, or blk if there is * @ret: the next BlockDriverState with pending requests, or bs
* none. * if there is none.
*/ */
static BlockBackend *next_throttle_token(BlockBackend *blk, bool is_write) static BlockDriverState *next_throttle_token(BlockDriverState *bs,
bool is_write)
{ {
BlockBackendPublic *blkp = blk_get_public(blk); ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts); BlockDriverState *token, *start;
BlockBackend *token, *start;
start = token = tg->tokens[is_write]; start = token = tg->tokens[is_write];
/* get next bs round in round robin style */ /* get next bs round in round robin style */
token = throttle_group_next_blk(token); token = throttle_group_next_bs(token);
while (token != start && !blkp->pending_reqs[is_write]) { while (token != start && !token->pending_reqs[is_write]) {
token = throttle_group_next_blk(token); token = throttle_group_next_bs(token);
} }
/* If no IO are queued for scheduling on the next round robin token /* If no IO are queued for scheduling on the next round robin token
* then decide the token is the current bs because chances are * then decide the token is the current bs because chances are
* the current bs get the current request queued. * the current bs get the current request queued.
*/ */
if (token == start && !blkp->pending_reqs[is_write]) { if (token == start && !token->pending_reqs[is_write]) {
token = blk; token = bs;
} }
return token; return token;
} }
/* Check if the next I/O request for a BlockBackend needs to be throttled or /* Check if the next I/O request for a BlockDriverState needs to be
* not. If there's no timer set in this group, set one and update the token * throttled or not. If there's no timer set in this group, set one
* accordingly. * and update the token accordingly.
* *
* This assumes that tg->lock is held. * This assumes that tg->lock is held.
* *
* @blk: the current BlockBackend * @bs: the current BlockDriverState
* @is_write: the type of operation (read/write) * @is_write: the type of operation (read/write)
* @ret: whether the I/O request needs to be throttled or not * @ret: whether the I/O request needs to be throttled or not
*/ */
static bool throttle_group_schedule_timer(BlockBackend *blk, bool is_write) static bool throttle_group_schedule_timer(BlockDriverState *bs,
bool is_write)
{ {
BlockBackendPublic *blkp = blk_get_public(blk); ThrottleState *ts = bs->throttle_state;
ThrottleState *ts = blkp->throttle_state; ThrottleTimers *tt = &bs->throttle_timers;
ThrottleTimers *tt = &blkp->throttle_timers;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts); ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
bool must_wait; bool must_wait;
if (blkp->io_limits_disabled) {
return false;
}
/* Check if any of the timers in this group is already armed */ /* Check if any of the timers in this group is already armed */
if (tg->any_timer_armed[is_write]) { if (tg->any_timer_armed[is_write]) {
return true; return true;
@@ -232,9 +224,9 @@ static bool throttle_group_schedule_timer(BlockBackend *blk, bool is_write)
must_wait = throttle_schedule_timer(ts, tt, is_write); must_wait = throttle_schedule_timer(ts, tt, is_write);
/* If a timer just got armed, set blk as the current token */ /* If a timer just got armed, set bs as the current token */
if (must_wait) { if (must_wait) {
tg->tokens[is_write] = blk; tg->tokens[is_write] = bs;
tg->any_timer_armed[is_write] = true; tg->any_timer_armed[is_write] = true;
} }
@@ -245,19 +237,18 @@ static bool throttle_group_schedule_timer(BlockBackend *blk, bool is_write)
* *
* This assumes that tg->lock is held. * This assumes that tg->lock is held.
* *
* @blk: the current BlockBackend * @bs: the current BlockDriverState
* @is_write: the type of operation (read/write) * @is_write: the type of operation (read/write)
*/ */
static void schedule_next_request(BlockBackend *blk, bool is_write) static void schedule_next_request(BlockDriverState *bs, bool is_write)
{ {
BlockBackendPublic *blkp = blk_get_public(blk); ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts);
bool must_wait; bool must_wait;
BlockBackend *token; BlockDriverState *token;
/* Check if there's any pending request to schedule next */ /* Check if there's any pending request to schedule next */
token = next_throttle_token(blk, is_write); token = next_throttle_token(bs, is_write);
if (!blkp->pending_reqs[is_write]) { if (!token->pending_reqs[is_write]) {
return; return;
} }
@@ -266,12 +257,12 @@ static void schedule_next_request(BlockBackend *blk, bool is_write)
/* If it doesn't have to wait, queue it for immediate execution */ /* If it doesn't have to wait, queue it for immediate execution */
if (!must_wait) { if (!must_wait) {
/* Give preference to requests from the current blk */ /* Give preference to requests from the current bs */
if (qemu_in_coroutine() && if (qemu_in_coroutine() &&
qemu_co_queue_next(&blkp->throttled_reqs[is_write])) { qemu_co_queue_next(&bs->throttled_reqs[is_write])) {
token = blk; token = bs;
} else { } else {
ThrottleTimers *tt = &blkp->throttle_timers; ThrottleTimers *tt = &token->throttle_timers;
int64_t now = qemu_clock_get_ns(tt->clock_type); int64_t now = qemu_clock_get_ns(tt->clock_type);
timer_mod(tt->timers[is_write], now + 1); timer_mod(tt->timers[is_write], now + 1);
tg->any_timer_armed[is_write] = true; tg->any_timer_armed[is_write] = true;
@@ -284,67 +275,53 @@ static void schedule_next_request(BlockBackend *blk, bool is_write)
* if necessary, and schedule the next request using a round robin * if necessary, and schedule the next request using a round robin
* algorithm. * algorithm.
* *
* @blk: the current BlockBackend * @bs: the current BlockDriverState
* @bytes: the number of bytes for this I/O * @bytes: the number of bytes for this I/O
* @is_write: the type of operation (read/write) * @is_write: the type of operation (read/write)
*/ */
void coroutine_fn throttle_group_co_io_limits_intercept(BlockBackend *blk, void coroutine_fn throttle_group_co_io_limits_intercept(BlockDriverState *bs,
unsigned int bytes, unsigned int bytes,
bool is_write) bool is_write)
{ {
bool must_wait; bool must_wait;
BlockBackend *token; BlockDriverState *token;
BlockBackendPublic *blkp = blk_get_public(blk); ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts);
qemu_mutex_lock(&tg->lock); qemu_mutex_lock(&tg->lock);
/* First we check if this I/O has to be throttled. */ /* First we check if this I/O has to be throttled. */
token = next_throttle_token(blk, is_write); token = next_throttle_token(bs, is_write);
must_wait = throttle_group_schedule_timer(token, is_write); must_wait = throttle_group_schedule_timer(token, is_write);
/* Wait if there's a timer set or queued requests of this type */ /* Wait if there's a timer set or queued requests of this type */
if (must_wait || blkp->pending_reqs[is_write]) { if (must_wait || bs->pending_reqs[is_write]) {
blkp->pending_reqs[is_write]++; bs->pending_reqs[is_write]++;
qemu_mutex_unlock(&tg->lock); qemu_mutex_unlock(&tg->lock);
qemu_co_queue_wait(&blkp->throttled_reqs[is_write]); qemu_co_queue_wait(&bs->throttled_reqs[is_write]);
qemu_mutex_lock(&tg->lock); qemu_mutex_lock(&tg->lock);
blkp->pending_reqs[is_write]--; bs->pending_reqs[is_write]--;
} }
/* The I/O will be executed, so do the accounting */ /* The I/O will be executed, so do the accounting */
throttle_account(blkp->throttle_state, is_write, bytes); throttle_account(bs->throttle_state, is_write, bytes);
/* Schedule the next request */ /* Schedule the next request */
schedule_next_request(blk, is_write); schedule_next_request(bs, is_write);
qemu_mutex_unlock(&tg->lock); qemu_mutex_unlock(&tg->lock);
} }
void throttle_group_restart_blk(BlockBackend *blk)
{
BlockBackendPublic *blkp = blk_get_public(blk);
int i;
for (i = 0; i < 2; i++) {
while (qemu_co_enter_next(&blkp->throttled_reqs[i])) {
;
}
}
}
/* Update the throttle configuration for a particular group. Similar /* Update the throttle configuration for a particular group. Similar
* to throttle_config(), but guarantees atomicity within the * to throttle_config(), but guarantees atomicity within the
* throttling group. * throttling group.
* *
* @blk: a BlockBackend that is a member of the group * @bs: a BlockDriverState that is member of the group
* @cfg: the configuration to set * @cfg: the configuration to set
*/ */
void throttle_group_config(BlockBackend *blk, ThrottleConfig *cfg) void throttle_group_config(BlockDriverState *bs, ThrottleConfig *cfg)
{ {
BlockBackendPublic *blkp = blk_get_public(blk); ThrottleTimers *tt = &bs->throttle_timers;
ThrottleTimers *tt = &blkp->throttle_timers; ThrottleState *ts = bs->throttle_state;
ThrottleState *ts = blkp->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts); ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
qemu_mutex_lock(&tg->lock); qemu_mutex_lock(&tg->lock);
/* throttle_config() cancels the timers */ /* throttle_config() cancels the timers */
@@ -356,22 +333,18 @@ void throttle_group_config(BlockBackend *blk, ThrottleConfig *cfg)
} }
throttle_config(ts, tt, cfg); throttle_config(ts, tt, cfg);
qemu_mutex_unlock(&tg->lock); qemu_mutex_unlock(&tg->lock);
qemu_co_enter_next(&blkp->throttled_reqs[0]);
qemu_co_enter_next(&blkp->throttled_reqs[1]);
} }
/* Get the throttle configuration from a particular group. Similar to /* Get the throttle configuration from a particular group. Similar to
* throttle_get_config(), but guarantees atomicity within the * throttle_get_config(), but guarantees atomicity within the
* throttling group. * throttling group.
* *
* @blk: a BlockBackend that is a member of the group * @bs: a BlockDriverState that is member of the group
* @cfg: the configuration will be written here * @cfg: the configuration will be written here
*/ */
void throttle_group_get_config(BlockBackend *blk, ThrottleConfig *cfg) void throttle_group_get_config(BlockDriverState *bs, ThrottleConfig *cfg)
{ {
BlockBackendPublic *blkp = blk_get_public(blk); ThrottleState *ts = bs->throttle_state;
ThrottleState *ts = blkp->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts); ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
qemu_mutex_lock(&tg->lock); qemu_mutex_lock(&tg->lock);
throttle_get_config(ts, cfg); throttle_get_config(ts, cfg);
@@ -381,13 +354,12 @@ void throttle_group_get_config(BlockBackend *blk, ThrottleConfig *cfg)
/* ThrottleTimers callback. This wakes up a request that was waiting /* ThrottleTimers callback. This wakes up a request that was waiting
* because it had been throttled. * because it had been throttled.
* *
* @blk: the BlockBackend whose request had been throttled * @bs: the BlockDriverState whose request had been throttled
* @is_write: the type of operation (read/write) * @is_write: the type of operation (read/write)
*/ */
static void timer_cb(BlockBackend *blk, bool is_write) static void timer_cb(BlockDriverState *bs, bool is_write)
{ {
BlockBackendPublic *blkp = blk_get_public(blk); ThrottleState *ts = bs->throttle_state;
ThrottleState *ts = blkp->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts); ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
bool empty_queue; bool empty_queue;
@@ -397,13 +369,13 @@ static void timer_cb(BlockBackend *blk, bool is_write)
qemu_mutex_unlock(&tg->lock); qemu_mutex_unlock(&tg->lock);
/* Run the request that was waiting for this timer */ /* Run the request that was waiting for this timer */
empty_queue = !qemu_co_enter_next(&blkp->throttled_reqs[is_write]); empty_queue = !qemu_co_enter_next(&bs->throttled_reqs[is_write]);
/* If the request queue was empty then we have to take care of /* If the request queue was empty then we have to take care of
* scheduling the next one */ * scheduling the next one */
if (empty_queue) { if (empty_queue) {
qemu_mutex_lock(&tg->lock); qemu_mutex_lock(&tg->lock);
schedule_next_request(blk, is_write); schedule_next_request(bs, is_write);
qemu_mutex_unlock(&tg->lock); qemu_mutex_unlock(&tg->lock);
} }
} }
@@ -418,19 +390,18 @@ static void write_timer_cb(void *opaque)
timer_cb(opaque, true); timer_cb(opaque, true);
} }
/* Register a BlockBackend in the throttling group, also initializing its /* Register a BlockDriverState in the throttling group, also
* timers and updating its throttle_state pointer to point to it. If a * initializing its timers and updating its throttle_state pointer to
* throttling group with that name does not exist yet, it will be created. * point to it. If a throttling group with that name does not exist
* yet, it will be created.
* *
* @blk: the BlockBackend to insert * @bs: the BlockDriverState to insert
* @groupname: the name of the group * @groupname: the name of the group
*/ */
void throttle_group_register_blk(BlockBackend *blk, const char *groupname) void throttle_group_register_bs(BlockDriverState *bs, const char *groupname)
{ {
int i; int i;
BlockBackendPublic *blkp = blk_get_public(blk); ThrottleGroup *tg = throttle_group_incref(groupname);
ThrottleState *ts = throttle_group_incref(groupname);
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
int clock_type = QEMU_CLOCK_REALTIME; int clock_type = QEMU_CLOCK_REALTIME;
if (qtest_enabled()) { if (qtest_enabled()) {
@@ -438,67 +409,88 @@ void throttle_group_register_blk(BlockBackend *blk, const char *groupname)
clock_type = QEMU_CLOCK_VIRTUAL; clock_type = QEMU_CLOCK_VIRTUAL;
} }
blkp->throttle_state = ts; bs->throttle_state = &tg->ts;
qemu_mutex_lock(&tg->lock); qemu_mutex_lock(&tg->lock);
/* If the ThrottleGroup is new set this BlockBackend as the token */ /* If the ThrottleGroup is new set this BlockDriverState as the token */
for (i = 0; i < 2; i++) { for (i = 0; i < 2; i++) {
if (!tg->tokens[i]) { if (!tg->tokens[i]) {
tg->tokens[i] = blk; tg->tokens[i] = bs;
} }
} }
QLIST_INSERT_HEAD(&tg->head, blkp, round_robin); QLIST_INSERT_HEAD(&tg->head, bs, round_robin);
throttle_timers_init(&blkp->throttle_timers, throttle_timers_init(&bs->throttle_timers,
blk_get_aio_context(blk), bdrv_get_aio_context(bs),
clock_type, clock_type,
read_timer_cb, read_timer_cb,
write_timer_cb, write_timer_cb,
blk); bs);
qemu_mutex_unlock(&tg->lock); qemu_mutex_unlock(&tg->lock);
} }
/* Unregister a BlockBackend from its group, removing it from the list, /* Unregister a BlockDriverState from its group, removing it from the
* destroying the timers and setting the throttle_state pointer to NULL. * list, destroying the timers and setting the throttle_state pointer
* * to NULL.
* The BlockBackend must not have pending throttled requests, so the caller has
* to drain them first.
* *
* The group will be destroyed if it's empty after this operation. * The group will be destroyed if it's empty after this operation.
* *
* @blk: the BlockBackend to remove * @bs: the BlockDriverState to remove
*/ */
void throttle_group_unregister_blk(BlockBackend *blk) void throttle_group_unregister_bs(BlockDriverState *bs)
{ {
BlockBackendPublic *blkp = blk_get_public(blk); ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts);
int i; int i;
assert(blkp->pending_reqs[0] == 0 && blkp->pending_reqs[1] == 0);
assert(qemu_co_queue_empty(&blkp->throttled_reqs[0]));
assert(qemu_co_queue_empty(&blkp->throttled_reqs[1]));
qemu_mutex_lock(&tg->lock); qemu_mutex_lock(&tg->lock);
for (i = 0; i < 2; i++) { for (i = 0; i < 2; i++) {
if (tg->tokens[i] == blk) { if (tg->tokens[i] == bs) {
BlockBackend *token = throttle_group_next_blk(blk); BlockDriverState *token = throttle_group_next_bs(bs);
/* Take care of the case where this is the last blk in the group */ /* Take care of the case where this is the last bs in the group */
if (token == blk) { if (token == bs) {
token = NULL; token = NULL;
} }
tg->tokens[i] = token; tg->tokens[i] = token;
} }
} }
/* remove the current blk from the list */ /* remove the current bs from the list */
QLIST_REMOVE(blkp, round_robin); QLIST_REMOVE(bs, round_robin);
throttle_timers_destroy(&blkp->throttle_timers); throttle_timers_destroy(&bs->throttle_timers);
qemu_mutex_unlock(&tg->lock); qemu_mutex_unlock(&tg->lock);
throttle_group_unref(&tg->ts); throttle_group_unref(tg);
blkp->throttle_state = NULL; bs->throttle_state = NULL;
}
/* Acquire the lock of this throttling group.
*
* You won't normally need to use this. None of the functions from the
* ThrottleGroup API require you to acquire the lock since all of them
* deal with it internally.
*
* This should only be used in exceptional cases when you want to
* access the protected fields of a BlockDriverState directly
* (e.g. bdrv_swap()).
*
* @bs: a BlockDriverState that is member of the group
*/
void throttle_group_lock(BlockDriverState *bs)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
qemu_mutex_lock(&tg->lock);
}
/* Release the lock of this throttling group.
*
* See the comments in throttle_group_lock().
*/
void throttle_group_unlock(BlockDriverState *bs)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
qemu_mutex_unlock(&tg->lock);
} }
static void throttle_groups_init(void) static void throttle_groups_init(void)

Some files were not shown because too many files have changed in this diff Show More