Compare commits

..

219 Commits

Author SHA1 Message Date
Michael Roth
adba377ea7 Update VERSION for 1.7.2 release
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-21 17:42:15 -05:00
Dr. David Alan Gilbert
8fde73e138 Allow mismatched virtio config-len
Commit 'virtio: validate config_len on load' restricted config_len
loaded from the wire to match the config_len that the device had.

Unfortunately, there are cases where this isn't true, the one
we found it on was the wce addition in virtio-blk.

Allow mismatched config-lengths:
   *) If the version on the wire is shorter then fine
   *) If the version on the wire is longer, load what we have space
      for and skip the rest.

(This is mst@redhat.com's rework of what I originally posted)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2f5732e964)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:02 -05:00
Le Tan
14d9fb02c2 pci: assign devfn to pci_dev before calling pci_device_iommu_address_space()
In function do_pci_register_device() in file hw/pci/pci.c, move the assignment
of pci_dev->devfn to the position before the call to
pci_device_iommu_address_space(pci_dev) which will use the value of
pci_dev->devfn.

Fixes: 9eda7d373e
    pci: Introduce helper to retrieve a PCI device's DMA address space

Cc: qemu-stable@nongnu.org
Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit efc8188e93)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:02 -05:00
Andreas Färber
53e4895c98 hw: Fix qemu_allocate_irqs() leaks
Replace qemu_allocate_irqs(foo, bar, 1)[0]
with qemu_allocate_irq(foo, bar, 0).

This avoids leaking the dereferenced qemu_irq *.

Cc: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[PC Changes:
 * Applied change to instance in sh4/sh7750.c
]
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Kirill Batuzov <batuzovk@ispras.ru>
[AF: Fix IRQ index in sh4/sh7750.c]
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>

(cherry picked from commit f3c7d0389f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:02 -05:00
Andreas Färber
bb485bf2e8 sdhci: Fix misuse of qemu_free_irqs()
It does a g_free() on the pointer, so don't pass a local &foo reference.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 127a4e1a51)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:02 -05:00
Markus Armbruster
02835d5744 vnc: Fix tight_detect_smooth_image() for lossless case
VncTight member uint8_t quality is either (uint8_t)-1 for lossless or
less than 10 for lossy.

tight_detect_smooth_image() first promotes it to int, then compares
with -1.  Always unequal, so we always execute the lossy code.  Reads
beyond tight_conf[] and returns crap when quality is actually
lossless.

Compare to (uint8_t)-1 instead, like we do elsewhere.

Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2e7bcdb99a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:02 -05:00
Michael Roth
41ee91810e qapi: zero-initialize all QMP command parameters
In general QMP command parameter values are specified by consumers of the
QMP/HMP interface, but in the case of optional parameters these values may
be left uninitialized.

It is considered a bug for code to make use of optional parameters that have
not been flagged as being present by the marshalling code (via corresponding
has_<parameter> parameter), however our marshalling code will still pass
these uninitialized values on to the corresponding QMP function (to then
be ignored). Some compilers (clang in particular) consider this unsafe
however, and generate warnings as a result. As reported by Peter Maydell:

  This is something clang's -fsanitize=undefined spotted. The
  code generated by qapi-commands.py in qmp-marshal.c for
  qmp_marshal_* functions where there are some optional
  arguments looks like this:

      bool has_force = false;
      bool force;

      mi = qmp_input_visitor_new_strict(QOBJECT(args));
      v = qmp_input_get_visitor(mi);
      visit_type_str(v, &device, "device", errp);
      visit_start_optional(v, &has_force, "force", errp);
      if (has_force) {
          visit_type_bool(v, &force, "force", errp);
      }
      visit_end_optional(v, errp);
      qmp_input_visitor_cleanup(mi);

      if (error_is_set(errp)) {
          goto out;
      }
      qmp_eject(device, has_force, force, errp);

  In the case where has_force is false, we never initialize
  force, but then we use it by passing it to qmp_eject.
  I imagine we don't then actually use the value, but clang
  complains in particular for 'bool' variables because the value
  that ends up being loaded from memory for 'force' is not either
  0 or 1 (being uninitialized stack contents).

Fix this by initializing all QMP command parameters to {0} in the
marshalling code prior to passing them on to the QMP functions.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit fc13d93726)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:02 -05:00
Hani Benhabiles
0c60b74a0c nbd: Shutdown socket before closing.
This forces finishing data sending to client before closing the socket like in
exports listing or replying with NBD_REP_ERR_UNSUP cases.

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 27e5eae457)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:02 -05:00
Hani Benhabiles
25351f6a9a nbd: Close socket on negotiation failure.
Otherwise, the nbd client may hang waiting for the server response.

Signed-off-by: Hani Benhabiles <hani@linux.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 36af599417)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:01 -05:00
Hani Benhabiles
cf392d2c7c nbd: Don't validate from and len in NBD_CMD_DISC.
These values aren't used in this case.

Currently, the from field in the request sent by the nbd kernel module leading
to a false error message when ending the connection with the client.

$ qemu-nbd some.img -v
// After nbd-client -d /dev/nbd0
nbd.c:nbd_trip():L1031: From: 18446744073709551104, Len: 0, Size: 20971520,
Offset: 0
nbd.c:nbd_trip():L1032: requested operation past EOF--bad client?
nbd.c:nbd_receive_request():L638: read failed

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 8c5d1abbb7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:01 -05:00
Hani Benhabiles
3c3d8c6d19 nbd: Don't export a block device with no medium.
The device is exported with erroneous values and can't be read.

Before the patch:
$ sudo nbd-client localhost -p 10809 /dev/nbd0 -name floppy0
Negotiation: ..size = 17592186044415MB
bs=1024, sz=18446744073709547520 bytes

$ sudo mount /dev/nbd0 /mnt/tmp/
mount: block device /dev/nbd0 is write-protected, mounting read-only
mount: /dev/nbd0: can't read superblock

After the patch:
(qemu) nbd_server_add ide0-hd0
(qemu) nbd_server_add floppy0
Device 'floppy0' has no medium

Signed-off-by: Hani Benhabiles <kroosec@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 60fe4fac22)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:01 -05:00
Alexander Graf
62c754e67c virtio-serial: don't migrate the config space
The device configuration is set at realize time and never changes. It
should not be migrated as it is done today. For the sake of compatibility,
let's just skip them at load time.

Signed-off-by: Alexander Graf <agraf@suse.de>
[ added missing casts to uint16_t *,
  added From, SoB and commit message,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit e38e943a1f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:01 -05:00
Cédric Le Goater
0fd14a5564 virtio-net: byteswap virtio-net header
TCP connectivity fails when the guest has a different endianness.
The packets are silently dropped on the host by the tap backend
when they are read from user space because the endianness of the
virtio-net header is in the wrong order. These lines may appear
in the guest console:

[  454.709327] skbuff: bad partial csum: csum=8704/4096 len=74
[  455.702554] skbuff: bad partial csum: csum=8704/4096 len=74

The issue that got first spotted with a ppc64le PowerKVM guest,
but it also exists for the less common case of a x86_64 guest run
by a big-endian ppc64 TCG hypervisor.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[ Ported from PowerKVM,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit 032a74a1c0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:01 -05:00
Eduardo Habkost
7a3cd5ab40 target-i386: Filter FEAT_7_0_EBX TCG features too
The TCG_7_0_EBX_FEATURES macro was defined but never used (it even had a
typo that was never noticed). Make the existing TCG feature filtering
code use it.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit d0a70f46fa)

Conflicts:
	target-i386/cpu.c

*fixed simple context mismatch

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:01 -05:00
Peter Maydell
8a93721d04 coroutine-win32.c: Add noinline attribute to work around gcc bug
A gcc codegen bug in x86_64-w64-mingw32-gcc (GCC) 4.6.3 means that
non-debug builds of QEMU for Windows tend to assert when using
coroutines. Work around this by marking qemu_coroutine_switch
as noinline.

If we allow gcc to inline qemu_coroutine_switch into
coroutine_trampoline, then it hoists the code to get the
address of the TLS variable "current" out of the while() loop.
This is an invalid transformation because the SwitchToFiber()
call may be called when running thread A but return in thread B,
and so we might be in a different thread context each time
round the loop. This can happen quite often.  Typically.
a coroutine is started when a VCPU thread does bdrv_aio_readv:

     VCPU thread

     main VCPU thread coroutine      I/O coroutine
        bdrv_aio_readv ----->
                                     start I/O operation
                                       thread_pool_submit_co
                       <------------ yields
        back to emulation

Then I/O finishes and the thread-pool.c event notifier triggers in
the I/O thread.  event_notifier_ready calls thread_pool_co_cb, and
the I/O coroutine now restarts *in another thread*:

     iothread

     main iothread coroutine         I/O coroutine (formerly in VCPU thread)
        event_notifier_ready
          thread_pool_co_cb ----->   current = I/O coroutine;
                                     call AIO callback

But on Win32, because of the bug, the "current" being set here the
current coroutine of the VCPU thread, not the iothread.

noinline is a good-enough workaround, and quite unlikely to break in
the future.

(Thanks to Paolo Bonzini for assistance in diagnosing the problem
and providing the detailed example/ascii art quoted above.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403535303-14939-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
(cherry picked from commit ff4873cb8c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:01 -05:00
Alexander Graf
b47506f55c KVM: Fix GSI number space limit
KVM tells us the number of GSIs it can handle inside the kernel. That value is
basically KVM_MAX_IRQ_ROUTES. However when we try to set the GSI mapping table,
it checks for

    r = -EINVAL;
    if (routing.nr >= KVM_MAX_IRQ_ROUTES)
        goto out;

erroring out even when we're only using all of the GSIs. To make sure we never
hit that limit, let's reduce the number of GSIs we get from KVM by one.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 00008418aa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:01 -05:00
Hani Benhabiles
f0c609dede usb: Fix usb-bt-dongle initialization.
Due to an incomplete initialization, adding a usb-bt-dongle device through HMP
or QMP will cause a segmentation fault.

Signed-off-by: Hani Benhabiles <hani@linux.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c340a284f3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:01 -05:00
Michael S. Tsirkin
79bd7781dd vhost: fix resource leak in error handling
vhost_verify_ring_mappings leaks mappings on error.
Fix this up.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit 8617343faa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:01 -05:00
Ulrich Obergfell
36afdba00a scsi-disk: fix bug in scsi_block_new_request() introduced by commit 137745c
This patch fixes a bug in scsi_block_new_request() that was introduced
by commit 137745c5c6. If the host cache
is used - i.e. if BDRV_O_NOCACHE is _not_ set - the 'break' statement
needs to be executed to 'fall back' to SG_IO.

Cc: qemu-stable@nongnu.org
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2fe5a9f73b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:01 -05:00
Michael R. Hines
63bf1e0ea5 rdma: bug fixes
1. Fix small memory leak in parsing inet address from command line in data_init()
2. Fix ibv_post_send() return value check and pass error code back up correctly.
3. Fix rdma_destroy_qp() segfault after failure to connect to destination.

Reported-by: frank.yangjie@gmail.com
Reported-by: dgilbert@redhat.com
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit e325b49a32)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:00 -05:00
Gonglei
23dbc56d22 qga: Fix handle fd leak in acquire_privilege()
token should be closed in all conditions.
So move CloseHandle(token) to "out" branch.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 374044f08f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-15 19:28:00 -05:00
Stefan Hajnoczi
4041945624 aio: fix qemu_bh_schedule() bh->ctx race condition
qemu_bh_schedule() is supposed to be thread-safe at least the first time
it is called.  Unfortunately this is not quite true:

  bh->scheduled = 1;
  aio_notify(bh->ctx);

Since another thread may run the BH callback once it has been scheduled,
there is a race condition if the callback frees the BH before
aio_notify(bh->ctx) has a chance to run.

Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Stefan Priebe <s.priebe@profihost.ag>
(cherry picked from commit 924fe1293c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:30 -05:00
Cornelia Huck
5019106862 s390x/css: handle emw correctly for tsch
We should not try to store the emw portion of the irb if extended
measurements are not applicable. In particular, we should not surprise
the guest by storing a larger irb if it did not enable extended
measurements.

Cc: qemu-stable@nongnu.org
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit f068d320de)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:30 -05:00
Peter Maydell
f784615221 target-arm: Fix errors in writes to generic timer control registers
The code for handling writes to the generic timer control registers
had several bugs:
 * ISTATUS (bit 2) is read-only but we forced it to zero on any write
 * the check for "was IMASK (bit 1) toggled?" incorrectly used '&' where
   it should be '^'
 * the handling of IMASK was inverted: we should set the IRQ if
   ISTATUS is set and IMASK is clear, not if both are set

The combination of these bugs meant that when running a Linux guest
that uses the generic timers we would fairly quickly end up either
forgetting that the timer output should be asserted, or failing to
set the IRQ when the timer was unmasked. The result is that the guest
never gets any more timer interrupts.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401803208-1281-1-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
(cherry picked from commit d3afacc726)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:30 -05:00
Richard Henderson
e34feec264 tcg-i386: Fix win64 qemu store
The first non-register argument isn't placed at offset 0.

Cc: qemu-stable@nongnu.org
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
(cherry picked from commit 0b91966730)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:30 -05:00
Peter Maydell
ccb08f53d5 linux-user: Don't overrun guest buffer in sched_getaffinity
If the guest's "long" type is smaller than the host's, then
our sched_getaffinity wrapper needs to round the buffer size
up to a multiple of the host sizeof(long). This means that when
we copy the data back from the host buffer to the guest's
buffer there might be more than we can fit. Rather than
overflowing the guest's buffer, handle this case by returning
EINVAL or ignoring the unused extra space, as appropriate.

Note that only guests using the syscall interface directly might
run into this bug -- the glibc wrappers around it will always
use a buffer whose size is a multiple of 8 regardless of guest
architecture.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit be3bd286bc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:30 -05:00
Markus Armbruster
cb34d1e9e9 qemu-img: Plug memory leak in convert command
Introduced in commit 661a0f7.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit bb9cd2ee99)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:29 -05:00
Markus Armbruster
df9c108acd block/sheepdog: Plug memory leak in sd_snapshot_create()
Has always been leaky.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2df5fee2db)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:29 -05:00
Markus Armbruster
d3cd48a85f block/vvfat: Plug memory leak in read_directory()
Has always been leaky.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b122c3b6d0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:29 -05:00
Markus Armbruster
501da9369c block/vvfat: Plug memory leak in check_directory_consistency()
On error path.  Introduced in commit a046433a.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6262bbd363)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:29 -05:00
Markus Armbruster
7267e51b32 block/qapi: Plug memory leak in dump_qobject() case QTYPE_QERROR
Introduced in commit a8d8ecb.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f25391c2a6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:29 -05:00
Markus Armbruster
d1775fe94a blockdev: Plug memory leak in drive_init()
bs_opts is leaked on all paths from its qdev_new() that don't got
through blockdev_init().  Add the missing QDECREF(), and zap bs_opts
after blockdev_init(), so the new QDECREF() does nothing when we go
through blockdev_init().

Leak introduced in commit f298d07.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 3cb0e25c4b)

Conflicts:
	blockdev.c

*fixed trivial context mismatch due to blockdev_init signature change

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:29 -05:00
Markus Armbruster
d2b987479a blockdev: Plug memory leak in blockdev_init()
blockdev_init() leaks bs_opts when qemu_opts_create() fails, i.e. when
the ID is bad.  Missed in commit ec9c10d.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6376f95223)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:29 -05:00
Stefan Weil
c2fb0f2870 cputlb: Fix regression with TCG interpreter (bug 1310324)
Commit 0f842f8a24 replaced GETPC_EXT() which
was derived from GETPC() by GETRA_EXT() without fixing cputlb.c. A later
patch replaced GETRA_EXT() by GETRA() in exec/softmmu_template.h which
is included in cputlb.c.

The TCG interpreter failed because the values returned by GETRA() were no
longer explicitly set to 0. The redefinition of GETRA() introduced here
fixes this.

In addition, GETPC_ADJ which is also used in exec/softmmu_template.h is
set to 0. Both changes reduce the compiled code size for cputlb.c by more
than 100 bytes, so the normal TCG without interpreter also profits from
the reduced code size and slightly faster code.

Cc: qemu-stable@nongnu.org
Reported-by: Giovanni Mascellani <gio@debian.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7e4e88656c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:29 -05:00
Max Filippov
26b51027f9 target-xtensa: fix cross-page jumps/calls at the end of TB
Use tb->pc instead of dc->pc to check for cross-page jumps.
When TB translation stops at the page boundary dc->pc points to the next
page allowing chaining to TBs in it, which is wrong.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit 433d33c555)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:29 -05:00
Markus Armbruster
44564f8226 virtio-scsi: Plug memory leak on virtio_scsi_push_event() error path
Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 91e7fcca47)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:29 -05:00
Kevin Wolf
2f1eb049df qcow1: Stricter backing file length check
Like qcow2 since commit 6d33e8e7, error out on invalid lengths instead
of silently truncating them to 1023.

Also don't rely on bdrv_pread() catching integer overflows that make len
negative, but use unsigned variables in the first place.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
(cherry picked from commit d66e5cee00)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:28 -05:00
Kevin Wolf
b53d8665a2 qcow1: Validate image size (CVE-2014-0223)
A huge image size could cause s->l1_size to overflow. Make sure that
images never require a L1 table larger than what fits in s->l1_size.

This cannot only cause unbounded allocations, but also the allocation of
a too small L1 table, resulting in out-of-bounds array accesses (both
reads and writes).

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 46485de0cb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:28 -05:00
Kevin Wolf
8b17eb6e6c qcow1: Validate L2 table size (CVE-2014-0222)
Too large L2 table sizes cause unbounded allocations. Images actually
created by qemu-img only have 512 byte or 4k L2 tables.

To keep things consistent with cluster sizes, allow ranges between 512
bytes and 64k (in fact, down to 1 entry = 8 bytes is technically
working, but L2 table sizes smaller than a cluster don't make a lot of
sense).

This also means that the number of bytes on the virtual disk that are
described by the same L2 table is limited to at most 8k * 64k or 2^29,
preventively avoiding any integer overflows.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
(cherry picked from commit 42eb58179b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:28 -05:00
Kevin Wolf
e6c55cf7c2 qcow1: Check maximum cluster size
Huge values for header.cluster_bits cause unbounded allocations (e.g.
for s->cluster_cache) and crash qemu this way. Less huge values may
survive those allocations, but can cause integer overflows later on.

The only cluster sizes that qemu can create are 4k (for standalone
images) and 512 (for images with backing files), so we can limit it
to 64k.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
(cherry picked from commit 7159a45b2b)

Conflicts:
	block/qcow.c
	tests/qemu-iotests/group

*removed mismatch due to error msgs from upstream's b6d5066d
*removed context from upstream block tests

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:28 -05:00
Kevin Wolf
41819e90af qcow1: Make padding in the header explicit
We were relying on all compilers inserting the same padding in the
header struct that is used for the on-disk format. Let's not do that.
Mark the struct as packed and insert an explicit padding field for
compatibility.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
(cherry picked from commit ea54feff58)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:28 -05:00
Kevin Wolf
97a0e27e71 parallels: Sanity check for s->tracks (CVE-2014-0142)
This avoids a possible division by zero.

Convert s->tracks to unsigned as well because it feels better than
surviving just because the results of calculations with s->tracks are
converted to unsigned anyway.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 9302e863aa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:28 -05:00
Kevin Wolf
750336bc90 parallels: Fix catalog size integer overflow (CVE-2014-0143)
The first test case would cause a huge memory allocation, leading to a
qemu abort; the second one to a too small malloc() for the catalog
(smaller than s->catalog_size), which causes a read-only out-of-bounds
array access and on big endian hosts an endianess conversion for an
undefined memory area.

The sample image used here is not an original Parallels image. It was
created using an hexeditor on the basis of the struct that qemu uses.
Good enough for trying to crash the driver, but not for ensuring
compatibility.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit afbcc40bee)

Conflicts:
	tests/qemu-iotests/group

*fixed mismatches in group file

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:28 -05:00
Kevin Wolf
cfa8008cc0 qcow2: Check maximum L1 size in qcow2_snapshot_load_tmp() (CVE-2014-0143)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6a83f8b5be)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:31:28 -05:00
Kevin Wolf
d99c4e2d85 qcow2: Fix L1 allocation size in qcow2_snapshot_load_tmp() (CVE-2014-0145)
For the L1 table to loaded for an internal snapshot, the code allocated
only enough memory to hold the currently active L1 table. If the
snapshot's L1 table is actually larger than the current one, this leads
to a buffer overflow.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c05e4667be)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:14 -05:00
Kevin Wolf
641c3ec442 qcow2: Fix copy_sectors() with VM state
bs->total_sectors is not the highest possible sector number that could
be involved in a copy on write operation: VM state is after the end of
the virtual disk. This resulted in wrong values for the number of
sectors to be copied (n).

The code that checks for the end of the image isn't required any more
because the code hasn't been calling the block layer's bdrv_read() for a
long time; instead, it directly calls qcow2_readv(), which doesn't error
out on VM state sector numbers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6b7d4c5558)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:14 -05:00
Kevin Wolf
c2c52728f5 qcow2: Fix NULL dereference in qcow2_open() error path (CVE-2014-0146)
The qcow2 code assumes that s->snapshots is non-NULL if s->nb_snapshots
!= 0. By having the initialisation of both fields separated in
qcow2_open(), any error occuring in between would cause the error path
to dereference NULL in qcow2_free_snapshots() if the image had any
snapshots.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 11b128f406)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:14 -05:00
Kevin Wolf
759d38652a block: Limit request size (CVE-2014-0143)
Limiting the size of a single request to INT_MAX not only fixes a
direct integer overflow in bdrv_check_request() (which would only
trigger bad behaviour with ridiculously huge images, as in close to
2^64 bytes), but can also prevent overflows in all block drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8f4754ede5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Stefan Hajnoczi
b6f7fbdd1d dmg: prevent chunk buffer overflow (CVE-2014-0145)
Both compressed and uncompressed I/O is buffered.  dmg_open() calculates
the maximum buffer size needed from the metadata in the image file.

There is currently a buffer overflow since ->lengths[] is accounted
against the maximum compressed buffer size but actually uses the
uncompressed buffer:

  switch (s->types[chunk]) {
  case 1: /* copy */
      ret = bdrv_pread(bs->file, s->offsets[chunk],
                       s->uncompressed_chunk, s->lengths[chunk]);

We must account against the maximum uncompressed buffer size for type=1
chunks.

This patch fixes the maximum buffer size calculation to take into
account the chunk type.  It is critical that we update the correct
maximum since there are two buffers ->compressed_chunk and
->uncompressed_chunk.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f0dce23475)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Stefan Hajnoczi
d400b5dc4a dmg: use uint64_t consistently for sectors and lengths
The DMG metadata is stored as uint64_t, so use the same type for
sector_num.  int was a particularly poor choice since it is only 32-bit
and would truncate large values.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 686d7148ec)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Stefan Hajnoczi
758c4840c6 dmg: sanitize chunk length and sectorcount (CVE-2014-0145)
Chunk length and sectorcount are used for decompression buffers as well
as the bdrv_pread() count argument.  Ensure that they have reasonable
values so neither memory allocation nor conversion from uint64_t to int
will cause problems.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c165f77580)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Stefan Hajnoczi
4b50bd7357 dmg: use appropriate types when reading chunks
Use the right types instead of signed int:

  size_t new_size;

  This is a byte count for g_realloc() that is calculated from uint32_t
  and size_t values.

  uint32_t chunk_count;

  Use the same type as s->n_chunks, which is used together with
  chunk_count.

This patch is a cleanup and does not fix bugs.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit eb71803b04)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Stefan Hajnoczi
4ee5b9c8cb dmg: drop broken bdrv_pread() loop
It is not necessary to check errno for EINTR and the block layer does
not produce short reads.  Therefore we can drop the loop that attempts
to read a compressed chunk.

The loop is buggy because it incorrectly adds the transferred bytes
twice:

  do {
      ret = bdrv_pread(...);
      i += ret;
  } while (ret >= 0 && ret + i < s->lengths[chunk]);

Luckily we can drop the loop completely and perform a single
bdrv_pread().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit b404bf8542)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Stefan Hajnoczi
ad08cae75c dmg: prevent out-of-bounds array access on terminator
When a terminator is reached the base for offsets and sectors is stored.
The following records that are processed will use this base value.

If the first record we encounter is a terminator, then calculating the
base values would result in out-of-bounds array accesses.  Don't do
that.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 73ed27ec28)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Stefan Hajnoczi
dedf4a5f79 dmg: coding style and indentation cleanup
Clean up the mix of tabs and spaces, as well as the coding style
violations in block/dmg.c.  There are no semantic changes since this
patch simply reformats the code.

This patch is necessary before we can make meaningful changes to this
file, due to the inconsistent formatting and confusing indentation.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2c1885adcf)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Kevin Wolf
3c6347ce8c qcow2: Fix new L1 table size check (CVE-2014-0143)
The size in bytes is assigned to an int later, so check that instead of
the number of entries.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit cab60de930)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Kevin Wolf
e1c8770f56 qcow2: Protect against some integer overflows in bdrv_check
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 0abe740f1d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Kevin Wolf
c874837475 qcow2: Fix types in qcow2_alloc_clusters and alloc_clusters_noref
In order to avoid integer overflows.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit bb572aefbd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Kevin Wolf
610ab7bd3d qcow2: Check new refcount table size on growth
If the size becomes larger than what qcow2_open() would accept, fail the
growing operation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2b5d5953ee)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:13 -05:00
Kevin Wolf
7a6088c870 qcow2: Avoid integer overflow in get_refcount (CVE-2014-0143)
This ensures that the checks catch all invalid cluster indexes
instead of returning the refcount of a wrong cluster.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit db8a31d11d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:12 -05:00
Kevin Wolf
ffa3ab0217 qcow2: Don't rely on free_cluster_index in alloc_refcount_block() (CVE-2014-0147)
free_cluster_index is only correct if update_refcount() was called from
an allocation function, and even there it's brittle because it's used to
protect unfinished allocations which still have a refcount of 0 - if it
moves in the wrong place, the unfinished allocation can be corrupted.

So not using it any more seems to be a good idea. Instead, use the
first requested cluster to do the calculations. Return -EAGAIN if
unfinished allocations could become invalid and let the caller restart
its search for some free clusters.

The context of creating a snapsnot is one situation where
update_refcount() is called outside of a cluster allocation. For this
case, the change fixes a buffer overflow if a cluster is referenced in
an L2 table that cannot be represented by an existing refcount block.
(new_table[refcount_table_index] was out of bounds)

[Bump the qemu-iotests 026 refblock_alloc.write leak count from 10 to
11.
--Stefan]

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit b106ad9185)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:12 -05:00
Kevin Wolf
aeba41549d qcow2: Zero-initialise first cluster for new images
Strictly speaking, this is only required for has_zero_init() == false,
but it's easy enough to just do a cluster-aligned write that is padded
with zeros after the header.

This fixes that after 'qemu-img create' header extensions are attempted
to be parsed that are really just random leftover data.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f8413b3c23)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:12 -05:00
Hu Tao
2f59c95f16 qcow2: fix offset overflow in qcow2_alloc_clusters_at()
When cluster size is big enough it can lead to an offset overflow
in qcow2_alloc_clusters_at(). This patch fixes it.

The allocation is stopped each time at L2 table boundary
(see handle_alloc()), so the possible maximum bytes could be

  2^(cluster_bits - 3 + cluster_bits)

cluster_bits - 3 is used to compute the number of entry by L2
and the additional cluster_bits is to take into account each
clusters referenced by the L2 entries.

so int is safe for cluster_bits<=17, unsafe otherwise.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 33304ec9fa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:12 -05:00
Kevin Wolf
5ba151f4dc qcow2: Fix backing file name length check
len could become negative and would pass the check then. Nothing bad
happened because bdrv_pread() happens to return an error for negative
length values, but make variables for sizes unsigned anyway.

This patch also changes the behaviour to error out on invalid lengths
instead of silently truncating it to 1023.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6d33e8e7dc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:12 -05:00
Kevin Wolf
cd598d4161 qcow2: Validate active L1 table offset and size (CVE-2014-0144)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2d51c32c4b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:12 -05:00
Kevin Wolf
04bc6981ca qcow2: Validate snapshot table offset/size (CVE-2014-0144)
This avoid unbounded memory allocation and fixes a potential buffer
overflow on 32 bit hosts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ce48f2f441)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:12 -05:00
Kevin Wolf
818ce8487e qcow2: Validate refcount table offset
The end of the refcount table must not exceed INT64_MAX so that integer
overflows are avoided.

Also check for misaligned refcount table. Such images are invalid and
probably the result of data corruption. Error out to avoid further
corruption.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8c7de28305)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:12 -05:00
Kevin Wolf
f6027f805b qcow2: Check refcount table size (CVE-2014-0144)
Limit the in-memory reference count table size to 8 MB, it's enough in
practice. This fixes an unbounded allocation as well as a buffer
overflow in qcow2_refcount_init().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5dab2faddc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:12 -05:00
Kevin Wolf
6f6db0c7af qcow2: Check backing_file_offset (CVE-2014-0144)
Header, header extension and the backing file name must all be stored in
the first cluster. Setting the backing file to a much higher value
allowed header extensions to become much bigger than we want them to be
(unbounded allocation).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a1b3955c94)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:12 -05:00
Kevin Wolf
665f3ad58f qcow2: Check header_length (CVE-2014-0144)
This fixes an unbounded allocation for s->unknown_header_fields.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 24342f2cae)

Conflicts:
	tests/qemu-iotests/group

*fixed context mismatches in group file

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:12 -05:00
Fam Zheng
4854971ac1 curl: check data size before memcpy to local buffer. (CVE-2014-0144)
curl_read_cb is callback function for libcurl when data arrives. The
data size passed in here is not guaranteed to be within the range of
request we submitted, so we may overflow the guest IO buffer. Check the
real size we have before memcpy to buffer to avoid overflow.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6d4b9e55fc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:11 -05:00
Jeff Cody
1786c4225d vhdx: Bounds checking for block_size and logical_sector_size (CVE-2014-0148)
Other variables (e.g. sectors_per_block) are calculated using these
variables, and if not range-checked illegal values could be obtained
causing infinite loops and other potential issues when calculating
BAT entries.

The 1.00 VHDX spec requires BlockSize to be min 1MB, max 256MB.
LogicalSectorSize is required to be either 512 or 4096 bytes.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 1d7678dec4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:11 -05:00
Jeff Cody
37173f54b7 vdi: add bounds checks for blocks_in_image and disk_size header fields (CVE-2014-0144)
The maximum blocks_in_image is 0xffffffff / 4, which also limits the
maximum disk_size for a VDI image to 1024TB.  Note that this is the maximum
size that QEMU will currently support with this driver, not necessarily the
maximum size allowed by the image format.

This also fixes an incorrect error message, a bug introduced by commit
5b7aa9b56d (Reported by Stefan Weil)

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 63fa06dc97)

Conflicts:
	block/vdi.c

*modified to retain 1.7's usage of logout() over error_setg()

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:11 -05:00
Kevin Wolf
76d1eddbe5 vpc: Validate block size (CVE-2014-0142)
This fixes some cases of division by zero crashes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5e71dfad76)

Conflicts:
	tests/qemu-iotests/group

*fixed context mismatches in group file

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:11 -05:00
Jeff Cody
b2390c7008 vpc/vhd: add bounds check for max_table_entries and block_size (CVE-2014-0144)
This adds checks to make sure that max_table_entries and block_size
are in sane ranges.  Memory is allocated based on max_table_entries,
and block_size is used to calculate indices into that allocated
memory, so if these values are incorrect that can lead to potential
unbounded memory allocation, or invalid memory accesses.

Also, the allocation of the pagetable is changed from g_malloc0()
to qemu_blockalign().

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 97f1c45c6f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:11 -05:00
Kevin Wolf
6ee0d5fdc7 bochs: Fix bitmap offset calculation
32 bit truncation could let us access the wrong offset in the image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a9ba36a45d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:11 -05:00
Kevin Wolf
b0a7517c24 bochs: Check extent_size header field (CVE-2014-0142)
This fixes two possible division by zero crashes: In bochs_open() and in
seek_to_sector().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8e53abbc20)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:11 -05:00
Kevin Wolf
6b94cfeca8 bochs: Check catalog_size header field (CVE-2014-0143)
It should neither become negative nor allow unbounded memory
allocations. This fixes aborts in g_malloc() and an s->catalog_bitmap
buffer overflow on big endian hosts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e3737b820b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:11 -05:00
Kevin Wolf
0e748624bd bochs: Use unsigned variables for offsets and sizes (CVE-2014-0147)
Gets us rid of integer overflows resulting in negative sizes which
aren't correctly checked.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 246f65838d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:11 -05:00
Kevin Wolf
bb8b201815 bochs: Unify header structs and make them QEMU_PACKED
This is an on-disk structure, so offsets must be accurate.

Before this patch, sizeof(bochs) != sizeof(header_v1), which makes the
memcpy() between both invalid. We're lucky enough that the destination
buffer happened to be the larger one, and the memcpy size to be taken
from the smaller one, so we didn't get a buffer overflow in practice.

This patch unifies the both structures, eliminating the need to do a
memcpy in the first place. The common fields are extracted to the top
level of the struct and the actually differing part gets a union of the
two versions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3dd8a6763b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:11 -05:00
Kevin Wolf
ae9b5df877 qemu-iotests: Support for bochs format
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 24f3078a04)

Conflicts:
	tests/qemu-iotests/group

*fix context mismatches in group file

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:11 -05:00
Stefan Hajnoczi
dbd3e4a75c block/cloop: fix offsets[] size off-by-one
cloop stores the number of compressed blocks in the n_blocks header
field.  The file actually contains n_blocks + 1 offsets, where the extra
offset is the end-of-file offset.

The following line in cloop_read_block() results in an out-of-bounds
offsets[] access:

    uint32_t bytes = s->offsets[block_num + 1] - s->offsets[block_num];

This patch allocates and loads the extra offset so that
cloop_read_block() works correctly when the last block is accessed.

Notice that we must free s->offsets[] unconditionally now since there is
always an end-of-file offset.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 42d43d35d9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:10 -05:00
Stefan Hajnoczi
0fda3e2d63 block/cloop: refuse images with bogus offsets (CVE-2014-0144)
The offsets[] array allows efficient seeking and tells us the maximum
compressed data size.  If the offsets are bogus the maximum compressed
data size will be unrealistic.

This could cause g_malloc() to abort and bogus offsets mean the image is
broken anyway.  Therefore we should refuse such images.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f56b9bc3ae)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:10 -05:00
Stefan Hajnoczi
7dcffbb2bf block/cloop: refuse images with huge offsets arrays (CVE-2014-0144)
Limit offsets_size to 512 MB so that:

1. g_malloc() does not abort due to an unreasonable size argument.

2. offsets_size does not overflow the bdrv_pread() int size argument.

This limit imposes a maximum image size of 16 TB at 256 KB block size.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7b103b36d6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:10 -05:00
Stefan Hajnoczi
d723971b5d block/cloop: prevent offsets_size integer overflow (CVE-2014-0143)
The following integer overflow in offsets_size can lead to out-of-bounds
memory stores when n_blocks has a huge value:

    uint32_t n_blocks, offsets_size;
    [...]
    ret = bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4);
    [...]
    s->n_blocks = be32_to_cpu(s->n_blocks);

    /* read offsets */
    offsets_size = s->n_blocks * sizeof(uint64_t);
    s->offsets = g_malloc(offsets_size);

    [...]

    for(i=0;i<s->n_blocks;i++) {
        s->offsets[i] = be64_to_cpu(s->offsets[i]);

offsets_size can be smaller than n_blocks due to integer overflow.
Therefore s->offsets[] is too small when the for loop byteswaps offsets.

This patch refuses to open files if offsets_size would overflow.

Note that changing the type of offsets_size is not a fix since 32-bit
hosts still only have 32-bit size_t.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 509a41bab5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:10 -05:00
Stefan Hajnoczi
1f6bda9301 block/cloop: validate block_size header field (CVE-2014-0144)
Avoid unbounded s->uncompressed_block memory allocation by checking that
the block_size header field has a reasonable value.  Also enforce the
assumption that the value is a non-zero multiple of 512.

These constraints conform to cloop 2.639's code so we accept existing
image files.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d65f97a82c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:10 -05:00
Stefan Hajnoczi
46c5cacbb4 qemu-iotests: add cloop input validation tests
Add a cloop format-specific test case.  Later patches add tests for
input validation to the script.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 05560fcebb)

Conflicts:
	tests/qemu-iotests/group

*fixed context mismatches in group file

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:10 -05:00
Stefan Hajnoczi
95139b786a qemu-iotests: add ./check -cloop support
Add the cloop block driver to qemu-iotests.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 47f73da0a7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-03 16:18:03 -05:00
Peter Lieven
69b7aacc01 migration: catch unknown flags in ram_load
if a saved vm has unknown flags in the memory data qemu
currently simply ignores this flag and continues which
yields in an unpredictable result.

This patch catches all unknown flags and aborts the
loading of the vm. Additionally error reports are thrown
if the migration aborts abnormally.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit db80facefa)

Conflicts:
	arch_init.c

*removed unecessary context from 4798fe55

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-01 11:11:36 -05:00
ChenLiang
3102b1a221 migration: remove duplicate code
version_id is checked twice in the ram_load.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 21a246a43b)

*prereq for db80fac backport
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-01 11:07:12 -05:00
Michael S. Tsirkin
84321ba2b6 virtio: allow mapping up to max queue size
It's a loop from i < num_sg  and the array is VIRTQUEUE_MAX_SIZE - so
it's OK if the value read is VIRTQUEUE_MAX_SIZE.

Not a big problem in practice as people don't use
such big queues, but it's inelegant.

Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9372514080)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-01 11:03:24 -05:00
Michael S. Tsirkin
9fbc298a47 pci-assign: limit # of msix vectors
KVM only supports MSIX table size up to 256 vectors,
but some assigned devices support more vectors,
at the moment attempts to assign them fail with EINVAL.

Tweak the MSIX capability exposed to guest to limit table size
to a supported value.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Gonglei <arei.gonglei@huawei.com>
Cc: qemu-stable@nongnu.org
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 639973a474)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-07-01 10:57:03 -05:00
Alexey Kardashevskiy
74dd27cecb spapr_pci: Fix number of returned vectors in ibm, change-msi
Current guest kernels try allocating as many vectors as the quota is.
For example, in the case of virtio-net (which has just 3 vectors)
the guest requests 4 vectors (that is the quota in the test) and
the existing ibm,change-msi handler returns 4. But before it returns,
it calls msix_set_message() in a loop and corrupts memory behind
the end of msix_table.

This limits the number of vectors returned by ibm,change-msi to
the maximum supported by the actual device.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: qemu-stable@nongnu.org
[agraf: squash in bugfix from aik]
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit b26696b519)

*s/error_report/fprintf/ to reflect v1.7.x error reporting style

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-29 15:58:43 -05:00
Peter Maydell
b6760b6203 linux-user/elfload.c: Fix A64 code which was incorrectly acting like A32
The ARM target-specific code in elfload.c was incorrectly allowing
the 64-bit ARM target to use most of the existing 32-bit definitions:
most noticably this meant that our HWCAP bits passed to the guest
were wrong, and register handling when dumping core was totally
broken. Fix this by properly separating the 64 and 32 bit code,
since they have more differences than similarities.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit 24e76ff06b)

Conflicts:
	linux-user/elfload.c

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 15:43:30 -05:00
Peter Maydell
64b210d4d5 linux-user/elfload.c: Update ARM HWCAP bits
The kernel has added support for a number of new ARM HWCAP bits;
add them to QEMU, including support for setting them where we have
a corresponding CPU feature bit.

We were also incorrectly setting the VFPv3D16 HWCAP -- this means
"only 16 D registers", not "supports 16-bit floating point format";
since QEMU always has 32 D registers for VFPv3, we can just remove
the line that incorrectly set this bit.

The kernel does not set the HWCAP_FPA even if it is providing FPA
emulation via nwfpe, so don't set this bit in QEMU either.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit 2468265465)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 15:42:08 -05:00
Peter Maydell
f6de3526a0 linux-user/elfload.c: Fix incorrect ARM HWCAP bits
The ELF HWCAP bits for ARM features THUMBEE, NEON, VFPv3 and VFPv3D16 are
all off by one compared to the kernel definitions. Fix this discrepancy
and add in the missing CRUNCH bit which was the cause of the off-by-one
error. (We don't emulate any of the CPUs which have that weird hardware,
so it's otherwise uninteresting to us.)

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit 43ce393ee5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 15:41:56 -05:00
Edgar E. Iglesias
7c56952183 target-arm: Make vbar_write 64bit friendly on 32bit hosts
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1398926097-28097-2-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit fed3ffb9f1)

Conflicts:
	target-arm/helper.c

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 15:39:34 -05:00
Paolo Bonzini
3c1162e471 target-i386: fix set of registers zeroed on reset
BND0-3, BNDCFGU, BNDCFGS, BNDSTATUS were not zeroed on reset, but they
should be (Intel Instruction Set Extensions Programming Reference
319433-015, pages 9-4 and 9-6).  Same for YMM.

XCR0 should be reset to 1.

TSC and TSC_RESET were zeroed already by the memset, remove the explicit
assignments.

Cc: Andreas Faerber <afaerber@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 05e7e819d7)

Conflicts:
	target-i386/cpu.c
	target-i386/cpu.h

*removed dependency on 79e9ebeb

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:57:46 -05:00
Michael S. Tsirkin
73d8965bcc stellaris_enet: block migration
Incoming migration with stellaris_enet is unsafe.
It's being reworked, but for now, simply block it
since noone is using it anyway.
Block outgoing migration for good measure.

CVE-2013-4532

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:36:01 -05:00
Michael S. Tsirkin
2003205fd2 virtio: validate config_len on load
Malformed input can have config_len in migration stream
exceed the array size allocated on destination, the
result will be heap overflow.

To fix, that config_len matches on both sides.

CVE-2014-0182

Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

--

v2: use %ix and %zx to print config_len values
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit a890a2f913)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:31:48 -05:00
Peter Maydell
7abee6c988 savevm: Ignore minimum_version_id_old if there is no load_state_old
At the moment we require vmstate definitions to set minimum_version_id_old
to the same value as minimum_version_id if they do not provide a
load_state_old handler. Since the load_state_old functionality is
required only for a handful of devices that need to retain migration
compatibility with a pre-vmstate implementation, this means the bulk
of devices have pointless boilerplate. Relax the definition so that
minimum_version_id_old is ignored if there is no load_state_old handler.

Note that under the old scheme we would segfault if the vmstate
specified a minimum_version_id_old that was less than minimum_version_id
but did not provide a load_state_old function, and the incoming state
specified a version number between minimum_version_id_old and
minimum_version_id. Under the new scheme this will just result in
our failing the migration.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 767adce2d9)

Conflicts:
	vmstate.c

*removed dependency on b6fcfa59 (Move VMState code to vmstate.c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:26:28 -05:00
Michael S. Tsirkin
c4bd2e4cb0 usb: sanity check setup_index+setup_len in post_load
CVE-2013-4541

s->setup_len and s->setup_index are fed into usb_packet_copy as
size/offset into s->data_buf, it's possible for invalid state to exploit
this to load arbitrary data.

setup_len and setup_index should be checked to make sure
they are not negative.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 9f8e9895c5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:22:06 -05:00
Michael S. Tsirkin
0776525e77 vmstate: s/VMSTATE_INT32_LE/VMSTATE_INT32_POSITIVE_LE/
As the macro verifies the value is positive, rename it
to make the function clearer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3476436a44)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:21:46 -05:00
Michael S. Tsirkin
a7fcb4c5e0 virtio-scsi: fix buffer overrun on invalid state load
CVE-2013-4542

hw/scsi/scsi-bus.c invokes load_request.

 virtio_scsi_load_request does:
    qemu_get_buffer(f, (unsigned char *)&req->elem, sizeof(req->elem));

this probably can make elem invalid, for example,
make in_num or out_num huge, then:

    virtio_scsi_parse_req(s, vs->cmd_vqs[n], req);

will do:

    if (req->elem.out_num > 1) {
        qemu_sgl_init_external(req, &req->elem.out_sg[1],
                               &req->elem.out_addr[1],
                               req->elem.out_num - 1);
    } else {
        qemu_sgl_init_external(req, &req->elem.in_sg[1],
                               &req->elem.in_addr[1],
                               req->elem.in_num - 1);
    }

and this will access out of array bounds.

Note: this adds security checks within assert calls since
SCSIBusInfo's load_request cannot fail.
For now simply disable builds with NDEBUG - there seems
to be little value in supporting these.

Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3c3ce98142)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:21:30 -05:00
Michael S. Tsirkin
8d948a000d zaurus: fix buffer overrun on invalid state load
CVE-2013-4540

Within scoop_gpio_handler_update, if prev_level has a high bit set, then
we get bit > 16 and that causes a buffer overrun.

Since prev_level comes from wire indirectly, this can
happen on invalid state load.

Similarly for gpio_level and gpio_dir.

To fix, limit to 16 bit.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 52f91c3723)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:21:17 -05:00
Michael S. Tsirkin
c75e43b871 tsc210x: fix buffer overrun on invalid state load
CVE-2013-4539

s->precision, nextprecision, function and nextfunction
come from wire and are used
as idx into resolution[] in TSC_CUT_RESOLUTION.

Validate after load to avoid buffer overrun.

Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5193be3be3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:21:02 -05:00
Michael S. Tsirkin
af443645c3 ssd0323: fix buffer overun on invalid state load
CVE-2013-4538

s->cmd_len used as index in ssd0323_transfer() to store 32-bit field.
Possible this field might then be supplied by guest to overwrite a
return addr somewhere. Same for row/col fields, which are indicies into
framebuffer array.

To fix validate after load.

Additionally, validate that the row/col_start/end are within bounds;
otherwise the guest can provoke an overrun by either setting the _end
field so large that the row++ increments just walk off the end of the
array, or by setting the _start value to something bogus and then
letting the "we hit end of row" logic reset row to row_start.

For completeness, validate mode as well.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit ead7a57df3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:20:52 -05:00
Michael S. Tsirkin
45edb0ca7a ssi-sd: fix buffer overrun on invalid state load
CVE-2013-4537

s->arglen is taken from wire and used as idx
in ssi_sd_transfer().

Validate it before access.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit a9c380db3b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:19:48 -05:00
Michael S. Tsirkin
d92a7683e8 pxa2xx: avoid buffer overrun on incoming migration
CVE-2013-4533

s->rx_level is read from the wire and used to determine how many bytes
to subsequently read into s->rx_fifo[]. If s->rx_level exceeds the
length of s->rx_fifo[] the buffer can be overrun with arbitrary data
from the wire.

Fix this by validating rx_level against the size of s->rx_fifo.

Cc: Don Koch <dkoch@verizon.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Don Koch <dkoch@verizon.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit caa881abe0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:19:05 -05:00
Michael S. Tsirkin
68801b7be1 virtio: validate num_sg when mapping
CVE-2013-4535
CVE-2013-4536

Both virtio-block and virtio-serial read,
VirtQueueElements are read in as buffers, and passed to
virtqueue_map_sg(), where num_sg is taken from the wire and can force
writes to indicies beyond VIRTQUEUE_MAX_SIZE.

To fix, validate num_sg.

Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 36cf2a3713)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:18:51 -05:00
Michael Roth
609f5bf6fe openpic: avoid buffer overrun on incoming migration
CVE-2013-4534

opp->nb_cpus is read from the wire and used to determine how many
IRQDest elements to read into opp->dst[]. If the value exceeds the
length of opp->dst[], MAX_CPU, opp->dst[] can be overrun with arbitrary
data from the wire.

Fix this by failing migration if the value read from the wire exceeds
MAX_CPU.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 73d963c0a7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:18:27 -05:00
Michael Roth
8f0e369a52 virtio: avoid buffer overrun on incoming migration
CVE-2013-6399

vdev->queue_sel is read from the wire, and later used in the
emulation code as an index into vdev->vq[]. If the value of
vdev->queue_sel exceeds the length of vdev->vq[], currently
allocated to be VIRTIO_PCI_QUEUE_MAX elements, subsequent PIO
operations such as VIRTIO_PCI_QUEUE_PFN can be used to overrun
the buffer with arbitrary data originating from the source.

Fix this by failing migration if the value from the wire exceeds
VIRTIO_PCI_QUEUE_MAX.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 4b53c2c72c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:16:13 -05:00
Michael S. Tsirkin
630ebeffb4 vmstate: fix buffer overflow in target-arm/machine.c
CVE-2013-4531

cpreg_vmstate_indexes is a VARRAY_INT32. A negative value for
cpreg_vmstate_array_len will cause a buffer overflow.

VMSTATE_INT32_LE was supposed to protect against this
but doesn't because it doesn't validate that input is
non-negative.

Fix this macro to valide the value appropriately.

The only other user of VMSTATE_INT32_LE doesn't
ever use negative numbers so it doesn't care.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit d2ef4b61fe)

Conflicts:
	vmstate.c

*removed dependency on b6fcfa59 (Move VMState code to vmstate.c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:15:02 -05:00
Dr. David Alan Gilbert
a2b4e846b3 Fix vmstate_info_int32_le comparison/assign
Fix comparison of vmstate_info_int32_le so that it succeeds if loaded
value is (l)ess than or (e)qual

When the comparison succeeds, assign the value loaded
  This is a change in behaviour but I think the original intent, since
  the idea is to check if the version/size of the thing you're loading is
  less than some limit, but you might well want to do something based on
  the actual version/size in the file

Fix up comment and name text

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 24a370ef23)

Conflicts:
	vmstate.c

*removed dependency on b6fcfa59 (Move VMState code to vmstate.c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:12:33 -05:00
Michael S. Tsirkin
f217f379a8 pl022: fix buffer overun on invalid state load
CVE-2013-4530

pl022.c did not bounds check tx_fifo_head and
rx_fifo_head after loading them from file and
before they are used to dereference array.

Reported-by: Michael S. Tsirkin <mst@redhat.com
Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit d8d0a0bc7e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:02:16 -05:00
Michael S. Tsirkin
e83444f71e hw/pci/pcie_aer.c: fix buffer overruns on invalid state load
4) CVE-2013-4529
hw/pci/pcie_aer.c    pcie aer log can overrun the buffer if log_num is
                     too large

There are two issues in this file:
1. log_max from remote can be larger than on local
then buffer will overrun with data coming from state file.
2. log_num can be larger then we get data corruption
again with an overflow but not adversary controlled.

Fix both issues.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5f691ff91d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:01:51 -05:00
Michael S. Tsirkin
d8aba740f2 hpet: fix buffer overrun on invalid state load
CVE-2013-4527 hw/timer/hpet.c buffer overrun

hpet is a VARRAY with a uint8 size but static array of 32

To fix, make sure num_timers is valid using VMSTATE_VALID hook.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3f1c49e213)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:01:11 -05:00
Michael S. Tsirkin
d34e6f7960 ahci: fix buffer overrun on invalid state load
CVE-2013-4526

Within hw/ide/ahci.c, VARRAY refers to ports which is also loaded.  So
we use the old version of ports to read the array but then allow any
value for ports.  This can cause the code to overflow.

There's no reason to migrate ports - it never changes.
So just make sure it matches.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit ae2158ad6c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:00:54 -05:00
Michael S. Tsirkin
5544b7e419 virtio: out-of-bounds buffer write on invalid state load
CVE-2013-4151 QEMU 1.0 out-of-bounds buffer write in
virtio_load@hw/virtio/virtio.c

So we have this code since way back when:

    num = qemu_get_be32(f);

    for (i = 0; i < num; i++) {
        vdev->vq[i].vring.num = qemu_get_be32(f);

array of vqs has size VIRTIO_PCI_QUEUE_MAX, so
on invalid input this will write beyond end of buffer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit cc45995294)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 14:00:35 -05:00
Michael S. Tsirkin
7b6444a2e4 virtio-net: out-of-bounds buffer write on load
CVE-2013-4149 QEMU 1.3.0 out-of-bounds buffer write in
virtio_net_load()@hw/net/virtio-net.c

>         } else if (n->mac_table.in_use) {
>             uint8_t *buf = g_malloc0(n->mac_table.in_use);

We are allocating buffer of size n->mac_table.in_use

>             qemu_get_buffer(f, buf, n->mac_table.in_use * ETH_ALEN);

and read to the n->mac_table.in_use size buffer n->mac_table.in_use *
ETH_ALEN bytes, corrupting memory.

If adversary controls state then memory written there is controlled
by adversary.

Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 98f93ddd84)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 13:59:56 -05:00
Michael S. Tsirkin
2b15f410bd virtio-net: out-of-bounds buffer write on invalid state load
CVE-2013-4150 QEMU 1.5.0 out-of-bounds buffer write in
virtio_net_load()@hw/net/virtio-net.c

This code is in hw/net/virtio-net.c:

    if (n->max_queues > 1) {
        if (n->max_queues != qemu_get_be16(f)) {
            error_report("virtio-net: different max_queues ");
            return -1;
        }

        n->curr_queues = qemu_get_be16(f);
        for (i = 1; i < n->curr_queues; i++) {
            n->vqs[i].tx_waiting = qemu_get_be32(f);
        }
    }

Number of vqs is max_queues, so if we get invalid input here,
for example if max_queues = 2, curr_queues = 3, we get
write beyond end of the buffer, with data that comes from
wire.

This might be used to corrupt qemu memory in hard to predict ways.
Since we have lots of function pointers around, RCE might be possible.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit eea750a562)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 13:59:50 -05:00
Michael S. Tsirkin
95f118fa82 virtio-net: fix buffer overflow on invalid state load
CVE-2013-4148 QEMU 1.0 integer conversion in
virtio_net_load()@hw/net/virtio-net.c

Deals with loading a corrupted savevm image.

>         n->mac_table.in_use = qemu_get_be32(f);

in_use is int so it can get negative when assigned 32bit unsigned value.

>         /* MAC_TABLE_ENTRIES may be different from the saved image */
>         if (n->mac_table.in_use <= MAC_TABLE_ENTRIES) {

passing this check ^^^

>             qemu_get_buffer(f, n->mac_table.macs,
>                             n->mac_table.in_use * ETH_ALEN);

with good in_use value, "n->mac_table.in_use * ETH_ALEN" can get
positive and bigger than mac_table.macs. For example 0x81000000
satisfies this condition when ETH_ALEN is 6.

Fix it by making the value unsigned.
For consistency, change first_multi as well.

Note: all call sites were audited to confirm that
making them unsigned didn't cause any issues:
it turns out we actually never do math on them,
so it's easy to validate because both values are
always <= MAC_TABLE_ENTRIES.

Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 71f7fe48e1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 13:57:15 -05:00
Michael S. Tsirkin
29e2bbef19 vmstate: add VMSTATE_VALIDATE
Validate state using VMS_ARRAY with num = 0 and VMS_MUST_EXIST

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 4082f0889b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 13:55:15 -05:00
Michael S. Tsirkin
a075a3a27e vmstate: add VMS_MUST_EXIST
Can be used to verify a required field exists or validate
state in some other way.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5bf81c8d63)

Conflicts:
	vmstate.c

*removed dependency on b6fcfa59 (Move VMState code to vmstate.c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 13:53:45 -05:00
Michael S. Tsirkin
25062a7521 vmstate: reduce code duplication
move size offset and number of elements math out
to functions, to reduce code duplication.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 35fc1f7189)

Conflicts:
	vmstate.c

*removed dependency on b6fcfa59 (Move VMState code to vmstate.c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-26 13:48:59 -05:00
Dmitry Fleytman
f93614c936 vmxnet3: validate queues configuration read on migration
CVE-2013-4544

Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1396604722-11902-5-git-send-email-dmitry@daynix.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit f12d048a52)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 17:13:49 -05:00
Dmitry Fleytman
709cc04345 vmxnet3: validate interrupt indices read on migration
CVE-2013-4544

Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1396604722-11902-4-git-send-email-dmitry@daynix.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 3c99afc779)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 17:13:33 -05:00
Dmitry Fleytman
ed995c6c2f vmxnet3: validate queues configuration coming from guest
CVE-2013-4544

Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1396604722-11902-3-git-send-email-dmitry@daynix.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 9878d173f5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 17:13:22 -05:00
Dmitry Fleytman
6bbbb937aa vmxnet3: validate interrupt indices coming from guest
CVE-2013-4544

Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1396604722-11902-2-git-send-email-dmitry@daynix.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 8c6c047899)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 17:13:08 -05:00
Michael S. Tsirkin
636fa8aec3 acpi: fix tables for no-hpet configuration
acpi build tried to add offset of hpet table to rsdt even when hpet was
disabled.  If no tables follow hpet, this could lead to a malformed
rsdt.

Fix it up.

To avoid such errors in the future, rearrange code slightly to make it
clear that acpi_add_table stores the offset of the following table - not
of the previous one.

Reported-by: TeLeMan <geleman@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
(cherry picked from commit 9ac1c4c07e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 17:10:19 -05:00
Michael Tokarev
1a6ea31052 po/Makefile: fix $SRC_PATH reference
The rule for messages.po appears to be slightly wrong.
Move the `cd' command within parens.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Tested-by: Stefan Weil <sw@weilnetz.de>
(cherry picked from commit b920cad669)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 17:09:33 -05:00
David Hildenbrand
012d778c07 s390x: empty function stubs in preparation for __KVM_HAVE_GUEST_DEBUG
This patch creates empty function stubs (used by the gdbserver) in preparation
for the hw debugging support by kvm on s390, which will enable the
__KVM_HAVE_GUEST_DEBUG define in the linux headers and require these methods on
the qemu side.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 8c0124490b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 16:43:05 -05:00
Thomas Huth
dd8f80b83c s390x/helper: Added format control bit to MMU translation
With the EDAT-1 facility, the MMU translation can stop at the
segment table already, pointing to a 1 MB block. And while we're
at it, move the page table entry handling to a separate function,
too, as suggested by Alexander Graf.

Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit c4400206d4)

Conflicts:
	target-s390x/helper.c

*removed unecessary context

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 16:39:52 -05:00
Kevin Wolf
b1a86eb532 block: Use BDRV_O_NO_BACKING where appropriate
If you open an image temporarily just because you want to check its size
or get it flushed, there's no real reason to open the whole backing file
chain.

This is a backport of c9fbb99d41 to
qemu 1.7.1.

The backport was done to fix a bug where QEMU 1.7.1 would crash or freeze
when the user take around 80 consecutives snapshots in a row.

git bisect would lead to commit: ba2ab2f2ca
and it was clear that BDRV_NO_BACKING was missing.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 16:33:46 -05:00
Benoît Canet
792a40384f block: Prevent coroutine stack overflow when recursing in bdrv_open_backing_file.
In 1.7.1 qcow2_create2 reopen the file for flushing without the BDRV_O_NO_BACKING
flags.

As a consequence the code would recursively open the whole backing chain.

These three stack arrays would pile up through the recursion and lead to a coroutine
stack overflow.

Convert these array to malloced buffers in order to streamline the coroutine
footprint.

Symptoms where freezes or segfaults on production machines while taking QMP externals
snapshots. The overflow disturbed coroutine switching.

Signed-off-by: Benoit Canet <benoit.canet@gmail.com>

*note: backport of upstream's 1ba4b6a

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 16:28:44 -05:00
Peter Crosthwaite
0655eeed18 arm: translate.c: Fix smlald Instruction
The smlald (and probably smlsld) instruction was doing incorrect sign
extensions of the operands amongst 64bit result calculation. The
instruction psuedo-code is:

 operand2 = if m_swap then ROR(R[m],16) else R[m];
 product1 = SInt(R[n]<15:0>) * SInt(operand2<15:0>);
 product2 = SInt(R[n]<31:16>) * SInt(operand2<31:16>);
 result = product1 + product2 + SInt(R[dHi]:R[dLo]);
 R[dHi] = result<63:32>;
 R[dLo] = result<31:0>;

The result calculation should be done in 64 bit arithmetic, and hence
product1 and product2 should be sign extended to 64b before calculation.

The current implementation was adding product1 and product2 together
then sign-extending the intermediate result leading to false negatives.

E.G. if product1 = product2 = 0x4000000, their sum = 0x80000000, which
will be incorrectly interpreted as -ve on sign extension.

We fix by doing the 64b extensions on both product1 and product2 before
any addition/subtraction happens.

We also fix where we were possibly incorrectly setting the Q saturation
flag for SMLSLD, which the ARM ARM specifically says is not set.

Reported-by: Christina Smith <christina.smith@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 2cddb6f5a15be4ab8d2160f3499d128ae93d304d.1397704570.git.peter.crosthwaite@xilinx.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 33bbd75a7c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 16:08:05 -05:00
Hannes Reinecke
5cfd43b79d megasas: Implement LD_LIST_QUERY
Newer firmware implement a LD_LIST_QUERY command, and due to a driver
issue no drives might be detected if this command isn't supported.
So add emulation for this command, too.

Cc: qemu-stable@nongnu.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 34bb4d02e0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 16:02:25 -05:00
Benoît Canet
c5dae2f4c5 ide: Correct improper smart self test counter reset in ide core.
The SMART self test counter was incorrectly being reset to zero,
not 1. This had the effect that on every 21st SMART EXECUTE OFFLINE:
 * We would write off the beginning of a dynamically allocated buffer
 * We forgot the SMART history
Fix this.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Message-id: 1397336390-24664-1-git-send-email-benoit.canet@irqsave.net
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Acked-by: Kevin Wolf <kwolf@redhat.com>
[PMM: tweaked commit message as per suggestions from Markus]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

(cherry picked from commit 940973ae0b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:56:17 -05:00
Max Reitz
3239a20294 block-commit: speed is an optional parameter
As speed is an optional parameter for the QMP block-commit command, it
should be set to 0 if not given (as it is undefined if has_speed is
false), that is, the speed should not be limited.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 5450466394)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:39:13 -05:00
Kevin Wolf
a8b7e73901 qcow2: Flush metadata during read-only reopen
If lazy refcounts are enabled for a backing file, committing to this
backing file may leave it in a dirty state even if the commit succeeds.
The reason is that the bdrv_flush() call in bdrv_commit() doesn't flush
refcount updates with lazy refcounts enabled, and qcow2_reopen_prepare()
doesn't take care to flush metadata.

In order to fix this, this patch also fixes qcow2_mark_clean(), which
contains another ineffective bdrv_flush() call beause lazy refcounts are
disabled only afterwards. All existing callers of qcow2_mark_clean()
either don't modify refcounts or already flush manually, so that this
fixes only a latent, but not yet actually triggerable bug.

Another instance of the same problem is live snapshots. Again, a real
corruption is prevented by an explicit flush for non-read-only images in
external_snapshot_prepare(), but images using lazy refcounts stay dirty.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 4c2e5f8f46)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:34:50 -05:00
Peter Maydell
38a55f3070 hw/net/stellaris_enet: Correct handling of packet padding
The PADEN bit in the transmit control register enables padding of short
data packets out to the required minimum length. However a typo here
meant we were adjusting tx_fifo_len rather than tx_frame_len, so the
padding didn't actually happen. Fix this bug.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: qemu-stable@nongnu.org
(cherry picked from commit 7fd5f064d1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:33:46 -05:00
Peter Maydell
7d09facec7 hw/net/stellaris_enet: Restructure tx_fifo code to avoid buffer overrun
The current tx_fifo code has a corner case where the guest can overrun
the fifo buffer: if automatic CRCs are disabled we allow the guest to write
the CRC word even if there isn't actually space for it in the FIFO.
The datasheet is unclear about exactly how the hardware deals with this
situation; the most plausible answer seems to be that the CRC word is
just lost.

Implement this fix by separating the "can we stuff another word in the
FIFO" logic from the "should we transmit the packet now" check. This
also moves us closer to the real hardware, which has a number of ways
it can be configured to trigger sending the packet, some of which we
don't implement.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: qemu-stable@nongnu.org
(cherry picked from commit 5c10495ab1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:33:30 -05:00
Stefan Fritsch
11088abadf virtio-net: Do not filter VLANs without F_CTRL_VLAN
If VIRTIO_NET_F_CTRL_VLAN is not negotiated, do not filter out all
VLAN-tagged packets but send them to the guest.

This fixes VLANs with OpenBSD guests (and probably NetBSD, too, because
the OpenBSD driver started as a port from NetBSD).

Signed-off-by: Stefan Fritsch <sf@sfritsch.de>
Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0b1eaa8803)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:28:50 -05:00
Stefan Hajnoczi
0fd56fb844 mirror: fix early wake from sleep due to aio
The mirror blockjob coroutine rate-limits itself by sleeping.  The
coroutine also performs I/O asynchronously so it's important that the
aio callback doesn't wake the coroutine early as that breaks
rate-limiting.

Reported-by: Joaquim Barrera <jbarrera@ac.upc.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7b770c720b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:26:29 -05:00
Paolo Bonzini
8211eeb7d2 mirror: fix throttling delay calculation
The throttling delay calculation was using an inaccurate sector count to
calculate the time to sleep.  This broke rate-limiting for the block
mirror job.

Move the delay calculation into mirror_iteration() where we know how
many sectors were transferred.  This lets us calculate an accurate delay
time.

Reported-by: Joaquim Barrera <jbarrera@ac.upc.edu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit cc8c9d6c6f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:26:23 -05:00
Stefan Weil
0414abe04f configure: Don't use __int128_t for clang versions before 3.2
Those versions don't fully support __int128_t.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit a00f66ab9b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:24:19 -05:00
Stefan Weil
151be4f61f tests: Fix 'make test' for i686 hosts (build regression)
'make test' is broken at least since commit
baacf04799. Several source files were moved
to util/, and some of them there split, so add the missing prefix and new
files to fix the compiler and linker errors.

There remain more issues, but these changes allow running the test on a
Linux i686 host.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 6d4adef48d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:23:31 -05:00
Stefan Hajnoczi
a290aeebc4 tap: avoid deadlocking rx
The net subsystem has a control flow mechanism so peer NetClientStates
can tell each other to stop sending packets.  This is used to stop
monitoring the tap file descriptor for incoming packets if the guest rx
ring has no spare buffers.

There is a corner case when tap_can_send() is true at the beginning of
an event loop iteration but becomes false before the tap_send() fd
handler is invoked.

tap_send() will read the packet from the tap file descriptor and attempt
to send it.  The net queue will hold on to the packet and return 0,
indicating that further I/O is not possible.  tap then stops monitoring
the file descriptor for reads.

This is unlike the normal case where tap_can_send() is the same before
and during the event loop iteration.  The event loop would simply not
monitor the file descriptor if tap_can_send() returns true.  Upon next
iteration it would check tap_can_send() again and begin monitoring if we
can send.

The deadlock happens because tap_send() explicitly disabled read_poll.
This is done with the expectation that the peer will call
qemu_net_queue_flush().  But hw/net/virtio-net.c does not monitor
vm_running transitions and issue the flush.  Hence we're left with a
broken tap device.

Cc: qemu-stable@nongnu.org
Reported-by: Neil Skrypuch <neil@tembosocial.com>
Tested-by: Neil Skrypuch <neil@tembosocial.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 68e5ec6400)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:17:15 -05:00
Stefan Hajnoczi
7e42cd6f35 qom: Avoid leaking str and bool properties on failure
When object_property_add_str() and object_property_add_bool() fail, they
leak their internal StringProperty and BoolProperty structs.  Remember
to free the structs on error.

Luckily this is a low-impact memory leak since most QOM properties are
static qdev properties that will never take the error case.
object_property_add() only fails if the property name is already in use.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit a01aedc8d3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:11:17 -05:00
Fam Zheng
4f577e9e69 scsi: Change scsi sense buf size to 252
Current buffer size fails the assersion check in like

    hw/scsi/scsi-bus.c:1655:    assert(req->sense_len <= sizeof(req->sense));

when backend (block/iscsi.c) returns more data then 96.

Exercise the core dump path by booting an Gentoo ISO with scsi-generic
device backed with iscsi (built with libiscsi 1.7.0):

    x86_64-softmmu/qemu-system-x86_64 \
    -drive file=iscsi://localhost:3260/iqn.foobar/0,if=none,id=drive-disk \
    -device virtio-scsi-pci,id=scsi1,bus=pci.0,addr=0x6 \
    -device scsi-generic,drive=drive-disk,bus=scsi1.0,id=iscsi-disk \
    -boot d \
    -cdrom gentoo.iso

    qemu-system-x86_64: hw/scsi/scsi-bus.c:1655: scsi_req_complete:
    Assertion `req->sense_len <= sizeof(req->sense)' failed.

According to SPC-4, section 4.5.2.1, 252 is the limit of sense data. So
increase the value to fix it.

Also remove duplicated define for the macro.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c5f52875b9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 15:05:09 -05:00
Richard Henderson
6be38ee9e7 target-i386: Fix ucomis and comis memory access
We were loading 16 bytes for both single and double-precision
scalar comparisons.

Reported-by: Alexander Bluhm <bluhm@openbsd.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
(cherry picked from commit cb48da7f81)

Conflicts:
	target-i386/translate.c

*removed dependency on 323d1876

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 11:45:00 -05:00
Richard Henderson
2e191f8e54 target-i386: Fix CC_OP_CLR vs PF
Parity should be set for a zero result.

Cc: qemu-stable@nongnu.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
(cherry picked from commit d2fe51bda8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 11:25:09 -05:00
Thomas Huth
91ae1d30ec s390x/virtio-hcall: Add range check for hypervisor call
The handler for diag 500 did not check whether the requested function
was in the supported range, so illegal values could crash QEMU in the
worst case.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: qemu-stable@nongnu.org
(cherry picked from commit f2c55d1735)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 11:11:45 -05:00
Peter Lieven
0a77a92d74 block/iscsi: fix deadlock on scsi check condition
the retry logic was broken because the complete status
of the task structure was not reset. this resulted in
an infinite loop retrying the command over and over.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 837c390137)

Conflicts:
	block/iscsi.c

*only modified retry clauses present before 063c3378

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 11:08:29 -05:00
Markus Armbruster
8b8dd2c4b5 scsi-bus: Fix transfer length for VERIFY with BYTCHK=11b
The transfer length depends on field BYTCHK, which is encoded in byte
1, bits 1..2.  However, the guard for for case BYTCHK=11b doesn't
work, and we get case 01b instead.  Fix it.

Note that since emulated scsi-hd fails the command outright, it takes
SCSI passthrough of a device that actually implements VERIFY with
BYTCHK=11b to make the bug bite.

Screwed up in commit d12ad44.  Spotted by Coverity.

Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7ef8cf9a08)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-25 11:00:29 -05:00
Gal Hammer
248de52cf8 char: restore read callback on a reattached (hotplug) chardev
Fix a bug that was introduced in commit 386a5a1e. A removal of a device
set the chr handlers to NULL. However when the device is plugged back,
its read callback is not restored so data can't be transferred from the
host to the guest (e.g. via the virtio-serial port).

https://bugzilla.redhat.com/show_bug.cgi?id=1027181

Signed-off-by: Gal Hammer <ghammer@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ac1b84dd1e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-06-20 08:19:49 -05:00
Michael Roth
ba014af39c Update VERSION for 1.7.1 release
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-03-03 16:30:51 -06:00
Alexander Graf
d689974b51 KVM: Use return value for error print
Commit 94ccff13 introduced a more verbose failure message and retry
operations on KVM VM creation. However, it ended up using a variable
for its failure message that hasn't been initialized yet.

Fix it to use the value it meant to set.

Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 521f438e36)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-27 10:54:41 -06:00
Christoffer Dall
e50218c269 hw/intc/arm_gic: Fix GIC_SET_LEVEL
The GIC_SET_LEVEL macro unfortunately overwrote the entire level
bitmask instead of just or'ing on the necessary bits, causing active
level PPIs on a core to clear PPIs on other cores.

Cc: qemu-stable@nongnu.org
Reported-by: Rob Herring <rob.herring@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1393031030-8692-1-git-send-email-christoffer.dall@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 6453fa998a)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-27 09:38:42 -06:00
Peter Maydell
fa98e47a25 hw/arm/musicpal: Remove nonexistent CDTP2, CDTP3 registers
The ethernet device in the musicpal only has two tx queues,
but we modelled it with four CTDP registers, presumably a
cut and paste from the rx queue registers. Since the tx_queue[]
array is only 2 entries long this allowed a guest to overrun
this buffer. Remove the nonexistent registers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1392737293-10073-1-git-send-email-peter.maydell@linaro.org
Acked-by: Jan Kiszka <jan.kiszka@web.de>
Cc: qemu-stable@nongnu.org
(cherry picked from commit cf143ad350)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-27 09:38:31 -06:00
Peter Maydell
ff51a1d589 hw/intc/exynos4210_combiner: Don't overrun output_irq array in init
The Exynos4210 combiner has IIC_NIRQ inputs and IIC_NGRP outputs;
use the correct constant in the loop initializing our output
sysbus IRQs so that we don't overrun the output_irq[] array.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1392659611-8439-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Andreas Färber <afaerber@suse.de>
Cc: qemu-stable@nongnu.org
(cherry picked from commit fce0a82608)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-27 09:38:08 -06:00
Peter Maydell
5444df1581 hw/timer/arm_timer: Avoid array overrun for bad addresses
The integrator's timer read/write functions log an error for
bad addresses in guest accesses, but were falling through and
using an out of bounds array index rather than returning early.
Fix this.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-id: 1392647854-8067-4-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
(cherry picked from commit cba933b225)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-27 09:37:58 -06:00
Peter Maydell
e498311693 hw/misc/arm_sysctl: Fix bad boundary check on mb clock accesses
Fix incorrect use of sizeof() rather than ARRAY_SIZE() to guard
accesses into the mb_clock[] array, which was allowing a malicious
guest to overwrite the end of the array.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-id: 1392647854-8067-2-git-send-email-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
(cherry picked from commit ec1efab957)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-27 09:37:43 -06:00
Markus Armbruster
4736fb34f7 qga: Fix memory allocation pasto
qmp_guest_file_seek() allocates memory for a GuestFileRead object
instead of the GuestFileSeek object it actually uses.  Harmless,
because the GuestFileRead is slightly larger.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 10b7c5dd0d)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-25 13:34:15 -06:00
Tomoki Sekiyama
6d0a48acd8 qga: vss-win32: Fix interference with snapshot deletion by other VSS request
When a VSS requester such as vshadow.exe or diskshadow.exe requests to
delete snapshots, qemu-ga VSS provider's DeleteSnapshots() is also called
and returns E_NOTIMPL, that makes the deletion fail.
To avoid this issue, return S_OK and set values that represent no snapshots
are deleted by qemu-ga VSS provider.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Yan Vugenfirer <yvugenfi@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit d9e1f574cb)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-25 13:34:03 -06:00
Tomoki Sekiyama
5e5d4fc68e qga: vss-win32: Fix interference with snapshot creation by other VSS requesters
When a VSS requester such as vshadow.exe or diskshadow.exe requests to
create disk snapshots, Windows may choose qemu-ga VSS provider if it is
only provider registered on the system. However, because it provides only a
function to freeze the filesystem, the snapshotting fails.

This patch adds a check into CQGAVssProvider::IsVolumeSupported() to reject
the request from other VSS requesters, so that the other provider is chosen.

The check of requester is done by confirming event channels between
qemu-ga's requester and provider established. To ensure that the events are
initialized when CQGAVssProvider::IsVolumeSupported() is called, it moves
the initialization earlier.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Yan Vugenfirer <yvugenfi@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit ff8adbcfdb)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-25 13:33:54 -06:00
Tomoki Sekiyama
68e3bb1128 qga: vss-win32: Use NULL as an invalid pointer for OpenEvent and CreateEvent
OpenEvent and CreateEvent WinAPI return NULL when failed to open/create
events handles, instead of INVALID_HANDLE_VALUE (although their return
types are HANDLE).
This replaces INVALID_HANDLE_VALUE related to event handles with NULL.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Yan Vugenfirer <yvugenfi@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 4c1b8f1e83)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-25 13:33:47 -06:00
Paolo Bonzini
c885105bf3 adlib: fix patching of port I/O addresses
Commit 2b21fb5 (adlib: sort offsets in portio registration, 2013-08-14)
fixed the offsets in adlib_portio_list, but forgot the matching indices
in adlib_realizefn.

Reported at http://virtuallyfun.superglobalmegacorp.com/?p=3616 by
"neozeed".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 7f0ba7bb43)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 14:15:35 -06:00
Huw Davies
2cd72adb1c tcg-arm: The shift count of op_rotl_i32 is in args[2] not args[1].
It's this that should be subtracted from 0x20 when converting to a right rotate.

Cc: qemu-stable@nongnu.org
Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
(cherry picked from commit 7a3a00979d)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:40:04 -06:00
Paolo Bonzini
819ddf7d1f memory: fix limiting of translation at a page boundary
Commit 360e607 (address_space_translate: do not cross page boundaries,
2014-01-30) broke MMIO accesses in cases where the section is shorter
than the full register width.  This can happen for example with the
Bochs DISPI registers, which are 16 bits wide but have only a 1-byte
long MemoryRegion (if you write to the "second byte" of the register
your access is discarded; it doesn't write only to half of the register).

Restrict the action of commit 360e607 to direct RAM accesses.  This
is enough for Xen, since MMIO will not go through the mapcache.

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit a87f39543a)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:36:00 -06:00
Mark Cave-Ayland
ec6428b598 Update OpenBIOS images
Update OpenBIOS images to SVN r1246 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
(cherry picked from commit fbb9c590ca)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:41 -06:00
Stefan Weil
424388980d linux-user: Fix trampoline code for CRIS
__put_user can write bytes, words (2 bytes) or longwords (4 bytes).
Here obviously words should have been written, but bytes were written,
so values like 0x9c5f were truncated to 0x5f.

Fix this by changing retcode from uint8_t to to uint16_t in
target_signal_frame and also in the unused rt_signal_frame.

This problem was reported by static code analysis (smatch).

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
(cherry picked from commit 8cfc114a2f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:41 -06:00
Stefan Weil
6b579c8c53 i386: Add missing include file for QEMU_PACKED
Instead of packing BiosLinkerLoaderEntry, an unused global variable called
QEMU_PACKED was created (detected by smatch static code analysis).

Including qemu-common.h gets the right definition and also includes some
standard include files which now can be removed here.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit c428c5a21c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:41 -06:00
thomas knych
47c6edce7a KVM: Retry KVM_CREATE_VM on EINTR
Upstreaming this change from Android (https://android-review.googlesource.com/54211).

On heavily loaded machines with many VM instances we see KVM_CREATE_VM
failing with EINTR on this path:

kvm_dev_ioctl_create_vm -> kvm_create_vm -> kvm_init_mmu_notifier -> mmu_notifier_register ->  do_mmu_notifier_register -> mm_take_all_locks

which checks if any signals have been raised while it was attaining locks
and returns EINTR.  Retrying the system call greatly improves reliability.

Cc: qemu-stable@nongnu.org
Signed-off-by: thomas knych <thomaswk@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 94ccff1338)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:41 -06:00
Eric Farman
a5221ee143 virtio-scsi: Prevent assertion on missed events
In some cases, an unplug can cause events to be dropped, which
leads to an assertion failure when preparing to notify the guest
kernel.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 49fb65c7f9)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Eric Farman
30a0fc3607 virtio-scsi: Cleanup of I/Os that never started
There is still a small window that occurs when a cancel I/O affects
an asynchronous I/O operation that hasn't started.  In other words,
when the residual data length equals the expected data length.

Today, the routine virtio_scsi_command_complete fails because the
VirtIOSCSIReq pointer (from the hba_private field in SCSIRequest)
was cleared earlier when virtio_scsi_complete_req was called by
the virtio_scsi_request_cancelled routine.  As a result, the
virtio_scsi_command_complete routine needs to simply return when
it is processing a SCSIRequest block that was marked canceled.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e9c0f0f58a)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Paolo Bonzini
ad0a6444ad scsi: Assign cancel_io vector for scsi_disk_emulate_ops
Some emulated disk operations (MODE SELECT, UNMAP, WRITE SAME)
can trigger asynchronous I/Os.  Provide the cancel_io callback
to ensure that AIOCBs are properly cleaned up.

Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
[Tweak commit message. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 33325a53f1)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Paolo Bonzini
6b7ed87665 scsi: Support TEST UNIT READY in the dummy LUN0
SeaBIOS waits for LUN0 to respond to the TEST UNIT READY command
in order to decide whether it should part of the boot sequence.
If LUN0 does not respond to the command, boot is delayed by up
to 5 seconds.  This currently happens when there is no LUN0 on
a target.  Fix that by adding a trivial implementation of the
command.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1cb27d9233)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Peter Maydell
b54720b5d6 block/curl: Implement the libcurl timer callback interface
libcurl versions 7.16.0 and later have a timer callback interface which
must be implemented in order for libcurl to make forward progress (it
will sometimes rely on being called back on the timeout if there are
no file descriptors registered). Implement the callback, and use a
QEMU AIO timer to ensure we prod libcurl again when it asks us to.

Based on Peter's original patch plus my fix to add curl_multi_timeout_do.
Should compile just fine even on older versions of libcurl.

I also tried copy-on-read and streaming:

    $ ./qemu-img create -f qcow2 -o \
         backing_file=http://download.fedoraproject.org/pub/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso \
         foo.qcow2 1G
    $ x86_64-softmmu/qemu-system-x86_64 \
         -drive if=none,file=foo.qcow2,copy-on-read=on,id=cd \
         -device ide-cd,drive=cd --enable-kvm -m 1024

Direct http usage is probably too slow, but with copy-on-read ultimately
the image does boot!

After some time, streaming gets canceled by an EIO, which needs further
investigation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 031fd1be56)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Alex Williamson
c426a2da12 vfio-pci: Release all MSI-X vectors when disabled
We were relying on msix_unset_vector_notifiers() to release all the
vectors when we disable MSI-X, but this only happens when MSI-X is
still enabled on the device.  Perform further cleanup by releasing
any remaining vectors listed as in-use after this call.  This caused
a leak of IRQ routes on hotplug depending on how the guest OS prepared
the device for removal.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-stable@nongnu.org
(cherry picked from commit 3e40ba0faf)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Luiz Capitulino
15a14f2eeb migration: qmp_migrate(): keep working after syntax error
If a user or QMP client enter a bad syntax for the migrate
command in QMP/HMP, then the migrate command will never succeed
from that point on.

For example, if you enter:

(qemu) migrate tcp;0:4444
migrate: Parameter 'uri' expects a valid migration protocol

Then the migrate command will always fail from now on:

(qemu) migrate tcp:0:4444
migrate: There's a migration process in progress

The problem is that qmp_migrate() sets the migration status to
MIG_STATE_SETUP and doesn't reset it on syntax error. This bug
was introduced by commit 29ae8a4133.

Reviewed-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit c950114286)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Stefan Weil
88d08de7e5 mainstone: Fix duplicate array values for key 'space'
cgcc reported a duplicate initialisation. Mainstone includes a matrix
keyboard where two different positions map to 'space'.

QEMU uses the reversed mapping and does not map 'space' to two different
matrix positions.

Some other keys are either missing or might be mapped wrongly (cf. Linux
kernel code). Don't fix these until someone can test them with real
hardware, but add TODO comments.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 7dbc1158bc)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Corey Bryant
109b2439f0 seccomp: exit if seccomp_init() fails
This fixes a bug where we weren't exiting if seccomp_init() failed.

Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Acked-by: Eduardo Otubo <otubo@linux.vnet.ibm.com>
Acked-by: Paul Moore <pmoore@redhat.com>
(cherry picked from commit 2a13f99112)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Cornelia Huck
c2f6dc66bc s390x/kvm: Fix diagnose handling.
The instruction intercept handler for diagnose used only the displacement
when trying to calculate the function code. This is only correct for base
0, however; we need to perform a complete base/displacement address
calculation and use bits 48-63 as the function code.

Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 638129ff47)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Laszlo Ersek
dc9e1e798c qemu_opts_parse(): always check return value
qemu_opts_parse() can always return NULL, even if the QemuOptsList.desc in
question would be trivial to satisfy (eg. because it's empty). For
example:

qemu_opts_parse()
  opts_parse()
    qemu_opts_create()
      id_wellformed()

In practice:

  $ .../qemu-system-x86_64 -acpitable id=3
  qemu-system-x86_64: -acpitable id=3: Parameter 'id' expects an identifier
  **
  ERROR:vl.c:3491:main: assertion failed: (opts != NULL)
  Aborted (core dumped)

  $ .../qemu-system-x86_64 -smbios id=3
  qemu-system-x86_64: -smbios id=3: Parameter 'id' expects an identifier
  Segmentation fault (core dumped)

I checked all qemu_opts_parse() invocations (and all drive_def()
invocations too, because it blindly forwards the former's retval). Only
the two above examples look problematic.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1385658779-7529-1-git-send-email-lersek@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
(cherry picked from commit f46e720a82)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Peter Lieven
02e1c55ddd block/iscsi: use a bh to schedule co reentrance
this fixes a potential segfault and performance regression.

If the coroutine is reentered directly in the iscsi_co_generic_cb
iscsi_process_{read,write} are interrupted and reentered any
time later. One the one hand this could happen after an iscsi_close
where the iscsi context is already gone (segfault). On the
other hand this limits the number of processed callbacks
in each aio_dispatch to one (potential performance regression).

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 8b9dfe9098)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:40 -06:00
Michael S. Tsirkin
9692bad34d hpet: fix build with CONFIG_HPET off
make hpet_find inline so we don't need
to build hpet.c to check if hpet is enabled.

Fixes link error with CONFIG_HPET off.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 142e0950cf)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:39 -06:00
Aurelien Jarno
6ec62b79e3 tcg/optimize: fix known-zero bits for right shift ops
32-bit versions of sar and shr ops should not propagate known-zero bits
from the unused 32 high bits. For sar it could even lead to wrong code
being generated.

Cc: qemu-stable@nongnu.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
(cherry picked from commit e46b225a31)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:39 -06:00
Brad
0e282aca86 Fix QEMU build on OpenBSD on x86 archs
This resolves the build issue with building the ROMs on OpenBSD on x86 archs.
As of OpenBSD 5.3 the compiler builds PIE binaries by default and thus the
whole OS/packages and so forth. The ROMs need to have PIE disabled.
Check in configure whether the compiler supports the flags for disabling
PIE, and if it does then use them for building the ROMs. This fixes the
following buildbot failure:

>From the OpenBSD buildbots..
  Building optionrom/multiboot.img
ld: multiboot.o: relocation R_X86_64_16 can not be used when making a shared object; recompile with -fPIC

Signed-off by: Brad Smith <brad@comstyle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 46eef33b89)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:39 -06:00
Petar Jovanovic
75b4b747a2 linux-user: create target_structs header to place ipc_perm and shmid_ds
Creating target_structs header in linux-user/$arch/ and making
target_ipc_perm and target_shmid_ds its first inhabitants.
The struct defintions may/should be further fine-tuned by arch maintainers.

Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit 55a2b1631f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-21 00:34:39 -06:00
Petar Jovanovic
0bc4142e7f linux-user: pass correct parameter to do_shmctl()
Fix shmctl issue by passing correct parameter buf to do_shmctl().

Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit a29267846a)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 22:24:21 -06:00
Petar Jovanovic
b9cabc36a2 target-mips: fix 64-bit FPU config for user-mode emulation
FR bit should be initialized to 1 for MIPS64, under condition that this
bit is writable and that CPU has an FPU unit. It should be initialized to
zero for MIPS32.
This fixes different MIPS32 issues with FPU instructions whose behaviour
defaulted to 64-bit FPU mode.

Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 4d66261f71)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Gerd Hoffmann
03bc4f6628 piix: fix 32bit pci hole
Make the 32bit pci hole start at end of ram, so all possible address
space is covered.

We used to try and make addresses aligned so they are easier to cover
with MTRRs, but since they are cosmetic on KVM, this is probably not
worth worrying about.
Of course the firmware can use less than that.  Leaving space unused is
no problem, mapping pci bars outside the hole causes problems though.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit ddaaefb4dd)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Michael S. Tsirkin
8b6d92a565 pc: map PCI address space as catchall region for not mapped addresses
With a help of negative memory region priority PCI address space
is mapped underneath RAM regions effectively catching every access
to addresses not mapped by any other region.
It simplifies PCI address space mapping into system address space.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
(cherry picked from commit 83d08f2673)

*prereq for ddaaefb backport

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Marcel Apfelbaum
44c68b84ae exec: separate sections and nodes per address space
Every address space has its own nodes and sections, but
it uses the same global arrays of nodes/section.

This limits the number of devices that can be attached
to the guest to 20-30 devices. It happens because:
 - The sections array is limited to 2^12 entries.
 - The main memory has at least 100 sections.
 - Each device address space is actually an alias to
   main memory, multiplying its number of nodes/sections.

Remove the limitation by using separate arrays of
nodes and sections for each address space.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 53cb28cbfe)

Conflicts:

	exec.c

*removed dependency on b35ba30

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Michael S. Tsirkin
4c3e00d83f exec: pass hw address to phys_page_find
callers always shift by target page bits so let's just do this
internally.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 97115a8d45)

*prereq for 53cb28c backport

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Michael S. Tsirkin
6a108c4802 exec: replace leaf with skip
In preparation for dynamic radix tree depth support, rename is_leaf
field to skip, telling us how many bits to skip to next level.
Set to 0 for leaf.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9736e55b78)

*prereq for 53cb28c backport

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Paolo Bonzini
e480a1b8ff split definitions for exec.c and translate-all.c radix trees
The exec.c and translate-all.c radix trees are quite different, and
the exec.c one in particular is not limited to the CPU---it can be
used also by devices that do DMA, and in that case the address space
is not limited to TARGET_PHYS_ADDR_SPACE_BITS bits.

We want to make exec.c's radix trees 64-bit wide.  As a first step,
stop sharing the constants between exec.c and translate-all.c.
exec.c gets P_L2_* constants, translate-all.c gets V_L2_*, for
consistency with the existing V_L1_* symbols.  Though actually
in the softmmu case translate-all.c is also indexed by physical
addresses...

This patch has no semantic change.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 03f4995781)

*prereq for 53cb28c backport

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Markus Armbruster
29b0fcc181 qdev-monitor: Avoid device_add crashing on non-device driver name
Watch this:

    $ upstream-qemu -nodefaults -S -display none -monitor stdio
    QEMU 1.7.50 monitor - type 'help' for more information
    (qemu) device_add rng-egd
    /work/armbru/qemu/qdev-monitor.c:491:qdev_device_add: Object 0x2089b00 is not an instance of type device
    Aborted (core dumped)

Crashes because "rng-egd" exists, but isn't a subtype of TYPE_DEVICE.
Broken in commit 18b6dad.

Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 061e84f7a4)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Alexander Graf
b8fca09eec x86: only allow real mode to access 32bit without LMA
When we're running in non-64bit mode with qemu-system-x86_64 we can
still end up with virtual addresses that are above the 32bit boundary
if a segment offset is set up.

GNU Hurd does exactly that. It sets the segment offset to 0x80000000 and
puts its EIP value to 0x8xxxxxxx to access low memory.

This doesn't hit us when we enable paging, as there we just mask away the
unused bits. But with real mode, we assume that vaddr == paddr which is
wrong in this case. Real hardware wraps the virtual address around at the
32bit boundary. So let's do the same.

This fixes booting GNU Hurd in qemu-system-x86_64 for me.

Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 33dfdb56f2)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Paolo Bonzini
50a203c3b9 vl: add missing transition debug->finish_migrate
This fixes an abort if you invoke the "migrate" command while the
guest is being debugged.

Cc: qemu-stable@nongnu.org
Cc: lcapitulino@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit eca01d3a93)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Matthew Garrett
f227ed1842 migration: Fix rate limit
The migration thread appears to want to allow writeout to occur at full
speed rather than being rate limited during completion of state saving,
but sets the limit to INT_MAX when xfer_limit is INT64_MAX. This causes
problems if there's more than 2GB of state left to save at this point. It
probably ought to just be INT64_MAX instead.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 40596834c0)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Peter Crosthwaite
2dc7975300 qom: Split out object and class caches
The object-cast and class-cast caches cannot be shared because class
caching is conditional on the target type not being an interface and
object caching is unconditional. Leads to a bug when a class cast
to an interface follows an object cast to the same interface type:

FooObject = FOO(obj);
FooClass = FOO_GET_CLASS(obj);

Where TYPE_FOO is an interface. The first (object) cast will be
successful and cache the casting result (i.e. TYPE_FOO will be cached).
The second (class) cast will then check the shared cast cache
and register a hit. The issue is, when a class cast hits in the cache
it just returns a pointer cast of the input class (i.e. the concrete
class).

When casting to an interface, the cast itself must return the
interface class, not the concrete class. The implementation of class
cast caching already ensures that the returned cast result is only
a pointer cast before caching. The object cast logic however does
not have this check.

Resolve by just splitting the object and class caches.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Nathan Rossi <nathan.rossi@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 0ab4c94c84)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Marcel Apfelbaum
8fa58fe910 memory.c: bugfix - ref counting mismatch in memory_region_find
'address_space_get_flatview' gets a reference to a FlatView.
If the flatview lookup fails, the code returns without
"unreferencing" the view.

Cc: qemu-stable@nongnu.org

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 6307d974f9)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:18 -06:00
Gerd Hoffmann
97f74de48c intel-hda: fix position buffer
Fix position buffer updates to use the correct stream offset.

Without this patch both IN (record) and OUT (playback) streams
will update the IN buffer positions.  The linux kernel notices
and complains:
  hda-intel: Invalid position buffer, using LPIB read method instead.

The bug may also lead to glitches when recording and playing
at the same time:
  https://bugzilla.redhat.com/show_bug.cgi?id=947785

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d58ce68a45)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:17 -06:00
Paolo Bonzini
30a08ab4e1 scsi-disk: fix VERIFY emulation
VERIFY emulation was completely botched (and remained botched through
all the refactorings).  The command must be emulated both in check-medium
mode (BYTCHK=00, which we implement by doing nothing) and in check-bytes
mode (which we do not implement yet).  Unlike WRITE AND VERIFY (which we
treat simply as WRITE with FUA bit set), VERIFY cannot be handled like
READ.  In fact the device is _receiving_ data for VERIFY, not _sending_
it like READ.

Cc: qemu-stable@nongnu.org
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d97e773081)

Conflicts:

	hw/scsi/scsi-disk.c

*fixed up WRITE_SAME_* conflicts due to 84f94a9a not being in 1.7.0

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:17 -06:00
Paolo Bonzini
df3e347891 scsi-bus: fix transfer length and direction for VERIFY command
The amount of bytes to transfer depends on the BYTCHK field.
If any data is transferred, it is sent to the device.

Cc: qemu-stable@nongnu.org
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d12ad44cc4)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:59:17 -06:00
Paolo Bonzini
810766d9dd virtio-pci: add device_unplugged callback
This fixes a crash in hot-unplug of virtio-pci devices behind a PCIe
switch.  The crash happens because the ioeventfd is still set whent the
child is destroyed (destruction happens in postorder).  Then the proxy
tries to unset to ioeventfd, but the virtqueue structure that holds the
EventNotifier has been trashed in the meanwhile.  kvm_set_ioeventfd_pio
does not expect failure and aborts.

The fix is simply to move parts of uninitialization to a new
device_unplugged callback, which is called before the child is destroyed.

Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 06a1307379)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:15 -06:00
Paolo Bonzini
3220207c27 virtio-rng: switch exit callback to VirtioDeviceClass
This ensures hot-unplug is handled properly by the proxy, and avoids
leaking bus_name which is freed by virtio_device_exit.

Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7bb6edb0e3)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:15 -06:00
Paolo Bonzini
def56d28cf virtio-balloon: switch exit callback to VirtioDeviceClass
This ensures hot-unplug is handled properly by the proxy, and avoids
leaking bus_name which is freed by virtio_device_exit.

Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit baa61b9870)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:15 -06:00
Paolo Bonzini
478f1f6ccf virtio-scsi: switch exit callback to VirtioDeviceClass
This ensures hot-unplug is handled properly by the proxy, and avoids
leaking bus_name which is freed by virtio_device_exit.

Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e3c9d76acc)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:15 -06:00
Paolo Bonzini
8f08550ee2 virtio-net: switch exit callback to VirtioDeviceClass
This ensures hot-unplug is handled properly by the proxy, and avoids
leaking bus_name which is freed by virtio_device_exit.

Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3786cff5eb)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:15 -06:00
Paolo Bonzini
e6c007056c virtio-serial: switch exit callback to VirtioDeviceClass
This ensures hot-unplug is handled properly by the proxy, and avoids
leaking bus_name which is freed by virtio_device_exit.

Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0e86c13fe2)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:15 -06:00
Paolo Bonzini
e84e23de35 virtio-blk: switch exit callback to VirtioDeviceClass
This ensures hot-unplug is handled properly by the proxy, and avoids
leaking bus_name which is freed by virtio_device_exit.

Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 40dfc16f5f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:15 -06:00
Paolo Bonzini
40699a469e virtio-bus: cleanup plug/unplug interface
Right now we have these pairs:

- virtio_bus_plug_device/virtio_bus_destroy_device.  The first
  takes a VirtIODevice, the second takes a VirtioBusState

- device_plugged/device_unplug callbacks in the VirtioBusClass
  (here it's just the naming that is inconsistent)

- virtio_bus_destroy_device is not called by anyone (and since
  it calls qdev_free, it would be called by the proxies---but
  then the callback is useless since the proxies can do whatever
  they want before calling virtio_bus_destroy_device)

And there is a k->init but no k->exit, hence virtio_device_exit is
overwritten by subclasses (except virtio-9p).  This cleans it up by:

- renaming the device_unplug callback to device_unplugged

- renaming virtio_bus_plug_device to virtio_bus_device_plugged,
  matching the callback name

- renaming virtio_bus_destroy_device to virtio_bus_device_unplugged,
  removing the qdev_free, making it take a VirtIODevice and calling it
  from virtio_device_exit

- adding a k->exit callback

virtio_device_exit is still overwritten, the next patches will fix that.

Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5e96f5d2f8)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:15 -06:00
Paolo Bonzini
cbf23fdf21 virtio-pci: remove vdev field
The vdev field is complicated to synchronize.  Just access the
BusState's list of children.

Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a3fc66d9fd)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:15 -06:00
Paolo Bonzini
a9b9ca7e0e virtio-ccw: remove vdev field
The vdev field is complicated to synchronize.  Just access the
BusState's list of children.

Cc: qemu-stable@nongnu.org
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f24a684073)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:14 -06:00
Paolo Bonzini
d765275bb1 virtio-bus: remove vdev field
The vdev field is complicated to synchronize.  Just access the
BusState's list of children.

Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 06d3dff072)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:14 -06:00
Paolo Bonzini
f47542925e virtio-ccw: move virtio_ccw_stop_ioeventfd to virtio_ccw_busdev_unplug
Similar to the PCI bug that prompted these patches, virtio-ccw will
segfault after the reworking of hotplug/hot-unplug.  Prepare for
this by moving virtio_ccw_stop_ioeventfd to before the freeing
of the proxy device.

A better place for this could be the device_unplugged callback
for the virtio-ccw bus.  However, we do not yet have a callback
that works: this patch avoids the problem while leaving the tree
bisectable.

Cc: qemu-stable@nongnu.org
Reported-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Suggested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0b81c1ef5c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-02-20 21:36:14 -06:00
2802 changed files with 179510 additions and 411257 deletions

155
.gitignore vendored
View File

@@ -1,71 +1,72 @@
/config-devices.*
/config-all-devices.*
/config-all-disas.*
/config-host.*
/config-target.*
/config.status
/config-temp
/trace/generated-tracers.h
/trace/generated-tracers.c
/trace/generated-tracers-dtrace.h
/trace/generated-tracers.dtrace
/trace/generated-events.h
/trace/generated-events.c
/trace/generated-helpers-wrappers.h
/trace/generated-helpers.h
/trace/generated-helpers.c
/trace/generated-tcg-tracers.h
/trace/generated-ust-provider.h
/trace/generated-ust.c
/ui/shader/texture-blit-frag.h
/ui/shader/texture-blit-vert.h
/libcacard/trace/generated-tracers.c
config-devices.*
config-all-devices.*
config-all-disas.*
config-host.*
config-target.*
trace/generated-tracers.h
trace/generated-tracers.c
trace/generated-tracers-dtrace.h
trace/generated-tracers.dtrace
trace/generated-events.h
trace/generated-events.c
libcacard/trace/generated-tracers.c
*-timestamp
/*-softmmu
/*-darwin-user
/*-linux-user
/*-bsd-user
/libdis*
/libuser
/linux-headers/asm
/qga/qapi-generated
/qapi-generated
/qapi-types.[ch]
/qapi-visit.[ch]
/qapi-event.[ch]
/qmp-commands.h
/qmp-marshal.c
/qemu-doc.html
/qemu-tech.html
/qemu-doc.info
/qemu-tech.info
/qemu-img
/qemu-nbd
/qemu-options.def
/qemu-options.texi
/qemu-img-cmds.texi
/qemu-img-cmds.h
/qemu-io
/qemu-ga
/qemu-bridge-helper
/qemu-monitor.texi
/qmp-commands.txt
/vscclient
/fsdev/virtfs-proxy-helper
*.[1-9]
*-softmmu
*-darwin-user
*-linux-user
*-bsd-user
libdis*
libuser
linux-headers/asm
qapi-generated
qapi-types.[ch]
qapi-visit.[ch]
qmp-commands.h
qmp-marshal.c
qemu-doc.html
qemu-tech.html
qemu-doc.info
qemu-tech.info
qemu.1
qemu.pod
qemu-img.1
qemu-img.pod
qemu-img
qemu-nbd
qemu-nbd.8
qemu-nbd.pod
qemu-options.def
qemu-options.texi
qemu-img-cmds.texi
qemu-img-cmds.h
qemu-io
qemu-ga
qemu-bridge-helper
qemu-monitor.texi
vscclient
qmp-commands.txt
test-bitops
test-coroutine
test-int128
test-opts-visitor
test-qmp-input-visitor
test-qmp-output-visitor
test-string-input-visitor
test-string-output-visitor
test-visitor-serialization
fsdev/virtfs-proxy-helper
fsdev/virtfs-proxy-helper.1
fsdev/virtfs-proxy-helper.pod
.gdbinit
*.a
*.aux
*.cp
*.dvi
*.exe
*.dll
*.so
*.mo
*.fn
*.ky
*.log
*.pdf
*.pod
*.cps
*.fns
*.kys
@@ -75,31 +76,35 @@
*.tp
*.vr
*.d
!/scripts/qemu-guest-agent/fsfreeze-hook.d
!scripts/qemu-guest-agent/fsfreeze-hook.d
*.o
*.lo
*.la
*.pc
.libs
.sdk
*.swp
*.orig
.pc
*.gcda
*.gcno
/pc-bios/bios-pq/status
/pc-bios/vgabios-pq/status
/pc-bios/optionrom/linuxboot.asm
/pc-bios/optionrom/linuxboot.bin
/pc-bios/optionrom/linuxboot.raw
/pc-bios/optionrom/linuxboot.img
/pc-bios/optionrom/multiboot.asm
/pc-bios/optionrom/multiboot.bin
/pc-bios/optionrom/multiboot.raw
/pc-bios/optionrom/multiboot.img
/pc-bios/optionrom/kvmvapic.asm
/pc-bios/optionrom/kvmvapic.bin
/pc-bios/optionrom/kvmvapic.raw
/pc-bios/optionrom/kvmvapic.img
/pc-bios/s390-ccw/s390-ccw.elf
/pc-bios/s390-ccw/s390-ccw.img
patches
pc-bios/bios-pq/status
pc-bios/vgabios-pq/status
pc-bios/optionrom/linuxboot.asm
pc-bios/optionrom/linuxboot.bin
pc-bios/optionrom/linuxboot.raw
pc-bios/optionrom/linuxboot.img
pc-bios/optionrom/multiboot.asm
pc-bios/optionrom/multiboot.bin
pc-bios/optionrom/multiboot.raw
pc-bios/optionrom/multiboot.img
pc-bios/optionrom/kvmvapic.asm
pc-bios/optionrom/kvmvapic.bin
pc-bios/optionrom/kvmvapic.raw
pc-bios/optionrom/kvmvapic.img
pc-bios/s390-ccw/s390-ccw.elf
pc-bios/s390-ccw/s390-ccw.img
.stgit-*
cscope.*
tags

6
.gitmodules vendored
View File

@@ -13,9 +13,6 @@
[submodule "roms/openbios"]
path = roms/openbios
url = git://git.qemu-project.org/openbios.git
[submodule "roms/openhackware"]
path = roms/openhackware
url = git://git.qemu-project.org/openhackware.git
[submodule "roms/qemu-palcode"]
path = roms/qemu-palcode
url = git://github.com/rth7680/qemu-palcode.git
@@ -28,6 +25,3 @@
[submodule "dtc"]
path = dtc
url = git://git.qemu-project.org/dtc.git
[submodule "roms/u-boot"]
path = roms/u-boot
url = git://git.qemu-project.org/u-boot.git

View File

@@ -4,15 +4,9 @@ python:
compiler:
- gcc
- clang
notifications:
irc:
channels:
- "irc.oftc.net#qemu"
on_success: change
on_failure: always
env:
global:
- TEST_CMD=""
- TEST_CMD="make check"
- EXTRA_CONFIG=""
# Development packages, EXTRA_PKGS saved for additional builds
- CORE_PKGS="libusb-1.0-0-dev libiscsi-dev librados-dev libncurses5-dev"
@@ -20,51 +14,30 @@ env:
- GUI_PKGS="libgtk-3-dev libvte-2.90-dev libsdl1.2-dev libpng12-dev libpixman-1-dev"
- EXTRA_PKGS=""
matrix:
# Group major targets together with their linux-user counterparts
- TARGETS=alpha-softmmu,alpha-linux-user
- TARGETS=arm-softmmu,arm-linux-user,armeb-linux-user,aarch64-softmmu,aarch64-linux-user
- TARGETS=cris-softmmu,cris-linux-user
- TARGETS=i386-softmmu,i386-linux-user,x86_64-softmmu,x86_64-linux-user
- TARGETS=m68k-softmmu,m68k-linux-user
- TARGETS=microblaze-softmmu,microblazeel-softmmu,microblaze-linux-user,microblazeel-linux-user
- TARGETS=mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu
- TARGETS=mips-linux-user,mips64-linux-user,mips64el-linux-user,mipsel-linux-user,mipsn32-linux-user,mipsn32el-linux-user
- TARGETS=or32-softmmu,or32-linux-user
- TARGETS=ppc-softmmu,ppc64-softmmu,ppcemb-softmmu,ppc-linux-user,ppc64-linux-user,ppc64abi32-linux-user,ppc64le-linux-user
- TARGETS=s390x-softmmu,s390x-linux-user
- TARGETS=sh4-softmmu,sh4eb-softmmu,sh4-linux-user sh4eb-linux-user
- TARGETS=sparc-softmmu,sparc64-softmmu,sparc-linux-user,sparc32plus-linux-user,sparc64-linux-user
- TARGETS=unicore32-softmmu,unicore32-linux-user
# Group remaining softmmu only targets into one build
- TARGETS=lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,xtensaeb-softmmu
git:
# we want to do this ourselves
submodules: false
- TARGETS=alpha-softmmu,alpha-linux-user
- TARGETS=arm-softmmu,arm-linux-user
- TARGETS=cris-softmmu
- TARGETS=i386-softmmu,x86_64-softmmu
- TARGETS=lm32-softmmu
- TARGETS=m68k-softmmu
- TARGETS=microblaze-softmmu,microblazeel-softmmu
- TARGETS=mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu
- TARGETS=moxie-softmmu
- TARGETS=or32-softmmu,
- TARGETS=ppc-softmmu,ppc64-softmmu,ppcemb-softmmu
- TARGETS=s390x-softmmu
- TARGETS=sh4-softmmu,sh4eb-softmmu
- TARGETS=sparc-softmmu,sparc64-softmmu
- TARGETS=unicore32-softmmu
- TARGETS=xtensa-softmmu,xtensaeb-softmmu
before_install:
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive
- sudo apt-get update -qq
- sudo apt-get install -qq ${CORE_PKGS} ${NET_PKGS} ${GUI_PKGS} ${EXTRA_PKGS}
before_script:
- ./configure --target-list=${TARGETS} --enable-debug-tcg ${EXTRA_CONFIG}
script:
- make -j2 && ${TEST_CMD}
script: "./configure --target-list=${TARGETS} ${EXTRA_CONFIG} && make && ${TEST_CMD}"
matrix:
# We manually include a number of additional build for non-standard bits
include:
# Make check target (we only do this once)
- env:
- TARGETS=alpha-softmmu,arm-softmmu,aarch64-softmmu,cris-softmmu,
i386-softmmu,x86_64-softmmu,m68k-softmmu,microblaze-softmmu,
microblazeel-softmmu,mips-softmmu,mips64-softmmu,
mips64el-softmmu,mipsel-softmmu,or32-softmmu,ppc-softmmu,
ppc64-softmmu,ppcemb-softmmu,s390x-softmmu,sh4-softmmu,
sh4eb-softmmu,sparc-softmmu,sparc64-softmmu,
unicore32-softmmu,unicore32-linux-user,
lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,
xtensaeb-softmmu
TEST_CMD="make check"
compiler: gcc
# Debug related options
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-debug"
@@ -72,10 +45,6 @@ matrix:
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-debug --enable-tcg-interpreter"
compiler: gcc
# All the extra -dev packages
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="libaio-dev libcap-ng-dev libattr1-dev libbrlapi-dev uuid-dev libusb-1.0.0-dev"
compiler: gcc
# Currently configure doesn't force --disable-pie
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-gprof --enable-gcov --disable-pie"
@@ -86,18 +55,17 @@ matrix:
compiler: gcc
# All the trace backends (apart from dtrace)
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=stderr"
EXTRA_CONFIG="--enable-trace-backend=stderr"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=simple"
EXTRA_CONFIG="--enable-trace-backend=simple"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=ftrace"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="liblttng-ust-dev liburcu-dev"
EXTRA_CONFIG="--enable-trace-backends=ust"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-modules"
EXTRA_CONFIG="--enable-trace-backend=ftrace"
TEST_CMD=""
compiler: gcc
# This disabled make check for the ftrace backend which needs more setting up
# Currently broken on 12.04 due to mis-packaged liburcu and changed API, will be pulled.
#- env: TARGETS=i386-softmmu,x86_64-softmmu
# EXTRA_PKGS="liblttng-ust-dev liburcu-dev"
# EXTRA_CONFIG="--enable-trace-backend=ust"

View File

@@ -84,24 +84,3 @@ and clarity it comes on a line by itself:
Rationale: a consistent (except for functions...) bracing style reduces
ambiguity and avoids needless churn when lines are added or removed.
Furthermore, it is the QEMU coding style.
5. Declarations
Mixed declarations (interleaving statements and declarations within blocks)
are not allowed; declarations should be at the beginning of blocks. In other
words, the code should not generate warnings if using GCC's
-Wdeclaration-after-statement option.
6. Conditional statements
When comparing a variable for (in)equality with a constant, list the
constant on the right, as in:
if (a == 1) {
/* Reads like: "If a equals 1" */
do_something();
}
Rationale: Yoda conditions (as in 'if (1 == a)') are awkward to read.
Besides, good compilers already warn users when '==' is mis-typed as '=',
even when the constant is on the right.

View File

@@ -11,7 +11,7 @@ option) any later version.
As of July 2013, contributions under version 2 of the GNU General Public
License (and no later version) are only accepted for the following files
or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*.
or directories: bsd-user/, linux-user/, hw/misc/vfio.c, hw/xen/xen_pt*.
3) The Tiny Code Generator (TCG) is released under the BSD license
(see license headers in files).

View File

@@ -50,33 +50,15 @@ Descriptions of section entries:
General Project Administration
------------------------------
M: Peter Maydell <peter.maydell@linaro.org>
Responsible Disclosure, Reporting Security Issues
------------------------------
W: http://wiki.qemu.org/SecurityProcess
M: Michael S. Tsirkin <mst@redhat.com>
L: secalert@redhat.com
M: Anthony Liguori <aliguori@amazon.com>
Guest CPU cores (TCG):
----------------------
Overall
L: qemu-devel@nongnu.org
S: Odd fixes
F: cpu-exec.c
F: cputlb.c
F: softmmu_template.h
F: translate-all.c
F: include/exec/cpu_ldst.h
F: include/exec/cpu_ldst_template.h
F: include/exec/helper*.h
Alpha
M: Richard Henderson <rth@twiddle.net>
S: Maintained
F: target-alpha/
F: hw/alpha/
F: tests/tcg/alpha/
ARM
M: Peter Maydell <peter.maydell@linaro.org>
@@ -90,19 +72,13 @@ M: Edgar E. Iglesias <edgar.iglesias@gmail.com>
S: Maintained
F: target-cris/
F: hw/cris/
F: tests/tcg/cris/
LM32
M: Michael Walle <michael@walle.cc>
S: Maintained
F: target-lm32/
F: disas/lm32.c
F: hw/lm32/
F: hw/*/lm32_*
F: hw/*/milkymist-*
F: include/hw/char/lm32_juart.h
F: include/hw/lm32/
F: tests/tcg/lm32/
F: hw/char/lm32_*
M68K
S: Orphan
@@ -117,11 +93,9 @@ F: hw/microblaze/
MIPS
M: Aurelien Jarno <aurelien@aurel32.net>
M: Leon Alrae <leon.alrae@imgtec.com>
S: Maintained
S: Odd Fixes
F: target-mips/
F: hw/mips/
F: tests/tcg/mips/
Moxie
M: Anthony Green <green@moxielogic.com>
@@ -133,7 +107,6 @@ M: Jia Liu <proljc@gmail.com>
S: Maintained
F: target-openrisc/
F: hw/openrisc/
F: tests/tcg/openrisc/
PowerPC
M: Alexander Graf <agraf@suse.de>
@@ -157,7 +130,6 @@ F: hw/sh4/
SPARC
M: Blue Swirl <blauwirbel@gmail.com>
M: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
S: Maintained
F: target-sparc/
F: hw/sparc/
@@ -170,10 +142,8 @@ F: target-unicore32/
F: hw/unicore32/
X86
M: Paolo Bonzini <pbonzini@redhat.com>
M: Richard Henderson <rth@twiddle.net>
M: Eduardo Habkost <ehabkost@redhat.com>
S: Maintained
M: qemu-devel@nongnu.org
S: Odd Fixes
F: target-i386/
F: hw/i386/
@@ -183,18 +153,12 @@ W: http://wiki.osll.spb.ru/doku.php?id=etc:users:jcmvbkbc:qemu-target-xtensa
S: Maintained
F: target-xtensa/
F: hw/xtensa/
F: tests/tcg/xtensa/
TriCore
M: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
S: Maintained
F: target-tricore/
F: hw/tricore/
Guest CPU Cores (KVM):
----------------------
Overall
M: Gleb Natapov <gleb@redhat.com>
M: Paolo Bonzini <pbonzini@redhat.com>
L: kvm@vger.kernel.org
S: Supported
@@ -206,28 +170,18 @@ M: Peter Maydell <peter.maydell@linaro.org>
S: Maintained
F: target-arm/kvm.c
MIPS
M: James Hogan <james.hogan@imgtec.com>
S: Maintained
F: target-mips/kvm.c
PPC
M: Alexander Graf <agraf@suse.de>
S: Maintained
F: target-ppc/kvm.c
S390
M: Christian Borntraeger <borntraeger@de.ibm.com>
M: Cornelia Huck <cornelia.huck@de.ibm.com>
M: Alexander Graf <agraf@suse.de>
S: Maintained
F: target-s390x/kvm.c
F: hw/intc/s390_flic.c
F: hw/intc/s390_flic_kvm.c
F: include/hw/s390x/s390_flic.h
X86
M: Paolo Bonzini <pbonzini@redhat.com>
M: Gleb Natapov <gleb@redhat.com>
M: Marcelo Tosatti <mtosatti@redhat.com>
L: kvm@vger.kernel.org
S: Supported
@@ -265,13 +219,6 @@ F: *win32*
ARM Machines
------------
Allwinner-a10
M: Li Guang <lig.fnst@cn.fujitsu.com>
S: Maintained
F: hw/*/allwinner-a10*
F: include/hw/*/allwinner-a10*
F: hw/arm/cubieboard.c
Exynos
M: Evgeny Voevodin <e.voevodin@samsung.com>
M: Maksim Kozlov <m.kozlov@samsung.com>
@@ -281,19 +228,13 @@ S: Maintained
F: hw/*/exynos*
Calxeda Highbank
M: Rob Herring <robh@kernel.org>
S: Maintained
M: Mark Langsdorf <mark.langsdorf@calxeda.com>
S: Supported
F: hw/arm/highbank.c
F: hw/net/xgmac.c
Canon DIGIC
M: Antony Pavlov <antonynpavlov@gmail.com>
S: Maintained
F: include/hw/arm/digic.h
F: hw/*/digic*
Gumstix
L: qemu-devel@nongnu.org
M: qemu-devel@nongnu.org
S: Orphan
F: hw/arm/gumstix.c
@@ -309,7 +250,7 @@ S: Maintained
F: hw/arm/integratorcp.c
Mainstone
L: qemu-devel@nongnu.org
M: qemu-devel@nongnu.org
S: Orphan
F: hw/arm/mainstone.c
@@ -349,20 +290,13 @@ S: Maintained
F: hw/*/versatile*
Xilinx Zynq
M: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
M: Peter Crosthwaite <peter.crosthwaite@petalogix.com>
S: Maintained
F: hw/arm/xilinx_zynq.c
F: hw/misc/zynq_slcr.c
F: hw/*/cadence_*
F: hw/ssi/xilinx_spips.c
ARM ACPI Subsystem
M: Shannon Zhao <zhaoshenglong@huawei.com>
M: Shannon Zhao <shannon.zhao@linaro.org>
S: Maintained
F: hw/arm/virt-acpi-build.c
F: include/hw/arm/virt-acpi-build.h
CRIS Machines
-------------
Axis Dev88
@@ -405,7 +339,7 @@ S: Maintained
F: hw/microblaze/petalogix_s3adsp1800_mmu.c
petalogix_ml605
M: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
M: Peter Crosthwaite <peter.crosthwaite@petalogix.com>
S: Maintained
F: hw/microblaze/petalogix_ml605_mmu.c
@@ -422,7 +356,7 @@ S: Maintained
F: hw/mips/mips_malta.c
Mipssim
L: qemu-devel@nongnu.org
M: qemu-devel@nongnu.org
S: Orphan
F: hw/mips/mips_mipssim.c
@@ -493,8 +427,7 @@ F: hw/ppc/prep.c
F: hw/pci-host/prep.[hc]
F: hw/isa/pc87312.[hc]
sPAPR (pseries)
M: David Gibson <david@gibson.dropbear.id.au>
sPAPR
M: Alexander Graf <agraf@suse.de>
L: qemu-ppc@nongnu.org
S: Supported
@@ -526,13 +459,11 @@ SPARC Machines
--------------
Sun4m
M: Blue Swirl <blauwirbel@gmail.com>
M: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
S: Maintained
F: hw/sparc/sun4m.c
Sun4u
M: Blue Swirl <blauwirbel@gmail.com>
M: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
S: Maintained
F: hw/sparc64/sun4u.c
@@ -548,20 +479,13 @@ S390 Virtio
M: Alexander Graf <agraf@suse.de>
S: Maintained
F: hw/s390x/s390-*.c
X: hw/s390x/*pci*.[hc]
S390 Virtio-ccw
M: Cornelia Huck <cornelia.huck@de.ibm.com>
M: Christian Borntraeger <borntraeger@de.ibm.com>
M: Alexander Graf <agraf@suse.de>
S: Supported
F: hw/s390x/s390-virtio-ccw.c
F: hw/s390x/css.[hc]
F: hw/s390x/sclp*.[hc]
F: hw/s390x/ipl*.[hc]
F: hw/s390x/*pci*.[hc]
F: include/hw/s390x/
F: pc-bios/s390-ccw/
T: git git://github.com/cohuck/qemu virtio-ccw-upstr
UniCore32 Machines
@@ -575,64 +499,30 @@ F: hw/unicore32/
X86 Machines
------------
PC
M: Michael S. Tsirkin <mst@redhat.com>
M: Anthony Liguori <aliguori@amazon.com>
S: Supported
F: include/hw/i386/
F: hw/i386/
F: hw/pci-host/piix.c
F: hw/pci-host/q35.c
F: hw/pci-host/pam.c
F: include/hw/pci-host/q35.h
F: include/hw/pci-host/pam.h
F: hw/isa/piix4.c
F: hw/isa/lpc_ich9.c
F: hw/i2c/smbus_ich9.c
F: hw/acpi/piix4.c
F: hw/acpi/ich9.c
F: include/hw/acpi/ich9.h
F: include/hw/acpi/piix.h
F: hw/i386/pc.[ch]
F: hw/i386/pc_piix.c
Xtensa Machines
---------------
sim
M: Max Filippov <jcmvbkbc@gmail.com>
S: Maintained
F: hw/xtensa/sim.c
F: hw/xtensa/xtensa_sim.c
XTFPGA (LX60, LX200, ML605, KC705)
Avnet LX60
M: Max Filippov <jcmvbkbc@gmail.com>
S: Maintained
F: hw/xtensa/xtfpga.c
F: hw/net/opencores_eth.c
F: hw/xtensa/xtensa_lx60.c
Devices
-------
EDU
M: Jiri Slaby <jslaby@suse.cz>
S: Maintained
F: hw/misc/edu.c
IDE
M: John Snow <jsnow@redhat.com>
L: qemu-block@nongnu.org
S: Supported
M: Kevin Wolf <kwolf@redhat.com>
S: Odd Fixes
F: include/hw/ide.h
F: hw/ide/
F: hw/block/block.c
F: hw/block/cdrom.c
F: hw/block/hd-geometry.c
F: tests/ide-test.c
F: tests/ahci-test.c
T: git git://github.com/jnsnow/qemu.git ide
Floppy
M: John Snow <jsnow@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: hw/block/fdc.c
F: include/hw/block/fdc.h
T: git git://github.com/jnsnow/qemu.git ide
OMAP
M: Peter Maydell <peter.maydell@linaro.org>
@@ -644,19 +534,7 @@ M: Michael S. Tsirkin <mst@redhat.com>
S: Supported
F: include/hw/pci/*
F: hw/pci/*
ACPI
M: Michael S. Tsirkin <mst@redhat.com>
M: Igor Mammedov <imammedo@redhat.com>
S: Supported
F: include/hw/acpi/*
F: hw/mem/*
F: hw/acpi/*
F: hw/i386/acpi-build.[hc]
F: hw/i386/*dsl
F: hw/arm/virt-acpi-build.c
F: include/hw/arm/virt-acpi-build.h
F: scripts/acpi*py
ppc4xx
M: Alexander Graf <agraf@suse.de>
@@ -683,7 +561,7 @@ S: Orphan
F: hw/scsi/lsi53c895a.c
SSI
M: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
M: Peter Crosthwaite <peter.crosthwaite@petalogix.com>
S: Maintained
F: hw/ssi/*
F: hw/block/m25p80.c
@@ -692,18 +570,11 @@ USB
M: Gerd Hoffmann <kraxel@redhat.com>
S: Maintained
F: hw/usb/*
F: tests/usb-*-test.c
USB (serial adapter)
M: Gerd Hoffmann <kraxel@redhat.com>
M: Samuel Thibault <samuel.thibault@ens-lyon.org>
S: Maintained
F: hw/usb/dev-serial.c
VFIO
M: Alex Williamson <alex.williamson@redhat.com>
S: Supported
F: hw/vfio/*
F: hw/misc/vfio.c
vhost
M: Michael S. Tsirkin <mst@redhat.com>
@@ -711,158 +582,70 @@ S: Supported
F: hw/*/*vhost*
virtio
M: Michael S. Tsirkin <mst@redhat.com>
M: Anthony Liguori <aliguori@amazon.com>
S: Supported
F: hw/*/virtio*
F: net/vhost-user.c
virtio-9p
M: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
S: Supported
F: hw/9pfs/
F: fsdev/
F: tests/virtio-9p-test.c
T: git git://github.com/kvaneesh/QEMU.git
virtio-blk
M: Kevin Wolf <kwolf@redhat.com>
M: Stefan Hajnoczi <stefanha@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: hw/block/virtio-blk.c
F: hw/block/dataplane/*
F: hw/virtio/dataplane/*
T: git git://github.com/stefanha/qemu.git block
virtio-ccw
M: Cornelia Huck <cornelia.huck@de.ibm.com>
M: Christian Borntraeger <borntraeger@de.ibm.com>
S: Supported
F: hw/s390x/virtio-ccw.[hc]
T: git git://github.com/cohuck/qemu virtio-ccw-upstr
virtio-input
M: Gerd Hoffmann <kraxel@redhat.com>
S: Maintained
F: hw/input/virtio-input*.c
F: include/hw/virtio/virtio-input.h
virtio-serial
M: Amit Shah <amit.shah@redhat.com>
S: Supported
F: hw/char/virtio-serial-bus.c
F: hw/char/virtio-console.c
F: include/hw/virtio/virtio-serial.h
virtio-rng
M: Amit Shah <amit.shah@redhat.com>
S: Supported
F: hw/virtio/virtio-rng.c
F: include/hw/virtio/virtio-rng.h
F: backends/rng*.c
nvme
M: Keith Busch <keith.busch@intel.com>
L: qemu-block@nongnu.org
S: Supported
F: hw/block/nvme*
F: tests/nvme-test.c
megasas
M: Hannes Reinecke <hare@suse.de>
L: qemu-block@nongnu.org
S: Supported
F: hw/scsi/megasas.c
F: hw/scsi/mfi.h
Xilinx EDK
M: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
M: Peter Crosthwaite <peter.crosthwaite@petalogix.com>
M: Edgar E. Iglesias <edgar.iglesias@gmail.com>
S: Maintained
F: hw/*/xilinx_*
F: include/hw/xilinx.h
Vmware
M: Dmitry Fleytman <dmitry@daynix.com>
S: Maintained
F: hw/net/vmxnet*
F: hw/scsi/vmw_pvscsi*
Rocker
M: Scott Feldman <sfeldma@gmail.com>
M: Jiri Pirko <jiri@resnulli.us>
S: Maintained
F: hw/net/rocker/
Subsystems
----------
Audio
M: Vassili Karpov (malc) <av1474@comtv.ru>
M: Gerd Hoffmann <kraxel@redhat.com>
S: Maintained
F: audio/
F: hw/audio/
F: tests/ac97-test.c
F: tests/es1370-test.c
F: tests/intel-hda-test.c
Block layer core
Block
M: Kevin Wolf <kwolf@redhat.com>
L: qemu-block@nongnu.org
M: Stefan Hajnoczi <stefanha@redhat.com>
S: Supported
F: block*
F: block/
F: hw/block/
F: include/block/
F: qemu-img*
F: qemu-io*
F: tests/qemu-iotests/
T: git git://repo.or.cz/qemu/kevin.git block
Block I/O path
M: Stefan Hajnoczi <stefanha@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: async.c
F: aio-*.c
F: block/io.c
F: migration/block*
T: git git://github.com/stefanha/qemu.git block
Block Jobs
M: Jeff Cody <jcody@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: blockjob.c
F: include/block/blockjob.h
F: block/backup.c
F: block/commit.c
F: block/stream.h
F: block/mirror.c
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
Block QAPI, monitor, command line
M: Markus Armbruster <armbru@redhat.com>
S: Supported
F: blockdev.c
F: block/qapi.c
F: qapi/block*.json
T: git git://repo.or.cz/qemu/armbru.git block-next
Character Devices
M: Paolo Bonzini <pbonzini@redhat.com>
M: Anthony Liguori <aliguori@amazon.com>
S: Maintained
F: qemu-char.c
F: backends/msmouse.c
F: backends/testdev.c
Character Devices (Braille)
M: Samuel Thibault <samuel.thibault@ens-lyon.org>
S: Maintained
F: backends/baum.c
Coverity model
M: Markus Armbruster <armbru@redhat.com>
S: Supported
F: scripts/coverity-model.c
CPU
M: Andreas Färber <afaerber@suse.de>
@@ -878,27 +661,17 @@ F: include/hw/cpu/icc_bus.h
F: hw/cpu/icc_bus.c
Device Tree
M: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
M: Peter Crosthwaite <peter.crosthwaite@petalogix.com>
M: Alexander Graf <agraf@suse.de>
S: Maintained
F: device_tree.[ch]
GDB stub
L: qemu-devel@nongnu.org
M: qemu-devel@nongnu.org
S: Odd Fixes
F: gdbstub*
F: gdb-xml/
Memory API
M: Paolo Bonzini <pbonzini@redhat.com>
S: Supported
F: include/exec/ioport.h
F: ioport.c
F: include/exec/memory.h
F: memory.c
F: include/exec/memory-internal.h
F: exec.c
SPICE
M: Gerd Hoffmann <kraxel@redhat.com>
S: Supported
@@ -908,47 +681,35 @@ F: audio/spiceaudio.c
F: hw/display/qxl*
Graphics
M: Gerd Hoffmann <kraxel@redhat.com>
S: Odd Fixes
M: Anthony Liguori <aliguori@amazon.com>
S: Maintained
F: ui/
Cocoa graphics
M: Andreas Färber <andreas.faerber@web.de>
M: Peter Maydell <peter.maydell@linaro.org>
S: Odd Fixes
F: ui/cocoa.m
Main loop
M: Paolo Bonzini <pbonzini@redhat.com>
S: Maintained
F: cpus.c
F: main-loop.c
F: qemu-timer.c
M: Anthony Liguori <aliguori@amazon.com>
S: Supported
F: vl.c
Human Monitor (HMP)
M: Luiz Capitulino <lcapitulino@redhat.com>
S: Maintained
S: Supported
F: monitor.c
F: hmp.c
F: hmp-commands.hx
T: git git://repo.or.cz/qemu/qmp-unstable.git queue/qmp
Network device layer
M: Anthony Liguori <aliguori@amazon.com>
M: Stefan Hajnoczi <stefanha@redhat.com>
M: Jason Wang <jasowang@redhat.com>
S: Maintained
F: net/
T: git git://github.com/stefanha/qemu.git net
Netmap network backend
M: Luigi Rizzo <rizzo@iet.unipi.it>
M: Giuseppe Lettieri <g.lettieri@iet.unipi.it>
M: Vincenzo Maffione <v.maffione@gmail.com>
W: http://info.iet.unipi.it/~luigi/netmap/
S: Maintained
F: net/netmap.c
Network Block Device (NBD)
M: Paolo Bonzini <pbonzini@redhat.com>
S: Odd Fixes
@@ -957,65 +718,29 @@ F: nbd.*
F: qemu-nbd.c
T: git git://github.com/bonzini/qemu.git nbd-next
NUMA
M: Eduardo Habkost <ehabkost@redhat.com>
S: Maintained
F: numa.c
F: include/sysemu/numa.h
K: numa|NUMA
K: srat|SRAT
T: git git://github.com/ehabkost/qemu.git numa
QAPI
M: Markus Armbruster <armbru@redhat.com>
M: Luiz Capitulino <lcapitulino@redhat.com>
M: Michael Roth <mdroth@linux.vnet.ibm.com>
S: Supported
F: qapi/
X: qapi/*.json
F: tests/qapi-schema/
F: scripts/qapi*
F: docs/qapi*
T: git git://repo.or.cz/qemu/armbru.git qapi-next
T: git git://repo.or.cz/qemu/qmp-unstable.git queue/qmp
QAPI Schema
M: Eric Blake <eblake@redhat.com>
M: Luiz Capitulino <lcapitulino@redhat.com>
M: Markus Armbruster <armbru@redhat.com>
S: Supported
F: qapi-schema.json
F: qapi/*.json
T: git git://repo.or.cz/qemu/armbru.git qapi-next
QObject
M: Luiz Capitulino <lcapitulino@redhat.com>
S: Maintained
F: qobject/
T: git git://repo.or.cz/qemu/qmp-unstable.git queue/qmp
QEMU Guest Agent
M: Michael Roth <mdroth@linux.vnet.ibm.com>
S: Maintained
F: qga/
T: git git://github.com/mdroth/qemu.git qga
QOM
M: Andreas Färber <afaerber@suse.de>
S: Supported
T: git git://github.com/afaerber/qemu-cpu.git qom-next
F: include/qom/
X: include/qom/cpu.h
F: qom/
X: qom/cpu.c
F: tests/qom-test.c
QMP
M: Markus Armbruster <armbru@redhat.com>
M: Luiz Capitulino <lcapitulino@redhat.com>
S: Supported
F: qmp.c
F: monitor.c
F: qmp-commands.hx
F: docs/qmp/
F: scripts/qmp/
T: git git://repo.or.cz/qemu/armbru.git qapi-next
F: QMP/
T: git git://repo.or.cz/qemu/qmp-unstable.git queue/qmp
SLIRP
M: Jan Kiszka <jan.kiszka@siemens.com>
@@ -1037,36 +762,14 @@ M: Blue Swirl <blauwirbel@gmail.com>
S: Odd Fixes
F: scripts/checkpatch.pl
Migration
M: Juan Quintela <quintela@redhat.com>
M: Amit Shah <amit.shah@redhat.com>
S: Maintained
F: include/migration/
F: migration/
F: scripts/vmstate-static-checker.py
F: tests/vmstate-static-checker-data/
Seccomp
M: Eduardo Otubo <eduardo.otubo@profitbricks.com>
M: Eduardo Otubo <otubo@linux.vnet.ibm.com>
S: Supported
F: qemu-seccomp.c
F: include/sysemu/seccomp.h
Cryptography
M: Daniel P. Berrange <berrange@redhat.com>
S: Maintained
F: crypto/
F: include/crypto/
F: tests/test-crypto-*
Usermode Emulation
------------------
Overall
M: Riku Voipio <riku.voipio@iki.fi>
S: Maintained
F: thunk.c
F: user-exec.c
BSD user
M: Blue Swirl <blauwirbel@gmail.com>
S: Maintained
@@ -1080,6 +783,7 @@ F: linux-user/
Tiny Code Generator (TCG)
-------------------------
Common code
M: qemu-devel@nongnu.org
M: Richard Henderson <rth@twiddle.net>
S: Maintained
F: tcg/
@@ -1096,7 +800,7 @@ S: Maintained
F: tcg/arm/
i386 target
L: qemu-devel@nongnu.org
M: qemu-devel@nongnu.org
S: Maintained
F: tcg/i386/
@@ -1164,38 +868,27 @@ Block drivers
-------------
VMDK
M: Fam Zheng <famz@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/vmdk.c
RBD
M: Josh Durgin <jdurgin@redhat.com>
M: Jeff Cody <jcody@redhat.com>
L: qemu-block@nongnu.org
M: Josh Durgin <josh.durgin@inktank.com>
S: Supported
F: block/rbd.c
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
Sheepdog
M: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
M: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
M: Liu Yuan <namei.unix@gmail.com>
M: Jeff Cody <jcody@redhat.com>
L: qemu-block@nongnu.org
L: sheepdog@lists.wpkg.org
S: Supported
F: block/sheepdog.c
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
VHDX
M: Jeff Cody <jcody@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/vhdx*
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
VDI
M: Stefan Weil <sw@weilnetz.de>
L: qemu-block@nongnu.org
S: Maintained
F: block/vdi.c
@@ -1203,144 +896,10 @@ iSCSI
M: Ronnie Sahlberg <ronniesahlberg@gmail.com>
M: Paolo Bonzini <pbonzini@redhat.com>
M: Peter Lieven <pl@kamp.de>
L: qemu-block@nongnu.org
S: Supported
F: block/iscsi.c
NFS
M: Jeff Cody <jcody@redhat.com>
M: Peter Lieven <pl@kamp.de>
L: qemu-block@nongnu.org
S: Maintained
F: block/nfs.c
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
SSH
M: Richard W.M. Jones <rjones@redhat.com>
M: Jeff Cody <jcody@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/ssh.c
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
ARCHIPELAGO
M: Chrysostomos Nanakos <chris@include.gr>
M: Jeff Cody <jcody@redhat.com>
L: qemu-block@nongnu.org
S: Maintained
F: block/archipelago.c
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
CURL
M: Jeff Cody <jcody@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/curl.c
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
GLUSTER
M: Jeff Cody <jcody@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/gluster.c
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
Null Block Driver
M: Fam Zheng <famz@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/null.c
Bootdevice
M: Gonglei <arei.gonglei@huawei.com>
S: Maintained
F: bootdevice.c
Quorum
M: Alberto Garcia <berto@igalia.com>
S: Supported
F: block/quorum.c
L: qemu-block@nongnu.org
blkverify
M: Stefan Hajnoczi <stefanha@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/blkverify.c
bochs
M: Stefan Hajnoczi <stefanha@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/bochs.c
cloop
M: Stefan Hajnoczi <stefanha@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/cloop.c
dmg
M: Stefan Hajnoczi <stefanha@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/dmg.c
parallels
M: Stefan Hajnoczi <stefanha@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/parallels.c
qed
M: Stefan Hajnoczi <stefanha@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/qed.c
raw
M: Kevin Wolf <kwolf@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/linux-aio.c
F: block/raw-aio.h
F: block/raw-posix.c
F: block/raw-win32.c
F: block/raw_bsd.c
F: block/win32-aio.c
qcow2
M: Kevin Wolf <kwolf@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/qcow2*
qcow
M: Kevin Wolf <kwolf@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/qcow.c
blkdebug
M: Kevin Wolf <kwolf@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/blkdebug.c
vpc
M: Kevin Wolf <kwolf@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/vpc.c
vvfat
M: Kevin Wolf <kwolf@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: block/vvfat.c
Image format fuzzer
M: Stefan Hajnoczi <stefanha@redhat.com>
L: qemu-block@nongnu.org
S: Supported
F: tests/image-fuzzer/

206
Makefile
View File

@@ -3,11 +3,6 @@
# Always point to the root of the build tree (needs GNU make).
BUILD_DIR=$(CURDIR)
# Before including a proper config-host.mak, assume we are in the source tree
SRC_PATH=.
UNCHECKED_GOALS := %clean TAGS cscope ctags
# All following code might depend on configuration variables
ifneq ($(wildcard config-host.mak),)
# Put the all: rule here so that config-host.mak can contain dependencies.
@@ -43,43 +38,32 @@ config-host.mak: $(SRC_PATH)/configure
fi
else
config-host.mak:
ifneq ($(filter-out $(UNCHECKED_GOALS),$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
ifneq ($(filter-out %clean,$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
@echo "Please call configure before running make!"
@exit 1
endif
endif
GENERATED_HEADERS = config-host.h qemu-options.def
GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h
GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c
GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h
GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c
GENERATED_HEADERS += trace/generated-events.h
GENERATED_SOURCES += trace/generated-events.c
GENERATED_HEADERS += trace/generated-tracers.h
ifeq ($(findstring dtrace,$(TRACE_BACKENDS)),dtrace)
ifeq ($(TRACE_BACKEND),dtrace)
GENERATED_HEADERS += trace/generated-tracers-dtrace.h
endif
GENERATED_SOURCES += trace/generated-tracers.c
GENERATED_HEADERS += trace/generated-tcg-tracers.h
GENERATED_HEADERS += trace/generated-helpers-wrappers.h
GENERATED_HEADERS += trace/generated-helpers.h
GENERATED_SOURCES += trace/generated-helpers.c
ifeq ($(findstring ust,$(TRACE_BACKENDS)),ust)
GENERATED_HEADERS += trace/generated-ust-provider.h
GENERATED_SOURCES += trace/generated-ust.c
endif
# Don't try to regenerate Makefile or configure
# We don't generate any of them
Makefile: ;
configure: ;
.PHONY: all clean cscope distclean dvi html info install install-doc \
pdf recurse-all speed test dist msi
pdf recurse-all speed test dist
$(call set-vpath, $(SRC_PATH))
@@ -89,9 +73,6 @@ HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)
ifdef BUILD_DOCS
DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 qmp-commands.txt
ifdef CONFIG_LINUX
DOCS+=kvm_stat.1
endif
ifdef CONFIG_VIRTFS
DOCS+=fsdev/virtfs-proxy-helper.1
endif
@@ -117,9 +98,8 @@ endif
-include $(SUBDIR_DEVICES_MAK_DEP)
%/config-devices.mak: default-configs/%.mak
$(call quiet-command, \
$(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $< $*-config-devices.mak.d $@ > $@.tmp, " GEN $@.tmp")
$(call quiet-command, if test -f $@; then \
$(call quiet-command,$(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $@ $<, " GEN $@")
@if test -f $@; then \
if cmp -s $@.old $@; then \
mv $@.tmp $@; \
cp -p $@ $@.old; \
@@ -135,33 +115,20 @@ endif
else \
mv $@.tmp $@; \
cp -p $@ $@.old; \
fi, " GEN $@");
fi
defconfig:
rm -f config-all-devices.mak $(SUBDIR_DEVICES_MAK)
ifneq ($(wildcard config-host.mak),)
include $(SRC_PATH)/Makefile.objs
endif
dummy := $(call unnest-vars,, \
stub-obj-y \
util-obj-y \
qga-obj-y \
qga-vss-dll-obj-y \
block-obj-y \
block-obj-m \
common-obj-y \
common-obj-m)
ifneq ($(wildcard config-host.mak),)
include $(SRC_PATH)/tests/Makefile
endif
ifeq ($(CONFIG_SMARTCARD_NSS),y)
include $(SRC_PATH)/libcacard/Makefile
endif
all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules
all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all
config-host.h: config-host.h-timestamp
config-host.h-timestamp: config-host.mak
@@ -171,7 +138,6 @@ qemu-options.def: $(SRC_PATH)/qemu-options.hx
SUBDIR_RULES=$(patsubst %,subdir-%, $(TARGET_DIRS))
SOFTMMU_SUBDIR_RULES=$(filter %-softmmu,$(SUBDIR_RULES))
$(SOFTMMU_SUBDIR_RULES): $(block-obj-y)
$(SOFTMMU_SUBDIR_RULES): config-all-devices.mak
subdir-%:
@@ -206,9 +172,11 @@ ALL_SUBDIRS=$(TARGET_DIRS) $(patsubst %,pc-bios/%, $(ROMS))
recurse-all: $(SUBDIR_RULES) $(ROMSUBDIR_RULES)
$(BUILD_DIR)/version.o: $(SRC_PATH)/version.rc config-host.h | $(BUILD_DIR)/version.lo
bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS)
$(BUILD_DIR)/version.o: $(SRC_PATH)/version.rc $(BUILD_DIR)/config-host.h | $(BUILD_DIR)/version.lo
$(call quiet-command,$(WINDRES) -I$(BUILD_DIR) -o $@ $<," RC version.o")
$(BUILD_DIR)/version.lo: $(SRC_PATH)/version.rc config-host.h
$(BUILD_DIR)/version.lo: $(SRC_PATH)/version.rc $(BUILD_DIR)/config-host.h
$(call quiet-command,$(WINDRES) -I$(BUILD_DIR) -o $@ $<," RC version.lo")
Makefile: $(version-obj-y) $(version-lobj-y)
@@ -217,10 +185,7 @@ Makefile: $(version-obj-y) $(version-lobj-y)
# Build libraries
libqemustub.a: $(stub-obj-y)
libqemuutil.a: $(util-obj-y)
block-modules = $(foreach o,$(block-obj-m),"$(basename $(subst /,-,$o))",) NULL
util/module.o-cflags = -D'CONFIG_BLOCK_MODULES=$(block-modules)'
libqemuutil.a: $(util-obj-y) qapi-types.o qapi-visit.o
######################################################################
@@ -247,44 +212,23 @@ qapi-py = $(SRC_PATH)/scripts/qapi.py $(SRC_PATH)/scripts/ordereddict.py
qga/qapi-generated/qga-qapi-types.c qga/qapi-generated/qga-qapi-types.h :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
" GEN $@")
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py $(gen-out-type) -o qga/qapi-generated -p "qga-" < $<, " GEN $@")
qga/qapi-generated/qga-qapi-visit.c qga/qapi-generated/qga-qapi-visit.h :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
" GEN $@")
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py $(gen-out-type) -o qga/qapi-generated -p "qga-" < $<, " GEN $@")
qga/qapi-generated/qga-qmp-commands.h qga/qapi-generated/qga-qmp-marshal.c :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
" GEN $@")
qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json \
$(SRC_PATH)/qapi/block.json $(SRC_PATH)/qapi/block-core.json \
$(SRC_PATH)/qapi/event.json
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py $(gen-out-type) -o qga/qapi-generated -p "qga-" < $<, " GEN $@")
qapi-types.c qapi-types.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
$(gen-out-type) -o "." -b $<, \
" GEN $@")
$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py $(gen-out-type) -o "." -b < $<, " GEN $@")
qapi-visit.c qapi-visit.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
$(gen-out-type) -o "." -b $<, \
" GEN $@")
qapi-event.c qapi-event.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-event.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-event.py \
$(gen-out-type) -o "." $<, \
" GEN $@")
$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py $(gen-out-type) -o "." -b < $<, " GEN $@")
qmp-commands.h qmp-marshal.c :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
$(gen-out-type) -o "." -m $<, \
" GEN $@")
$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py $(gen-out-type) -m -o "." < $<, " GEN $@")
QGALIB_GEN=$(addprefix qga/qapi-generated/, qga-qapi-types.h qga-qapi-visit.h qga-qmp-commands.h)
$(qga-obj-y) qemu-ga.o: $(QGALIB_GEN)
@@ -292,38 +236,16 @@ $(qga-obj-y) qemu-ga.o: $(QGALIB_GEN)
qemu-ga$(EXESUF): $(qga-obj-y) libqemuutil.a libqemustub.a
$(call LINK, $^)
ifdef QEMU_GA_MSI_ENABLED
QEMU_GA_MSI=qemu-ga-$(ARCH).msi
msi: ${QEMU_GA_MSI}
$(QEMU_GA_MSI): qemu-ga.exe
ifdef QEMU_GA_MSI_WITH_VSS
$(QEMU_GA_MSI): qga/vss-win32/qga-vss.dll
endif
$(QEMU_GA_MSI): config-host.mak
$(QEMU_GA_MSI): qga/installer/qemu-ga.wxs
$(call quiet-command,QEMU_GA_VERSION="$(QEMU_GA_VERSION)" QEMU_GA_MANUFACTURER="$(QEMU_GA_MANUFACTURER)" QEMU_GA_DISTRO="$(QEMU_GA_DISTRO)" \
wixl -o $@ $(QEMU_GA_MSI_ARCH) $(QEMU_GA_MSI_WITH_VSS) $(QEMU_GA_MSI_MINGW_DLL_PATH) $<, " WIXL $@")
else
msi:
@echo MSI build not configured or dependency resolution failed (reconfigure with --enable-guest-agent-msi option)
endif
clean:
# avoid old build problems by removing potentially incorrect old files
rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h gen-op-arm.h
rm -f qemu-options.def
rm -f *.msi
find . \( -name '*.l[oa]' -o -name '*.so' -o -name '*.dll' -o -name '*.mo' -o -name '*.[oda]' \) -type f -exec rm {} +
find . -name '*.[oda]' -type f -exec rm -f {} +
find . -name '*.l[oa]' -type f -exec rm -f {} +
rm -f $(filter-out %.tlb,$(TOOLS)) $(HELPERS-y) qemu-ga TAGS cscope.* *.pod *~ */*~
rm -f fsdev/*.pod
rm -rf .libs */.libs
rm -f qemu-img-cmds.h
rm -f ui/shader/*-vert.h ui/shader/*-frag.h
@# May not be present in GENERATED_HEADERS
rm -f trace/generated-tracers-dtrace.dtrace*
rm -f trace/generated-tracers-dtrace.h*
@@ -345,8 +267,8 @@ qemu-%.tar.bz2:
distclean: clean
rm -f config-host.mak config-host.h* config-host.ld $(DOCS) qemu-options.texi qemu-img-cmds.texi qemu-monitor.texi
rm -f config-all-devices.mak config-all-disas.mak config.status
rm -f po/*.mo tests/qemu-iotests/common.env
rm -f config-all-devices.mak config-all-disas.mak
rm -f po/*.mo
rm -f roms/seabios/config.mak roms/vgabios/config.mak
rm -f qemu-doc.info qemu-doc.aux qemu-doc.cp qemu-doc.cps qemu-doc.dvi
rm -f qemu-doc.fn qemu-doc.fns qemu-doc.info qemu-doc.ky qemu-doc.kys
@@ -359,8 +281,8 @@ distclean: clean
rm -rf $$d || exit 1 ; \
done
rm -Rf .sdk
if test -f pixman/config.log; then $(MAKE) -C pixman distclean; fi
if test -f dtc/version_gen.h; then $(MAKE) $(DTC_MAKE_ARGS) clean; fi
if test -f pixman/config.log; then make -C pixman distclean; fi
if test -f dtc/version_gen.h; then make $(DTC_MAKE_ARGS) clean; fi
KEYMAPS=da en-gb et fr fr-ch is lt modifiers no pt-br sv \
ar de en-us fi fr-be hr it lv nl pl ru th \
@@ -368,10 +290,10 @@ common de-ch es fo fr-ca hu ja mk nl-be pt sl tr \
bepo cz
ifdef INSTALL_BLOBS
BLOBS=bios.bin bios-256k.bin sgabios.bin vgabios.bin vgabios-cirrus.bin \
vgabios-stdvga.bin vgabios-vmware.bin vgabios-qxl.bin vgabios-virtio.bin \
BLOBS=bios.bin sgabios.bin vgabios.bin vgabios-cirrus.bin \
vgabios-stdvga.bin vgabios-vmware.bin vgabios-qxl.bin \
acpi-dsdt.aml q35-acpi-dsdt.aml \
ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc QEMU,tcx.bin QEMU,cgthree.bin \
ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc QEMU,tcx.bin \
pxe-e1000.rom pxe-eepro100.rom pxe-ne2k_pci.rom \
pxe-pcnet.rom pxe-rtl8139.rom pxe-virtio.rom \
efi-e1000.rom efi-eepro100.rom efi-ne2k_pci.rom \
@@ -382,8 +304,7 @@ multiboot.bin linuxboot.bin kvmvapic.bin \
s390-zipl.rom \
s390-ccw.img \
spapr-rtas.bin slof.bin \
palcode-clipper \
u-boot.e500
palcode-clipper
else
BLOBS=
endif
@@ -416,22 +337,21 @@ ifneq (,$(findstring qemu-ga,$(TOOLS)))
endif
endif
install-confdir:
$(INSTALL_DIR) "$(DESTDIR)$(qemu_confdir)"
install: all $(if $(BUILD_DOCS),install-doc) \
install-sysconfig: install-datadir install-confdir
$(INSTALL_DATA) $(SRC_PATH)/sysconfigs/target/target-x86_64.conf "$(DESTDIR)$(qemu_confdir)"
install: all $(if $(BUILD_DOCS),install-doc) install-sysconfig \
install-datadir install-localstatedir
$(INSTALL_DIR) "$(DESTDIR)$(bindir)"
ifneq ($(TOOLS),)
$(call install-prog,$(TOOLS),$(DESTDIR)$(bindir))
endif
ifneq ($(CONFIG_MODULES),)
$(INSTALL_DIR) "$(DESTDIR)$(qemu_moddir)"
for s in $(modules-m:.mo=$(DSOSUF)); do \
t="$(DESTDIR)$(qemu_moddir)/$$(echo $$s | tr / -)"; \
$(INSTALL_LIB) $$s "$$t"; \
test -z "$(STRIP)" || $(STRIP) "$$t"; \
done
$(INSTALL_PROG) $(STRIP_OPT) $(TOOLS) "$(DESTDIR)$(bindir)"
endif
ifneq ($(HELPERS-y),)
$(call install-prog,$(HELPERS-y),$(DESTDIR)$(libexecdir))
$(INSTALL_DIR) "$(DESTDIR)$(libexecdir)"
$(INSTALL_PROG) $(STRIP_OPT) $(HELPERS-y) "$(DESTDIR)$(libexecdir)"
endif
ifneq ($(BLOBS),)
set -e; for x in $(BLOBS); do \
@@ -445,45 +365,23 @@ endif
set -e; for x in $(KEYMAPS); do \
$(INSTALL_DATA) $(SRC_PATH)/pc-bios/keymaps/$$x "$(DESTDIR)$(qemu_datadir)/keymaps"; \
done
$(INSTALL_DATA) $(SRC_PATH)/trace-events "$(DESTDIR)$(qemu_datadir)/trace-events"
for d in $(TARGET_DIRS); do \
$(MAKE) $(SUBDIR_MAKEFLAGS) TARGET_DIR=$$d/ -C $$d $@ || exit 1 ; \
$(MAKE) -C $$d $@ || exit 1 ; \
done
# various test targets
test speed: all
$(MAKE) -C tests/tcg $@
.PHONY: ctags
ctags:
rm -f $@
find "$(SRC_PATH)" -name '*.[hc]' -exec ctags --append {} +
.PHONY: TAGS
TAGS:
rm -f $@
find "$(SRC_PATH)" -name '*.[hc]' -exec etags --append {} +
cscope:
rm -f "$(SRC_PATH)"/cscope.*
find "$(SRC_PATH)/" -name "*.[chsS]" -print | sed 's,^\./,,' > "$(SRC_PATH)/cscope.files"
cscope -b -i"$(SRC_PATH)/cscope.files"
# opengl shader programs
ui/shader/%-vert.h: $(SRC_PATH)/ui/shader/%.vert $(SRC_PATH)/scripts/shaderinclude.pl
@mkdir -p $(dir $@)
$(call quiet-command,\
perl $(SRC_PATH)/scripts/shaderinclude.pl $< > $@,\
" VERT $@")
ui/shader/%-frag.h: $(SRC_PATH)/ui/shader/%.frag $(SRC_PATH)/scripts/shaderinclude.pl
@mkdir -p $(dir $@)
$(call quiet-command,\
perl $(SRC_PATH)/scripts/shaderinclude.pl $< > $@,\
" FRAG $@")
ui/console-gl.o: $(SRC_PATH)/ui/console-gl.c \
ui/shader/texture-blit-vert.h ui/shader/texture-blit-frag.h
rm -f ./cscope.*
find "$(SRC_PATH)" -name "*.[chsS]" -print | sed 's,^\./,,' > ./cscope.files
cscope -b
# documentation
MAKEINFO=makeinfo
@@ -538,12 +436,6 @@ qemu-nbd.8: qemu-nbd.texi
$(POD2MAN) --section=8 --center=" " --release=" " qemu-nbd.pod > $@, \
" GEN $@")
kvm_stat.1: scripts/kvm/kvm_stat.texi
$(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< kvm_stat.pod && \
$(POD2MAN) --section=1 --center=" " --release=" " kvm_stat.pod > $@, \
" GEN $@")
dvi: qemu-doc.dvi qemu-tech.dvi
html: qemu-doc.html qemu-tech.html
info: qemu-doc.info qemu-tech.info
@@ -576,7 +468,7 @@ installer: $(INSTALLER)
INSTDIR=/tmp/qemu-nsis
$(INSTALLER): $(SRC_PATH)/qemu.nsi
$(MAKE) install prefix=${INSTDIR}
make install prefix=${INSTDIR}
ifdef SIGNCODE
(cd ${INSTDIR}; \
for i in *.exe; do \
@@ -610,7 +502,7 @@ endif # CONFIG_WIN
# Add a dependency on the generated files, so that they are always
# rebuilt before other object files
ifneq ($(filter-out $(UNCHECKED_GOALS),$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
ifneq ($(filter-out %clean,$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
Makefile: $(GENERATED_HEADERS)
endif

View File

@@ -1,8 +1,7 @@
#######################################################################
# Common libraries for tools and emulators
stub-obj-y = stubs/
util-obj-y = util/ qobject/ qapi/ qapi-types.o qapi-visit.o qapi-event.o
util-obj-y += crypto/
util-obj-y = util/ qobject/ qapi/ trace/
#######################################################################
# block-obj-y is code used by both qemu system emulation and qemu-img
@@ -13,14 +12,18 @@ block-obj-y += main-loop.o iohandler.o qemu-timer.o
block-obj-$(CONFIG_POSIX) += aio-posix.o
block-obj-$(CONFIG_WIN32) += aio-win32.o
block-obj-y += block/
block-obj-y += qapi-types.o qapi-visit.o
block-obj-y += qemu-io-cmds.o
block-obj-y += qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
block-obj-y += qemu-coroutine-sleep.o
block-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o
block-obj-m = block/
ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
# Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
# only pull in the actual virtio-9p device if we also enabled virtio.
CONFIG_REALLY_VIRTFS=y
endif
######################################################################
# smartcard
@@ -31,8 +34,6 @@ libcacard-y += libcacard/vcard_emul_nss.o
libcacard-y += libcacard/vcard_emul_type.o
libcacard-y += libcacard/card_7816.o
libcacard-y += libcacard/vcardt.o
libcacard/vcard_emul_nss.o-cflags := $(NSS_CFLAGS)
libcacard/vcard_emul_nss.o-libs := $(NSS_LIBS)
######################################################################
# Target independent part of system emulation. The long term path is to
@@ -40,33 +41,33 @@ libcacard/vcard_emul_nss.o-libs := $(NSS_LIBS)
# single QEMU executable should support all CPUs and machines.
ifeq ($(CONFIG_SOFTMMU),y)
common-obj-y = blockdev.o blockdev-nbd.o block/
common-obj-y += iothread.o
common-obj-y = $(block-obj-y) blockdev.o blockdev-nbd.o block/
common-obj-y += net/
common-obj-y += readline.o
common-obj-y += qdev-monitor.o device-hotplug.o
common-obj-$(CONFIG_WIN32) += os-win32.o
common-obj-$(CONFIG_POSIX) += os-posix.o
common-obj-$(CONFIG_LINUX) += fsdev/
common-obj-y += migration/
common-obj-y += migration.o migration-tcp.o
common-obj-$(CONFIG_RDMA) += migration-rdma.o
common-obj-y += qemu-char.o #aio.o
common-obj-y += page_cache.o
common-obj-y += qjson.o
common-obj-y += block-migration.o
common-obj-y += page_cache.o xbzrle.o
common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
common-obj-y += audio/
common-obj-y += hw/
common-obj-y += accel.o
common-obj-y += ui/
common-obj-y += bt-host.o bt-vhci.o
bt-host.o-cflags := $(BLUEZ_CFLAGS)
common-obj-y += dma-helpers.o
common-obj-y += vl.o
vl.o-cflags := $(GPROF_CFLAGS) $(SDL_CFLAGS)
common-obj-y += tpm.o
common-obj-$(CONFIG_SLIRP) += slirp/
@@ -77,8 +78,6 @@ common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
common-obj-$(CONFIG_SMARTCARD_NSS) += $(libcacard-y)
common-obj-$(CONFIG_FDT) += device_tree.o
######################################################################
# qapi
@@ -86,6 +85,11 @@ common-obj-y += qmp-marshal.o
common-obj-y += qmp.o hmp.o
endif
######################################################################
# some qapi visitors are used by both system and user emulation:
common-obj-y += qapi-visit.o qapi-types.o
#######################################################################
# Target-independent parts used in system and user emulation
common-obj-y += qemu-log.o
@@ -99,15 +103,25 @@ common-obj-y += disas/
version-obj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.o
version-lobj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.lo
######################################################################
# tracing
util-obj-y += trace/
target-obj-y += trace/
######################################################################
# guest agent
# FIXME: a few definitions from qapi-types.o/qapi-visit.o are needed
# by libqemuutil.a. These should be moved to a separate .json schema.
qga-obj-y = qga/
qga-obj-y = qga/ qapi-types.o qapi-visit.o
qga-vss-dll-obj-y = qga/
vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
vl.o: QEMU_CFLAGS+=$(SDL_CFLAGS)
QEMU_CFLAGS+=$(GLIB_CFLAGS)
nested-vars += \
stub-obj-y \
util-obj-y \
qga-obj-y \
qga-vss-dll-obj-y \
block-obj-y \
common-obj-y
dummy := $(call unnest-vars)

View File

@@ -1,7 +1,5 @@
# -*- Mode: makefile -*-
BUILD_DIR?=$(CURDIR)/..
include ../config-host.mak
include config-target.mak
include config-devices.mak
@@ -18,29 +16,26 @@ QEMU_CFLAGS+=-I$(SRC_PATH)/include
ifdef CONFIG_USER_ONLY
# user emulator name
QEMU_PROG=qemu-$(TARGET_NAME)
QEMU_PROG_BUILD = $(QEMU_PROG)
else
# system emulator name
QEMU_PROG=qemu-system-$(TARGET_NAME)$(EXESUF)
ifneq (,$(findstring -mwindows,$(libs_softmmu)))
# Terminate program name with a 'w' because the linker builds a windows executable.
QEMU_PROGW=qemu-system-$(TARGET_NAME)w$(EXESUF)
$(QEMU_PROG): $(QEMU_PROGW)
$(call quiet-command,$(OBJCOPY) --subsystem console $(QEMU_PROGW) $(QEMU_PROG)," GEN $(TARGET_DIR)$(QEMU_PROG)")
QEMU_PROG_BUILD = $(QEMU_PROGW)
else
QEMU_PROG_BUILD = $(QEMU_PROG)
endif
endif # windows executable
QEMU_PROG=qemu-system-$(TARGET_NAME)$(EXESUF)
endif
PROGS=$(QEMU_PROG) $(QEMU_PROGW)
PROGS=$(QEMU_PROG)
ifdef QEMU_PROGW
PROGS+=$(QEMU_PROGW)
endif
STPFILES=
config-target.h: config-target.h-timestamp
config-target.h-timestamp: config-target.mak
ifdef CONFIG_TRACE_SYSTEMTAP
stap: $(QEMU_PROG).stp-installed $(QEMU_PROG).stp $(QEMU_PROG)-simpletrace.stp
stap: $(QEMU_PROG).stp-installed $(QEMU_PROG).stp
ifdef CONFIG_USER_ONLY
TARGET_TYPE=user
@@ -51,7 +46,7 @@ endif
$(QEMU_PROG).stp-installed: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--format=stap \
--backends=$(TRACE_BACKENDS) \
--backend=$(TRACE_BACKEND) \
--binary=$(bindir)/$(QEMU_PROG) \
--target-name=$(TARGET_NAME) \
--target-type=$(TARGET_TYPE) \
@@ -60,19 +55,12 @@ $(QEMU_PROG).stp-installed: $(SRC_PATH)/trace-events
$(QEMU_PROG).stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--format=stap \
--backends=$(TRACE_BACKENDS) \
--backend=$(TRACE_BACKEND) \
--binary=$(realpath .)/$(QEMU_PROG) \
--target-name=$(TARGET_NAME) \
--target-type=$(TARGET_TYPE) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp")
$(QEMU_PROG)-simpletrace.stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--format=simpletrace-stap \
--backends=$(TRACE_BACKENDS) \
--probe-prefix=qemu.$(TARGET_TYPE).$(TARGET_NAME) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG)-simpletrace.stp")
else
stap:
endif
@@ -85,7 +73,7 @@ all: $(PROGS) stap
#########################################################
# cpu emulator library
obj-y = exec.o translate-all.o cpu-exec.o
obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o
obj-y += tcg/tcg.o tcg/optimize.o
obj-$(CONFIG_TCG_INTERPRETER) += tci.o
obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
obj-y += fpu/softfloat.o
@@ -94,12 +82,6 @@ obj-y += disas.o
obj-$(call notempty,$(TARGET_XML_FILES)) += gdbstub-xml.o
obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decContext.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decNumber.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal32.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal64.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal128.o
#########################################################
# Linux user emulator target
@@ -117,8 +99,7 @@ endif #CONFIG_LINUX_USER
ifdef CONFIG_BSD_USER
QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ABI_DIR) \
-I$(SRC_PATH)/bsd-user/$(HOST_VARIANT_DIR)
QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ABI_DIR)
obj-y += bsd-user/
obj-y += gdbstub.o user-exec.o
@@ -128,21 +109,19 @@ endif #CONFIG_BSD_USER
#########################################################
# System emulator target
ifdef CONFIG_SOFTMMU
obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
obj-y += qtest.o bootdevice.o
obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o
obj-y += qtest.o
obj-y += hw/
obj-$(CONFIG_FDT) += device_tree.o
obj-$(CONFIG_KVM) += kvm-all.o
obj-y += memory.o cputlb.o
obj-y += memory.o savevm.o cputlb.o
obj-y += memory_mapping.o
obj-y += dump.o
obj-y += migration/ram.o migration/savevm.o
LIBS := $(libs_softmmu) $(LIBS)
LIBS+=$(libs_softmmu)
# xen support
obj-$(CONFIG_XEN) += xen-common.o
obj-$(CONFIG_XEN_I386) += xen-hvm.o xen-mapcache.o
obj-$(call lnot,$(CONFIG_XEN)) += xen-common-stub.o
obj-$(call lnot,$(CONFIG_XEN_I386)) += xen-hvm-stub.o
obj-$(CONFIG_XEN) += xen-all.o xen-mapcache.o
obj-$(call lnot,$(CONFIG_XEN)) += xen-stub.o
# Hardware support
ifeq ($(TARGET_NAME), sparc64)
@@ -151,6 +130,8 @@ else
obj-y += hw/$(TARGET_BASE_ARCH)/
endif
main.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
GENERATED_HEADERS += hmp-commands.h qmp-commands-old.h
endif # CONFIG_SOFTMMU
@@ -158,33 +139,27 @@ endif # CONFIG_SOFTMMU
# Workaround for http://gcc.gnu.org/PR55489, see configure.
%/translate.o: QEMU_CFLAGS += $(TRANSLATE_OPT_CFLAGS)
dummy := $(call unnest-vars,,obj-y)
all-obj-y := $(obj-y)
nested-vars += obj-y
target-obj-y :=
block-obj-y :=
common-obj-y :=
# This resolves all nested paths, so it must come last
include $(SRC_PATH)/Makefile.objs
dummy := $(call unnest-vars,,target-obj-y)
target-obj-y-save := $(target-obj-y)
dummy := $(call unnest-vars,.., \
block-obj-y \
block-obj-m \
common-obj-y \
common-obj-m)
target-obj-y := $(target-obj-y-save)
all-obj-y += $(common-obj-y)
all-obj-y += $(target-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y)
$(QEMU_PROG_BUILD): config-devices.mak
all-obj-y = $(obj-y)
all-obj-y += $(addprefix ../, $(common-obj-y))
# build either PROG or PROGW
$(QEMU_PROG_BUILD): $(all-obj-y) ../libqemuutil.a ../libqemustub.a
$(call LINK, $(filter-out %.mak, $^))
ifdef CONFIG_DARWIN
$(call quiet-command,Rez -append $(SRC_PATH)/pc-bios/qemu.rsrc -o $@," REZ $(TARGET_DIR)$@")
$(call quiet-command,SetFile -a C $@," SETFILE $(TARGET_DIR)$@")
ifndef CONFIG_HAIKU
LIBS+=-lm
endif
ifdef QEMU_PROGW
# The linker builds a windows executable. Make also a console executable.
$(QEMU_PROGW): $(all-obj-y) ../libqemuutil.a ../libqemustub.a
$(call LINK,$^)
$(QEMU_PROG): $(QEMU_PROGW)
$(call quiet-command,$(OBJCOPY) --subsystem console $(QEMU_PROGW) $(QEMU_PROG)," GEN $(TARGET_DIR)$(QEMU_PROG)")
else
$(QEMU_PROG): $(all-obj-y) ../libqemuutil.a ../libqemustub.a
$(call LINK,$^)
endif
gdbstub-xml.c: $(TARGET_XML_FILES) $(SRC_PATH)/scripts/feature_to_c.sh
@@ -206,12 +181,14 @@ endif
install: all
ifneq ($(PROGS),)
$(call install-prog,$(PROGS),$(DESTDIR)$(bindir))
$(INSTALL) -m 755 $(PROGS) "$(DESTDIR)$(bindir)"
ifneq ($(STRIP),)
$(STRIP) $(patsubst %,"$(DESTDIR)$(bindir)/%",$(PROGS))
endif
endif
ifdef CONFIG_TRACE_SYSTEMTAP
$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset"
$(INSTALL_DATA) $(QEMU_PROG).stp-installed "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG).stp"
$(INSTALL_DATA) $(QEMU_PROG)-simpletrace.stp "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG)-simpletrace.stp"
endif
GENERATED_HEADERS += config-target.h

View File

@@ -1 +1 @@
2.4.0.1
1.7.2

157
accel.c
View File

@@ -1,157 +0,0 @@
/*
* QEMU System Emulator, accelerator interfaces
*
* Copyright (c) 2003-2008 Fabrice Bellard
* Copyright (c) 2014 Red Hat Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "sysemu/accel.h"
#include "hw/boards.h"
#include "qemu-common.h"
#include "sysemu/arch_init.h"
#include "sysemu/sysemu.h"
#include "sysemu/kvm.h"
#include "sysemu/qtest.h"
#include "hw/xen/xen.h"
#include "qom/object.h"
#include "hw/boards.h"
int tcg_tb_size;
static bool tcg_allowed = true;
static int tcg_init(MachineState *ms)
{
tcg_exec_init(tcg_tb_size * 1024 * 1024);
return 0;
}
static const TypeInfo accel_type = {
.name = TYPE_ACCEL,
.parent = TYPE_OBJECT,
.class_size = sizeof(AccelClass),
.instance_size = sizeof(AccelState),
};
/* Lookup AccelClass from opt_name. Returns NULL if not found */
static AccelClass *accel_find(const char *opt_name)
{
char *class_name = g_strdup_printf(ACCEL_CLASS_NAME("%s"), opt_name);
AccelClass *ac = ACCEL_CLASS(object_class_by_name(class_name));
g_free(class_name);
return ac;
}
static int accel_init_machine(AccelClass *acc, MachineState *ms)
{
ObjectClass *oc = OBJECT_CLASS(acc);
const char *cname = object_class_get_name(oc);
AccelState *accel = ACCEL(object_new(cname));
int ret;
ms->accelerator = accel;
*(acc->allowed) = true;
ret = acc->init_machine(ms);
if (ret < 0) {
ms->accelerator = NULL;
*(acc->allowed) = false;
object_unref(OBJECT(accel));
}
return ret;
}
int configure_accelerator(MachineState *ms)
{
const char *p;
char buf[10];
int ret;
bool accel_initialised = false;
bool init_failed = false;
AccelClass *acc = NULL;
p = qemu_opt_get(qemu_get_machine_opts(), "accel");
if (p == NULL) {
/* Use the default "accelerator", tcg */
p = "tcg";
}
while (!accel_initialised && *p != '\0') {
if (*p == ':') {
p++;
}
p = get_opt_name(buf, sizeof(buf), p, ':');
acc = accel_find(buf);
if (!acc) {
fprintf(stderr, "\"%s\" accelerator not found.\n", buf);
continue;
}
if (acc->available && !acc->available()) {
printf("%s not supported for this target\n",
acc->name);
continue;
}
ret = accel_init_machine(acc, ms);
if (ret < 0) {
init_failed = true;
fprintf(stderr, "failed to initialize %s: %s\n",
acc->name,
strerror(-ret));
} else {
accel_initialised = true;
}
}
if (!accel_initialised) {
if (!init_failed) {
fprintf(stderr, "No accelerator found!\n");
}
exit(1);
}
if (init_failed) {
fprintf(stderr, "Back to %s accelerator.\n", acc->name);
}
return !accel_initialised;
}
static void tcg_accel_class_init(ObjectClass *oc, void *data)
{
AccelClass *ac = ACCEL_CLASS(oc);
ac->name = "tcg";
ac->init_machine = tcg_init;
ac->allowed = &tcg_allowed;
}
#define TYPE_TCG_ACCEL ACCEL_CLASS_NAME("tcg")
static const TypeInfo tcg_accel_type = {
.name = TYPE_TCG_ACCEL,
.parent = TYPE_ACCEL,
.class_init = tcg_accel_class_init,
};
static void register_accel_types(void)
{
type_register_static(&accel_type);
type_register_static(&tcg_accel_type);
}
type_init(register_accel_types);

View File

@@ -24,6 +24,7 @@ struct AioHandler
IOHandler *io_read;
IOHandler *io_write;
int deleted;
int pollfds_idx;
void *opaque;
QLIST_ENTRY(AioHandler) node;
};
@@ -72,7 +73,7 @@ void aio_set_fd_handler(AioContext *ctx,
} else {
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_new0(AioHandler, 1);
node = g_malloc0(sizeof(AioHandler));
node->pfd.fd = fd;
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
@@ -82,6 +83,7 @@ void aio_set_fd_handler(AioContext *ctx,
node->io_read = io_read;
node->io_write = io_write;
node->opaque = opaque;
node->pollfds_idx = -1;
node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP | G_IO_ERR : 0);
node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0);
@@ -98,11 +100,6 @@ void aio_set_event_notifier(AioContext *ctx,
(IOHandler *)io_read, NULL, notifier);
}
bool aio_prepare(AioContext *ctx)
{
return false;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
@@ -122,22 +119,13 @@ bool aio_pending(AioContext *ctx)
return false;
}
bool aio_dispatch(AioContext *ctx)
static bool aio_dispatch(AioContext *ctx)
{
AioHandler *node;
bool progress = false;
/*
* If there are callbacks left that have been queued, we need to call them.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
progress = true;
}
/*
* We have to walk very carefully in case aio_set_fd_handler is
* We have to walk very carefully in case qemu_aio_set_fd_handler is
* called while we're walking.
*/
node = QLIST_FIRST(&ctx->aio_handlers);
@@ -184,116 +172,76 @@ bool aio_dispatch(AioContext *ctx)
return progress;
}
/* These thread-local variables are used only in a small part of aio_poll
* around the call to the poll() system call. In particular they are not
* used while aio_poll is performing callbacks, which makes it much easier
* to think about reentrancy!
*
* Stack-allocated arrays would be perfect but they have size limitations;
* heap allocation is expensive enough that we want to reuse arrays across
* calls to aio_poll(). And because poll() has to be called without holding
* any lock, the arrays cannot be stored in AioContext. Thread-local data
* has none of the disadvantages of these three options.
*/
static __thread GPollFD *pollfds;
static __thread AioHandler **nodes;
static __thread unsigned npfd, nalloc;
static __thread Notifier pollfds_cleanup_notifier;
static void pollfds_cleanup(Notifier *n, void *unused)
{
g_assert(npfd == 0);
g_free(pollfds);
g_free(nodes);
nalloc = 0;
}
static void add_pollfd(AioHandler *node)
{
if (npfd == nalloc) {
if (nalloc == 0) {
pollfds_cleanup_notifier.notify = pollfds_cleanup;
qemu_thread_atexit_add(&pollfds_cleanup_notifier);
nalloc = 8;
} else {
g_assert(nalloc <= INT_MAX);
nalloc *= 2;
}
pollfds = g_renew(GPollFD, pollfds, nalloc);
nodes = g_renew(AioHandler *, nodes, nalloc);
}
nodes[npfd] = node;
pollfds[npfd] = (GPollFD) {
.fd = node->pfd.fd,
.events = node->pfd.events,
};
npfd++;
}
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
int i, ret;
int ret;
bool progress;
int64_t timeout;
aio_context_acquire(ctx);
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
* be re-evaluated before the next blocking poll(). This is
* already true when aio_poll is called with blocking == false;
* if blocking == true, it is only true after poll() returns,
* so disable the optimization now.
/*
* If there are callbacks left that have been queued, we need to call them.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for qemu_aio_wait loops).
*/
if (blocking) {
atomic_add(&ctx->notify_me, 2);
if (aio_bh_poll(ctx)) {
blocking = false;
progress = true;
}
if (aio_dispatch(ctx)) {
progress = true;
}
if (progress && !blocking) {
return true;
}
ctx->walking_handlers++;
assert(npfd == 0);
g_array_set_size(ctx->pollfds, 0);
/* fill pollfds */
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
node->pollfds_idx = -1;
if (!node->deleted && node->pfd.events) {
add_pollfd(node);
GPollFD pfd = {
.fd = node->pfd.fd,
.events = node->pfd.events,
};
node->pollfds_idx = ctx->pollfds->len;
g_array_append_val(ctx->pollfds, pfd);
}
}
timeout = blocking ? aio_compute_timeout(ctx) : 0;
ctx->walking_handlers--;
/* early return if we only have the aio_notify() fd */
if (ctx->pollfds->len == 1) {
return progress;
}
/* wait until next event */
if (timeout) {
aio_context_release(ctx);
}
ret = qemu_poll_ns((GPollFD *)pollfds, npfd, timeout);
if (blocking) {
atomic_sub(&ctx->notify_me, 2);
}
if (timeout) {
aio_context_acquire(ctx);
}
aio_notify_accept(ctx);
ret = qemu_poll_ns((GPollFD *)ctx->pollfds->data,
ctx->pollfds->len,
blocking ? timerlistgroup_deadline_ns(&ctx->tlg) : 0);
/* if we have any readable fds, dispatch event */
if (ret > 0) {
for (i = 0; i < npfd; i++) {
nodes[i]->pfd.revents = pollfds[i].revents;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->pollfds_idx != -1) {
GPollFD *pfd = &g_array_index(ctx->pollfds, GPollFD,
node->pollfds_idx);
node->pfd.revents = pfd->revents;
}
}
}
npfd = 0;
ctx->walking_handlers--;
/* Run dispatch even if there were no readable fds to run timers */
if (aio_dispatch(ctx)) {
progress = true;
}
aio_context_release(ctx);
return progress;
}

View File

@@ -22,80 +22,12 @@
struct AioHandler {
EventNotifier *e;
IOHandler *io_read;
IOHandler *io_write;
EventNotifierHandler *io_notify;
GPollFD pfd;
int deleted;
void *opaque;
QLIST_ENTRY(AioHandler) node;
};
void aio_set_fd_handler(AioContext *ctx,
int fd,
IOHandler *io_read,
IOHandler *io_write,
void *opaque)
{
/* fd is a SOCKET in our case */
AioHandler *node;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->pfd.fd == fd && !node->deleted) {
break;
}
}
/* Are we deleting the fd handler? */
if (!io_read && !io_write) {
if (node) {
/* If the lock is held, just mark the node as deleted */
if (ctx->walking_handlers) {
node->deleted = 1;
node->pfd.revents = 0;
} else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up after
* releasing the walking_handlers lock.
*/
QLIST_REMOVE(node, node);
g_free(node);
}
}
} else {
HANDLE event;
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_new0(AioHandler, 1);
node->pfd.fd = fd;
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
}
node->pfd.events = 0;
if (node->io_read) {
node->pfd.events |= G_IO_IN;
}
if (node->io_write) {
node->pfd.events |= G_IO_OUT;
}
node->e = &ctx->notifier;
/* Update handler with latest information */
node->opaque = opaque;
node->io_read = io_read;
node->io_write = io_write;
event = event_notifier_get_handle(&ctx->notifier);
WSAEventSelect(node->pfd.fd, event,
FD_READ | FD_ACCEPT | FD_CLOSE |
FD_CONNECT | FD_WRITE | FD_OOB);
}
aio_notify(ctx);
}
void aio_set_event_notifier(AioContext *ctx,
EventNotifier *e,
EventNotifierHandler *io_notify)
@@ -129,7 +61,7 @@ void aio_set_event_notifier(AioContext *ctx,
} else {
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_new0(AioHandler, 1);
node = g_malloc0(sizeof(AioHandler));
node->e = e;
node->pfd.fd = (uintptr_t)event_notifier_get_handle(e);
node->pfd.events = G_IO_IN;
@@ -144,43 +76,6 @@ void aio_set_event_notifier(AioContext *ctx,
aio_notify(ctx);
}
bool aio_prepare(AioContext *ctx)
{
static struct timeval tv0;
AioHandler *node;
bool have_select_revents = false;
fd_set rfds, wfds;
/* fill fd sets */
FD_ZERO(&rfds);
FD_ZERO(&wfds);
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->io_read) {
FD_SET ((SOCKET)node->pfd.fd, &rfds);
}
if (node->io_write) {
FD_SET ((SOCKET)node->pfd.fd, &wfds);
}
}
if (select(0, &rfds, &wfds, NULL, &tv0) > 0) {
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
node->pfd.revents = 0;
if (FD_ISSET(node->pfd.fd, &rfds)) {
node->pfd.revents |= G_IO_IN;
have_select_revents = true;
}
if (FD_ISSET(node->pfd.fd, &wfds)) {
node->pfd.revents |= G_IO_OUT;
have_select_revents = true;
}
}
}
return have_select_revents;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
@@ -189,37 +84,47 @@ bool aio_pending(AioContext *ctx)
if (node->pfd.revents && node->io_notify) {
return true;
}
if ((node->pfd.revents & G_IO_IN) && node->io_read) {
return true;
}
if ((node->pfd.revents & G_IO_OUT) && node->io_write) {
return true;
}
}
return false;
}
static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
bool progress = false;
HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
bool progress;
int count;
int timeout;
progress = false;
/*
* We have to walk very carefully in case aio_set_fd_handler is
* If there are callbacks left that have been queued, we need to call then.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for qemu_aio_wait loops).
*/
if (aio_bh_poll(ctx)) {
blocking = false;
progress = true;
}
/* Run timers */
progress |= timerlistgroup_run_timers(&ctx->tlg);
/*
* Then dispatch any pending callbacks from the GSource.
*
* We have to walk very carefully in case qemu_aio_set_fd_handler is
* called while we're walking.
*/
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
int revents = node->pfd.revents;
ctx->walking_handlers++;
if (!node->deleted &&
(revents || event_notifier_get_handle(node->e) == event) &&
node->io_notify) {
if (node->pfd.revents && node->io_notify) {
node->pfd.revents = 0;
node->io_notify(node->e);
@@ -229,28 +134,6 @@ static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
}
}
if (!node->deleted &&
(node->io_read || node->io_write)) {
node->pfd.revents = 0;
if ((revents & G_IO_IN) && node->io_read) {
node->io_read(node->opaque);
progress = true;
}
if ((revents & G_IO_OUT) && node->io_write) {
node->io_write(node->opaque);
progress = true;
}
/* if the next select() will return an event, we have progressed */
if (event == event_notifier_get_handle(&ctx->notifier)) {
WSANETWORKEVENTS ev;
WSAEnumNetworkEvents(node->pfd.fd, event, &ev);
if (ev.lNetworkEvents) {
progress = true;
}
}
}
tmp = node;
node = QLIST_NEXT(node, node);
@@ -262,43 +145,10 @@ static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
}
}
return progress;
}
bool aio_dispatch(AioContext *ctx)
{
bool progress;
progress = aio_bh_poll(ctx);
progress |= aio_dispatch_handlers(ctx, INVALID_HANDLE_VALUE);
progress |= timerlistgroup_run_timers(&ctx->tlg);
return progress;
}
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
bool progress, have_select_revents, first;
int count;
int timeout;
aio_context_acquire(ctx);
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
* be re-evaluated before the next blocking poll(). This is
* already true when aio_poll is called with blocking == false;
* if blocking == true, it is only true after poll() returns,
* so disable the optimization now.
*/
if (blocking) {
atomic_add(&ctx->notify_me, 2);
if (progress && !blocking) {
return true;
}
have_select_revents = aio_prepare(ctx);
ctx->walking_handlers++;
/* fill fd sets */
@@ -310,56 +160,69 @@ bool aio_poll(AioContext *ctx, bool blocking)
}
ctx->walking_handlers--;
first = true;
/* ctx->notifier is always registered. */
assert(count > 0);
/* early return if we only have the aio_notify() fd */
if (count == 1) {
return progress;
}
/* Multiple iterations, all of them non-blocking except the first,
* may be necessary to process all pending events. After the first
* WaitForMultipleObjects call ctx->notify_me will be decremented.
*/
do {
HANDLE event;
/* wait until next event */
while (count > 0) {
int ret;
timeout = blocking && !have_select_revents
? qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)) : 0;
if (timeout) {
aio_context_release(ctx);
}
timeout = blocking ?
qemu_timeout_ns_to_ms(timerlistgroup_deadline_ns(&ctx->tlg)) : 0;
ret = WaitForMultipleObjects(count, events, FALSE, timeout);
if (blocking) {
assert(first);
atomic_sub(&ctx->notify_me, 2);
}
if (timeout) {
aio_context_acquire(ctx);
}
if (first) {
aio_notify_accept(ctx);
progress |= aio_bh_poll(ctx);
first = false;
}
/* if we have any signaled events, dispatch event */
event = NULL;
if ((DWORD) (ret - WAIT_OBJECT_0) < count) {
event = events[ret - WAIT_OBJECT_0];
events[ret - WAIT_OBJECT_0] = events[--count];
} else if (!have_select_revents) {
if ((DWORD) (ret - WAIT_OBJECT_0) >= count) {
break;
}
have_select_revents = false;
blocking = false;
progress |= aio_dispatch_handlers(ctx, event);
} while (count > 0);
/* we have to walk very carefully in case
* qemu_aio_set_fd_handler is called while we're walking */
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
progress |= timerlistgroup_run_timers(&ctx->tlg);
ctx->walking_handlers++;
if (!node->deleted &&
event_notifier_get_handle(node->e) == events[ret - WAIT_OBJECT_0] &&
node->io_notify) {
node->io_notify(node->e);
/* aio_notify() does not count as progress */
if (node->e != &ctx->notifier) {
progress = true;
}
}
tmp = node;
node = QLIST_NEXT(node, node);
ctx->walking_handlers--;
if (!ctx->walking_handlers && tmp->deleted) {
QLIST_REMOVE(tmp, node);
g_free(tmp);
}
}
/* Try again, but only call each handler once. */
events[ret - WAIT_OBJECT_0] = events[--count];
}
if (blocking) {
/* Run the timers a second time. We do this because otherwise aio_wait
* will not note progress - and will stop a drain early - if we have
* a timer that was not ready to run entering g_poll but is ready
* after g_poll. This will only do anything if a timer has expired.
*/
progress |= timerlistgroup_run_timers(&ctx->tlg);
}
aio_context_release(ctx);
return progress;
}

View File

@@ -22,16 +22,42 @@
* THE SOFTWARE.
*/
#include <stdint.h>
#include <stdarg.h>
#include <stdlib.h>
#ifndef _WIN32
#include <sys/types.h>
#include <sys/mman.h>
#endif
#include "config.h"
#include "monitor/monitor.h"
#include "sysemu/sysemu.h"
#include "qemu/bitops.h"
#include "qemu/bitmap.h"
#include "sysemu/arch_init.h"
#include "audio/audio.h"
#include "hw/i386/pc.h"
#include "hw/pci/pci.h"
#include "hw/audio/audio.h"
#include "sysemu/kvm.h"
#include "migration/migration.h"
#include "hw/i386/smbios.h"
#include "exec/address-spaces.h"
#include "hw/audio/pcspk.h"
#include "migration/page_cache.h"
#include "qemu/config-file.h"
#include "qemu/error-report.h"
#include "qmp-commands.h"
#include "trace.h"
#include "exec/cpu-all.h"
#include "hw/acpi/acpi.h"
#ifdef DEBUG_ARCH_INIT
#define DPRINTF(fmt, ...) \
do { fprintf(stdout, "arch_init: " fmt, ## __VA_ARGS__); } while (0)
#else
#define DPRINTF(fmt, ...) \
do { } while (0)
#endif
#ifdef TARGET_SPARC
int graphic_width = 1024;
int graphic_height = 768;
@@ -75,11 +101,25 @@ int graphic_depth = 32;
#define QEMU_ARCH QEMU_ARCH_XTENSA
#elif defined(TARGET_UNICORE32)
#define QEMU_ARCH QEMU_ARCH_UNICORE32
#elif defined(TARGET_TRICORE)
#define QEMU_ARCH QEMU_ARCH_TRICORE
#endif
const uint32_t arch_type = QEMU_ARCH;
static bool mig_throttle_on;
static int dirty_rate_high_cnt;
static void check_guest_throttling(void);
/***********************************************************/
/* ram save/restore */
#define RAM_SAVE_FLAG_FULL 0x01 /* Obsolete, not used anymore */
#define RAM_SAVE_FLAG_COMPRESS 0x02
#define RAM_SAVE_FLAG_MEM_SIZE 0x04
#define RAM_SAVE_FLAG_PAGE 0x08
#define RAM_SAVE_FLAG_EOS 0x10
#define RAM_SAVE_FLAG_CONTINUE 0x20
#define RAM_SAVE_FLAG_XBZRLE 0x40
/* 0x80 is reserved in migration.h start with 0x100 next */
static struct defconfig_file {
const char *filename;
@@ -87,9 +127,11 @@ static struct defconfig_file {
bool userconfig;
} default_config_files[] = {
{ CONFIG_QEMU_CONFDIR "/qemu.conf", true },
{ CONFIG_QEMU_CONFDIR "/target-" TARGET_NAME ".conf", true },
{ NULL }, /* end of list */
};
int qemu_read_default_config_files(bool userconfig)
{
int ret;
@@ -108,6 +150,825 @@ int qemu_read_default_config_files(bool userconfig)
return 0;
}
static inline bool is_zero_range(uint8_t *p, uint64_t size)
{
return buffer_find_nonzero_offset(p, size) == size;
}
/* struct contains XBZRLE cache and a static page
used by the compression */
static struct {
/* buffer used for XBZRLE encoding */
uint8_t *encoded_buf;
/* buffer for storing page content */
uint8_t *current_buf;
/* buffer used for XBZRLE decoding */
uint8_t *decoded_buf;
/* Cache for XBZRLE */
PageCache *cache;
} XBZRLE = {
.encoded_buf = NULL,
.current_buf = NULL,
.decoded_buf = NULL,
.cache = NULL,
};
int64_t xbzrle_cache_resize(int64_t new_size)
{
if (XBZRLE.cache != NULL) {
return cache_resize(XBZRLE.cache, new_size / TARGET_PAGE_SIZE) *
TARGET_PAGE_SIZE;
}
return pow2floor(new_size);
}
/* accounting for migration statistics */
typedef struct AccountingInfo {
uint64_t dup_pages;
uint64_t skipped_pages;
uint64_t norm_pages;
uint64_t iterations;
uint64_t xbzrle_bytes;
uint64_t xbzrle_pages;
uint64_t xbzrle_cache_miss;
uint64_t xbzrle_overflows;
} AccountingInfo;
static AccountingInfo acct_info;
static void acct_clear(void)
{
memset(&acct_info, 0, sizeof(acct_info));
}
uint64_t dup_mig_bytes_transferred(void)
{
return acct_info.dup_pages * TARGET_PAGE_SIZE;
}
uint64_t dup_mig_pages_transferred(void)
{
return acct_info.dup_pages;
}
uint64_t skipped_mig_bytes_transferred(void)
{
return acct_info.skipped_pages * TARGET_PAGE_SIZE;
}
uint64_t skipped_mig_pages_transferred(void)
{
return acct_info.skipped_pages;
}
uint64_t norm_mig_bytes_transferred(void)
{
return acct_info.norm_pages * TARGET_PAGE_SIZE;
}
uint64_t norm_mig_pages_transferred(void)
{
return acct_info.norm_pages;
}
uint64_t xbzrle_mig_bytes_transferred(void)
{
return acct_info.xbzrle_bytes;
}
uint64_t xbzrle_mig_pages_transferred(void)
{
return acct_info.xbzrle_pages;
}
uint64_t xbzrle_mig_pages_cache_miss(void)
{
return acct_info.xbzrle_cache_miss;
}
uint64_t xbzrle_mig_pages_overflow(void)
{
return acct_info.xbzrle_overflows;
}
static size_t save_block_hdr(QEMUFile *f, RAMBlock *block, ram_addr_t offset,
int cont, int flag)
{
size_t size;
qemu_put_be64(f, offset | cont | flag);
size = 8;
if (!cont) {
qemu_put_byte(f, strlen(block->idstr));
qemu_put_buffer(f, (uint8_t *)block->idstr,
strlen(block->idstr));
size += 1 + strlen(block->idstr);
}
return size;
}
#define ENCODING_FLAG_XBZRLE 0x1
static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data,
ram_addr_t current_addr, RAMBlock *block,
ram_addr_t offset, int cont, bool last_stage)
{
int encoded_len = 0, bytes_sent = -1;
uint8_t *prev_cached_page;
if (!cache_is_cached(XBZRLE.cache, current_addr)) {
if (!last_stage) {
cache_insert(XBZRLE.cache, current_addr, current_data);
}
acct_info.xbzrle_cache_miss++;
return -1;
}
prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
/* save current buffer into memory */
memcpy(XBZRLE.current_buf, current_data, TARGET_PAGE_SIZE);
/* XBZRLE encoding (if there is no overflow) */
encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
TARGET_PAGE_SIZE);
if (encoded_len == 0) {
DPRINTF("Skipping unmodified page\n");
return 0;
} else if (encoded_len == -1) {
DPRINTF("Overflow\n");
acct_info.xbzrle_overflows++;
/* update data in the cache */
memcpy(prev_cached_page, current_data, TARGET_PAGE_SIZE);
return -1;
}
/* we need to update the data in the cache, in order to get the same data */
if (!last_stage) {
memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE);
}
/* Send XBZRLE based compressed page */
bytes_sent = save_block_hdr(f, block, offset, cont, RAM_SAVE_FLAG_XBZRLE);
qemu_put_byte(f, ENCODING_FLAG_XBZRLE);
qemu_put_be16(f, encoded_len);
qemu_put_buffer(f, XBZRLE.encoded_buf, encoded_len);
bytes_sent += encoded_len + 1 + 2;
acct_info.xbzrle_pages++;
acct_info.xbzrle_bytes += bytes_sent;
return bytes_sent;
}
/* This is the last block that we have visited serching for dirty pages
*/
static RAMBlock *last_seen_block;
/* This is the last block from where we have sent data */
static RAMBlock *last_sent_block;
static ram_addr_t last_offset;
static unsigned long *migration_bitmap;
static uint64_t migration_dirty_pages;
static uint32_t last_version;
static bool ram_bulk_stage;
static inline
ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
ram_addr_t start)
{
unsigned long base = mr->ram_addr >> TARGET_PAGE_BITS;
unsigned long nr = base + (start >> TARGET_PAGE_BITS);
uint64_t mr_size = TARGET_PAGE_ALIGN(memory_region_size(mr));
unsigned long size = base + (mr_size >> TARGET_PAGE_BITS);
unsigned long next;
if (ram_bulk_stage && nr > base) {
next = nr + 1;
} else {
next = find_next_bit(migration_bitmap, size, nr);
}
if (next < size) {
clear_bit(next, migration_bitmap);
migration_dirty_pages--;
}
return (next - base) << TARGET_PAGE_BITS;
}
static inline bool migration_bitmap_set_dirty(MemoryRegion *mr,
ram_addr_t offset)
{
bool ret;
int nr = (mr->ram_addr + offset) >> TARGET_PAGE_BITS;
ret = test_and_set_bit(nr, migration_bitmap);
if (!ret) {
migration_dirty_pages++;
}
return ret;
}
/* Needs iothread lock! */
static void migration_bitmap_sync(void)
{
RAMBlock *block;
ram_addr_t addr;
uint64_t num_dirty_pages_init = migration_dirty_pages;
MigrationState *s = migrate_get_current();
static int64_t start_time;
static int64_t bytes_xfer_prev;
static int64_t num_dirty_pages_period;
int64_t end_time;
int64_t bytes_xfer_now;
if (!bytes_xfer_prev) {
bytes_xfer_prev = ram_bytes_transferred();
}
if (!start_time) {
start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
}
trace_migration_bitmap_sync_start();
address_space_sync_dirty_bitmap(&address_space_memory);
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
for (addr = 0; addr < block->length; addr += TARGET_PAGE_SIZE) {
if (memory_region_test_and_clear_dirty(block->mr,
addr, TARGET_PAGE_SIZE,
DIRTY_MEMORY_MIGRATION)) {
migration_bitmap_set_dirty(block->mr, addr);
}
}
}
trace_migration_bitmap_sync_end(migration_dirty_pages
- num_dirty_pages_init);
num_dirty_pages_period += migration_dirty_pages - num_dirty_pages_init;
end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
/* more than 1 second = 1000 millisecons */
if (end_time > start_time + 1000) {
if (migrate_auto_converge()) {
/* The following detection logic can be refined later. For now:
Check to see if the dirtied bytes is 50% more than the approx.
amount of bytes that just got transferred since the last time we
were in this routine. If that happens >N times (for now N==4)
we turn on the throttle down logic */
bytes_xfer_now = ram_bytes_transferred();
if (s->dirty_pages_rate &&
(num_dirty_pages_period * TARGET_PAGE_SIZE >
(bytes_xfer_now - bytes_xfer_prev)/2) &&
(dirty_rate_high_cnt++ > 4)) {
trace_migration_throttle();
mig_throttle_on = true;
dirty_rate_high_cnt = 0;
}
bytes_xfer_prev = bytes_xfer_now;
} else {
mig_throttle_on = false;
}
s->dirty_pages_rate = num_dirty_pages_period * 1000
/ (end_time - start_time);
s->dirty_bytes_rate = s->dirty_pages_rate * TARGET_PAGE_SIZE;
start_time = end_time;
num_dirty_pages_period = 0;
}
}
/*
* ram_save_block: Writes a page of memory to the stream f
*
* Returns: The number of bytes written.
* 0 means no dirty pages
*/
static int ram_save_block(QEMUFile *f, bool last_stage)
{
RAMBlock *block = last_seen_block;
ram_addr_t offset = last_offset;
bool complete_round = false;
int bytes_sent = 0;
MemoryRegion *mr;
ram_addr_t current_addr;
if (!block)
block = QTAILQ_FIRST(&ram_list.blocks);
while (true) {
mr = block->mr;
offset = migration_bitmap_find_and_reset_dirty(mr, offset);
if (complete_round && block == last_seen_block &&
offset >= last_offset) {
break;
}
if (offset >= block->length) {
offset = 0;
block = QTAILQ_NEXT(block, next);
if (!block) {
block = QTAILQ_FIRST(&ram_list.blocks);
complete_round = true;
ram_bulk_stage = false;
}
} else {
int ret;
uint8_t *p;
int cont = (block == last_sent_block) ?
RAM_SAVE_FLAG_CONTINUE : 0;
p = memory_region_get_ram_ptr(mr) + offset;
/* In doubt sent page as normal */
bytes_sent = -1;
ret = ram_control_save_page(f, block->offset,
offset, TARGET_PAGE_SIZE, &bytes_sent);
if (ret != RAM_SAVE_CONTROL_NOT_SUPP) {
if (ret != RAM_SAVE_CONTROL_DELAYED) {
if (bytes_sent > 0) {
acct_info.norm_pages++;
} else if (bytes_sent == 0) {
acct_info.dup_pages++;
}
}
} else if (is_zero_range(p, TARGET_PAGE_SIZE)) {
acct_info.dup_pages++;
bytes_sent = save_block_hdr(f, block, offset, cont,
RAM_SAVE_FLAG_COMPRESS);
qemu_put_byte(f, 0);
bytes_sent++;
} else if (!ram_bulk_stage && migrate_use_xbzrle()) {
current_addr = block->offset + offset;
bytes_sent = save_xbzrle_page(f, p, current_addr, block,
offset, cont, last_stage);
if (!last_stage) {
p = get_cached_data(XBZRLE.cache, current_addr);
}
}
/* XBZRLE overflow or normal page */
if (bytes_sent == -1) {
bytes_sent = save_block_hdr(f, block, offset, cont, RAM_SAVE_FLAG_PAGE);
qemu_put_buffer_async(f, p, TARGET_PAGE_SIZE);
bytes_sent += TARGET_PAGE_SIZE;
acct_info.norm_pages++;
}
/* if page is unmodified, continue to the next */
if (bytes_sent > 0) {
last_sent_block = block;
break;
}
}
}
last_seen_block = block;
last_offset = offset;
return bytes_sent;
}
static uint64_t bytes_transferred;
void acct_update_position(QEMUFile *f, size_t size, bool zero)
{
uint64_t pages = size / TARGET_PAGE_SIZE;
if (zero) {
acct_info.dup_pages += pages;
} else {
acct_info.norm_pages += pages;
bytes_transferred += size;
qemu_update_position(f, size);
}
}
static ram_addr_t ram_save_remaining(void)
{
return migration_dirty_pages;
}
uint64_t ram_bytes_remaining(void)
{
return ram_save_remaining() * TARGET_PAGE_SIZE;
}
uint64_t ram_bytes_transferred(void)
{
return bytes_transferred;
}
uint64_t ram_bytes_total(void)
{
RAMBlock *block;
uint64_t total = 0;
QTAILQ_FOREACH(block, &ram_list.blocks, next)
total += block->length;
return total;
}
static void migration_end(void)
{
if (migration_bitmap) {
memory_global_dirty_log_stop();
g_free(migration_bitmap);
migration_bitmap = NULL;
}
if (XBZRLE.cache) {
cache_fini(XBZRLE.cache);
g_free(XBZRLE.cache);
g_free(XBZRLE.encoded_buf);
g_free(XBZRLE.current_buf);
g_free(XBZRLE.decoded_buf);
XBZRLE.cache = NULL;
}
}
static void ram_migration_cancel(void *opaque)
{
migration_end();
}
static void reset_ram_globals(void)
{
last_seen_block = NULL;
last_sent_block = NULL;
last_offset = 0;
last_version = ram_list.version;
ram_bulk_stage = true;
}
#define MAX_WAIT 50 /* ms, half buffered_file limit */
static int ram_save_setup(QEMUFile *f, void *opaque)
{
RAMBlock *block;
int64_t ram_pages = last_ram_offset() >> TARGET_PAGE_BITS;
migration_bitmap = bitmap_new(ram_pages);
bitmap_set(migration_bitmap, 0, ram_pages);
migration_dirty_pages = ram_pages;
mig_throttle_on = false;
dirty_rate_high_cnt = 0;
if (migrate_use_xbzrle()) {
XBZRLE.cache = cache_init(migrate_xbzrle_cache_size() /
TARGET_PAGE_SIZE,
TARGET_PAGE_SIZE);
if (!XBZRLE.cache) {
DPRINTF("Error creating cache\n");
return -1;
}
XBZRLE.encoded_buf = g_malloc0(TARGET_PAGE_SIZE);
XBZRLE.current_buf = g_malloc(TARGET_PAGE_SIZE);
acct_clear();
}
qemu_mutex_lock_iothread();
qemu_mutex_lock_ramlist();
bytes_transferred = 0;
reset_ram_globals();
memory_global_dirty_log_start();
migration_bitmap_sync();
qemu_mutex_unlock_iothread();
qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
qemu_put_byte(f, strlen(block->idstr));
qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
qemu_put_be64(f, block->length);
}
qemu_mutex_unlock_ramlist();
ram_control_before_iterate(f, RAM_CONTROL_SETUP);
ram_control_after_iterate(f, RAM_CONTROL_SETUP);
qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
return 0;
}
static int ram_save_iterate(QEMUFile *f, void *opaque)
{
int ret;
int i;
int64_t t0;
int total_sent = 0;
qemu_mutex_lock_ramlist();
if (ram_list.version != last_version) {
reset_ram_globals();
}
ram_control_before_iterate(f, RAM_CONTROL_ROUND);
t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
i = 0;
while ((ret = qemu_file_rate_limit(f)) == 0) {
int bytes_sent;
bytes_sent = ram_save_block(f, false);
/* no more blocks to sent */
if (bytes_sent == 0) {
break;
}
total_sent += bytes_sent;
acct_info.iterations++;
check_guest_throttling();
/* we want to check in the 1st loop, just in case it was the 1st time
and we had to sync the dirty bitmap.
qemu_get_clock_ns() is a bit expensive, so we only check each some
iterations
*/
if ((i & 63) == 0) {
uint64_t t1 = (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - t0) / 1000000;
if (t1 > MAX_WAIT) {
DPRINTF("big wait: %" PRIu64 " milliseconds, %d iterations\n",
t1, i);
break;
}
}
i++;
}
qemu_mutex_unlock_ramlist();
/*
* Must occur before EOS (or any QEMUFile operation)
* because of RDMA protocol.
*/
ram_control_after_iterate(f, RAM_CONTROL_ROUND);
bytes_transferred += total_sent;
/*
* Do not count these 8 bytes into total_sent, so that we can
* return 0 if no page had been dirtied.
*/
qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
bytes_transferred += 8;
ret = qemu_file_get_error(f);
if (ret < 0) {
return ret;
}
return total_sent;
}
static int ram_save_complete(QEMUFile *f, void *opaque)
{
qemu_mutex_lock_ramlist();
migration_bitmap_sync();
ram_control_before_iterate(f, RAM_CONTROL_FINISH);
/* try transferring iterative blocks of memory */
/* flush all remaining blocks regardless of rate limiting */
while (true) {
int bytes_sent;
bytes_sent = ram_save_block(f, true);
/* no more blocks to sent */
if (bytes_sent == 0) {
break;
}
bytes_transferred += bytes_sent;
}
ram_control_after_iterate(f, RAM_CONTROL_FINISH);
migration_end();
qemu_mutex_unlock_ramlist();
qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
return 0;
}
static uint64_t ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size)
{
uint64_t remaining_size;
remaining_size = ram_save_remaining() * TARGET_PAGE_SIZE;
if (remaining_size < max_size) {
qemu_mutex_lock_iothread();
migration_bitmap_sync();
qemu_mutex_unlock_iothread();
remaining_size = ram_save_remaining() * TARGET_PAGE_SIZE;
}
return remaining_size;
}
static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void *host)
{
int ret, rc = 0;
unsigned int xh_len;
int xh_flags;
if (!XBZRLE.decoded_buf) {
XBZRLE.decoded_buf = g_malloc(TARGET_PAGE_SIZE);
}
/* extract RLE header */
xh_flags = qemu_get_byte(f);
xh_len = qemu_get_be16(f);
if (xh_flags != ENCODING_FLAG_XBZRLE) {
fprintf(stderr, "Failed to load XBZRLE page - wrong compression!\n");
return -1;
}
if (xh_len > TARGET_PAGE_SIZE) {
fprintf(stderr, "Failed to load XBZRLE page - len overflow!\n");
return -1;
}
/* load data and decode */
qemu_get_buffer(f, XBZRLE.decoded_buf, xh_len);
/* decode RLE */
ret = xbzrle_decode_buffer(XBZRLE.decoded_buf, xh_len, host,
TARGET_PAGE_SIZE);
if (ret == -1) {
fprintf(stderr, "Failed to load XBZRLE page - decode error!\n");
rc = -1;
} else if (ret > TARGET_PAGE_SIZE) {
fprintf(stderr, "Failed to load XBZRLE page - size %d exceeds %d!\n",
ret, TARGET_PAGE_SIZE);
abort();
}
return rc;
}
static inline void *host_from_stream_offset(QEMUFile *f,
ram_addr_t offset,
int flags)
{
static RAMBlock *block = NULL;
char id[256];
uint8_t len;
if (flags & RAM_SAVE_FLAG_CONTINUE) {
if (!block) {
fprintf(stderr, "Ack, bad migration stream!\n");
return NULL;
}
return memory_region_get_ram_ptr(block->mr) + offset;
}
len = qemu_get_byte(f);
qemu_get_buffer(f, (uint8_t *)id, len);
id[len] = 0;
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (!strncmp(id, block->idstr, sizeof(id)))
return memory_region_get_ram_ptr(block->mr) + offset;
}
fprintf(stderr, "Can't find block %s!\n", id);
return NULL;
}
/*
* If a page (or a whole RDMA chunk) has been
* determined to be zero, then zap it.
*/
void ram_handle_compressed(void *host, uint8_t ch, uint64_t size)
{
if (ch != 0 || !is_zero_range(host, size)) {
memset(host, ch, size);
}
}
static int ram_load(QEMUFile *f, void *opaque, int version_id)
{
ram_addr_t addr;
int flags, ret = 0;
static uint64_t seq_iter;
seq_iter++;
if (version_id != 4) {
return -EINVAL;
}
while (!ret) {
addr = qemu_get_be64(f);
flags = addr & ~TARGET_PAGE_MASK;
addr &= TARGET_PAGE_MASK;
if (flags & RAM_SAVE_FLAG_MEM_SIZE) {
/* Synchronize RAM block list */
char id[256];
ram_addr_t length;
ram_addr_t total_ram_bytes = addr;
while (total_ram_bytes) {
RAMBlock *block;
uint8_t len;
len = qemu_get_byte(f);
qemu_get_buffer(f, (uint8_t *)id, len);
id[len] = 0;
length = qemu_get_be64(f);
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (!strncmp(id, block->idstr, sizeof(id))) {
if (block->length != length) {
fprintf(stderr,
"Length mismatch: %s: " RAM_ADDR_FMT
" in != " RAM_ADDR_FMT "\n", id, length,
block->length);
ret = -EINVAL;
}
break;
}
}
if (!block) {
fprintf(stderr, "Unknown ramblock \"%s\", cannot "
"accept migration\n", id);
ret = -EINVAL;
}
if (ret) {
break;
}
total_ram_bytes -= length;
}
} else if (flags & RAM_SAVE_FLAG_COMPRESS) {
void *host;
uint8_t ch;
host = host_from_stream_offset(f, addr, flags);
if (!host) {
return -EINVAL;
}
ch = qemu_get_byte(f);
ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
} else if (flags & RAM_SAVE_FLAG_PAGE) {
void *host;
host = host_from_stream_offset(f, addr, flags);
if (!host) {
return -EINVAL;
}
qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
} else if (flags & RAM_SAVE_FLAG_XBZRLE) {
void *host = host_from_stream_offset(f, addr, flags);
if (!host) {
return -EINVAL;
}
if (load_xbzrle(f, addr, host) < 0) {
error_report("Failed to decompress XBZRLE page at "
RAM_ADDR_FMT, addr);
ret = -EINVAL;
break;
}
} else if (flags & RAM_SAVE_FLAG_HOOK) {
ram_control_load_hook(f, flags);
} else if (flags & RAM_SAVE_FLAG_EOS) {
/* normal exit */
break;
} else {
error_report("Unknown migration flags: %#x", flags);
ret = -EINVAL;
break;
}
ret = qemu_file_get_error(f);
}
DPRINTF("Completed load of VM with exit code %d seq iteration "
"%" PRIu64 "\n", ret, seq_iter);
return ret;
}
SaveVMHandlers savevm_ram_handlers = {
.save_live_setup = ram_save_setup,
.save_live_iterate = ram_save_iterate,
.save_live_complete = ram_save_complete,
.save_live_pending = ram_save_pending,
.load_state = ram_load,
.cancel = ram_migration_cancel,
};
struct soundhw {
const char *name;
const char *descr;
@@ -190,11 +1051,12 @@ void select_soundhw(const char *optarg)
if (!c->name) {
if (l > 80) {
error_report("Unknown sound card name (too big to show)");
fprintf(stderr,
"Unknown sound card name (too big to show)\n");
}
else {
error_report("Unknown sound card name `%.*s'",
(int) l, p);
fprintf(stderr, "Unknown sound card name `%.*s'\n",
(int) l, p);
}
bad_card = 1;
}
@@ -217,13 +1079,13 @@ void audio_init(void)
if (c->enabled) {
if (c->isa) {
if (!isa_bus) {
error_report("ISA bus not available for %s", c->name);
fprintf(stderr, "ISA bus not available for %s\n", c->name);
exit(1);
}
c->init.init_isa(isa_bus);
} else {
if (!pci_bus) {
error_report("PCI bus not available for %s", c->name);
fprintf(stderr, "PCI bus not available for %s\n", c->name);
exit(1);
}
c->init.init_pci(pci_bus);
@@ -280,6 +1142,11 @@ void cpudef_init(void)
#endif
}
int tcg_available(void)
{
return 1;
}
int kvm_available(void)
{
#ifdef CONFIG_KVM
@@ -307,3 +1174,52 @@ TargetInfo *qmp_query_target(Error **errp)
return info;
}
/* Stub function that's gets run on the vcpu when its brought out of the
VM to run inside qemu via async_run_on_cpu()*/
static void mig_sleep_cpu(void *opq)
{
qemu_mutex_unlock_iothread();
g_usleep(30*1000);
qemu_mutex_lock_iothread();
}
/* To reduce the dirty rate explicitly disallow the VCPUs from spending
much time in the VM. The migration thread will try to catchup.
Workload will experience a performance drop.
*/
static void mig_throttle_guest_down(void)
{
CPUState *cpu;
qemu_mutex_lock_iothread();
CPU_FOREACH(cpu) {
async_run_on_cpu(cpu, mig_sleep_cpu, NULL);
}
qemu_mutex_unlock_iothread();
}
static void check_guest_throttling(void)
{
static int64_t t0;
int64_t t1;
if (!mig_throttle_on) {
return;
}
if (!t0) {
t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
return;
}
t1 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
/* If it has been more than 40 ms since the last time the guest
* was throttled then do it again.
*/
if (40 < (t1-t0)/1000000) {
mig_throttle_guest_down();
t0 = t1;
}
}

167
async.c
View File

@@ -26,7 +26,6 @@
#include "block/aio.h"
#include "block/thread-pool.h"
#include "qemu/main-loop.h"
#include "qemu/atomic.h"
/***********************************************************/
/* bottom halves (can be seen as timers which expire ASAP) */
@@ -44,12 +43,10 @@ struct QEMUBH {
QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
{
QEMUBH *bh;
bh = g_new(QEMUBH, 1);
*bh = (QEMUBH){
.ctx = ctx,
.cb = cb,
.opaque = opaque,
};
bh = g_malloc0(sizeof(QEMUBH));
bh->ctx = ctx;
bh->cb = cb;
bh->opaque = opaque;
qemu_mutex_lock(&ctx->bh_lock);
bh->next = ctx->first_bh;
/* Make sure that the members are ready before putting bh into list */
@@ -72,17 +69,14 @@ int aio_bh_poll(AioContext *ctx)
/* Make sure that fetching bh happens before accessing its members */
smp_read_barrier_depends();
next = bh->next;
/* The atomic_xchg is paired with the one in qemu_bh_schedule. The
* implicit memory barrier ensures that the callback sees all writes
* done by the scheduling thread. It also ensures that the scheduling
* thread sees the zero before bh->cb has run, and thus will call
* aio_notify again if necessary.
*/
if (!bh->deleted && atomic_xchg(&bh->scheduled, 0)) {
/* Idle BHs and the notify BH don't count as progress */
if (!bh->idle && bh != ctx->notify_dummy_bh) {
if (!bh->deleted && bh->scheduled) {
bh->scheduled = 0;
/* Paired with write barrier in bh schedule to ensure reading for
* idle & callbacks coming after bh's scheduling.
*/
smp_rmb();
if (!bh->idle)
ret = 1;
}
bh->idle = 0;
bh->cb(bh->opaque);
}
@@ -111,28 +105,33 @@ int aio_bh_poll(AioContext *ctx)
void qemu_bh_schedule_idle(QEMUBH *bh)
{
if (bh->scheduled)
return;
bh->idle = 1;
/* Make sure that idle & any writes needed by the callback are done
* before the locations are read in the aio_bh_poll.
*/
atomic_mb_set(&bh->scheduled, 1);
smp_wmb();
bh->scheduled = 1;
}
void qemu_bh_schedule(QEMUBH *bh)
{
AioContext *ctx;
if (bh->scheduled)
return;
ctx = bh->ctx;
bh->idle = 0;
/* The memory barrier implicit in atomic_xchg makes sure that:
/* Make sure that:
* 1. idle & any writes needed by the callback are done before the
* locations are read in the aio_bh_poll.
* 2. ctx is loaded before scheduled is set and the callback has a chance
* to execute.
*/
if (atomic_xchg(&bh->scheduled, 1) == 0) {
aio_notify(ctx);
}
smp_mb();
bh->scheduled = 1;
aio_notify(ctx);
}
@@ -152,50 +151,39 @@ void qemu_bh_delete(QEMUBH *bh)
bh->deleted = 1;
}
int64_t
aio_compute_timeout(AioContext *ctx)
static gboolean
aio_ctx_prepare(GSource *source, gint *timeout)
{
int64_t deadline;
int timeout = -1;
AioContext *ctx = (AioContext *) source;
QEMUBH *bh;
int deadline;
/* We assume there is no timeout already supplied */
*timeout = -1;
for (bh = ctx->first_bh; bh; bh = bh->next) {
if (!bh->deleted && bh->scheduled) {
if (bh->idle) {
/* idle bottom halves will be polled at least
* every 10ms */
timeout = 10000000;
*timeout = 10;
} else {
/* non-idle bottom halves will be executed
* immediately */
return 0;
*timeout = 0;
return true;
}
}
}
deadline = timerlistgroup_deadline_ns(&ctx->tlg);
deadline = qemu_timeout_ns_to_ms(timerlistgroup_deadline_ns(&ctx->tlg));
if (deadline == 0) {
return 0;
} else {
return qemu_soonest_timeout(timeout, deadline);
}
}
static gboolean
aio_ctx_prepare(GSource *source, gint *timeout)
{
AioContext *ctx = (AioContext *) source;
atomic_or(&ctx->notify_me, 1);
/* We assume there is no timeout already supplied */
*timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx));
if (aio_prepare(ctx)) {
*timeout = 0;
return true;
} else {
*timeout = qemu_soonest_timeout(*timeout, deadline);
}
return *timeout == 0;
return false;
}
static gboolean
@@ -204,9 +192,6 @@ aio_ctx_check(GSource *source)
AioContext *ctx = (AioContext *) source;
QEMUBH *bh;
atomic_and(&ctx->notify_me, ~1);
aio_notify_accept(ctx);
for (bh = ctx->first_bh; bh; bh = bh->next) {
if (!bh->deleted && bh->scheduled) {
return true;
@@ -223,7 +208,7 @@ aio_ctx_dispatch(GSource *source,
AioContext *ctx = (AioContext *) source;
assert(callback == NULL);
aio_dispatch(ctx);
aio_poll(ctx, false);
return true;
}
@@ -232,25 +217,11 @@ aio_ctx_finalize(GSource *source)
{
AioContext *ctx = (AioContext *) source;
qemu_bh_delete(ctx->notify_dummy_bh);
thread_pool_free(ctx->thread_pool);
qemu_mutex_lock(&ctx->bh_lock);
while (ctx->first_bh) {
QEMUBH *next = ctx->first_bh->next;
/* qemu_bh_delete() must have been called on BHs in this AioContext */
assert(ctx->first_bh->deleted);
g_free(ctx->first_bh);
ctx->first_bh = next;
}
qemu_mutex_unlock(&ctx->bh_lock);
aio_set_event_notifier(ctx, &ctx->notifier, NULL);
event_notifier_cleanup(&ctx->notifier);
rfifolock_destroy(&ctx->lock);
qemu_mutex_destroy(&ctx->bh_lock);
g_array_free(ctx->pollfds, TRUE);
timerlistgroup_deinit(&ctx->tlg);
}
@@ -277,21 +248,7 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
void aio_notify(AioContext *ctx)
{
/* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs
* with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
*/
smp_mb();
if (ctx->notify_me) {
event_notifier_set(&ctx->notifier);
atomic_mb_set(&ctx->notified, true);
}
}
void aio_notify_accept(AioContext *ctx)
{
if (atomic_xchg(&ctx->notified, false)) {
event_notifier_test_and_clear(&ctx->notifier);
}
event_notifier_set(&ctx->notifier);
}
static void aio_timerlist_notify(void *opaque)
@@ -299,45 +256,19 @@ static void aio_timerlist_notify(void *opaque)
aio_notify(opaque);
}
static void aio_rfifolock_cb(void *opaque)
AioContext *aio_context_new(void)
{
AioContext *ctx = opaque;
/* Kick owner thread in case they are blocked in aio_poll() */
qemu_bh_schedule(ctx->notify_dummy_bh);
}
static void notify_dummy_bh(void *opaque)
{
/* Do nothing, we were invoked just to force the event loop to iterate */
}
static void event_notifier_dummy_cb(EventNotifier *e)
{
}
AioContext *aio_context_new(Error **errp)
{
int ret;
AioContext *ctx;
ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
ret = event_notifier_init(&ctx->notifier, false);
if (ret < 0) {
g_source_destroy(&ctx->source);
error_setg_errno(errp, -ret, "Failed to initialize event notifier");
return NULL;
}
g_source_set_can_recurse(&ctx->source, true);
aio_set_event_notifier(ctx, &ctx->notifier,
(EventNotifierHandler *)
event_notifier_dummy_cb);
ctx->pollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
ctx->thread_pool = NULL;
qemu_mutex_init(&ctx->bh_lock);
rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
event_notifier_init(&ctx->notifier, false);
aio_set_event_notifier(ctx, &ctx->notifier,
(EventNotifierHandler *)
event_notifier_test_and_clear);
timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
ctx->notify_dummy_bh = aio_bh_new(ctx, notify_dummy_bh, NULL);
return ctx;
}
@@ -350,13 +281,3 @@ void aio_context_unref(AioContext *ctx)
{
g_source_unref(&ctx->source);
}
void aio_context_acquire(AioContext *ctx)
{
rfifolock_lock(&ctx->lock);
}
void aio_context_release(AioContext *ctx)
{
rfifolock_unlock(&ctx->lock);
}

View File

@@ -5,9 +5,13 @@ common-obj-$(CONFIG_SPICE) += spiceaudio.o
common-obj-$(CONFIG_COREAUDIO) += coreaudio.o
common-obj-$(CONFIG_ALSA) += alsaaudio.o
common-obj-$(CONFIG_DSOUND) += dsoundaudio.o
common-obj-$(CONFIG_FMOD) += fmodaudio.o
common-obj-$(CONFIG_ESD) += esdaudio.o
common-obj-$(CONFIG_PA) += paaudio.o
common-obj-$(CONFIG_WINWAVE) += winwaveaudio.o
common-obj-$(CONFIG_AUDIO_PT_INT) += audio_pt_int.o
common-obj-$(CONFIG_AUDIO_WIN_INT) += audio_win_int.o
common-obj-y += wavcapture.o
sdlaudio.o-cflags := $(SDL_CFLAGS)
$(obj)/audio.o $(obj)/fmodaudio.o: QEMU_CFLAGS += $(FMOD_CFLAGS)
$(obj)/sdlaudio.o: QEMU_CFLAGS += $(SDL_CFLAGS)

View File

@@ -25,7 +25,6 @@
#include "qemu-common.h"
#include "qemu/main-loop.h"
#include "audio.h"
#include "trace.h"
#if QEMU_GNUC_PREREQ(4, 3)
#pragma GCC diagnostic ignored "-Waddress"
@@ -34,28 +33,9 @@
#define AUDIO_CAP "alsa"
#include "audio_int.h"
typedef struct ALSAConf {
int size_in_usec_in;
int size_in_usec_out;
const char *pcm_name_in;
const char *pcm_name_out;
unsigned int buffer_size_in;
unsigned int period_size_in;
unsigned int buffer_size_out;
unsigned int period_size_out;
unsigned int threshold;
int buffer_size_in_overridden;
int period_size_in_overridden;
int buffer_size_out_overridden;
int period_size_out_overridden;
} ALSAConf;
struct pollhlp {
snd_pcm_t *handle;
struct pollfd *pfds;
ALSAConf *conf;
int count;
int mask;
};
@@ -76,6 +56,30 @@ typedef struct ALSAVoiceIn {
struct pollhlp pollhlp;
} ALSAVoiceIn;
static struct {
int size_in_usec_in;
int size_in_usec_out;
const char *pcm_name_in;
const char *pcm_name_out;
unsigned int buffer_size_in;
unsigned int period_size_in;
unsigned int buffer_size_out;
unsigned int period_size_out;
unsigned int threshold;
int buffer_size_in_overridden;
int period_size_in_overridden;
int buffer_size_out_overridden;
int period_size_out_overridden;
int verbose;
} conf = {
.buffer_size_out = 4096,
.period_size_out = 1024,
.pcm_name_out = "default",
.pcm_name_in = "default",
};
struct alsa_params_req {
int freq;
snd_pcm_format_t fmt;
@@ -201,7 +205,9 @@ static void alsa_poll_handler (void *opaque)
}
if (!(revents & hlp->mask)) {
trace_alsa_revents(revents);
if (conf.verbose) {
dolog ("revents = %d\n", revents);
}
return;
}
@@ -260,14 +266,31 @@ static int alsa_poll_helper (snd_pcm_t *handle, struct pollhlp *hlp, int mask)
for (i = 0; i < count; ++i) {
if (pfds[i].events & POLLIN) {
qemu_set_fd_handler (pfds[i].fd, alsa_poll_handler, NULL, hlp);
err = qemu_set_fd_handler (pfds[i].fd, alsa_poll_handler,
NULL, hlp);
}
if (pfds[i].events & POLLOUT) {
trace_alsa_pollout(i, pfds[i].fd);
qemu_set_fd_handler (pfds[i].fd, NULL, alsa_poll_handler, hlp);
if (conf.verbose) {
dolog ("POLLOUT %d %d\n", i, pfds[i].fd);
}
err = qemu_set_fd_handler (pfds[i].fd, NULL,
alsa_poll_handler, hlp);
}
if (conf.verbose) {
dolog ("Set handler events=%#x index=%d fd=%d err=%d\n",
pfds[i].events, i, pfds[i].fd, err);
}
trace_alsa_set_handler(pfds[i].events, i, pfds[i].fd, err);
if (err) {
dolog ("Failed to set handler events=%#x index=%d fd=%d err=%d\n",
pfds[i].events, i, pfds[i].fd, err);
while (i--) {
qemu_set_fd_handler (pfds[i].fd, NULL, NULL, NULL);
}
g_free (pfds);
return -1;
}
}
hlp->pfds = pfds;
hlp->count = count;
@@ -453,15 +476,14 @@ static void alsa_set_threshold (snd_pcm_t *handle, snd_pcm_uframes_t threshold)
}
static int alsa_open (int in, struct alsa_params_req *req,
struct alsa_params_obt *obt, snd_pcm_t **handlep,
ALSAConf *conf)
struct alsa_params_obt *obt, snd_pcm_t **handlep)
{
snd_pcm_t *handle;
snd_pcm_hw_params_t *hw_params;
int err;
int size_in_usec;
unsigned int freq, nchannels;
const char *pcm_name = in ? conf->pcm_name_in : conf->pcm_name_out;
const char *pcm_name = in ? conf.pcm_name_in : conf.pcm_name_out;
snd_pcm_uframes_t obt_buffer_size;
const char *typ = in ? "ADC" : "DAC";
snd_pcm_format_t obtfmt;
@@ -500,7 +522,7 @@ static int alsa_open (int in, struct alsa_params_req *req,
}
err = snd_pcm_hw_params_set_format (handle, hw_params, req->fmt);
if (err < 0) {
if (err < 0 && conf.verbose) {
alsa_logerr2 (err, typ, "Failed to set format %d\n", req->fmt);
}
@@ -632,7 +654,7 @@ static int alsa_open (int in, struct alsa_params_req *req,
goto err;
}
if (!in && conf->threshold) {
if (!in && conf.threshold) {
snd_pcm_uframes_t threshold;
int bytes_per_sec;
@@ -654,7 +676,7 @@ static int alsa_open (int in, struct alsa_params_req *req,
break;
}
threshold = (conf->threshold * bytes_per_sec) / 1000;
threshold = (conf.threshold * bytes_per_sec) / 1000;
alsa_set_threshold (handle, threshold);
}
@@ -664,9 +686,10 @@ static int alsa_open (int in, struct alsa_params_req *req,
*handlep = handle;
if (obtfmt != req->fmt ||
if (conf.verbose &&
(obtfmt != req->fmt ||
obt->nchannels != req->nchannels ||
obt->freq != req->freq) {
obt->freq != req->freq)) {
dolog ("Audio parameters for %s\n", typ);
alsa_dump_info (req, obt, obtfmt);
}
@@ -720,7 +743,9 @@ static void alsa_write_pending (ALSAVoiceOut *alsa)
if (written <= 0) {
switch (written) {
case 0:
trace_alsa_wrote_zero(len);
if (conf.verbose) {
dolog ("Failed to write %d frames (wrote zero)\n", len);
}
return;
case -EPIPE:
@@ -729,7 +754,9 @@ static void alsa_write_pending (ALSAVoiceOut *alsa)
len);
return;
}
trace_alsa_xrun_out();
if (conf.verbose) {
dolog ("Recovering from playback xrun\n");
}
continue;
case -ESTRPIPE:
@@ -740,7 +767,9 @@ static void alsa_write_pending (ALSAVoiceOut *alsa)
len);
return;
}
trace_alsa_resume_out();
if (conf.verbose) {
dolog ("Resuming suspended output stream\n");
}
continue;
case -EAGAIN:
@@ -786,31 +815,31 @@ static void alsa_fini_out (HWVoiceOut *hw)
ldebug ("alsa_fini\n");
alsa_anal_close (&alsa->handle, &alsa->pollhlp);
g_free(alsa->pcm_buf);
alsa->pcm_buf = NULL;
if (alsa->pcm_buf) {
g_free (alsa->pcm_buf);
alsa->pcm_buf = NULL;
}
}
static int alsa_init_out(HWVoiceOut *hw, struct audsettings *as,
void *drv_opaque)
static int alsa_init_out (HWVoiceOut *hw, struct audsettings *as)
{
ALSAVoiceOut *alsa = (ALSAVoiceOut *) hw;
struct alsa_params_req req;
struct alsa_params_obt obt;
snd_pcm_t *handle;
struct audsettings obt_as;
ALSAConf *conf = drv_opaque;
req.fmt = aud_to_alsafmt (as->fmt, as->endianness);
req.freq = as->freq;
req.nchannels = as->nchannels;
req.period_size = conf->period_size_out;
req.buffer_size = conf->buffer_size_out;
req.size_in_usec = conf->size_in_usec_out;
req.period_size = conf.period_size_out;
req.buffer_size = conf.buffer_size_out;
req.size_in_usec = conf.size_in_usec_out;
req.override_mask =
(conf->period_size_out_overridden ? 1 : 0) |
(conf->buffer_size_out_overridden ? 2 : 0);
(conf.period_size_out_overridden ? 1 : 0) |
(conf.buffer_size_out_overridden ? 2 : 0);
if (alsa_open (0, &req, &obt, &handle, conf)) {
if (alsa_open (0, &req, &obt, &handle)) {
return -1;
}
@@ -831,7 +860,6 @@ static int alsa_init_out(HWVoiceOut *hw, struct audsettings *as,
}
alsa->handle = handle;
alsa->pollhlp.conf = conf;
return 0;
}
@@ -902,26 +930,25 @@ static int alsa_ctl_out (HWVoiceOut *hw, int cmd, ...)
return -1;
}
static int alsa_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
static int alsa_init_in (HWVoiceIn *hw, struct audsettings *as)
{
ALSAVoiceIn *alsa = (ALSAVoiceIn *) hw;
struct alsa_params_req req;
struct alsa_params_obt obt;
snd_pcm_t *handle;
struct audsettings obt_as;
ALSAConf *conf = drv_opaque;
req.fmt = aud_to_alsafmt (as->fmt, as->endianness);
req.freq = as->freq;
req.nchannels = as->nchannels;
req.period_size = conf->period_size_in;
req.buffer_size = conf->buffer_size_in;
req.size_in_usec = conf->size_in_usec_in;
req.period_size = conf.period_size_in;
req.buffer_size = conf.buffer_size_in;
req.size_in_usec = conf.size_in_usec_in;
req.override_mask =
(conf->period_size_in_overridden ? 1 : 0) |
(conf->buffer_size_in_overridden ? 2 : 0);
(conf.period_size_in_overridden ? 1 : 0) |
(conf.buffer_size_in_overridden ? 2 : 0);
if (alsa_open (1, &req, &obt, &handle, conf)) {
if (alsa_open (1, &req, &obt, &handle)) {
return -1;
}
@@ -942,7 +969,6 @@ static int alsa_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
}
alsa->handle = handle;
alsa->pollhlp.conf = conf;
return 0;
}
@@ -952,8 +978,10 @@ static void alsa_fini_in (HWVoiceIn *hw)
alsa_anal_close (&alsa->handle, &alsa->pollhlp);
g_free(alsa->pcm_buf);
alsa->pcm_buf = NULL;
if (alsa->pcm_buf) {
g_free (alsa->pcm_buf);
alsa->pcm_buf = NULL;
}
}
static int alsa_run_in (HWVoiceIn *hw)
@@ -998,10 +1026,14 @@ static int alsa_run_in (HWVoiceIn *hw)
dolog ("Failed to resume suspended input stream\n");
return 0;
}
trace_alsa_resume_in();
if (conf.verbose) {
dolog ("Resuming suspended input stream\n");
}
break;
default:
trace_alsa_no_frames(state);
if (conf.verbose) {
dolog ("No frames available and ALSA state is %d\n", state);
}
return 0;
}
}
@@ -1036,7 +1068,9 @@ static int alsa_run_in (HWVoiceIn *hw)
if (nread <= 0) {
switch (nread) {
case 0:
trace_alsa_read_zero(len);
if (conf.verbose) {
dolog ("Failed to read %ld frames (read zero)\n", len);
}
goto exit;
case -EPIPE:
@@ -1044,7 +1078,9 @@ static int alsa_run_in (HWVoiceIn *hw)
alsa_logerr (nread, "Failed to read %ld frames\n", len);
goto exit;
}
trace_alsa_xrun_in();
if (conf.verbose) {
dolog ("Recovering from capture xrun\n");
}
continue;
case -EAGAIN:
@@ -1116,85 +1152,82 @@ static int alsa_ctl_in (HWVoiceIn *hw, int cmd, ...)
return -1;
}
static ALSAConf glob_conf = {
.buffer_size_out = 4096,
.period_size_out = 1024,
.pcm_name_out = "default",
.pcm_name_in = "default",
};
static void *alsa_audio_init (void)
{
ALSAConf *conf = g_malloc(sizeof(ALSAConf));
*conf = glob_conf;
return conf;
return &conf;
}
static void alsa_audio_fini (void *opaque)
{
g_free(opaque);
(void) opaque;
}
static struct audio_option alsa_options[] = {
{
.name = "DAC_SIZE_IN_USEC",
.tag = AUD_OPT_BOOL,
.valp = &glob_conf.size_in_usec_out,
.valp = &conf.size_in_usec_out,
.descr = "DAC period/buffer size in microseconds (otherwise in frames)"
},
{
.name = "DAC_PERIOD_SIZE",
.tag = AUD_OPT_INT,
.valp = &glob_conf.period_size_out,
.valp = &conf.period_size_out,
.descr = "DAC period size (0 to go with system default)",
.overriddenp = &glob_conf.period_size_out_overridden
.overriddenp = &conf.period_size_out_overridden
},
{
.name = "DAC_BUFFER_SIZE",
.tag = AUD_OPT_INT,
.valp = &glob_conf.buffer_size_out,
.valp = &conf.buffer_size_out,
.descr = "DAC buffer size (0 to go with system default)",
.overriddenp = &glob_conf.buffer_size_out_overridden
.overriddenp = &conf.buffer_size_out_overridden
},
{
.name = "ADC_SIZE_IN_USEC",
.tag = AUD_OPT_BOOL,
.valp = &glob_conf.size_in_usec_in,
.valp = &conf.size_in_usec_in,
.descr =
"ADC period/buffer size in microseconds (otherwise in frames)"
},
{
.name = "ADC_PERIOD_SIZE",
.tag = AUD_OPT_INT,
.valp = &glob_conf.period_size_in,
.valp = &conf.period_size_in,
.descr = "ADC period size (0 to go with system default)",
.overriddenp = &glob_conf.period_size_in_overridden
.overriddenp = &conf.period_size_in_overridden
},
{
.name = "ADC_BUFFER_SIZE",
.tag = AUD_OPT_INT,
.valp = &glob_conf.buffer_size_in,
.valp = &conf.buffer_size_in,
.descr = "ADC buffer size (0 to go with system default)",
.overriddenp = &glob_conf.buffer_size_in_overridden
.overriddenp = &conf.buffer_size_in_overridden
},
{
.name = "THRESHOLD",
.tag = AUD_OPT_INT,
.valp = &glob_conf.threshold,
.valp = &conf.threshold,
.descr = "(undocumented)"
},
{
.name = "DAC_DEV",
.tag = AUD_OPT_STR,
.valp = &glob_conf.pcm_name_out,
.valp = &conf.pcm_name_out,
.descr = "DAC device name (for instance dmix)"
},
{
.name = "ADC_DEV",
.tag = AUD_OPT_STR,
.valp = &glob_conf.pcm_name_in,
.valp = &conf.pcm_name_in,
.descr = "ADC device name"
},
{
.name = "VERBOSE",
.tag = AUD_OPT_BOOL,
.valp = &conf.verbose,
.descr = "Behave in a more verbose way"
},
{ /* End of list */ }
};

View File

@@ -30,6 +30,7 @@
#define AUDIO_CAP "audio"
#include "audio_int.h"
/* #define DEBUG_PLIVE */
/* #define DEBUG_LIVE */
/* #define DEBUG_OUT */
/* #define DEBUG_CAPTURE */
@@ -65,6 +66,8 @@ static struct {
int hertz;
int64_t ticks;
} period;
int plive;
int log_to_monitor;
int try_poll_in;
int try_poll_out;
} conf = {
@@ -92,7 +95,9 @@ static struct {
}
},
.period = { .hertz = 100 },
.period = { .hertz = 250 },
.plive = 0,
.log_to_monitor = 0,
.try_poll_in = 1,
.try_poll_out = 1,
};
@@ -326,11 +331,20 @@ static const char *audio_get_conf_str (const char *key,
void AUD_vlog (const char *cap, const char *fmt, va_list ap)
{
if (cap) {
fprintf(stderr, "%s: ", cap);
}
if (conf.log_to_monitor) {
if (cap) {
monitor_printf(default_mon, "%s: ", cap);
}
vfprintf(stderr, fmt, ap);
monitor_vprintf(default_mon, fmt, ap);
}
else {
if (cap) {
fprintf (stderr, "%s: ", cap);
}
vfprintf (stderr, fmt, ap);
}
}
void AUD_log (const char *cap, const char *fmt, ...)
@@ -1440,6 +1454,9 @@ static void audio_run_out (AudioState *s)
while (sw) {
sw1 = sw->entries.le_next;
if (!sw->active && !sw->callback.fn) {
#ifdef DEBUG_PLIVE
dolog ("Finishing with old voice\n");
#endif
audio_close_out (sw);
}
sw = sw1;
@@ -1631,6 +1648,18 @@ static struct audio_option audio_options[] = {
.valp = &conf.period.hertz,
.descr = "Timer period in HZ (0 - use lowest possible)"
},
{
.name = "PLIVE",
.tag = AUD_OPT_BOOL,
.valp = &conf.plive,
.descr = "(undocumented)"
},
{
.name = "LOG_TO_MONITOR",
.tag = AUD_OPT_BOOL,
.valp = &conf.log_to_monitor,
.descr = "Print logging messages to monitor instead of stderr"
},
{ /* End of list */ }
};
@@ -1783,7 +1812,8 @@ static const VMStateDescription vmstate_audio = {
.name = "audio",
.version_id = 1,
.minimum_version_id = 1,
.fields = (VMStateField[]) {
.minimum_version_id_old = 1,
.fields = (VMStateField []) {
VMSTATE_END_OF_LIST()
}
};

View File

@@ -156,13 +156,13 @@ struct audio_driver {
};
struct audio_pcm_ops {
int (*init_out)(HWVoiceOut *hw, struct audsettings *as, void *drv_opaque);
int (*init_out)(HWVoiceOut *hw, struct audsettings *as);
void (*fini_out)(HWVoiceOut *hw);
int (*run_out) (HWVoiceOut *hw, int live);
int (*write) (SWVoiceOut *sw, void *buf, int size);
int (*ctl_out) (HWVoiceOut *hw, int cmd, ...);
int (*init_in) (HWVoiceIn *hw, struct audsettings *as, void *drv_opaque);
int (*init_in) (HWVoiceIn *hw, struct audsettings *as);
void (*fini_in) (HWVoiceIn *hw);
int (*run_in) (HWVoiceIn *hw);
int (*read) (SWVoiceIn *sw, void *buf, int size);
@@ -206,11 +206,14 @@ extern struct audio_driver no_audio_driver;
extern struct audio_driver oss_audio_driver;
extern struct audio_driver sdl_audio_driver;
extern struct audio_driver wav_audio_driver;
extern struct audio_driver fmod_audio_driver;
extern struct audio_driver alsa_audio_driver;
extern struct audio_driver coreaudio_audio_driver;
extern struct audio_driver dsound_audio_driver;
extern struct audio_driver esd_audio_driver;
extern struct audio_driver pa_audio_driver;
extern struct audio_driver spice_audio_driver;
extern struct audio_driver winwave_audio_driver;
extern const struct mixeng_volume nominal_volume;
void audio_pcm_init_info (struct audio_pcm_info *info, struct audsettings *as);

View File

@@ -71,7 +71,10 @@ static void glue (audio_init_nb_voices_, TYPE) (struct audio_driver *drv)
static void glue (audio_pcm_hw_free_resources_, TYPE) (HW *hw)
{
g_free (HWBUF);
if (HWBUF) {
g_free (HWBUF);
}
HWBUF = NULL;
}
@@ -89,7 +92,9 @@ static int glue (audio_pcm_hw_alloc_resources_, TYPE) (HW *hw)
static void glue (audio_pcm_sw_free_resources_, TYPE) (SW *sw)
{
g_free (sw->buf);
if (sw->buf) {
g_free (sw->buf);
}
if (sw->rate) {
st_rate_stop (sw->rate);
@@ -167,8 +172,10 @@ static int glue (audio_pcm_sw_init_, TYPE) (
static void glue (audio_pcm_sw_fini_, TYPE) (SW *sw)
{
glue (audio_pcm_sw_free_resources_, TYPE) (sw);
g_free (sw->name);
sw->name = NULL;
if (sw->name) {
g_free (sw->name);
sw->name = NULL;
}
}
static void glue (audio_pcm_hw_add_sw_, TYPE) (HW *hw, SW *sw)
@@ -191,9 +198,9 @@ static void glue (audio_pcm_hw_gc_, TYPE) (HW **hwp)
audio_detach_capture (hw);
#endif
QLIST_REMOVE (hw, entries);
glue (hw->pcm_ops->fini_, TYPE) (hw);
glue (s->nb_hw_voices_, TYPE) += 1;
glue (audio_pcm_hw_free_resources_ ,TYPE) (hw);
glue (hw->pcm_ops->fini_, TYPE) (hw);
g_free (hw);
*hwp = NULL;
}
@@ -262,7 +269,7 @@ static HW *glue (audio_pcm_hw_add_new_, TYPE) (struct audsettings *as)
#ifdef DAC
QLIST_INIT (&hw->cap_head);
#endif
if (glue (hw->pcm_ops->init_, TYPE) (hw, as, s->drv_opaque)) {
if (glue (hw->pcm_ops->init_, TYPE) (hw, as)) {
goto err0;
}
@@ -398,6 +405,10 @@ SW *glue (AUD_open_, TYPE) (
)
{
AudioState *s = &glob_audio_state;
#ifdef DAC
int live = 0;
SW *old_sw = NULL;
#endif
if (audio_bug (AUDIO_FUNC, !card || !name || !callback_fn || !as)) {
dolog ("card=%p name=%p callback_fn=%p as=%p\n",
@@ -422,6 +433,29 @@ SW *glue (AUD_open_, TYPE) (
return sw;
}
#ifdef DAC
if (conf.plive && sw && (!sw->active && !sw->empty)) {
live = sw->total_hw_samples_mixed;
#ifdef DEBUG_PLIVE
dolog ("Replacing voice %s with %d live samples\n", SW_NAME (sw), live);
dolog ("Old %s freq %d, bits %d, channels %d\n",
SW_NAME (sw), sw->info.freq, sw->info.bits, sw->info.nchannels);
dolog ("New %s freq %d, bits %d, channels %d\n",
name,
as->freq,
(as->fmt == AUD_FMT_S16 || as->fmt == AUD_FMT_U16) ? 16 : 8,
as->nchannels);
#endif
if (live) {
old_sw = sw;
old_sw->callback.fn = NULL;
sw = NULL;
}
}
#endif
if (!glue (conf.fixed_, TYPE).enabled && sw) {
glue (AUD_close_, TYPE) (card, sw);
sw = NULL;
@@ -454,6 +488,20 @@ SW *glue (AUD_open_, TYPE) (
sw->callback.fn = callback_fn;
sw->callback.opaque = callback_opaque;
#ifdef DAC
if (live) {
int mixed =
(live << old_sw->info.shift)
* old_sw->info.bytes_per_second
/ sw->info.bytes_per_second;
#ifdef DEBUG_PLIVE
dolog ("Silence will be mixed %d\n", mixed);
#endif
sw->total_hw_samples_mixed += mixed;
}
#endif
#ifdef DEBUG_AUDIO
dolog ("%s\n", name);
audio_pcm_print_info ("hw", &sw->hw->info);

View File

@@ -32,16 +32,20 @@
#define AUDIO_CAP "coreaudio"
#include "audio_int.h"
static int isAtexit;
typedef struct {
struct {
int buffer_frames;
int nbuffers;
} CoreaudioConf;
int isAtexit;
} conf = {
.buffer_frames = 512,
.nbuffers = 4,
.isAtexit = 0
};
typedef struct coreaudioVoiceOut {
HWVoiceOut hw;
pthread_mutex_t mutex;
int isAtexit;
AudioDeviceID outputDeviceID;
UInt32 audioDevicePropertyBufferFrameSize;
AudioStreamBasicDescription outputStreamBasicDescription;
@@ -157,7 +161,7 @@ static inline UInt32 isPlaying (AudioDeviceID outputDeviceID)
static void coreaudio_atexit (void)
{
isAtexit = 1;
conf.isAtexit = 1;
}
static int coreaudio_lock (coreaudioVoiceOut *core, const char *fn_name)
@@ -283,8 +287,7 @@ static int coreaudio_write (SWVoiceOut *sw, void *buf, int len)
return audio_pcm_sw_write (sw, buf, len);
}
static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
void *drv_opaque)
static int coreaudio_init_out (HWVoiceOut *hw, struct audsettings *as)
{
OSStatus status;
coreaudioVoiceOut *core = (coreaudioVoiceOut *) hw;
@@ -292,7 +295,6 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
int err;
const char *typ = "playback";
AudioValueRange frameRange;
CoreaudioConf *conf = drv_opaque;
/* create mutex */
err = pthread_mutex_init(&core->mutex, NULL);
@@ -334,16 +336,16 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
return -1;
}
if (frameRange.mMinimum > conf->buffer_frames) {
if (frameRange.mMinimum > conf.buffer_frames) {
core->audioDevicePropertyBufferFrameSize = (UInt32) frameRange.mMinimum;
dolog ("warning: Upsizing Buffer Frames to %f\n", frameRange.mMinimum);
}
else if (frameRange.mMaximum < conf->buffer_frames) {
else if (frameRange.mMaximum < conf.buffer_frames) {
core->audioDevicePropertyBufferFrameSize = (UInt32) frameRange.mMaximum;
dolog ("warning: Downsizing Buffer Frames to %f\n", frameRange.mMaximum);
}
else {
core->audioDevicePropertyBufferFrameSize = conf->buffer_frames;
core->audioDevicePropertyBufferFrameSize = conf.buffer_frames;
}
/* set Buffer Frame Size */
@@ -377,7 +379,7 @@ static int coreaudio_init_out(HWVoiceOut *hw, struct audsettings *as,
"Could not get device buffer frame size\n");
return -1;
}
hw->samples = conf->nbuffers * core->audioDevicePropertyBufferFrameSize;
hw->samples = conf.nbuffers * core->audioDevicePropertyBufferFrameSize;
/* get StreamFormat */
propertySize = sizeof(core->outputStreamBasicDescription);
@@ -441,7 +443,7 @@ static void coreaudio_fini_out (HWVoiceOut *hw)
int err;
coreaudioVoiceOut *core = (coreaudioVoiceOut *) hw;
if (!isAtexit) {
if (!conf.isAtexit) {
/* stop playback */
if (isPlaying(core->outputDeviceID)) {
status = AudioDeviceStop(core->outputDeviceID, audioDeviceIOProc);
@@ -484,7 +486,7 @@ static int coreaudio_ctl_out (HWVoiceOut *hw, int cmd, ...)
case VOICE_DISABLE:
/* stop playback */
if (!isAtexit) {
if (!conf.isAtexit) {
if (isPlaying(core->outputDeviceID)) {
status = AudioDeviceStop(core->outputDeviceID, audioDeviceIOProc);
if (status != kAudioHardwareNoError) {
@@ -497,36 +499,28 @@ static int coreaudio_ctl_out (HWVoiceOut *hw, int cmd, ...)
return 0;
}
static CoreaudioConf glob_conf = {
.buffer_frames = 512,
.nbuffers = 4,
};
static void *coreaudio_audio_init (void)
{
CoreaudioConf *conf = g_malloc(sizeof(CoreaudioConf));
*conf = glob_conf;
atexit(coreaudio_atexit);
return conf;
return &coreaudio_audio_init;
}
static void coreaudio_audio_fini (void *opaque)
{
g_free(opaque);
(void) opaque;
}
static struct audio_option coreaudio_options[] = {
{
.name = "BUFFER_SIZE",
.tag = AUD_OPT_INT,
.valp = &glob_conf.buffer_frames,
.valp = &conf.buffer_frames,
.descr = "Size of the buffer in frames"
},
{
.name = "BUFFER_COUNT",
.tag = AUD_OPT_INT,
.valp = &glob_conf.nbuffers,
.valp = &conf.nbuffers,
.descr = "Number of buffers"
},
{ /* End of list */ }

View File

@@ -67,11 +67,11 @@ static int glue (dsound_lock_, TYPE) (
LPVOID *p2p,
DWORD *blen1p,
DWORD *blen2p,
int entire,
dsound *s
int entire
)
{
HRESULT hr;
int i;
LPVOID p1 = NULL, p2 = NULL;
DWORD blen1 = 0, blen2 = 0;
DWORD flag;
@@ -81,18 +81,37 @@ static int glue (dsound_lock_, TYPE) (
#else
flag = entire ? DSBLOCK_ENTIREBUFFER : 0;
#endif
hr = glue(IFACE, _Lock)(buf, pos, len, &p1, &blen1, &p2, &blen2, flag);
for (i = 0; i < conf.lock_retries; ++i) {
hr = glue (IFACE, _Lock) (
buf,
pos,
len,
&p1,
&blen1,
&p2,
&blen2,
flag
);
if (FAILED (hr)) {
if (FAILED (hr)) {
#ifndef DSBTYPE_IN
if (hr == DSERR_BUFFERLOST) {
if (glue (dsound_restore_, TYPE) (buf, s)) {
dsound_logerr (hr, "Could not lock " NAME "\n");
if (hr == DSERR_BUFFERLOST) {
if (glue (dsound_restore_, TYPE) (buf)) {
dsound_logerr (hr, "Could not lock " NAME "\n");
goto fail;
}
continue;
}
#endif
dsound_logerr (hr, "Could not lock " NAME "\n");
goto fail;
}
#endif
dsound_logerr (hr, "Could not lock " NAME "\n");
break;
}
if (i == conf.lock_retries) {
dolog ("%d attempts to lock " NAME " failed\n", i);
goto fail;
}
@@ -155,19 +174,16 @@ static void dsound_fini_out (HWVoiceOut *hw)
}
#ifdef DSBTYPE_IN
static int dsound_init_in(HWVoiceIn *hw, struct audsettings *as,
void *drv_opaque)
static int dsound_init_in (HWVoiceIn *hw, struct audsettings *as)
#else
static int dsound_init_out(HWVoiceOut *hw, struct audsettings *as,
void *drv_opaque)
static int dsound_init_out (HWVoiceOut *hw, struct audsettings *as)
#endif
{
int err;
HRESULT hr;
dsound *s = drv_opaque;
dsound *s = &glob_dsound;
WAVEFORMATEX wfx;
struct audsettings obt_as;
DSoundConf *conf = &s->conf;
#ifdef DSBTYPE_IN
const char *typ = "ADC";
DSoundVoiceIn *ds = (DSoundVoiceIn *) hw;
@@ -194,7 +210,7 @@ static int dsound_init_out(HWVoiceOut *hw, struct audsettings *as,
bd.dwSize = sizeof (bd);
bd.lpwfxFormat = &wfx;
#ifdef DSBTYPE_IN
bd.dwBufferBytes = conf->bufsize_in;
bd.dwBufferBytes = conf.bufsize_in;
hr = IDirectSoundCapture_CreateCaptureBuffer (
s->dsound_capture,
&bd,
@@ -203,7 +219,7 @@ static int dsound_init_out(HWVoiceOut *hw, struct audsettings *as,
);
#else
bd.dwFlags = DSBCAPS_STICKYFOCUS | DSBCAPS_GETCURRENTPOSITION2;
bd.dwBufferBytes = conf->bufsize_out;
bd.dwBufferBytes = conf.bufsize_out;
hr = IDirectSound_CreateSoundBuffer (
s->dsound,
&bd,
@@ -253,7 +269,6 @@ static int dsound_init_out(HWVoiceOut *hw, struct audsettings *as,
);
}
hw->samples = bc.dwBufferBytes >> hw->info.shift;
ds->s = s;
#ifdef DEBUG_DSOUND
dolog ("caps %ld, desc %ld\n",

View File

@@ -41,25 +41,42 @@
/* #define DEBUG_DSOUND */
typedef struct {
static struct {
int lock_retries;
int restore_retries;
int getstatus_retries;
int set_primary;
int bufsize_in;
int bufsize_out;
struct audsettings settings;
int latency_millis;
} DSoundConf;
} conf = {
.lock_retries = 1,
.restore_retries = 1,
.getstatus_retries = 1,
.set_primary = 0,
.bufsize_in = 16384,
.bufsize_out = 16384,
.settings.freq = 44100,
.settings.nchannels = 2,
.settings.fmt = AUD_FMT_S16,
.latency_millis = 10
};
typedef struct {
LPDIRECTSOUND dsound;
LPDIRECTSOUNDCAPTURE dsound_capture;
LPDIRECTSOUNDBUFFER dsound_primary_buffer;
struct audsettings settings;
DSoundConf conf;
} dsound;
static dsound glob_dsound;
typedef struct {
HWVoiceOut hw;
LPDIRECTSOUNDBUFFER dsound_buffer;
DWORD old_pos;
int first_time;
dsound *s;
#ifdef DEBUG_DSOUND
DWORD old_ppos;
DWORD played;
@@ -71,7 +88,6 @@ typedef struct {
HWVoiceIn hw;
int first_time;
LPDIRECTSOUNDCAPTUREBUFFER dsound_capture_buffer;
dsound *s;
} DSoundVoiceIn;
static void dsound_log_hresult (HRESULT hr)
@@ -265,17 +281,29 @@ static void print_wave_format (WAVEFORMATEX *wfx)
}
#endif
static int dsound_restore_out (LPDIRECTSOUNDBUFFER dsb, dsound *s)
static int dsound_restore_out (LPDIRECTSOUNDBUFFER dsb)
{
HRESULT hr;
int i;
hr = IDirectSoundBuffer_Restore (dsb);
for (i = 0; i < conf.restore_retries; ++i) {
hr = IDirectSoundBuffer_Restore (dsb);
if (hr != DS_OK) {
dsound_logerr (hr, "Could not restore playback buffer\n");
return -1;
switch (hr) {
case DS_OK:
return 0;
case DSERR_BUFFERLOST:
continue;
default:
dsound_logerr (hr, "Could not restore playback buffer\n");
return -1;
}
}
return 0;
dolog ("%d attempts to restore playback buffer failed\n", i);
return -1;
}
#include "dsound_template.h"
@@ -283,20 +311,25 @@ static int dsound_restore_out (LPDIRECTSOUNDBUFFER dsb, dsound *s)
#include "dsound_template.h"
#undef DSBTYPE_IN
static int dsound_get_status_out (LPDIRECTSOUNDBUFFER dsb, DWORD *statusp,
dsound *s)
static int dsound_get_status_out (LPDIRECTSOUNDBUFFER dsb, DWORD *statusp)
{
HRESULT hr;
int i;
hr = IDirectSoundBuffer_GetStatus (dsb, statusp);
if (FAILED (hr)) {
dsound_logerr (hr, "Could not get playback buffer status\n");
return -1;
}
for (i = 0; i < conf.getstatus_retries; ++i) {
hr = IDirectSoundBuffer_GetStatus (dsb, statusp);
if (FAILED (hr)) {
dsound_logerr (hr, "Could not get playback buffer status\n");
return -1;
}
if (*statusp & DSERR_BUFFERLOST) {
dsound_restore_out(dsb, s);
return -1;
if (*statusp & DSERR_BUFFERLOST) {
if (dsound_restore_out (dsb)) {
return -1;
}
continue;
}
break;
}
return 0;
@@ -343,8 +376,7 @@ static void dsound_write_sample (HWVoiceOut *hw, uint8_t *dst, int dst_len)
hw->rpos = pos % hw->samples;
}
static void dsound_clear_sample (HWVoiceOut *hw, LPDIRECTSOUNDBUFFER dsb,
dsound *s)
static void dsound_clear_sample (HWVoiceOut *hw, LPDIRECTSOUNDBUFFER dsb)
{
int err;
LPVOID p1, p2;
@@ -357,8 +389,7 @@ static void dsound_clear_sample (HWVoiceOut *hw, LPDIRECTSOUNDBUFFER dsb,
hw->samples << hw->info.shift,
&p1, &p2,
&blen1, &blen2,
1,
s
1
);
if (err) {
return;
@@ -384,9 +415,25 @@ static void dsound_clear_sample (HWVoiceOut *hw, LPDIRECTSOUNDBUFFER dsb,
dsound_unlock_out (dsb, p1, p2, blen1, blen2);
}
static int dsound_open (dsound *s)
static void dsound_close (dsound *s)
{
HRESULT hr;
if (s->dsound_primary_buffer) {
hr = IDirectSoundBuffer_Release (s->dsound_primary_buffer);
if (FAILED (hr)) {
dsound_logerr (hr, "Could not release primary buffer\n");
}
s->dsound_primary_buffer = NULL;
}
}
static int dsound_open (dsound *s)
{
int err;
HRESULT hr;
WAVEFORMATEX wfx;
DSBUFFERDESC dsbd;
HWND hwnd;
hwnd = GetForegroundWindow ();
@@ -402,7 +449,63 @@ static int dsound_open (dsound *s)
return -1;
}
if (!conf.set_primary) {
return 0;
}
err = waveformat_from_audio_settings (&wfx, &conf.settings);
if (err) {
return -1;
}
memset (&dsbd, 0, sizeof (dsbd));
dsbd.dwSize = sizeof (dsbd);
dsbd.dwFlags = DSBCAPS_PRIMARYBUFFER;
dsbd.dwBufferBytes = 0;
dsbd.lpwfxFormat = NULL;
hr = IDirectSound_CreateSoundBuffer (
s->dsound,
&dsbd,
&s->dsound_primary_buffer,
NULL
);
if (FAILED (hr)) {
dsound_logerr (hr, "Could not create primary playback buffer\n");
return -1;
}
hr = IDirectSoundBuffer_SetFormat (s->dsound_primary_buffer, &wfx);
if (FAILED (hr)) {
dsound_logerr (hr, "Could not set primary playback buffer format\n");
}
hr = IDirectSoundBuffer_GetFormat (
s->dsound_primary_buffer,
&wfx,
sizeof (wfx),
NULL
);
if (FAILED (hr)) {
dsound_logerr (hr, "Could not get primary playback buffer format\n");
goto fail0;
}
#ifdef DEBUG_DSOUND
dolog ("Primary\n");
print_wave_format (&wfx);
#endif
err = waveformat_to_audio_settings (&wfx, &s->settings);
if (err) {
goto fail0;
}
return 0;
fail0:
dsound_close (s);
return -1;
}
static int dsound_ctl_out (HWVoiceOut *hw, int cmd, ...)
@@ -411,7 +514,6 @@ static int dsound_ctl_out (HWVoiceOut *hw, int cmd, ...)
DWORD status;
DSoundVoiceOut *ds = (DSoundVoiceOut *) hw;
LPDIRECTSOUNDBUFFER dsb = ds->dsound_buffer;
dsound *s = ds->s;
if (!dsb) {
dolog ("Attempt to control voice without a buffer\n");
@@ -420,7 +522,7 @@ static int dsound_ctl_out (HWVoiceOut *hw, int cmd, ...)
switch (cmd) {
case VOICE_ENABLE:
if (dsound_get_status_out (dsb, &status, s)) {
if (dsound_get_status_out (dsb, &status)) {
return -1;
}
@@ -429,7 +531,7 @@ static int dsound_ctl_out (HWVoiceOut *hw, int cmd, ...)
return 0;
}
dsound_clear_sample (hw, dsb, s);
dsound_clear_sample (hw, dsb);
hr = IDirectSoundBuffer_Play (dsb, 0, 0, DSBPLAY_LOOPING);
if (FAILED (hr)) {
@@ -439,7 +541,7 @@ static int dsound_ctl_out (HWVoiceOut *hw, int cmd, ...)
break;
case VOICE_DISABLE:
if (dsound_get_status_out (dsb, &status, s)) {
if (dsound_get_status_out (dsb, &status)) {
return -1;
}
@@ -476,8 +578,6 @@ static int dsound_run_out (HWVoiceOut *hw, int live)
DWORD wpos, ppos, old_pos;
LPVOID p1, p2;
int bufsize;
dsound *s = ds->s;
DSoundConf *conf = &s->conf;
if (!dsb) {
dolog ("Attempt to run empty with playback buffer\n");
@@ -500,14 +600,14 @@ static int dsound_run_out (HWVoiceOut *hw, int live)
len = live << hwshift;
if (ds->first_time) {
if (conf->latency_millis) {
if (conf.latency_millis) {
DWORD cur_blat;
cur_blat = audio_ring_dist (wpos, ppos, bufsize);
ds->first_time = 0;
old_pos = wpos;
old_pos +=
millis_to_bytes (&hw->info, conf->latency_millis) - cur_blat;
millis_to_bytes (&hw->info, conf.latency_millis) - cur_blat;
old_pos %= bufsize;
old_pos &= ~hw->info.align;
}
@@ -563,8 +663,7 @@ static int dsound_run_out (HWVoiceOut *hw, int live)
len,
&p1, &p2,
&blen1, &blen2,
0,
s
0
);
if (err) {
return 0;
@@ -667,7 +766,6 @@ static int dsound_run_in (HWVoiceIn *hw)
DWORD cpos, rpos;
LPVOID p1, p2;
int hwshift;
dsound *s = ds->s;
if (!dscb) {
dolog ("Attempt to run without capture buffer\n");
@@ -722,8 +820,7 @@ static int dsound_run_in (HWVoiceIn *hw)
&p2,
&blen1,
&blen2,
0,
s
0
);
if (err) {
return 0;
@@ -746,19 +843,12 @@ static int dsound_run_in (HWVoiceIn *hw)
return decr;
}
static DSoundConf glob_conf = {
.bufsize_in = 16384,
.bufsize_out = 16384,
.latency_millis = 10
};
static void dsound_audio_fini (void *opaque)
{
HRESULT hr;
dsound *s = opaque;
if (!s->dsound) {
g_free(s);
return;
}
@@ -769,7 +859,6 @@ static void dsound_audio_fini (void *opaque)
s->dsound = NULL;
if (!s->dsound_capture) {
g_free(s);
return;
}
@@ -778,21 +867,17 @@ static void dsound_audio_fini (void *opaque)
dsound_logerr (hr, "Could not release DirectSoundCapture\n");
}
s->dsound_capture = NULL;
g_free(s);
}
static void *dsound_audio_init (void)
{
int err;
HRESULT hr;
dsound *s = g_malloc0(sizeof(dsound));
dsound *s = &glob_dsound;
s->conf = glob_conf;
hr = CoInitialize (NULL);
if (FAILED (hr)) {
dsound_logerr (hr, "Could not initialize COM\n");
g_free(s);
return NULL;
}
@@ -805,7 +890,6 @@ static void *dsound_audio_init (void)
);
if (FAILED (hr)) {
dsound_logerr (hr, "Could not create DirectSound instance\n");
g_free(s);
return NULL;
}
@@ -817,7 +901,7 @@ static void *dsound_audio_init (void)
if (FAILED (hr)) {
dsound_logerr (hr, "Could not release DirectSound\n");
}
g_free(s);
s->dsound = NULL;
return NULL;
}
@@ -854,22 +938,64 @@ static void *dsound_audio_init (void)
}
static struct audio_option dsound_options[] = {
{
.name = "LOCK_RETRIES",
.tag = AUD_OPT_INT,
.valp = &conf.lock_retries,
.descr = "Number of times to attempt locking the buffer"
},
{
.name = "RESTOURE_RETRIES",
.tag = AUD_OPT_INT,
.valp = &conf.restore_retries,
.descr = "Number of times to attempt restoring the buffer"
},
{
.name = "GETSTATUS_RETRIES",
.tag = AUD_OPT_INT,
.valp = &conf.getstatus_retries,
.descr = "Number of times to attempt getting status of the buffer"
},
{
.name = "SET_PRIMARY",
.tag = AUD_OPT_BOOL,
.valp = &conf.set_primary,
.descr = "Set the parameters of primary buffer"
},
{
.name = "LATENCY_MILLIS",
.tag = AUD_OPT_INT,
.valp = &glob_conf.latency_millis,
.valp = &conf.latency_millis,
.descr = "(undocumented)"
},
{
.name = "PRIMARY_FREQ",
.tag = AUD_OPT_INT,
.valp = &conf.settings.freq,
.descr = "Primary buffer frequency"
},
{
.name = "PRIMARY_CHANNELS",
.tag = AUD_OPT_INT,
.valp = &conf.settings.nchannels,
.descr = "Primary buffer number of channels (1 - mono, 2 - stereo)"
},
{
.name = "PRIMARY_FMT",
.tag = AUD_OPT_FMT,
.valp = &conf.settings.fmt,
.descr = "Primary buffer format"
},
{
.name = "BUFSIZE_OUT",
.tag = AUD_OPT_INT,
.valp = &glob_conf.bufsize_out,
.valp = &conf.bufsize_out,
.descr = "(undocumented)"
},
{
.name = "BUFSIZE_IN",
.tag = AUD_OPT_INT,
.valp = &glob_conf.bufsize_in,
.valp = &conf.bufsize_in,
.descr = "(undocumented)"
},
{ /* End of list */ }

557
audio/esdaudio.c Normal file
View File

@@ -0,0 +1,557 @@
/*
* QEMU ESD audio driver
*
* Copyright (c) 2006 Frederick Reeve (brushed up by malc)
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include <esd.h>
#include "qemu-common.h"
#include "audio.h"
#define AUDIO_CAP "esd"
#include "audio_int.h"
#include "audio_pt_int.h"
typedef struct {
HWVoiceOut hw;
int done;
int live;
int decr;
int rpos;
void *pcm_buf;
int fd;
struct audio_pt pt;
} ESDVoiceOut;
typedef struct {
HWVoiceIn hw;
int done;
int dead;
int incr;
int wpos;
void *pcm_buf;
int fd;
struct audio_pt pt;
} ESDVoiceIn;
static struct {
int samples;
int divisor;
char *dac_host;
char *adc_host;
} conf = {
.samples = 1024,
.divisor = 2,
};
static void GCC_FMT_ATTR (2, 3) qesd_logerr (int err, const char *fmt, ...)
{
va_list ap;
va_start (ap, fmt);
AUD_vlog (AUDIO_CAP, fmt, ap);
va_end (ap);
AUD_log (AUDIO_CAP, "Reason: %s\n", strerror (err));
}
/* playback */
static void *qesd_thread_out (void *arg)
{
ESDVoiceOut *esd = arg;
HWVoiceOut *hw = &esd->hw;
int threshold;
threshold = conf.divisor ? hw->samples / conf.divisor : 0;
if (audio_pt_lock (&esd->pt, AUDIO_FUNC)) {
return NULL;
}
for (;;) {
int decr, to_mix, rpos;
for (;;) {
if (esd->done) {
goto exit;
}
if (esd->live > threshold) {
break;
}
if (audio_pt_wait (&esd->pt, AUDIO_FUNC)) {
goto exit;
}
}
decr = to_mix = esd->live;
rpos = hw->rpos;
if (audio_pt_unlock (&esd->pt, AUDIO_FUNC)) {
return NULL;
}
while (to_mix) {
ssize_t written;
int chunk = audio_MIN (to_mix, hw->samples - rpos);
struct st_sample *src = hw->mix_buf + rpos;
hw->clip (esd->pcm_buf, src, chunk);
again:
written = write (esd->fd, esd->pcm_buf, chunk << hw->info.shift);
if (written == -1) {
if (errno == EINTR || errno == EAGAIN) {
goto again;
}
qesd_logerr (errno, "write failed\n");
return NULL;
}
if (written != chunk << hw->info.shift) {
int wsamples = written >> hw->info.shift;
int wbytes = wsamples << hw->info.shift;
if (wbytes != written) {
dolog ("warning: Misaligned write %d (requested %zd), "
"alignment %d\n",
wbytes, written, hw->info.align + 1);
}
to_mix -= wsamples;
rpos = (rpos + wsamples) % hw->samples;
break;
}
rpos = (rpos + chunk) % hw->samples;
to_mix -= chunk;
}
if (audio_pt_lock (&esd->pt, AUDIO_FUNC)) {
return NULL;
}
esd->rpos = rpos;
esd->live -= decr;
esd->decr += decr;
}
exit:
audio_pt_unlock (&esd->pt, AUDIO_FUNC);
return NULL;
}
static int qesd_run_out (HWVoiceOut *hw, int live)
{
int decr;
ESDVoiceOut *esd = (ESDVoiceOut *) hw;
if (audio_pt_lock (&esd->pt, AUDIO_FUNC)) {
return 0;
}
decr = audio_MIN (live, esd->decr);
esd->decr -= decr;
esd->live = live - decr;
hw->rpos = esd->rpos;
if (esd->live > 0) {
audio_pt_unlock_and_signal (&esd->pt, AUDIO_FUNC);
}
else {
audio_pt_unlock (&esd->pt, AUDIO_FUNC);
}
return decr;
}
static int qesd_write (SWVoiceOut *sw, void *buf, int len)
{
return audio_pcm_sw_write (sw, buf, len);
}
static int qesd_init_out (HWVoiceOut *hw, struct audsettings *as)
{
ESDVoiceOut *esd = (ESDVoiceOut *) hw;
struct audsettings obt_as = *as;
int esdfmt = ESD_STREAM | ESD_PLAY;
esdfmt |= (as->nchannels == 2) ? ESD_STEREO : ESD_MONO;
switch (as->fmt) {
case AUD_FMT_S8:
case AUD_FMT_U8:
esdfmt |= ESD_BITS8;
obt_as.fmt = AUD_FMT_U8;
break;
case AUD_FMT_S32:
case AUD_FMT_U32:
dolog ("Will use 16 instead of 32 bit samples\n");
/* fall through */
case AUD_FMT_S16:
case AUD_FMT_U16:
deffmt:
esdfmt |= ESD_BITS16;
obt_as.fmt = AUD_FMT_S16;
break;
default:
dolog ("Internal logic error: Bad audio format %d\n", as->fmt);
goto deffmt;
}
obt_as.endianness = AUDIO_HOST_ENDIANNESS;
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = conf.samples;
esd->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
if (!esd->pcm_buf) {
dolog ("Could not allocate buffer (%d bytes)\n",
hw->samples << hw->info.shift);
return -1;
}
esd->fd = esd_play_stream (esdfmt, as->freq, conf.dac_host, NULL);
if (esd->fd < 0) {
qesd_logerr (errno, "esd_play_stream failed\n");
goto fail1;
}
if (audio_pt_init (&esd->pt, qesd_thread_out, esd, AUDIO_CAP, AUDIO_FUNC)) {
goto fail2;
}
return 0;
fail2:
if (close (esd->fd)) {
qesd_logerr (errno, "%s: close on esd socket(%d) failed\n",
AUDIO_FUNC, esd->fd);
}
esd->fd = -1;
fail1:
g_free (esd->pcm_buf);
esd->pcm_buf = NULL;
return -1;
}
static void qesd_fini_out (HWVoiceOut *hw)
{
void *ret;
ESDVoiceOut *esd = (ESDVoiceOut *) hw;
audio_pt_lock (&esd->pt, AUDIO_FUNC);
esd->done = 1;
audio_pt_unlock_and_signal (&esd->pt, AUDIO_FUNC);
audio_pt_join (&esd->pt, &ret, AUDIO_FUNC);
if (esd->fd >= 0) {
if (close (esd->fd)) {
qesd_logerr (errno, "failed to close esd socket\n");
}
esd->fd = -1;
}
audio_pt_fini (&esd->pt, AUDIO_FUNC);
g_free (esd->pcm_buf);
esd->pcm_buf = NULL;
}
static int qesd_ctl_out (HWVoiceOut *hw, int cmd, ...)
{
(void) hw;
(void) cmd;
return 0;
}
/* capture */
static void *qesd_thread_in (void *arg)
{
ESDVoiceIn *esd = arg;
HWVoiceIn *hw = &esd->hw;
int threshold;
threshold = conf.divisor ? hw->samples / conf.divisor : 0;
if (audio_pt_lock (&esd->pt, AUDIO_FUNC)) {
return NULL;
}
for (;;) {
int incr, to_grab, wpos;
for (;;) {
if (esd->done) {
goto exit;
}
if (esd->dead > threshold) {
break;
}
if (audio_pt_wait (&esd->pt, AUDIO_FUNC)) {
goto exit;
}
}
incr = to_grab = esd->dead;
wpos = hw->wpos;
if (audio_pt_unlock (&esd->pt, AUDIO_FUNC)) {
return NULL;
}
while (to_grab) {
ssize_t nread;
int chunk = audio_MIN (to_grab, hw->samples - wpos);
void *buf = advance (esd->pcm_buf, wpos);
again:
nread = read (esd->fd, buf, chunk << hw->info.shift);
if (nread == -1) {
if (errno == EINTR || errno == EAGAIN) {
goto again;
}
qesd_logerr (errno, "read failed\n");
return NULL;
}
if (nread != chunk << hw->info.shift) {
int rsamples = nread >> hw->info.shift;
int rbytes = rsamples << hw->info.shift;
if (rbytes != nread) {
dolog ("warning: Misaligned write %d (requested %zd), "
"alignment %d\n",
rbytes, nread, hw->info.align + 1);
}
to_grab -= rsamples;
wpos = (wpos + rsamples) % hw->samples;
break;
}
hw->conv (hw->conv_buf + wpos, buf, nread >> hw->info.shift);
wpos = (wpos + chunk) % hw->samples;
to_grab -= chunk;
}
if (audio_pt_lock (&esd->pt, AUDIO_FUNC)) {
return NULL;
}
esd->wpos = wpos;
esd->dead -= incr;
esd->incr += incr;
}
exit:
audio_pt_unlock (&esd->pt, AUDIO_FUNC);
return NULL;
}
static int qesd_run_in (HWVoiceIn *hw)
{
int live, incr, dead;
ESDVoiceIn *esd = (ESDVoiceIn *) hw;
if (audio_pt_lock (&esd->pt, AUDIO_FUNC)) {
return 0;
}
live = audio_pcm_hw_get_live_in (hw);
dead = hw->samples - live;
incr = audio_MIN (dead, esd->incr);
esd->incr -= incr;
esd->dead = dead - incr;
hw->wpos = esd->wpos;
if (esd->dead > 0) {
audio_pt_unlock_and_signal (&esd->pt, AUDIO_FUNC);
}
else {
audio_pt_unlock (&esd->pt, AUDIO_FUNC);
}
return incr;
}
static int qesd_read (SWVoiceIn *sw, void *buf, int len)
{
return audio_pcm_sw_read (sw, buf, len);
}
static int qesd_init_in (HWVoiceIn *hw, struct audsettings *as)
{
ESDVoiceIn *esd = (ESDVoiceIn *) hw;
struct audsettings obt_as = *as;
int esdfmt = ESD_STREAM | ESD_RECORD;
esdfmt |= (as->nchannels == 2) ? ESD_STEREO : ESD_MONO;
switch (as->fmt) {
case AUD_FMT_S8:
case AUD_FMT_U8:
esdfmt |= ESD_BITS8;
obt_as.fmt = AUD_FMT_U8;
break;
case AUD_FMT_S16:
case AUD_FMT_U16:
esdfmt |= ESD_BITS16;
obt_as.fmt = AUD_FMT_S16;
break;
case AUD_FMT_S32:
case AUD_FMT_U32:
dolog ("Will use 16 instead of 32 bit samples\n");
esdfmt |= ESD_BITS16;
obt_as.fmt = AUD_FMT_S16;
break;
}
obt_as.endianness = AUDIO_HOST_ENDIANNESS;
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = conf.samples;
esd->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
if (!esd->pcm_buf) {
dolog ("Could not allocate buffer (%d bytes)\n",
hw->samples << hw->info.shift);
return -1;
}
esd->fd = esd_record_stream (esdfmt, as->freq, conf.adc_host, NULL);
if (esd->fd < 0) {
qesd_logerr (errno, "esd_record_stream failed\n");
goto fail1;
}
if (audio_pt_init (&esd->pt, qesd_thread_in, esd, AUDIO_CAP, AUDIO_FUNC)) {
goto fail2;
}
return 0;
fail2:
if (close (esd->fd)) {
qesd_logerr (errno, "%s: close on esd socket(%d) failed\n",
AUDIO_FUNC, esd->fd);
}
esd->fd = -1;
fail1:
g_free (esd->pcm_buf);
esd->pcm_buf = NULL;
return -1;
}
static void qesd_fini_in (HWVoiceIn *hw)
{
void *ret;
ESDVoiceIn *esd = (ESDVoiceIn *) hw;
audio_pt_lock (&esd->pt, AUDIO_FUNC);
esd->done = 1;
audio_pt_unlock_and_signal (&esd->pt, AUDIO_FUNC);
audio_pt_join (&esd->pt, &ret, AUDIO_FUNC);
if (esd->fd >= 0) {
if (close (esd->fd)) {
qesd_logerr (errno, "failed to close esd socket\n");
}
esd->fd = -1;
}
audio_pt_fini (&esd->pt, AUDIO_FUNC);
g_free (esd->pcm_buf);
esd->pcm_buf = NULL;
}
static int qesd_ctl_in (HWVoiceIn *hw, int cmd, ...)
{
(void) hw;
(void) cmd;
return 0;
}
/* common */
static void *qesd_audio_init (void)
{
return &conf;
}
static void qesd_audio_fini (void *opaque)
{
(void) opaque;
ldebug ("esd_fini");
}
struct audio_option qesd_options[] = {
{
.name = "SAMPLES",
.tag = AUD_OPT_INT,
.valp = &conf.samples,
.descr = "buffer size in samples"
},
{
.name = "DIVISOR",
.tag = AUD_OPT_INT,
.valp = &conf.divisor,
.descr = "threshold divisor"
},
{
.name = "DAC_HOST",
.tag = AUD_OPT_STR,
.valp = &conf.dac_host,
.descr = "playback host"
},
{
.name = "ADC_HOST",
.tag = AUD_OPT_STR,
.valp = &conf.adc_host,
.descr = "capture host"
},
{ /* End of list */ }
};
static struct audio_pcm_ops qesd_pcm_ops = {
.init_out = qesd_init_out,
.fini_out = qesd_fini_out,
.run_out = qesd_run_out,
.write = qesd_write,
.ctl_out = qesd_ctl_out,
.init_in = qesd_init_in,
.fini_in = qesd_fini_in,
.run_in = qesd_run_in,
.read = qesd_read,
.ctl_in = qesd_ctl_in,
};
struct audio_driver esd_audio_driver = {
.name = "esd",
.descr = "http://en.wikipedia.org/wiki/Esound",
.options = qesd_options,
.init = qesd_audio_init,
.fini = qesd_audio_fini,
.pcm_ops = &qesd_pcm_ops,
.can_be_default = 0,
.max_voices_out = INT_MAX,
.max_voices_in = INT_MAX,
.voice_size_out = sizeof (ESDVoiceOut),
.voice_size_in = sizeof (ESDVoiceIn)
};

685
audio/fmodaudio.c Normal file
View File

@@ -0,0 +1,685 @@
/*
* QEMU FMOD audio driver
*
* Copyright (c) 2004-2005 Vassili Karpov (malc)
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include <fmod.h>
#include <fmod_errors.h>
#include "qemu-common.h"
#include "audio.h"
#define AUDIO_CAP "fmod"
#include "audio_int.h"
typedef struct FMODVoiceOut {
HWVoiceOut hw;
unsigned int old_pos;
FSOUND_SAMPLE *fmod_sample;
int channel;
} FMODVoiceOut;
typedef struct FMODVoiceIn {
HWVoiceIn hw;
FSOUND_SAMPLE *fmod_sample;
} FMODVoiceIn;
static struct {
const char *drvname;
int nb_samples;
int freq;
int nb_channels;
int bufsize;
int broken_adc;
} conf = {
.nb_samples = 2048 * 2,
.freq = 44100,
.nb_channels = 2,
};
static void GCC_FMT_ATTR (1, 2) fmod_logerr (const char *fmt, ...)
{
va_list ap;
va_start (ap, fmt);
AUD_vlog (AUDIO_CAP, fmt, ap);
va_end (ap);
AUD_log (AUDIO_CAP, "Reason: %s\n",
FMOD_ErrorString (FSOUND_GetError ()));
}
static void GCC_FMT_ATTR (2, 3) fmod_logerr2 (
const char *typ,
const char *fmt,
...
)
{
va_list ap;
AUD_log (AUDIO_CAP, "Could not initialize %s\n", typ);
va_start (ap, fmt);
AUD_vlog (AUDIO_CAP, fmt, ap);
va_end (ap);
AUD_log (AUDIO_CAP, "Reason: %s\n",
FMOD_ErrorString (FSOUND_GetError ()));
}
static int fmod_write (SWVoiceOut *sw, void *buf, int len)
{
return audio_pcm_sw_write (sw, buf, len);
}
static void fmod_clear_sample (FMODVoiceOut *fmd)
{
HWVoiceOut *hw = &fmd->hw;
int status;
void *p1 = 0, *p2 = 0;
unsigned int len1 = 0, len2 = 0;
status = FSOUND_Sample_Lock (
fmd->fmod_sample,
0,
hw->samples << hw->info.shift,
&p1,
&p2,
&len1,
&len2
);
if (!status) {
fmod_logerr ("Failed to lock sample\n");
return;
}
if ((len1 & hw->info.align) || (len2 & hw->info.align)) {
dolog ("Lock returned misaligned length %d, %d, alignment %d\n",
len1, len2, hw->info.align + 1);
goto fail;
}
if ((len1 + len2) - (hw->samples << hw->info.shift)) {
dolog ("Lock returned incomplete length %d, %d\n",
len1 + len2, hw->samples << hw->info.shift);
goto fail;
}
audio_pcm_info_clear_buf (&hw->info, p1, hw->samples);
fail:
status = FSOUND_Sample_Unlock (fmd->fmod_sample, p1, p2, len1, len2);
if (!status) {
fmod_logerr ("Failed to unlock sample\n");
}
}
static void fmod_write_sample (HWVoiceOut *hw, uint8_t *dst, int dst_len)
{
int src_len1 = dst_len;
int src_len2 = 0;
int pos = hw->rpos + dst_len;
struct st_sample *src1 = hw->mix_buf + hw->rpos;
struct st_sample *src2 = NULL;
if (pos > hw->samples) {
src_len1 = hw->samples - hw->rpos;
src2 = hw->mix_buf;
src_len2 = dst_len - src_len1;
pos = src_len2;
}
if (src_len1) {
hw->clip (dst, src1, src_len1);
}
if (src_len2) {
dst = advance (dst, src_len1 << hw->info.shift);
hw->clip (dst, src2, src_len2);
}
hw->rpos = pos % hw->samples;
}
static int fmod_unlock_sample (FSOUND_SAMPLE *sample, void *p1, void *p2,
unsigned int blen1, unsigned int blen2)
{
int status = FSOUND_Sample_Unlock (sample, p1, p2, blen1, blen2);
if (!status) {
fmod_logerr ("Failed to unlock sample\n");
return -1;
}
return 0;
}
static int fmod_lock_sample (
FSOUND_SAMPLE *sample,
struct audio_pcm_info *info,
int pos,
int len,
void **p1,
void **p2,
unsigned int *blen1,
unsigned int *blen2
)
{
int status;
status = FSOUND_Sample_Lock (
sample,
pos << info->shift,
len << info->shift,
p1,
p2,
blen1,
blen2
);
if (!status) {
fmod_logerr ("Failed to lock sample\n");
return -1;
}
if ((*blen1 & info->align) || (*blen2 & info->align)) {
dolog ("Lock returned misaligned length %d, %d, alignment %d\n",
*blen1, *blen2, info->align + 1);
fmod_unlock_sample (sample, *p1, *p2, *blen1, *blen2);
*p1 = NULL - 1;
*p2 = NULL - 1;
*blen1 = ~0U;
*blen2 = ~0U;
return -1;
}
if (!*p1 && *blen1) {
dolog ("warning: !p1 && blen1=%d\n", *blen1);
*blen1 = 0;
}
if (!p2 && *blen2) {
dolog ("warning: !p2 && blen2=%d\n", *blen2);
*blen2 = 0;
}
return 0;
}
static int fmod_run_out (HWVoiceOut *hw, int live)
{
FMODVoiceOut *fmd = (FMODVoiceOut *) hw;
int decr;
void *p1 = 0, *p2 = 0;
unsigned int blen1 = 0, blen2 = 0;
unsigned int len1 = 0, len2 = 0;
if (!hw->pending_disable) {
return 0;
}
decr = live;
if (fmd->channel >= 0) {
int len = decr;
int old_pos = fmd->old_pos;
int ppos = FSOUND_GetCurrentPosition (fmd->channel);
if (ppos == old_pos || !ppos) {
return 0;
}
if ((old_pos < ppos) && ((old_pos + len) > ppos)) {
len = ppos - old_pos;
}
else {
if ((old_pos > ppos) && ((old_pos + len) > (ppos + hw->samples))) {
len = hw->samples - old_pos + ppos;
}
}
decr = len;
if (audio_bug (AUDIO_FUNC, decr < 0)) {
dolog ("decr=%d live=%d ppos=%d old_pos=%d len=%d\n",
decr, live, ppos, old_pos, len);
return 0;
}
}
if (!decr) {
return 0;
}
if (fmod_lock_sample (fmd->fmod_sample, &fmd->hw.info,
fmd->old_pos, decr,
&p1, &p2,
&blen1, &blen2)) {
return 0;
}
len1 = blen1 >> hw->info.shift;
len2 = blen2 >> hw->info.shift;
ldebug ("%p %p %d %d %d %d\n", p1, p2, len1, len2, blen1, blen2);
decr = len1 + len2;
if (p1 && len1) {
fmod_write_sample (hw, p1, len1);
}
if (p2 && len2) {
fmod_write_sample (hw, p2, len2);
}
fmod_unlock_sample (fmd->fmod_sample, p1, p2, blen1, blen2);
fmd->old_pos = (fmd->old_pos + decr) % hw->samples;
return decr;
}
static int aud_to_fmodfmt (audfmt_e fmt, int stereo)
{
int mode = FSOUND_LOOP_NORMAL;
switch (fmt) {
case AUD_FMT_S8:
mode |= FSOUND_SIGNED | FSOUND_8BITS;
break;
case AUD_FMT_U8:
mode |= FSOUND_UNSIGNED | FSOUND_8BITS;
break;
case AUD_FMT_S16:
mode |= FSOUND_SIGNED | FSOUND_16BITS;
break;
case AUD_FMT_U16:
mode |= FSOUND_UNSIGNED | FSOUND_16BITS;
break;
default:
dolog ("Internal logic error: Bad audio format %d\n", fmt);
#ifdef DEBUG_FMOD
abort ();
#endif
mode |= FSOUND_8BITS;
}
mode |= stereo ? FSOUND_STEREO : FSOUND_MONO;
return mode;
}
static void fmod_fini_out (HWVoiceOut *hw)
{
FMODVoiceOut *fmd = (FMODVoiceOut *) hw;
if (fmd->fmod_sample) {
FSOUND_Sample_Free (fmd->fmod_sample);
fmd->fmod_sample = 0;
if (fmd->channel >= 0) {
FSOUND_StopSound (fmd->channel);
}
}
}
static int fmod_init_out (HWVoiceOut *hw, struct audsettings *as)
{
int mode, channel;
FMODVoiceOut *fmd = (FMODVoiceOut *) hw;
struct audsettings obt_as = *as;
mode = aud_to_fmodfmt (as->fmt, as->nchannels == 2 ? 1 : 0);
fmd->fmod_sample = FSOUND_Sample_Alloc (
FSOUND_FREE, /* index */
conf.nb_samples, /* length */
mode, /* mode */
as->freq, /* freq */
255, /* volume */
128, /* pan */
255 /* priority */
);
if (!fmd->fmod_sample) {
fmod_logerr2 ("DAC", "Failed to allocate FMOD sample\n");
return -1;
}
channel = FSOUND_PlaySoundEx (FSOUND_FREE, fmd->fmod_sample, 0, 1);
if (channel < 0) {
fmod_logerr2 ("DAC", "Failed to start playing sound\n");
FSOUND_Sample_Free (fmd->fmod_sample);
return -1;
}
fmd->channel = channel;
/* FMOD always operates on little endian frames? */
obt_as.endianness = 0;
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = conf.nb_samples;
return 0;
}
static int fmod_ctl_out (HWVoiceOut *hw, int cmd, ...)
{
int status;
FMODVoiceOut *fmd = (FMODVoiceOut *) hw;
switch (cmd) {
case VOICE_ENABLE:
fmod_clear_sample (fmd);
status = FSOUND_SetPaused (fmd->channel, 0);
if (!status) {
fmod_logerr ("Failed to resume channel %d\n", fmd->channel);
}
break;
case VOICE_DISABLE:
status = FSOUND_SetPaused (fmd->channel, 1);
if (!status) {
fmod_logerr ("Failed to pause channel %d\n", fmd->channel);
}
break;
}
return 0;
}
static int fmod_init_in (HWVoiceIn *hw, struct audsettings *as)
{
int mode;
FMODVoiceIn *fmd = (FMODVoiceIn *) hw;
struct audsettings obt_as = *as;
if (conf.broken_adc) {
return -1;
}
mode = aud_to_fmodfmt (as->fmt, as->nchannels == 2 ? 1 : 0);
fmd->fmod_sample = FSOUND_Sample_Alloc (
FSOUND_FREE, /* index */
conf.nb_samples, /* length */
mode, /* mode */
as->freq, /* freq */
255, /* volume */
128, /* pan */
255 /* priority */
);
if (!fmd->fmod_sample) {
fmod_logerr2 ("ADC", "Failed to allocate FMOD sample\n");
return -1;
}
/* FMOD always operates on little endian frames? */
obt_as.endianness = 0;
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = conf.nb_samples;
return 0;
}
static void fmod_fini_in (HWVoiceIn *hw)
{
FMODVoiceIn *fmd = (FMODVoiceIn *) hw;
if (fmd->fmod_sample) {
FSOUND_Record_Stop ();
FSOUND_Sample_Free (fmd->fmod_sample);
fmd->fmod_sample = 0;
}
}
static int fmod_run_in (HWVoiceIn *hw)
{
FMODVoiceIn *fmd = (FMODVoiceIn *) hw;
int hwshift = hw->info.shift;
int live, dead, new_pos, len;
unsigned int blen1 = 0, blen2 = 0;
unsigned int len1, len2;
unsigned int decr;
void *p1, *p2;
live = audio_pcm_hw_get_live_in (hw);
dead = hw->samples - live;
if (!dead) {
return 0;
}
new_pos = FSOUND_Record_GetPosition ();
if (new_pos < 0) {
fmod_logerr ("Could not get recording position\n");
return 0;
}
len = audio_ring_dist (new_pos, hw->wpos, hw->samples);
if (!len) {
return 0;
}
len = audio_MIN (len, dead);
if (fmod_lock_sample (fmd->fmod_sample, &fmd->hw.info,
hw->wpos, len,
&p1, &p2,
&blen1, &blen2)) {
return 0;
}
len1 = blen1 >> hwshift;
len2 = blen2 >> hwshift;
decr = len1 + len2;
if (p1 && blen1) {
hw->conv (hw->conv_buf + hw->wpos, p1, len1);
}
if (p2 && len2) {
hw->conv (hw->conv_buf, p2, len2);
}
fmod_unlock_sample (fmd->fmod_sample, p1, p2, blen1, blen2);
hw->wpos = (hw->wpos + decr) % hw->samples;
return decr;
}
static struct {
const char *name;
int type;
} drvtab[] = {
{ .name = "none", .type = FSOUND_OUTPUT_NOSOUND },
#ifdef _WIN32
{ .name = "winmm", .type = FSOUND_OUTPUT_WINMM },
{ .name = "dsound", .type = FSOUND_OUTPUT_DSOUND },
{ .name = "a3d", .type = FSOUND_OUTPUT_A3D },
{ .name = "asio", .type = FSOUND_OUTPUT_ASIO },
#endif
#ifdef __linux__
{ .name = "oss", .type = FSOUND_OUTPUT_OSS },
{ .name = "alsa", .type = FSOUND_OUTPUT_ALSA },
{ .name = "esd", .type = FSOUND_OUTPUT_ESD },
#endif
#ifdef __APPLE__
{ .name = "mac", .type = FSOUND_OUTPUT_MAC },
#endif
#if 0
{ .name = "xbox", .type = FSOUND_OUTPUT_XBOX },
{ .name = "ps2", .type = FSOUND_OUTPUT_PS2 },
{ .name = "gcube", .type = FSOUND_OUTPUT_GC },
#endif
{ .name = "none-realtime", .type = FSOUND_OUTPUT_NOSOUND_NONREALTIME }
};
static void *fmod_audio_init (void)
{
size_t i;
double ver;
int status;
int output_type = -1;
const char *drv = conf.drvname;
ver = FSOUND_GetVersion ();
if (ver < FMOD_VERSION) {
dolog ("Wrong FMOD version %f, need at least %f\n", ver, FMOD_VERSION);
return NULL;
}
#ifdef __linux__
if (ver < 3.75) {
dolog ("FMOD before 3.75 has bug preventing ADC from working\n"
"ADC will be disabled.\n");
conf.broken_adc = 1;
}
#endif
if (drv) {
int found = 0;
for (i = 0; i < ARRAY_SIZE (drvtab); i++) {
if (!strcmp (drv, drvtab[i].name)) {
output_type = drvtab[i].type;
found = 1;
break;
}
}
if (!found) {
dolog ("Unknown FMOD driver `%s'\n", drv);
dolog ("Valid drivers:\n");
for (i = 0; i < ARRAY_SIZE (drvtab); i++) {
dolog (" %s\n", drvtab[i].name);
}
}
}
if (output_type != -1) {
status = FSOUND_SetOutput (output_type);
if (!status) {
fmod_logerr ("FSOUND_SetOutput(%d) failed\n", output_type);
return NULL;
}
}
if (conf.bufsize) {
status = FSOUND_SetBufferSize (conf.bufsize);
if (!status) {
fmod_logerr ("FSOUND_SetBufferSize (%d) failed\n", conf.bufsize);
}
}
status = FSOUND_Init (conf.freq, conf.nb_channels, 0);
if (!status) {
fmod_logerr ("FSOUND_Init failed\n");
return NULL;
}
return &conf;
}
static int fmod_read (SWVoiceIn *sw, void *buf, int size)
{
return audio_pcm_sw_read (sw, buf, size);
}
static int fmod_ctl_in (HWVoiceIn *hw, int cmd, ...)
{
int status;
FMODVoiceIn *fmd = (FMODVoiceIn *) hw;
switch (cmd) {
case VOICE_ENABLE:
status = FSOUND_Record_StartSample (fmd->fmod_sample, 1);
if (!status) {
fmod_logerr ("Failed to start recording\n");
}
break;
case VOICE_DISABLE:
status = FSOUND_Record_Stop ();
if (!status) {
fmod_logerr ("Failed to stop recording\n");
}
break;
}
return 0;
}
static void fmod_audio_fini (void *opaque)
{
(void) opaque;
FSOUND_Close ();
}
static struct audio_option fmod_options[] = {
{
.name = "DRV",
.tag = AUD_OPT_STR,
.valp = &conf.drvname,
.descr = "FMOD driver"
},
{
.name = "FREQ",
.tag = AUD_OPT_INT,
.valp = &conf.freq,
.descr = "Default frequency"
},
{
.name = "SAMPLES",
.tag = AUD_OPT_INT,
.valp = &conf.nb_samples,
.descr = "Buffer size in samples"
},
{
.name = "CHANNELS",
.tag = AUD_OPT_INT,
.valp = &conf.nb_channels,
.descr = "Number of default channels (1 - mono, 2 - stereo)"
},
{
.name = "BUFSIZE",
.tag = AUD_OPT_INT,
.valp = &conf.bufsize,
.descr = "(undocumented)"
},
{ /* End of list */ }
};
static struct audio_pcm_ops fmod_pcm_ops = {
.init_out = fmod_init_out,
.fini_out = fmod_fini_out,
.run_out = fmod_run_out,
.write = fmod_write,
.ctl_out = fmod_ctl_out,
.init_in = fmod_init_in,
.fini_in = fmod_fini_in,
.run_in = fmod_run_in,
.read = fmod_read,
.ctl_in = fmod_ctl_in
};
struct audio_driver fmod_audio_driver = {
.name = "fmod",
.descr = "FMOD 3.xx http://www.fmod.org",
.options = fmod_options,
.init = fmod_audio_init,
.fini = fmod_audio_fini,
.pcm_ops = &fmod_pcm_ops,
.can_be_default = 1,
.max_voices_out = INT_MAX,
.max_voices_in = INT_MAX,
.voice_size_out = sizeof (FMODVoiceOut),
.voice_size_in = sizeof (FMODVoiceIn)
};

View File

@@ -63,7 +63,7 @@ static int no_write (SWVoiceOut *sw, void *buf, int len)
return audio_pcm_sw_write (sw, buf, len);
}
static int no_init_out(HWVoiceOut *hw, struct audsettings *as, void *drv_opaque)
static int no_init_out (HWVoiceOut *hw, struct audsettings *as)
{
audio_pcm_init_info (&hw->info, as);
hw->samples = 1024;
@@ -82,7 +82,7 @@ static int no_ctl_out (HWVoiceOut *hw, int cmd, ...)
return 0;
}
static int no_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
static int no_init_in (HWVoiceIn *hw, struct audsettings *as)
{
audio_pcm_init_info (&hw->info, as);
hw->samples = 1024;

View File

@@ -30,7 +30,6 @@
#include "qemu/main-loop.h"
#include "qemu/host-utils.h"
#include "audio.h"
#include "trace.h"
#define AUDIO_CAP "oss"
#include "audio_int.h"
@@ -39,16 +38,6 @@
#define USE_DSP_POLICY
#endif
typedef struct OSSConf {
int try_mmap;
int nfrags;
int fragsize;
const char *devpath_out;
const char *devpath_in;
int exclusive;
int policy;
} OSSConf;
typedef struct OSSVoiceOut {
HWVoiceOut hw;
void *pcm_buf;
@@ -58,7 +47,6 @@ typedef struct OSSVoiceOut {
int fragsize;
int mmapped;
int pending;
OSSConf *conf;
} OSSVoiceOut;
typedef struct OSSVoiceIn {
@@ -67,9 +55,28 @@ typedef struct OSSVoiceIn {
int fd;
int nfrags;
int fragsize;
OSSConf *conf;
} OSSVoiceIn;
static struct {
int try_mmap;
int nfrags;
int fragsize;
const char *devpath_out;
const char *devpath_in;
int debug;
int exclusive;
int policy;
} conf = {
.try_mmap = 0,
.nfrags = 4,
.fragsize = 4096,
.devpath_out = "/dev/dsp",
.devpath_in = "/dev/dsp",
.debug = 0,
.exclusive = 0,
.policy = 5
};
struct oss_params {
int freq;
audfmt_e fmt;
@@ -131,18 +138,18 @@ static void oss_helper_poll_in (void *opaque)
audio_run ("oss_poll_in");
}
static void oss_poll_out (HWVoiceOut *hw)
static int oss_poll_out (HWVoiceOut *hw)
{
OSSVoiceOut *oss = (OSSVoiceOut *) hw;
qemu_set_fd_handler (oss->fd, NULL, oss_helper_poll_out, NULL);
return qemu_set_fd_handler (oss->fd, NULL, oss_helper_poll_out, NULL);
}
static void oss_poll_in (HWVoiceIn *hw)
static int oss_poll_in (HWVoiceIn *hw)
{
OSSVoiceIn *oss = (OSSVoiceIn *) hw;
qemu_set_fd_handler (oss->fd, oss_helper_poll_in, NULL, NULL);
return qemu_set_fd_handler (oss->fd, oss_helper_poll_in, NULL, NULL);
}
static int oss_write (SWVoiceOut *sw, void *buf, int len)
@@ -265,18 +272,18 @@ static int oss_get_version (int fd, int *version, const char *typ)
#endif
static int oss_open (int in, struct oss_params *req,
struct oss_params *obt, int *pfd, OSSConf* conf)
struct oss_params *obt, int *pfd)
{
int fd;
int oflags = conf->exclusive ? O_EXCL : 0;
int oflags = conf.exclusive ? O_EXCL : 0;
audio_buf_info abinfo;
int fmt, freq, nchannels;
int setfragment = 1;
const char *dspname = in ? conf->devpath_in : conf->devpath_out;
const char *dspname = in ? conf.devpath_in : conf.devpath_out;
const char *typ = in ? "ADC" : "DAC";
/* Kludge needed to have working mmap on Linux */
oflags |= conf->try_mmap ? O_RDWR : (in ? O_RDONLY : O_WRONLY);
oflags |= conf.try_mmap ? O_RDWR : (in ? O_RDONLY : O_WRONLY);
fd = open (dspname, oflags | O_NONBLOCK);
if (-1 == fd) {
@@ -310,18 +317,20 @@ static int oss_open (int in, struct oss_params *req,
}
#ifdef USE_DSP_POLICY
if (conf->policy >= 0) {
if (conf.policy >= 0) {
int version;
if (!oss_get_version (fd, &version, typ)) {
trace_oss_version(version);
if (conf.debug) {
dolog ("OSS version = %#x\n", version);
}
if (version >= 0x040000) {
int policy = conf->policy;
int policy = conf.policy;
if (ioctl (fd, SNDCTL_DSP_POLICY, &policy)) {
oss_logerr2 (errno, typ,
"Failed to set timing policy to %d\n",
conf->policy);
conf.policy);
goto err;
}
setfragment = 0;
@@ -449,12 +458,19 @@ static int oss_run_out (HWVoiceOut *hw, int live)
}
if (abinfo.bytes > bufsize) {
trace_oss_invalid_available_size(abinfo.bytes, bufsize);
if (conf.debug) {
dolog ("warning: Invalid available size, size=%d bufsize=%d\n"
"please report your OS/audio hw to av1474@comtv.ru\n",
abinfo.bytes, bufsize);
}
abinfo.bytes = bufsize;
}
if (abinfo.bytes < 0) {
trace_oss_invalid_available_size(abinfo.bytes, bufsize);
if (conf.debug) {
dolog ("warning: Invalid available size, size=%d bufsize=%d\n",
abinfo.bytes, bufsize);
}
return 0;
}
@@ -494,8 +510,7 @@ static void oss_fini_out (HWVoiceOut *hw)
}
}
static int oss_init_out(HWVoiceOut *hw, struct audsettings *as,
void *drv_opaque)
static int oss_init_out (HWVoiceOut *hw, struct audsettings *as)
{
OSSVoiceOut *oss = (OSSVoiceOut *) hw;
struct oss_params req, obt;
@@ -504,17 +519,16 @@ static int oss_init_out(HWVoiceOut *hw, struct audsettings *as,
int fd;
audfmt_e effective_fmt;
struct audsettings obt_as;
OSSConf *conf = drv_opaque;
oss->fd = -1;
req.fmt = aud_to_ossfmt (as->fmt, as->endianness);
req.freq = as->freq;
req.nchannels = as->nchannels;
req.fragsize = conf->fragsize;
req.nfrags = conf->nfrags;
req.fragsize = conf.fragsize;
req.nfrags = conf.nfrags;
if (oss_open (0, &req, &obt, &fd, conf)) {
if (oss_open (0, &req, &obt, &fd)) {
return -1;
}
@@ -541,7 +555,7 @@ static int oss_init_out(HWVoiceOut *hw, struct audsettings *as,
hw->samples = (obt.nfrags * obt.fragsize) >> hw->info.shift;
oss->mmapped = 0;
if (conf->try_mmap) {
if (conf.try_mmap) {
oss->pcm_buf = mmap (
NULL,
hw->samples << hw->info.shift,
@@ -601,7 +615,6 @@ static int oss_init_out(HWVoiceOut *hw, struct audsettings *as,
}
oss->fd = fd;
oss->conf = conf;
return 0;
}
@@ -621,8 +634,7 @@ static int oss_ctl_out (HWVoiceOut *hw, int cmd, ...)
va_end (ap);
ldebug ("enabling voice\n");
if (poll_mode) {
oss_poll_out (hw);
if (poll_mode && oss_poll_out (hw)) {
poll_mode = 0;
}
hw->poll_mode = poll_mode;
@@ -664,7 +676,7 @@ static int oss_ctl_out (HWVoiceOut *hw, int cmd, ...)
return 0;
}
static int oss_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
static int oss_init_in (HWVoiceIn *hw, struct audsettings *as)
{
OSSVoiceIn *oss = (OSSVoiceIn *) hw;
struct oss_params req, obt;
@@ -673,16 +685,15 @@ static int oss_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
int fd;
audfmt_e effective_fmt;
struct audsettings obt_as;
OSSConf *conf = drv_opaque;
oss->fd = -1;
req.fmt = aud_to_ossfmt (as->fmt, as->endianness);
req.freq = as->freq;
req.nchannels = as->nchannels;
req.fragsize = conf->fragsize;
req.nfrags = conf->nfrags;
if (oss_open (1, &req, &obt, &fd, conf)) {
req.fragsize = conf.fragsize;
req.nfrags = conf.nfrags;
if (oss_open (1, &req, &obt, &fd)) {
return -1;
}
@@ -716,7 +727,6 @@ static int oss_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
}
oss->fd = fd;
oss->conf = conf;
return 0;
}
@@ -726,8 +736,10 @@ static void oss_fini_in (HWVoiceIn *hw)
oss_anal_close (&oss->fd);
g_free(oss->pcm_buf);
oss->pcm_buf = NULL;
if (oss->pcm_buf) {
g_free (oss->pcm_buf);
oss->pcm_buf = NULL;
}
}
static int oss_run_in (HWVoiceIn *hw)
@@ -818,8 +830,7 @@ static int oss_ctl_in (HWVoiceIn *hw, int cmd, ...)
poll_mode = va_arg (ap, int);
va_end (ap);
if (poll_mode) {
oss_poll_in (hw);
if (poll_mode && oss_poll_in (hw)) {
poll_mode = 0;
}
hw->poll_mode = poll_mode;
@@ -836,79 +847,71 @@ static int oss_ctl_in (HWVoiceIn *hw, int cmd, ...)
return 0;
}
static OSSConf glob_conf = {
.try_mmap = 0,
.nfrags = 4,
.fragsize = 4096,
.devpath_out = "/dev/dsp",
.devpath_in = "/dev/dsp",
.exclusive = 0,
.policy = 5
};
static void *oss_audio_init (void)
{
OSSConf *conf = g_malloc(sizeof(OSSConf));
*conf = glob_conf;
if (access(conf->devpath_in, R_OK | W_OK) < 0 ||
access(conf->devpath_out, R_OK | W_OK) < 0) {
g_free(conf);
if (access(conf.devpath_in, R_OK | W_OK) < 0 ||
access(conf.devpath_out, R_OK | W_OK) < 0) {
return NULL;
}
return conf;
return &conf;
}
static void oss_audio_fini (void *opaque)
{
g_free(opaque);
(void) opaque;
}
static struct audio_option oss_options[] = {
{
.name = "FRAGSIZE",
.tag = AUD_OPT_INT,
.valp = &glob_conf.fragsize,
.valp = &conf.fragsize,
.descr = "Fragment size in bytes"
},
{
.name = "NFRAGS",
.tag = AUD_OPT_INT,
.valp = &glob_conf.nfrags,
.valp = &conf.nfrags,
.descr = "Number of fragments"
},
{
.name = "MMAP",
.tag = AUD_OPT_BOOL,
.valp = &glob_conf.try_mmap,
.valp = &conf.try_mmap,
.descr = "Try using memory mapped access"
},
{
.name = "DAC_DEV",
.tag = AUD_OPT_STR,
.valp = &glob_conf.devpath_out,
.valp = &conf.devpath_out,
.descr = "Path to DAC device"
},
{
.name = "ADC_DEV",
.tag = AUD_OPT_STR,
.valp = &glob_conf.devpath_in,
.valp = &conf.devpath_in,
.descr = "Path to ADC device"
},
{
.name = "EXCLUSIVE",
.tag = AUD_OPT_BOOL,
.valp = &glob_conf.exclusive,
.valp = &conf.exclusive,
.descr = "Open device in exclusive mode (vmix wont work)"
},
#ifdef USE_DSP_POLICY
{
.name = "POLICY",
.tag = AUD_OPT_INT,
.valp = &glob_conf.policy,
.valp = &conf.policy,
.descr = "Set the timing policy of the device, -1 to use fragment mode",
},
#endif
{
.name = "DEBUG",
.tag = AUD_OPT_BOOL,
.valp = &conf.debug,
.descr = "Turn on some debugging messages"
},
{ /* End of list */ }
};

View File

@@ -8,19 +8,6 @@
#include "audio_int.h"
#include "audio_pt_int.h"
typedef struct {
int samples;
char *server;
char *sink;
char *source;
} PAConf;
typedef struct {
PAConf conf;
pa_threaded_mainloop *mainloop;
pa_context *context;
} paaudio;
typedef struct {
HWVoiceOut hw;
int done;
@@ -30,7 +17,6 @@ typedef struct {
pa_stream *stream;
void *pcm_buf;
struct audio_pt pt;
paaudio *g;
} PAVoiceOut;
typedef struct {
@@ -44,10 +30,20 @@ typedef struct {
struct audio_pt pt;
const void *read_data;
size_t read_index, read_length;
paaudio *g;
} PAVoiceIn;
static void qpa_audio_fini(void *opaque);
typedef struct {
int samples;
char *server;
char *sink;
char *source;
pa_threaded_mainloop *mainloop;
pa_context *context;
} paaudio;
static paaudio glob_paaudio = {
.samples = 4096,
};
static void GCC_FMT_ATTR (2, 3) qpa_logerr (int err, const char *fmt, ...)
{
@@ -110,7 +106,7 @@ static inline int PA_STREAM_IS_GOOD(pa_stream_state_t x)
static int qpa_simple_read (PAVoiceIn *p, void *data, size_t length, int *rerror)
{
paaudio *g = p->g;
paaudio *g = &glob_paaudio;
pa_threaded_mainloop_lock (g->mainloop);
@@ -164,7 +160,7 @@ unlock_and_fail:
static int qpa_simple_write (PAVoiceOut *p, const void *data, size_t length, int *rerror)
{
paaudio *g = p->g;
paaudio *g = &glob_paaudio;
pa_threaded_mainloop_lock (g->mainloop);
@@ -226,7 +222,7 @@ static void *qpa_thread_out (void *arg)
}
}
decr = to_mix = audio_MIN (pa->live, pa->g->conf.samples >> 2);
decr = to_mix = audio_MIN (pa->live, glob_paaudio.samples >> 2);
rpos = pa->rpos;
if (audio_pt_unlock (&pa->pt, AUDIO_FUNC)) {
@@ -318,7 +314,7 @@ static void *qpa_thread_in (void *arg)
}
}
incr = to_grab = audio_MIN (pa->dead, pa->g->conf.samples >> 2);
incr = to_grab = audio_MIN (pa->dead, glob_paaudio.samples >> 2);
wpos = pa->wpos;
if (audio_pt_unlock (&pa->pt, AUDIO_FUNC)) {
@@ -434,7 +430,7 @@ static audfmt_e pa_to_audfmt (pa_sample_format_t fmt, int *endianness)
static void context_state_cb (pa_context *c, void *userdata)
{
paaudio *g = userdata;
paaudio *g = &glob_paaudio;
switch (pa_context_get_state(c)) {
case PA_CONTEXT_READY:
@@ -453,7 +449,7 @@ static void context_state_cb (pa_context *c, void *userdata)
static void stream_state_cb (pa_stream *s, void * userdata)
{
paaudio *g = userdata;
paaudio *g = &glob_paaudio;
switch (pa_stream_get_state (s)) {
@@ -471,21 +467,23 @@ static void stream_state_cb (pa_stream *s, void * userdata)
static void stream_request_cb (pa_stream *s, size_t length, void *userdata)
{
paaudio *g = userdata;
paaudio *g = &glob_paaudio;
pa_threaded_mainloop_signal (g->mainloop, 0);
}
static pa_stream *qpa_simple_new (
paaudio *g,
const char *server,
const char *name,
pa_stream_direction_t dir,
const char *dev,
const char *stream_name,
const pa_sample_spec *ss,
const pa_channel_map *map,
const pa_buffer_attr *attr,
int *rerror)
{
paaudio *g = &glob_paaudio;
int r;
pa_stream *stream;
@@ -536,36 +534,35 @@ fail:
return NULL;
}
static int qpa_init_out(HWVoiceOut *hw, struct audsettings *as,
void *drv_opaque)
static int qpa_init_out (HWVoiceOut *hw, struct audsettings *as)
{
int error;
pa_sample_spec ss;
pa_buffer_attr ba;
static pa_sample_spec ss;
static pa_buffer_attr ba;
struct audsettings obt_as = *as;
PAVoiceOut *pa = (PAVoiceOut *) hw;
paaudio *g = pa->g = drv_opaque;
ss.format = audfmt_to_pa (as->fmt, as->endianness);
ss.channels = as->nchannels;
ss.rate = as->freq;
/*
* qemu audio tick runs at 100 Hz (by default), so processing
* data chunks worth 10 ms of sound should be a good fit.
* qemu audio tick runs at 250 Hz (by default), so processing
* data chunks worth 4 ms of sound should be a good fit.
*/
ba.tlength = pa_usec_to_bytes (10 * 1000, &ss);
ba.minreq = pa_usec_to_bytes (5 * 1000, &ss);
ba.tlength = pa_usec_to_bytes (4 * 1000, &ss);
ba.minreq = pa_usec_to_bytes (2 * 1000, &ss);
ba.maxlength = -1;
ba.prebuf = -1;
obt_as.fmt = pa_to_audfmt (ss.format, &obt_as.endianness);
pa->stream = qpa_simple_new (
g,
glob_paaudio.server,
"qemu",
PA_STREAM_PLAYBACK,
g->conf.sink,
glob_paaudio.sink,
"pcm.playback",
&ss,
NULL, /* channel map */
&ba, /* buffering attributes */
@@ -577,7 +574,7 @@ static int qpa_init_out(HWVoiceOut *hw, struct audsettings *as,
}
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = g->conf.samples;
hw->samples = glob_paaudio.samples;
pa->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
pa->rpos = hw->rpos;
if (!pa->pcm_buf) {
@@ -604,13 +601,12 @@ static int qpa_init_out(HWVoiceOut *hw, struct audsettings *as,
return -1;
}
static int qpa_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
static int qpa_init_in (HWVoiceIn *hw, struct audsettings *as)
{
int error;
pa_sample_spec ss;
static pa_sample_spec ss;
struct audsettings obt_as = *as;
PAVoiceIn *pa = (PAVoiceIn *) hw;
paaudio *g = pa->g = drv_opaque;
ss.format = audfmt_to_pa (as->fmt, as->endianness);
ss.channels = as->nchannels;
@@ -619,10 +615,11 @@ static int qpa_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
obt_as.fmt = pa_to_audfmt (ss.format, &obt_as.endianness);
pa->stream = qpa_simple_new (
g,
glob_paaudio.server,
"qemu",
PA_STREAM_RECORD,
g->conf.source,
glob_paaudio.source,
"pcm.capture",
&ss,
NULL, /* channel map */
NULL, /* buffering attributes */
@@ -634,7 +631,7 @@ static int qpa_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
}
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = g->conf.samples;
hw->samples = glob_paaudio.samples;
pa->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
pa->wpos = hw->wpos;
if (!pa->pcm_buf) {
@@ -706,7 +703,7 @@ static int qpa_ctl_out (HWVoiceOut *hw, int cmd, ...)
PAVoiceOut *pa = (PAVoiceOut *) hw;
pa_operation *op;
pa_cvolume v;
paaudio *g = pa->g;
paaudio *g = &glob_paaudio;
#ifdef PA_CHECK_VERSION /* macro is present in 0.9.16+ */
pa_cvolume_init (&v); /* function is present in 0.9.13+ */
@@ -758,7 +755,7 @@ static int qpa_ctl_in (HWVoiceIn *hw, int cmd, ...)
PAVoiceIn *pa = (PAVoiceIn *) hw;
pa_operation *op;
pa_cvolume v;
paaudio *g = pa->g;
paaudio *g = &glob_paaudio;
#ifdef PA_CHECK_VERSION
pa_cvolume_init (&v);
@@ -808,31 +805,23 @@ static int qpa_ctl_in (HWVoiceIn *hw, int cmd, ...)
}
/* common */
static PAConf glob_conf = {
.samples = 4096,
};
static void *qpa_audio_init (void)
{
paaudio *g = g_malloc(sizeof(paaudio));
g->conf = glob_conf;
g->mainloop = NULL;
g->context = NULL;
paaudio *g = &glob_paaudio;
g->mainloop = pa_threaded_mainloop_new ();
if (!g->mainloop) {
goto fail;
}
g->context = pa_context_new (pa_threaded_mainloop_get_api (g->mainloop),
g->conf.server);
g->context = pa_context_new (pa_threaded_mainloop_get_api (g->mainloop), glob_paaudio.server);
if (!g->context) {
goto fail;
}
pa_context_set_state_callback (g->context, context_state_cb, g);
if (pa_context_connect (g->context, g->conf.server, 0, NULL) < 0) {
if (pa_context_connect (g->context, glob_paaudio.server, 0, NULL) < 0) {
qpa_logerr (pa_context_errno (g->context),
"pa_context_connect() failed\n");
goto fail;
@@ -865,13 +854,12 @@ static void *qpa_audio_init (void)
pa_threaded_mainloop_unlock (g->mainloop);
return g;
return &glob_paaudio;
unlock_and_fail:
pa_threaded_mainloop_unlock (g->mainloop);
fail:
AUD_log (AUDIO_CAP, "Failed to initialize PA context");
qpa_audio_fini(g);
return NULL;
}
@@ -886,38 +874,39 @@ static void qpa_audio_fini (void *opaque)
if (g->context) {
pa_context_disconnect (g->context);
pa_context_unref (g->context);
g->context = NULL;
}
if (g->mainloop) {
pa_threaded_mainloop_free (g->mainloop);
}
g_free(g);
g->mainloop = NULL;
}
struct audio_option qpa_options[] = {
{
.name = "SAMPLES",
.tag = AUD_OPT_INT,
.valp = &glob_conf.samples,
.valp = &glob_paaudio.samples,
.descr = "buffer size in samples"
},
{
.name = "SERVER",
.tag = AUD_OPT_STR,
.valp = &glob_conf.server,
.valp = &glob_paaudio.server,
.descr = "server address"
},
{
.name = "SINK",
.tag = AUD_OPT_STR,
.valp = &glob_conf.sink,
.valp = &glob_paaudio.sink,
.descr = "sink device name"
},
{
.name = "SOURCE",
.tag = AUD_OPT_STR,
.valp = &glob_conf.source,
.valp = &glob_paaudio.source,
.descr = "source device name"
},
{ /* End of list */ }

View File

@@ -55,7 +55,6 @@ static struct SDLAudioState {
SDL_mutex *mutex;
SDL_sem *sem;
int initialized;
bool driver_created;
} glob_sdl;
typedef struct SDLAudioState SDLAudioState;
@@ -333,8 +332,7 @@ static void sdl_fini_out (HWVoiceOut *hw)
sdl_close (&glob_sdl);
}
static int sdl_init_out(HWVoiceOut *hw, struct audsettings *as,
void *drv_opaque)
static int sdl_init_out (HWVoiceOut *hw, struct audsettings *as)
{
SDLVoiceOut *sdl = (SDLVoiceOut *) hw;
SDLAudioState *s = &glob_sdl;
@@ -394,10 +392,6 @@ static int sdl_ctl_out (HWVoiceOut *hw, int cmd, ...)
static void *sdl_audio_init (void)
{
SDLAudioState *s = &glob_sdl;
if (s->driver_created) {
sdl_logerr("Can't create multiple sdl backends\n");
return NULL;
}
if (SDL_InitSubSystem (SDL_INIT_AUDIO)) {
sdl_logerr ("SDL failed to initialize audio subsystem\n");
@@ -419,7 +413,6 @@ static void *sdl_audio_init (void)
return NULL;
}
s->driver_created = true;
return s;
}
@@ -430,7 +423,6 @@ static void sdl_audio_fini (void *opaque)
SDL_DestroySemaphore (s->sem);
SDL_DestroyMutex (s->mutex);
SDL_QuitSubSystem (SDL_INIT_AUDIO);
s->driver_created = false;
}
static struct audio_option sdl_options[] = {

View File

@@ -18,7 +18,6 @@
*/
#include "hw/hw.h"
#include "qemu/error-report.h"
#include "qemu/timer.h"
#include "ui/qemu-spice.h"
@@ -26,17 +25,8 @@
#include "audio.h"
#include "audio_int.h"
#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
#define LINE_OUT_SAMPLES (480 * 4)
#else
#define LINE_OUT_SAMPLES (256 * 4)
#endif
#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
#define LINE_IN_SAMPLES (480 * 4)
#else
#define LINE_IN_SAMPLES (256 * 4)
#endif
#define LINE_IN_SAMPLES 1024
#define LINE_OUT_SAMPLES 1024
typedef struct SpiceRateCtl {
int64_t start_ticks;
@@ -106,7 +96,7 @@ static int rate_get_samples (struct audio_pcm_info *info, SpiceRateCtl *rate)
bytes = muldiv64 (ticks, info->bytes_per_second, get_ticks_per_sec ());
samples = (bytes - rate->bytes_sent) >> info->shift;
if (samples < 0 || samples > 65536) {
error_report("Resetting rate control (%" PRId64 " samples)", samples);
fprintf (stderr, "Resetting rate control (%" PRId64 " samples)\n", samples);
rate_start (rate);
samples = 0;
}
@@ -116,17 +106,12 @@ static int rate_get_samples (struct audio_pcm_info *info, SpiceRateCtl *rate)
/* playback */
static int line_out_init(HWVoiceOut *hw, struct audsettings *as,
void *drv_opaque)
static int line_out_init (HWVoiceOut *hw, struct audsettings *as)
{
SpiceVoiceOut *out = container_of (hw, SpiceVoiceOut, hw);
struct audsettings settings;
#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
settings.freq = spice_server_get_best_playback_rate(NULL);
#else
settings.freq = SPICE_INTERFACE_PLAYBACK_FREQ;
#endif
settings.nchannels = SPICE_INTERFACE_PLAYBACK_CHAN;
settings.fmt = AUD_FMT_S16;
settings.endianness = AUDIO_HOST_ENDIANNESS;
@@ -137,9 +122,6 @@ static int line_out_init(HWVoiceOut *hw, struct audsettings *as,
out->sin.base.sif = &playback_sif.base;
qemu_spice_add_interface (&out->sin.base);
#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
spice_server_set_playback_rate(&out->sin, settings.freq);
#endif
return 0;
}
@@ -245,16 +227,12 @@ static int line_out_ctl (HWVoiceOut *hw, int cmd, ...)
/* record */
static int line_in_init(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
static int line_in_init (HWVoiceIn *hw, struct audsettings *as)
{
SpiceVoiceIn *in = container_of (hw, SpiceVoiceIn, hw);
struct audsettings settings;
#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
settings.freq = spice_server_get_best_record_rate(NULL);
#else
settings.freq = SPICE_INTERFACE_RECORD_FREQ;
#endif
settings.nchannels = SPICE_INTERFACE_RECORD_CHAN;
settings.fmt = AUD_FMT_S16;
settings.endianness = AUDIO_HOST_ENDIANNESS;
@@ -265,9 +243,6 @@ static int line_in_init(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
in->sin.base.sif = &record_sif.base;
qemu_spice_add_interface (&in->sin.base);
#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
spice_server_set_record_rate(&in->sin, settings.freq);
#endif
return 0;
}

View File

@@ -36,10 +36,15 @@ typedef struct WAVVoiceOut {
int total_samples;
} WAVVoiceOut;
typedef struct {
static struct {
struct audsettings settings;
const char *wav_path;
} WAVConf;
} conf = {
.settings.freq = 44100,
.settings.nchannels = 2,
.settings.fmt = AUD_FMT_S16,
.wav_path = "qemu.wav"
};
static int wav_run_out (HWVoiceOut *hw, int live)
{
@@ -100,8 +105,7 @@ static void le_store (uint8_t *buf, uint32_t val, int len)
}
}
static int wav_init_out(HWVoiceOut *hw, struct audsettings *as,
void *drv_opaque)
static int wav_init_out (HWVoiceOut *hw, struct audsettings *as)
{
WAVVoiceOut *wav = (WAVVoiceOut *) hw;
int bits16 = 0, stereo = 0;
@@ -111,8 +115,9 @@ static int wav_init_out(HWVoiceOut *hw, struct audsettings *as,
0x02, 0x00, 0x44, 0xac, 0x00, 0x00, 0x10, 0xb1, 0x02, 0x00, 0x04,
0x00, 0x10, 0x00, 0x64, 0x61, 0x74, 0x61, 0x00, 0x00, 0x00, 0x00
};
WAVConf *conf = drv_opaque;
struct audsettings wav_as = conf->settings;
struct audsettings wav_as = conf.settings;
(void) as;
stereo = wav_as.nchannels == 2;
switch (wav_as.fmt) {
@@ -150,10 +155,10 @@ static int wav_init_out(HWVoiceOut *hw, struct audsettings *as,
le_store (hdr + 28, hw->info.freq << (bits16 + stereo), 4);
le_store (hdr + 32, 1 << (bits16 + stereo), 2);
wav->f = fopen (conf->wav_path, "wb");
wav->f = fopen (conf.wav_path, "wb");
if (!wav->f) {
dolog ("Failed to open wave file `%s'\nReason: %s\n",
conf->wav_path, strerror (errno));
conf.wav_path, strerror (errno));
g_free (wav->pcm_buf);
wav->pcm_buf = NULL;
return -1;
@@ -221,49 +226,40 @@ static int wav_ctl_out (HWVoiceOut *hw, int cmd, ...)
return 0;
}
static WAVConf glob_conf = {
.settings.freq = 44100,
.settings.nchannels = 2,
.settings.fmt = AUD_FMT_S16,
.wav_path = "qemu.wav"
};
static void *wav_audio_init (void)
{
WAVConf *conf = g_malloc(sizeof(WAVConf));
*conf = glob_conf;
return conf;
return &conf;
}
static void wav_audio_fini (void *opaque)
{
(void) opaque;
ldebug ("wav_fini");
g_free(opaque);
}
static struct audio_option wav_options[] = {
{
.name = "FREQUENCY",
.tag = AUD_OPT_INT,
.valp = &glob_conf.settings.freq,
.valp = &conf.settings.freq,
.descr = "Frequency"
},
{
.name = "FORMAT",
.tag = AUD_OPT_FMT,
.valp = &glob_conf.settings.fmt,
.valp = &conf.settings.fmt,
.descr = "Format"
},
{
.name = "DAC_FIXED_CHANNELS",
.tag = AUD_OPT_INT,
.valp = &glob_conf.settings.nchannels,
.valp = &conf.settings.nchannels,
.descr = "Number of channels (1 - mono, 2 - stereo)"
},
{
.name = "PATH",
.tag = AUD_OPT_STR,
.valp = &glob_conf.wav_path,
.valp = &conf.wav_path,
.descr = "Path to wave file"
},
{ /* End of list */ }

View File

@@ -1,6 +1,5 @@
#include "hw/hw.h"
#include "monitor/monitor.h"
#include "qemu/error-report.h"
#include "audio.h"
typedef struct {
@@ -64,7 +63,8 @@ static void wav_destroy (void *opaque)
}
doclose:
if (fclose (wav->f)) {
error_report("wav_destroy: fclose failed: %s", strerror(errno));
fprintf (stderr, "wav_destroy: fclose failed: %s",
strerror (errno));
}
}

717
audio/winwaveaudio.c Normal file
View File

@@ -0,0 +1,717 @@
/* public domain */
#include "qemu-common.h"
#include "sysemu/sysemu.h"
#include "audio.h"
#define AUDIO_CAP "winwave"
#include "audio_int.h"
#include <windows.h>
#include <mmsystem.h>
#include "audio_win_int.h"
static struct {
int dac_headers;
int dac_samples;
int adc_headers;
int adc_samples;
} conf = {
.dac_headers = 4,
.dac_samples = 1024,
.adc_headers = 4,
.adc_samples = 1024
};
typedef struct {
HWVoiceOut hw;
HWAVEOUT hwo;
WAVEHDR *hdrs;
HANDLE event;
void *pcm_buf;
int avail;
int pending;
int curhdr;
int paused;
CRITICAL_SECTION crit_sect;
} WaveVoiceOut;
typedef struct {
HWVoiceIn hw;
HWAVEIN hwi;
WAVEHDR *hdrs;
HANDLE event;
void *pcm_buf;
int curhdr;
int paused;
int rpos;
int avail;
CRITICAL_SECTION crit_sect;
} WaveVoiceIn;
static void winwave_log_mmresult (MMRESULT mr)
{
const char *str = "BUG";
switch (mr) {
case MMSYSERR_NOERROR:
str = "Success";
break;
case MMSYSERR_INVALHANDLE:
str = "Specified device handle is invalid";
break;
case MMSYSERR_BADDEVICEID:
str = "Specified device id is out of range";
break;
case MMSYSERR_NODRIVER:
str = "No device driver is present";
break;
case MMSYSERR_NOMEM:
str = "Unable to allocate or lock memory";
break;
case WAVERR_SYNC:
str = "Device is synchronous but waveOutOpen was called "
"without using the WINWAVE_ALLOWSYNC flag";
break;
case WAVERR_UNPREPARED:
str = "The data block pointed to by the pwh parameter "
"hasn't been prepared";
break;
case WAVERR_STILLPLAYING:
str = "There are still buffers in the queue";
break;
default:
dolog ("Reason: Unknown (MMRESULT %#x)\n", mr);
return;
}
dolog ("Reason: %s\n", str);
}
static void GCC_FMT_ATTR (2, 3) winwave_logerr (
MMRESULT mr,
const char *fmt,
...
)
{
va_list ap;
va_start (ap, fmt);
AUD_vlog (AUDIO_CAP, fmt, ap);
va_end (ap);
AUD_log (NULL, " failed\n");
winwave_log_mmresult (mr);
}
static void winwave_anal_close_out (WaveVoiceOut *wave)
{
MMRESULT mr;
mr = waveOutClose (wave->hwo);
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveOutClose");
}
wave->hwo = NULL;
}
static void CALLBACK winwave_callback_out (
HWAVEOUT hwo,
UINT msg,
DWORD_PTR dwInstance,
DWORD_PTR dwParam1,
DWORD_PTR dwParam2
)
{
WaveVoiceOut *wave = (WaveVoiceOut *) dwInstance;
switch (msg) {
case WOM_DONE:
{
WAVEHDR *h = (WAVEHDR *) dwParam1;
if (!h->dwUser) {
h->dwUser = 1;
EnterCriticalSection (&wave->crit_sect);
{
wave->avail += conf.dac_samples;
}
LeaveCriticalSection (&wave->crit_sect);
if (wave->hw.poll_mode) {
if (!SetEvent (wave->event)) {
dolog ("DAC SetEvent failed %lx\n", GetLastError ());
}
}
}
}
break;
case WOM_CLOSE:
case WOM_OPEN:
break;
default:
dolog ("unknown wave out callback msg %x\n", msg);
}
}
static int winwave_init_out (HWVoiceOut *hw, struct audsettings *as)
{
int i;
int err;
MMRESULT mr;
WAVEFORMATEX wfx;
WaveVoiceOut *wave;
wave = (WaveVoiceOut *) hw;
InitializeCriticalSection (&wave->crit_sect);
err = waveformat_from_audio_settings (&wfx, as);
if (err) {
goto err0;
}
mr = waveOutOpen (&wave->hwo, WAVE_MAPPER, &wfx,
(DWORD_PTR) winwave_callback_out,
(DWORD_PTR) wave, CALLBACK_FUNCTION);
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveOutOpen");
goto err1;
}
wave->hdrs = audio_calloc (AUDIO_FUNC, conf.dac_headers,
sizeof (*wave->hdrs));
if (!wave->hdrs) {
goto err2;
}
audio_pcm_init_info (&hw->info, as);
hw->samples = conf.dac_samples * conf.dac_headers;
wave->avail = hw->samples;
wave->pcm_buf = audio_calloc (AUDIO_FUNC, conf.dac_samples,
conf.dac_headers << hw->info.shift);
if (!wave->pcm_buf) {
goto err3;
}
for (i = 0; i < conf.dac_headers; ++i) {
WAVEHDR *h = &wave->hdrs[i];
h->dwUser = 0;
h->dwBufferLength = conf.dac_samples << hw->info.shift;
h->lpData = advance (wave->pcm_buf, i * h->dwBufferLength);
h->dwFlags = 0;
mr = waveOutPrepareHeader (wave->hwo, h, sizeof (*h));
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveOutPrepareHeader(%d)", i);
goto err4;
}
}
return 0;
err4:
g_free (wave->pcm_buf);
err3:
g_free (wave->hdrs);
err2:
winwave_anal_close_out (wave);
err1:
err0:
return -1;
}
static int winwave_write (SWVoiceOut *sw, void *buf, int len)
{
return audio_pcm_sw_write (sw, buf, len);
}
static int winwave_run_out (HWVoiceOut *hw, int live)
{
WaveVoiceOut *wave = (WaveVoiceOut *) hw;
int decr;
int doreset;
EnterCriticalSection (&wave->crit_sect);
{
decr = audio_MIN (live, wave->avail);
decr = audio_pcm_hw_clip_out (hw, wave->pcm_buf, decr, wave->pending);
wave->pending += decr;
wave->avail -= decr;
}
LeaveCriticalSection (&wave->crit_sect);
doreset = hw->poll_mode && (wave->pending >= conf.dac_samples);
if (doreset && !ResetEvent (wave->event)) {
dolog ("DAC ResetEvent failed %lx\n", GetLastError ());
}
while (wave->pending >= conf.dac_samples) {
MMRESULT mr;
WAVEHDR *h = &wave->hdrs[wave->curhdr];
h->dwUser = 0;
mr = waveOutWrite (wave->hwo, h, sizeof (*h));
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveOutWrite(%d)", wave->curhdr);
break;
}
wave->pending -= conf.dac_samples;
wave->curhdr = (wave->curhdr + 1) % conf.dac_headers;
}
return decr;
}
static void winwave_poll (void *opaque)
{
(void) opaque;
audio_run ("winwave_poll");
}
static void winwave_fini_out (HWVoiceOut *hw)
{
int i;
MMRESULT mr;
WaveVoiceOut *wave = (WaveVoiceOut *) hw;
mr = waveOutReset (wave->hwo);
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveOutReset");
}
for (i = 0; i < conf.dac_headers; ++i) {
mr = waveOutUnprepareHeader (wave->hwo, &wave->hdrs[i],
sizeof (wave->hdrs[i]));
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveOutUnprepareHeader(%d)", i);
}
}
winwave_anal_close_out (wave);
if (wave->event) {
qemu_del_wait_object (wave->event, winwave_poll, wave);
if (!CloseHandle (wave->event)) {
dolog ("DAC CloseHandle failed %lx\n", GetLastError ());
}
wave->event = NULL;
}
g_free (wave->pcm_buf);
wave->pcm_buf = NULL;
g_free (wave->hdrs);
wave->hdrs = NULL;
}
static int winwave_ctl_out (HWVoiceOut *hw, int cmd, ...)
{
MMRESULT mr;
WaveVoiceOut *wave = (WaveVoiceOut *) hw;
switch (cmd) {
case VOICE_ENABLE:
{
va_list ap;
int poll_mode;
va_start (ap, cmd);
poll_mode = va_arg (ap, int);
va_end (ap);
if (poll_mode && !wave->event) {
wave->event = CreateEvent (NULL, TRUE, TRUE, NULL);
if (!wave->event) {
dolog ("DAC CreateEvent: %lx, poll mode will be disabled\n",
GetLastError ());
}
}
if (wave->event) {
int ret;
ret = qemu_add_wait_object (wave->event, winwave_poll, wave);
hw->poll_mode = (ret == 0);
}
else {
hw->poll_mode = 0;
}
wave->paused = 0;
}
return 0;
case VOICE_DISABLE:
if (!wave->paused) {
mr = waveOutReset (wave->hwo);
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveOutReset");
}
else {
wave->paused = 1;
}
}
if (wave->event) {
qemu_del_wait_object (wave->event, winwave_poll, wave);
}
return 0;
}
return -1;
}
static void winwave_anal_close_in (WaveVoiceIn *wave)
{
MMRESULT mr;
mr = waveInClose (wave->hwi);
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveInClose");
}
wave->hwi = NULL;
}
static void CALLBACK winwave_callback_in (
HWAVEIN *hwi,
UINT msg,
DWORD_PTR dwInstance,
DWORD_PTR dwParam1,
DWORD_PTR dwParam2
)
{
WaveVoiceIn *wave = (WaveVoiceIn *) dwInstance;
switch (msg) {
case WIM_DATA:
{
WAVEHDR *h = (WAVEHDR *) dwParam1;
if (!h->dwUser) {
h->dwUser = 1;
EnterCriticalSection (&wave->crit_sect);
{
wave->avail += conf.adc_samples;
}
LeaveCriticalSection (&wave->crit_sect);
if (wave->hw.poll_mode) {
if (!SetEvent (wave->event)) {
dolog ("ADC SetEvent failed %lx\n", GetLastError ());
}
}
}
}
break;
case WIM_CLOSE:
case WIM_OPEN:
break;
default:
dolog ("unknown wave in callback msg %x\n", msg);
}
}
static void winwave_add_buffers (WaveVoiceIn *wave, int samples)
{
int doreset;
doreset = wave->hw.poll_mode && (samples >= conf.adc_samples);
if (doreset && !ResetEvent (wave->event)) {
dolog ("ADC ResetEvent failed %lx\n", GetLastError ());
}
while (samples >= conf.adc_samples) {
MMRESULT mr;
WAVEHDR *h = &wave->hdrs[wave->curhdr];
h->dwUser = 0;
mr = waveInAddBuffer (wave->hwi, h, sizeof (*h));
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveInAddBuffer(%d)", wave->curhdr);
}
wave->curhdr = (wave->curhdr + 1) % conf.adc_headers;
samples -= conf.adc_samples;
}
}
static int winwave_init_in (HWVoiceIn *hw, struct audsettings *as)
{
int i;
int err;
MMRESULT mr;
WAVEFORMATEX wfx;
WaveVoiceIn *wave;
wave = (WaveVoiceIn *) hw;
InitializeCriticalSection (&wave->crit_sect);
err = waveformat_from_audio_settings (&wfx, as);
if (err) {
goto err0;
}
mr = waveInOpen (&wave->hwi, WAVE_MAPPER, &wfx,
(DWORD_PTR) winwave_callback_in,
(DWORD_PTR) wave, CALLBACK_FUNCTION);
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveInOpen");
goto err1;
}
wave->hdrs = audio_calloc (AUDIO_FUNC, conf.dac_headers,
sizeof (*wave->hdrs));
if (!wave->hdrs) {
goto err2;
}
audio_pcm_init_info (&hw->info, as);
hw->samples = conf.adc_samples * conf.adc_headers;
wave->avail = 0;
wave->pcm_buf = audio_calloc (AUDIO_FUNC, conf.adc_samples,
conf.adc_headers << hw->info.shift);
if (!wave->pcm_buf) {
goto err3;
}
for (i = 0; i < conf.adc_headers; ++i) {
WAVEHDR *h = &wave->hdrs[i];
h->dwUser = 0;
h->dwBufferLength = conf.adc_samples << hw->info.shift;
h->lpData = advance (wave->pcm_buf, i * h->dwBufferLength);
h->dwFlags = 0;
mr = waveInPrepareHeader (wave->hwi, h, sizeof (*h));
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveInPrepareHeader(%d)", i);
goto err4;
}
}
wave->paused = 1;
winwave_add_buffers (wave, hw->samples);
return 0;
err4:
g_free (wave->pcm_buf);
err3:
g_free (wave->hdrs);
err2:
winwave_anal_close_in (wave);
err1:
err0:
return -1;
}
static void winwave_fini_in (HWVoiceIn *hw)
{
int i;
MMRESULT mr;
WaveVoiceIn *wave = (WaveVoiceIn *) hw;
mr = waveInReset (wave->hwi);
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveInReset");
}
for (i = 0; i < conf.adc_headers; ++i) {
mr = waveInUnprepareHeader (wave->hwi, &wave->hdrs[i],
sizeof (wave->hdrs[i]));
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveInUnprepareHeader(%d)", i);
}
}
winwave_anal_close_in (wave);
if (wave->event) {
qemu_del_wait_object (wave->event, winwave_poll, wave);
if (!CloseHandle (wave->event)) {
dolog ("ADC CloseHandle failed %lx\n", GetLastError ());
}
wave->event = NULL;
}
g_free (wave->pcm_buf);
wave->pcm_buf = NULL;
g_free (wave->hdrs);
wave->hdrs = NULL;
}
static int winwave_run_in (HWVoiceIn *hw)
{
WaveVoiceIn *wave = (WaveVoiceIn *) hw;
int live = audio_pcm_hw_get_live_in (hw);
int dead = hw->samples - live;
int decr, ret;
if (!dead) {
return 0;
}
EnterCriticalSection (&wave->crit_sect);
{
decr = audio_MIN (dead, wave->avail);
wave->avail -= decr;
}
LeaveCriticalSection (&wave->crit_sect);
ret = decr;
while (decr) {
int left = hw->samples - hw->wpos;
int conv = audio_MIN (left, decr);
hw->conv (hw->conv_buf + hw->wpos,
advance (wave->pcm_buf, wave->rpos << hw->info.shift),
conv);
wave->rpos = (wave->rpos + conv) % hw->samples;
hw->wpos = (hw->wpos + conv) % hw->samples;
decr -= conv;
}
winwave_add_buffers (wave, ret);
return ret;
}
static int winwave_read (SWVoiceIn *sw, void *buf, int size)
{
return audio_pcm_sw_read (sw, buf, size);
}
static int winwave_ctl_in (HWVoiceIn *hw, int cmd, ...)
{
MMRESULT mr;
WaveVoiceIn *wave = (WaveVoiceIn *) hw;
switch (cmd) {
case VOICE_ENABLE:
{
va_list ap;
int poll_mode;
va_start (ap, cmd);
poll_mode = va_arg (ap, int);
va_end (ap);
if (poll_mode && !wave->event) {
wave->event = CreateEvent (NULL, TRUE, TRUE, NULL);
if (!wave->event) {
dolog ("ADC CreateEvent: %lx, poll mode will be disabled\n",
GetLastError ());
}
}
if (wave->event) {
int ret;
ret = qemu_add_wait_object (wave->event, winwave_poll, wave);
hw->poll_mode = (ret == 0);
}
else {
hw->poll_mode = 0;
}
if (wave->paused) {
mr = waveInStart (wave->hwi);
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveInStart");
}
wave->paused = 0;
}
}
return 0;
case VOICE_DISABLE:
if (!wave->paused) {
mr = waveInStop (wave->hwi);
if (mr != MMSYSERR_NOERROR) {
winwave_logerr (mr, "waveInStop");
}
else {
wave->paused = 1;
}
}
if (wave->event) {
qemu_del_wait_object (wave->event, winwave_poll, wave);
}
return 0;
}
return 0;
}
static void *winwave_audio_init (void)
{
return &conf;
}
static void winwave_audio_fini (void *opaque)
{
(void) opaque;
}
static struct audio_option winwave_options[] = {
{
.name = "DAC_HEADERS",
.tag = AUD_OPT_INT,
.valp = &conf.dac_headers,
.descr = "DAC number of headers",
},
{
.name = "DAC_SAMPLES",
.tag = AUD_OPT_INT,
.valp = &conf.dac_samples,
.descr = "DAC number of samples per header",
},
{
.name = "ADC_HEADERS",
.tag = AUD_OPT_INT,
.valp = &conf.adc_headers,
.descr = "ADC number of headers",
},
{
.name = "ADC_SAMPLES",
.tag = AUD_OPT_INT,
.valp = &conf.adc_samples,
.descr = "ADC number of samples per header",
},
{ /* End of list */ }
};
static struct audio_pcm_ops winwave_pcm_ops = {
.init_out = winwave_init_out,
.fini_out = winwave_fini_out,
.run_out = winwave_run_out,
.write = winwave_write,
.ctl_out = winwave_ctl_out,
.init_in = winwave_init_in,
.fini_in = winwave_fini_in,
.run_in = winwave_run_in,
.read = winwave_read,
.ctl_in = winwave_ctl_in
};
struct audio_driver winwave_audio_driver = {
.name = "winwave",
.descr = "Windows Waveform Audio http://msdn.microsoft.com",
.options = winwave_options,
.init = winwave_audio_init,
.fini = winwave_audio_fini,
.pcm_ops = &winwave_pcm_ops,
.can_be_default = 1,
.max_voices_out = INT_MAX,
.max_voices_in = INT_MAX,
.voice_size_out = sizeof (WaveVoiceOut),
.voice_size_in = sizeof (WaveVoiceIn)
};

View File

@@ -1,11 +1,8 @@
common-obj-y += rng.o rng-egd.o
common-obj-$(CONFIG_POSIX) += rng-random.o
common-obj-y += msmouse.o testdev.o
common-obj-y += msmouse.o
common-obj-$(CONFIG_BRLAPI) += baum.o
baum.o-cflags := $(SDL_CFLAGS)
$(obj)/baum.o: QEMU_CFLAGS += $(SDL_CFLAGS)
common-obj-$(CONFIG_TPM) += tpm.o
common-obj-y += hostmem.o hostmem-ram.o
common-obj-$(CONFIG_LINUX) += hostmem-file.o

View File

@@ -566,15 +566,13 @@ CharDriverState *chr_baum_init(void)
BaumDriverState *baum;
CharDriverState *chr;
brlapi_handle_t *handle;
#if defined(CONFIG_SDL)
#if SDL_COMPILEDVERSION < SDL_VERSIONNUM(2, 0, 0)
#ifdef CONFIG_SDL
SDL_SysWMinfo info;
#endif
#endif
int tty;
baum = g_malloc0(sizeof(BaumDriverState));
baum->chr = chr = qemu_chr_alloc();
baum->chr = chr = g_malloc0(sizeof(CharDriverState));
chr->opaque = baum;
chr->chr_write = baum_write;
@@ -597,14 +595,12 @@ CharDriverState *chr_baum_init(void)
goto fail;
}
#if defined(CONFIG_SDL)
#if SDL_COMPILEDVERSION < SDL_VERSIONNUM(2, 0, 0)
#ifdef CONFIG_SDL
memset(&info, 0, sizeof(info));
SDL_VERSION(&info.version);
if (SDL_GetWMInfo(&info))
tty = info.info.x11.wmwindow;
else
#endif
#endif
tty = BRLAPI_TTY_DEFAULT;
@@ -629,7 +625,7 @@ fail_handle:
static void register_types(void)
{
register_char_driver("braille", CHARDEV_BACKEND_KIND_BRAILLE, NULL);
register_char_driver_qapi("braille", CHARDEV_BACKEND_KIND_BRAILLE, NULL);
}
type_init(register_types);

View File

@@ -1,134 +0,0 @@
/*
* QEMU Host Memory Backend for hugetlbfs
*
* Copyright (C) 2013-2014 Red Hat Inc
*
* Authors:
* Paolo Bonzini <pbonzini@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu-common.h"
#include "sysemu/hostmem.h"
#include "sysemu/sysemu.h"
#include "qom/object_interfaces.h"
/* hostmem-file.c */
/**
* @TYPE_MEMORY_BACKEND_FILE:
* name of backend that uses mmap on a file descriptor
*/
#define TYPE_MEMORY_BACKEND_FILE "memory-backend-file"
#define MEMORY_BACKEND_FILE(obj) \
OBJECT_CHECK(HostMemoryBackendFile, (obj), TYPE_MEMORY_BACKEND_FILE)
typedef struct HostMemoryBackendFile HostMemoryBackendFile;
struct HostMemoryBackendFile {
HostMemoryBackend parent_obj;
bool share;
char *mem_path;
};
static void
file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
}
if (!fb->mem_path) {
error_setg(errp, "mem-path property not set");
return;
}
#ifndef CONFIG_LINUX
error_setg(errp, "-mem-path not supported on this host");
#else
if (!memory_region_size(&backend->mr)) {
backend->force_prealloc = mem_prealloc;
memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
object_get_canonical_path(OBJECT(backend)),
backend->size, fb->share,
fb->mem_path, errp);
}
#endif
}
static void
file_backend_class_init(ObjectClass *oc, void *data)
{
HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
bc->alloc = file_backend_memory_alloc;
}
static char *get_mem_path(Object *o, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
return g_strdup(fb->mem_path);
}
static void set_mem_path(Object *o, const char *str, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (memory_region_size(&backend->mr)) {
error_setg(errp, "cannot change property value");
return;
}
if (fb->mem_path) {
g_free(fb->mem_path);
}
fb->mem_path = g_strdup(str);
}
static bool file_memory_backend_get_share(Object *o, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
return fb->share;
}
static void file_memory_backend_set_share(Object *o, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (memory_region_size(&backend->mr)) {
error_setg(errp, "cannot change property value");
return;
}
fb->share = value;
}
static void
file_backend_instance_init(Object *o)
{
object_property_add_bool(o, "share",
file_memory_backend_get_share,
file_memory_backend_set_share, NULL);
object_property_add_str(o, "mem-path", get_mem_path,
set_mem_path, NULL);
}
static const TypeInfo file_backend_info = {
.name = TYPE_MEMORY_BACKEND_FILE,
.parent = TYPE_MEMORY_BACKEND,
.class_init = file_backend_class_init,
.instance_init = file_backend_instance_init,
.instance_size = sizeof(HostMemoryBackendFile),
};
static void register_types(void)
{
type_register_static(&file_backend_info);
}
type_init(register_types);

View File

@@ -1,53 +0,0 @@
/*
* QEMU Host Memory Backend
*
* Copyright (C) 2013-2014 Red Hat Inc
*
* Authors:
* Igor Mammedov <imammedo@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "sysemu/hostmem.h"
#include "qom/object_interfaces.h"
#define TYPE_MEMORY_BACKEND_RAM "memory-backend-ram"
static void
ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
char *path;
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
}
path = object_get_canonical_path_component(OBJECT(backend));
memory_region_init_ram(&backend->mr, OBJECT(backend), path,
backend->size, errp);
g_free(path);
}
static void
ram_backend_class_init(ObjectClass *oc, void *data)
{
HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
bc->alloc = ram_backend_memory_alloc;
}
static const TypeInfo ram_backend_info = {
.name = TYPE_MEMORY_BACKEND_RAM,
.parent = TYPE_MEMORY_BACKEND,
.class_init = ram_backend_class_init,
};
static void register_types(void)
{
type_register_static(&ram_backend_info);
}
type_init(register_types);

View File

@@ -1,372 +0,0 @@
/*
* QEMU Host Memory Backend
*
* Copyright (C) 2013-2014 Red Hat Inc
*
* Authors:
* Igor Mammedov <imammedo@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "sysemu/hostmem.h"
#include "hw/boards.h"
#include "qapi/visitor.h"
#include "qapi-types.h"
#include "qapi-visit.h"
#include "qemu/config-file.h"
#include "qom/object_interfaces.h"
#ifdef CONFIG_NUMA
#include <numaif.h>
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_DEFAULT != MPOL_DEFAULT);
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_PREFERRED != MPOL_PREFERRED);
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_BIND != MPOL_BIND);
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_INTERLEAVE != MPOL_INTERLEAVE);
#endif
static void
host_memory_backend_get_size(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint64_t value = backend->size;
visit_type_size(v, &value, name, errp);
}
static void
host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
Error *local_err = NULL;
uint64_t value;
if (memory_region_size(&backend->mr)) {
error_setg(&local_err, "cannot change property value");
goto out;
}
visit_type_size(v, &value, name, &local_err);
if (local_err) {
goto out;
}
if (!value) {
error_setg(&local_err, "Property '%s.%s' doesn't take value '%"
PRIu64 "'", object_get_typename(obj), name, value);
goto out;
}
backend->size = value;
out:
error_propagate(errp, local_err);
}
static void
host_memory_backend_get_host_nodes(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint16List *host_nodes = NULL;
uint16List **node = &host_nodes;
unsigned long value;
value = find_first_bit(backend->host_nodes, MAX_NODES);
if (value == MAX_NODES) {
return;
}
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
do {
value = find_next_bit(backend->host_nodes, MAX_NODES, value + 1);
if (value == MAX_NODES) {
break;
}
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
} while (true);
visit_type_uint16List(v, &host_nodes, name, errp);
}
static void
host_memory_backend_set_host_nodes(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
#ifdef CONFIG_NUMA
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint16List *l = NULL;
visit_type_uint16List(v, &l, name, errp);
while (l) {
bitmap_set(backend->host_nodes, l->value, 1);
l = l->next;
}
#else
error_setg(errp, "NUMA node binding are not supported by this QEMU");
#endif
}
static int
host_memory_backend_get_policy(Object *obj, Error **errp G_GNUC_UNUSED)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
return backend->policy;
}
static void
host_memory_backend_set_policy(Object *obj, int policy, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
backend->policy = policy;
#ifndef CONFIG_NUMA
if (policy != HOST_MEM_POLICY_DEFAULT) {
error_setg(errp, "NUMA policies are not supported by this QEMU");
}
#endif
}
static bool host_memory_backend_get_merge(Object *obj, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
return backend->merge;
}
static void host_memory_backend_set_merge(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!memory_region_size(&backend->mr)) {
backend->merge = value;
return;
}
if (value != backend->merge) {
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
qemu_madvise(ptr, sz,
value ? QEMU_MADV_MERGEABLE : QEMU_MADV_UNMERGEABLE);
backend->merge = value;
}
}
static bool host_memory_backend_get_dump(Object *obj, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
return backend->dump;
}
static void host_memory_backend_set_dump(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!memory_region_size(&backend->mr)) {
backend->dump = value;
return;
}
if (value != backend->dump) {
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
qemu_madvise(ptr, sz,
value ? QEMU_MADV_DODUMP : QEMU_MADV_DONTDUMP);
backend->dump = value;
}
}
static bool host_memory_backend_get_prealloc(Object *obj, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
return backend->prealloc || backend->force_prealloc;
}
static void host_memory_backend_set_prealloc(Object *obj, bool value,
Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (backend->force_prealloc) {
if (value) {
error_setg(errp,
"remove -mem-prealloc to use the prealloc property");
return;
}
}
if (!memory_region_size(&backend->mr)) {
backend->prealloc = value;
return;
}
if (value && !backend->prealloc) {
int fd = memory_region_get_fd(&backend->mr);
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
os_mem_prealloc(fd, ptr, sz);
backend->prealloc = true;
}
}
static void host_memory_backend_init(Object *obj)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
MachineState *machine = MACHINE(qdev_get_machine());
backend->merge = machine_mem_merge(machine);
backend->dump = machine_dump_guest_core(machine);
backend->prealloc = mem_prealloc;
object_property_add_bool(obj, "merge",
host_memory_backend_get_merge,
host_memory_backend_set_merge, NULL);
object_property_add_bool(obj, "dump",
host_memory_backend_get_dump,
host_memory_backend_set_dump, NULL);
object_property_add_bool(obj, "prealloc",
host_memory_backend_get_prealloc,
host_memory_backend_set_prealloc, NULL);
object_property_add(obj, "size", "int",
host_memory_backend_get_size,
host_memory_backend_set_size, NULL, NULL, NULL);
object_property_add(obj, "host-nodes", "int",
host_memory_backend_get_host_nodes,
host_memory_backend_set_host_nodes, NULL, NULL, NULL);
object_property_add_enum(obj, "policy", "HostMemPolicy",
HostMemPolicy_lookup,
host_memory_backend_get_policy,
host_memory_backend_set_policy, NULL);
}
MemoryRegion *
host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
{
return memory_region_size(&backend->mr) ? &backend->mr : NULL;
}
static void
host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(uc);
HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
Error *local_err = NULL;
void *ptr;
uint64_t sz;
if (bc->alloc) {
bc->alloc(backend, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
ptr = memory_region_get_ram_ptr(&backend->mr);
sz = memory_region_size(&backend->mr);
if (backend->merge) {
qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
}
if (!backend->dump) {
qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP);
}
#ifdef CONFIG_NUMA
unsigned long lastbit = find_last_bit(backend->host_nodes, MAX_NODES);
/* lastbit == MAX_NODES means maxnode = 0 */
unsigned long maxnode = (lastbit + 1) % (MAX_NODES + 1);
/* ensure policy won't be ignored in case memory is preallocated
* before mbind(). note: MPOL_MF_STRICT is ignored on hugepages so
* this doesn't catch hugepage case. */
unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE;
/* check for invalid host-nodes and policies and give more verbose
* error messages than mbind(). */
if (maxnode && backend->policy == MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be empty for policy default,"
" or you should explicitly specify a policy other"
" than default");
return;
} else if (maxnode == 0 && backend->policy != MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be set for policy %s",
HostMemPolicy_lookup[backend->policy]);
return;
}
/* We can have up to MAX_NODES nodes, but we need to pass maxnode+1
* as argument to mbind() due to an old Linux bug (feature?) which
* cuts off the last specified node. This means backend->host_nodes
* must have MAX_NODES+1 bits available.
*/
assert(sizeof(backend->host_nodes) >=
BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long));
assert(maxnode <= MAX_NODES);
if (mbind(ptr, sz, backend->policy,
maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) {
error_setg_errno(errp, errno,
"cannot bind memory to host NUMA nodes");
return;
}
#endif
/* Preallocate memory after the NUMA policy has been instantiated.
* This is necessary to guarantee memory is allocated with
* specified NUMA policy in place.
*/
if (backend->prealloc) {
os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz);
}
}
}
static bool
host_memory_backend_can_be_deleted(UserCreatable *uc, Error **errp)
{
MemoryRegion *mr;
mr = host_memory_backend_get_memory(MEMORY_BACKEND(uc), errp);
if (memory_region_is_mapped(mr)) {
return false;
} else {
return true;
}
}
static void
host_memory_backend_class_init(ObjectClass *oc, void *data)
{
UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
ucc->complete = host_memory_backend_memory_complete;
ucc->can_be_deleted = host_memory_backend_can_be_deleted;
}
static const TypeInfo host_memory_backend_info = {
.name = TYPE_MEMORY_BACKEND,
.parent = TYPE_OBJECT,
.abstract = true,
.class_size = sizeof(HostMemoryBackendClass),
.class_init = host_memory_backend_class_init,
.instance_size = sizeof(HostMemoryBackend),
.instance_init = host_memory_backend_init,
.interfaces = (InterfaceInfo[]) {
{ TYPE_USER_CREATABLE },
{ }
}
};
static void register_types(void)
{
type_register_static(&host_memory_backend_info);
}
type_init(register_types);

View File

@@ -67,7 +67,7 @@ CharDriverState *qemu_chr_open_msmouse(void)
{
CharDriverState *chr;
chr = qemu_chr_alloc();
chr = g_malloc0(sizeof(CharDriverState));
chr->chr_write = msmouse_chr_write;
chr->chr_close = msmouse_chr_close;
chr->explicit_be_open = true;
@@ -79,7 +79,7 @@ CharDriverState *qemu_chr_open_msmouse(void)
static void register_types(void)
{
register_char_driver("msmouse", CHARDEV_BACKEND_KIND_MSMOUSE, NULL);
register_char_driver_qapi("msmouse", CHARDEV_BACKEND_KIND_MSMOUSE, NULL);
}
type_init(register_types);

View File

@@ -140,20 +140,19 @@ static void rng_egd_opened(RngBackend *b, Error **errp)
RngEgd *s = RNG_EGD(b);
if (s->chr_name == NULL) {
error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
"chardev", "a valid character device");
error_set(errp, QERR_INVALID_PARAMETER_VALUE,
"chardev", "a valid character device");
return;
}
s->chr = qemu_chr_find(s->chr_name);
if (s->chr == NULL) {
error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
"Device '%s' not found", s->chr_name);
error_set(errp, QERR_DEVICE_NOT_FOUND, s->chr_name);
return;
}
if (qemu_chr_fe_claim(s->chr) != 0) {
error_setg(errp, QERR_DEVICE_IN_USE, s->chr_name);
error_set(errp, QERR_DEVICE_IN_USE, s->chr_name);
return;
}
@@ -168,9 +167,8 @@ static void rng_egd_set_chardev(Object *obj, const char *value, Error **errp)
RngEgd *s = RNG_EGD(b);
if (b->opened) {
error_setg(errp, QERR_PERMISSION_DENIED);
error_set(errp, QERR_PERMISSION_DENIED);
} else {
g_free(s->chr_name);
s->chr_name = g_strdup(value);
}
}

View File

@@ -74,8 +74,8 @@ static void rng_random_opened(RngBackend *b, Error **errp)
RndRandom *s = RNG_RANDOM(b);
if (s->filename == NULL) {
error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
"filename", "a valid filename");
error_set(errp, QERR_INVALID_PARAMETER_VALUE,
"filename", "a valid filename");
} else {
s->fd = qemu_open(s->filename, O_RDONLY | O_NONBLOCK);
if (s->fd == -1) {
@@ -88,7 +88,11 @@ static char *rng_random_get_filename(Object *obj, Error **errp)
{
RndRandom *s = RNG_RANDOM(obj);
return g_strdup(s->filename);
if (s->filename) {
return g_strdup(s->filename);
}
return NULL;
}
static void rng_random_set_filename(Object *obj, const char *filename,
@@ -98,11 +102,14 @@ static void rng_random_set_filename(Object *obj, const char *filename,
RndRandom *s = RNG_RANDOM(obj);
if (b->opened) {
error_setg(errp, QERR_PERMISSION_DENIED);
error_set(errp, QERR_PERMISSION_DENIED);
return;
}
g_free(s->filename);
if (s->filename) {
g_free(s->filename);
}
s->filename = g_strdup(filename);
}
@@ -116,15 +123,15 @@ static void rng_random_init(Object *obj)
NULL);
s->filename = g_strdup("/dev/random");
s->fd = -1;
}
static void rng_random_finalize(Object *obj)
{
RndRandom *s = RNG_RANDOM(obj);
qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
if (s->fd != -1) {
qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
qemu_close(s->fd);
}

View File

@@ -12,7 +12,6 @@
#include "sysemu/rng.h"
#include "qapi/qmp/qerror.h"
#include "qom/object_interfaces.h"
void rng_backend_request_entropy(RngBackend *s, size_t size,
EntropyReceiveFunc *receive_entropy,
@@ -41,35 +40,32 @@ static bool rng_backend_prop_get_opened(Object *obj, Error **errp)
return s->opened;
}
static void rng_backend_complete(UserCreatable *uc, Error **errp)
void rng_backend_open(RngBackend *s, Error **errp)
{
object_property_set_bool(OBJECT(uc), true, "opened", errp);
object_property_set_bool(OBJECT(s), true, "opened", errp);
}
static void rng_backend_prop_set_opened(Object *obj, bool value, Error **errp)
{
RngBackend *s = RNG_BACKEND(obj);
RngBackendClass *k = RNG_BACKEND_GET_CLASS(s);
Error *local_err = NULL;
if (value == s->opened) {
return;
}
if (!value && s->opened) {
error_setg(errp, QERR_PERMISSION_DENIED);
error_set(errp, QERR_PERMISSION_DENIED);
return;
}
if (k->opened) {
k->opened(s, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
k->opened(s, errp);
}
s->opened = true;
if (!error_is_set(errp)) {
s->opened = value;
}
}
static void rng_backend_init(Object *obj)
@@ -80,25 +76,13 @@ static void rng_backend_init(Object *obj)
NULL);
}
static void rng_backend_class_init(ObjectClass *oc, void *data)
{
UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
ucc->complete = rng_backend_complete;
}
static const TypeInfo rng_backend_info = {
.name = TYPE_RNG_BACKEND,
.parent = TYPE_OBJECT,
.instance_size = sizeof(RngBackend),
.instance_init = rng_backend_init,
.class_size = sizeof(RngBackendClass),
.class_init = rng_backend_class_init,
.abstract = true,
.interfaces = (InterfaceInfo[]) {
{ TYPE_USER_CREATABLE },
{ }
}
};
static void register_types(void)

View File

@@ -1,131 +0,0 @@
/*
* QEMU Char Device for testsuite control
*
* Copyright (c) 2014 Red Hat, Inc.
*
* Author: Paolo Bonzini <pbonzini@redhat.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "sysemu/char.h"
#define BUF_SIZE 32
typedef struct {
CharDriverState *chr;
uint8_t in_buf[32];
int in_buf_used;
} TestdevCharState;
/* Try to interpret a whole incoming packet */
static int testdev_eat_packet(TestdevCharState *testdev)
{
const uint8_t *cur = testdev->in_buf;
int len = testdev->in_buf_used;
uint8_t c;
int arg;
#define EAT(c) do { \
if (!len--) { \
return 0; \
} \
c = *cur++; \
} while (0)
EAT(c);
while (isspace(c)) {
EAT(c);
}
arg = 0;
while (isdigit(c)) {
arg = arg * 10 + c - '0';
EAT(c);
}
while (isspace(c)) {
EAT(c);
}
switch (c) {
case 'q':
exit((arg << 1) | 1);
break;
default:
break;
}
return cur - testdev->in_buf;
}
/* The other end is writing some data. Store it and try to interpret */
static int testdev_write(CharDriverState *chr, const uint8_t *buf, int len)
{
TestdevCharState *testdev = chr->opaque;
int tocopy, eaten, orig_len = len;
while (len) {
/* Complete our buffer as much as possible */
tocopy = MIN(len, BUF_SIZE - testdev->in_buf_used);
memcpy(testdev->in_buf + testdev->in_buf_used, buf, tocopy);
testdev->in_buf_used += tocopy;
buf += tocopy;
len -= tocopy;
/* Interpret it as much as possible */
while (testdev->in_buf_used > 0 &&
(eaten = testdev_eat_packet(testdev)) > 0) {
memmove(testdev->in_buf, testdev->in_buf + eaten,
testdev->in_buf_used - eaten);
testdev->in_buf_used -= eaten;
}
}
return orig_len;
}
static void testdev_close(struct CharDriverState *chr)
{
TestdevCharState *testdev = chr->opaque;
g_free(testdev);
}
CharDriverState *chr_testdev_init(void)
{
TestdevCharState *testdev;
CharDriverState *chr;
testdev = g_malloc0(sizeof(TestdevCharState));
testdev->chr = chr = g_malloc0(sizeof(CharDriverState));
chr->opaque = testdev;
chr->chr_write = testdev_write;
chr->chr_close = testdev_close;
return chr;
}
static void register_types(void)
{
register_char_driver("testdev", CHARDEV_BACKEND_KIND_TESTDEV, NULL);
}
type_init(register_types);

View File

@@ -36,7 +36,7 @@ void tpm_backend_destroy(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
k->ops->destroy(s);
return k->ops->destroy(s);
}
int tpm_backend_init(TPMBackend *s, TPMState *state,
@@ -96,20 +96,6 @@ bool tpm_backend_get_tpm_established_flag(TPMBackend *s)
return k->ops->get_tpm_established_flag(s);
}
int tpm_backend_reset_tpm_established_flag(TPMBackend *s, uint8_t locty)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->reset_tpm_established_flag(s, locty);
}
TPMVersion tpm_backend_get_tpm_version(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->get_tpm_version(s);
}
static bool tpm_backend_prop_get_opened(Object *obj, Error **errp)
{
TPMBackend *s = TPM_BACKEND(obj);
@@ -126,26 +112,23 @@ static void tpm_backend_prop_set_opened(Object *obj, bool value, Error **errp)
{
TPMBackend *s = TPM_BACKEND(obj);
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
Error *local_err = NULL;
if (value == s->opened) {
return;
}
if (!value && s->opened) {
error_setg(errp, QERR_PERMISSION_DENIED);
error_set(errp, QERR_PERMISSION_DENIED);
return;
}
if (k->opened) {
k->opened(s, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
k->opened(s, errp);
}
s->opened = true;
if (!error_is_set(errp)) {
s->opened = value;
}
}
static void tpm_backend_instance_init(Object *obj)
@@ -179,6 +162,17 @@ void tpm_backend_thread_end(TPMBackendThread *tbt)
}
}
void tpm_backend_thread_tpm_reset(TPMBackendThread *tbt,
GFunc func, gpointer user_data)
{
if (!tbt->pool) {
tpm_backend_thread_create(tbt, func, user_data);
} else {
g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_TPM_RESET,
NULL);
}
}
static const TypeInfo tpm_backend_info = {
.name = TYPE_TPM_BACKEND,
.parent = TYPE_OBJECT,

View File

@@ -24,34 +24,18 @@
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "monitor/monitor.h"
#include "exec/cpu-common.h"
#include "sysemu/kvm.h"
#include "sysemu/balloon.h"
#include "trace.h"
#include "qmp-commands.h"
#include "qapi/qmp/qerror.h"
#include "qapi/qmp/qjson.h"
static QEMUBalloonEvent *balloon_event_fn;
static QEMUBalloonStatus *balloon_stat_fn;
static void *balloon_opaque;
static bool have_balloon(Error **errp)
{
if (kvm_enabled() && !kvm_has_sync_mmu()) {
error_set(errp, ERROR_CLASS_KVM_MISSING_CAP,
"Using KVM without synchronous MMU, balloon unavailable");
return false;
}
if (!balloon_event_fn) {
error_set(errp, ERROR_CLASS_DEVICE_NOT_ACTIVE,
"No balloon device has been activated");
return false;
}
return true;
}
int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
QEMUBalloonStatus *stat_func, void *opaque)
{
@@ -59,6 +43,7 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
/* We're already registered one balloon handler. How many can
* a guest really have?
*/
error_report("Another balloon device already registered");
return -1;
}
balloon_event_fn = event_func;
@@ -77,30 +62,71 @@ void qemu_remove_balloon_handler(void *opaque)
balloon_opaque = NULL;
}
static int qemu_balloon(ram_addr_t target)
{
if (!balloon_event_fn) {
return 0;
}
trace_balloon_event(balloon_opaque, target);
balloon_event_fn(balloon_opaque, target);
return 1;
}
static int qemu_balloon_status(BalloonInfo *info)
{
if (!balloon_stat_fn) {
return 0;
}
balloon_stat_fn(balloon_opaque, info);
return 1;
}
void qemu_balloon_changed(int64_t actual)
{
QObject *data;
data = qobject_from_jsonf("{ 'actual': %" PRId64 " }",
actual);
monitor_protocol_event(QEVENT_BALLOON_CHANGE, data);
qobject_decref(data);
}
BalloonInfo *qmp_query_balloon(Error **errp)
{
BalloonInfo *info;
if (!have_balloon(errp)) {
if (kvm_enabled() && !kvm_has_sync_mmu()) {
error_set(errp, QERR_KVM_MISSING_CAP, "synchronous MMU", "balloon");
return NULL;
}
info = g_malloc0(sizeof(*info));
balloon_stat_fn(balloon_opaque, info);
if (qemu_balloon_status(info) == 0) {
error_set(errp, QERR_DEVICE_NOT_ACTIVE, "balloon");
qapi_free_BalloonInfo(info);
return NULL;
}
return info;
}
void qmp_balloon(int64_t target, Error **errp)
void qmp_balloon(int64_t value, Error **errp)
{
if (!have_balloon(errp)) {
if (kvm_enabled() && !kvm_has_sync_mmu()) {
error_set(errp, QERR_KVM_MISSING_CAP, "synchronous MMU", "balloon");
return;
}
if (target <= 0) {
error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "target", "a size");
if (value <= 0) {
error_set(errp, QERR_INVALID_PARAMETER_VALUE, "target", "a size");
return;
}
trace_balloon_event(balloon_opaque, target);
balloon_event_fn(balloon_opaque, target);
if (qemu_balloon(value) == 0) {
error_set(errp, QERR_DEVICE_NOT_ACTIVE, "balloon");
}
}

View File

@@ -14,16 +14,13 @@
*/
#include "qemu-common.h"
#include "block/block.h"
#include "qemu/error-report.h"
#include "qemu/main-loop.h"
#include "block/block_int.h"
#include "hw/hw.h"
#include "qemu/queue.h"
#include "qemu/timer.h"
#include "migration/block.h"
#include "migration/migration.h"
#include "sysemu/blockdev.h"
#include "sysemu/block-backend.h"
#include <assert.h>
#define BLOCK_SIZE (1 << 20)
@@ -61,8 +58,6 @@ typedef struct BlkMigDevState {
/* Protected by block migration lock. */
unsigned long *aio_bitmap;
int64_t completed_sectors;
BdrvDirtyBitmap *dirty_bitmap;
Error *blocker;
} BlkMigDevState;
typedef struct BlkMigBlock {
@@ -73,7 +68,7 @@ typedef struct BlkMigBlock {
int nr_sectors;
struct iovec iov;
QEMUIOVector qiov;
BlockAIOCB *aiocb;
BlockDriverAIOCB *aiocb;
/* Protected by block migration lock. */
int ret;
@@ -133,9 +128,9 @@ static void blk_send(QEMUFile *f, BlkMigBlock * blk)
| flags);
/* device name */
len = strlen(bdrv_get_device_name(blk->bmds->bs));
len = strlen(blk->bmds->bs->device_name);
qemu_put_byte(f, len);
qemu_put_buffer(f, (uint8_t *)bdrv_get_device_name(blk->bmds->bs), len);
qemu_put_buffer(f, (uint8_t *)blk->bmds->bs->device_name, len);
/* if a block is zero we need to flush here since the network
* bandwidth is now a lot higher than the storage device bandwidth.
@@ -189,7 +184,7 @@ static int bmds_aio_inflight(BlkMigDevState *bmds, int64_t sector)
{
int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK;
if (sector < bdrv_nb_sectors(bmds->bs)) {
if ((sector << BDRV_SECTOR_BITS) < bdrv_getlength(bmds->bs)) {
return !!(bmds->aio_bitmap[chunk / (sizeof(unsigned long) * 8)] &
(1UL << (chunk % (sizeof(unsigned long) * 8))));
} else {
@@ -226,7 +221,8 @@ static void alloc_aio_bitmap(BlkMigDevState *bmds)
BlockDriverState *bs = bmds->bs;
int64_t bitmap_size;
bitmap_size = bdrv_nb_sectors(bs) + BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
bitmap_size = (bdrv_getlength(bs) >> BDRV_SECTOR_BITS) +
BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
bitmap_size /= BDRV_SECTORS_PER_DIRTY_CHUNK * 8;
bmds->aio_bitmap = g_malloc0(bitmap_size);
@@ -286,7 +282,7 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
nr_sectors = total_sectors - cur_sector;
}
blk = g_new(BlkMigBlock, 1);
blk = g_malloc(sizeof(BlkMigBlock));
blk->buf = g_malloc(BLOCK_SIZE);
blk->bmds = bmds;
blk->sector = cur_sector;
@@ -304,7 +300,7 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
blk->aiocb = bdrv_aio_readv(bs, cur_sector, &blk->qiov,
nr_sectors, blk_mig_read_cb, blk);
bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, cur_sector, nr_sectors);
bdrv_reset_dirty(bs, cur_sector, nr_sectors);
qemu_mutex_unlock_iothread();
bmds->cur_sector = cur_sector + nr_sectors;
@@ -313,45 +309,51 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
/* Called with iothread lock taken. */
static int set_dirty_tracking(void)
static void set_dirty_tracking(int enable)
{
BlkMigDevState *bmds;
int ret;
QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
bmds->dirty_bitmap = bdrv_create_dirty_bitmap(bmds->bs, BLOCK_SIZE,
NULL, NULL);
if (!bmds->dirty_bitmap) {
ret = -errno;
goto fail;
}
bdrv_set_dirty_tracking(bmds->bs, enable ? BLOCK_SIZE : 0);
}
return 0;
fail:
QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
if (bmds->dirty_bitmap) {
bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
}
}
return ret;
}
static void unset_dirty_tracking(void)
static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
{
BlkMigDevState *bmds;
int64_t sectors;
QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
bdrv_release_dirty_bitmap(bmds->bs, bmds->dirty_bitmap);
if (!bdrv_is_read_only(bs)) {
sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
if (sectors <= 0) {
return;
}
bmds = g_malloc0(sizeof(BlkMigDevState));
bmds->bs = bs;
bmds->bulk_completed = 0;
bmds->total_sectors = sectors;
bmds->completed_sectors = 0;
bmds->shared_base = block_mig_state.shared_base;
alloc_aio_bitmap(bmds);
bdrv_set_in_use(bs, 1);
bdrv_ref(bs);
block_mig_state.total_sector_sum += sectors;
if (bmds->shared_base) {
DPRINTF("Start migration for %s with shared base image\n",
bs->device_name);
} else {
DPRINTF("Start full migration for %s\n", bs->device_name);
}
QSIMPLEQ_INSERT_TAIL(&block_mig_state.bmds_list, bmds, entry);
}
}
static void init_blk_migration(QEMUFile *f)
{
BlockDriverState *bs;
BlkMigDevState *bmds;
int64_t sectors;
block_mig_state.submitted = 0;
block_mig_state.read_done = 0;
block_mig_state.transferred = 0;
@@ -360,38 +362,7 @@ static void init_blk_migration(QEMUFile *f)
block_mig_state.bulk_completed = 0;
block_mig_state.zero_blocks = migrate_zero_blocks();
for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
if (bdrv_is_read_only(bs)) {
continue;
}
sectors = bdrv_nb_sectors(bs);
if (sectors <= 0) {
return;
}
bmds = g_new0(BlkMigDevState, 1);
bmds->bs = bs;
bmds->bulk_completed = 0;
bmds->total_sectors = sectors;
bmds->completed_sectors = 0;
bmds->shared_base = block_mig_state.shared_base;
alloc_aio_bitmap(bmds);
error_setg(&bmds->blocker, "block device is in use by migration");
bdrv_op_block_all(bs, bmds->blocker);
bdrv_ref(bs);
block_mig_state.total_sector_sum += sectors;
if (bmds->shared_base) {
DPRINTF("Start migration for %s with shared base image\n",
bdrv_get_device_name(bs));
} else {
DPRINTF("Start full migration for %s\n", bdrv_get_device_name(bs));
}
QSIMPLEQ_INSERT_TAIL(&block_mig_state.bmds_list, bmds, entry);
}
bdrv_iterate(init_blk_migration_it, NULL);
}
/* Called with no lock taken. */
@@ -457,18 +428,18 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
blk_mig_lock();
if (bmds_aio_inflight(bmds, sector)) {
blk_mig_unlock();
bdrv_drain(bmds->bs);
bdrv_drain_all();
} else {
blk_mig_unlock();
}
if (bdrv_get_dirty(bmds->bs, bmds->dirty_bitmap, sector)) {
if (bdrv_get_dirty(bmds->bs, sector)) {
if (total_sectors - sector < BDRV_SECTORS_PER_DIRTY_CHUNK) {
nr_sectors = total_sectors - sector;
} else {
nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
}
blk = g_new(BlkMigBlock, 1);
blk = g_malloc(sizeof(BlkMigBlock));
blk->buf = g_malloc(BLOCK_SIZE);
blk->bmds = bmds;
blk->sector = sector;
@@ -497,7 +468,7 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
g_free(blk);
}
bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, sector, nr_sectors);
bdrv_reset_dirty(bmds->bs, sector, nr_sectors);
break;
}
sector += BDRV_SECTORS_PER_DIRTY_CHUNK;
@@ -583,7 +554,7 @@ static int64_t get_remaining_dirty(void)
int64_t dirty = 0;
QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
dirty += bdrv_get_dirty_count(bmds->dirty_bitmap);
dirty += bdrv_get_dirty_count(bmds->bs);
}
return dirty << BDRV_SECTOR_BITS;
@@ -598,13 +569,12 @@ static void blk_mig_cleanup(void)
bdrv_drain_all();
unset_dirty_tracking();
set_dirty_tracking(0);
blk_mig_lock();
while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
bdrv_op_unblock_all(bmds->bs, bmds->blocker);
error_free(bmds->blocker);
bdrv_set_in_use(bmds->bs, 0);
bdrv_unref(bmds->bs);
g_free(bmds->aio_bitmap);
g_free(bmds);
@@ -634,13 +604,7 @@ static int block_save_setup(QEMUFile *f, void *opaque)
init_blk_migration(f);
/* start track dirty blocks */
ret = set_dirty_tracking();
if (ret) {
qemu_mutex_unlock_iothread();
return ret;
}
set_dirty_tracking(1);
qemu_mutex_unlock_iothread();
ret = flush_blks(f);
@@ -654,7 +618,6 @@ static int block_save_iterate(QEMUFile *f, void *opaque)
{
int ret;
int64_t last_ftell = qemu_ftell(f);
int64_t delta_ftell;
DPRINTF("Enter save live iterate submitted %d transferred %d\n",
block_mig_state.submitted, block_mig_state.transferred);
@@ -704,14 +667,7 @@ static int block_save_iterate(QEMUFile *f, void *opaque)
}
qemu_put_be64(f, BLK_MIG_FLAG_EOS);
delta_ftell = qemu_ftell(f) - last_ftell;
if (delta_ftell > 0) {
return 1;
} else if (delta_ftell < 0) {
return -1;
} else {
return 0;
}
return qemu_ftell(f) - last_ftell;
}
/* Called with iothread lock taken. */
@@ -766,8 +722,8 @@ static uint64_t block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size)
block_mig_state.read_done * BLOCK_SIZE;
/* Report at least one block pending during bulk phase */
if (pending <= max_size && !block_mig_state.bulk_completed) {
pending = max_size + BLOCK_SIZE;
if (pending == 0 && !block_mig_state.bulk_completed) {
pending = BLOCK_SIZE;
}
blk_mig_unlock();
qemu_mutex_unlock_iothread();
@@ -783,7 +739,6 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
char device_name[256];
int64_t addr;
BlockDriverState *bs, *bs_prev = NULL;
BlockBackend *blk;
uint8_t *buf;
int64_t total_sectors = 0;
int nr_sectors;
@@ -801,17 +756,16 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
qemu_get_buffer(f, (uint8_t *)device_name, len);
device_name[len] = '\0';
blk = blk_by_name(device_name);
if (!blk) {
bs = bdrv_find(device_name);
if (!bs) {
fprintf(stderr, "Error unknown block device %s\n",
device_name);
return -EINVAL;
}
bs = blk_bs(blk);
if (bs != bs_prev) {
bs_prev = bs;
total_sectors = bdrv_nb_sectors(bs);
total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
if (total_sectors <= 0) {
error_report("Error getting length of block device %s",
device_name);
@@ -826,8 +780,7 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
}
if (flags & BLK_MIG_FLAG_ZERO_BLOCK) {
ret = bdrv_write_zeroes(bs, addr, nr_sectors,
BDRV_REQ_MAY_UNMAP);
ret = bdrv_write_zeroes(bs, addr, nr_sectors);
} else {
buf = g_malloc(BLOCK_SIZE);
qemu_get_buffer(f, buf, BLOCK_SIZE);
@@ -873,7 +826,7 @@ static bool block_is_active(void *opaque)
return block_mig_state.blk_enable == 1;
}
static SaveVMHandlers savevm_block_handlers = {
SaveVMHandlers savevm_block_handlers = {
.set_params = block_set_params,
.save_live_setup = block_save_setup,
.save_live_iterate = block_save_iterate,

4685
block.c

File diff suppressed because it is too large Load Diff

View File

@@ -1,44 +1,26 @@
block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o
block-obj-y += raw_bsd.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o
block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
block-obj-y += quorum.o
block-obj-y += parallels.o blkdebug.o blkverify.o
block-obj-y += block-backend.o snapshot.o qapi.o
block-obj-y += snapshot.o qapi.o
block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
block-obj-$(CONFIG_POSIX) += raw-posix.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
block-obj-y += null.o mirror.o io.o
block-obj-y += throttle-groups.o
block-obj-y += nbd.o nbd-client.o sheepdog.o
ifeq ($(CONFIG_POSIX),y)
block-obj-y += nbd.o sheepdog.o
block-obj-$(CONFIG_LIBISCSI) += iscsi.o
block-obj-$(CONFIG_LIBNFS) += nfs.o
block-obj-$(CONFIG_CURL) += curl.o
block-obj-$(CONFIG_RBD) += rbd.o
block-obj-$(CONFIG_GLUSTERFS) += gluster.o
block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
block-obj-$(CONFIG_LIBSSH2) += ssh.o
block-obj-y += accounting.o
block-obj-y += write-threshold.o
endif
common-obj-y += stream.o
common-obj-y += commit.o
common-obj-y += mirror.o
common-obj-y += backup.o
iscsi.o-cflags := $(LIBISCSI_CFLAGS)
iscsi.o-libs := $(LIBISCSI_LIBS)
curl.o-cflags := $(CURL_CFLAGS)
curl.o-libs := $(CURL_LIBS)
rbd.o-cflags := $(RBD_CFLAGS)
rbd.o-libs := $(RBD_LIBS)
gluster.o-cflags := $(GLUSTERFS_CFLAGS)
gluster.o-libs := $(GLUSTERFS_LIBS)
ssh.o-cflags := $(LIBSSH2_CFLAGS)
ssh.o-libs := $(LIBSSH2_LIBS)
archipelago.o-libs := $(ARCHIPELAGO_LIBS)
block-obj-m += dmg.o
dmg.o-libs := $(BZIP2_LIBS)
qcow.o-libs := -lz
linux-aio.o-libs := -laio
$(obj)/curl.o: QEMU_CFLAGS+=$(CURL_CFLAGS)

View File

@@ -1,63 +0,0 @@
/*
* QEMU System Emulator block accounting
*
* Copyright (c) 2011 Christoph Hellwig
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "block/accounting.h"
#include "block/block_int.h"
#include "qemu/timer.h"
void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
int64_t bytes, enum BlockAcctType type)
{
assert(type < BLOCK_MAX_IOTYPE);
cookie->bytes = bytes;
cookie->start_time_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
cookie->type = type;
}
void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
assert(cookie->type < BLOCK_MAX_IOTYPE);
stats->nr_bytes[cookie->type] += cookie->bytes;
stats->nr_ops[cookie->type]++;
stats->total_time_ns[cookie->type] +=
qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - cookie->start_time_ns;
}
void block_acct_highest_sector(BlockAcctStats *stats, int64_t sector_num,
unsigned int nb_sectors)
{
if (stats->wr_highest_sector < sector_num + nb_sectors - 1) {
stats->wr_highest_sector = sector_num + nb_sectors - 1;
}
}
void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type,
int num_requests)
{
assert(type < BLOCK_MAX_IOTYPE);
stats->merged[type] += num_requests;
}

File diff suppressed because it is too large Load Diff

View File

@@ -19,7 +19,6 @@
#include "block/block.h"
#include "block/block_int.h"
#include "block/blockjob.h"
#include "qapi/qmp/qerror.h"
#include "qemu/ratelimit.h"
#define BACKUP_CLUSTER_BITS 16
@@ -38,8 +37,6 @@ typedef struct CowRequest {
typedef struct BackupBlockJob {
BlockJob common;
BlockDriverState *target;
/* bitmap for sync=incremental */
BdrvDirtyBitmap *sync_bitmap;
MirrorSyncMode sync_mode;
RateLimit limit;
BlockdevOnError on_source_error;
@@ -141,8 +138,7 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
ret = bdrv_co_write_zeroes(job->target,
start * BACKUP_SECTORS_PER_CLUSTER,
n, BDRV_REQ_MAY_UNMAP);
start * BACKUP_SECTORS_PER_CLUSTER, n);
} else {
ret = bdrv_co_writev(job->target,
start * BACKUP_SECTORS_PER_CLUSTER, n,
@@ -184,13 +180,8 @@ static int coroutine_fn backup_before_write_notify(
void *opaque)
{
BdrvTrackedRequest *req = opaque;
int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;
assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
return backup_do_cow(req->bs, sector_num, nb_sectors, NULL);
return backup_do_cow(req->bs, req->sector_num, req->nb_sectors, NULL);
}
static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -198,7 +189,7 @@ static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
if (speed < 0) {
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
error_set(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
@@ -230,110 +221,9 @@ static BlockErrorAction backup_error_action(BackupBlockJob *job,
}
}
typedef struct {
int ret;
} BackupCompleteData;
static void backup_complete(BlockJob *job, void *opaque)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
BackupCompleteData *data = opaque;
bdrv_unref(s->target);
block_job_completed(job, data->ret);
g_free(data);
}
static bool coroutine_fn yield_and_check(BackupBlockJob *job)
{
if (block_job_is_cancelled(&job->common)) {
return true;
}
/* we need to yield so that bdrv_drain_all() returns.
* (without, VM does not reboot)
*/
if (job->common.speed) {
uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
job->sectors_read);
job->sectors_read = 0;
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
} else {
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, 0);
}
if (block_job_is_cancelled(&job->common)) {
return true;
}
return false;
}
static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
{
bool error_is_read;
int ret = 0;
int clusters_per_iter;
uint32_t granularity;
int64_t sector;
int64_t cluster;
int64_t end;
int64_t last_cluster = -1;
BlockDriverState *bs = job->common.bs;
HBitmapIter hbi;
granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);
bdrv_dirty_iter_init(job->sync_bitmap, &hbi);
/* Find the next dirty sector(s) */
while ((sector = hbitmap_iter_next(&hbi)) != -1) {
cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
/* Fake progress updates for any clusters we skipped */
if (cluster != last_cluster + 1) {
job->common.offset += ((cluster - last_cluster - 1) *
BACKUP_CLUSTER_SIZE);
}
for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
do {
if (yield_and_check(job)) {
return ret;
}
ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
BACKUP_SECTORS_PER_CLUSTER, &error_is_read);
if ((ret < 0) &&
backup_error_action(job, error_is_read, -ret) ==
BLOCK_ERROR_ACTION_REPORT) {
return ret;
}
} while (ret < 0);
}
/* If the bitmap granularity is smaller than the backup granularity,
* we need to advance the iterator pointer to the next cluster. */
if (granularity < BACKUP_CLUSTER_SIZE) {
bdrv_set_dirty_iter(&hbi, cluster * BACKUP_SECTORS_PER_CLUSTER);
}
last_cluster = cluster - 1;
}
/* Play some final catchup with the progress meter */
end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
if (last_cluster + 1 < end) {
job->common.offset += ((end - last_cluster - 1) * BACKUP_CLUSTER_SIZE);
}
return ret;
}
static void coroutine_fn backup_run(void *opaque)
{
BackupBlockJob *job = opaque;
BackupCompleteData *data;
BlockDriverState *bs = job->common.bs;
BlockDriverState *target = job->target;
BlockdevOnError on_target_error = job->on_target_error;
@@ -347,7 +237,8 @@ static void coroutine_fn backup_run(void *opaque)
qemu_co_rwlock_init(&job->flush_rwlock);
start = 0;
end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
end = DIV_ROUND_UP(job->common.len / BDRV_SECTOR_SIZE,
BACKUP_SECTORS_PER_CLUSTER);
job->bitmap = hbitmap_alloc(end, 0);
@@ -365,13 +256,28 @@ static void coroutine_fn backup_run(void *opaque)
qemu_coroutine_yield();
job->common.busy = true;
}
} else if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
ret = backup_run_incremental(job);
} else {
/* Both FULL and TOP SYNC_MODE's require copying.. */
for (; start < end; start++) {
bool error_is_read;
if (yield_and_check(job)) {
if (block_job_is_cancelled(&job->common)) {
break;
}
/* we need to yield so that qemu_aio_flush() returns.
* (without, VM does not reboot)
*/
if (job->common.speed) {
uint64_t delay_ns = ratelimit_calculate_delay(
&job->limit, job->sectors_read);
job->sectors_read = 0;
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
} else {
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, 0);
}
if (block_job_is_cancelled(&job->common)) {
break;
}
@@ -395,7 +301,7 @@ static void coroutine_fn backup_run(void *opaque)
BACKUP_SECTORS_PER_CLUSTER - i, &n);
i += n;
if (alloced == 1 || n == 0) {
if (alloced == 1) {
break;
}
}
@@ -413,7 +319,7 @@ static void coroutine_fn backup_run(void *opaque)
/* Depending on error action, fail now or retry cluster */
BlockErrorAction action =
backup_error_action(job, error_is_read, -ret);
if (action == BLOCK_ERROR_ACTION_REPORT) {
if (action == BDRV_ACTION_REPORT) {
break;
} else {
start--;
@@ -429,34 +335,19 @@ static void coroutine_fn backup_run(void *opaque)
qemu_co_rwlock_wrlock(&job->flush_rwlock);
qemu_co_rwlock_unlock(&job->flush_rwlock);
if (job->sync_bitmap) {
BdrvDirtyBitmap *bm;
if (ret < 0 || block_job_is_cancelled(&job->common)) {
/* Merge the successor back into the parent, delete nothing. */
bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
assert(bm);
} else {
/* Everything is fine, delete this bitmap and install the backup. */
bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
assert(bm);
}
}
hbitmap_free(job->bitmap);
bdrv_iostatus_disable(target);
bdrv_op_unblock_all(target, job->common.blocker);
bdrv_unref(target);
data = g_malloc(sizeof(*data));
data->ret = ret;
block_job_defer_to_main_loop(&job->common, backup_complete, data);
block_job_completed(&job->common, ret);
}
void backup_start(BlockDriverState *bs, BlockDriverState *target,
int64_t speed, MirrorSyncMode sync_mode,
BdrvDirtyBitmap *sync_bitmap,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockCompletionFunc *cb, void *opaque,
BlockDriverCompletionFunc *cb, void *opaque,
Error **errp)
{
int64_t len;
@@ -465,54 +356,10 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
assert(target);
assert(cb);
if (bs == target) {
error_setg(errp, "Source and target cannot be the same");
return;
}
if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
!bdrv_iostatus_is_enabled(bs)) {
error_setg(errp, QERR_INVALID_PARAMETER, "on-source-error");
return;
}
if (!bdrv_is_inserted(bs)) {
error_setg(errp, "Device is not inserted: %s",
bdrv_get_device_name(bs));
return;
}
if (!bdrv_is_inserted(target)) {
error_setg(errp, "Device is not inserted: %s",
bdrv_get_device_name(target));
return;
}
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
return;
}
if (bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP_TARGET, errp)) {
return;
}
if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
if (!sync_bitmap) {
error_setg(errp, "must provide a valid bitmap name for "
"\"incremental\" sync mode");
return;
}
/* Create a new bitmap, and freeze/disable this one. */
if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap, errp) < 0) {
return;
}
} else if (sync_bitmap) {
error_setg(errp,
"a sync_bitmap was provided to backup_run, "
"but received an incompatible sync_mode (%s)",
MirrorSyncMode_lookup[sync_mode]);
error_set(errp, QERR_INVALID_PARAMETER, "on-source-error");
return;
}
@@ -520,30 +367,20 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
if (len < 0) {
error_setg_errno(errp, -len, "unable to get length for '%s'",
bdrv_get_device_name(bs));
goto error;
return;
}
BackupBlockJob *job = block_job_create(&backup_job_driver, bs, speed,
cb, opaque, errp);
if (!job) {
goto error;
return;
}
bdrv_op_block_all(target, job->common.blocker);
job->on_source_error = on_source_error;
job->on_target_error = on_target_error;
job->target = target;
job->sync_mode = sync_mode;
job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
sync_bitmap : NULL;
job->common.len = len;
job->common.co = qemu_coroutine_create(backup_run);
qemu_coroutine_enter(job->common.co, job);
return;
error:
if (sync_bitmap) {
bdrv_reclaim_dirty_bitmap(bs, sync_bitmap, NULL);
}
}

View File

@@ -26,10 +26,6 @@
#include "qemu/config-file.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
typedef struct BDRVBlkdebugState {
int state;
@@ -41,7 +37,7 @@ typedef struct BDRVBlkdebugState {
} BDRVBlkdebugState;
typedef struct BlkdebugAIOCB {
BlockAIOCB common;
BlockDriverAIOCB common;
QEMUBH *bh;
int ret;
} BlkdebugAIOCB;
@@ -52,8 +48,11 @@ typedef struct BlkdebugSuspendedReq {
QLIST_ENTRY(BlkdebugSuspendedReq) next;
} BlkdebugSuspendedReq;
static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb);
static const AIOCBInfo blkdebug_aiocb_info = {
.aiocb_size = sizeof(BlkdebugAIOCB),
.aiocb_size = sizeof(BlkdebugAIOCB),
.cancel = blkdebug_aio_cancel,
};
enum {
@@ -187,16 +186,6 @@ static const char *event_names[BLKDBG_EVENT_MAX] = {
[BLKDBG_FLUSH_TO_OS] = "flush_to_os",
[BLKDBG_FLUSH_TO_DISK] = "flush_to_disk",
[BLKDBG_PWRITEV_RMW_HEAD] = "pwritev_rmw.head",
[BLKDBG_PWRITEV_RMW_AFTER_HEAD] = "pwritev_rmw.after_head",
[BLKDBG_PWRITEV_RMW_TAIL] = "pwritev_rmw.tail",
[BLKDBG_PWRITEV_RMW_AFTER_TAIL] = "pwritev_rmw.after_tail",
[BLKDBG_PWRITEV] = "pwritev",
[BLKDBG_PWRITEV_ZERO] = "pwritev_zero",
[BLKDBG_PWRITEV_DONE] = "pwritev_done",
[BLKDBG_EMPTY_IMAGE_PREPARE] = "empty_image_prepare",
};
static int get_event_by_name(const char *name, BlkDebugEvent *event)
@@ -218,7 +207,7 @@ struct add_rule_data {
int action;
};
static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
static int add_rule(QemuOpts *opts, void *opaque)
{
struct add_rule_data *d = opaque;
BDRVBlkdebugState *s = d->s;
@@ -228,11 +217,7 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
/* Find the right event for the rule */
event_name = qemu_opt_get(opts, "event");
if (!event_name) {
error_setg(errp, "Missing event name for rule");
return -1;
} else if (get_event_by_name(event_name, &event) < 0) {
error_setg(errp, "Invalid event name \"%s\"", event_name);
if (!event_name || get_event_by_name(event_name, &event) < 0) {
return -1;
}
@@ -286,60 +271,34 @@ static void remove_rule(BlkdebugRule *rule)
g_free(rule);
}
static int read_config(BDRVBlkdebugState *s, const char *filename,
QDict *options, Error **errp)
static int read_config(BDRVBlkdebugState *s, const char *filename)
{
FILE *f = NULL;
FILE *f;
int ret;
struct add_rule_data d;
Error *local_err = NULL;
if (filename) {
f = fopen(filename, "r");
if (f == NULL) {
error_setg_errno(errp, errno, "Could not read blkdebug config file");
return -errno;
}
ret = qemu_config_parse(f, config_groups, filename);
if (ret < 0) {
error_setg(errp, "Could not parse blkdebug config file");
ret = -EINVAL;
goto fail;
}
f = fopen(filename, "r");
if (f == NULL) {
return -errno;
}
qemu_config_parse_qdict(options, config_groups, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
ret = qemu_config_parse(f, config_groups, filename);
if (ret < 0) {
goto fail;
}
d.s = s;
d.action = ACTION_INJECT_ERROR;
qemu_opts_foreach(&inject_error_opts, add_rule, &d, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
qemu_opts_foreach(&inject_error_opts, add_rule, &d, 0);
d.action = ACTION_SET_STATE;
qemu_opts_foreach(&set_state_opts, add_rule, &d, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
qemu_opts_foreach(&set_state_opts, add_rule, &d, 0);
ret = 0;
fail:
qemu_opts_reset(&inject_error_opts);
qemu_opts_reset(&set_state_opts);
if (f) {
fclose(f);
}
fclose(f);
return ret;
}
@@ -351,9 +310,7 @@ static void blkdebug_parse_filename(const char *filename, QDict *options,
/* Parse the blkdebug: prefix */
if (!strstart(filename, "blkdebug:", &filename)) {
/* There was no prefix; therefore, all options have to be already
present in the QDict (except for the filename) */
qdict_put(options, "x-image", qstring_from_str(filename));
error_setg(errp, "File name string must start with 'blkdebug:'");
return;
}
@@ -389,11 +346,6 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "[internal use only, will be removed]",
},
{
.name = "align",
.type = QEMU_OPT_SIZE,
.help = "Required alignment in bytes",
},
{ /* end of list */ }
},
};
@@ -404,53 +356,46 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
BDRVBlkdebugState *s = bs->opaque;
QemuOpts *opts;
Error *local_err = NULL;
const char *config;
uint64_t align;
const char *filename, *config;
int ret;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
opts = qemu_opts_create_nofail(&runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto out;
goto fail;
}
/* Read rules from config file or command line options */
/* Read rules from config file */
config = qemu_opt_get(opts, "config");
ret = read_config(s, config, options, errp);
if (ret) {
goto out;
if (config) {
ret = read_config(s, config);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not read blkdebug config file");
goto fail;
}
}
/* Set initial state */
s->state = 1;
/* Open the backing file */
assert(bs->file == NULL);
ret = bdrv_open_image(&bs->file, qemu_opt_get(opts, "x-image"), options, "image",
bs, &child_file, false, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto out;
filename = qemu_opt_get(opts, "x-image");
if (filename == NULL) {
error_setg(errp, "Could not retrieve image file name");
ret = -EINVAL;
goto fail;
}
/* Set request alignment */
align = qemu_opt_get_size(opts, "align", bs->request_alignment);
if (align > 0 && align < INT_MAX && !(align & (align - 1))) {
bs->request_alignment = align;
} else {
error_setg(errp, "Invalid alignment");
ret = -EINVAL;
goto fail_unref;
ret = bdrv_file_open(&bs->file, filename, NULL, flags, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
}
ret = 0;
goto out;
fail_unref:
bdrv_unref(bs->file);
out:
fail:
qemu_opts_del(opts);
return ret;
}
@@ -460,40 +405,44 @@ static void error_callback_bh(void *opaque)
struct BlkdebugAIOCB *acb = opaque;
qemu_bh_delete(acb->bh);
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_unref(acb);
qemu_aio_release(acb);
}
static BlockAIOCB *inject_error(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque, BlkdebugRule *rule)
static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb)
{
BlkdebugAIOCB *acb = container_of(blockacb, BlkdebugAIOCB, common);
qemu_aio_release(acb);
}
static BlockDriverAIOCB *inject_error(BlockDriverState *bs,
BlockDriverCompletionFunc *cb, void *opaque, BlkdebugRule *rule)
{
BDRVBlkdebugState *s = bs->opaque;
int error = rule->options.inject.error;
struct BlkdebugAIOCB *acb;
QEMUBH *bh;
bool immediately = rule->options.inject.immediately;
if (rule->options.inject.once) {
QSIMPLEQ_REMOVE(&s->active_rules, rule, BlkdebugRule, active_next);
remove_rule(rule);
QSIMPLEQ_INIT(&s->active_rules);
}
if (immediately) {
if (rule->options.inject.immediately) {
return NULL;
}
acb = qemu_aio_get(&blkdebug_aiocb_info, bs, cb, opaque);
acb->ret = -error;
bh = aio_bh_new(bdrv_get_aio_context(bs), error_callback_bh, acb);
bh = qemu_bh_new(error_callback_bh, acb);
acb->bh = bh;
qemu_bh_schedule(bh);
return &acb->common;
}
static BlockAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
static BlockDriverAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
@@ -513,9 +462,9 @@ static BlockAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
return bdrv_aio_readv(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
}
static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
static BlockDriverAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
@@ -535,25 +484,6 @@ static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
return bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
}
static BlockAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
if (rule->options.inject.sector == -1) {
break;
}
}
if (rule && rule->options.inject.error) {
return inject_error(bs, cb, opaque, rule);
}
return bdrv_aio_flush(bs->file, cb, opaque);
}
static void blkdebug_close(BlockDriverState *bs)
{
@@ -664,9 +594,9 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugSuspendedReq *r, *next;
BlkdebugSuspendedReq *r;
QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, next) {
QLIST_FOREACH(r, &s->suspended_reqs, next) {
if (!strcmp(r->tag, tag)) {
qemu_coroutine_enter(r->co, NULL);
return 0;
@@ -675,31 +605,6 @@ static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
return -ENOENT;
}
static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
const char *tag)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugSuspendedReq *r, *r_next;
BlkdebugRule *rule, *next;
int i, ret = -ENOENT;
for (i = 0; i < BLKDBG_EVENT_MAX; i++) {
QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
if (rule->action == ACTION_SUSPEND &&
!strcmp(rule->options.suspend.tag, tag)) {
remove_rule(rule);
ret = 0;
}
}
}
QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, r_next) {
if (!strcmp(r->tag, tag)) {
qemu_coroutine_enter(r->co, NULL);
ret = 0;
}
}
return ret;
}
static bool blkdebug_debug_is_suspended(BlockDriverState *bs, const char *tag)
{
@@ -719,60 +624,6 @@ static int64_t blkdebug_getlength(BlockDriverState *bs)
return bdrv_getlength(bs->file);
}
static int blkdebug_truncate(BlockDriverState *bs, int64_t offset)
{
return bdrv_truncate(bs->file, offset);
}
static void blkdebug_refresh_filename(BlockDriverState *bs)
{
QDict *opts;
const QDictEntry *e;
bool force_json = false;
for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
if (strcmp(qdict_entry_key(e), "config") &&
strcmp(qdict_entry_key(e), "x-image") &&
strcmp(qdict_entry_key(e), "image") &&
strncmp(qdict_entry_key(e), "image.", strlen("image.")))
{
force_json = true;
break;
}
}
if (force_json && !bs->file->full_open_options) {
/* The config file cannot be recreated, so creating a plain filename
* is impossible */
return;
}
if (!force_json && bs->file->exact_filename[0]) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkdebug:%s:%s",
qdict_get_try_str(bs->options, "config") ?: "",
bs->file->exact_filename);
}
opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkdebug")));
QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "image", QOBJECT(bs->file->full_open_options));
for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
if (strcmp(qdict_entry_key(e), "x-image") &&
strcmp(qdict_entry_key(e), "image") &&
strncmp(qdict_entry_key(e), "image.", strlen("image.")))
{
qobject_incref(qdict_entry_value(e));
qdict_put_obj(opts, qdict_entry_key(e), qdict_entry_value(e));
}
}
bs->full_open_options = opts;
}
static BlockDriver bdrv_blkdebug = {
.format_name = "blkdebug",
.protocol_name = "blkdebug",
@@ -782,17 +633,12 @@ static BlockDriver bdrv_blkdebug = {
.bdrv_file_open = blkdebug_open,
.bdrv_close = blkdebug_close,
.bdrv_getlength = blkdebug_getlength,
.bdrv_truncate = blkdebug_truncate,
.bdrv_refresh_filename = blkdebug_refresh_filename,
.bdrv_aio_readv = blkdebug_aio_readv,
.bdrv_aio_writev = blkdebug_aio_writev,
.bdrv_aio_flush = blkdebug_aio_flush,
.bdrv_debug_event = blkdebug_debug_event,
.bdrv_debug_breakpoint = blkdebug_debug_breakpoint,
.bdrv_debug_remove_breakpoint
= blkdebug_debug_remove_breakpoint,
.bdrv_debug_resume = blkdebug_debug_resume,
.bdrv_debug_is_suspended = blkdebug_debug_is_suspended,
};

View File

@@ -10,8 +10,6 @@
#include <stdarg.h>
#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
#include "block/block_int.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
typedef struct {
BlockDriverState *test_file;
@@ -19,7 +17,7 @@ typedef struct {
typedef struct BlkverifyAIOCB BlkverifyAIOCB;
struct BlkverifyAIOCB {
BlockAIOCB common;
BlockDriverAIOCB common;
QEMUBH *bh;
/* Request metadata */
@@ -29,6 +27,7 @@ struct BlkverifyAIOCB {
int ret; /* first completed request's result */
unsigned int done; /* completion counter */
bool *finished; /* completion signal for cancel */
QEMUIOVector *qiov; /* user I/O vector */
QEMUIOVector raw_qiov; /* cloned I/O vector for raw file */
@@ -37,8 +36,21 @@ struct BlkverifyAIOCB {
void (*verify)(BlkverifyAIOCB *acb);
};
static void blkverify_aio_cancel(BlockDriverAIOCB *blockacb)
{
BlkverifyAIOCB *acb = (BlkverifyAIOCB *)blockacb;
bool finished = false;
/* Wait until request completes, invokes its callback, and frees itself */
acb->finished = &finished;
while (!finished) {
qemu_aio_wait();
}
}
static const AIOCBInfo blkverify_aiocb_info = {
.aiocb_size = sizeof(BlkverifyAIOCB),
.cancel = blkverify_aio_cancel,
};
static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyAIOCB *acb,
@@ -66,9 +78,7 @@ static void blkverify_parse_filename(const char *filename, QDict *options,
/* Parse the blkverify: prefix */
if (!strstart(filename, "blkverify:", &filename)) {
/* There was no prefix; therefore, all options have to be already
present in the QDict (except for the filename) */
qdict_put(options, "x-image", qstring_from_str(filename));
error_setg(errp, "File name string must start with 'blkverify:'");
return;
}
@@ -112,38 +122,50 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
BDRVBlkverifyState *s = bs->opaque;
QemuOpts *opts;
Error *local_err = NULL;
const char *filename, *raw;
int ret;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
opts = qemu_opts_create_nofail(&runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
/* Open the raw file */
assert(bs->file == NULL);
ret = bdrv_open_image(&bs->file, qemu_opt_get(opts, "x-raw"), options,
"raw", bs, &child_file, false, &local_err);
/* Parse the raw image filename */
raw = qemu_opt_get(opts, "x-raw");
if (raw == NULL) {
error_setg(errp, "Could not retrieve raw image filename");
ret = -EINVAL;
goto fail;
}
ret = bdrv_file_open(&bs->file, raw, NULL, flags, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
}
/* Open the test file */
assert(s->test_file == NULL);
ret = bdrv_open_image(&s->test_file, qemu_opt_get(opts, "x-image"), options,
"test", bs, &child_format, false, &local_err);
filename = qemu_opt_get(opts, "x-image");
if (filename == NULL) {
error_setg(errp, "Could not retrieve test image filename");
ret = -EINVAL;
goto fail;
}
s->test_file = bdrv_new("");
ret = bdrv_open(s->test_file, filename, NULL, flags, NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
bdrv_unref(s->test_file);
s->test_file = NULL;
goto fail;
}
ret = 0;
fail:
qemu_opts_del(opts);
return ret;
}
@@ -162,10 +184,114 @@ static int64_t blkverify_getlength(BlockDriverState *bs)
return bdrv_getlength(s->test_file);
}
/**
* Check that I/O vector contents are identical
*
* @a: I/O vector
* @b: I/O vector
* @ret: Offset to first mismatching byte or -1 if match
*/
static ssize_t blkverify_iovec_compare(QEMUIOVector *a, QEMUIOVector *b)
{
int i;
ssize_t offset = 0;
assert(a->niov == b->niov);
for (i = 0; i < a->niov; i++) {
size_t len = 0;
uint8_t *p = (uint8_t *)a->iov[i].iov_base;
uint8_t *q = (uint8_t *)b->iov[i].iov_base;
assert(a->iov[i].iov_len == b->iov[i].iov_len);
while (len < a->iov[i].iov_len && *p++ == *q++) {
len++;
}
offset += len;
if (len != a->iov[i].iov_len) {
return offset;
}
}
return -1;
}
typedef struct {
int src_index;
struct iovec *src_iov;
void *dest_base;
} IOVectorSortElem;
static int sortelem_cmp_src_base(const void *a, const void *b)
{
const IOVectorSortElem *elem_a = a;
const IOVectorSortElem *elem_b = b;
/* Don't overflow */
if (elem_a->src_iov->iov_base < elem_b->src_iov->iov_base) {
return -1;
} else if (elem_a->src_iov->iov_base > elem_b->src_iov->iov_base) {
return 1;
} else {
return 0;
}
}
static int sortelem_cmp_src_index(const void *a, const void *b)
{
const IOVectorSortElem *elem_a = a;
const IOVectorSortElem *elem_b = b;
return elem_a->src_index - elem_b->src_index;
}
/**
* Copy contents of I/O vector
*
* The relative relationships of overlapping iovecs are preserved. This is
* necessary to ensure identical semantics in the cloned I/O vector.
*/
static void blkverify_iovec_clone(QEMUIOVector *dest, const QEMUIOVector *src,
void *buf)
{
IOVectorSortElem sortelems[src->niov];
void *last_end;
int i;
/* Sort by source iovecs by base address */
for (i = 0; i < src->niov; i++) {
sortelems[i].src_index = i;
sortelems[i].src_iov = &src->iov[i];
}
qsort(sortelems, src->niov, sizeof(sortelems[0]), sortelem_cmp_src_base);
/* Allocate buffer space taking into account overlapping iovecs */
last_end = NULL;
for (i = 0; i < src->niov; i++) {
struct iovec *cur = sortelems[i].src_iov;
ptrdiff_t rewind = 0;
/* Detect overlap */
if (last_end && last_end > cur->iov_base) {
rewind = last_end - cur->iov_base;
}
sortelems[i].dest_base = buf - rewind;
buf += cur->iov_len - MIN(rewind, cur->iov_len);
last_end = MAX(cur->iov_base + cur->iov_len, last_end);
}
/* Sort by source iovec index and build destination iovec */
qsort(sortelems, src->niov, sizeof(sortelems[0]), sortelem_cmp_src_index);
for (i = 0; i < src->niov; i++) {
qemu_iovec_add(dest, sortelems[i].dest_base, src->iov[i].iov_len);
}
}
static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
BlockDriverCompletionFunc *cb,
void *opaque)
{
BlkverifyAIOCB *acb = qemu_aio_get(&blkverify_aiocb_info, bs, cb, opaque);
@@ -179,6 +305,7 @@ static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
acb->qiov = qiov;
acb->buf = NULL;
acb->verify = NULL;
acb->finished = NULL;
return acb;
}
@@ -192,7 +319,10 @@ static void blkverify_aio_bh(void *opaque)
qemu_vfree(acb->buf);
}
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_unref(acb);
if (acb->finished) {
*acb->finished = true;
}
qemu_aio_release(acb);
}
static void blkverify_aio_cb(void *opaque, int ret)
@@ -213,8 +343,7 @@ static void blkverify_aio_cb(void *opaque, int ret)
acb->verify(acb);
}
acb->bh = aio_bh_new(bdrv_get_aio_context(acb->common.bs),
blkverify_aio_bh, acb);
acb->bh = qemu_bh_new(blkverify_aio_bh, acb);
qemu_bh_schedule(acb->bh);
break;
}
@@ -222,16 +351,16 @@ static void blkverify_aio_cb(void *opaque, int ret)
static void blkverify_verify_readv(BlkverifyAIOCB *acb)
{
ssize_t offset = qemu_iovec_compare(acb->qiov, &acb->raw_qiov);
ssize_t offset = blkverify_iovec_compare(acb->qiov, &acb->raw_qiov);
if (offset != -1) {
blkverify_err(acb, "contents mismatch in sector %" PRId64,
acb->sector_num + (int64_t)(offset / BDRV_SECTOR_SIZE));
}
}
static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
static BlockDriverAIOCB *blkverify_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
BlkverifyAIOCB *acb = blkverify_aio_get(bs, false, sector_num, qiov,
@@ -240,7 +369,7 @@ static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
acb->verify = blkverify_verify_readv;
acb->buf = qemu_blockalign(bs->file, qiov->size);
qemu_iovec_init(&acb->raw_qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->raw_qiov, qiov, acb->buf);
blkverify_iovec_clone(&acb->raw_qiov, qiov, acb->buf);
bdrv_aio_readv(s->test_file, sector_num, qiov, nb_sectors,
blkverify_aio_cb, acb);
@@ -249,9 +378,9 @@ static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
return &acb->common;
}
static BlockAIOCB *blkverify_aio_writev(BlockDriverState *bs,
static BlockDriverAIOCB *blkverify_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
BlkverifyAIOCB *acb = blkverify_aio_get(bs, true, sector_num, qiov,
@@ -264,9 +393,9 @@ static BlockAIOCB *blkverify_aio_writev(BlockDriverState *bs,
return &acb->common;
}
static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb,
void *opaque)
static BlockDriverAIOCB *blkverify_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb,
void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
@@ -274,82 +403,21 @@ static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
return bdrv_aio_flush(s->test_file, cb, opaque);
}
static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
BlockDriverState *candidate)
{
BDRVBlkverifyState *s = bs->opaque;
bool perm = bdrv_recurse_is_first_non_filter(bs->file, candidate);
if (perm) {
return true;
}
return bdrv_recurse_is_first_non_filter(s->test_file, candidate);
}
/* Propagate AioContext changes to ->test_file */
static void blkverify_detach_aio_context(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_detach_aio_context(s->test_file);
}
static void blkverify_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_attach_aio_context(s->test_file, new_context);
}
static void blkverify_refresh_filename(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
/* bs->file has already been refreshed */
bdrv_refresh_filename(s->test_file);
if (bs->file->full_open_options && s->test_file->full_open_options) {
QDict *opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkverify")));
QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "raw", QOBJECT(bs->file->full_open_options));
QINCREF(s->test_file->full_open_options);
qdict_put_obj(opts, "test", QOBJECT(s->test_file->full_open_options));
bs->full_open_options = opts;
}
if (bs->file->exact_filename[0] && s->test_file->exact_filename[0]) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkverify:%s:%s",
bs->file->exact_filename, s->test_file->exact_filename);
}
}
static BlockDriver bdrv_blkverify = {
.format_name = "blkverify",
.protocol_name = "blkverify",
.instance_size = sizeof(BDRVBlkverifyState),
.format_name = "blkverify",
.protocol_name = "blkverify",
.instance_size = sizeof(BDRVBlkverifyState),
.bdrv_parse_filename = blkverify_parse_filename,
.bdrv_file_open = blkverify_open,
.bdrv_close = blkverify_close,
.bdrv_getlength = blkverify_getlength,
.bdrv_refresh_filename = blkverify_refresh_filename,
.bdrv_parse_filename = blkverify_parse_filename,
.bdrv_file_open = blkverify_open,
.bdrv_close = blkverify_close,
.bdrv_getlength = blkverify_getlength,
.bdrv_aio_readv = blkverify_aio_readv,
.bdrv_aio_writev = blkverify_aio_writev,
.bdrv_aio_flush = blkverify_aio_flush,
.bdrv_aio_readv = blkverify_aio_readv,
.bdrv_aio_writev = blkverify_aio_writev,
.bdrv_aio_flush = blkverify_aio_flush,
.bdrv_attach_aio_context = blkverify_attach_aio_context,
.bdrv_detach_aio_context = blkverify_detach_aio_context,
.is_filter = true,
.bdrv_recurse_is_first_non_filter = blkverify_recurse_is_first_non_filter,
.bdrv_check_ext_snapshot = bdrv_check_ext_snapshot_forbidden,
};
static void bdrv_blkverify_init(void)

View File

@@ -1,920 +0,0 @@
/*
* QEMU Block backends
*
* Copyright (C) 2014 Red Hat, Inc.
*
* Authors:
* Markus Armbruster <armbru@redhat.com>,
*
* This work is licensed under the terms of the GNU LGPL, version 2.1
* or later. See the COPYING.LIB file in the top-level directory.
*/
#include "sysemu/block-backend.h"
#include "block/block_int.h"
#include "sysemu/blockdev.h"
#include "qapi-event.h"
/* Number of coroutines to reserve per attached device model */
#define COROUTINE_POOL_RESERVATION 64
struct BlockBackend {
char *name;
int refcnt;
BlockDriverState *bs;
DriveInfo *legacy_dinfo; /* null unless created by drive_new() */
QTAILQ_ENTRY(BlockBackend) link; /* for blk_backends */
void *dev; /* attached device model, if any */
/* TODO change to DeviceState when all users are qdevified */
const BlockDevOps *dev_ops;
void *dev_opaque;
};
typedef struct BlockBackendAIOCB {
BlockAIOCB common;
QEMUBH *bh;
int ret;
} BlockBackendAIOCB;
static const AIOCBInfo block_backend_aiocb_info = {
.aiocb_size = sizeof(BlockBackendAIOCB),
};
static void drive_info_del(DriveInfo *dinfo);
/* All the BlockBackends (except for hidden ones) */
static QTAILQ_HEAD(, BlockBackend) blk_backends =
QTAILQ_HEAD_INITIALIZER(blk_backends);
/*
* Create a new BlockBackend with @name, with a reference count of one.
* @name must not be null or empty.
* Fail if a BlockBackend with this name already exists.
* Store an error through @errp on failure, unless it's null.
* Return the new BlockBackend on success, null on failure.
*/
BlockBackend *blk_new(const char *name, Error **errp)
{
BlockBackend *blk;
assert(name && name[0]);
if (!id_wellformed(name)) {
error_setg(errp, "Invalid device name");
return NULL;
}
if (blk_by_name(name)) {
error_setg(errp, "Device with id '%s' already exists", name);
return NULL;
}
if (bdrv_find_node(name)) {
error_setg(errp,
"Device name '%s' conflicts with an existing node name",
name);
return NULL;
}
blk = g_new0(BlockBackend, 1);
blk->name = g_strdup(name);
blk->refcnt = 1;
QTAILQ_INSERT_TAIL(&blk_backends, blk, link);
return blk;
}
/*
* Create a new BlockBackend with a new BlockDriverState attached.
* Otherwise just like blk_new(), which see.
*/
BlockBackend *blk_new_with_bs(const char *name, Error **errp)
{
BlockBackend *blk;
BlockDriverState *bs;
blk = blk_new(name, errp);
if (!blk) {
return NULL;
}
bs = bdrv_new_root();
blk->bs = bs;
bs->blk = blk;
return blk;
}
/*
* Calls blk_new_with_bs() and then calls bdrv_open() on the BlockDriverState.
*
* Just as with bdrv_open(), after having called this function the reference to
* @options belongs to the block layer (even on failure).
*
* TODO: Remove @filename and @flags; it should be possible to specify a whole
* BDS tree just by specifying the @options QDict (or @reference,
* alternatively). At the time of adding this function, this is not possible,
* though, so callers of this function have to be able to specify @filename and
* @flags.
*/
BlockBackend *blk_new_open(const char *name, const char *filename,
const char *reference, QDict *options, int flags,
Error **errp)
{
BlockBackend *blk;
int ret;
blk = blk_new_with_bs(name, errp);
if (!blk) {
QDECREF(options);
return NULL;
}
ret = bdrv_open(&blk->bs, filename, reference, options, flags, NULL, errp);
if (ret < 0) {
blk_unref(blk);
return NULL;
}
return blk;
}
static void blk_delete(BlockBackend *blk)
{
assert(!blk->refcnt);
assert(!blk->dev);
if (blk->bs) {
assert(blk->bs->blk == blk);
blk->bs->blk = NULL;
bdrv_unref(blk->bs);
blk->bs = NULL;
}
/* Avoid double-remove after blk_hide_on_behalf_of_hmp_drive_del() */
if (blk->name[0]) {
QTAILQ_REMOVE(&blk_backends, blk, link);
}
g_free(blk->name);
drive_info_del(blk->legacy_dinfo);
g_free(blk);
}
static void drive_info_del(DriveInfo *dinfo)
{
if (!dinfo) {
return;
}
qemu_opts_del(dinfo->opts);
g_free(dinfo->serial);
g_free(dinfo);
}
/*
* Increment @blk's reference count.
* @blk must not be null.
*/
void blk_ref(BlockBackend *blk)
{
blk->refcnt++;
}
/*
* Decrement @blk's reference count.
* If this drops it to zero, destroy @blk.
* For convenience, do nothing if @blk is null.
*/
void blk_unref(BlockBackend *blk)
{
if (blk) {
assert(blk->refcnt > 0);
if (!--blk->refcnt) {
blk_delete(blk);
}
}
}
/*
* Return the BlockBackend after @blk.
* If @blk is null, return the first one.
* Else, return @blk's next sibling, which may be null.
*
* To iterate over all BlockBackends, do
* for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
* ...
* }
*/
BlockBackend *blk_next(BlockBackend *blk)
{
return blk ? QTAILQ_NEXT(blk, link) : QTAILQ_FIRST(&blk_backends);
}
/*
* Return @blk's name, a non-null string.
* Wart: the name is empty iff @blk has been hidden with
* blk_hide_on_behalf_of_hmp_drive_del().
*/
const char *blk_name(BlockBackend *blk)
{
return blk->name;
}
/*
* Return the BlockBackend with name @name if it exists, else null.
* @name must not be null.
*/
BlockBackend *blk_by_name(const char *name)
{
BlockBackend *blk;
assert(name);
QTAILQ_FOREACH(blk, &blk_backends, link) {
if (!strcmp(name, blk->name)) {
return blk;
}
}
return NULL;
}
/*
* Return the BlockDriverState attached to @blk if any, else null.
*/
BlockDriverState *blk_bs(BlockBackend *blk)
{
return blk->bs;
}
/*
* Return @blk's DriveInfo if any, else null.
*/
DriveInfo *blk_legacy_dinfo(BlockBackend *blk)
{
return blk->legacy_dinfo;
}
/*
* Set @blk's DriveInfo to @dinfo, and return it.
* @blk must not have a DriveInfo set already.
* No other BlockBackend may have the same DriveInfo set.
*/
DriveInfo *blk_set_legacy_dinfo(BlockBackend *blk, DriveInfo *dinfo)
{
assert(!blk->legacy_dinfo);
return blk->legacy_dinfo = dinfo;
}
/*
* Return the BlockBackend with DriveInfo @dinfo.
* It must exist.
*/
BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo)
{
BlockBackend *blk;
QTAILQ_FOREACH(blk, &blk_backends, link) {
if (blk->legacy_dinfo == dinfo) {
return blk;
}
}
abort();
}
/*
* Hide @blk.
* @blk must not have been hidden already.
* Make attached BlockDriverState, if any, anonymous.
* Once hidden, @blk is invisible to all functions that don't receive
* it as argument. For example, blk_by_name() won't return it.
* Strictly for use by do_drive_del().
* TODO get rid of it!
*/
void blk_hide_on_behalf_of_hmp_drive_del(BlockBackend *blk)
{
QTAILQ_REMOVE(&blk_backends, blk, link);
blk->name[0] = 0;
if (blk->bs) {
bdrv_make_anon(blk->bs);
}
}
/*
* Attach device model @dev to @blk.
* Return 0 on success, -EBUSY when a device model is attached already.
*/
int blk_attach_dev(BlockBackend *blk, void *dev)
/* TODO change to DeviceState *dev when all users are qdevified */
{
if (blk->dev) {
return -EBUSY;
}
blk_ref(blk);
blk->dev = dev;
bdrv_iostatus_reset(blk->bs);
return 0;
}
/*
* Attach device model @dev to @blk.
* @blk must not have a device model attached already.
* TODO qdevified devices don't use this, remove when devices are qdevified
*/
void blk_attach_dev_nofail(BlockBackend *blk, void *dev)
{
if (blk_attach_dev(blk, dev) < 0) {
abort();
}
}
/*
* Detach device model @dev from @blk.
* @dev must be currently attached to @blk.
*/
void blk_detach_dev(BlockBackend *blk, void *dev)
/* TODO change to DeviceState *dev when all users are qdevified */
{
assert(blk->dev == dev);
blk->dev = NULL;
blk->dev_ops = NULL;
blk->dev_opaque = NULL;
bdrv_set_guest_block_size(blk->bs, 512);
blk_unref(blk);
}
/*
* Return the device model attached to @blk if any, else null.
*/
void *blk_get_attached_dev(BlockBackend *blk)
/* TODO change to return DeviceState * when all users are qdevified */
{
return blk->dev;
}
/*
* Set @blk's device model callbacks to @ops.
* @opaque is the opaque argument to pass to the callbacks.
* This is for use by device models.
*/
void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops,
void *opaque)
{
blk->dev_ops = ops;
blk->dev_opaque = opaque;
}
/*
* Notify @blk's attached device model of media change.
* If @load is true, notify of media load.
* Else, notify of media eject.
* Also send DEVICE_TRAY_MOVED events as appropriate.
*/
void blk_dev_change_media_cb(BlockBackend *blk, bool load)
{
if (blk->dev_ops && blk->dev_ops->change_media_cb) {
bool tray_was_closed = !blk_dev_is_tray_open(blk);
blk->dev_ops->change_media_cb(blk->dev_opaque, load);
if (tray_was_closed) {
/* tray open */
qapi_event_send_device_tray_moved(blk_name(blk),
true, &error_abort);
}
if (load) {
/* tray close */
qapi_event_send_device_tray_moved(blk_name(blk),
false, &error_abort);
}
}
}
/*
* Does @blk's attached device model have removable media?
* %true if no device model is attached.
*/
bool blk_dev_has_removable_media(BlockBackend *blk)
{
return !blk->dev || (blk->dev_ops && blk->dev_ops->change_media_cb);
}
/*
* Notify @blk's attached device model of a media eject request.
* If @force is true, the medium is about to be yanked out forcefully.
*/
void blk_dev_eject_request(BlockBackend *blk, bool force)
{
if (blk->dev_ops && blk->dev_ops->eject_request_cb) {
blk->dev_ops->eject_request_cb(blk->dev_opaque, force);
}
}
/*
* Does @blk's attached device model have a tray, and is it open?
*/
bool blk_dev_is_tray_open(BlockBackend *blk)
{
if (blk->dev_ops && blk->dev_ops->is_tray_open) {
return blk->dev_ops->is_tray_open(blk->dev_opaque);
}
return false;
}
/*
* Does @blk's attached device model have the medium locked?
* %false if the device model has no such lock.
*/
bool blk_dev_is_medium_locked(BlockBackend *blk)
{
if (blk->dev_ops && blk->dev_ops->is_medium_locked) {
return blk->dev_ops->is_medium_locked(blk->dev_opaque);
}
return false;
}
/*
* Notify @blk's attached device model of a backend size change.
*/
void blk_dev_resize_cb(BlockBackend *blk)
{
if (blk->dev_ops && blk->dev_ops->resize_cb) {
blk->dev_ops->resize_cb(blk->dev_opaque);
}
}
void blk_iostatus_enable(BlockBackend *blk)
{
bdrv_iostatus_enable(blk->bs);
}
static int blk_check_byte_request(BlockBackend *blk, int64_t offset,
size_t size)
{
int64_t len;
if (size > INT_MAX) {
return -EIO;
}
if (!blk_is_inserted(blk)) {
return -ENOMEDIUM;
}
len = blk_getlength(blk);
if (len < 0) {
return len;
}
if (offset < 0) {
return -EIO;
}
if (offset > len || len - offset < size) {
return -EIO;
}
return 0;
}
static int blk_check_request(BlockBackend *blk, int64_t sector_num,
int nb_sectors)
{
if (sector_num < 0 || sector_num > INT64_MAX / BDRV_SECTOR_SIZE) {
return -EIO;
}
if (nb_sectors < 0 || nb_sectors > INT_MAX / BDRV_SECTOR_SIZE) {
return -EIO;
}
return blk_check_byte_request(blk, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE);
}
int blk_read(BlockBackend *blk, int64_t sector_num, uint8_t *buf,
int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_read(blk->bs, sector_num, buf, nb_sectors);
}
int blk_read_unthrottled(BlockBackend *blk, int64_t sector_num, uint8_t *buf,
int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_read_unthrottled(blk->bs, sector_num, buf, nb_sectors);
}
int blk_write(BlockBackend *blk, int64_t sector_num, const uint8_t *buf,
int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_write(blk->bs, sector_num, buf, nb_sectors);
}
int blk_write_zeroes(BlockBackend *blk, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_write_zeroes(blk->bs, sector_num, nb_sectors, flags);
}
static void error_callback_bh(void *opaque)
{
struct BlockBackendAIOCB *acb = opaque;
qemu_bh_delete(acb->bh);
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_unref(acb);
}
static BlockAIOCB *abort_aio_request(BlockBackend *blk, BlockCompletionFunc *cb,
void *opaque, int ret)
{
struct BlockBackendAIOCB *acb;
QEMUBH *bh;
acb = blk_aio_get(&block_backend_aiocb_info, blk, cb, opaque);
acb->ret = ret;
bh = aio_bh_new(blk_get_aio_context(blk), error_callback_bh, acb);
acb->bh = bh;
qemu_bh_schedule(bh);
return &acb->common;
}
BlockAIOCB *blk_aio_write_zeroes(BlockBackend *blk, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return abort_aio_request(blk, cb, opaque, ret);
}
return bdrv_aio_write_zeroes(blk->bs, sector_num, nb_sectors, flags,
cb, opaque);
}
int blk_pread(BlockBackend *blk, int64_t offset, void *buf, int count)
{
int ret = blk_check_byte_request(blk, offset, count);
if (ret < 0) {
return ret;
}
return bdrv_pread(blk->bs, offset, buf, count);
}
int blk_pwrite(BlockBackend *blk, int64_t offset, const void *buf, int count)
{
int ret = blk_check_byte_request(blk, offset, count);
if (ret < 0) {
return ret;
}
return bdrv_pwrite(blk->bs, offset, buf, count);
}
int64_t blk_getlength(BlockBackend *blk)
{
return bdrv_getlength(blk->bs);
}
void blk_get_geometry(BlockBackend *blk, uint64_t *nb_sectors_ptr)
{
bdrv_get_geometry(blk->bs, nb_sectors_ptr);
}
int64_t blk_nb_sectors(BlockBackend *blk)
{
return bdrv_nb_sectors(blk->bs);
}
BlockAIOCB *blk_aio_readv(BlockBackend *blk, int64_t sector_num,
QEMUIOVector *iov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return abort_aio_request(blk, cb, opaque, ret);
}
return bdrv_aio_readv(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
}
BlockAIOCB *blk_aio_writev(BlockBackend *blk, int64_t sector_num,
QEMUIOVector *iov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return abort_aio_request(blk, cb, opaque, ret);
}
return bdrv_aio_writev(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
}
BlockAIOCB *blk_aio_flush(BlockBackend *blk,
BlockCompletionFunc *cb, void *opaque)
{
return bdrv_aio_flush(blk->bs, cb, opaque);
}
BlockAIOCB *blk_aio_discard(BlockBackend *blk,
int64_t sector_num, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return abort_aio_request(blk, cb, opaque, ret);
}
return bdrv_aio_discard(blk->bs, sector_num, nb_sectors, cb, opaque);
}
void blk_aio_cancel(BlockAIOCB *acb)
{
bdrv_aio_cancel(acb);
}
void blk_aio_cancel_async(BlockAIOCB *acb)
{
bdrv_aio_cancel_async(acb);
}
int blk_aio_multiwrite(BlockBackend *blk, BlockRequest *reqs, int num_reqs)
{
int i, ret;
for (i = 0; i < num_reqs; i++) {
ret = blk_check_request(blk, reqs[i].sector, reqs[i].nb_sectors);
if (ret < 0) {
return ret;
}
}
return bdrv_aio_multiwrite(blk->bs, reqs, num_reqs);
}
int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
{
return bdrv_ioctl(blk->bs, req, buf);
}
BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
BlockCompletionFunc *cb, void *opaque)
{
return bdrv_aio_ioctl(blk->bs, req, buf, cb, opaque);
}
int blk_co_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_co_discard(blk->bs, sector_num, nb_sectors);
}
int blk_co_flush(BlockBackend *blk)
{
return bdrv_co_flush(blk->bs);
}
int blk_flush(BlockBackend *blk)
{
return bdrv_flush(blk->bs);
}
int blk_flush_all(void)
{
return bdrv_flush_all();
}
void blk_drain(BlockBackend *blk)
{
bdrv_drain(blk->bs);
}
void blk_drain_all(void)
{
bdrv_drain_all();
}
BlockdevOnError blk_get_on_error(BlockBackend *blk, bool is_read)
{
return bdrv_get_on_error(blk->bs, is_read);
}
BlockErrorAction blk_get_error_action(BlockBackend *blk, bool is_read,
int error)
{
return bdrv_get_error_action(blk->bs, is_read, error);
}
void blk_error_action(BlockBackend *blk, BlockErrorAction action,
bool is_read, int error)
{
bdrv_error_action(blk->bs, action, is_read, error);
}
int blk_is_read_only(BlockBackend *blk)
{
return bdrv_is_read_only(blk->bs);
}
int blk_is_sg(BlockBackend *blk)
{
return bdrv_is_sg(blk->bs);
}
int blk_enable_write_cache(BlockBackend *blk)
{
return bdrv_enable_write_cache(blk->bs);
}
void blk_set_enable_write_cache(BlockBackend *blk, bool wce)
{
bdrv_set_enable_write_cache(blk->bs, wce);
}
void blk_invalidate_cache(BlockBackend *blk, Error **errp)
{
bdrv_invalidate_cache(blk->bs, errp);
}
int blk_is_inserted(BlockBackend *blk)
{
return bdrv_is_inserted(blk->bs);
}
void blk_lock_medium(BlockBackend *blk, bool locked)
{
bdrv_lock_medium(blk->bs, locked);
}
void blk_eject(BlockBackend *blk, bool eject_flag)
{
bdrv_eject(blk->bs, eject_flag);
}
int blk_get_flags(BlockBackend *blk)
{
return bdrv_get_flags(blk->bs);
}
int blk_get_max_transfer_length(BlockBackend *blk)
{
return blk->bs->bl.max_transfer_length;
}
void blk_set_guest_block_size(BlockBackend *blk, int align)
{
bdrv_set_guest_block_size(blk->bs, align);
}
void *blk_blockalign(BlockBackend *blk, size_t size)
{
return qemu_blockalign(blk ? blk->bs : NULL, size);
}
bool blk_op_is_blocked(BlockBackend *blk, BlockOpType op, Error **errp)
{
return bdrv_op_is_blocked(blk->bs, op, errp);
}
void blk_op_unblock(BlockBackend *blk, BlockOpType op, Error *reason)
{
bdrv_op_unblock(blk->bs, op, reason);
}
void blk_op_block_all(BlockBackend *blk, Error *reason)
{
bdrv_op_block_all(blk->bs, reason);
}
void blk_op_unblock_all(BlockBackend *blk, Error *reason)
{
bdrv_op_unblock_all(blk->bs, reason);
}
AioContext *blk_get_aio_context(BlockBackend *blk)
{
return bdrv_get_aio_context(blk->bs);
}
void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
{
bdrv_set_aio_context(blk->bs, new_context);
}
void blk_add_aio_context_notifier(BlockBackend *blk,
void (*attached_aio_context)(AioContext *new_context, void *opaque),
void (*detach_aio_context)(void *opaque), void *opaque)
{
bdrv_add_aio_context_notifier(blk->bs, attached_aio_context,
detach_aio_context, opaque);
}
void blk_remove_aio_context_notifier(BlockBackend *blk,
void (*attached_aio_context)(AioContext *,
void *),
void (*detach_aio_context)(void *),
void *opaque)
{
bdrv_remove_aio_context_notifier(blk->bs, attached_aio_context,
detach_aio_context, opaque);
}
void blk_add_close_notifier(BlockBackend *blk, Notifier *notify)
{
bdrv_add_close_notifier(blk->bs, notify);
}
void blk_io_plug(BlockBackend *blk)
{
bdrv_io_plug(blk->bs);
}
void blk_io_unplug(BlockBackend *blk)
{
bdrv_io_unplug(blk->bs);
}
BlockAcctStats *blk_get_stats(BlockBackend *blk)
{
return bdrv_get_stats(blk->bs);
}
void *blk_aio_get(const AIOCBInfo *aiocb_info, BlockBackend *blk,
BlockCompletionFunc *cb, void *opaque)
{
return qemu_aio_get(aiocb_info, blk_bs(blk), cb, opaque);
}
int coroutine_fn blk_co_write_zeroes(BlockBackend *blk, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_co_write_zeroes(blk->bs, sector_num, nb_sectors, flags);
}
int blk_write_compressed(BlockBackend *blk, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_write_compressed(blk->bs, sector_num, buf, nb_sectors);
}
int blk_truncate(BlockBackend *blk, int64_t offset)
{
return bdrv_truncate(blk->bs, offset);
}
int blk_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_discard(blk->bs, sector_num, nb_sectors);
}
int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
int64_t pos, int size)
{
return bdrv_save_vmstate(blk->bs, buf, pos, size);
}
int blk_load_vmstate(BlockBackend *blk, uint8_t *buf, int64_t pos, int size)
{
return bdrv_load_vmstate(blk->bs, buf, pos, size);
}
int blk_probe_blocksizes(BlockBackend *blk, BlockSizes *bsz)
{
return bdrv_probe_blocksizes(blk->bs, bsz);
}
int blk_probe_geometry(BlockBackend *blk, HDGeometry *geo)
{
return bdrv_probe_geometry(blk->bs, geo);
}

View File

@@ -113,8 +113,7 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
strcmp(bochs.subtype, GROWING_TYPE) ||
((le32_to_cpu(bochs.version) != HEADER_VERSION) &&
(le32_to_cpu(bochs.version) != HEADER_V1))) {
error_setg(errp, "Image not in Bochs format");
return -EINVAL;
return -EMEDIUMTYPE;
}
if (le32_to_cpu(bochs.version) == HEADER_V1) {
@@ -131,11 +130,7 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
return -EFBIG;
}
s->catalog_bitmap = g_try_new(uint32_t, s->catalog_size);
if (s->catalog_size && s->catalog_bitmap == NULL) {
error_setg(errp, "Could not allocate memory for catalog");
return -ENOMEM;
}
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
ret = bdrv_pread(bs->file, le32_to_cpu(bochs.header), s->catalog_bitmap,
s->catalog_size * 4);
@@ -152,26 +147,16 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
s->extent_blocks = 1 + (le32_to_cpu(bochs.extent) - 1) / 512;
s->extent_size = le32_to_cpu(bochs.extent);
if (s->extent_size < BDRV_SECTOR_SIZE) {
/* bximage actually never creates extents smaller than 4k */
error_setg(errp, "Extent size must be at least 512");
ret = -EINVAL;
goto fail;
} else if (!is_power_of_2(s->extent_size)) {
error_setg(errp, "Extent size %" PRIu32 " is not a power of two",
s->extent_size);
ret = -EINVAL;
goto fail;
if (s->extent_size == 0) {
error_setg(errp, "Extent size may not be zero");
return -EINVAL;
} else if (s->extent_size > 0x800000) {
error_setg(errp, "Extent size %" PRIu32 " is too large",
s->extent_size);
ret = -EINVAL;
goto fail;
return -EINVAL;
}
if (s->catalog_size < DIV_ROUND_UP(bs->total_sectors,
s->extent_size / BDRV_SECTOR_SIZE))
{
if (s->catalog_size < bs->total_sectors / s->extent_size) {
error_setg(errp, "Catalog size is too small for this disk size");
ret = -EINVAL;
goto fail;
@@ -191,14 +176,13 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
uint64_t offset = sector_num * 512;
uint64_t extent_index, extent_offset, bitmap_offset;
char bitmap_entry;
int ret;
// seek to sector
extent_index = offset / s->extent_size;
extent_offset = (offset % s->extent_size) / 512;
if (s->catalog_bitmap[extent_index] == 0xffffffff) {
return 0; /* not allocated */
return -1; /* not allocated */
}
bitmap_offset = s->data_offset +
@@ -206,14 +190,13 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
(s->extent_blocks + s->bitmap_blocks));
/* read in bitmap for current extent */
ret = bdrv_pread(bs->file, bitmap_offset + (extent_offset / 8),
&bitmap_entry, 1);
if (ret < 0) {
return ret;
if (bdrv_pread(bs->file, bitmap_offset + (extent_offset / 8),
&bitmap_entry, 1) != 1) {
return -1;
}
if (!((bitmap_entry >> (extent_offset % 8)) & 1)) {
return 0; /* not allocated */
return -1; /* not allocated */
}
return bitmap_offset + (512 * (s->bitmap_blocks + extent_offset));
@@ -226,16 +209,13 @@ static int bochs_read(BlockDriverState *bs, int64_t sector_num,
while (nb_sectors > 0) {
int64_t block_offset = seek_to_sector(bs, sector_num);
if (block_offset < 0) {
return block_offset;
} else if (block_offset > 0) {
if (block_offset >= 0) {
ret = bdrv_pread(bs->file, block_offset, buf, 512);
if (ret < 0) {
return ret;
if (ret != 512) {
return -1;
}
} else {
} else
memset(buf, 0, 512);
}
nb_sectors--;
sector_num++;
buf += 512;

View File

@@ -72,7 +72,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
}
s->block_size = be32_to_cpu(s->block_size);
if (s->block_size % 512) {
error_setg(errp, "block_size %" PRIu32 " must be a multiple of 512",
error_setg(errp, "block_size %u must be a multiple of 512",
s->block_size);
return -EINVAL;
}
@@ -86,7 +86,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
* need a buffer this big.
*/
if (s->block_size > MAX_BLOCK_SIZE) {
error_setg(errp, "block_size %" PRIu32 " must be %u MB or less",
error_setg(errp, "block_size %u must be %u MB or less",
s->block_size,
MAX_BLOCK_SIZE / (1024 * 1024));
return -EINVAL;
@@ -101,7 +101,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
/* read offsets */
if (s->n_blocks > (UINT32_MAX - 1) / sizeof(uint64_t)) {
/* Prevent integer overflow */
error_setg(errp, "n_blocks %" PRIu32 " must be %zu or less",
error_setg(errp, "n_blocks %u must be %zu or less",
s->n_blocks,
(UINT32_MAX - 1) / sizeof(uint64_t));
return -EINVAL;
@@ -116,12 +116,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
"try increasing block size");
return -EINVAL;
}
s->offsets = g_try_malloc(offsets_size);
if (s->offsets == NULL) {
error_setg(errp, "Could not allocate offsets table");
return -ENOMEM;
}
s->offsets = g_malloc(offsets_size);
ret = bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size);
if (ret < 0) {
@@ -138,7 +133,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
if (s->offsets[i] < s->offsets[i - 1]) {
error_setg(errp, "offsets not monotonically increasing at "
"index %" PRIu32 ", image file is corrupt", i);
"index %u, image file is corrupt", i);
ret = -EINVAL;
goto fail;
}
@@ -151,8 +146,8 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
* ridiculous s->compressed_block allocation.
*/
if (size > 2 * MAX_BLOCK_SIZE) {
error_setg(errp, "invalid compressed block size at index %" PRIu32
", image file is corrupt", i);
error_setg(errp, "invalid compressed block size at index %u, "
"image file is corrupt", i);
ret = -EINVAL;
goto fail;
}
@@ -163,20 +158,8 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
}
/* initialize zlib engine */
s->compressed_block = g_try_malloc(max_compressed_block_size + 1);
if (s->compressed_block == NULL) {
error_setg(errp, "Could not allocate compressed_block");
ret = -ENOMEM;
goto fail;
}
s->uncompressed_block = g_try_malloc(s->block_size);
if (s->uncompressed_block == NULL) {
error_setg(errp, "Could not allocate uncompressed_block");
ret = -ENOMEM;
goto fail;
}
s->compressed_block = g_malloc(max_compressed_block_size + 1);
s->uncompressed_block = g_malloc(s->block_size);
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;

View File

@@ -15,7 +15,6 @@
#include "trace.h"
#include "block/block_int.h"
#include "block/blockjob.h"
#include "qapi/qmp/qerror.h"
#include "qemu/ratelimit.h"
enum {
@@ -38,7 +37,6 @@ typedef struct CommitBlockJob {
BlockdevOnError on_error;
int base_flags;
int orig_overlay_flags;
char *backing_file_str;
} CommitBlockJob;
static int coroutine_fn commit_populate(BlockDriverState *bs,
@@ -61,50 +59,17 @@ static int coroutine_fn commit_populate(BlockDriverState *bs,
return 0;
}
typedef struct {
int ret;
} CommitCompleteData;
static void commit_complete(BlockJob *job, void *opaque)
static void coroutine_fn commit_run(void *opaque)
{
CommitBlockJob *s = container_of(job, CommitBlockJob, common);
CommitCompleteData *data = opaque;
CommitBlockJob *s = opaque;
BlockDriverState *active = s->active;
BlockDriverState *top = s->top;
BlockDriverState *base = s->base;
BlockDriverState *overlay_bs;
int ret = data->ret;
if (!block_job_is_cancelled(&s->common) && ret == 0) {
/* success */
ret = bdrv_drop_intermediate(active, top, base, s->backing_file_str);
}
/* restore base open flags here if appropriate (e.g., change the base back
* to r/o). These reopens do not need to be atomic, since we won't abort
* even on failure here */
if (s->base_flags != bdrv_get_flags(base)) {
bdrv_reopen(base, s->base_flags, NULL);
}
overlay_bs = bdrv_find_overlay(active, top);
if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
}
g_free(s->backing_file_str);
block_job_completed(&s->common, ret);
g_free(data);
}
static void coroutine_fn commit_run(void *opaque)
{
CommitBlockJob *s = opaque;
CommitCompleteData *data;
BlockDriverState *top = s->top;
BlockDriverState *base = s->base;
int64_t sector_num, end;
int ret = 0;
int n = 0;
void *buf = NULL;
void *buf;
int bytes_written = 0;
int64_t base_len;
@@ -112,18 +77,18 @@ static void coroutine_fn commit_run(void *opaque)
if (s->common.len < 0) {
goto out;
goto exit_restore_reopen;
}
ret = base_len = bdrv_getlength(base);
if (base_len < 0) {
goto out;
goto exit_restore_reopen;
}
if (base_len < s->common.len) {
ret = bdrv_truncate(base, s->common.len);
if (ret) {
goto out;
goto exit_restore_reopen;
}
}
@@ -162,7 +127,7 @@ wait:
if (s->on_error == BLOCKDEV_ON_ERROR_STOP ||
s->on_error == BLOCKDEV_ON_ERROR_REPORT||
(s->on_error == BLOCKDEV_ON_ERROR_ENOSPC && ret == -ENOSPC)) {
goto out;
goto exit_free_buf;
} else {
n = 0;
continue;
@@ -174,12 +139,27 @@ wait:
ret = 0;
out:
if (!block_job_is_cancelled(&s->common) && sector_num == end) {
/* success */
ret = bdrv_drop_intermediate(active, top, base);
}
exit_free_buf:
qemu_vfree(buf);
data = g_malloc(sizeof(*data));
data->ret = ret;
block_job_defer_to_main_loop(&s->common, commit_complete, data);
exit_restore_reopen:
/* restore base open flags here if appropriate (e.g., change the base back
* to r/o). These reopens do not need to be atomic, since we won't abort
* even on failure here */
if (s->base_flags != bdrv_get_flags(base)) {
bdrv_reopen(base, s->base_flags, NULL);
}
overlay_bs = bdrv_find_overlay(active, top);
if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
}
block_job_completed(&s->common, ret);
}
static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -187,7 +167,7 @@ static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
CommitBlockJob *s = container_of(job, CommitBlockJob, common);
if (speed < 0) {
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
error_set(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
@@ -201,8 +181,8 @@ static const BlockJobDriver commit_job_driver = {
void commit_start(BlockDriverState *bs, BlockDriverState *base,
BlockDriverState *top, int64_t speed,
BlockdevOnError on_error, BlockCompletionFunc *cb,
void *opaque, const char *backing_file_str, Error **errp)
BlockdevOnError on_error, BlockDriverCompletionFunc *cb,
void *opaque, Error **errp)
{
CommitBlockJob *s;
BlockReopenQueue *reopen_queue = NULL;
@@ -214,11 +194,17 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
if ((on_error == BLOCKDEV_ON_ERROR_STOP ||
on_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
!bdrv_iostatus_is_enabled(bs)) {
error_setg(errp, "Invalid parameter combination");
error_set(errp, QERR_INVALID_PARAMETER_COMBINATION);
return;
}
/* Once we support top == active layer, remove this check */
if (top == bs) {
error_setg(errp,
"Top image as the active layer is currently unsupported");
return;
}
assert(top != bs);
if (top == base) {
error_setg(errp, "Invalid files for merge: top and base are the same");
return;
@@ -264,8 +250,6 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
s->base_flags = orig_base_flags;
s->orig_overlay_flags = orig_overlay_flags;
s->backing_file_str = g_strdup(backing_file_str);
s->on_error = on_error;
s->common.co = qemu_coroutine_create(commit_run);

403
block/cow.c Normal file
View File

@@ -0,0 +1,403 @@
/*
* Block driver for the COW format
*
* Copyright (c) 2004 Fabrice Bellard
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
/**************************************************************/
/* COW block driver using file system holes */
/* user mode linux compatible COW file */
#define COW_MAGIC 0x4f4f4f4d /* MOOO */
#define COW_VERSION 2
struct cow_header_v2 {
uint32_t magic;
uint32_t version;
char backing_file[1024];
int32_t mtime;
uint64_t size;
uint32_t sectorsize;
};
typedef struct BDRVCowState {
CoMutex lock;
int64_t cow_sectors_offset;
} BDRVCowState;
static int cow_probe(const uint8_t *buf, int buf_size, const char *filename)
{
const struct cow_header_v2 *cow_header = (const void *)buf;
if (buf_size >= sizeof(struct cow_header_v2) &&
be32_to_cpu(cow_header->magic) == COW_MAGIC &&
be32_to_cpu(cow_header->version) == COW_VERSION)
return 100;
else
return 0;
}
static int cow_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVCowState *s = bs->opaque;
struct cow_header_v2 cow_header;
int bitmap_size;
int64_t size;
int ret;
/* see if it is a cow image */
ret = bdrv_pread(bs->file, 0, &cow_header, sizeof(cow_header));
if (ret < 0) {
goto fail;
}
if (be32_to_cpu(cow_header.magic) != COW_MAGIC) {
ret = -EMEDIUMTYPE;
goto fail;
}
if (be32_to_cpu(cow_header.version) != COW_VERSION) {
char version[64];
snprintf(version, sizeof(version),
"COW version %d", cow_header.version);
qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bs->device_name, "cow", version);
ret = -ENOTSUP;
goto fail;
}
/* cow image found */
size = be64_to_cpu(cow_header.size);
bs->total_sectors = size / 512;
pstrcpy(bs->backing_file, sizeof(bs->backing_file),
cow_header.backing_file);
bitmap_size = ((bs->total_sectors + 7) >> 3) + sizeof(cow_header);
s->cow_sectors_offset = (bitmap_size + 511) & ~511;
qemu_co_mutex_init(&s->lock);
return 0;
fail:
return ret;
}
/*
* XXX(hch): right now these functions are extremely inefficient.
* We should just read the whole bitmap we'll need in one go instead.
*/
static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum, bool *first)
{
uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
uint8_t bitmap;
int ret;
ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
if (bitmap & (1 << (bitnum % 8))) {
return 0;
}
if (*first) {
ret = bdrv_flush(bs->file);
if (ret < 0) {
return ret;
}
*first = false;
}
bitmap |= (1 << (bitnum % 8));
ret = bdrv_pwrite(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
return 0;
}
#define BITS_PER_BITMAP_SECTOR (512 * 8)
/* Cannot use bitmap.c on big-endian machines. */
static int cow_test_bit(int64_t bitnum, const uint8_t *bitmap)
{
return (bitmap[bitnum / 8] & (1 << (bitnum & 7))) != 0;
}
static int cow_find_streak(const uint8_t *bitmap, int value, int start, int nb_sectors)
{
int streak_value = value ? 0xFF : 0;
int last = MIN(start + nb_sectors, BITS_PER_BITMAP_SECTOR);
int bitnum = start;
while (bitnum < last) {
if ((bitnum & 7) == 0 && bitmap[bitnum / 8] == streak_value) {
bitnum += 8;
continue;
}
if (cow_test_bit(bitnum, bitmap) == value) {
bitnum++;
continue;
}
break;
}
return MIN(bitnum, last) - start;
}
/* Return true if first block has been changed (ie. current version is
* in COW file). Set the number of continuous blocks for which that
* is true. */
static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *num_same)
{
int64_t bitnum = sector_num + sizeof(struct cow_header_v2) * 8;
uint64_t offset = (bitnum / 8) & -BDRV_SECTOR_SIZE;
uint8_t bitmap[BDRV_SECTOR_SIZE];
int ret;
int changed;
ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
bitnum &= BITS_PER_BITMAP_SECTOR - 1;
changed = cow_test_bit(bitnum, bitmap);
*num_same = cow_find_streak(bitmap, changed, bitnum, nb_sectors);
return changed;
}
static int64_t coroutine_fn cow_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *num_same)
{
BDRVCowState *s = bs->opaque;
int ret = cow_co_is_allocated(bs, sector_num, nb_sectors, num_same);
int64_t offset = s->cow_sectors_offset + (sector_num << BDRV_SECTOR_BITS);
if (ret < 0) {
return ret;
}
return (ret ? BDRV_BLOCK_DATA : 0) | offset | BDRV_BLOCK_OFFSET_VALID;
}
static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
int error = 0;
int i;
bool first = true;
for (i = 0; i < nb_sectors; i++) {
error = cow_set_bit(bs, sector_num + i, &first);
if (error) {
break;
}
}
return error;
}
static int coroutine_fn cow_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
BDRVCowState *s = bs->opaque;
int ret, n;
while (nb_sectors > 0) {
ret = cow_co_is_allocated(bs, sector_num, nb_sectors, &n);
if (ret < 0) {
return ret;
}
if (ret) {
ret = bdrv_pread(bs->file,
s->cow_sectors_offset + sector_num * 512,
buf, n * 512);
if (ret < 0) {
return ret;
}
} else {
if (bs->backing_hd) {
/* read from the base image */
ret = bdrv_read(bs->backing_hd, sector_num, buf, n);
if (ret < 0) {
return ret;
}
} else {
memset(buf, 0, n * 512);
}
}
nb_sectors -= n;
sector_num += n;
buf += n * 512;
}
return 0;
}
static coroutine_fn int cow_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVCowState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = cow_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static int cow_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
BDRVCowState *s = bs->opaque;
int ret;
ret = bdrv_pwrite(bs->file, s->cow_sectors_offset + sector_num * 512,
buf, nb_sectors * 512);
if (ret < 0) {
return ret;
}
return cow_update_bitmap(bs, sector_num, nb_sectors);
}
static coroutine_fn int cow_co_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
int ret;
BDRVCowState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = cow_write(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static void cow_close(BlockDriverState *bs)
{
}
static int cow_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
struct cow_header_v2 cow_header;
struct stat st;
int64_t image_sectors = 0;
const char *image_filename = NULL;
Error *local_err = NULL;
int ret;
BlockDriverState *cow_bs;
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
image_sectors = options->value.n / 512;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
image_filename = options->value.s;
}
options++;
}
ret = bdrv_create_file(filename, options, &local_err);
if (ret < 0) {
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
ret = bdrv_file_open(&cow_bs, filename, NULL, BDRV_O_RDWR, &local_err);
if (ret < 0) {
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
memset(&cow_header, 0, sizeof(cow_header));
cow_header.magic = cpu_to_be32(COW_MAGIC);
cow_header.version = cpu_to_be32(COW_VERSION);
if (image_filename) {
/* Note: if no file, we put a dummy mtime */
cow_header.mtime = cpu_to_be32(0);
if (stat(image_filename, &st) != 0) {
goto mtime_fail;
}
cow_header.mtime = cpu_to_be32(st.st_mtime);
mtime_fail:
pstrcpy(cow_header.backing_file, sizeof(cow_header.backing_file),
image_filename);
}
cow_header.sectorsize = cpu_to_be32(512);
cow_header.size = cpu_to_be64(image_sectors * 512);
ret = bdrv_pwrite(cow_bs, 0, &cow_header, sizeof(cow_header));
if (ret < 0) {
goto exit;
}
/* resize to include at least all the bitmap */
ret = bdrv_truncate(cow_bs,
sizeof(cow_header) + ((image_sectors + 7) >> 3));
if (ret < 0) {
goto exit;
}
exit:
bdrv_unref(cow_bs);
return ret;
}
static QEMUOptionParameter cow_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = OPT_STRING,
.help = "File name of a base image"
},
{ NULL }
};
static BlockDriver bdrv_cow = {
.format_name = "cow",
.instance_size = sizeof(BDRVCowState),
.bdrv_probe = cow_probe,
.bdrv_open = cow_open,
.bdrv_close = cow_close,
.bdrv_create = cow_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_read = cow_co_read,
.bdrv_write = cow_co_write,
.bdrv_co_get_block_status = cow_co_get_block_status,
.create_options = cow_create_options,
};
static void bdrv_cow_init(void)
{
bdrv_register(&bdrv_cow);
}
block_init(bdrv_cow_init);

View File

@@ -22,13 +22,10 @@
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "qemu/error-report.h"
#include "block/block_int.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qstring.h"
#include <curl/curl.h>
// #define DEBUG_CURL
// #define DEBUG
// #define DEBUG_VERBOSE
#ifdef DEBUG_CURL
@@ -40,21 +37,6 @@
#if LIBCURL_VERSION_NUM >= 0x071000
/* The multi interface timer callback was introduced in 7.16.0 */
#define NEED_CURL_TIMER_CALLBACK
#define HAVE_SOCKET_ACTION
#endif
#ifndef HAVE_SOCKET_ACTION
/* If curl_multi_socket_action isn't available, define it statically here in
* terms of curl_multi_socket. Note that ev_bitmask will be ignored, which is
* less efficient but still safe. */
static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
curl_socket_t sockfd,
int ev_bitmask,
int *running_handles)
{
return curl_multi_socket(multi_handle, sockfd, running_handles);
}
#define curl_multi_socket_action __curl_multi_socket_action
#endif
#define PROTOCOLS (CURLPROTO_HTTP | CURLPROTO_HTTPS | \
@@ -64,24 +46,16 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_NUM_STATES 8
#define CURL_NUM_ACB 8
#define SECTOR_SIZE 512
#define READ_AHEAD_DEFAULT (256 * 1024)
#define CURL_TIMEOUT_DEFAULT 5
#define CURL_TIMEOUT_MAX 10000
#define READ_AHEAD_SIZE (256 * 1024)
#define FIND_RET_NONE 0
#define FIND_RET_OK 1
#define FIND_RET_WAIT 2
#define CURL_BLOCK_OPT_URL "url"
#define CURL_BLOCK_OPT_READAHEAD "readahead"
#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
#define CURL_BLOCK_OPT_TIMEOUT "timeout"
#define CURL_BLOCK_OPT_COOKIE "cookie"
struct BDRVCURLState;
typedef struct CURLAIOCB {
BlockAIOCB common;
BlockDriverAIOCB common;
QEMUBH *bh;
QEMUIOVector *qiov;
@@ -97,7 +71,6 @@ typedef struct CURLState
struct BDRVCURLState *s;
CURLAIOCB *acb[CURL_NUM_ACB];
CURL *curl;
curl_socket_t sock_fd;
char *orig_buf;
size_t buf_start;
size_t buf_off;
@@ -114,16 +87,11 @@ typedef struct BDRVCURLState {
CURLState states[CURL_NUM_STATES];
char *url;
size_t readahead_size;
bool sslverify;
uint64_t timeout;
char *cookie;
bool accept_range;
AioContext *aio_context;
} BDRVCURLState;
static void curl_clean_state(CURLState *s);
static void curl_multi_do(void *arg);
static void curl_multi_read(void *arg);
#ifdef NEED_CURL_TIMER_CALLBACK
static int curl_timer_cb(CURLM *multi, long timeout_ms, void *opaque)
@@ -143,29 +111,21 @@ static int curl_timer_cb(CURLM *multi, long timeout_ms, void *opaque)
#endif
static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
void *userp, void *sp)
void *s, void *sp)
{
BDRVCURLState *s;
CURLState *state = NULL;
curl_easy_getinfo(curl, CURLINFO_PRIVATE, (char **)&state);
state->sock_fd = fd;
s = state->s;
DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
switch (action) {
case CURL_POLL_IN:
aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
NULL, state);
qemu_aio_set_fd_handler(fd, curl_multi_do, NULL, s);
break;
case CURL_POLL_OUT:
aio_set_fd_handler(s->aio_context, fd, NULL, curl_multi_do, state);
qemu_aio_set_fd_handler(fd, NULL, curl_multi_do, s);
break;
case CURL_POLL_INOUT:
aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
curl_multi_do, state);
qemu_aio_set_fd_handler(fd, curl_multi_do, curl_multi_do, s);
break;
case CURL_POLL_REMOVE:
aio_set_fd_handler(s->aio_context, fd, NULL, NULL, NULL);
qemu_aio_set_fd_handler(fd, NULL, NULL, NULL);
break;
}
@@ -195,7 +155,7 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
DPRINTF("CURL: Just reading %zd bytes\n", realsize);
if (!s || !s->orig_buf)
return 0;
goto read_end;
if (s->buf_off >= s->buf_len) {
/* buffer full, read nothing */
@@ -215,11 +175,12 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
acb->end - acb->start);
acb->common.cb(acb->common.opaque, 0);
qemu_aio_unref(acb);
qemu_aio_release(acb);
s->acb[i] = NULL;
}
}
read_end:
return realsize;
}
@@ -254,8 +215,7 @@ static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
}
// Wait for unfinished chunks
if (state->in_use &&
(start >= state->buf_start) &&
if ((start >= state->buf_start) &&
(start <= buf_fend) &&
(end >= state->buf_start) &&
(end <= buf_fend))
@@ -277,81 +237,68 @@ static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
return FIND_RET_NONE;
}
static void curl_multi_check_completion(BDRVCURLState *s)
static void curl_multi_read(BDRVCURLState *s)
{
int msgs_in_queue;
/* Try to find done transfers, so we can free the easy
* handle again. */
for (;;) {
do {
CURLMsg *msg;
msg = curl_multi_info_read(s->multi, &msgs_in_queue);
/* Quit when there are no more completions */
if (!msg)
break;
if (msg->msg == CURLMSG_DONE) {
CURLState *state = NULL;
curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE,
(char **)&state);
/* ACBs for successful messages get completed in curl_read_cb */
if (msg->data.result != CURLE_OK) {
int i;
static int errcount = 100;
/* Don't lose the original error message from curl, since
* it contains extra data.
*/
if (errcount > 0) {
error_report("curl: %s", state->errmsg);
if (--errcount == 0) {
error_report("curl: further errors suppressed");
}
}
for (i = 0; i < CURL_NUM_ACB; i++) {
CURLAIOCB *acb = state->acb[i];
if (acb == NULL) {
continue;
}
acb->common.cb(acb->common.opaque, -EPROTO);
qemu_aio_unref(acb);
state->acb[i] = NULL;
}
}
curl_clean_state(state);
if (msg->msg == CURLMSG_NONE)
break;
switch (msg->msg) {
case CURLMSG_DONE:
{
CURLState *state = NULL;
curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE, (char**)&state);
/* ACBs for successful messages get completed in curl_read_cb */
if (msg->data.result != CURLE_OK) {
int i;
for (i = 0; i < CURL_NUM_ACB; i++) {
CURLAIOCB *acb = state->acb[i];
if (acb == NULL) {
continue;
}
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_release(acb);
state->acb[i] = NULL;
}
}
curl_clean_state(state);
break;
}
default:
msgs_in_queue = 0;
break;
}
}
} while(msgs_in_queue);
}
static void curl_multi_do(void *arg)
{
CURLState *s = (CURLState *)arg;
BDRVCURLState *s = (BDRVCURLState *)arg;
int running;
int r;
if (!s->s->multi) {
if (!s->multi) {
return;
}
do {
r = curl_multi_socket_action(s->s->multi, s->sock_fd, 0, &running);
r = curl_multi_socket_all(s->multi, &running);
} while(r == CURLM_CALL_MULTI_PERFORM);
}
static void curl_multi_read(void *arg)
{
CURLState *s = (CURLState *)arg;
curl_multi_do(arg);
curl_multi_check_completion(s->s);
curl_multi_read(s);
}
static void curl_multi_timeout_do(void *arg)
@@ -366,13 +313,13 @@ static void curl_multi_timeout_do(void *arg)
curl_multi_socket_action(s->multi, CURL_SOCKET_TIMEOUT, 0, &running);
curl_multi_check_completion(s);
curl_multi_read(s);
#else
abort();
#endif
}
static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
static CURLState *curl_init_state(BDRVCURLState *s)
{
CURLState *state = NULL;
int i, j;
@@ -390,47 +337,44 @@ static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
break;
}
if (!state) {
aio_poll(bdrv_get_aio_context(bs), true);
g_usleep(100);
curl_multi_do(s);
}
} while(!state);
if (!state->curl) {
state->curl = curl_easy_init();
if (!state->curl) {
return NULL;
}
curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
curl_easy_setopt(state->curl, CURLOPT_SSL_VERIFYPEER,
(long) s->sslverify);
if (s->cookie) {
curl_easy_setopt(state->curl, CURLOPT_COOKIE, s->cookie);
}
curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, (long)s->timeout);
curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION,
(void *)curl_read_cb);
curl_easy_setopt(state->curl, CURLOPT_WRITEDATA, (void *)state);
curl_easy_setopt(state->curl, CURLOPT_PRIVATE, (void *)state);
curl_easy_setopt(state->curl, CURLOPT_AUTOREFERER, 1);
curl_easy_setopt(state->curl, CURLOPT_FOLLOWLOCATION, 1);
curl_easy_setopt(state->curl, CURLOPT_NOSIGNAL, 1);
curl_easy_setopt(state->curl, CURLOPT_ERRORBUFFER, state->errmsg);
curl_easy_setopt(state->curl, CURLOPT_FAILONERROR, 1);
if (state->curl)
goto has_curl;
/* Restrict supported protocols to avoid security issues in the more
* obscure protocols. For example, do not allow POP3/SMTP/IMAP see
* CVE-2013-0249.
*
* Restricting protocols is only supported from 7.19.4 upwards.
*/
state->curl = curl_easy_init();
if (!state->curl)
return NULL;
curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, 5);
curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION, (void *)curl_read_cb);
curl_easy_setopt(state->curl, CURLOPT_WRITEDATA, (void *)state);
curl_easy_setopt(state->curl, CURLOPT_PRIVATE, (void *)state);
curl_easy_setopt(state->curl, CURLOPT_AUTOREFERER, 1);
curl_easy_setopt(state->curl, CURLOPT_FOLLOWLOCATION, 1);
curl_easy_setopt(state->curl, CURLOPT_NOSIGNAL, 1);
curl_easy_setopt(state->curl, CURLOPT_ERRORBUFFER, state->errmsg);
curl_easy_setopt(state->curl, CURLOPT_FAILONERROR, 1);
/* Restrict supported protocols to avoid security issues in the more
* obscure protocols. For example, do not allow POP3/SMTP/IMAP see
* CVE-2013-0249.
*
* Restricting protocols is only supported from 7.19.4 upwards.
*/
#if LIBCURL_VERSION_NUM >= 0x071304
curl_easy_setopt(state->curl, CURLOPT_PROTOCOLS, PROTOCOLS);
curl_easy_setopt(state->curl, CURLOPT_REDIR_PROTOCOLS, PROTOCOLS);
curl_easy_setopt(state->curl, CURLOPT_PROTOCOLS, PROTOCOLS);
curl_easy_setopt(state->curl, CURLOPT_REDIR_PROTOCOLS, PROTOCOLS);
#endif
#ifdef DEBUG_VERBOSE
curl_easy_setopt(state->curl, CURLOPT_VERBOSE, 1);
curl_easy_setopt(state->curl, CURLOPT_VERBOSE, 1);
#endif
}
has_curl:
state->s = s;
@@ -447,50 +391,43 @@ static void curl_clean_state(CURLState *s)
static void curl_parse_filename(const char *filename, QDict *options,
Error **errp)
{
qdict_put(options, CURL_BLOCK_OPT_URL, qstring_from_str(filename));
}
static void curl_detach_aio_context(BlockDriverState *bs)
{
BDRVCURLState *s = bs->opaque;
int i;
#define RA_OPTSTR ":readahead="
char *file;
char *ra;
const char *ra_val;
int parse_state = 0;
for (i = 0; i < CURL_NUM_STATES; i++) {
if (s->states[i].in_use) {
curl_clean_state(&s->states[i]);
file = g_strdup(filename);
/* Parse a trailing ":readahead=#:" param, if present. */
ra = file + strlen(file) - 1;
while (ra >= file) {
if (parse_state == 0) {
if (*ra == ':') {
parse_state++;
} else {
break;
}
} else if (parse_state == 1) {
if (*ra > '9' || *ra < '0') {
char *opt_start = ra - strlen(RA_OPTSTR) + 1;
if (opt_start > file &&
strncmp(opt_start, RA_OPTSTR, strlen(RA_OPTSTR)) == 0) {
ra_val = ra + 1;
ra -= strlen(RA_OPTSTR) - 1;
*ra = '\0';
qdict_put(options, "readahead", qstring_from_str(ra_val));
}
break;
}
}
if (s->states[i].curl) {
curl_easy_cleanup(s->states[i].curl);
s->states[i].curl = NULL;
}
g_free(s->states[i].orig_buf);
s->states[i].orig_buf = NULL;
}
if (s->multi) {
curl_multi_cleanup(s->multi);
s->multi = NULL;
ra--;
}
timer_del(&s->timer);
}
qdict_put(options, "url", qstring_from_str(file));
static void curl_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVCURLState *s = bs->opaque;
aio_timer_init(new_context, &s->timer,
QEMU_CLOCK_REALTIME, SCALE_NS,
curl_multi_timeout_do, s);
assert(!s->multi);
s->multi = curl_multi_init();
s->aio_context = new_context;
curl_multi_setopt(s->multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb);
#ifdef NEED_CURL_TIMER_CALLBACK
curl_multi_setopt(s->multi, CURLMOPT_TIMERDATA, s);
curl_multi_setopt(s->multi, CURLMOPT_TIMERFUNCTION, curl_timer_cb);
#endif
g_free(file);
}
static QemuOptsList runtime_opts = {
@@ -498,30 +435,15 @@ static QemuOptsList runtime_opts = {
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = CURL_BLOCK_OPT_URL,
.name = "url",
.type = QEMU_OPT_STRING,
.help = "URL to open",
},
{
.name = CURL_BLOCK_OPT_READAHEAD,
.name = "readahead",
.type = QEMU_OPT_SIZE,
.help = "Readahead size",
},
{
.name = CURL_BLOCK_OPT_SSLVERIFY,
.type = QEMU_OPT_BOOL,
.help = "Verify SSL certificate"
},
{
.name = CURL_BLOCK_OPT_TIMEOUT,
.type = QEMU_OPT_NUMBER,
.help = "Curl timeout"
},
{
.name = CURL_BLOCK_OPT_COOKIE,
.type = QEMU_OPT_STRING,
.help = "Pass the cookie or list of cookies with each request"
},
{ /* end of list */ }
},
};
@@ -534,46 +456,35 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
QemuOpts *opts;
Error *local_err = NULL;
const char *file;
const char *cookie;
double d;
static int inited = 0;
if (flags & BDRV_O_RDWR) {
error_setg(errp, "curl block device does not support writes");
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"curl block device does not support writes");
return -EROFS;
}
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
opts = qemu_opts_create_nofail(&runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
if (error_is_set(&local_err)) {
qerror_report_err(local_err);
error_free(local_err);
goto out_noclean;
}
s->readahead_size = qemu_opt_get_size(opts, CURL_BLOCK_OPT_READAHEAD,
READ_AHEAD_DEFAULT);
s->readahead_size = qemu_opt_get_size(opts, "readahead", READ_AHEAD_SIZE);
if ((s->readahead_size & 0x1ff) != 0) {
error_setg(errp, "HTTP_READAHEAD_SIZE %zd is not a multiple of 512",
s->readahead_size);
fprintf(stderr, "HTTP_READAHEAD_SIZE %zd is not a multiple of 512\n",
s->readahead_size);
goto out_noclean;
}
s->timeout = qemu_opt_get_number(opts, CURL_BLOCK_OPT_TIMEOUT,
CURL_TIMEOUT_DEFAULT);
if (s->timeout > CURL_TIMEOUT_MAX) {
error_setg(errp, "timeout parameter is too large or negative");
goto out_noclean;
}
s->sslverify = qemu_opt_get_bool(opts, CURL_BLOCK_OPT_SSLVERIFY, true);
cookie = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE);
s->cookie = g_strdup(cookie);
file = qemu_opt_get(opts, CURL_BLOCK_OPT_URL);
file = qemu_opt_get(opts, "url");
if (file == NULL) {
error_setg(errp, "curl block driver requires an 'url' option");
qerror_report(ERROR_CLASS_GENERIC_ERROR, "curl block driver requires "
"an 'url' option");
goto out_noclean;
}
@@ -583,9 +494,8 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
}
DPRINTF("CURL: Opening %s\n", file);
s->aio_context = bdrv_get_aio_context(bs);
s->url = g_strdup(file);
state = curl_init_state(bs, s);
state = curl_init_state(s);
if (!state)
goto out_noclean;
@@ -616,31 +526,49 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
curl_easy_cleanup(state->curl);
state->curl = NULL;
curl_attach_aio_context(bs, bdrv_get_aio_context(bs));
aio_timer_init(bdrv_get_aio_context(bs), &s->timer,
QEMU_CLOCK_REALTIME, SCALE_NS,
curl_multi_timeout_do, s);
// Now we know the file exists and its size, so let's
// initialize the multi interface!
s->multi = curl_multi_init();
curl_multi_setopt(s->multi, CURLMOPT_SOCKETDATA, s);
curl_multi_setopt(s->multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb);
#ifdef NEED_CURL_TIMER_CALLBACK
curl_multi_setopt(s->multi, CURLMOPT_TIMERDATA, s);
curl_multi_setopt(s->multi, CURLMOPT_TIMERFUNCTION, curl_timer_cb);
#endif
curl_multi_do(s);
qemu_opts_del(opts);
return 0;
out:
error_setg(errp, "CURL: Error opening file: %s", state->errmsg);
fprintf(stderr, "CURL: Error opening file: %s\n", state->errmsg);
curl_easy_cleanup(state->curl);
state->curl = NULL;
out_noclean:
g_free(s->cookie);
g_free(s->url);
qemu_opts_del(opts);
return -EINVAL;
}
static void curl_aio_cancel(BlockDriverAIOCB *blockacb)
{
// Do we have to implement canceling? Seems to work without...
}
static const AIOCBInfo curl_aiocb_info = {
.aiocb_size = sizeof(CURLAIOCB),
.cancel = curl_aio_cancel,
};
static void curl_readv_bh_cb(void *p)
{
CURLState *state;
int running;
CURLAIOCB *acb = p;
BDRVCURLState *s = acb->common.bs->opaque;
@@ -655,7 +583,7 @@ static void curl_readv_bh_cb(void *p)
// we can just call the callback and be done.
switch (curl_find_buf(s, start, acb->nb_sectors * SECTOR_SIZE, acb)) {
case FIND_RET_OK:
qemu_aio_unref(acb);
qemu_aio_release(acb);
// fall through
case FIND_RET_WAIT:
return;
@@ -664,10 +592,10 @@ static void curl_readv_bh_cb(void *p)
}
// No cache found, so let's start a new request
state = curl_init_state(acb->common.bs, s);
state = curl_init_state(s);
if (!state) {
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_unref(acb);
qemu_aio_release(acb);
return;
}
@@ -675,17 +603,12 @@ static void curl_readv_bh_cb(void *p)
acb->end = (acb->nb_sectors * SECTOR_SIZE);
state->buf_off = 0;
g_free(state->orig_buf);
if (state->orig_buf)
g_free(state->orig_buf);
state->buf_start = start;
state->buf_len = acb->end + s->readahead_size;
end = MIN(start + state->buf_len, s->len) - 1;
state->orig_buf = g_try_malloc(state->buf_len);
if (state->buf_len && state->orig_buf == NULL) {
curl_clean_state(state);
acb->common.cb(acb->common.opaque, -ENOMEM);
qemu_aio_unref(acb);
return;
}
state->orig_buf = g_malloc(state->buf_len);
state->acb[0] = acb;
snprintf(state->range, 127, "%zd-%zd", start, end);
@@ -694,14 +617,13 @@ static void curl_readv_bh_cb(void *p)
curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
curl_multi_add_handle(s->multi, state->curl);
curl_multi_do(s);
/* Tell curl it needs to kick things off */
curl_multi_socket_action(s->multi, CURL_SOCKET_TIMEOUT, 0, &running);
}
static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
static BlockDriverAIOCB *curl_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
CURLAIOCB *acb;
@@ -711,7 +633,7 @@ static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
acb->sector_num = sector_num;
acb->nb_sectors = nb_sectors;
acb->bh = aio_bh_new(bdrv_get_aio_context(bs), curl_readv_bh_cb, acb);
acb->bh = qemu_bh_new(curl_readv_bh_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
}
@@ -719,11 +641,26 @@ static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
static void curl_close(BlockDriverState *bs)
{
BDRVCURLState *s = bs->opaque;
int i;
DPRINTF("CURL: Close\n");
curl_detach_aio_context(bs);
for (i=0; i<CURL_NUM_STATES; i++) {
if (s->states[i].in_use)
curl_clean_state(&s->states[i]);
if (s->states[i].curl) {
curl_easy_cleanup(s->states[i].curl);
s->states[i].curl = NULL;
}
if (s->states[i].orig_buf) {
g_free(s->states[i].orig_buf);
s->states[i].orig_buf = NULL;
}
}
if (s->multi)
curl_multi_cleanup(s->multi);
timer_del(&s->timer);
g_free(s->cookie);
g_free(s->url);
}
@@ -734,83 +671,68 @@ static int64_t curl_getlength(BlockDriverState *bs)
}
static BlockDriver bdrv_http = {
.format_name = "http",
.protocol_name = "http",
.format_name = "http",
.protocol_name = "http",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
.bdrv_aio_readv = curl_aio_readv,
};
static BlockDriver bdrv_https = {
.format_name = "https",
.protocol_name = "https",
.format_name = "https",
.protocol_name = "https",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
.bdrv_aio_readv = curl_aio_readv,
};
static BlockDriver bdrv_ftp = {
.format_name = "ftp",
.protocol_name = "ftp",
.format_name = "ftp",
.protocol_name = "ftp",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
.bdrv_aio_readv = curl_aio_readv,
};
static BlockDriver bdrv_ftps = {
.format_name = "ftps",
.protocol_name = "ftps",
.format_name = "ftps",
.protocol_name = "ftps",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
.bdrv_aio_readv = curl_aio_readv,
};
static BlockDriver bdrv_tftp = {
.format_name = "tftp",
.protocol_name = "tftp",
.format_name = "tftp",
.protocol_name = "tftp",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
.bdrv_aio_readv = curl_aio_readv,
};
static void curl_block_init(void)

View File

@@ -24,13 +24,8 @@
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/bswap.h"
#include "qemu/error-report.h"
#include "qemu/module.h"
#include <zlib.h>
#ifdef CONFIG_BZIP2
#include <bzlib.h>
#endif
#include <glib.h>
enum {
/* Limit chunk sizes to prevent unreasonable amounts of memory being used
@@ -60,9 +55,6 @@ typedef struct BDRVDMGState {
uint8_t *compressed_chunk;
uint8_t *uncompressed_chunk;
z_stream zstream;
#ifdef CONFIG_BZIP2
bz_stream bzstream;
#endif
} BDRVDMGState;
static int dmg_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -108,16 +100,6 @@ static int read_uint32(BlockDriverState *bs, int64_t offset, uint32_t *result)
return 0;
}
static inline uint64_t buff_read_uint64(const uint8_t *buffer, int64_t offset)
{
return be64_to_cpu(*(uint64_t *)&buffer[offset]);
}
static inline uint32_t buff_read_uint32(const uint8_t *buffer, int64_t offset)
{
return be32_to_cpu(*(uint32_t *)&buffer[offset]);
}
/* Increase max chunk sizes, if necessary. This function is used to calculate
* the buffer sizes needed for compressed/uncompressed chunk I/O.
*/
@@ -130,7 +112,6 @@ static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
switch (s->types[chunk]) {
case 0x80000005: /* zlib compressed */
case 0x80000006: /* bzip2 compressed */
compressed_size = s->lengths[chunk];
uncompressed_sectors = s->sectorcounts[chunk];
break;
@@ -138,9 +119,7 @@ static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
uncompressed_sectors = (s->lengths[chunk] + 511) / 512;
break;
case 2: /* zero */
/* as the all-zeroes block may be large, it is treated specially: the
* sector is not copied from a large buffer, a simple memset is used
* instead. Therefore uncompressed_sectors does not need to be set. */
uncompressed_sectors = s->sectorcounts[chunk];
break;
}
@@ -152,377 +131,161 @@ static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
}
}
static int64_t dmg_find_koly_offset(BlockDriverState *file_bs, Error **errp)
{
int64_t length;
int64_t offset = 0;
uint8_t buffer[515];
int i, ret;
/* bdrv_getlength returns a multiple of block size (512), rounded up. Since
* dmg images can have odd sizes, try to look for the "koly" magic which
* marks the begin of the UDIF trailer (512 bytes). This magic can be found
* in the last 511 bytes of the second-last sector or the first 4 bytes of
* the last sector (search space: 515 bytes) */
length = bdrv_getlength(file_bs);
if (length < 0) {
error_setg_errno(errp, -length,
"Failed to get file size while reading UDIF trailer");
return length;
} else if (length < 512) {
error_setg(errp, "dmg file must be at least 512 bytes long");
return -EINVAL;
}
if (length > 511 + 512) {
offset = length - 511 - 512;
}
length = length < 515 ? length : 515;
ret = bdrv_pread(file_bs, offset, buffer, length);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed while reading UDIF trailer");
return ret;
}
for (i = 0; i < length - 3; i++) {
if (buffer[i] == 'k' && buffer[i+1] == 'o' &&
buffer[i+2] == 'l' && buffer[i+3] == 'y') {
return offset + i;
}
}
error_setg(errp, "Could not locate UDIF trailer in dmg file");
return -EINVAL;
}
/* used when building the sector table */
typedef struct DmgHeaderState {
/* used internally by dmg_read_mish_block to remember offsets of blocks
* across calls */
uint64_t data_fork_offset;
/* exported for dmg_open */
uint32_t max_compressed_size;
uint32_t max_sectors_per_chunk;
} DmgHeaderState;
static bool dmg_is_known_block_type(uint32_t entry_type)
{
switch (entry_type) {
case 0x00000001: /* uncompressed */
case 0x00000002: /* zeroes */
case 0x80000005: /* zlib */
#ifdef CONFIG_BZIP2
case 0x80000006: /* bzip2 */
#endif
return true;
default:
return false;
}
}
static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
uint8_t *buffer, uint32_t count)
{
uint32_t type, i;
int ret;
size_t new_size;
uint32_t chunk_count;
int64_t offset = 0;
uint64_t data_offset;
uint64_t in_offset = ds->data_fork_offset;
uint64_t out_offset;
type = buff_read_uint32(buffer, offset);
/* skip data that is not a valid MISH block (invalid magic or too small) */
if (type != 0x6d697368 || count < 244) {
/* assume success for now */
return 0;
}
/* chunk offsets are relative to this sector number */
out_offset = buff_read_uint64(buffer, offset + 8);
/* location in data fork for (compressed) blob (in bytes) */
data_offset = buff_read_uint64(buffer, offset + 0x18);
in_offset += data_offset;
/* move to begin of chunk entries */
offset += 204;
chunk_count = (count - 204) / 40;
new_size = sizeof(uint64_t) * (s->n_chunks + chunk_count);
s->types = g_realloc(s->types, new_size / 2);
s->offsets = g_realloc(s->offsets, new_size);
s->lengths = g_realloc(s->lengths, new_size);
s->sectors = g_realloc(s->sectors, new_size);
s->sectorcounts = g_realloc(s->sectorcounts, new_size);
for (i = s->n_chunks; i < s->n_chunks + chunk_count; i++) {
s->types[i] = buff_read_uint32(buffer, offset);
if (!dmg_is_known_block_type(s->types[i])) {
chunk_count--;
i--;
offset += 40;
continue;
}
/* sector number */
s->sectors[i] = buff_read_uint64(buffer, offset + 8);
s->sectors[i] += out_offset;
/* sector count */
s->sectorcounts[i] = buff_read_uint64(buffer, offset + 0x10);
/* all-zeroes sector (type 2) does not need to be "uncompressed" and can
* therefore be unbounded. */
if (s->types[i] != 2 && s->sectorcounts[i] > DMG_SECTORCOUNTS_MAX) {
error_report("sector count %" PRIu64 " for chunk %" PRIu32
" is larger than max (%u)",
s->sectorcounts[i], i, DMG_SECTORCOUNTS_MAX);
ret = -EINVAL;
goto fail;
}
/* offset in (compressed) data fork */
s->offsets[i] = buff_read_uint64(buffer, offset + 0x18);
s->offsets[i] += in_offset;
/* length in (compressed) data fork */
s->lengths[i] = buff_read_uint64(buffer, offset + 0x20);
if (s->lengths[i] > DMG_LENGTHS_MAX) {
error_report("length %" PRIu64 " for chunk %" PRIu32
" is larger than max (%u)",
s->lengths[i], i, DMG_LENGTHS_MAX);
ret = -EINVAL;
goto fail;
}
update_max_chunk_size(s, i, &ds->max_compressed_size,
&ds->max_sectors_per_chunk);
offset += 40;
}
s->n_chunks += chunk_count;
return 0;
fail:
return ret;
}
static int dmg_read_resource_fork(BlockDriverState *bs, DmgHeaderState *ds,
uint64_t info_begin, uint64_t info_length)
{
BDRVDMGState *s = bs->opaque;
int ret;
uint32_t count, rsrc_data_offset;
uint8_t *buffer = NULL;
uint64_t info_end;
uint64_t offset;
/* read offset from begin of resource fork (info_begin) to resource data */
ret = read_uint32(bs, info_begin, &rsrc_data_offset);
if (ret < 0) {
goto fail;
} else if (rsrc_data_offset > info_length) {
ret = -EINVAL;
goto fail;
}
/* read length of resource data */
ret = read_uint32(bs, info_begin + 8, &count);
if (ret < 0) {
goto fail;
} else if (count == 0 || rsrc_data_offset + count > info_length) {
ret = -EINVAL;
goto fail;
}
/* begin of resource data (consisting of one or more resources) */
offset = info_begin + rsrc_data_offset;
/* end of resource data (there is possibly a following resource map
* which will be ignored). */
info_end = offset + count;
/* read offsets (mish blocks) from one or more resources in resource data */
while (offset < info_end) {
/* size of following resource */
ret = read_uint32(bs, offset, &count);
if (ret < 0) {
goto fail;
} else if (count == 0 || count > info_end - offset) {
ret = -EINVAL;
goto fail;
}
offset += 4;
buffer = g_realloc(buffer, count);
ret = bdrv_pread(bs->file, offset, buffer, count);
if (ret < 0) {
goto fail;
}
ret = dmg_read_mish_block(s, ds, buffer, count);
if (ret < 0) {
goto fail;
}
/* advance offset by size of resource */
offset += count;
}
ret = 0;
fail:
g_free(buffer);
return ret;
}
static int dmg_read_plist_xml(BlockDriverState *bs, DmgHeaderState *ds,
uint64_t info_begin, uint64_t info_length)
{
BDRVDMGState *s = bs->opaque;
int ret;
uint8_t *buffer = NULL;
char *data_begin, *data_end;
/* Have at least some length to avoid NULL for g_malloc. Attempt to set a
* safe upper cap on the data length. A test sample had a XML length of
* about 1 MiB. */
if (info_length == 0 || info_length > 16 * 1024 * 1024) {
ret = -EINVAL;
goto fail;
}
buffer = g_malloc(info_length + 1);
buffer[info_length] = '\0';
ret = bdrv_pread(bs->file, info_begin, buffer, info_length);
if (ret != info_length) {
ret = -EINVAL;
goto fail;
}
/* look for <data>...</data>. The data is 284 (0x11c) bytes after base64
* decode. The actual data element has 431 (0x1af) bytes which includes tabs
* and line feeds. */
data_end = (char *)buffer;
while ((data_begin = strstr(data_end, "<data>")) != NULL) {
guchar *mish;
gsize out_len = 0;
data_begin += 6;
data_end = strstr(data_begin, "</data>");
/* malformed XML? */
if (data_end == NULL) {
ret = -EINVAL;
goto fail;
}
*data_end++ = '\0';
mish = g_base64_decode(data_begin, &out_len);
ret = dmg_read_mish_block(s, ds, mish, (uint32_t)out_len);
g_free(mish);
if (ret < 0) {
goto fail;
}
}
ret = 0;
fail:
g_free(buffer);
return ret;
}
static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVDMGState *s = bs->opaque;
DmgHeaderState ds;
uint64_t rsrc_fork_offset, rsrc_fork_length;
uint64_t plist_xml_offset, plist_xml_length;
uint64_t info_begin, info_end, last_in_offset, last_out_offset;
uint32_t count, tmp;
uint32_t max_compressed_size = 1, max_sectors_per_chunk = 1, i;
int64_t offset;
int ret;
bs->read_only = 1;
s->n_chunks = 0;
s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
/* used by dmg_read_mish_block to keep track of the current I/O position */
ds.data_fork_offset = 0;
ds.max_compressed_size = 1;
ds.max_sectors_per_chunk = 1;
/* locate the UDIF trailer */
offset = dmg_find_koly_offset(bs->file, errp);
/* read offset of info blocks */
offset = bdrv_getlength(bs->file);
if (offset < 0) {
ret = offset;
goto fail;
}
offset -= 0x1d8;
/* offset of data fork (DataForkOffset) */
ret = read_uint64(bs, offset + 0x18, &ds.data_fork_offset);
ret = read_uint64(bs, offset, &info_begin);
if (ret < 0) {
goto fail;
} else if (ds.data_fork_offset > offset) {
} else if (info_begin == 0) {
ret = -EINVAL;
goto fail;
}
/* offset of resource fork (RsrcForkOffset) */
ret = read_uint64(bs, offset + 0x28, &rsrc_fork_offset);
ret = read_uint32(bs, info_begin, &tmp);
if (ret < 0) {
goto fail;
}
ret = read_uint64(bs, offset + 0x30, &rsrc_fork_length);
if (ret < 0) {
goto fail;
}
if (rsrc_fork_offset >= offset ||
rsrc_fork_length > offset - rsrc_fork_offset) {
} else if (tmp != 0x100) {
ret = -EINVAL;
goto fail;
}
/* offset of property list (XMLOffset) */
ret = read_uint64(bs, offset + 0xd8, &plist_xml_offset);
ret = read_uint32(bs, info_begin + 4, &count);
if (ret < 0) {
goto fail;
}
ret = read_uint64(bs, offset + 0xe0, &plist_xml_length);
if (ret < 0) {
goto fail;
}
if (plist_xml_offset >= offset ||
plist_xml_length > offset - plist_xml_offset) {
} else if (count == 0) {
ret = -EINVAL;
goto fail;
}
ret = read_uint64(bs, offset + 0x1ec, (uint64_t *)&bs->total_sectors);
if (ret < 0) {
goto fail;
}
if (bs->total_sectors < 0) {
ret = -EINVAL;
goto fail;
}
if (rsrc_fork_length != 0) {
ret = dmg_read_resource_fork(bs, &ds,
rsrc_fork_offset, rsrc_fork_length);
info_end = info_begin + count;
offset = info_begin + 0x100;
/* read offsets */
last_in_offset = last_out_offset = 0;
while (offset < info_end) {
uint32_t type;
ret = read_uint32(bs, offset, &count);
if (ret < 0) {
goto fail;
} else if (count == 0) {
ret = -EINVAL;
goto fail;
}
offset += 4;
ret = read_uint32(bs, offset, &type);
if (ret < 0) {
goto fail;
}
} else if (plist_xml_length != 0) {
ret = dmg_read_plist_xml(bs, &ds, plist_xml_offset, plist_xml_length);
if (ret < 0) {
goto fail;
if (type == 0x6d697368 && count >= 244) {
size_t new_size;
uint32_t chunk_count;
offset += 4;
offset += 200;
chunk_count = (count - 204) / 40;
new_size = sizeof(uint64_t) * (s->n_chunks + chunk_count);
s->types = g_realloc(s->types, new_size / 2);
s->offsets = g_realloc(s->offsets, new_size);
s->lengths = g_realloc(s->lengths, new_size);
s->sectors = g_realloc(s->sectors, new_size);
s->sectorcounts = g_realloc(s->sectorcounts, new_size);
for (i = s->n_chunks; i < s->n_chunks + chunk_count; i++) {
ret = read_uint32(bs, offset, &s->types[i]);
if (ret < 0) {
goto fail;
}
offset += 4;
if (s->types[i] != 0x80000005 && s->types[i] != 1 &&
s->types[i] != 2) {
if (s->types[i] == 0xffffffff && i > 0) {
last_in_offset = s->offsets[i - 1] + s->lengths[i - 1];
last_out_offset = s->sectors[i - 1] +
s->sectorcounts[i - 1];
}
chunk_count--;
i--;
offset += 36;
continue;
}
offset += 4;
ret = read_uint64(bs, offset, &s->sectors[i]);
if (ret < 0) {
goto fail;
}
s->sectors[i] += last_out_offset;
offset += 8;
ret = read_uint64(bs, offset, &s->sectorcounts[i]);
if (ret < 0) {
goto fail;
}
offset += 8;
if (s->sectorcounts[i] > DMG_SECTORCOUNTS_MAX) {
error_report("sector count %" PRIu64 " for chunk %u is "
"larger than max (%u)",
s->sectorcounts[i], i, DMG_SECTORCOUNTS_MAX);
ret = -EINVAL;
goto fail;
}
ret = read_uint64(bs, offset, &s->offsets[i]);
if (ret < 0) {
goto fail;
}
s->offsets[i] += last_in_offset;
offset += 8;
ret = read_uint64(bs, offset, &s->lengths[i]);
if (ret < 0) {
goto fail;
}
offset += 8;
if (s->lengths[i] > DMG_LENGTHS_MAX) {
error_report("length %" PRIu64 " for chunk %u is larger "
"than max (%u)",
s->lengths[i], i, DMG_LENGTHS_MAX);
ret = -EINVAL;
goto fail;
}
update_max_chunk_size(s, i, &max_compressed_size,
&max_sectors_per_chunk);
}
s->n_chunks += chunk_count;
}
} else {
ret = -EINVAL;
goto fail;
}
/* initialize zlib engine */
s->compressed_chunk = qemu_try_blockalign(bs->file,
ds.max_compressed_size + 1);
s->uncompressed_chunk = qemu_try_blockalign(bs->file,
512 * ds.max_sectors_per_chunk);
if (s->compressed_chunk == NULL || s->uncompressed_chunk == NULL) {
ret = -ENOMEM;
goto fail;
}
s->compressed_chunk = g_malloc(max_compressed_size + 1);
s->uncompressed_chunk = g_malloc(512 * max_sectors_per_chunk);
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;
@@ -539,8 +302,8 @@ fail:
g_free(s->lengths);
g_free(s->sectors);
g_free(s->sectorcounts);
qemu_vfree(s->compressed_chunk);
qemu_vfree(s->uncompressed_chunk);
g_free(s->compressed_chunk);
g_free(s->uncompressed_chunk);
return ret;
}
@@ -579,16 +342,13 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
if (!is_sector_in_chunk(s, s->current_chunk, sector_num)) {
int ret;
uint32_t chunk = search_chunk(s, sector_num);
#ifdef CONFIG_BZIP2
uint64_t total_out;
#endif
if (chunk >= s->n_chunks) {
return -1;
}
s->current_chunk = s->n_chunks;
switch (s->types[chunk]) { /* block entry type */
switch (s->types[chunk]) {
case 0x80000005: { /* zlib compressed */
/* we need to buffer, because only the chunk as whole can be
* inflated. */
@@ -612,34 +372,6 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
return -1;
}
break; }
#ifdef CONFIG_BZIP2
case 0x80000006: /* bzip2 compressed */
/* we need to buffer, because only the chunk as whole can be
* inflated. */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->compressed_chunk, s->lengths[chunk]);
if (ret != s->lengths[chunk]) {
return -1;
}
ret = BZ2_bzDecompressInit(&s->bzstream, 0, 0);
if (ret != BZ_OK) {
return -1;
}
s->bzstream.next_in = (char *)s->compressed_chunk;
s->bzstream.avail_in = (unsigned int) s->lengths[chunk];
s->bzstream.next_out = (char *)s->uncompressed_chunk;
s->bzstream.avail_out = (unsigned int) 512 * s->sectorcounts[chunk];
ret = BZ2_bzDecompress(&s->bzstream);
total_out = ((uint64_t)s->bzstream.total_out_hi32 << 32) +
s->bzstream.total_out_lo32;
BZ2_bzDecompressEnd(&s->bzstream);
if (ret != BZ_STREAM_END ||
total_out != 512 * s->sectorcounts[chunk]) {
return -1;
}
break;
#endif /* CONFIG_BZIP2 */
case 1: /* copy */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->uncompressed_chunk, s->lengths[chunk]);
@@ -648,8 +380,7 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
}
break;
case 2: /* zero */
/* see dmg_read, it is treated specially. No buffer needs to be
* pre-filled, the zeroes can be set directly. */
memset(s->uncompressed_chunk, 0, 512 * s->sectorcounts[chunk]);
break;
}
s->current_chunk = chunk;
@@ -668,13 +399,6 @@ static int dmg_read(BlockDriverState *bs, int64_t sector_num,
if (dmg_read_chunk(bs, sector_num + i) != 0) {
return -1;
}
/* Special case: current chunk is all zeroes. Do not perform a memcpy as
* s->uncompressed_chunk may be too small to cover the large all-zeroes
* section. dmg_read_chunk is called to find s->current_chunk */
if (s->types[s->current_chunk] == 2) { /* all zeroes block entry */
memset(buf + i * 512, 0, 512);
continue;
}
sector_offset_in_chunk = sector_num + i - s->sectors[s->current_chunk];
memcpy(buf + i * 512,
s->uncompressed_chunk + sector_offset_in_chunk * 512, 512);
@@ -702,8 +426,8 @@ static void dmg_close(BlockDriverState *bs)
g_free(s->lengths);
g_free(s->sectors);
g_free(s->sectorcounts);
qemu_vfree(s->compressed_chunk);
qemu_vfree(s->uncompressed_chunk);
g_free(s->compressed_chunk);
g_free(s->uncompressed_chunk);
inflateEnd(&s->zstream);
}

View File

@@ -3,27 +3,42 @@
*
* Copyright (C) 2012 Bharata B Rao <bharata@linux.vnet.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
* Pipe handling mechanism in AIO implementation is derived from
* block/rbd.c. Hence,
*
* Copyright (C) 2010-2011 Christian Brunner <chb@muc.de>,
* Josh Durgin <josh.durgin@dreamhost.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*
* Contributions after 2012-01-13 are licensed under the terms of the
* GNU GPL, version 2 or (at your option) any later version.
*/
#include <glusterfs/api/glfs.h>
#include "block/block_int.h"
#include "qemu/sockets.h"
#include "qemu/uri.h"
typedef struct GlusterAIOCB {
BlockDriverAIOCB common;
int64_t size;
int ret;
bool *finished;
QEMUBH *bh;
Coroutine *coroutine;
AioContext *aio_context;
} GlusterAIOCB;
typedef struct BDRVGlusterState {
struct glfs *glfs;
int fds[2];
struct glfs_fd *fd;
int event_reader_pos;
GlusterAIOCB *event_acb;
} BDRVGlusterState;
#define GLUSTER_FD_READ 0
#define GLUSTER_FD_WRITE 1
typedef struct GlusterConf {
char *server;
int port;
@@ -34,13 +49,11 @@ typedef struct GlusterConf {
static void qemu_gluster_gconf_free(GlusterConf *gconf)
{
if (gconf) {
g_free(gconf->server);
g_free(gconf->volname);
g_free(gconf->image);
g_free(gconf->transport);
g_free(gconf);
}
g_free(gconf->server);
g_free(gconf->volname);
g_free(gconf->image);
g_free(gconf->transport);
g_free(gconf);
}
static int parse_volume_options(GlusterConf *gconf, char *path)
@@ -81,7 +94,7 @@ static int parse_volume_options(GlusterConf *gconf, char *path)
* 'server' specifies the server where the volume file specification for
* the given volume resides. This can be either hostname, ipv4 address
* or ipv6 address. ipv6 address needs to be within square brackets [ ].
* If transport type is 'unix', then 'server' field should not be specified.
* If transport type is 'unix', then 'server' field should not be specifed.
* The 'socket' field needs to be populated with the path to unix domain
* socket.
*
@@ -118,7 +131,7 @@ static int qemu_gluster_parseuri(GlusterConf *gconf, const char *filename)
}
/* transport */
if (!uri->scheme || !strcmp(uri->scheme, "gluster")) {
if (!strcmp(uri->scheme, "gluster")) {
gconf->transport = g_strdup("tcp");
} else if (!strcmp(uri->scheme, "gluster+tcp")) {
gconf->transport = g_strdup("tcp");
@@ -154,7 +167,7 @@ static int qemu_gluster_parseuri(GlusterConf *gconf, const char *filename)
}
gconf->server = g_strdup(qp->p[0].value);
} else {
gconf->server = g_strdup(uri->server ? uri->server : "localhost");
gconf->server = g_strdup(uri->server);
gconf->port = uri->port;
}
@@ -166,8 +179,7 @@ out:
return ret;
}
static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char *filename,
Error **errp)
static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char *filename)
{
struct glfs *glfs = NULL;
int ret;
@@ -175,8 +187,8 @@ static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char *filename,
ret = qemu_gluster_parseuri(gconf, filename);
if (ret < 0) {
error_setg(errp, "Usage: file=gluster[+transport]://[server[:port]]/"
"volname/image[?socket=...]");
error_report("Usage: file=gluster[+transport]://[server[:port]]/"
"volname/image[?socket=...]");
errno = -ret;
goto out;
}
@@ -203,16 +215,9 @@ static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char *filename,
ret = glfs_init(glfs);
if (ret) {
error_setg_errno(errp, errno,
"Gluster connection failed for server=%s port=%d "
"volume=%s image=%s transport=%s", gconf->server,
gconf->port, gconf->volname, gconf->image,
gconf->transport);
/* glfs_init sometimes doesn't set errno although docs suggest that */
if (errno == 0)
errno = EINVAL;
error_report("Gluster connection failed for server=%s port=%d "
"volume=%s image=%s transport=%s", gconf->server, gconf->port,
gconf->volname, gconf->image, gconf->transport);
goto out;
}
return glfs;
@@ -226,32 +231,46 @@ out:
return NULL;
}
static void qemu_gluster_complete_aio(void *opaque)
static void qemu_gluster_complete_aio(GlusterAIOCB *acb, BDRVGlusterState *s)
{
GlusterAIOCB *acb = (GlusterAIOCB *)opaque;
int ret;
bool *finished = acb->finished;
BlockDriverCompletionFunc *cb = acb->common.cb;
void *opaque = acb->common.opaque;
qemu_bh_delete(acb->bh);
acb->bh = NULL;
qemu_coroutine_enter(acb->coroutine, NULL);
}
/*
* AIO callback routine called from GlusterFS thread.
*/
static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
{
GlusterAIOCB *acb = (GlusterAIOCB *)arg;
if (!ret || ret == acb->size) {
acb->ret = 0; /* Success */
} else if (ret < 0) {
acb->ret = ret; /* Read/Write failed */
if (!acb->ret || acb->ret == acb->size) {
ret = 0; /* Success */
} else if (acb->ret < 0) {
ret = acb->ret; /* Read/Write failed */
} else {
acb->ret = -EIO; /* Partial read/write - fail it */
ret = -EIO; /* Partial read/write - fail it */
}
acb->bh = aio_bh_new(acb->aio_context, qemu_gluster_complete_aio, acb);
qemu_bh_schedule(acb->bh);
qemu_aio_release(acb);
cb(opaque, ret);
if (finished) {
*finished = true;
}
}
static void qemu_gluster_aio_event_reader(void *opaque)
{
BDRVGlusterState *s = opaque;
ssize_t ret;
do {
char *p = (char *)&s->event_acb;
ret = read(s->fds[GLUSTER_FD_READ], p + s->event_reader_pos,
sizeof(s->event_acb) - s->event_reader_pos);
if (ret > 0) {
s->event_reader_pos += ret;
if (s->event_reader_pos == sizeof(s->event_acb)) {
s->event_reader_pos = 0;
qemu_gluster_complete_aio(s->event_acb, s);
}
}
} while (ret < 0 && errno == EINTR);
}
/* TODO Convert to fine grained options */
@@ -268,57 +287,60 @@ static QemuOptsList runtime_opts = {
},
};
static void qemu_gluster_parse_flags(int bdrv_flags, int *open_flags)
{
assert(open_flags != NULL);
*open_flags |= O_BINARY;
if (bdrv_flags & BDRV_O_RDWR) {
*open_flags |= O_RDWR;
} else {
*open_flags |= O_RDONLY;
}
if ((bdrv_flags & BDRV_O_NOCACHE)) {
*open_flags |= O_DIRECT;
}
}
static int qemu_gluster_open(BlockDriverState *bs, QDict *options,
int bdrv_flags, Error **errp)
{
BDRVGlusterState *s = bs->opaque;
int open_flags = 0;
int open_flags = O_BINARY;
int ret = 0;
GlusterConf *gconf = g_new0(GlusterConf, 1);
GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));
QemuOpts *opts;
Error *local_err = NULL;
const char *filename;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
opts = qemu_opts_create_nofail(&runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
if (error_is_set(&local_err)) {
qerror_report_err(local_err);
error_free(local_err);
ret = -EINVAL;
goto out;
}
filename = qemu_opt_get(opts, "filename");
s->glfs = qemu_gluster_init(gconf, filename, errp);
s->glfs = qemu_gluster_init(gconf, filename);
if (!s->glfs) {
ret = -errno;
goto out;
}
qemu_gluster_parse_flags(bdrv_flags, &open_flags);
if (bdrv_flags & BDRV_O_RDWR) {
open_flags |= O_RDWR;
} else {
open_flags |= O_RDONLY;
}
if ((bdrv_flags & BDRV_O_NOCACHE)) {
open_flags |= O_DIRECT;
}
s->fd = glfs_open(s->glfs, gconf->image, open_flags);
if (!s->fd) {
ret = -errno;
goto out;
}
ret = qemu_pipe(s->fds);
if (ret < 0) {
ret = -errno;
goto out;
}
fcntl(s->fds[GLUSTER_FD_READ], F_SETFL, O_NONBLOCK);
qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ],
qemu_gluster_aio_event_reader, NULL, s);
out:
qemu_opts_del(opts);
qemu_gluster_gconf_free(gconf);
@@ -334,181 +356,26 @@ out:
return ret;
}
typedef struct BDRVGlusterReopenState {
struct glfs *glfs;
struct glfs_fd *fd;
} BDRVGlusterReopenState;
static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
BlockReopenQueue *queue, Error **errp)
{
int ret = 0;
BDRVGlusterReopenState *reop_s;
GlusterConf *gconf = NULL;
int open_flags = 0;
assert(state != NULL);
assert(state->bs != NULL);
state->opaque = g_new0(BDRVGlusterReopenState, 1);
reop_s = state->opaque;
qemu_gluster_parse_flags(state->flags, &open_flags);
gconf = g_new0(GlusterConf, 1);
reop_s->glfs = qemu_gluster_init(gconf, state->bs->filename, errp);
if (reop_s->glfs == NULL) {
ret = -errno;
goto exit;
}
reop_s->fd = glfs_open(reop_s->glfs, gconf->image, open_flags);
if (reop_s->fd == NULL) {
/* reops->glfs will be cleaned up in _abort */
ret = -errno;
goto exit;
}
exit:
/* state->opaque will be freed in either the _abort or _commit */
qemu_gluster_gconf_free(gconf);
return ret;
}
static void qemu_gluster_reopen_commit(BDRVReopenState *state)
{
BDRVGlusterReopenState *reop_s = state->opaque;
BDRVGlusterState *s = state->bs->opaque;
/* close the old */
if (s->fd) {
glfs_close(s->fd);
}
if (s->glfs) {
glfs_fini(s->glfs);
}
/* use the newly opened image / connection */
s->fd = reop_s->fd;
s->glfs = reop_s->glfs;
g_free(state->opaque);
state->opaque = NULL;
return;
}
static void qemu_gluster_reopen_abort(BDRVReopenState *state)
{
BDRVGlusterReopenState *reop_s = state->opaque;
if (reop_s == NULL) {
return;
}
if (reop_s->fd) {
glfs_close(reop_s->fd);
}
if (reop_s->glfs) {
glfs_fini(reop_s->glfs);
}
g_free(state->opaque);
state->opaque = NULL;
return;
}
#ifdef CONFIG_GLUSTERFS_ZEROFILL
static coroutine_fn int qemu_gluster_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
{
int ret;
GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
BDRVGlusterState *s = bs->opaque;
off_t size = nb_sectors * BDRV_SECTOR_SIZE;
off_t offset = sector_num * BDRV_SECTOR_SIZE;
acb->size = size;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
ret = glfs_zerofill_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
if (ret < 0) {
ret = -errno;
goto out;
}
qemu_coroutine_yield();
ret = acb->ret;
out:
g_slice_free(GlusterAIOCB, acb);
return ret;
}
static inline bool gluster_supports_zerofill(void)
{
return 1;
}
static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
int64_t size)
{
return glfs_zerofill(fd, offset, size);
}
#else
static inline bool gluster_supports_zerofill(void)
{
return 0;
}
static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
int64_t size)
{
return 0;
}
#endif
static int qemu_gluster_create(const char *filename,
QemuOpts *opts, Error **errp)
QEMUOptionParameter *options, Error **errp)
{
struct glfs *glfs;
struct glfs_fd *fd;
int ret = 0;
int prealloc = 0;
int64_t total_size = 0;
char *tmp = NULL;
GlusterConf *gconf = g_new0(GlusterConf, 1);
GlusterConf *gconf = g_malloc0(sizeof(GlusterConf));
glfs = qemu_gluster_init(gconf, filename, errp);
glfs = qemu_gluster_init(gconf, filename);
if (!glfs) {
ret = -errno;
goto out;
}
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!tmp || !strcmp(tmp, "off")) {
prealloc = 0;
} else if (!strcmp(tmp, "full") &&
gluster_supports_zerofill()) {
prealloc = 1;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'"
" or GlusterFS doesn't support zerofill API",
tmp);
ret = -EINVAL;
goto out;
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
total_size = options->value.n / BDRV_SECTOR_SIZE;
}
options++;
}
fd = glfs_creat(glfs, gconf->image,
@@ -516,20 +383,14 @@ static int qemu_gluster_create(const char *filename,
if (!fd) {
ret = -errno;
} else {
if (!glfs_ftruncate(fd, total_size)) {
if (prealloc && qemu_gluster_zerofill(fd, 0, total_size)) {
ret = -errno;
}
} else {
if (glfs_ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
ret = -errno;
}
if (glfs_close(fd) != 0) {
ret = -errno;
}
}
out:
g_free(tmp);
qemu_gluster_gconf_free(gconf);
if (glfs) {
glfs_fini(glfs);
@@ -537,19 +398,58 @@ out:
return ret;
}
static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int write)
static void qemu_gluster_aio_cancel(BlockDriverAIOCB *blockacb)
{
GlusterAIOCB *acb = (GlusterAIOCB *)blockacb;
bool finished = false;
acb->finished = &finished;
while (!finished) {
qemu_aio_wait();
}
}
static const AIOCBInfo gluster_aiocb_info = {
.aiocb_size = sizeof(GlusterAIOCB),
.cancel = qemu_gluster_aio_cancel,
};
static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
{
GlusterAIOCB *acb = (GlusterAIOCB *)arg;
BlockDriverState *bs = acb->common.bs;
BDRVGlusterState *s = bs->opaque;
int retval;
acb->ret = ret;
retval = qemu_write_full(s->fds[GLUSTER_FD_WRITE], &acb, sizeof(acb));
if (retval != sizeof(acb)) {
/*
* Gluster AIO callback thread failed to notify the waiting
* QEMU thread about IO completion.
*/
error_report("Gluster AIO completion failed: %s", strerror(errno));
abort();
}
}
static BlockDriverAIOCB *qemu_gluster_aio_rw(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque, int write)
{
int ret;
GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
GlusterAIOCB *acb;
BDRVGlusterState *s = bs->opaque;
size_t size = nb_sectors * BDRV_SECTOR_SIZE;
off_t offset = sector_num * BDRV_SECTOR_SIZE;
size_t size;
off_t offset;
offset = sector_num * BDRV_SECTOR_SIZE;
size = nb_sectors * BDRV_SECTOR_SIZE;
acb = qemu_aio_get(&gluster_aiocb_info, bs, cb, opaque);
acb->size = size;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
acb->finished = NULL;
if (write) {
ret = glfs_pwritev_async(s->fd, qiov->iov, qiov->niov, offset, 0,
@@ -560,16 +460,13 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
}
if (ret < 0) {
ret = -errno;
goto out;
}
qemu_coroutine_yield();
ret = acb->ret;
return &acb->common;
out:
g_slice_free(GlusterAIOCB, acb);
return ret;
qemu_aio_release(acb);
return NULL;
}
static int qemu_gluster_truncate(BlockDriverState *bs, int64_t offset)
@@ -585,70 +482,71 @@ static int qemu_gluster_truncate(BlockDriverState *bs, int64_t offset)
return 0;
}
static coroutine_fn int qemu_gluster_co_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
static BlockDriverAIOCB *qemu_gluster_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
{
return qemu_gluster_co_rw(bs, sector_num, nb_sectors, qiov, 0);
return qemu_gluster_aio_rw(bs, sector_num, qiov, nb_sectors, cb, opaque, 0);
}
static coroutine_fn int qemu_gluster_co_writev(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
static BlockDriverAIOCB *qemu_gluster_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
{
return qemu_gluster_co_rw(bs, sector_num, nb_sectors, qiov, 1);
return qemu_gluster_aio_rw(bs, sector_num, qiov, nb_sectors, cb, opaque, 1);
}
static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
static BlockDriverAIOCB *qemu_gluster_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb, void *opaque)
{
int ret;
GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
GlusterAIOCB *acb;
BDRVGlusterState *s = bs->opaque;
acb = qemu_aio_get(&gluster_aiocb_info, bs, cb, opaque);
acb->size = 0;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
acb->finished = NULL;
ret = glfs_fsync_async(s->fd, &gluster_finish_aiocb, acb);
if (ret < 0) {
ret = -errno;
goto out;
}
qemu_coroutine_yield();
ret = acb->ret;
return &acb->common;
out:
g_slice_free(GlusterAIOCB, acb);
return ret;
qemu_aio_release(acb);
return NULL;
}
#ifdef CONFIG_GLUSTERFS_DISCARD
static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
int64_t sector_num, int nb_sectors)
static BlockDriverAIOCB *qemu_gluster_aio_discard(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, BlockDriverCompletionFunc *cb,
void *opaque)
{
int ret;
GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
GlusterAIOCB *acb;
BDRVGlusterState *s = bs->opaque;
size_t size = nb_sectors * BDRV_SECTOR_SIZE;
off_t offset = sector_num * BDRV_SECTOR_SIZE;
size_t size;
off_t offset;
offset = sector_num * BDRV_SECTOR_SIZE;
size = nb_sectors * BDRV_SECTOR_SIZE;
acb = qemu_aio_get(&gluster_aiocb_info, bs, cb, opaque);
acb->size = 0;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
acb->finished = NULL;
ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
if (ret < 0) {
ret = -errno;
goto out;
}
qemu_coroutine_yield();
ret = acb->ret;
return &acb->common;
out:
g_slice_free(GlusterAIOCB, acb);
return ret;
qemu_aio_release(acb);
return NULL;
}
#endif
@@ -683,6 +581,10 @@ static void qemu_gluster_close(BlockDriverState *bs)
{
BDRVGlusterState *s = bs->opaque;
close(s->fds[GLUSTER_FD_READ]);
close(s->fds[GLUSTER_FD_WRITE]);
qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], NULL, NULL, NULL);
if (s->fd) {
glfs_close(s->fd);
s->fd = NULL;
@@ -696,22 +598,13 @@ static int qemu_gluster_has_zero_init(BlockDriverState *bs)
return 0;
}
static QemuOptsList qemu_gluster_create_opts = {
.name = "qemu-gluster-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qemu_gluster_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_PREALLOC,
.type = QEMU_OPT_STRING,
.help = "Preallocation mode (allowed values: off, full)"
},
{ /* end of list */ }
}
static QEMUOptionParameter qemu_gluster_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{ NULL }
};
static BlockDriver bdrv_gluster = {
@@ -720,25 +613,19 @@ static BlockDriver bdrv_gluster = {
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_reopen_prepare = qemu_gluster_reopen_prepare,
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
.bdrv_co_readv = qemu_gluster_co_readv,
.bdrv_co_writev = qemu_gluster_co_writev,
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
.bdrv_aio_readv = qemu_gluster_aio_readv,
.bdrv_aio_writev = qemu_gluster_aio_writev,
.bdrv_aio_flush = qemu_gluster_aio_flush,
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
#ifdef CONFIG_GLUSTERFS_DISCARD
.bdrv_co_discard = qemu_gluster_co_discard,
.bdrv_aio_discard = qemu_gluster_aio_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_opts = &qemu_gluster_create_opts,
.create_options = qemu_gluster_create_options,
};
static BlockDriver bdrv_gluster_tcp = {
@@ -747,25 +634,19 @@ static BlockDriver bdrv_gluster_tcp = {
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_reopen_prepare = qemu_gluster_reopen_prepare,
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
.bdrv_co_readv = qemu_gluster_co_readv,
.bdrv_co_writev = qemu_gluster_co_writev,
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
.bdrv_aio_readv = qemu_gluster_aio_readv,
.bdrv_aio_writev = qemu_gluster_aio_writev,
.bdrv_aio_flush = qemu_gluster_aio_flush,
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
#ifdef CONFIG_GLUSTERFS_DISCARD
.bdrv_co_discard = qemu_gluster_co_discard,
.bdrv_aio_discard = qemu_gluster_aio_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_opts = &qemu_gluster_create_opts,
.create_options = qemu_gluster_create_options,
};
static BlockDriver bdrv_gluster_unix = {
@@ -774,25 +655,19 @@ static BlockDriver bdrv_gluster_unix = {
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_reopen_prepare = qemu_gluster_reopen_prepare,
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
.bdrv_co_readv = qemu_gluster_co_readv,
.bdrv_co_writev = qemu_gluster_co_writev,
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
.bdrv_aio_readv = qemu_gluster_aio_readv,
.bdrv_aio_writev = qemu_gluster_aio_writev,
.bdrv_aio_flush = qemu_gluster_aio_flush,
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
#ifdef CONFIG_GLUSTERFS_DISCARD
.bdrv_co_discard = qemu_gluster_co_discard,
.bdrv_aio_discard = qemu_gluster_aio_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_opts = &qemu_gluster_create_opts,
.create_options = qemu_gluster_create_options,
};
static BlockDriver bdrv_gluster_rdma = {
@@ -801,25 +676,19 @@ static BlockDriver bdrv_gluster_rdma = {
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_reopen_prepare = qemu_gluster_reopen_prepare,
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
.bdrv_co_readv = qemu_gluster_co_readv,
.bdrv_co_writev = qemu_gluster_co_writev,
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
.bdrv_aio_readv = qemu_gluster_aio_readv,
.bdrv_aio_writev = qemu_gluster_aio_writev,
.bdrv_aio_flush = qemu_gluster_aio_flush,
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
#ifdef CONFIG_GLUSTERFS_DISCARD
.bdrv_co_discard = qemu_gluster_co_discard,
.bdrv_aio_discard = qemu_gluster_aio_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_opts = &qemu_gluster_create_opts,
.create_options = qemu_gluster_create_options,
};
static void bdrv_gluster_init(void)

2610
block/io.c

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -25,42 +25,22 @@
*/
#define MAX_EVENTS 128
#define MAX_QUEUED_IO 128
struct qemu_laiocb {
BlockAIOCB common;
BlockDriverAIOCB common;
struct qemu_laio_state *ctx;
struct iocb iocb;
ssize_t ret;
size_t nbytes;
QEMUIOVector *qiov;
bool is_read;
QSIMPLEQ_ENTRY(qemu_laiocb) next;
QLIST_ENTRY(qemu_laiocb) node;
};
typedef struct {
int plugged;
unsigned int n;
bool blocked;
QSIMPLEQ_HEAD(, qemu_laiocb) pending;
} LaioQueue;
struct qemu_laio_state {
io_context_t ctx;
EventNotifier e;
/* io queue for submit at batch */
LaioQueue io_q;
/* I/O completion processing */
QEMUBH *completion_bh;
struct io_event events[MAX_EVENTS];
int event_idx;
int event_max;
};
static void ioq_submit(struct qemu_laio_state *s);
static inline ssize_t io_event_ret(struct io_event *ev)
{
return (ssize_t)(((uint64_t)ev->res2 << 32) | ev->res);
@@ -87,159 +67,77 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
ret = -EINVAL;
}
}
}
laiocb->common.cb(laiocb->common.opaque, ret);
qemu_aio_unref(laiocb);
}
/* The completion BH fetches completed I/O requests and invokes their
* callbacks.
*
* The function is somewhat tricky because it supports nested event loops, for
* example when a request callback invokes aio_poll(). In order to do this,
* the completion events array and index are kept in qemu_laio_state. The BH
* reschedules itself as long as there are completions pending so it will
* either be called again in a nested event loop or will be called after all
* events have been completed. When there are no events left to complete, the
* BH returns without rescheduling.
*/
static void qemu_laio_completion_bh(void *opaque)
{
struct qemu_laio_state *s = opaque;
/* Fetch more completion events when empty */
if (s->event_idx == s->event_max) {
do {
struct timespec ts = { 0 };
s->event_max = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS,
s->events, &ts);
} while (s->event_max == -EINTR);
s->event_idx = 0;
if (s->event_max <= 0) {
s->event_max = 0;
return; /* no more events */
}
laiocb->common.cb(laiocb->common.opaque, ret);
}
/* Reschedule so nested event loops see currently pending completions */
qemu_bh_schedule(s->completion_bh);
/* Process completion events */
while (s->event_idx < s->event_max) {
struct iocb *iocb = s->events[s->event_idx].obj;
struct qemu_laiocb *laiocb =
container_of(iocb, struct qemu_laiocb, iocb);
laiocb->ret = io_event_ret(&s->events[s->event_idx]);
s->event_idx++;
qemu_laio_process_completion(s, laiocb);
}
if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
ioq_submit(s);
}
qemu_aio_release(laiocb);
}
static void qemu_laio_completion_cb(EventNotifier *e)
{
struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, e);
if (event_notifier_test_and_clear(&s->e)) {
qemu_bh_schedule(s->completion_bh);
while (event_notifier_test_and_clear(&s->e)) {
struct io_event events[MAX_EVENTS];
struct timespec ts = { 0 };
int nevents, i;
do {
nevents = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS, events, &ts);
} while (nevents == -EINTR);
for (i = 0; i < nevents; i++) {
struct iocb *iocb = events[i].obj;
struct qemu_laiocb *laiocb =
container_of(iocb, struct qemu_laiocb, iocb);
laiocb->ret = io_event_ret(&events[i]);
qemu_laio_process_completion(s, laiocb);
}
}
}
static void laio_cancel(BlockAIOCB *blockacb)
static void laio_cancel(BlockDriverAIOCB *blockacb)
{
struct qemu_laiocb *laiocb = (struct qemu_laiocb *)blockacb;
struct io_event event;
int ret;
if (laiocb->ret != -EINPROGRESS) {
if (laiocb->ret != -EINPROGRESS)
return;
}
/*
* Note that as of Linux 2.6.31 neither the block device code nor any
* filesystem implements cancellation of AIO request.
* Thus the polling loop below is the normal code path.
*/
ret = io_cancel(laiocb->ctx->ctx, &laiocb->iocb, &event);
laiocb->ret = -ECANCELED;
if (ret != 0) {
/* iocb is not cancelled, cb will be called by the event loop later */
if (ret == 0) {
laiocb->ret = -ECANCELED;
return;
}
laiocb->common.cb(laiocb->common.opaque, laiocb->ret);
/*
* We have to wait for the iocb to finish.
*
* The only way to get the iocb status update is by polling the io context.
* We might be able to do this slightly more optimal by removing the
* O_NONBLOCK flag.
*/
while (laiocb->ret == -EINPROGRESS) {
qemu_laio_completion_cb(&laiocb->ctx->e);
}
}
static const AIOCBInfo laio_aiocb_info = {
.aiocb_size = sizeof(struct qemu_laiocb),
.cancel_async = laio_cancel,
.cancel = laio_cancel,
};
static void ioq_init(LaioQueue *io_q)
{
QSIMPLEQ_INIT(&io_q->pending);
io_q->plugged = 0;
io_q->n = 0;
io_q->blocked = false;
}
static void ioq_submit(struct qemu_laio_state *s)
{
int ret, len;
struct qemu_laiocb *aiocb;
struct iocb *iocbs[MAX_QUEUED_IO];
QSIMPLEQ_HEAD(, qemu_laiocb) completed;
do {
len = 0;
QSIMPLEQ_FOREACH(aiocb, &s->io_q.pending, next) {
iocbs[len++] = &aiocb->iocb;
if (len == MAX_QUEUED_IO) {
break;
}
}
ret = io_submit(s->ctx, len, iocbs);
if (ret == -EAGAIN) {
break;
}
if (ret < 0) {
abort();
}
s->io_q.n -= ret;
aiocb = container_of(iocbs[ret - 1], struct qemu_laiocb, iocb);
QSIMPLEQ_SPLIT_AFTER(&s->io_q.pending, aiocb, next, &completed);
} while (ret == len && !QSIMPLEQ_EMPTY(&s->io_q.pending));
s->io_q.blocked = (s->io_q.n > 0);
}
void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
{
struct qemu_laio_state *s = aio_ctx;
s->io_q.plugged++;
}
void laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
{
struct qemu_laio_state *s = aio_ctx;
assert(s->io_q.plugged > 0 || !unplug);
if (unplug && --s->io_q.plugged > 0) {
return;
}
if (!s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
ioq_submit(s);
}
}
BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type)
BlockDriverCompletionFunc *cb, void *opaque, int type)
{
struct qemu_laio_state *s = aio_ctx;
struct qemu_laiocb *laiocb;
@@ -270,35 +168,15 @@ BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
}
io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
QSIMPLEQ_INSERT_TAIL(&s->io_q.pending, laiocb, next);
s->io_q.n++;
if (!s->io_q.blocked &&
(!s->io_q.plugged || s->io_q.n >= MAX_QUEUED_IO)) {
ioq_submit(s);
}
if (io_submit(s->ctx, 1, &iocbs) < 0)
goto out_free_aiocb;
return &laiocb->common;
out_free_aiocb:
qemu_aio_unref(laiocb);
qemu_aio_release(laiocb);
return NULL;
}
void laio_detach_aio_context(void *s_, AioContext *old_context)
{
struct qemu_laio_state *s = s_;
aio_set_event_notifier(old_context, &s->e, NULL);
qemu_bh_delete(s->completion_bh);
}
void laio_attach_aio_context(void *s_, AioContext *new_context)
{
struct qemu_laio_state *s = s_;
s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
}
void *laio_init(void)
{
struct qemu_laio_state *s;
@@ -312,7 +190,7 @@ void *laio_init(void)
goto out_close_efd;
}
ioq_init(&s->io_q);
qemu_aio_set_event_notifier(&s->e, qemu_laio_completion_cb);
return s;
@@ -322,16 +200,3 @@ out_free_state:
g_free(s);
return NULL;
}
void laio_cleanup(void *s_)
{
struct qemu_laio_state *s = s_;
event_notifier_cleanup(&s->e);
if (io_destroy(s->ctx) != 0) {
fprintf(stderr, "%s: destroy AIO context %p failed\n",
__func__, &s->ctx);
}
g_free(s);
}

View File

@@ -14,13 +14,11 @@
#include "trace.h"
#include "block/blockjob.h"
#include "block/block_int.h"
#include "qapi/qmp/qerror.h"
#include "qemu/ratelimit.h"
#include "qemu/bitmap.h"
#define SLICE_TIME 100000000ULL /* ns */
#define MAX_IN_FLIGHT 16
#define DEFAULT_MIRROR_BUF_SIZE (10 << 20)
/* The mirroring buffer is a list of granularity-sized chunks.
* Free chunks are organized in a list.
@@ -33,23 +31,14 @@ typedef struct MirrorBlockJob {
BlockJob common;
RateLimit limit;
BlockDriverState *target;
BlockDriverState *base;
/* The name of the graph node to replace */
char *replaces;
/* The BDS to replace */
BlockDriverState *to_replace;
/* Used to block operations on the drive-mirror-replace target */
Error *replace_blocker;
bool is_none_mode;
MirrorSyncMode mode;
BlockdevOnError on_source_error, on_target_error;
bool synced;
bool should_complete;
int64_t sector_num;
int64_t granularity;
size_t buf_size;
int64_t bdev_length;
unsigned long *cow_bitmap;
BdrvDirtyBitmap *dirty_bitmap;
HBitmapIter hbi;
uint8_t *buf;
QSIMPLEQ_HEAD(, MirrorBuffer) buf_free;
@@ -57,9 +46,7 @@ typedef struct MirrorBlockJob {
unsigned long *in_flight_bitmap;
int in_flight;
int sectors_in_flight;
int ret;
bool unmap;
} MirrorBlockJob;
typedef struct MirrorOp {
@@ -92,7 +79,6 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
trace_mirror_iteration_done(s, op->sector_num, op->nb_sectors, ret);
s->in_flight--;
s->sectors_in_flight -= op->nb_sectors;
iov = op->qiov.iov;
for (i = 0; i < op->qiov.niov; i++) {
MirrorBuffer *buf = (MirrorBuffer *) iov[i].iov_base;
@@ -104,14 +90,10 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
chunk_num = op->sector_num / sectors_per_chunk;
nb_chunks = op->nb_sectors / sectors_per_chunk;
bitmap_clear(s->in_flight_bitmap, chunk_num, nb_chunks);
if (ret >= 0) {
if (s->cow_bitmap) {
bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
}
s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE;
if (s->cow_bitmap && ret >= 0) {
bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
}
qemu_iovec_destroy(&op->qiov);
g_slice_free(MirrorOp, op);
/* Enter coroutine when it is not sleeping. The coroutine sleeps to
@@ -128,11 +110,12 @@ static void mirror_write_complete(void *opaque, int ret)
MirrorOp *op = opaque;
MirrorBlockJob *s = op->s;
if (ret < 0) {
BlockDriverState *source = s->common.bs;
BlockErrorAction action;
bdrv_set_dirty_bitmap(s->dirty_bitmap, op->sector_num, op->nb_sectors);
bdrv_set_dirty(source, op->sector_num, op->nb_sectors);
action = mirror_error_action(s, false, -ret);
if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
if (action == BDRV_ACTION_REPORT && s->ret >= 0) {
s->ret = ret;
}
}
@@ -144,11 +127,12 @@ static void mirror_read_complete(void *opaque, int ret)
MirrorOp *op = opaque;
MirrorBlockJob *s = op->s;
if (ret < 0) {
BlockDriverState *source = s->common.bs;
BlockErrorAction action;
bdrv_set_dirty_bitmap(s->dirty_bitmap, op->sector_num, op->nb_sectors);
bdrv_set_dirty(source, op->sector_num, op->nb_sectors);
action = mirror_error_action(s, true, -ret);
if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
if (action == BDRV_ACTION_REPORT && s->ret >= 0) {
s->ret = ret;
}
@@ -164,23 +148,21 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
BlockDriverState *source = s->common.bs;
int nb_sectors, sectors_per_chunk, nb_chunks;
int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
uint64_t delay_ns = 0;
uint64_t delay_ns;
MirrorOp *op;
int pnum;
int64_t ret;
s->sector_num = hbitmap_iter_next(&s->hbi);
if (s->sector_num < 0) {
bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
bdrv_dirty_iter_init(source, &s->hbi);
s->sector_num = hbitmap_iter_next(&s->hbi);
trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
trace_mirror_restart_iter(s, bdrv_get_dirty_count(source));
assert(s->sector_num >= 0);
}
hbitmap_next_sector = s->sector_num;
sector_num = s->sector_num;
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
end = s->bdev_length / BDRV_SECTOR_SIZE;
end = s->common.len >> BDRV_SECTOR_BITS;
/* Extend the QEMUIOVector to include all adjacent blocks that will
* be copied in this operation.
@@ -209,7 +191,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
do {
int added_sectors, added_chunks;
if (!bdrv_get_dirty(source, s->dirty_bitmap, next_sector) ||
if (!bdrv_get_dirty(source, next_sector) ||
test_bit(next_chunk, s->in_flight_bitmap)) {
assert(nb_sectors > 0);
break;
@@ -255,6 +237,8 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
next_chunk += added_chunks;
if (!s->synced && s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, added_sectors);
} else {
delay_ns = 0;
}
} while (delay_ns == 0 && next_sector < end);
@@ -271,45 +255,27 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
next_sector = sector_num;
while (nb_chunks-- > 0) {
MirrorBuffer *buf = QSIMPLEQ_FIRST(&s->buf_free);
size_t remaining = (nb_sectors * BDRV_SECTOR_SIZE) - op->qiov.size;
QSIMPLEQ_REMOVE_HEAD(&s->buf_free, next);
s->buf_free_count--;
qemu_iovec_add(&op->qiov, buf, MIN(s->granularity, remaining));
qemu_iovec_add(&op->qiov, buf, s->granularity);
/* Advance the HBitmapIter in parallel, so that we do not examine
* the same sector twice.
*/
if (next_sector > hbitmap_next_sector
&& bdrv_get_dirty(source, s->dirty_bitmap, next_sector)) {
if (next_sector > hbitmap_next_sector && bdrv_get_dirty(source, next_sector)) {
hbitmap_next_sector = hbitmap_iter_next(&s->hbi);
}
next_sector += sectors_per_chunk;
}
bdrv_reset_dirty_bitmap(s->dirty_bitmap, sector_num, nb_sectors);
bdrv_reset_dirty(source, sector_num, nb_sectors);
/* Copy the dirty cluster. */
s->in_flight++;
s->sectors_in_flight += nb_sectors;
trace_mirror_one_iteration(s, sector_num, nb_sectors);
ret = bdrv_get_block_status_above(source, NULL, sector_num,
nb_sectors, &pnum);
if (ret < 0 || pnum < nb_sectors ||
(ret & BDRV_BLOCK_DATA && !(ret & BDRV_BLOCK_ZERO))) {
bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
mirror_read_complete, op);
} else if (ret & BDRV_BLOCK_ZERO) {
bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
mirror_write_complete, op);
} else {
assert(!(ret & BDRV_BLOCK_DATA));
bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
mirror_write_complete, op);
}
bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
mirror_read_complete, op);
return delay_ns;
}
@@ -337,62 +303,14 @@ static void mirror_drain(MirrorBlockJob *s)
}
}
typedef struct {
int ret;
} MirrorExitData;
static void mirror_exit(BlockJob *job, void *opaque)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
MirrorExitData *data = opaque;
AioContext *replace_aio_context = NULL;
if (s->to_replace) {
replace_aio_context = bdrv_get_aio_context(s->to_replace);
aio_context_acquire(replace_aio_context);
}
if (s->should_complete && data->ret == 0) {
BlockDriverState *to_replace = s->common.bs;
if (s->to_replace) {
to_replace = s->to_replace;
}
if (bdrv_get_flags(s->target) != bdrv_get_flags(to_replace)) {
bdrv_reopen(s->target, bdrv_get_flags(to_replace), NULL);
}
bdrv_swap(s->target, to_replace);
if (s->common.driver->job_type == BLOCK_JOB_TYPE_COMMIT) {
/* drop the bs loop chain formed by the swap: break the loop then
* trigger the unref from the top one */
BlockDriverState *p = s->base->backing_hd;
bdrv_set_backing_hd(s->base, NULL);
bdrv_unref(p);
}
}
if (s->to_replace) {
bdrv_op_unblock_all(s->to_replace, s->replace_blocker);
error_free(s->replace_blocker);
bdrv_unref(s->to_replace);
}
if (replace_aio_context) {
aio_context_release(replace_aio_context);
}
g_free(s->replaces);
bdrv_unref(s->target);
block_job_completed(&s->common, data->ret);
g_free(data);
}
static void coroutine_fn mirror_run(void *opaque)
{
MirrorBlockJob *s = opaque;
MirrorExitData *data;
BlockDriverState *bs = s->common.bs;
int64_t sector_num, end, length;
int64_t sector_num, end, sectors_per_chunk, length;
uint64_t last_pause_ns;
BlockDriverInfo bdi;
char backing_filename[2]; /* we only need 2 characters because we are only
checking for a NULL string */
char backing_filename[1024];
int ret = 0;
int n;
@@ -400,22 +318,13 @@ static void coroutine_fn mirror_run(void *opaque)
goto immediate_exit;
}
s->bdev_length = bdrv_getlength(bs);
if (s->bdev_length < 0) {
ret = s->bdev_length;
goto immediate_exit;
} else if (s->bdev_length == 0) {
/* Report BLOCK_JOB_READY and wait for complete. */
block_job_event_ready(&s->common);
s->synced = true;
while (!block_job_is_cancelled(&s->common) && !s->should_complete) {
block_job_yield(&s->common);
}
s->common.cancelled = false;
goto immediate_exit;
s->common.len = bdrv_getlength(bs);
if (s->common.len <= 0) {
block_job_completed(&s->common, s->common.len);
return;
}
length = DIV_ROUND_UP(s->bdev_length, s->granularity);
length = (bdrv_getlength(bs) + s->granularity - 1) / s->granularity;
s->in_flight_bitmap = bitmap_new(length);
/* If we have no backing file yet in the destination, we cannot let
@@ -425,45 +334,26 @@ static void coroutine_fn mirror_run(void *opaque)
bdrv_get_backing_filename(s->target, backing_filename,
sizeof(backing_filename));
if (backing_filename[0] && !s->target->backing_hd) {
ret = bdrv_get_info(s->target, &bdi);
if (ret < 0) {
goto immediate_exit;
}
bdrv_get_info(s->target, &bdi);
if (s->granularity < bdi.cluster_size) {
s->buf_size = MAX(s->buf_size, bdi.cluster_size);
s->cow_bitmap = bitmap_new(length);
}
}
end = s->bdev_length / BDRV_SECTOR_SIZE;
s->buf = qemu_try_blockalign(bs, s->buf_size);
if (s->buf == NULL) {
ret = -ENOMEM;
goto immediate_exit;
}
end = s->common.len >> BDRV_SECTOR_BITS;
s->buf = qemu_blockalign(bs, s->buf_size);
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
mirror_free_init(s);
last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
if (!s->is_none_mode) {
if (s->mode != MIRROR_SYNC_MODE_NONE) {
/* First part, loop on the sectors and initialize the dirty bitmap. */
BlockDriverState *base = s->base;
BlockDriverState *base;
base = s->mode == MIRROR_SYNC_MODE_FULL ? NULL : bs->backing_hd;
for (sector_num = 0; sector_num < end; ) {
/* Just to make sure we are not exceeding int limit. */
int nb_sectors = MIN(INT_MAX >> BDRV_SECTOR_BITS,
end - sector_num);
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
if (now - last_pause_ns > SLICE_TIME) {
last_pause_ns = now;
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, 0);
}
if (block_job_is_cancelled(&s->common)) {
goto immediate_exit;
}
ret = bdrv_is_allocated_above(bs, base, sector_num, nb_sectors, &n);
int64_t next = (sector_num | (sectors_per_chunk - 1)) + 1;
ret = bdrv_is_allocated_above(bs, base,
sector_num, next - sector_num, &n);
if (ret < 0) {
goto immediate_exit;
@@ -471,13 +361,16 @@ static void coroutine_fn mirror_run(void *opaque)
assert(n > 0);
if (ret == 1) {
bdrv_set_dirty_bitmap(s->dirty_bitmap, sector_num, n);
bdrv_set_dirty(bs, sector_num, n);
sector_num = next;
} else {
sector_num += n;
}
sector_num += n;
}
}
bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
bdrv_dirty_iter_init(bs, &s->hbi);
last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
for (;;) {
uint64_t delay_ns = 0;
int64_t cnt;
@@ -488,16 +381,10 @@ static void coroutine_fn mirror_run(void *opaque)
goto immediate_exit;
}
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
/* s->common.offset contains the number of bytes already processed so
* far, cnt is the number of dirty sectors remaining and
* s->sectors_in_flight is the number of sectors currently being
* processed; together those are the current total operation length */
s->common.len = s->common.offset +
(cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;
cnt = bdrv_get_dirty_count(bs);
/* Note that even when no rate limit is applied we need to yield
* periodically with no pending I/O so that bdrv_drain_all() returns.
* periodically with no pending I/O so that qemu_aio_flush() returns.
* We do so every SLICE_TIME nanoseconds, or when there is an error,
* or when the source is clean, whichever comes first.
*/
@@ -510,6 +397,9 @@ static void coroutine_fn mirror_run(void *opaque)
continue;
} else if (cnt != 0) {
delay_ns = mirror_iteration(s);
if (delay_ns == 0) {
continue;
}
}
}
@@ -518,8 +408,7 @@ static void coroutine_fn mirror_run(void *opaque)
trace_mirror_before_flush(s);
ret = bdrv_flush(s->target);
if (ret < 0) {
if (mirror_error_action(s, false, -ret) ==
BLOCK_ERROR_ACTION_REPORT) {
if (mirror_error_action(s, false, -ret) == BDRV_ACTION_REPORT) {
goto immediate_exit;
}
} else {
@@ -528,14 +417,15 @@ static void coroutine_fn mirror_run(void *opaque)
* report completion. This way, block-job-cancel will leave
* the target in a consistent state.
*/
s->common.offset = end * BDRV_SECTOR_SIZE;
if (!s->synced) {
block_job_event_ready(&s->common);
block_job_ready(&s->common);
s->synced = true;
}
should_complete = s->should_complete ||
block_job_is_cancelled(&s->common);
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
cnt = bdrv_get_dirty_count(bs);
}
}
@@ -549,13 +439,15 @@ static void coroutine_fn mirror_run(void *opaque)
* mirror_populate runs.
*/
trace_mirror_before_drain(s, cnt);
bdrv_drain(bs);
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
bdrv_drain_all();
cnt = bdrv_get_dirty_count(bs);
}
ret = 0;
trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
if (!s->synced) {
/* Publish progress */
s->common.offset = (end - cnt) * BDRV_SECTOR_SIZE;
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
if (block_job_is_cancelled(&s->common)) {
break;
@@ -588,12 +480,17 @@ immediate_exit:
qemu_vfree(s->buf);
g_free(s->cow_bitmap);
g_free(s->in_flight_bitmap);
bdrv_release_dirty_bitmap(bs, s->dirty_bitmap);
bdrv_set_dirty_tracking(bs, 0);
bdrv_iostatus_disable(s->target);
data = g_malloc(sizeof(*data));
data->ret = ret;
block_job_defer_to_main_loop(&s->common, mirror_exit, data);
if (s->should_complete && ret == 0) {
if (bdrv_get_flags(s->target) != bdrv_get_flags(s->common.bs)) {
bdrv_reopen(s->target, bdrv_get_flags(s->common.bs), NULL);
}
bdrv_swap(s->target, s->common.bs);
}
bdrv_close(s->target);
bdrv_unref(s->target);
block_job_completed(&s->common, ret);
}
static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -601,7 +498,7 @@ static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
if (speed < 0) {
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
error_set(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
@@ -622,38 +519,19 @@ static void mirror_complete(BlockJob *job, Error **errp)
ret = bdrv_open_backing_file(s->target, NULL, &local_err);
if (ret < 0) {
char backing_filename[PATH_MAX];
bdrv_get_full_backing_filename(s->target, backing_filename,
sizeof(backing_filename));
error_propagate(errp, local_err);
return;
}
if (!s->synced) {
error_setg(errp, QERR_BLOCK_JOB_NOT_READY,
bdrv_get_device_name(job->bs));
error_set(errp, QERR_BLOCK_JOB_NOT_READY, job->bs->device_name);
return;
}
/* check the target bs is not blocked and block all operations on it */
if (s->replaces) {
AioContext *replace_aio_context;
s->to_replace = check_to_replace_node(s->replaces, &local_err);
if (!s->to_replace) {
error_propagate(errp, local_err);
return;
}
replace_aio_context = bdrv_get_aio_context(s->to_replace);
aio_context_acquire(replace_aio_context);
error_setg(&s->replace_blocker,
"block device is in use by block-job-complete");
bdrv_op_block_all(s->to_replace, s->replace_blocker);
bdrv_ref(s->to_replace);
aio_context_release(replace_aio_context);
}
s->should_complete = true;
block_job_enter(&s->common);
block_job_resume(job);
}
static const BlockJobDriver mirror_job_driver = {
@@ -664,31 +542,25 @@ static const BlockJobDriver mirror_job_driver = {
.complete = mirror_complete,
};
static const BlockJobDriver commit_active_job_driver = {
.instance_size = sizeof(MirrorBlockJob),
.job_type = BLOCK_JOB_TYPE_COMMIT,
.set_speed = mirror_set_speed,
.iostatus_reset
= mirror_iostatus_reset,
.complete = mirror_complete,
};
static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
const char *replaces,
int64_t speed, uint32_t granularity,
int64_t buf_size,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
bool unmap,
BlockCompletionFunc *cb,
void *opaque, Error **errp,
const BlockJobDriver *driver,
bool is_none_mode, BlockDriverState *base)
void mirror_start(BlockDriverState *bs, BlockDriverState *target,
int64_t speed, int64_t granularity, int64_t buf_size,
MirrorSyncMode mode, BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockDriverCompletionFunc *cb,
void *opaque, Error **errp)
{
MirrorBlockJob *s;
if (granularity == 0) {
granularity = bdrv_get_default_bitmap_granularity(target);
/* Choose the default granularity based on the target file's cluster
* size, clamped between 4k and 64k. */
BlockDriverInfo bdi;
if (bdrv_get_info(target, &bdi) >= 0 && bdi.cluster_size != 0) {
granularity = MAX(4096, bdi.cluster_size);
granularity = MIN(65536, granularity);
} else {
granularity = 65536;
}
}
assert ((granularity & (granularity - 1)) == 0);
@@ -696,40 +568,23 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
!bdrv_iostatus_is_enabled(bs)) {
error_setg(errp, QERR_INVALID_PARAMETER, "on-source-error");
error_set(errp, QERR_INVALID_PARAMETER, "on-source-error");
return;
}
if (buf_size < 0) {
error_setg(errp, "Invalid parameter 'buf-size'");
return;
}
if (buf_size == 0) {
buf_size = DEFAULT_MIRROR_BUF_SIZE;
}
s = block_job_create(driver, bs, speed, cb, opaque, errp);
s = block_job_create(&mirror_job_driver, bs, speed, cb, opaque, errp);
if (!s) {
return;
}
s->replaces = g_strdup(replaces);
s->on_source_error = on_source_error;
s->on_target_error = on_target_error;
s->target = target;
s->is_none_mode = is_none_mode;
s->base = base;
s->mode = mode;
s->granularity = granularity;
s->buf_size = ROUND_UP(buf_size, granularity);
s->unmap = unmap;
s->buf_size = MAX(buf_size, granularity);
s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
if (!s->dirty_bitmap) {
g_free(s->replaces);
block_job_release(bs);
return;
}
bdrv_set_dirty_tracking(bs, granularity);
bdrv_set_enable_write_cache(s->target, true);
bdrv_set_on_error(s->target, on_target_error, on_target_error);
bdrv_iostatus_enable(s->target);
@@ -737,87 +592,3 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
trace_mirror_start(bs, s, s->common.co, opaque);
qemu_coroutine_enter(s->common.co, s);
}
void mirror_start(BlockDriverState *bs, BlockDriverState *target,
const char *replaces,
int64_t speed, uint32_t granularity, int64_t buf_size,
MirrorSyncMode mode, BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
bool unmap,
BlockCompletionFunc *cb,
void *opaque, Error **errp)
{
bool is_none_mode;
BlockDriverState *base;
if (mode == MIRROR_SYNC_MODE_INCREMENTAL) {
error_setg(errp, "Sync mode 'incremental' not supported");
return;
}
is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
base = mode == MIRROR_SYNC_MODE_TOP ? bs->backing_hd : NULL;
mirror_start_job(bs, target, replaces,
speed, granularity, buf_size,
on_source_error, on_target_error, unmap, cb, opaque, errp,
&mirror_job_driver, is_none_mode, base);
}
void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
int64_t speed,
BlockdevOnError on_error,
BlockCompletionFunc *cb,
void *opaque, Error **errp)
{
int64_t length, base_length;
int orig_base_flags;
int ret;
Error *local_err = NULL;
orig_base_flags = bdrv_get_flags(base);
if (bdrv_reopen(base, bs->open_flags, errp)) {
return;
}
length = bdrv_getlength(bs);
if (length < 0) {
error_setg_errno(errp, -length,
"Unable to determine length of %s", bs->filename);
goto error_restore_flags;
}
base_length = bdrv_getlength(base);
if (base_length < 0) {
error_setg_errno(errp, -base_length,
"Unable to determine length of %s", base->filename);
goto error_restore_flags;
}
if (length > base_length) {
ret = bdrv_truncate(base, length);
if (ret < 0) {
error_setg_errno(errp, -ret,
"Top image %s is larger than base image %s, and "
"resize of base image failed",
bs->filename, base->filename);
goto error_restore_flags;
}
}
bdrv_ref(base);
mirror_start_job(bs, base, NULL, speed, 0, 0,
on_error, on_error, false, cb, opaque, &local_err,
&commit_active_job_driver, false, base);
if (local_err) {
error_propagate(errp, local_err);
goto error_restore_flags;
}
return;
error_restore_flags:
/* ignore error and errp for bdrv_reopen, because we want to propagate
* the original error */
bdrv_reopen(base, orig_base_flags, NULL);
return;
}

View File

@@ -1,407 +0,0 @@
/*
* QEMU Block driver for NBD
*
* Copyright (C) 2008 Bull S.A.S.
* Author: Laurent Vivier <Laurent.Vivier@bull.net>
*
* Some parts:
* Copyright (C) 2007 Anthony Liguori <anthony@codemonkey.ws>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "nbd-client.h"
#include "qemu/sockets.h"
#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
#define INDEX_TO_HANDLE(bs, index) ((index) ^ ((uint64_t)(intptr_t)bs))
static void nbd_recv_coroutines_enter_all(NbdClientSession *s)
{
int i;
for (i = 0; i < MAX_NBD_REQUESTS; i++) {
if (s->recv_coroutine[i]) {
qemu_coroutine_enter(s->recv_coroutine[i], NULL);
}
}
}
static void nbd_teardown_connection(BlockDriverState *bs)
{
NbdClientSession *client = nbd_get_client_session(bs);
/* finish any pending coroutines */
shutdown(client->sock, 2);
nbd_recv_coroutines_enter_all(client);
nbd_client_detach_aio_context(bs);
closesocket(client->sock);
client->sock = -1;
}
static void nbd_reply_ready(void *opaque)
{
BlockDriverState *bs = opaque;
NbdClientSession *s = nbd_get_client_session(bs);
uint64_t i;
int ret;
if (s->reply.handle == 0) {
/* No reply already in flight. Fetch a header. It is possible
* that another thread has done the same thing in parallel, so
* the socket is not readable anymore.
*/
ret = nbd_receive_reply(s->sock, &s->reply);
if (ret == -EAGAIN) {
return;
}
if (ret < 0) {
s->reply.handle = 0;
goto fail;
}
}
/* There's no need for a mutex on the receive side, because the
* handler acts as a synchronization point and ensures that only
* one coroutine is called until the reply finishes. */
i = HANDLE_TO_INDEX(s, s->reply.handle);
if (i >= MAX_NBD_REQUESTS) {
goto fail;
}
if (s->recv_coroutine[i]) {
qemu_coroutine_enter(s->recv_coroutine[i], NULL);
return;
}
fail:
nbd_teardown_connection(bs);
}
static void nbd_restart_write(void *opaque)
{
BlockDriverState *bs = opaque;
qemu_coroutine_enter(nbd_get_client_session(bs)->send_coroutine, NULL);
}
static int nbd_co_send_request(BlockDriverState *bs,
struct nbd_request *request,
QEMUIOVector *qiov, int offset)
{
NbdClientSession *s = nbd_get_client_session(bs);
AioContext *aio_context;
int rc, ret, i;
qemu_co_mutex_lock(&s->send_mutex);
for (i = 0; i < MAX_NBD_REQUESTS; i++) {
if (s->recv_coroutine[i] == NULL) {
s->recv_coroutine[i] = qemu_coroutine_self();
break;
}
}
assert(i < MAX_NBD_REQUESTS);
request->handle = INDEX_TO_HANDLE(s, i);
s->send_coroutine = qemu_coroutine_self();
aio_context = bdrv_get_aio_context(bs);
aio_set_fd_handler(aio_context, s->sock,
nbd_reply_ready, nbd_restart_write, bs);
if (qiov) {
if (!s->is_unix) {
socket_set_cork(s->sock, 1);
}
rc = nbd_send_request(s->sock, request);
if (rc >= 0) {
ret = qemu_co_sendv(s->sock, qiov->iov, qiov->niov,
offset, request->len);
if (ret != request->len) {
rc = -EIO;
}
}
if (!s->is_unix) {
socket_set_cork(s->sock, 0);
}
} else {
rc = nbd_send_request(s->sock, request);
}
aio_set_fd_handler(aio_context, s->sock, nbd_reply_ready, NULL, bs);
s->send_coroutine = NULL;
qemu_co_mutex_unlock(&s->send_mutex);
return rc;
}
static void nbd_co_receive_reply(NbdClientSession *s,
struct nbd_request *request, struct nbd_reply *reply,
QEMUIOVector *qiov, int offset)
{
int ret;
/* Wait until we're woken up by the read handler. TODO: perhaps
* peek at the next reply and avoid yielding if it's ours? */
qemu_coroutine_yield();
*reply = s->reply;
if (reply->handle != request->handle) {
reply->error = EIO;
} else {
if (qiov && reply->error == 0) {
ret = qemu_co_recvv(s->sock, qiov->iov, qiov->niov,
offset, request->len);
if (ret != request->len) {
reply->error = EIO;
}
}
/* Tell the read handler to read another header. */
s->reply.handle = 0;
}
}
static void nbd_coroutine_start(NbdClientSession *s,
struct nbd_request *request)
{
/* Poor man semaphore. The free_sema is locked when no other request
* can be accepted, and unlocked after receiving one reply. */
if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
qemu_co_mutex_lock(&s->free_sema);
assert(s->in_flight < MAX_NBD_REQUESTS);
}
s->in_flight++;
/* s->recv_coroutine[i] is set as soon as we get the send_lock. */
}
static void nbd_coroutine_end(NbdClientSession *s,
struct nbd_request *request)
{
int i = HANDLE_TO_INDEX(s, request->handle);
s->recv_coroutine[i] = NULL;
if (s->in_flight-- == MAX_NBD_REQUESTS) {
qemu_co_mutex_unlock(&s->free_sema);
}
}
static int nbd_co_readv_1(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov,
int offset)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { .type = NBD_CMD_READ };
struct nbd_reply reply;
ssize_t ret;
request.from = sector_num * 512;
request.len = nb_sectors * 512;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL, 0);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, qiov, offset);
}
nbd_coroutine_end(client, &request);
return -reply.error;
}
static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov,
int offset)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { .type = NBD_CMD_WRITE };
struct nbd_reply reply;
ssize_t ret;
if (!bdrv_enable_write_cache(bs) &&
(client->nbdflags & NBD_FLAG_SEND_FUA)) {
request.type |= NBD_CMD_FLAG_FUA;
}
request.from = sector_num * 512;
request.len = nb_sectors * 512;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, qiov, offset);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, NULL, 0);
}
nbd_coroutine_end(client, &request);
return -reply.error;
}
/* qemu-nbd has a limit of slightly less than 1M per request. Try to
* remain aligned to 4K. */
#define NBD_MAX_SECTORS 2040
int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
{
int offset = 0;
int ret;
while (nb_sectors > NBD_MAX_SECTORS) {
ret = nbd_co_readv_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
if (ret < 0) {
return ret;
}
offset += NBD_MAX_SECTORS * 512;
sector_num += NBD_MAX_SECTORS;
nb_sectors -= NBD_MAX_SECTORS;
}
return nbd_co_readv_1(bs, sector_num, nb_sectors, qiov, offset);
}
int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
{
int offset = 0;
int ret;
while (nb_sectors > NBD_MAX_SECTORS) {
ret = nbd_co_writev_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
if (ret < 0) {
return ret;
}
offset += NBD_MAX_SECTORS * 512;
sector_num += NBD_MAX_SECTORS;
nb_sectors -= NBD_MAX_SECTORS;
}
return nbd_co_writev_1(bs, sector_num, nb_sectors, qiov, offset);
}
int nbd_client_co_flush(BlockDriverState *bs)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { .type = NBD_CMD_FLUSH };
struct nbd_reply reply;
ssize_t ret;
if (!(client->nbdflags & NBD_FLAG_SEND_FLUSH)) {
return 0;
}
if (client->nbdflags & NBD_FLAG_SEND_FUA) {
request.type |= NBD_CMD_FLAG_FUA;
}
request.from = 0;
request.len = 0;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL, 0);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, NULL, 0);
}
nbd_coroutine_end(client, &request);
return -reply.error;
}
int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { .type = NBD_CMD_TRIM };
struct nbd_reply reply;
ssize_t ret;
if (!(client->nbdflags & NBD_FLAG_SEND_TRIM)) {
return 0;
}
request.from = sector_num * 512;
request.len = nb_sectors * 512;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL, 0);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, NULL, 0);
}
nbd_coroutine_end(client, &request);
return -reply.error;
}
void nbd_client_detach_aio_context(BlockDriverState *bs)
{
aio_set_fd_handler(bdrv_get_aio_context(bs),
nbd_get_client_session(bs)->sock, NULL, NULL, NULL);
}
void nbd_client_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
aio_set_fd_handler(new_context, nbd_get_client_session(bs)->sock,
nbd_reply_ready, NULL, bs);
}
void nbd_client_close(BlockDriverState *bs)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = {
.type = NBD_CMD_DISC,
.from = 0,
.len = 0
};
if (client->sock == -1) {
return;
}
nbd_send_request(client->sock, &request);
nbd_teardown_connection(bs);
}
int nbd_client_init(BlockDriverState *bs, int sock, const char *export,
Error **errp)
{
NbdClientSession *client = nbd_get_client_session(bs);
int ret;
/* NBD handshake */
logout("session init %s\n", export);
qemu_set_block(sock);
ret = nbd_receive_negotiate(sock, export,
&client->nbdflags, &client->size, errp);
if (ret < 0) {
logout("Failed to negotiate with the NBD server\n");
closesocket(sock);
return ret;
}
qemu_co_mutex_init(&client->send_mutex);
qemu_co_mutex_init(&client->free_sema);
client->sock = sock;
/* Now that we're connected, set the socket to be non-blocking and
* kick the reply mechanism. */
qemu_set_nonblock(sock);
nbd_client_attach_aio_context(bs, bdrv_get_aio_context(bs));
logout("Established connection with NBD server\n");
return 0;
}

View File

@@ -1,53 +0,0 @@
#ifndef NBD_CLIENT_H
#define NBD_CLIENT_H
#include "qemu-common.h"
#include "block/nbd.h"
#include "block/block_int.h"
/* #define DEBUG_NBD */
#if defined(DEBUG_NBD)
#define logout(fmt, ...) \
fprintf(stderr, "nbd\t%-24s" fmt, __func__, ##__VA_ARGS__)
#else
#define logout(fmt, ...) ((void)0)
#endif
#define MAX_NBD_REQUESTS 16
typedef struct NbdClientSession {
int sock;
uint32_t nbdflags;
off_t size;
CoMutex send_mutex;
CoMutex free_sema;
Coroutine *send_coroutine;
int in_flight;
Coroutine *recv_coroutine[MAX_NBD_REQUESTS];
struct nbd_reply reply;
bool is_unix;
} NbdClientSession;
NbdClientSession *nbd_get_client_session(BlockDriverState *bs);
int nbd_client_init(BlockDriverState *bs, int sock, const char *export_name,
Error **errp);
void nbd_client_close(BlockDriverState *bs);
int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
int nb_sectors);
int nbd_client_co_flush(BlockDriverState *bs);
int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov);
int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov);
void nbd_client_detach_aio_context(BlockDriverState *bs);
void nbd_client_attach_aio_context(BlockDriverState *bs,
AioContext *new_context);
#endif /* NBD_CLIENT_H */

View File

@@ -26,24 +26,51 @@
* THE SOFTWARE.
*/
#include "block/nbd-client.h"
#include "qemu-common.h"
#include "block/nbd.h"
#include "qemu/uri.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/sockets.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qjson.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
#include <sys/types.h>
#include <unistd.h>
#define EN_OPTSTR ":exportname="
/* #define DEBUG_NBD */
#if defined(DEBUG_NBD)
#define logout(fmt, ...) \
fprintf(stderr, "nbd\t%-24s" fmt, __func__, ##__VA_ARGS__)
#else
#define logout(fmt, ...) ((void)0)
#endif
#define MAX_NBD_REQUESTS 16
#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
#define INDEX_TO_HANDLE(bs, index) ((index) ^ ((uint64_t)(intptr_t)bs))
typedef struct BDRVNBDState {
NbdClientSession client;
int sock;
uint32_t nbdflags;
off_t size;
size_t blocksize;
CoMutex send_mutex;
CoMutex free_sema;
Coroutine *send_coroutine;
int in_flight;
Coroutine *recv_coroutine[MAX_NBD_REQUESTS];
struct nbd_reply reply;
bool is_unix;
QemuOpts *socket_opts;
char *export_name; /* An NBD server may export several devices */
} BDRVNBDState;
static int nbd_parse_uri(const char *filename, QDict *options)
@@ -177,7 +204,7 @@ static void nbd_parse_filename(const char *filename, QDict *options,
InetSocketAddress *addr = NULL;
addr = inet_parse(host_spec, errp);
if (!addr) {
if (error_is_set(errp)) {
goto out;
}
@@ -190,56 +217,195 @@ out:
g_free(file);
}
static void nbd_config(BDRVNBDState *s, QDict *options, char **export,
Error **errp)
static int nbd_config(BDRVNBDState *s, QDict *options)
{
Error *local_err = NULL;
if (qdict_haskey(options, "path") == qdict_haskey(options, "host")) {
if (qdict_haskey(options, "path")) {
error_setg(errp, "path and host may not be used at the same time.");
} else {
error_setg(errp, "one of path and host must be specified.");
if (qdict_haskey(options, "path")) {
if (qdict_haskey(options, "host")) {
qerror_report(ERROR_CLASS_GENERIC_ERROR, "path and host may not "
"be used at the same time.");
return -EINVAL;
}
return;
s->is_unix = true;
} else if (qdict_haskey(options, "host")) {
s->is_unix = false;
} else {
return -EINVAL;
}
s->client.is_unix = qdict_haskey(options, "path");
s->socket_opts = qemu_opts_create(&socket_optslist, NULL, 0,
&error_abort);
s->socket_opts = qemu_opts_create_nofail(&socket_optslist);
qemu_opts_absorb_qdict(s->socket_opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
if (error_is_set(&local_err)) {
qerror_report_err(local_err);
error_free(local_err);
return -EINVAL;
}
if (!qemu_opt_get(s->socket_opts, "port")) {
qemu_opt_set_number(s->socket_opts, "port", NBD_DEFAULT_PORT,
&error_abort);
qemu_opt_set_number(s->socket_opts, "port", NBD_DEFAULT_PORT);
}
*export = g_strdup(qdict_get_try_str(options, "export"));
if (*export) {
s->export_name = g_strdup(qdict_get_try_str(options, "export"));
if (s->export_name) {
qdict_del(options, "export");
}
return 0;
}
NbdClientSession *nbd_get_client_session(BlockDriverState *bs)
static void nbd_coroutine_start(BDRVNBDState *s, struct nbd_request *request)
{
BDRVNBDState *s = bs->opaque;
return &s->client;
int i;
/* Poor man semaphore. The free_sema is locked when no other request
* can be accepted, and unlocked after receiving one reply. */
if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
qemu_co_mutex_lock(&s->free_sema);
assert(s->in_flight < MAX_NBD_REQUESTS);
}
s->in_flight++;
for (i = 0; i < MAX_NBD_REQUESTS; i++) {
if (s->recv_coroutine[i] == NULL) {
s->recv_coroutine[i] = qemu_coroutine_self();
break;
}
}
assert(i < MAX_NBD_REQUESTS);
request->handle = INDEX_TO_HANDLE(s, i);
}
static int nbd_establish_connection(BlockDriverState *bs, Error **errp)
static void nbd_reply_ready(void *opaque)
{
BDRVNBDState *s = opaque;
uint64_t i;
int ret;
if (s->reply.handle == 0) {
/* No reply already in flight. Fetch a header. It is possible
* that another thread has done the same thing in parallel, so
* the socket is not readable anymore.
*/
ret = nbd_receive_reply(s->sock, &s->reply);
if (ret == -EAGAIN) {
return;
}
if (ret < 0) {
s->reply.handle = 0;
goto fail;
}
}
/* There's no need for a mutex on the receive side, because the
* handler acts as a synchronization point and ensures that only
* one coroutine is called until the reply finishes. */
i = HANDLE_TO_INDEX(s, s->reply.handle);
if (i >= MAX_NBD_REQUESTS) {
goto fail;
}
if (s->recv_coroutine[i]) {
qemu_coroutine_enter(s->recv_coroutine[i], NULL);
return;
}
fail:
for (i = 0; i < MAX_NBD_REQUESTS; i++) {
if (s->recv_coroutine[i]) {
qemu_coroutine_enter(s->recv_coroutine[i], NULL);
}
}
}
static void nbd_restart_write(void *opaque)
{
BDRVNBDState *s = opaque;
qemu_coroutine_enter(s->send_coroutine, NULL);
}
static int nbd_co_send_request(BDRVNBDState *s, struct nbd_request *request,
QEMUIOVector *qiov, int offset)
{
int rc, ret;
qemu_co_mutex_lock(&s->send_mutex);
s->send_coroutine = qemu_coroutine_self();
qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, nbd_restart_write, s);
if (qiov) {
if (!s->is_unix) {
socket_set_cork(s->sock, 1);
}
rc = nbd_send_request(s->sock, request);
if (rc >= 0) {
ret = qemu_co_sendv(s->sock, qiov->iov, qiov->niov,
offset, request->len);
if (ret != request->len) {
rc = -EIO;
}
}
if (!s->is_unix) {
socket_set_cork(s->sock, 0);
}
} else {
rc = nbd_send_request(s->sock, request);
}
qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, NULL, s);
s->send_coroutine = NULL;
qemu_co_mutex_unlock(&s->send_mutex);
return rc;
}
static void nbd_co_receive_reply(BDRVNBDState *s, struct nbd_request *request,
struct nbd_reply *reply,
QEMUIOVector *qiov, int offset)
{
int ret;
/* Wait until we're woken up by the read handler. TODO: perhaps
* peek at the next reply and avoid yielding if it's ours? */
qemu_coroutine_yield();
*reply = s->reply;
if (reply->handle != request->handle) {
reply->error = EIO;
} else {
if (qiov && reply->error == 0) {
ret = qemu_co_recvv(s->sock, qiov->iov, qiov->niov,
offset, request->len);
if (ret != request->len) {
reply->error = EIO;
}
}
/* Tell the read handler to read another header. */
s->reply.handle = 0;
}
}
static void nbd_coroutine_end(BDRVNBDState *s, struct nbd_request *request)
{
int i = HANDLE_TO_INDEX(s, request->handle);
s->recv_coroutine[i] = NULL;
if (s->in_flight-- == MAX_NBD_REQUESTS) {
qemu_co_mutex_unlock(&s->free_sema);
}
}
static int nbd_establish_connection(BlockDriverState *bs)
{
BDRVNBDState *s = bs->opaque;
int sock;
int ret;
off_t size;
size_t blocksize;
if (s->client.is_unix) {
sock = unix_connect_opts(s->socket_opts, errp, NULL, NULL);
if (s->is_unix) {
sock = unix_socket_outgoing(qemu_opt_get(s->socket_opts, "path"));
} else {
sock = inet_connect_opts(s->socket_opts, errp, NULL, NULL);
sock = tcp_socket_outgoing_opts(s->socket_opts);
if (sock >= 0) {
socket_set_nodelay(sock);
}
@@ -248,194 +414,271 @@ static int nbd_establish_connection(BlockDriverState *bs, Error **errp)
/* Failed to establish connection */
if (sock < 0) {
logout("Failed to establish connection to NBD server\n");
return -EIO;
return -errno;
}
return sock;
/* NBD handshake */
ret = nbd_receive_negotiate(sock, s->export_name, &s->nbdflags, &size,
&blocksize);
if (ret < 0) {
logout("Failed to negotiate with the NBD server\n");
closesocket(sock);
return ret;
}
/* Now that we're connected, set the socket to be non-blocking and
* kick the reply mechanism. */
qemu_set_nonblock(sock);
qemu_aio_set_fd_handler(sock, nbd_reply_ready, NULL, s);
s->sock = sock;
s->size = size;
s->blocksize = blocksize;
logout("Established connection with NBD server\n");
return 0;
}
static void nbd_teardown_connection(BlockDriverState *bs)
{
BDRVNBDState *s = bs->opaque;
struct nbd_request request;
request.type = NBD_CMD_DISC;
request.from = 0;
request.len = 0;
nbd_send_request(s->sock, &request);
qemu_aio_set_fd_handler(s->sock, NULL, NULL, NULL);
closesocket(s->sock);
}
static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVNBDState *s = bs->opaque;
char *export = NULL;
int result, sock;
Error *local_err = NULL;
int result;
qemu_co_mutex_init(&s->send_mutex);
qemu_co_mutex_init(&s->free_sema);
/* Pop the config into our state object. Exit if invalid. */
nbd_config(s, options, &export, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return -EINVAL;
result = nbd_config(s, options);
if (result != 0) {
return result;
}
/* establish TCP connection, return error if it fails
* TODO: Configurable retry-until-timeout behaviour.
*/
sock = nbd_establish_connection(bs, errp);
if (sock < 0) {
g_free(export);
return sock;
}
result = nbd_establish_connection(bs);
/* NBD handshake */
result = nbd_client_init(bs, sock, export, errp);
g_free(export);
return result;
}
static int nbd_co_readv_1(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov,
int offset)
{
BDRVNBDState *s = bs->opaque;
struct nbd_request request;
struct nbd_reply reply;
ssize_t ret;
request.type = NBD_CMD_READ;
request.from = sector_num * 512;
request.len = nb_sectors * 512;
nbd_coroutine_start(s, &request);
ret = nbd_co_send_request(s, &request, NULL, 0);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(s, &request, &reply, qiov, offset);
}
nbd_coroutine_end(s, &request);
return -reply.error;
}
static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov,
int offset)
{
BDRVNBDState *s = bs->opaque;
struct nbd_request request;
struct nbd_reply reply;
ssize_t ret;
request.type = NBD_CMD_WRITE;
if (!bdrv_enable_write_cache(bs) && (s->nbdflags & NBD_FLAG_SEND_FUA)) {
request.type |= NBD_CMD_FLAG_FUA;
}
request.from = sector_num * 512;
request.len = nb_sectors * 512;
nbd_coroutine_start(s, &request);
ret = nbd_co_send_request(s, &request, qiov, offset);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(s, &request, &reply, NULL, 0);
}
nbd_coroutine_end(s, &request);
return -reply.error;
}
/* qemu-nbd has a limit of slightly less than 1M per request. Try to
* remain aligned to 4K. */
#define NBD_MAX_SECTORS 2040
static int nbd_co_readv(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
{
return nbd_client_co_readv(bs, sector_num, nb_sectors, qiov);
int offset = 0;
int ret;
while (nb_sectors > NBD_MAX_SECTORS) {
ret = nbd_co_readv_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
if (ret < 0) {
return ret;
}
offset += NBD_MAX_SECTORS * 512;
sector_num += NBD_MAX_SECTORS;
nb_sectors -= NBD_MAX_SECTORS;
}
return nbd_co_readv_1(bs, sector_num, nb_sectors, qiov, offset);
}
static int nbd_co_writev(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
{
return nbd_client_co_writev(bs, sector_num, nb_sectors, qiov);
int offset = 0;
int ret;
while (nb_sectors > NBD_MAX_SECTORS) {
ret = nbd_co_writev_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
if (ret < 0) {
return ret;
}
offset += NBD_MAX_SECTORS * 512;
sector_num += NBD_MAX_SECTORS;
nb_sectors -= NBD_MAX_SECTORS;
}
return nbd_co_writev_1(bs, sector_num, nb_sectors, qiov, offset);
}
static int nbd_co_flush(BlockDriverState *bs)
{
return nbd_client_co_flush(bs);
}
BDRVNBDState *s = bs->opaque;
struct nbd_request request;
struct nbd_reply reply;
ssize_t ret;
static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
{
bs->bl.max_discard = UINT32_MAX >> BDRV_SECTOR_BITS;
bs->bl.max_transfer_length = UINT32_MAX >> BDRV_SECTOR_BITS;
if (!(s->nbdflags & NBD_FLAG_SEND_FLUSH)) {
return 0;
}
request.type = NBD_CMD_FLUSH;
if (s->nbdflags & NBD_FLAG_SEND_FUA) {
request.type |= NBD_CMD_FLAG_FUA;
}
request.from = 0;
request.len = 0;
nbd_coroutine_start(s, &request);
ret = nbd_co_send_request(s, &request, NULL, 0);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(s, &request, &reply, NULL, 0);
}
nbd_coroutine_end(s, &request);
return -reply.error;
}
static int nbd_co_discard(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
return nbd_client_co_discard(bs, sector_num, nb_sectors);
BDRVNBDState *s = bs->opaque;
struct nbd_request request;
struct nbd_reply reply;
ssize_t ret;
if (!(s->nbdflags & NBD_FLAG_SEND_TRIM)) {
return 0;
}
request.type = NBD_CMD_TRIM;
request.from = sector_num * 512;
request.len = nb_sectors * 512;
nbd_coroutine_start(s, &request);
ret = nbd_co_send_request(s, &request, NULL, 0);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(s, &request, &reply, NULL, 0);
}
nbd_coroutine_end(s, &request);
return -reply.error;
}
static void nbd_close(BlockDriverState *bs)
{
BDRVNBDState *s = bs->opaque;
g_free(s->export_name);
qemu_opts_del(s->socket_opts);
nbd_client_close(bs);
nbd_teardown_connection(bs);
}
static int64_t nbd_getlength(BlockDriverState *bs)
{
BDRVNBDState *s = bs->opaque;
return s->client.size;
}
static void nbd_detach_aio_context(BlockDriverState *bs)
{
nbd_client_detach_aio_context(bs);
}
static void nbd_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
nbd_client_attach_aio_context(bs, new_context);
}
static void nbd_refresh_filename(BlockDriverState *bs)
{
QDict *opts = qdict_new();
const char *path = qdict_get_try_str(bs->options, "path");
const char *host = qdict_get_try_str(bs->options, "host");
const char *port = qdict_get_try_str(bs->options, "port");
const char *export = qdict_get_try_str(bs->options, "export");
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("nbd")));
if (path && export) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd+unix:///%s?socket=%s", export, path);
} else if (path && !export) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd+unix://?socket=%s", path);
} else if (!path && export && port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s:%s/%s", host, port, export);
} else if (!path && export && !port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s/%s", host, export);
} else if (!path && !export && port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s:%s", host, port);
} else if (!path && !export && !port) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"nbd://%s", host);
}
if (path) {
qdict_put_obj(opts, "path", QOBJECT(qstring_from_str(path)));
} else if (port) {
qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
qdict_put_obj(opts, "port", QOBJECT(qstring_from_str(port)));
} else {
qdict_put_obj(opts, "host", QOBJECT(qstring_from_str(host)));
}
if (export) {
qdict_put_obj(opts, "export", QOBJECT(qstring_from_str(export)));
}
bs->full_open_options = opts;
return s->size;
}
static BlockDriver bdrv_nbd = {
.format_name = "nbd",
.protocol_name = "nbd",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_refresh_limits = nbd_refresh_limits,
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
.format_name = "nbd",
.protocol_name = "nbd",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_getlength = nbd_getlength,
};
static BlockDriver bdrv_nbd_tcp = {
.format_name = "nbd",
.protocol_name = "nbd+tcp",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_refresh_limits = nbd_refresh_limits,
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
.format_name = "nbd",
.protocol_name = "nbd+tcp",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_getlength = nbd_getlength,
};
static BlockDriver bdrv_nbd_unix = {
.format_name = "nbd",
.protocol_name = "nbd+unix",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_refresh_limits = nbd_refresh_limits,
.bdrv_getlength = nbd_getlength,
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
.format_name = "nbd",
.protocol_name = "nbd+unix",
.instance_size = sizeof(BDRVNBDState),
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
.bdrv_getlength = nbd_getlength,
};
static void bdrv_nbd_init(void)

View File

@@ -1,516 +0,0 @@
/*
* QEMU Block driver for native access to files on NFS shares
*
* Copyright (c) 2014 Peter Lieven <pl@kamp.de>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "config-host.h"
#include <poll.h>
#include "qemu-common.h"
#include "qemu/config-file.h"
#include "qemu/error-report.h"
#include "block/block_int.h"
#include "trace.h"
#include "qemu/iov.h"
#include "qemu/uri.h"
#include "sysemu/sysemu.h"
#include <nfsc/libnfs.h>
#define QEMU_NFS_MAX_READAHEAD_SIZE 1048576
typedef struct NFSClient {
struct nfs_context *context;
struct nfsfh *fh;
int events;
bool has_zero_init;
AioContext *aio_context;
} NFSClient;
typedef struct NFSRPC {
int ret;
int complete;
QEMUIOVector *iov;
struct stat *st;
Coroutine *co;
QEMUBH *bh;
NFSClient *client;
} NFSRPC;
static void nfs_process_read(void *arg);
static void nfs_process_write(void *arg);
static void nfs_set_events(NFSClient *client)
{
int ev = nfs_which_events(client->context);
if (ev != client->events) {
aio_set_fd_handler(client->aio_context,
nfs_get_fd(client->context),
(ev & POLLIN) ? nfs_process_read : NULL,
(ev & POLLOUT) ? nfs_process_write : NULL,
client);
}
client->events = ev;
}
static void nfs_process_read(void *arg)
{
NFSClient *client = arg;
nfs_service(client->context, POLLIN);
nfs_set_events(client);
}
static void nfs_process_write(void *arg)
{
NFSClient *client = arg;
nfs_service(client->context, POLLOUT);
nfs_set_events(client);
}
static void nfs_co_init_task(NFSClient *client, NFSRPC *task)
{
*task = (NFSRPC) {
.co = qemu_coroutine_self(),
.client = client,
};
}
static void nfs_co_generic_bh_cb(void *opaque)
{
NFSRPC *task = opaque;
task->complete = 1;
qemu_bh_delete(task->bh);
qemu_coroutine_enter(task->co, NULL);
}
static void
nfs_co_generic_cb(int ret, struct nfs_context *nfs, void *data,
void *private_data)
{
NFSRPC *task = private_data;
task->ret = ret;
if (task->ret > 0 && task->iov) {
if (task->ret <= task->iov->size) {
qemu_iovec_from_buf(task->iov, 0, data, task->ret);
} else {
task->ret = -EIO;
}
}
if (task->ret == 0 && task->st) {
memcpy(task->st, data, sizeof(struct stat));
}
if (task->ret < 0) {
error_report("NFS Error: %s", nfs_get_error(nfs));
}
if (task->co) {
task->bh = aio_bh_new(task->client->aio_context,
nfs_co_generic_bh_cb, task);
qemu_bh_schedule(task->bh);
} else {
task->complete = 1;
}
}
static int coroutine_fn nfs_co_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
QEMUIOVector *iov)
{
NFSClient *client = bs->opaque;
NFSRPC task;
nfs_co_init_task(client, &task);
task.iov = iov;
if (nfs_pread_async(client->context, client->fh,
sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE,
nfs_co_generic_cb, &task) != 0) {
return -ENOMEM;
}
while (!task.complete) {
nfs_set_events(client);
qemu_coroutine_yield();
}
if (task.ret < 0) {
return task.ret;
}
/* zero pad short reads */
if (task.ret < iov->size) {
qemu_iovec_memset(iov, task.ret, 0, iov->size - task.ret);
}
return 0;
}
static int coroutine_fn nfs_co_writev(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
QEMUIOVector *iov)
{
NFSClient *client = bs->opaque;
NFSRPC task;
char *buf = NULL;
nfs_co_init_task(client, &task);
buf = g_try_malloc(nb_sectors * BDRV_SECTOR_SIZE);
if (nb_sectors && buf == NULL) {
return -ENOMEM;
}
qemu_iovec_to_buf(iov, 0, buf, nb_sectors * BDRV_SECTOR_SIZE);
if (nfs_pwrite_async(client->context, client->fh,
sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE,
buf, nfs_co_generic_cb, &task) != 0) {
g_free(buf);
return -ENOMEM;
}
while (!task.complete) {
nfs_set_events(client);
qemu_coroutine_yield();
}
g_free(buf);
if (task.ret != nb_sectors * BDRV_SECTOR_SIZE) {
return task.ret < 0 ? task.ret : -EIO;
}
return 0;
}
static int coroutine_fn nfs_co_flush(BlockDriverState *bs)
{
NFSClient *client = bs->opaque;
NFSRPC task;
nfs_co_init_task(client, &task);
if (nfs_fsync_async(client->context, client->fh, nfs_co_generic_cb,
&task) != 0) {
return -ENOMEM;
}
while (!task.complete) {
nfs_set_events(client);
qemu_coroutine_yield();
}
return task.ret;
}
/* TODO Convert to fine grained options */
static QemuOptsList runtime_opts = {
.name = "nfs",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.type = QEMU_OPT_STRING,
.help = "URL to the NFS file",
},
{ /* end of list */ }
},
};
static void nfs_detach_aio_context(BlockDriverState *bs)
{
NFSClient *client = bs->opaque;
aio_set_fd_handler(client->aio_context,
nfs_get_fd(client->context),
NULL, NULL, NULL);
client->events = 0;
}
static void nfs_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
NFSClient *client = bs->opaque;
client->aio_context = new_context;
nfs_set_events(client);
}
static void nfs_client_close(NFSClient *client)
{
if (client->context) {
if (client->fh) {
nfs_close(client->context, client->fh);
}
aio_set_fd_handler(client->aio_context,
nfs_get_fd(client->context),
NULL, NULL, NULL);
nfs_destroy_context(client->context);
}
memset(client, 0, sizeof(NFSClient));
}
static void nfs_file_close(BlockDriverState *bs)
{
NFSClient *client = bs->opaque;
nfs_client_close(client);
}
static int64_t nfs_client_open(NFSClient *client, const char *filename,
int flags, Error **errp)
{
int ret = -EINVAL, i;
struct stat st;
URI *uri;
QueryParams *qp = NULL;
char *file = NULL, *strp = NULL;
uri = uri_parse(filename);
if (!uri) {
error_setg(errp, "Invalid URL specified");
goto fail;
}
if (!uri->server) {
error_setg(errp, "Invalid URL specified");
goto fail;
}
strp = strrchr(uri->path, '/');
if (strp == NULL) {
error_setg(errp, "Invalid URL specified");
goto fail;
}
file = g_strdup(strp);
*strp = 0;
client->context = nfs_init_context();
if (client->context == NULL) {
error_setg(errp, "Failed to init NFS context");
goto fail;
}
qp = query_params_parse(uri->query);
for (i = 0; i < qp->n; i++) {
unsigned long long val;
if (!qp->p[i].value) {
error_setg(errp, "Value for NFS parameter expected: %s",
qp->p[i].name);
goto fail;
}
if (parse_uint_full(qp->p[i].value, &val, 0)) {
error_setg(errp, "Illegal value for NFS parameter: %s",
qp->p[i].name);
goto fail;
}
if (!strcmp(qp->p[i].name, "uid")) {
nfs_set_uid(client->context, val);
} else if (!strcmp(qp->p[i].name, "gid")) {
nfs_set_gid(client->context, val);
} else if (!strcmp(qp->p[i].name, "tcp-syncnt")) {
nfs_set_tcp_syncnt(client->context, val);
#ifdef LIBNFS_FEATURE_READAHEAD
} else if (!strcmp(qp->p[i].name, "readahead")) {
if (val > QEMU_NFS_MAX_READAHEAD_SIZE) {
error_report("NFS Warning: Truncating NFS readahead"
" size to %d", QEMU_NFS_MAX_READAHEAD_SIZE);
val = QEMU_NFS_MAX_READAHEAD_SIZE;
}
nfs_set_readahead(client->context, val);
#endif
} else {
error_setg(errp, "Unknown NFS parameter name: %s",
qp->p[i].name);
goto fail;
}
}
ret = nfs_mount(client->context, uri->server, uri->path);
if (ret < 0) {
error_setg(errp, "Failed to mount nfs share: %s",
nfs_get_error(client->context));
goto fail;
}
if (flags & O_CREAT) {
ret = nfs_creat(client->context, file, 0600, &client->fh);
if (ret < 0) {
error_setg(errp, "Failed to create file: %s",
nfs_get_error(client->context));
goto fail;
}
} else {
ret = nfs_open(client->context, file, flags, &client->fh);
if (ret < 0) {
error_setg(errp, "Failed to open file : %s",
nfs_get_error(client->context));
goto fail;
}
}
ret = nfs_fstat(client->context, client->fh, &st);
if (ret < 0) {
error_setg(errp, "Failed to fstat file: %s",
nfs_get_error(client->context));
goto fail;
}
ret = DIV_ROUND_UP(st.st_size, BDRV_SECTOR_SIZE);
client->has_zero_init = S_ISREG(st.st_mode);
goto out;
fail:
nfs_client_close(client);
out:
if (qp) {
query_params_free(qp);
}
uri_free(uri);
g_free(file);
return ret;
}
static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp) {
NFSClient *client = bs->opaque;
int64_t ret;
QemuOpts *opts;
Error *local_err = NULL;
client->aio_context = bdrv_get_aio_context(bs);
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto out;
}
ret = nfs_client_open(client, qemu_opt_get(opts, "filename"),
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
errp);
if (ret < 0) {
goto out;
}
bs->total_sectors = ret;
ret = 0;
out:
qemu_opts_del(opts);
return ret;
}
static QemuOptsList nfs_create_opts = {
.name = "nfs-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(nfs_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{ /* end of list */ }
}
};
static int nfs_file_create(const char *url, QemuOpts *opts, Error **errp)
{
int ret = 0;
int64_t total_size = 0;
NFSClient *client = g_new0(NFSClient, 1);
client->aio_context = qemu_get_aio_context();
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
ret = nfs_client_open(client, url, O_CREAT, errp);
if (ret < 0) {
goto out;
}
ret = nfs_ftruncate(client->context, client->fh, total_size);
nfs_client_close(client);
out:
g_free(client);
return ret;
}
static int nfs_has_zero_init(BlockDriverState *bs)
{
NFSClient *client = bs->opaque;
return client->has_zero_init;
}
static int64_t nfs_get_allocated_file_size(BlockDriverState *bs)
{
NFSClient *client = bs->opaque;
NFSRPC task = {0};
struct stat st;
task.st = &st;
if (nfs_fstat_async(client->context, client->fh, nfs_co_generic_cb,
&task) != 0) {
return -ENOMEM;
}
while (!task.complete) {
nfs_set_events(client);
aio_poll(client->aio_context, true);
}
return (task.ret < 0 ? task.ret : st.st_blocks * st.st_blksize);
}
static int nfs_file_truncate(BlockDriverState *bs, int64_t offset)
{
NFSClient *client = bs->opaque;
return nfs_ftruncate(client->context, client->fh, offset);
}
static BlockDriver bdrv_nfs = {
.format_name = "nfs",
.protocol_name = "nfs",
.instance_size = sizeof(NFSClient),
.bdrv_needs_filename = true,
.create_opts = &nfs_create_opts,
.bdrv_has_zero_init = nfs_has_zero_init,
.bdrv_get_allocated_file_size = nfs_get_allocated_file_size,
.bdrv_truncate = nfs_file_truncate,
.bdrv_file_open = nfs_file_open,
.bdrv_close = nfs_file_close,
.bdrv_create = nfs_file_create,
.bdrv_co_readv = nfs_co_readv,
.bdrv_co_writev = nfs_co_writev,
.bdrv_co_flush_to_disk = nfs_co_flush,
.bdrv_detach_aio_context = nfs_detach_aio_context,
.bdrv_attach_aio_context = nfs_attach_aio_context,
};
static void nfs_block_init(void)
{
bdrv_register(&bdrv_nfs);
}
block_init(nfs_block_init);

View File

@@ -1,222 +0,0 @@
/*
* Null block driver
*
* Authors:
* Fam Zheng <famz@redhat.com>
*
* Copyright (C) 2014 Red Hat, Inc.
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "block/block_int.h"
#define NULL_OPT_LATENCY "latency-ns"
typedef struct {
int64_t length;
int64_t latency_ns;
} BDRVNullState;
static QemuOptsList runtime_opts = {
.name = "null",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.type = QEMU_OPT_STRING,
.help = "",
},
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "size of the null block",
},
{
.name = NULL_OPT_LATENCY,
.type = QEMU_OPT_NUMBER,
.help = "nanoseconds (approximated) to wait "
"before completing request",
},
{ /* end of list */ }
},
};
static int null_file_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
QemuOpts *opts;
BDRVNullState *s = bs->opaque;
int ret = 0;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &error_abort);
s->length =
qemu_opt_get_size(opts, BLOCK_OPT_SIZE, 1 << 30);
s->latency_ns =
qemu_opt_get_number(opts, NULL_OPT_LATENCY, 0);
if (s->latency_ns < 0) {
error_setg(errp, "latency-ns is invalid");
ret = -EINVAL;
}
qemu_opts_del(opts);
return ret;
}
static void null_close(BlockDriverState *bs)
{
}
static int64_t null_getlength(BlockDriverState *bs)
{
BDRVNullState *s = bs->opaque;
return s->length;
}
static coroutine_fn int null_co_common(BlockDriverState *bs)
{
BDRVNullState *s = bs->opaque;
if (s->latency_ns) {
co_aio_sleep_ns(bdrv_get_aio_context(bs), QEMU_CLOCK_REALTIME,
s->latency_ns);
}
return 0;
}
static coroutine_fn int null_co_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
QEMUIOVector *qiov)
{
return null_co_common(bs);
}
static coroutine_fn int null_co_writev(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
QEMUIOVector *qiov)
{
return null_co_common(bs);
}
static coroutine_fn int null_co_flush(BlockDriverState *bs)
{
return null_co_common(bs);
}
typedef struct {
BlockAIOCB common;
QEMUBH *bh;
QEMUTimer timer;
} NullAIOCB;
static const AIOCBInfo null_aiocb_info = {
.aiocb_size = sizeof(NullAIOCB),
};
static void null_bh_cb(void *opaque)
{
NullAIOCB *acb = opaque;
acb->common.cb(acb->common.opaque, 0);
qemu_bh_delete(acb->bh);
qemu_aio_unref(acb);
}
static void null_timer_cb(void *opaque)
{
NullAIOCB *acb = opaque;
acb->common.cb(acb->common.opaque, 0);
timer_deinit(&acb->timer);
qemu_aio_unref(acb);
}
static inline BlockAIOCB *null_aio_common(BlockDriverState *bs,
BlockCompletionFunc *cb,
void *opaque)
{
NullAIOCB *acb;
BDRVNullState *s = bs->opaque;
acb = qemu_aio_get(&null_aiocb_info, bs, cb, opaque);
/* Only emulate latency after vcpu is running. */
if (s->latency_ns) {
aio_timer_init(bdrv_get_aio_context(bs), &acb->timer,
QEMU_CLOCK_REALTIME, SCALE_NS,
null_timer_cb, acb);
timer_mod_ns(&acb->timer,
qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + s->latency_ns);
} else {
acb->bh = aio_bh_new(bdrv_get_aio_context(bs), null_bh_cb, acb);
qemu_bh_schedule(acb->bh);
}
return &acb->common;
}
static BlockAIOCB *null_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
return null_aio_common(bs, cb, opaque);
}
static BlockAIOCB *null_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
return null_aio_common(bs, cb, opaque);
}
static BlockAIOCB *null_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb,
void *opaque)
{
return null_aio_common(bs, cb, opaque);
}
static int null_reopen_prepare(BDRVReopenState *reopen_state,
BlockReopenQueue *queue, Error **errp)
{
return 0;
}
static BlockDriver bdrv_null_co = {
.format_name = "null-co",
.protocol_name = "null-co",
.instance_size = sizeof(BDRVNullState),
.bdrv_file_open = null_file_open,
.bdrv_close = null_close,
.bdrv_getlength = null_getlength,
.bdrv_co_readv = null_co_readv,
.bdrv_co_writev = null_co_writev,
.bdrv_co_flush_to_disk = null_co_flush,
.bdrv_reopen_prepare = null_reopen_prepare,
};
static BlockDriver bdrv_null_aio = {
.format_name = "null-aio",
.protocol_name = "null-aio",
.instance_size = sizeof(BDRVNullState),
.bdrv_file_open = null_file_open,
.bdrv_close = null_close,
.bdrv_getlength = null_getlength,
.bdrv_aio_readv = null_aio_readv,
.bdrv_aio_writev = null_aio_writev,
.bdrv_aio_flush = null_aio_flush,
.bdrv_reopen_prepare = null_reopen_prepare,
};
static void bdrv_null_init(void)
{
bdrv_register(&bdrv_null_co);
bdrv_register(&bdrv_null_aio);
}
block_init(bdrv_null_init);

View File

@@ -2,12 +2,8 @@
* Block driver for Parallels disk image format
*
* Copyright (c) 2007 Alex Beregszaszi
* Copyright (c) 2015 Denis V. Lunev <den@openvz.org>
*
* This code was originally based on comparing different disk images created
* by Parallels. Currently it is based on opened OpenVZ sources
* available at
* http://git.openvz.org/?p=ploop;a=summary
* This code is based on comparing different disk images created by Parallels.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
@@ -30,558 +26,70 @@
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/bitmap.h"
#include "qapi/util.h"
/**************************************************************/
#define HEADER_MAGIC "WithoutFreeSpace"
#define HEADER_MAGIC2 "WithouFreSpacExt"
#define HEADER_VERSION 2
#define HEADER_INUSE_MAGIC (0x746F6E59)
#define DEFAULT_CLUSTER_SIZE 1048576 /* 1 MiB */
#define HEADER_SIZE 64
// always little-endian
typedef struct ParallelsHeader {
struct parallels_header {
char magic[16]; // "WithoutFreeSpace"
uint32_t version;
uint32_t heads;
uint32_t cylinders;
uint32_t tracks;
uint32_t bat_entries;
uint64_t nb_sectors;
uint32_t inuse;
uint32_t data_off;
char padding[12];
} QEMU_PACKED ParallelsHeader;
typedef enum ParallelsPreallocMode {
PRL_PREALLOC_MODE_FALLOCATE = 0,
PRL_PREALLOC_MODE_TRUNCATE = 1,
PRL_PREALLOC_MODE_MAX = 2,
} ParallelsPreallocMode;
static const char *prealloc_mode_lookup[] = {
"falloc",
"truncate",
NULL,
};
uint32_t catalog_entries;
uint32_t nb_sectors;
char padding[24];
} QEMU_PACKED;
typedef struct BDRVParallelsState {
/** Locking is conservative, the lock protects
* - image file extending (truncate, fallocate)
* - any access to block allocation table
*/
CoMutex lock;
ParallelsHeader *header;
uint32_t header_size;
bool header_unclean;
unsigned long *bat_dirty_bmap;
unsigned int bat_dirty_block;
uint32_t *bat_bitmap;
unsigned int bat_size;
int64_t data_end;
uint64_t prealloc_size;
ParallelsPreallocMode prealloc_mode;
uint32_t *catalog_bitmap;
unsigned int catalog_size;
unsigned int tracks;
unsigned int off_multiplier;
} BDRVParallelsState;
#define PARALLELS_OPT_PREALLOC_MODE "prealloc-mode"
#define PARALLELS_OPT_PREALLOC_SIZE "prealloc-size"
static QemuOptsList parallels_runtime_opts = {
.name = "parallels",
.head = QTAILQ_HEAD_INITIALIZER(parallels_runtime_opts.head),
.desc = {
{
.name = PARALLELS_OPT_PREALLOC_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Preallocation size on image expansion",
.def_value_str = "128MiB",
},
{
.name = PARALLELS_OPT_PREALLOC_MODE,
.type = QEMU_OPT_STRING,
.help = "Preallocation mode on image expansion "
"(allowed values: falloc, truncate)",
.def_value_str = "falloc",
},
{ /* end of list */ },
},
};
static int64_t bat2sect(BDRVParallelsState *s, uint32_t idx)
static int parallels_probe(const uint8_t *buf, int buf_size, const char *filename)
{
return (uint64_t)le32_to_cpu(s->bat_bitmap[idx]) * s->off_multiplier;
}
const struct parallels_header *ph = (const void *)buf;
static uint32_t bat_entry_off(uint32_t idx)
{
return sizeof(ParallelsHeader) + sizeof(uint32_t) * idx;
}
if (buf_size < HEADER_SIZE)
return 0;
static int64_t seek_to_sector(BDRVParallelsState *s, int64_t sector_num)
{
uint32_t index, offset;
index = sector_num / s->tracks;
offset = sector_num % s->tracks;
/* not allocated */
if ((index >= s->bat_size) || (s->bat_bitmap[index] == 0)) {
return -1;
}
return bat2sect(s, index) + offset;
}
static int cluster_remainder(BDRVParallelsState *s, int64_t sector_num,
int nb_sectors)
{
int ret = s->tracks - sector_num % s->tracks;
return MIN(nb_sectors, ret);
}
static int64_t block_status(BDRVParallelsState *s, int64_t sector_num,
int nb_sectors, int *pnum)
{
int64_t start_off = -2, prev_end_off = -2;
*pnum = 0;
while (nb_sectors > 0 || start_off == -2) {
int64_t offset = seek_to_sector(s, sector_num);
int to_end;
if (start_off == -2) {
start_off = offset;
prev_end_off = offset;
} else if (offset != prev_end_off) {
break;
}
to_end = cluster_remainder(s, sector_num, nb_sectors);
nb_sectors -= to_end;
sector_num += to_end;
*pnum += to_end;
if (offset > 0) {
prev_end_off += to_end;
}
}
return start_off;
}
static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, int *pnum)
{
BDRVParallelsState *s = bs->opaque;
uint32_t idx, to_allocate, i;
int64_t pos, space;
pos = block_status(s, sector_num, nb_sectors, pnum);
if (pos > 0) {
return pos;
}
idx = sector_num / s->tracks;
if (idx >= s->bat_size) {
return -EINVAL;
}
to_allocate = (sector_num + *pnum + s->tracks - 1) / s->tracks - idx;
space = to_allocate * s->tracks;
if (s->data_end + space > bdrv_getlength(bs->file) >> BDRV_SECTOR_BITS) {
int ret;
space += s->prealloc_size;
if (s->prealloc_mode == PRL_PREALLOC_MODE_FALLOCATE) {
ret = bdrv_write_zeroes(bs->file, s->data_end, space, 0);
} else {
ret = bdrv_truncate(bs->file,
(s->data_end + space) << BDRV_SECTOR_BITS);
}
if (ret < 0) {
return ret;
}
}
for (i = 0; i < to_allocate; i++) {
s->bat_bitmap[idx + i] = cpu_to_le32(s->data_end / s->off_multiplier);
s->data_end += s->tracks;
bitmap_set(s->bat_dirty_bmap,
bat_entry_off(idx) / s->bat_dirty_block, 1);
}
return bat2sect(s, idx) + sector_num % s->tracks;
}
static coroutine_fn int parallels_co_flush_to_os(BlockDriverState *bs)
{
BDRVParallelsState *s = bs->opaque;
unsigned long size = DIV_ROUND_UP(s->header_size, s->bat_dirty_block);
unsigned long bit;
qemu_co_mutex_lock(&s->lock);
bit = find_first_bit(s->bat_dirty_bmap, size);
while (bit < size) {
uint32_t off = bit * s->bat_dirty_block;
uint32_t to_write = s->bat_dirty_block;
int ret;
if (off + to_write > s->header_size) {
to_write = s->header_size - off;
}
ret = bdrv_pwrite(bs->file, off, (uint8_t *)s->header + off, to_write);
if (ret < 0) {
qemu_co_mutex_unlock(&s->lock);
return ret;
}
bit = find_next_bit(s->bat_dirty_bmap, size, bit + 1);
}
bitmap_zero(s->bat_dirty_bmap, size);
qemu_co_mutex_unlock(&s->lock);
return 0;
}
static int64_t coroutine_fn parallels_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum)
{
BDRVParallelsState *s = bs->opaque;
int64_t offset;
qemu_co_mutex_lock(&s->lock);
offset = block_status(s, sector_num, nb_sectors, pnum);
qemu_co_mutex_unlock(&s->lock);
if (offset < 0) {
return 0;
}
return (offset << BDRV_SECTOR_BITS) |
BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
}
static coroutine_fn int parallels_co_writev(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
{
BDRVParallelsState *s = bs->opaque;
uint64_t bytes_done = 0;
QEMUIOVector hd_qiov;
int ret = 0;
qemu_iovec_init(&hd_qiov, qiov->niov);
while (nb_sectors > 0) {
int64_t position;
int n, nbytes;
qemu_co_mutex_lock(&s->lock);
position = allocate_clusters(bs, sector_num, nb_sectors, &n);
qemu_co_mutex_unlock(&s->lock);
if (position < 0) {
ret = (int)position;
break;
}
nbytes = n << BDRV_SECTOR_BITS;
qemu_iovec_reset(&hd_qiov);
qemu_iovec_concat(&hd_qiov, qiov, bytes_done, nbytes);
ret = bdrv_co_writev(bs->file, position, n, &hd_qiov);
if (ret < 0) {
break;
}
nb_sectors -= n;
sector_num += n;
bytes_done += nbytes;
}
qemu_iovec_destroy(&hd_qiov);
return ret;
}
static coroutine_fn int parallels_co_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
{
BDRVParallelsState *s = bs->opaque;
uint64_t bytes_done = 0;
QEMUIOVector hd_qiov;
int ret = 0;
qemu_iovec_init(&hd_qiov, qiov->niov);
while (nb_sectors > 0) {
int64_t position;
int n, nbytes;
qemu_co_mutex_lock(&s->lock);
position = block_status(s, sector_num, nb_sectors, &n);
qemu_co_mutex_unlock(&s->lock);
nbytes = n << BDRV_SECTOR_BITS;
if (position < 0) {
qemu_iovec_memset(qiov, bytes_done, 0, nbytes);
} else {
qemu_iovec_reset(&hd_qiov);
qemu_iovec_concat(&hd_qiov, qiov, bytes_done, nbytes);
ret = bdrv_co_readv(bs->file, position, n, &hd_qiov);
if (ret < 0) {
break;
}
}
nb_sectors -= n;
sector_num += n;
bytes_done += nbytes;
}
qemu_iovec_destroy(&hd_qiov);
return ret;
}
static int parallels_check(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix)
{
BDRVParallelsState *s = bs->opaque;
int64_t size, prev_off, high_off;
int ret;
uint32_t i;
bool flush_bat = false;
int cluster_size = s->tracks << BDRV_SECTOR_BITS;
size = bdrv_getlength(bs->file);
if (size < 0) {
res->check_errors++;
return size;
}
if (s->header_unclean) {
fprintf(stderr, "%s image was not closed correctly\n",
fix & BDRV_FIX_ERRORS ? "Repairing" : "ERROR");
res->corruptions++;
if (fix & BDRV_FIX_ERRORS) {
/* parallels_close will do the job right */
res->corruptions_fixed++;
s->header_unclean = false;
}
}
res->bfi.total_clusters = s->bat_size;
res->bfi.compressed_clusters = 0; /* compression is not supported */
high_off = 0;
prev_off = 0;
for (i = 0; i < s->bat_size; i++) {
int64_t off = bat2sect(s, i) << BDRV_SECTOR_BITS;
if (off == 0) {
prev_off = 0;
continue;
}
/* cluster outside the image */
if (off > size) {
fprintf(stderr, "%s cluster %u is outside image\n",
fix & BDRV_FIX_ERRORS ? "Repairing" : "ERROR", i);
res->corruptions++;
if (fix & BDRV_FIX_ERRORS) {
prev_off = 0;
s->bat_bitmap[i] = 0;
res->corruptions_fixed++;
flush_bat = true;
continue;
}
}
res->bfi.allocated_clusters++;
if (off > high_off) {
high_off = off;
}
if (prev_off != 0 && (prev_off + cluster_size) != off) {
res->bfi.fragmented_clusters++;
}
prev_off = off;
}
if (flush_bat) {
ret = bdrv_pwrite_sync(bs->file, 0, s->header, s->header_size);
if (ret < 0) {
res->check_errors++;
return ret;
}
}
res->image_end_offset = high_off + cluster_size;
if (size > res->image_end_offset) {
int64_t count;
count = DIV_ROUND_UP(size - res->image_end_offset, cluster_size);
fprintf(stderr, "%s space leaked at the end of the image %" PRId64 "\n",
fix & BDRV_FIX_LEAKS ? "Repairing" : "ERROR",
size - res->image_end_offset);
res->leaks += count;
if (fix & BDRV_FIX_LEAKS) {
ret = bdrv_truncate(bs->file, res->image_end_offset);
if (ret < 0) {
res->check_errors++;
return ret;
}
res->leaks_fixed += count;
}
}
if (!memcmp(ph->magic, HEADER_MAGIC, 16) &&
(le32_to_cpu(ph->version) == HEADER_VERSION))
return 100;
return 0;
}
static int parallels_create(const char *filename, QemuOpts *opts, Error **errp)
{
int64_t total_size, cl_size;
uint8_t tmp[BDRV_SECTOR_SIZE];
Error *local_err = NULL;
BlockDriverState *file;
uint32_t bat_entries, bat_sectors;
ParallelsHeader header;
int ret;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
cl_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE), BDRV_SECTOR_SIZE);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
}
file = NULL;
ret = bdrv_open(&file, filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
}
ret = bdrv_truncate(file, 0);
if (ret < 0) {
goto exit;
}
bat_entries = DIV_ROUND_UP(total_size, cl_size);
bat_sectors = DIV_ROUND_UP(bat_entry_off(bat_entries), cl_size);
bat_sectors = (bat_sectors * cl_size) >> BDRV_SECTOR_BITS;
memset(&header, 0, sizeof(header));
memcpy(header.magic, HEADER_MAGIC2, sizeof(header.magic));
header.version = cpu_to_le32(HEADER_VERSION);
/* don't care much about geometry, it is not used on image level */
header.heads = cpu_to_le32(16);
header.cylinders = cpu_to_le32(total_size / BDRV_SECTOR_SIZE / 16 / 32);
header.tracks = cpu_to_le32(cl_size >> BDRV_SECTOR_BITS);
header.bat_entries = cpu_to_le32(bat_entries);
header.nb_sectors = cpu_to_le64(DIV_ROUND_UP(total_size, BDRV_SECTOR_SIZE));
header.data_off = cpu_to_le32(bat_sectors);
/* write all the data */
memset(tmp, 0, sizeof(tmp));
memcpy(tmp, &header, sizeof(header));
ret = bdrv_pwrite(file, 0, tmp, BDRV_SECTOR_SIZE);
if (ret < 0) {
goto exit;
}
ret = bdrv_write_zeroes(file, 1, bat_sectors - 1, 0);
if (ret < 0) {
goto exit;
}
ret = 0;
done:
bdrv_unref(file);
return ret;
exit:
error_setg_errno(errp, -ret, "Failed to create Parallels image");
goto done;
}
static int parallels_probe(const uint8_t *buf, int buf_size,
const char *filename)
{
const ParallelsHeader *ph = (const void *)buf;
if (buf_size < sizeof(ParallelsHeader)) {
return 0;
}
if ((!memcmp(ph->magic, HEADER_MAGIC, 16) ||
!memcmp(ph->magic, HEADER_MAGIC2, 16)) &&
(le32_to_cpu(ph->version) == HEADER_VERSION)) {
return 100;
}
return 0;
}
static int parallels_update_header(BlockDriverState *bs)
{
BDRVParallelsState *s = bs->opaque;
unsigned size = MAX(bdrv_opt_mem_align(bs->file), sizeof(ParallelsHeader));
if (size > s->header_size) {
size = s->header_size;
}
return bdrv_pwrite_sync(bs->file, 0, s->header, size);
}
static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVParallelsState *s = bs->opaque;
ParallelsHeader ph;
int ret, size, i;
QemuOpts *opts = NULL;
Error *local_err = NULL;
char *buf;
int i;
struct parallels_header ph;
int ret;
bs->read_only = 1; // no write support yet
ret = bdrv_pread(bs->file, 0, &ph, sizeof(ph));
if (ret < 0) {
goto fail;
}
bs->total_sectors = le64_to_cpu(ph.nb_sectors);
if (memcmp(ph.magic, HEADER_MAGIC, 16) ||
(le32_to_cpu(ph.version) != HEADER_VERSION)) {
ret = -EMEDIUMTYPE;
goto fail;
}
if (le32_to_cpu(ph.version) != HEADER_VERSION) {
goto fail_format;
}
if (!memcmp(ph.magic, HEADER_MAGIC, 16)) {
s->off_multiplier = 1;
bs->total_sectors = 0xffffffff & bs->total_sectors;
} else if (!memcmp(ph.magic, HEADER_MAGIC2, 16)) {
s->off_multiplier = le32_to_cpu(ph.tracks);
} else {
goto fail_format;
}
bs->total_sectors = le32_to_cpu(ph.nb_sectors);
s->tracks = le32_to_cpu(ph.tracks);
if (s->tracks == 0) {
@@ -589,165 +97,87 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
ret = -EINVAL;
goto fail;
}
if (s->tracks > INT32_MAX/513) {
error_setg(errp, "Invalid image: Too big cluster");
ret = -EFBIG;
goto fail;
}
s->bat_size = le32_to_cpu(ph.bat_entries);
if (s->bat_size > INT_MAX / sizeof(uint32_t)) {
s->catalog_size = le32_to_cpu(ph.catalog_entries);
if (s->catalog_size > INT_MAX / 4) {
error_setg(errp, "Catalog too large");
ret = -EFBIG;
goto fail;
}
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
size = bat_entry_off(s->bat_size);
s->header_size = ROUND_UP(size, bdrv_opt_mem_align(bs->file));
s->header = qemu_try_blockalign(bs->file, s->header_size);
if (s->header == NULL) {
ret = -ENOMEM;
goto fail;
}
s->data_end = le32_to_cpu(ph.data_off);
if (s->data_end == 0) {
s->data_end = ROUND_UP(bat_entry_off(s->bat_size), BDRV_SECTOR_SIZE);
}
if (s->data_end < s->header_size) {
/* there is not enough unused space to fit to block align between BAT
and actual data. We can't avoid read-modify-write... */
s->header_size = size;
}
ret = bdrv_pread(bs->file, 0, s->header, s->header_size);
ret = bdrv_pread(bs->file, 64, s->catalog_bitmap, s->catalog_size * 4);
if (ret < 0) {
goto fail;
}
s->bat_bitmap = (uint32_t *)(s->header + 1);
for (i = 0; i < s->bat_size; i++) {
int64_t off = bat2sect(s, i);
if (off >= s->data_end) {
s->data_end = off + s->tracks;
}
}
if (le32_to_cpu(ph.inuse) == HEADER_INUSE_MAGIC) {
/* Image was not closed correctly. The check is mandatory */
s->header_unclean = true;
if ((flags & BDRV_O_RDWR) && !(flags & BDRV_O_CHECK)) {
error_setg(errp, "parallels: Image was not closed correctly; "
"cannot be opened read/write");
ret = -EACCES;
goto fail;
}
}
opts = qemu_opts_create(&parallels_runtime_opts, NULL, 0, &local_err);
if (local_err != NULL) {
goto fail_options;
}
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err != NULL) {
goto fail_options;
}
s->prealloc_size =
qemu_opt_get_size_del(opts, PARALLELS_OPT_PREALLOC_SIZE, 0);
s->prealloc_size = MAX(s->tracks, s->prealloc_size >> BDRV_SECTOR_BITS);
buf = qemu_opt_get_del(opts, PARALLELS_OPT_PREALLOC_MODE);
s->prealloc_mode = qapi_enum_parse(prealloc_mode_lookup, buf,
PRL_PREALLOC_MODE_MAX, PRL_PREALLOC_MODE_FALLOCATE, &local_err);
g_free(buf);
if (local_err != NULL) {
goto fail_options;
}
if (!bdrv_has_zero_init(bs->file) ||
bdrv_truncate(bs->file, bdrv_getlength(bs->file)) != 0) {
s->prealloc_mode = PRL_PREALLOC_MODE_FALLOCATE;
}
if (flags & BDRV_O_RDWR) {
s->header->inuse = cpu_to_le32(HEADER_INUSE_MAGIC);
ret = parallels_update_header(bs);
if (ret < 0) {
goto fail;
}
}
s->bat_dirty_block = 4 * getpagesize();
s->bat_dirty_bmap =
bitmap_new(DIV_ROUND_UP(s->header_size, s->bat_dirty_block));
for (i = 0; i < s->catalog_size; i++)
le32_to_cpus(&s->catalog_bitmap[i]);
qemu_co_mutex_init(&s->lock);
return 0;
fail_format:
error_setg(errp, "Image not in Parallels format");
ret = -EINVAL;
fail:
qemu_vfree(s->header);
g_free(s->catalog_bitmap);
return ret;
fail_options:
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
{
BDRVParallelsState *s = bs->opaque;
uint32_t index, offset;
index = sector_num / s->tracks;
offset = sector_num % s->tracks;
/* not allocated */
if ((index > s->catalog_size) || (s->catalog_bitmap[index] == 0))
return -1;
return (uint64_t)(s->catalog_bitmap[index] + offset) * 512;
}
static int parallels_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
while (nb_sectors > 0) {
int64_t position = seek_to_sector(bs, sector_num);
if (position >= 0) {
if (bdrv_pread(bs->file, position, buf, 512) != 512)
return -1;
} else {
memset(buf, 0, 512);
}
nb_sectors--;
sector_num++;
buf += 512;
}
return 0;
}
static coroutine_fn int parallels_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVParallelsState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = parallels_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static void parallels_close(BlockDriverState *bs)
{
BDRVParallelsState *s = bs->opaque;
if (bs->open_flags & BDRV_O_RDWR) {
s->header->inuse = 0;
parallels_update_header(bs);
}
if (bs->open_flags & BDRV_O_RDWR) {
bdrv_truncate(bs->file, s->data_end << BDRV_SECTOR_BITS);
}
g_free(s->bat_dirty_bmap);
qemu_vfree(s->header);
g_free(s->catalog_bitmap);
}
static QemuOptsList parallels_create_opts = {
.name = "parallels-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(parallels_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size",
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Parallels image cluster size",
.def_value_str = stringify(DEFAULT_CLUSTER_SIZE),
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_parallels = {
.format_name = "parallels",
.instance_size = sizeof(BDRVParallelsState),
.bdrv_probe = parallels_probe,
.bdrv_open = parallels_open,
.bdrv_read = parallels_co_read,
.bdrv_close = parallels_close,
.bdrv_co_get_block_status = parallels_co_get_block_status,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_co_flush_to_os = parallels_co_flush_to_os,
.bdrv_co_readv = parallels_co_readv,
.bdrv_co_writev = parallels_co_writev,
.bdrv_create = parallels_create,
.bdrv_check = parallels_check,
.create_opts = &parallels_create_opts,
};
static void bdrv_parallels_init(void)

View File

@@ -24,103 +24,10 @@
#include "block/qapi.h"
#include "block/block_int.h"
#include "block/throttle-groups.h"
#include "block/write-threshold.h"
#include "qmp-commands.h"
#include "qapi-visit.h"
#include "qapi/qmp-output-visitor.h"
#include "qapi/qmp/types.h"
#include "sysemu/block-backend.h"
BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs, Error **errp)
{
ImageInfo **p_image_info;
BlockDriverState *bs0;
BlockDeviceInfo *info = g_malloc0(sizeof(*info));
info->file = g_strdup(bs->filename);
info->ro = bs->read_only;
info->drv = g_strdup(bs->drv->format_name);
info->encrypted = bs->encrypted;
info->encryption_key_missing = bdrv_key_required(bs);
info->cache = g_new(BlockdevCacheInfo, 1);
*info->cache = (BlockdevCacheInfo) {
.writeback = bdrv_enable_write_cache(bs),
.direct = !!(bs->open_flags & BDRV_O_NOCACHE),
.no_flush = !!(bs->open_flags & BDRV_O_NO_FLUSH),
};
if (bs->node_name[0]) {
info->has_node_name = true;
info->node_name = g_strdup(bs->node_name);
}
if (bs->backing_file[0]) {
info->has_backing_file = true;
info->backing_file = g_strdup(bs->backing_file);
}
info->backing_file_depth = bdrv_get_backing_file_depth(bs);
info->detect_zeroes = bs->detect_zeroes;
if (bs->io_limits_enabled) {
ThrottleConfig cfg;
throttle_group_get_config(bs, &cfg);
info->bps = cfg.buckets[THROTTLE_BPS_TOTAL].avg;
info->bps_rd = cfg.buckets[THROTTLE_BPS_READ].avg;
info->bps_wr = cfg.buckets[THROTTLE_BPS_WRITE].avg;
info->iops = cfg.buckets[THROTTLE_OPS_TOTAL].avg;
info->iops_rd = cfg.buckets[THROTTLE_OPS_READ].avg;
info->iops_wr = cfg.buckets[THROTTLE_OPS_WRITE].avg;
info->has_bps_max = cfg.buckets[THROTTLE_BPS_TOTAL].max;
info->bps_max = cfg.buckets[THROTTLE_BPS_TOTAL].max;
info->has_bps_rd_max = cfg.buckets[THROTTLE_BPS_READ].max;
info->bps_rd_max = cfg.buckets[THROTTLE_BPS_READ].max;
info->has_bps_wr_max = cfg.buckets[THROTTLE_BPS_WRITE].max;
info->bps_wr_max = cfg.buckets[THROTTLE_BPS_WRITE].max;
info->has_iops_max = cfg.buckets[THROTTLE_OPS_TOTAL].max;
info->iops_max = cfg.buckets[THROTTLE_OPS_TOTAL].max;
info->has_iops_rd_max = cfg.buckets[THROTTLE_OPS_READ].max;
info->iops_rd_max = cfg.buckets[THROTTLE_OPS_READ].max;
info->has_iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max;
info->iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max;
info->has_iops_size = cfg.op_size;
info->iops_size = cfg.op_size;
info->has_group = true;
info->group = g_strdup(throttle_group_get_name(bs));
}
info->write_threshold = bdrv_write_threshold_get(bs);
bs0 = bs;
p_image_info = &info->image;
while (1) {
Error *local_err = NULL;
bdrv_query_image_info(bs0, p_image_info, &local_err);
if (local_err) {
error_propagate(errp, local_err);
qapi_free_BlockDeviceInfo(info);
return NULL;
}
if (bs0->drv && bs0->backing_hd) {
bs0 = bs0->backing_hd;
(*p_image_info)->has_backing_image = true;
p_image_info = &((*p_image_info)->backing_image);
} else {
break;
}
}
return info;
}
/*
* Returns 0 on success, with *p_list either set to describe snapshot
@@ -203,24 +110,19 @@ void bdrv_query_image_info(BlockDriverState *bs,
ImageInfo **p_info,
Error **errp)
{
int64_t size;
uint64_t total_sectors;
const char *backing_filename;
char backing_filename2[1024];
BlockDriverInfo bdi;
int ret;
Error *err = NULL;
ImageInfo *info;
ImageInfo *info = g_new0(ImageInfo, 1);
size = bdrv_getlength(bs);
if (size < 0) {
error_setg_errno(errp, -size, "Can't get size of device '%s'",
bdrv_get_device_name(bs));
return;
}
bdrv_get_geometry(bs, &total_sectors);
info = g_new0(ImageInfo, 1);
info->filename = g_strdup(bs->filename);
info->format = g_strdup(bdrv_get_format_name(bs));
info->virtual_size = size;
info->virtual_size = total_sectors * 512;
info->actual_size = bdrv_get_allocated_file_size(bs);
info->has_actual_size = info->actual_size >= 0;
if (bdrv_is_encrypted(bs)) {
@@ -240,16 +142,10 @@ void bdrv_query_image_info(BlockDriverState *bs,
backing_filename = bs->backing_file;
if (backing_filename[0] != '\0') {
char *backing_filename2 = g_malloc0(PATH_MAX);
info->backing_filename = g_strdup(backing_filename);
info->has_backing_filename = true;
bdrv_get_full_backing_filename(bs, backing_filename2, PATH_MAX, &err);
if (err) {
error_propagate(errp, err);
qapi_free_ImageInfo(info);
g_free(backing_filename2);
return;
}
bdrv_get_full_backing_filename(bs, backing_filename2,
sizeof(backing_filename2));
if (strcmp(backing_filename, backing_filename2) != 0) {
info->full_backing_filename =
@@ -261,7 +157,6 @@ void bdrv_query_image_info(BlockDriverState *bs,
info->backing_filename_format = g_strdup(bs->backing_format);
info->has_backing_filename_format = true;
}
g_free(backing_filename2);
}
ret = bdrv_query_snapshot_info_list(bs, &info->snapshots, &err);
@@ -286,19 +181,22 @@ void bdrv_query_image_info(BlockDriverState *bs,
}
/* @p_info will be set only on success. */
static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
Error **errp)
void bdrv_query_info(BlockDriverState *bs,
BlockInfo **p_info,
Error **errp)
{
BlockInfo *info = g_malloc0(sizeof(*info));
BlockDriverState *bs = blk_bs(blk);
info->device = g_strdup(blk_name(blk));
BlockDriverState *bs0;
ImageInfo **p_image_info;
Error *local_err = NULL;
info->device = g_strdup(bs->device_name);
info->type = g_strdup("unknown");
info->locked = blk_dev_is_medium_locked(blk);
info->removable = blk_dev_has_removable_media(blk);
info->locked = bdrv_dev_is_medium_locked(bs);
info->removable = bdrv_dev_has_removable_media(bs);
if (blk_dev_has_removable_media(blk)) {
if (bdrv_dev_has_removable_media(bs)) {
info->has_tray_open = true;
info->tray_open = blk_dev_is_tray_open(blk);
info->tray_open = bdrv_dev_is_tray_open(bs);
}
if (bdrv_iostatus_is_enabled(bs)) {
@@ -306,16 +204,86 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
info->io_status = bs->iostatus;
}
if (!QLIST_EMPTY(&bs->dirty_bitmaps)) {
info->has_dirty_bitmaps = true;
info->dirty_bitmaps = bdrv_query_dirty_bitmaps(bs);
if (bs->dirty_bitmap) {
info->has_dirty = true;
info->dirty = g_malloc0(sizeof(*info->dirty));
info->dirty->count = bdrv_get_dirty_count(bs) * BDRV_SECTOR_SIZE;
info->dirty->granularity =
((int64_t) BDRV_SECTOR_SIZE << hbitmap_granularity(bs->dirty_bitmap));
}
if (bs->drv) {
info->has_inserted = true;
info->inserted = bdrv_block_device_info(bs, errp);
if (info->inserted == NULL) {
goto err;
info->inserted = g_malloc0(sizeof(*info->inserted));
info->inserted->file = g_strdup(bs->filename);
info->inserted->ro = bs->read_only;
info->inserted->drv = g_strdup(bs->drv->format_name);
info->inserted->encrypted = bs->encrypted;
info->inserted->encryption_key_missing = bdrv_key_required(bs);
if (bs->backing_file[0]) {
info->inserted->has_backing_file = true;
info->inserted->backing_file = g_strdup(bs->backing_file);
}
info->inserted->backing_file_depth = bdrv_get_backing_file_depth(bs);
if (bs->io_limits_enabled) {
ThrottleConfig cfg;
throttle_get_config(&bs->throttle_state, &cfg);
info->inserted->bps = cfg.buckets[THROTTLE_BPS_TOTAL].avg;
info->inserted->bps_rd = cfg.buckets[THROTTLE_BPS_READ].avg;
info->inserted->bps_wr = cfg.buckets[THROTTLE_BPS_WRITE].avg;
info->inserted->iops = cfg.buckets[THROTTLE_OPS_TOTAL].avg;
info->inserted->iops_rd = cfg.buckets[THROTTLE_OPS_READ].avg;
info->inserted->iops_wr = cfg.buckets[THROTTLE_OPS_WRITE].avg;
info->inserted->has_bps_max =
cfg.buckets[THROTTLE_BPS_TOTAL].max;
info->inserted->bps_max =
cfg.buckets[THROTTLE_BPS_TOTAL].max;
info->inserted->has_bps_rd_max =
cfg.buckets[THROTTLE_BPS_READ].max;
info->inserted->bps_rd_max =
cfg.buckets[THROTTLE_BPS_READ].max;
info->inserted->has_bps_wr_max =
cfg.buckets[THROTTLE_BPS_WRITE].max;
info->inserted->bps_wr_max =
cfg.buckets[THROTTLE_BPS_WRITE].max;
info->inserted->has_iops_max =
cfg.buckets[THROTTLE_OPS_TOTAL].max;
info->inserted->iops_max =
cfg.buckets[THROTTLE_OPS_TOTAL].max;
info->inserted->has_iops_rd_max =
cfg.buckets[THROTTLE_OPS_READ].max;
info->inserted->iops_rd_max =
cfg.buckets[THROTTLE_OPS_READ].max;
info->inserted->has_iops_wr_max =
cfg.buckets[THROTTLE_OPS_WRITE].max;
info->inserted->iops_wr_max =
cfg.buckets[THROTTLE_OPS_WRITE].max;
info->inserted->has_iops_size = cfg.op_size;
info->inserted->iops_size = cfg.op_size;
}
bs0 = bs;
p_image_info = &info->inserted->image;
while (1) {
bdrv_query_image_info(bs0, p_image_info, &local_err);
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
goto err;
}
if (bs0->drv && bs0->backing_hd) {
bs0 = bs0->backing_hd;
(*p_image_info)->has_backing_image = true;
p_image_info = &((*p_image_info)->backing_image);
} else {
break;
}
}
}
@@ -326,45 +294,31 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
qapi_free_BlockInfo(info);
}
static BlockStats *bdrv_query_stats(const BlockDriverState *bs,
bool query_backing)
BlockStats *bdrv_query_stats(const BlockDriverState *bs)
{
BlockStats *s;
s = g_malloc0(sizeof(*s));
if (bdrv_get_device_name(bs)[0]) {
if (bs->device_name[0]) {
s->has_device = true;
s->device = g_strdup(bdrv_get_device_name(bs));
}
if (bdrv_get_node_name(bs)[0]) {
s->has_node_name = true;
s->node_name = g_strdup(bdrv_get_node_name(bs));
s->device = g_strdup(bs->device_name);
}
s->stats = g_malloc0(sizeof(*s->stats));
s->stats->rd_bytes = bs->stats.nr_bytes[BLOCK_ACCT_READ];
s->stats->wr_bytes = bs->stats.nr_bytes[BLOCK_ACCT_WRITE];
s->stats->rd_operations = bs->stats.nr_ops[BLOCK_ACCT_READ];
s->stats->wr_operations = bs->stats.nr_ops[BLOCK_ACCT_WRITE];
s->stats->rd_merged = bs->stats.merged[BLOCK_ACCT_READ];
s->stats->wr_merged = bs->stats.merged[BLOCK_ACCT_WRITE];
s->stats->wr_highest_offset =
bs->stats.wr_highest_sector * BDRV_SECTOR_SIZE;
s->stats->flush_operations = bs->stats.nr_ops[BLOCK_ACCT_FLUSH];
s->stats->wr_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_WRITE];
s->stats->rd_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_READ];
s->stats->flush_total_time_ns = bs->stats.total_time_ns[BLOCK_ACCT_FLUSH];
s->stats->rd_bytes = bs->nr_bytes[BDRV_ACCT_READ];
s->stats->wr_bytes = bs->nr_bytes[BDRV_ACCT_WRITE];
s->stats->rd_operations = bs->nr_ops[BDRV_ACCT_READ];
s->stats->wr_operations = bs->nr_ops[BDRV_ACCT_WRITE];
s->stats->wr_highest_offset = bs->wr_highest_sector * BDRV_SECTOR_SIZE;
s->stats->flush_operations = bs->nr_ops[BDRV_ACCT_FLUSH];
s->stats->wr_total_time_ns = bs->total_time_ns[BDRV_ACCT_WRITE];
s->stats->rd_total_time_ns = bs->total_time_ns[BDRV_ACCT_READ];
s->stats->flush_total_time_ns = bs->total_time_ns[BDRV_ACCT_FLUSH];
if (bs->file) {
s->has_parent = true;
s->parent = bdrv_query_stats(bs->file, query_backing);
}
if (query_backing && bs->backing_hd) {
s->has_backing = true;
s->backing = bdrv_query_stats(bs->backing_hd, query_backing);
s->parent = bdrv_query_stats(bs->file);
}
return s;
@@ -373,13 +327,13 @@ static BlockStats *bdrv_query_stats(const BlockDriverState *bs,
BlockInfoList *qmp_query_block(Error **errp)
{
BlockInfoList *head = NULL, **p_next = &head;
BlockBackend *blk;
BlockDriverState *bs = NULL;
Error *local_err = NULL;
for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
while ((bs = bdrv_next(bs))) {
BlockInfoList *info = g_malloc0(sizeof(*info));
bdrv_query_info(blk, &info->value, &local_err);
if (local_err) {
bdrv_query_info(bs, &info->value, &local_err);
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
goto err;
}
@@ -395,23 +349,14 @@ BlockInfoList *qmp_query_block(Error **errp)
return NULL;
}
BlockStatsList *qmp_query_blockstats(bool has_query_nodes,
bool query_nodes,
Error **errp)
BlockStatsList *qmp_query_blockstats(Error **errp)
{
BlockStatsList *head = NULL, **p_next = &head;
BlockDriverState *bs = NULL;
/* Just to be safe if query_nodes is not always initialized */
query_nodes = has_query_nodes && query_nodes;
while ((bs = query_nodes ? bdrv_next_node(bs) : bdrv_next(bs))) {
while ((bs = bdrv_next(bs))) {
BlockStatsList *info = g_malloc0(sizeof(*info));
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
info->value = bdrv_query_stats(bs, !query_nodes);
aio_context_release(ctx);
info->value = bdrv_query_stats(bs);
*p_next = info;
p_next = &info->next;
@@ -424,7 +369,7 @@ BlockStatsList *qmp_query_blockstats(bool has_query_nodes,
static char *get_human_readable_size(char *buf, int buf_size, int64_t size)
{
static const char suffixes[NB_SUFFIXES] = {'K', 'M', 'G', 'T'};
static const char suffixes[NB_SUFFIXES] = "KMGT";
int64_t base;
int i;
@@ -520,9 +465,18 @@ static void dump_qobject(fprintf_function func_fprintf, void *f,
}
case QTYPE_QBOOL: {
QBool *value = qobject_to_qbool(obj);
func_fprintf(f, "%s", qbool_get_bool(value) ? "true" : "false");
func_fprintf(f, "%s", qbool_get_int(value) ? "true" : "false");
break;
}
case QTYPE_QERROR: {
QString *value = qerror_human((QError *)obj);
func_fprintf(f, "%s", qstring_get_str(value));
QDECREF(value);
break;
}
case QTYPE_NONE:
break;
case QTYPE_MAX:
default:
abort();
}
@@ -576,11 +530,12 @@ static void dump_qdict(fprintf_function func_fprintf, void *f, int indentation,
void bdrv_image_info_specific_dump(fprintf_function func_fprintf, void *f,
ImageInfoSpecific *info_spec)
{
Error *local_err = NULL;
QmpOutputVisitor *ov = qmp_output_visitor_new();
QObject *obj, *data;
visit_type_ImageInfoSpecific(qmp_output_get_visitor(ov), &info_spec, NULL,
&error_abort);
&local_err);
obj = qmp_output_get_qobject(ov);
assert(qobject_type(obj) == QTYPE_QDICT);
data = qdict_get(qobject_to_qdict(obj), "data");

View File

@@ -25,8 +25,7 @@
#include "block/block_int.h"
#include "qemu/module.h"
#include <zlib.h>
#include "qapi/qmp/qerror.h"
#include "crypto/cipher.h"
#include "qemu/aes.h"
#include "migration/migration.h"
/**************************************************************/
@@ -72,8 +71,10 @@ typedef struct BDRVQcowState {
uint8_t *cluster_cache;
uint8_t *cluster_data;
uint64_t cluster_cache_offset;
QCryptoCipher *cipher; /* NULL if no key yet */
uint32_t crypt_method; /* current crypt method, 0 if no key yet */
uint32_t crypt_method_header;
AES_KEY aes_encrypt_key;
AES_KEY aes_decrypt_key;
CoMutex lock;
Error *migration_blocker;
} BDRVQcowState;
@@ -114,16 +115,14 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
be64_to_cpus(&header.l1_table_offset);
if (header.magic != QCOW_MAGIC) {
error_setg(errp, "Image not in qcow format");
ret = -EINVAL;
ret = -EMEDIUMTYPE;
goto fail;
}
if (header.version != QCOW_VERSION) {
char version[64];
snprintf(version, sizeof(version), "QCOW version %" PRIu32,
header.version);
error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bdrv_get_device_or_node_name(bs), "qcow", version);
snprintf(version, sizeof(version), "QCOW version %d", header.version);
qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bs->device_name, "qcow", version);
ret = -ENOTSUP;
goto fail;
}
@@ -148,12 +147,6 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
}
if (header.crypt_method > QCOW_CRYPT_AES) {
error_setg(errp, "invalid encryption method in qcow header");
ret = -EINVAL;
goto fail;
}
if (!qcrypto_cipher_supports(QCRYPTO_CIPHER_ALG_AES_128)) {
error_setg(errp, "AES cipher not available");
ret = -EINVAL;
goto fail;
}
@@ -186,12 +179,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
}
s->l1_table_offset = header.l1_table_offset;
s->l1_table = g_try_new(uint64_t, s->l1_size);
if (s->l1_table == NULL) {
error_setg(errp, "Could not allocate memory for L1 table");
ret = -ENOMEM;
goto fail;
}
s->l1_table = g_malloc(s->l1_size * sizeof(uint64_t));
ret = bdrv_pread(bs->file, s->l1_table_offset, s->l1_table,
s->l1_size * sizeof(uint64_t));
@@ -202,16 +190,8 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
for(i = 0;i < s->l1_size; i++) {
be64_to_cpus(&s->l1_table[i]);
}
/* alloc L2 cache (max. 64k * 16 * 8 = 8 MB) */
s->l2_cache =
qemu_try_blockalign(bs->file,
s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t));
if (s->l2_cache == NULL) {
error_setg(errp, "Could not allocate L2 table cache");
ret = -ENOMEM;
goto fail;
}
/* alloc L2 cache */
s->l2_cache = g_malloc(s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t));
s->cluster_cache = g_malloc(s->cluster_size);
s->cluster_data = g_malloc(s->cluster_size);
s->cluster_cache_offset = -1;
@@ -219,7 +199,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
/* read the backing file name */
if (header.backing_file_offset != 0) {
len = header.backing_file_size;
if (len > 1023 || len >= sizeof(bs->backing_file)) {
if (len > 1023) {
error_setg(errp, "Backing file name too long");
ret = -EINVAL;
goto fail;
@@ -233,9 +213,9 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
}
/* Disable migration when qcow images are used */
error_setg(&s->migration_blocker, "The qcow format used by node '%s' "
"does not support live migration",
bdrv_get_device_or_node_name(bs));
error_set(&s->migration_blocker,
QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
"qcow", bs->device_name, "live migration");
migrate_add_blocker(s->migration_blocker);
qemu_co_mutex_init(&s->lock);
@@ -243,7 +223,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
fail:
g_free(s->l1_table);
qemu_vfree(s->l2_cache);
g_free(s->l2_cache);
g_free(s->cluster_cache);
g_free(s->cluster_data);
return ret;
@@ -263,7 +243,6 @@ static int qcow_set_key(BlockDriverState *bs, const char *key)
BDRVQcowState *s = bs->opaque;
uint8_t keybuf[16];
int len, i;
Error *err;
memset(keybuf, 0, 16);
len = strlen(key);
@@ -274,68 +253,38 @@ static int qcow_set_key(BlockDriverState *bs, const char *key)
for(i = 0;i < len;i++) {
keybuf[i] = key[i];
}
assert(bs->encrypted);
s->crypt_method = s->crypt_method_header;
qcrypto_cipher_free(s->cipher);
s->cipher = qcrypto_cipher_new(
QCRYPTO_CIPHER_ALG_AES_128,
QCRYPTO_CIPHER_MODE_CBC,
keybuf, G_N_ELEMENTS(keybuf),
&err);
if (!s->cipher) {
/* XXX would be nice if errors in this method could
* be properly propagate to the caller. Would need
* the bdrv_set_key() API signature to be fixed. */
error_free(err);
if (AES_set_encrypt_key(keybuf, 128, &s->aes_encrypt_key) != 0)
return -1;
if (AES_set_decrypt_key(keybuf, 128, &s->aes_decrypt_key) != 0)
return -1;
}
return 0;
}
/* The crypt function is compatible with the linux cryptoloop
algorithm for < 4 GB images. NOTE: out_buf == in_buf is
supported */
static int encrypt_sectors(BDRVQcowState *s, int64_t sector_num,
uint8_t *out_buf, const uint8_t *in_buf,
int nb_sectors, bool enc, Error **errp)
static void encrypt_sectors(BDRVQcowState *s, int64_t sector_num,
uint8_t *out_buf, const uint8_t *in_buf,
int nb_sectors, int enc,
const AES_KEY *key)
{
union {
uint64_t ll[2];
uint8_t b[16];
} ivec;
int i;
int ret;
for(i = 0; i < nb_sectors; i++) {
ivec.ll[0] = cpu_to_le64(sector_num);
ivec.ll[1] = 0;
if (qcrypto_cipher_setiv(s->cipher,
ivec.b, G_N_ELEMENTS(ivec.b),
errp) < 0) {
return -1;
}
if (enc) {
ret = qcrypto_cipher_encrypt(s->cipher,
in_buf,
out_buf,
512,
errp);
} else {
ret = qcrypto_cipher_decrypt(s->cipher,
in_buf,
out_buf,
512,
errp);
}
if (ret < 0) {
return -1;
}
AES_cbc_encrypt(in_buf, out_buf, 512, key,
ivec.b, enc);
sector_num++;
in_buf += 512;
out_buf += 512;
}
return 0;
}
/* 'allocate' is:
@@ -446,23 +395,17 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
bdrv_truncate(bs->file, cluster_offset + s->cluster_size);
/* if encrypted, we must initialize the cluster
content which won't be written */
if (bs->encrypted &&
if (s->crypt_method &&
(n_end - n_start) < s->cluster_sectors) {
uint64_t start_sect;
assert(s->cipher);
start_sect = (offset & ~(s->cluster_size - 1)) >> 9;
memset(s->cluster_data + 512, 0x00, 512);
for(i = 0; i < s->cluster_sectors; i++) {
if (i < n_start || i >= n_end) {
Error *err = NULL;
if (encrypt_sectors(s, start_sect + i,
s->cluster_data,
s->cluster_data + 512, 1,
true, &err) < 0) {
error_free(err);
errno = EIO;
return -1;
}
encrypt_sectors(s, start_sect + i,
s->cluster_data,
s->cluster_data + 512, 1, 1,
&s->aes_encrypt_key);
if (bdrv_pwrite(bs->file, cluster_offset + i * 512,
s->cluster_data, 512) != 512)
return -1;
@@ -502,7 +445,7 @@ static int64_t coroutine_fn qcow_co_get_block_status(BlockDriverState *bs,
if (!cluster_offset) {
return 0;
}
if ((cluster_offset & QCOW_OFLAG_COMPRESSED) || s->cipher) {
if ((cluster_offset & QCOW_OFLAG_COMPRESSED) || s->crypt_method) {
return BDRV_BLOCK_DATA;
}
cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
@@ -569,13 +512,9 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, int64_t sector_num,
QEMUIOVector hd_qiov;
uint8_t *buf;
void *orig_buf;
Error *err = NULL;
if (qiov->niov > 1) {
buf = orig_buf = qemu_try_blockalign(bs, qiov->size);
if (buf == NULL) {
return -ENOMEM;
}
buf = orig_buf = qemu_blockalign(bs, qiov->size);
} else {
orig_buf = NULL;
buf = (uint8_t *)qiov->iov->iov_base;
@@ -632,12 +571,10 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, int64_t sector_num,
if (ret < 0) {
break;
}
if (bs->encrypted) {
assert(s->cipher);
if (encrypt_sectors(s, sector_num, buf, buf,
n, false, &err) < 0) {
goto fail;
}
if (s->crypt_method) {
encrypt_sectors(s, sector_num, buf, buf,
n, 0,
&s->aes_decrypt_key);
}
}
ret = 0;
@@ -658,7 +595,6 @@ done:
return ret;
fail:
error_free(err);
ret = -EIO;
goto done;
}
@@ -680,10 +616,7 @@ static coroutine_fn int qcow_co_writev(BlockDriverState *bs, int64_t sector_num,
s->cluster_cache_offset = -1; /* disable compressed cache */
if (qiov->niov > 1) {
buf = orig_buf = qemu_try_blockalign(bs, qiov->size);
if (buf == NULL) {
return -ENOMEM;
}
buf = orig_buf = qemu_blockalign(bs, qiov->size);
qemu_iovec_to_buf(qiov, 0, buf, qiov->size);
} else {
orig_buf = NULL;
@@ -706,18 +639,12 @@ static coroutine_fn int qcow_co_writev(BlockDriverState *bs, int64_t sector_num,
ret = -EIO;
break;
}
if (bs->encrypted) {
Error *err = NULL;
assert(s->cipher);
if (s->crypt_method) {
if (!cluster_data) {
cluster_data = g_malloc0(s->cluster_size);
}
if (encrypt_sectors(s, sector_num, cluster_data, buf,
n, true, &err) < 0) {
error_free(err);
ret = -EIO;
break;
}
encrypt_sectors(s, sector_num, cluster_data, buf,
n, 1, &s->aes_encrypt_key);
src_buf = cluster_data;
} else {
src_buf = buf;
@@ -754,10 +681,8 @@ static void qcow_close(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
qcrypto_cipher_free(s->cipher);
s->cipher = NULL;
g_free(s->l1_table);
qemu_vfree(s->l2_cache);
g_free(s->l2_cache);
g_free(s->cluster_cache);
g_free(s->cluster_data);
@@ -765,38 +690,43 @@ static void qcow_close(BlockDriverState *bs)
error_free(s->migration_blocker);
}
static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
static int qcow_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int header_size, backing_filename_len, l1_size, shift, i;
QCowHeader header;
uint8_t *tmp;
int64_t total_size = 0;
char *backing_file = NULL;
const char *backing_file = NULL;
int flags = 0;
Error *local_err = NULL;
int ret;
BlockDriverState *qcow_bs;
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
flags |= BLOCK_FLAG_ENCRYPT;
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
total_size = options->value.n / 512;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
backing_file = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_ENCRYPT)) {
flags |= options->value.n ? BLOCK_FLAG_ENCRYPT : 0;
}
options++;
}
ret = bdrv_create_file(filename, opts, &local_err);
ret = bdrv_create_file(filename, options, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto cleanup;
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
qcow_bs = NULL;
ret = bdrv_open(&qcow_bs, filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, NULL, &local_err);
ret = bdrv_file_open(&qcow_bs, filename, NULL, BDRV_O_RDWR, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto cleanup;
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
ret = bdrv_truncate(qcow_bs, 0);
@@ -807,7 +737,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
memset(&header, 0, sizeof(header));
header.magic = cpu_to_be32(QCOW_MAGIC);
header.version = cpu_to_be32(QCOW_VERSION);
header.size = cpu_to_be64(total_size);
header.size = cpu_to_be64(total_size * 512);
header_size = sizeof(header);
backing_filename_len = 0;
if (backing_file) {
@@ -821,7 +751,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
backing_file = NULL;
}
header.cluster_bits = 9; /* 512 byte cluster to avoid copying
unmodified sectors */
unmodifyed sectors */
header.l2_bits = 12; /* 32 KB L2 tables */
} else {
header.cluster_bits = 12; /* 4 KB clusters */
@@ -829,7 +759,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
}
header_size = (header_size + 7) & ~7;
shift = header.cluster_bits + header.l2_bits;
l1_size = (total_size + (1LL << shift) - 1) >> shift;
l1_size = ((total_size * 512) + (1LL << shift) - 1) >> shift;
header.l1_table_offset = cpu_to_be64(header_size);
if (flags & BLOCK_FLAG_ENCRYPT) {
@@ -867,8 +797,6 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
ret = 0;
exit:
bdrv_unref(qcow_bs);
cleanup:
g_free(backing_file);
return ret;
}
@@ -981,28 +909,24 @@ static int qcow_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
return 0;
}
static QemuOptsList qcow_create_opts = {
.name = "qcow-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qcow_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = QEMU_OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_ENCRYPT,
.type = QEMU_OPT_BOOL,
.help = "Encrypt the image",
.def_value_str = "off"
},
{ /* end of list */ }
}
static QEMUOptionParameter qcow_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_ENCRYPT,
.type = OPT_FLAG,
.help = "Encrypt the image"
},
{ NULL }
};
static BlockDriver bdrv_qcow = {
@@ -1011,10 +935,9 @@ static BlockDriver bdrv_qcow = {
.bdrv_probe = qcow_probe,
.bdrv_open = qcow_open,
.bdrv_close = qcow_close,
.bdrv_reopen_prepare = qcow_reopen_prepare,
.bdrv_create = qcow_create,
.bdrv_reopen_prepare = qcow_reopen_prepare,
.bdrv_create = qcow_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.supports_backing = true,
.bdrv_co_readv = qcow_co_readv,
.bdrv_co_writev = qcow_co_writev,
@@ -1025,7 +948,7 @@ static BlockDriver bdrv_qcow = {
.bdrv_write_compressed = qcow_write_compressed,
.bdrv_get_info = qcow_get_info,
.create_opts = &qcow_create_opts,
.create_options = qcow_create_options,
};
static void bdrv_qcow_init(void)

View File

@@ -28,68 +28,46 @@
#include "trace.h"
typedef struct Qcow2CachedTable {
int64_t offset;
bool dirty;
uint64_t lru_counter;
int ref;
void* table;
int64_t offset;
bool dirty;
int cache_hits;
int ref;
} Qcow2CachedTable;
struct Qcow2Cache {
Qcow2CachedTable *entries;
struct Qcow2Cache *depends;
Qcow2CachedTable* entries;
struct Qcow2Cache* depends;
int size;
bool depends_on_flush;
void *table_array;
uint64_t lru_counter;
};
static inline void *qcow2_cache_get_table_addr(BlockDriverState *bs,
Qcow2Cache *c, int table)
{
BDRVQcowState *s = bs->opaque;
return (uint8_t *) c->table_array + (size_t) table * s->cluster_size;
}
static inline int qcow2_cache_get_table_idx(BlockDriverState *bs,
Qcow2Cache *c, void *table)
{
BDRVQcowState *s = bs->opaque;
ptrdiff_t table_offset = (uint8_t *) table - (uint8_t *) c->table_array;
int idx = table_offset / s->cluster_size;
assert(idx >= 0 && idx < c->size && table_offset % s->cluster_size == 0);
return idx;
}
Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
{
BDRVQcowState *s = bs->opaque;
Qcow2Cache *c;
int i;
c = g_new0(Qcow2Cache, 1);
c = g_malloc0(sizeof(*c));
c->size = num_tables;
c->entries = g_try_new0(Qcow2CachedTable, num_tables);
c->table_array = qemu_try_blockalign(bs->file,
(size_t) num_tables * s->cluster_size);
c->entries = g_malloc0(sizeof(*c->entries) * num_tables);
if (!c->entries || !c->table_array) {
qemu_vfree(c->table_array);
g_free(c->entries);
g_free(c);
c = NULL;
for (i = 0; i < c->size; i++) {
c->entries[i].table = qemu_blockalign(bs, s->cluster_size);
}
return c;
}
int qcow2_cache_destroy(BlockDriverState *bs, Qcow2Cache *c)
int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c)
{
int i;
for (i = 0; i < c->size; i++) {
assert(c->entries[i].ref == 0);
qemu_vfree(c->entries[i].table);
}
qemu_vfree(c->table_array);
g_free(c->entries);
g_free(c);
@@ -157,8 +135,8 @@ static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
}
ret = bdrv_pwrite(bs->file, c->entries[i].offset,
qcow2_cache_get_table_addr(bs, c, i), s->cluster_size);
ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table,
s->cluster_size);
if (ret < 0) {
return ret;
}
@@ -234,53 +212,66 @@ int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c)
for (i = 0; i < c->size; i++) {
assert(c->entries[i].ref == 0);
c->entries[i].offset = 0;
c->entries[i].lru_counter = 0;
c->entries[i].cache_hits = 0;
}
c->lru_counter = 0;
return 0;
}
static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c)
{
int i;
int min_count = INT_MAX;
int min_index = -1;
for (i = 0; i < c->size; i++) {
if (c->entries[i].ref) {
continue;
}
if (c->entries[i].cache_hits < min_count) {
min_index = i;
min_count = c->entries[i].cache_hits;
}
/* Give newer hits priority */
/* TODO Check how to optimize the replacement strategy */
c->entries[i].cache_hits /= 2;
}
if (min_index == -1) {
/* This can't happen in current synchronous code, but leave the check
* here as a reminder for whoever starts using AIO with the cache */
abort();
}
return min_index;
}
static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
uint64_t offset, void **table, bool read_from_disk)
{
BDRVQcowState *s = bs->opaque;
int i;
int ret;
int lookup_index;
uint64_t min_lru_counter = UINT64_MAX;
int min_lru_index = -1;
trace_qcow2_cache_get(qemu_coroutine_self(), c == s->l2_table_cache,
offset, read_from_disk);
/* Check if the table is already cached */
i = lookup_index = (offset / s->cluster_size * 4) % c->size;
do {
const Qcow2CachedTable *t = &c->entries[i];
if (t->offset == offset) {
for (i = 0; i < c->size; i++) {
if (c->entries[i].offset == offset) {
goto found;
}
if (t->ref == 0 && t->lru_counter < min_lru_counter) {
min_lru_counter = t->lru_counter;
min_lru_index = i;
}
if (++i == c->size) {
i = 0;
}
} while (i != lookup_index);
if (min_lru_index == -1) {
/* This can't happen in current synchronous code, but leave the check
* here as a reminder for whoever starts using AIO with the cache */
abort();
}
/* Cache miss: write a table back and replace it */
i = min_lru_index;
/* If not, write a table back and replace it */
i = qcow2_cache_find_entry_to_replace(c);
trace_qcow2_cache_get_replace_entry(qemu_coroutine_self(),
c == s->l2_table_cache, i);
if (i < 0) {
return i;
}
ret = qcow2_cache_entry_flush(bs, c, i);
if (ret < 0) {
@@ -295,19 +286,22 @@ static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
}
ret = bdrv_pread(bs->file, offset, qcow2_cache_get_table_addr(bs, c, i),
s->cluster_size);
ret = bdrv_pread(bs->file, offset, c->entries[i].table, s->cluster_size);
if (ret < 0) {
return ret;
}
}
/* Give the table some hits for the start so that it won't be replaced
* immediately. The number 32 is completely arbitrary. */
c->entries[i].cache_hits = 32;
c->entries[i].offset = offset;
/* And return the right table */
found:
c->entries[i].cache_hits++;
c->entries[i].ref++;
*table = qcow2_cache_get_table_addr(bs, c, i);
*table = c->entries[i].table;
trace_qcow2_cache_get_done(qemu_coroutine_self(),
c == s->l2_table_cache, i);
@@ -327,24 +321,36 @@ int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
return qcow2_cache_do_get(bs, c, offset, table, false);
}
void qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table)
int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table)
{
int i = qcow2_cache_get_table_idx(bs, c, *table);
int i;
for (i = 0; i < c->size; i++) {
if (c->entries[i].table == *table) {
goto found;
}
}
return -ENOENT;
found:
c->entries[i].ref--;
*table = NULL;
if (c->entries[i].ref == 0) {
c->entries[i].lru_counter = ++c->lru_counter;
}
assert(c->entries[i].ref >= 0);
return 0;
}
void qcow2_cache_entry_mark_dirty(BlockDriverState *bs, Qcow2Cache *c,
void *table)
void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table)
{
int i = qcow2_cache_get_table_idx(bs, c, table);
assert(c->entries[i].offset != 0);
int i;
for (i = 0; i < c->size; i++) {
if (c->entries[i].table == table) {
goto found;
}
}
abort();
found:
c->entries[i].dirty = true;
}

View File

@@ -42,13 +42,6 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
if (min_size <= s->l1_size)
return 0;
/* Do a sanity check on min_size before trying to calculate new_l1_size
* (this prevents overflows during the while loop for the calculation of
* new_l1_size) */
if (min_size > INT_MAX / sizeof(uint64_t)) {
return -EFBIG;
}
if (exact_size) {
new_l1_size = min_size;
} else {
@@ -72,20 +65,14 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
#endif
new_l1_size2 = sizeof(uint64_t) * new_l1_size;
new_l1_table = qemu_try_blockalign(bs->file,
align_offset(new_l1_size2, 512));
if (new_l1_table == NULL) {
return -ENOMEM;
}
memset(new_l1_table, 0, align_offset(new_l1_size2, 512));
new_l1_table = g_malloc0(align_offset(new_l1_size2, 512));
memcpy(new_l1_table, s->l1_table, s->l1_size * sizeof(uint64_t));
/* write new table (align to cluster) */
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ALLOC_TABLE);
new_l1_table_offset = qcow2_alloc_clusters(bs, new_l1_size2);
if (new_l1_table_offset < 0) {
qemu_vfree(new_l1_table);
g_free(new_l1_table);
return new_l1_table_offset;
}
@@ -119,7 +106,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
if (ret < 0) {
goto fail;
}
qemu_vfree(s->l1_table);
g_free(s->l1_table);
old_l1_table_offset = s->l1_table_offset;
s->l1_table_offset = new_l1_table_offset;
s->l1_table = new_l1_table;
@@ -129,7 +116,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
QCOW2_DISCARD_OTHER);
return 0;
fail:
qemu_vfree(new_l1_table);
g_free(new_l1_table);
qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2,
QCOW2_DISCARD_OTHER);
return ret;
@@ -164,14 +151,12 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset,
int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
{
BDRVQcowState *s = bs->opaque;
uint64_t buf[L1_ENTRIES_PER_SECTOR] = { 0 };
uint64_t buf[L1_ENTRIES_PER_SECTOR];
int l1_start_index;
int i, ret;
l1_start_index = l1_index & ~(L1_ENTRIES_PER_SECTOR - 1);
for (i = 0; i < L1_ENTRIES_PER_SECTOR && l1_start_index + i < s->l1_size;
i++)
{
for (i = 0; i < L1_ENTRIES_PER_SECTOR; i++) {
buf[i] = cpu_to_be64(s->l1_table[l1_start_index + i]);
}
@@ -253,14 +238,17 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
memcpy(l2_table, old_table, s->cluster_size);
qcow2_cache_put(bs, s->l2_table_cache, (void **) &old_table);
ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &old_table);
if (ret < 0) {
goto fail;
}
}
/* write the l2 table to the file */
BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE);
trace_qcow2_l2_allocate_write_l2(bs, l1_index);
qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
ret = qcow2_cache_flush(bs, s->l2_table_cache);
if (ret < 0) {
goto fail;
@@ -339,47 +327,26 @@ static int count_contiguous_free_clusters(uint64_t nb_clusters, uint64_t *l2_tab
/* The crypt function is compatible with the linux cryptoloop
algorithm for < 4 GB images. NOTE: out_buf == in_buf is
supported */
int qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num,
uint8_t *out_buf, const uint8_t *in_buf,
int nb_sectors, bool enc,
Error **errp)
void qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num,
uint8_t *out_buf, const uint8_t *in_buf,
int nb_sectors, int enc,
const AES_KEY *key)
{
union {
uint64_t ll[2];
uint8_t b[16];
} ivec;
int i;
int ret;
for(i = 0; i < nb_sectors; i++) {
ivec.ll[0] = cpu_to_le64(sector_num);
ivec.ll[1] = 0;
if (qcrypto_cipher_setiv(s->cipher,
ivec.b, G_N_ELEMENTS(ivec.b),
errp) < 0) {
return -1;
}
if (enc) {
ret = qcrypto_cipher_encrypt(s->cipher,
in_buf,
out_buf,
512,
errp);
} else {
ret = qcrypto_cipher_decrypt(s->cipher,
in_buf,
out_buf,
512,
errp);
}
if (ret < 0) {
return -1;
}
AES_cbc_encrypt(in_buf, out_buf, 512, key,
ivec.b, enc);
sector_num++;
in_buf += 512;
out_buf += 512;
}
return 0;
}
static int coroutine_fn copy_sectors(BlockDriverState *bs,
@@ -398,20 +365,12 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
}
iov.iov_len = n * BDRV_SECTOR_SIZE;
iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
if (iov.iov_base == NULL) {
return -ENOMEM;
}
iov.iov_base = qemu_blockalign(bs, iov.iov_len);
qemu_iovec_init_external(&qiov, &iov, 1);
BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
if (!bs->drv) {
ret = -ENOMEDIUM;
goto out;
}
/* Call .bdrv_co_readv() directly instead of using the public block-layer
* interface. This avoids double I/O throttling and request tracking,
* which can lead to deadlock when block layer copy-on-read is enabled.
@@ -421,16 +380,10 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
goto out;
}
if (bs->encrypted) {
Error *err = NULL;
assert(s->cipher);
if (qcow2_encrypt_sectors(s, start_sect + n_start,
iov.iov_base, iov.iov_base, n,
true, &err) < 0) {
ret = -EIO;
error_free(err);
goto out;
}
if (s->crypt_method) {
qcow2_encrypt_sectors(s, start_sect + n_start,
iov.iov_base, iov.iov_base, n, 1,
&s->aes_encrypt_key);
}
ret = qcow2_pre_write_overlap_check(bs, 0,
@@ -512,13 +465,6 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
goto out;
}
if (offset_into_cluster(s, l2_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
" unaligned (L1 index: %#" PRIx64 ")",
l2_offset, l1_index);
return -EIO;
}
/* load the l2 table in memory */
ret = l2_load(bs, l2_offset, &l2_table);
@@ -541,11 +487,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
break;
case QCOW2_CLUSTER_ZERO:
if (s->qcow_version < 3) {
qcow2_signal_corruption(bs, true, -1, -1, "Zero cluster entry found"
" in pre-v3 image (L2 offset: %#" PRIx64
", L2 index: %#x)", l2_offset, l2_index);
ret = -EIO;
goto fail;
return -EIO;
}
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], QCOW_OFLAG_ZERO);
@@ -561,14 +503,6 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], QCOW_OFLAG_ZERO);
*cluster_offset &= L2E_OFFSET_MASK;
if (offset_into_cluster(s, *cluster_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset %#"
PRIx64 " unaligned (L2 offset: %#" PRIx64
", L2 index: %#x)", *cluster_offset,
l2_offset, l2_index);
ret = -EIO;
goto fail;
}
break;
default:
abort();
@@ -585,10 +519,6 @@ out:
*num = nb_available - index_in_cluster;
return ret;
fail:
qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
return ret;
}
/*
@@ -624,12 +554,6 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
assert(l1_index < s->l1_size);
l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
if (offset_into_cluster(s, l2_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#" PRIx64
" unaligned (L1 index: %#" PRIx64 ")",
l2_offset, l1_index);
return -EIO;
}
/* seek the l2 table of the given l2 offset */
@@ -716,9 +640,12 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
/* compressed clusters never have the copied flag */
BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED);
qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
l2_table[l2_index] = cpu_to_be64(cluster_offset);
qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
if (ret < 0) {
return 0;
}
return cluster_offset;
}
@@ -762,11 +689,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
trace_qcow2_cluster_link_l2(qemu_coroutine_self(), m->nb_clusters);
assert(m->nb_clusters > 0);
old_cluster = g_try_new(uint64_t, m->nb_clusters);
if (old_cluster == NULL) {
ret = -ENOMEM;
goto err;
}
old_cluster = g_malloc(m->nb_clusters * sizeof(uint64_t));
/* copy content of unmodified sectors */
ret = perform_cow(bs, m, &m->cow_start);
@@ -792,7 +715,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
if (ret < 0) {
goto err;
}
qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
assert(l2_index + m->nb_clusters <= s->l2_size);
for (i = 0; i < m->nb_clusters; i++) {
@@ -810,7 +733,10 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
}
qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
if (ret < 0) {
goto err;
}
/*
* If this was a COW, we need to decrease the refcount of the old cluster.
@@ -962,7 +888,7 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
uint64_t *l2_table;
unsigned int nb_clusters;
unsigned int keep_clusters;
int ret;
int ret, pret;
trace_qcow2_handle_copied(qemu_coroutine_self(), guest_offset, *host_offset,
*bytes);
@@ -996,15 +922,6 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
bool offset_matches =
(cluster_offset & L2E_OFFSET_MASK) == *host_offset;
if (offset_into_cluster(s, cluster_offset & L2E_OFFSET_MASK)) {
qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset "
"%#llx unaligned (guest offset: %#" PRIx64
")", cluster_offset & L2E_OFFSET_MASK,
guest_offset);
ret = -EIO;
goto out;
}
if (*host_offset != 0 && !offset_matches) {
*bytes = 0;
ret = 0;
@@ -1029,11 +946,14 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
/* Cleanup */
out:
qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
pret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
if (pret < 0) {
return pret;
}
/* Only return a host offset if we actually made progress. Otherwise we
* would make requirements for handle_alloc() that it can't fulfill */
if (ret > 0) {
if (ret) {
*host_offset = (cluster_offset & L2E_OFFSET_MASK)
+ offset_into_cluster(s, guest_offset);
}
@@ -1154,7 +1074,10 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
* wrong with our code. */
assert(nb_clusters > 0);
qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
if (ret < 0) {
return ret;
}
/* Allocate, if necessary at a given offset in the image file */
alloc_cluster_offset = start_of_cluster(s, *host_offset);
@@ -1170,17 +1093,6 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
return 0;
}
/* !*host_offset would overwrite the image header and is reserved for "no
* host offset preferred". If 0 was a valid host offset, it'd trigger the
* following overlap check; do that now to avoid having an invalid value in
* *host_offset. */
if (!alloc_cluster_offset) {
ret = qcow2_pre_write_overlap_check(bs, 0, alloc_cluster_offset,
nb_clusters * s->cluster_size);
assert(ret < 0);
goto fail;
}
/*
* Save info needed for meta data update.
*
@@ -1261,7 +1173,7 @@ fail:
* Return 0 on success and -errno in error cases
*/
int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
int *num, uint64_t *host_offset, QCowL2Meta **m)
int n_start, int n_end, int *num, uint64_t *host_offset, QCowL2Meta **m)
{
BDRVQcowState *s = bs->opaque;
uint64_t start, remaining;
@@ -1269,13 +1181,15 @@ int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
uint64_t cur_bytes;
int ret;
trace_qcow2_alloc_clusters_offset(qemu_coroutine_self(), offset, *num);
trace_qcow2_alloc_clusters_offset(qemu_coroutine_self(), offset,
n_start, n_end);
assert((offset & ~BDRV_SECTOR_MASK) == 0);
assert(n_start * BDRV_SECTOR_SIZE == offset_into_cluster(s, offset));
offset = start_of_cluster(s, offset);
again:
start = offset;
remaining = (uint64_t)*num << BDRV_SECTOR_BITS;
start = offset + (n_start << BDRV_SECTOR_BITS);
remaining = (n_end - n_start) << BDRV_SECTOR_BITS;
cluster_offset = 0;
*host_offset = 0;
cur_bytes = 0;
@@ -1361,7 +1275,7 @@ again:
}
}
*num -= remaining >> BDRV_SECTOR_BITS;
*num = (n_end - n_start) - (remaining >> BDRV_SECTOR_BITS);
assert(*num > 0);
assert(*host_offset != 0);
@@ -1426,7 +1340,7 @@ int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset)
* clusters.
*/
static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
unsigned int nb_clusters, enum qcow2_discard_type type, bool full_discard)
unsigned int nb_clusters, enum qcow2_discard_type type)
{
BDRVQcowState *s = bs->opaque;
uint64_t *l2_table;
@@ -1443,63 +1357,31 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
nb_clusters = MIN(nb_clusters, s->l2_size - l2_index);
for (i = 0; i < nb_clusters; i++) {
uint64_t old_l2_entry;
uint64_t old_offset;
old_l2_entry = be64_to_cpu(l2_table[l2_index + i]);
/*
* If full_discard is false, make sure that a discarded area reads back
* as zeroes for v3 images (we cannot do it for v2 without actually
* writing a zero-filled buffer). We can skip the operation if the
* cluster is already marked as zero, or if it's unallocated and we
* don't have a backing file.
*
* TODO We might want to use bdrv_get_block_status(bs) here, but we're
* holding s->lock, so that doesn't work today.
*
* If full_discard is true, the sector should not read back as zeroes,
* but rather fall through to the backing file.
*/
switch (qcow2_get_cluster_type(old_l2_entry)) {
case QCOW2_CLUSTER_UNALLOCATED:
if (full_discard || !bs->backing_hd) {
continue;
}
break;
case QCOW2_CLUSTER_ZERO:
if (!full_discard) {
continue;
}
break;
case QCOW2_CLUSTER_NORMAL:
case QCOW2_CLUSTER_COMPRESSED:
break;
default:
abort();
old_offset = be64_to_cpu(l2_table[l2_index + i]);
if ((old_offset & L2E_OFFSET_MASK) == 0) {
continue;
}
/* First remove L2 entries */
qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
if (!full_discard && s->qcow_version >= 3) {
l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
} else {
l2_table[l2_index + i] = cpu_to_be64(0);
}
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
l2_table[l2_index + i] = cpu_to_be64(0);
/* Then decrease the refcount */
qcow2_free_any_clusters(bs, old_l2_entry, 1, type);
qcow2_free_any_clusters(bs, old_offset, 1, type);
}
qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
if (ret < 0) {
return ret;
}
return nb_clusters;
}
int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int nb_sectors, enum qcow2_discard_type type, bool full_discard)
int nb_sectors, enum qcow2_discard_type type)
{
BDRVQcowState *s = bs->opaque;
uint64_t end_offset;
@@ -1510,7 +1392,7 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
/* Round start up and end down */
offset = align_offset(offset, s->cluster_size);
end_offset = start_of_cluster(s, end_offset);
end_offset &= ~(s->cluster_size - 1);
if (offset > end_offset) {
return 0;
@@ -1522,7 +1404,7 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
/* Each L2 table is handled by its own loop iteration */
while (nb_clusters > 0) {
ret = discard_single_l2(bs, offset, nb_clusters, type, full_discard);
ret = discard_single_l2(bs, offset, nb_clusters, type);
if (ret < 0) {
goto fail;
}
@@ -1567,7 +1449,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
old_offset = be64_to_cpu(l2_table[l2_index + i]);
/* Update L2 entries */
qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
if (old_offset & QCOW_OFLAG_COMPRESSED) {
l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
@@ -1576,7 +1458,10 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
}
}
qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
if (ret < 0) {
return ret;
}
return nb_clusters;
}
@@ -1619,14 +1504,15 @@ fail:
* Expands all zero clusters in a specific L1 table (or deallocates them, for
* non-backed non-pre-allocated zero clusters).
*
* l1_entries and *visited_l1_entries are used to keep track of progress for
* status_cb(). l1_entries contains the total number of L1 entries and
* *visited_l1_entries counts all visited L1 entries.
* expanded_clusters is a bitmap where every bit corresponds to one cluster in
* the image file; a bit gets set if the corresponding cluster has been used for
* zero expansion (i.e., has been filled with zeroes and is referenced from an
* L2 table). nb_clusters contains the total cluster count of the image file,
* i.e., the number of bits in expanded_clusters.
*/
static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
int l1_size, int64_t *visited_l1_entries,
int64_t l1_entries,
BlockDriverAmendStatusCB *status_cb)
int l1_size, uint8_t **expanded_clusters,
uint64_t *nb_clusters)
{
BDRVQcowState *s = bs->opaque;
bool is_active_l1 = (l1_table == s->l1_table);
@@ -1637,34 +1523,18 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
if (!is_active_l1) {
/* inactive L2 tables require a buffer to be stored in when loading
* them from disk */
l2_table = qemu_try_blockalign(bs->file, s->cluster_size);
if (l2_table == NULL) {
return -ENOMEM;
}
l2_table = qemu_blockalign(bs, s->cluster_size);
}
for (i = 0; i < l1_size; i++) {
uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
bool l2_dirty = false;
uint64_t l2_refcount;
if (!l2_offset) {
/* unallocated */
(*visited_l1_entries)++;
if (status_cb) {
status_cb(bs, *visited_l1_entries, l1_entries);
}
continue;
}
if (offset_into_cluster(s, l2_offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "L2 table offset %#"
PRIx64 " unaligned (L1 index: %#x)",
l2_offset, i);
ret = -EIO;
goto fail;
}
if (is_active_l1) {
/* get active L2 tables from cache */
ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
@@ -1678,19 +1548,33 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
goto fail;
}
ret = qcow2_get_refcount(bs, l2_offset >> s->cluster_bits,
&l2_refcount);
if (ret < 0) {
goto fail;
}
for (j = 0; j < s->l2_size; j++) {
uint64_t l2_entry = be64_to_cpu(l2_table[j]);
int64_t offset = l2_entry & L2E_OFFSET_MASK;
int64_t offset = l2_entry & L2E_OFFSET_MASK, cluster_index;
int cluster_type = qcow2_get_cluster_type(l2_entry);
bool preallocated = offset != 0;
if (cluster_type != QCOW2_CLUSTER_ZERO) {
if (cluster_type == QCOW2_CLUSTER_NORMAL) {
cluster_index = offset >> s->cluster_bits;
assert((cluster_index >= 0) && (cluster_index < *nb_clusters));
if ((*expanded_clusters)[cluster_index / 8] &
(1 << (cluster_index % 8))) {
/* Probably a shared L2 table; this cluster was a zero
* cluster which has been expanded, its refcount
* therefore most likely requires an update. */
ret = qcow2_update_cluster_refcount(bs, cluster_index, 1,
QCOW2_DISCARD_NEVER);
if (ret < 0) {
goto fail;
}
/* Since we just increased the refcount, the COPIED flag may
* no longer be set. */
l2_table[j] = cpu_to_be64(l2_entry & ~QCOW_OFLAG_COPIED);
l2_dirty = true;
}
continue;
}
else if (qcow2_get_cluster_type(l2_entry) != QCOW2_CLUSTER_ZERO) {
continue;
}
@@ -1708,33 +1592,6 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
ret = offset;
goto fail;
}
if (l2_refcount > 1) {
/* For shared L2 tables, set the refcount accordingly (it is
* already 1 and needs to be l2_refcount) */
ret = qcow2_update_cluster_refcount(bs,
offset >> s->cluster_bits,
refcount_diff(1, l2_refcount), false,
QCOW2_DISCARD_OTHER);
if (ret < 0) {
qcow2_free_clusters(bs, offset, s->cluster_size,
QCOW2_DISCARD_OTHER);
goto fail;
}
}
}
if (offset_into_cluster(s, offset)) {
qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset "
"%#" PRIx64 " unaligned (L2 offset: %#"
PRIx64 ", L2 index: %#x)", offset,
l2_offset, j);
if (!preallocated) {
qcow2_free_clusters(bs, offset, s->cluster_size,
QCOW2_DISCARD_ALWAYS);
}
ret = -EIO;
goto fail;
}
ret = qcow2_pre_write_overlap_check(bs, 0, offset, s->cluster_size);
@@ -1747,7 +1604,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
}
ret = bdrv_write_zeroes(bs->file, offset / BDRV_SECTOR_SIZE,
s->cluster_sectors, 0);
s->cluster_sectors);
if (ret < 0) {
if (!preallocated) {
qcow2_free_clusters(bs, offset, s->cluster_size,
@@ -1756,20 +1613,41 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
goto fail;
}
if (l2_refcount == 1) {
l2_table[j] = cpu_to_be64(offset | QCOW_OFLAG_COPIED);
} else {
l2_table[j] = cpu_to_be64(offset);
}
l2_table[j] = cpu_to_be64(offset | QCOW_OFLAG_COPIED);
l2_dirty = true;
cluster_index = offset >> s->cluster_bits;
if (cluster_index >= *nb_clusters) {
uint64_t old_bitmap_size = (*nb_clusters + 7) / 8;
uint64_t new_bitmap_size;
/* The offset may lie beyond the old end of the underlying image
* file for growable files only */
assert(bs->file->growable);
*nb_clusters = size_to_clusters(s, bs->file->total_sectors *
BDRV_SECTOR_SIZE);
new_bitmap_size = (*nb_clusters + 7) / 8;
*expanded_clusters = g_realloc(*expanded_clusters,
new_bitmap_size);
/* clear the newly allocated space */
memset(&(*expanded_clusters)[old_bitmap_size], 0,
new_bitmap_size - old_bitmap_size);
}
assert((cluster_index >= 0) && (cluster_index < *nb_clusters));
(*expanded_clusters)[cluster_index / 8] |= 1 << (cluster_index % 8);
}
if (is_active_l1) {
if (l2_dirty) {
qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
qcow2_cache_depends_on_flush(s->l2_table_cache);
}
qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
ret = qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
if (ret < 0) {
l2_table = NULL;
goto fail;
}
} else {
if (l2_dirty) {
ret = qcow2_pre_write_overlap_check(bs,
@@ -1786,11 +1664,6 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
}
}
}
(*visited_l1_entries)++;
if (status_cb) {
status_cb(bs, *visited_l1_entries, l1_entries);
}
}
ret = 0;
@@ -1800,7 +1673,12 @@ fail:
if (!is_active_l1) {
qemu_vfree(l2_table);
} else {
qcow2_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
if (ret < 0) {
qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
} else {
ret = qcow2_cache_put(bs, s->l2_table_cache,
(void **)&l2_table);
}
}
}
return ret;
@@ -1812,25 +1690,21 @@ fail:
* allocation for pre-allocated ones). This is important for downgrading to a
* qcow2 version which doesn't yet support metadata zero clusters.
*/
int qcow2_expand_zero_clusters(BlockDriverState *bs,
BlockDriverAmendStatusCB *status_cb)
int qcow2_expand_zero_clusters(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
uint64_t *l1_table = NULL;
int64_t l1_entries = 0, visited_l1_entries = 0;
uint64_t nb_clusters;
uint8_t *expanded_clusters;
int ret;
int i, j;
if (status_cb) {
l1_entries = s->l1_size;
for (i = 0; i < s->nb_snapshots; i++) {
l1_entries += s->snapshots[i].l1_size;
}
}
nb_clusters = size_to_clusters(s, bs->file->total_sectors *
BDRV_SECTOR_SIZE);
expanded_clusters = g_malloc0((nb_clusters + 7) / 8);
ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size,
&visited_l1_entries, l1_entries,
status_cb);
&expanded_clusters, &nb_clusters);
if (ret < 0) {
goto fail;
}
@@ -1864,8 +1738,7 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs,
}
ret = expand_zero_clusters_in_l1(bs, l1_table, s->snapshots[i].l1_size,
&visited_l1_entries, l1_entries,
status_cb);
&expanded_clusters, &nb_clusters);
if (ret < 0) {
goto fail;
}
@@ -1874,6 +1747,7 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs,
ret = 0;
fail:
g_free(expanded_clusters);
g_free(l1_table);
return ret;
}

File diff suppressed because it is too large Load Diff

View File

@@ -25,7 +25,6 @@
#include "qemu-common.h"
#include "block/block_int.h"
#include "block/qcow2.h"
#include "qemu/error-report.h"
void qcow2_free_snapshots(BlockDriverState *bs)
{
@@ -59,7 +58,7 @@ int qcow2_read_snapshots(BlockDriverState *bs)
}
offset = s->snapshots_offset;
s->snapshots = g_new0(QCowSnapshot, s->nb_snapshots);
s->snapshots = g_malloc0(s->nb_snapshots * sizeof(QCowSnapshot));
for(i = 0; i < s->nb_snapshots; i++) {
/* Read statically sized part of the snapshot header */
@@ -117,14 +116,8 @@ int qcow2_read_snapshots(BlockDriverState *bs)
}
offset += name_size;
sn->name[name_size] = '\0';
if (offset - s->snapshots_offset > QCOW_MAX_SNAPSHOTS_SIZE) {
ret = -EFBIG;
goto fail;
}
}
assert(offset - s->snapshots_offset <= INT_MAX);
s->snapshots_size = offset - s->snapshots_offset;
return 0;
@@ -145,7 +138,7 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
uint32_t nb_snapshots;
uint64_t snapshots_offset;
} QEMU_PACKED header_data;
int64_t offset, snapshots_offset = 0;
int64_t offset, snapshots_offset;
int ret;
/* compute the size of the snapshots */
@@ -157,14 +150,7 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
offset += sizeof(extra);
offset += strlen(sn->id_str);
offset += strlen(sn->name);
if (offset > QCOW_MAX_SNAPSHOTS_SIZE) {
ret = -EFBIG;
goto fail;
}
}
assert(offset <= INT_MAX);
snapshots_size = offset;
/* Allocate space for the new snapshot list */
@@ -352,8 +338,10 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
memset(sn, 0, sizeof(*sn));
/* Generate an ID */
find_new_snapshot_id(bs, sn_info->id_str, sizeof(sn_info->id_str));
/* Generate an ID if it wasn't passed */
if (sn_info->id_str[0] == '\0') {
find_new_snapshot_id(bs, sn_info->id_str, sizeof(sn_info->id_str));
}
/* Check that the ID is unique */
if (find_snapshot_by_id_and_name(bs, sn_info->id_str, NULL) >= 0) {
@@ -380,12 +368,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
sn->l1_table_offset = l1_table_offset;
sn->l1_size = s->l1_size;
l1_table = g_try_new(uint64_t, s->l1_size);
if (s->l1_size && l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
l1_table = g_malloc(s->l1_size * sizeof(uint64_t));
for(i = 0; i < s->l1_size; i++) {
l1_table[i] = cpu_to_be64(s->l1_table[i]);
}
@@ -416,7 +399,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
}
/* Append the new snapshot to the snapshot list */
new_snapshot_list = g_new(QCowSnapshot, s->nb_snapshots + 1);
new_snapshot_list = g_malloc((s->nb_snapshots + 1) * sizeof(QCowSnapshot));
if (s->snapshots) {
memcpy(new_snapshot_list, s->snapshots,
s->nb_snapshots * sizeof(QCowSnapshot));
@@ -440,7 +423,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
qcow2_discard_clusters(bs, qcow2_vm_state_offset(s),
align_offset(sn->vm_state_size, s->cluster_size)
>> BDRV_SECTOR_BITS,
QCOW2_DISCARD_NEVER, false);
QCOW2_DISCARD_NEVER);
#ifdef DEBUG_ALLOC
{
@@ -503,11 +486,7 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
* Decrease the refcount referenced by the old one only when the L1
* table is overwritten.
*/
sn_l1_table = g_try_malloc0(cur_l1_bytes);
if (cur_l1_bytes && sn_l1_table == NULL) {
ret = -ENOMEM;
goto fail;
}
sn_l1_table = g_malloc0(cur_l1_bytes);
ret = bdrv_pread(bs->file, sn->l1_table_offset, sn_l1_table, sn_l1_bytes);
if (ret < 0) {
@@ -606,8 +585,7 @@ int qcow2_snapshot_delete(BlockDriverState *bs,
s->nb_snapshots--;
ret = qcow2_write_snapshots(bs);
if (ret < 0) {
error_setg_errno(errp, -ret,
"Failed to remove snapshot from snapshot list");
error_setg(errp, "Failed to remove snapshot from snapshot list");
return ret;
}
@@ -625,7 +603,7 @@ int qcow2_snapshot_delete(BlockDriverState *bs,
ret = qcow2_update_snapshot_refcount(bs, sn.l1_table_offset,
sn.l1_size, -1);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to free the cluster and L1 table");
error_setg(errp, "Failed to free the cluster and L1 table");
return ret;
}
qcow2_free_clusters(bs, sn.l1_table_offset, sn.l1_size * sizeof(uint64_t),
@@ -634,8 +612,7 @@ int qcow2_snapshot_delete(BlockDriverState *bs,
/* must update the copied flag on the current cluster offsets */
ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0);
if (ret < 0) {
error_setg_errno(errp, -ret,
"Failed to update snapshot status in disk");
error_setg(errp, "Failed to update snapshot status in disk");
return ret;
}
@@ -660,7 +637,7 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
return s->nb_snapshots;
}
sn_tab = g_new0(QEMUSnapshotInfo, s->nb_snapshots);
sn_tab = g_malloc0(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
for(i = 0; i < s->nb_snapshots; i++) {
sn_info = sn_tab + i;
sn = s->snapshots + i;
@@ -677,10 +654,7 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
return s->nb_snapshots;
}
int qcow2_snapshot_load_tmp(BlockDriverState *bs,
const char *snapshot_id,
const char *name,
Error **errp)
int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name)
{
int i, snapshot_index;
BDRVQcowState *s = bs->opaque;
@@ -692,35 +666,28 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
assert(bs->read_only);
/* Search the snapshot */
snapshot_index = find_snapshot_by_id_and_name(bs, snapshot_id, name);
snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_name);
if (snapshot_index < 0) {
error_setg(errp,
"Can't find snapshot");
return -ENOENT;
}
sn = &s->snapshots[snapshot_index];
/* Allocate and read in the snapshot's L1 table */
if (sn->l1_size > QCOW_MAX_L1_SIZE / sizeof(uint64_t)) {
error_setg(errp, "Snapshot L1 table too large");
if (sn->l1_size > QCOW_MAX_L1_SIZE) {
error_report("Snapshot L1 table too large");
return -EFBIG;
}
new_l1_bytes = sn->l1_size * sizeof(uint64_t);
new_l1_table = qemu_try_blockalign(bs->file,
align_offset(new_l1_bytes, 512));
if (new_l1_table == NULL) {
return -ENOMEM;
}
new_l1_table = g_malloc0(align_offset(new_l1_bytes, 512));
ret = bdrv_pread(bs->file, sn->l1_table_offset, new_l1_table, new_l1_bytes);
if (ret < 0) {
error_setg(errp, "Failed to read l1 table for snapshot");
qemu_vfree(new_l1_table);
g_free(new_l1_table);
return ret;
}
/* Switch the L1 table */
qemu_vfree(s->l1_table);
g_free(s->l1_table);
s->l1_size = sn->l1_size;
s->l1_table_offset = sn->l1_table_offset;

File diff suppressed because it is too large Load Diff

View File

@@ -25,7 +25,7 @@
#ifndef BLOCK_QCOW2_H
#define BLOCK_QCOW2_H
#include "crypto/cipher.h"
#include "qemu/aes.h"
#include "block/coroutine.h"
//#define DEBUG_ALLOC
@@ -48,10 +48,6 @@
* (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
#define QCOW_MAX_L1_SIZE 0x2000000
/* Allow for an average of 1k per snapshot table entry, should be plenty of
* space for snapshot names and IDs */
#define QCOW_MAX_SNAPSHOTS_SIZE (1024 * QCOW_MAX_SNAPSHOTS)
/* indicate that the refcount of the referenced cluster is exactly one. */
#define QCOW_OFLAG_COPIED (1ULL << 63)
/* indicate that the cluster is compressed (they never have the copied flag) */
@@ -59,22 +55,15 @@
/* The cluster reads as all zeros */
#define QCOW_OFLAG_ZERO (1ULL << 0)
#define REFCOUNT_SHIFT 1 /* refcount size is 2 bytes */
#define MIN_CLUSTER_BITS 9
#define MAX_CLUSTER_BITS 21
/* Must be at least 2 to cover COW */
#define MIN_L2_CACHE_SIZE 2 /* clusters */
#define L2_CACHE_SIZE 16
/* Must be at least 4 to cover all cases of refcount table growth */
#define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
/* Whichever is more */
#define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
/* The refblock cache needs only a fourth of the L2 cache size to cover as many
* clusters */
#define DEFAULT_L2_REFCOUNT_SIZE_RATIO 4
#define REFCOUNT_CACHE_SIZE 4
#define DEFAULT_CLUSTER_SIZE 65536
@@ -84,7 +73,6 @@
#define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot"
#define QCOW2_OPT_DISCARD_OTHER "pass-discard-other"
#define QCOW2_OPT_OVERLAP "overlap-check"
#define QCOW2_OPT_OVERLAP_TEMPLATE "overlap-check.template"
#define QCOW2_OPT_OVERLAP_MAIN_HEADER "overlap-check.main-header"
#define QCOW2_OPT_OVERLAP_ACTIVE_L1 "overlap-check.active-l1"
#define QCOW2_OPT_OVERLAP_ACTIVE_L2 "overlap-check.active-l2"
@@ -93,9 +81,6 @@
#define QCOW2_OPT_OVERLAP_SNAPSHOT_TABLE "overlap-check.snapshot-table"
#define QCOW2_OPT_OVERLAP_INACTIVE_L1 "overlap-check.inactive-l1"
#define QCOW2_OPT_OVERLAP_INACTIVE_L2 "overlap-check.inactive-l2"
#define QCOW2_OPT_CACHE_SIZE "cache-size"
#define QCOW2_OPT_L2_CACHE_SIZE "l2-cache-size"
#define QCOW2_OPT_REFCOUNT_CACHE_SIZE "refcount-cache-size"
typedef struct QCowHeader {
uint32_t magic;
@@ -216,11 +201,6 @@ typedef struct Qcow2DiscardRegion {
QTAILQ_ENTRY(Qcow2DiscardRegion) next;
} Qcow2DiscardRegion;
typedef uint64_t Qcow2GetRefcountFunc(const void *refcount_array,
uint64_t index);
typedef void Qcow2SetRefcountFunc(void *refcount_array,
uint64_t index, uint64_t value);
typedef struct BDRVQcowState {
int cluster_bits;
int cluster_size;
@@ -229,8 +209,6 @@ typedef struct BDRVQcowState {
int l2_size;
int l1_size;
int l1_vm_state_index;
int refcount_block_bits;
int refcount_block_size;
int csize_shift;
int csize_mask;
uint64_t cluster_offset_mask;
@@ -253,8 +231,10 @@ typedef struct BDRVQcowState {
CoMutex lock;
QCryptoCipher *cipher; /* current cipher, NULL if no key yet */
uint32_t crypt_method; /* current crypt method, 0 if no key yet */
uint32_t crypt_method_header;
AES_KEY aes_encrypt_key;
AES_KEY aes_decrypt_key;
uint64_t snapshots_offset;
int snapshots_size;
unsigned int nb_snapshots;
@@ -264,16 +244,10 @@ typedef struct BDRVQcowState {
int qcow_version;
bool use_lazy_refcounts;
int refcount_order;
int refcount_bits;
uint64_t refcount_max;
Qcow2GetRefcountFunc *get_refcount;
Qcow2SetRefcountFunc *set_refcount;
bool discard_passthrough[QCOW2_DISCARD_MAX];
int overlap_check; /* bitmask of Qcow2MetadataOverlap values */
bool signaled_corruption;
uint64_t incompatible_features;
uint64_t compatible_features;
@@ -284,14 +258,19 @@ typedef struct BDRVQcowState {
QLIST_HEAD(, Qcow2UnknownHeaderExtension) unknown_header_ext;
QTAILQ_HEAD (, Qcow2DiscardRegion) discards;
bool cache_discards;
/* Backing file path and format as stored in the image (this is not the
* effective path/format, which may be the result of a runtime option
* override) */
char *image_backing_file;
char *image_backing_format;
} BDRVQcowState;
/* XXX: use std qcow open function ? */
typedef struct QCowCreateState {
int cluster_size;
int cluster_bits;
uint16_t *refcount_block;
uint64_t *refcount_table;
int64_t l1_table_offset;
int64_t refcount_table_offset;
int64_t refcount_block_offset;
} QCowCreateState;
struct QCowAIOCB;
typedef struct Qcow2COWRegion {
@@ -396,11 +375,11 @@ typedef enum QCow2MetadataOverlap {
#define QCOW2_OL_ALL \
(QCOW2_OL_CACHED | QCOW2_OL_INACTIVE_L2)
#define L1E_OFFSET_MASK 0x00fffffffffffe00ULL
#define L2E_OFFSET_MASK 0x00fffffffffffe00ULL
#define L1E_OFFSET_MASK 0x00ffffffffffff00ULL
#define L2E_OFFSET_MASK 0x00ffffffffffff00ULL
#define L2E_COMPRESSED_OFFSET_SIZE_MASK 0x3fffffffffffffffULL
#define REFT_OFFSET_MASK 0xfffffffffffffe00ULL
#define REFT_OFFSET_MASK 0xffffffffffffff00ULL
static inline int64_t start_of_cluster(BDRVQcowState *s, int64_t offset)
{
@@ -474,11 +453,6 @@ static inline uint64_t l2meta_cow_end(QCowL2Meta *m)
+ (m->cow_end.nb_sectors << BDRV_SECTOR_BITS);
}
static inline uint64_t refcount_diff(uint64_t r1, uint64_t r2)
{
return r1 > r2 ? r1 - r2 : r2 - r1;
}
// FIXME Need qcow2_ prefix to global functions
/* qcow2.c functions */
@@ -490,20 +464,12 @@ int qcow2_mark_corrupt(BlockDriverState *bs);
int qcow2_mark_consistent(BlockDriverState *bs);
int qcow2_update_header(BlockDriverState *bs);
void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
int64_t size, const char *message_format, ...)
GCC_FMT_ATTR(5, 6);
/* qcow2-refcount.c functions */
int qcow2_refcount_init(BlockDriverState *bs);
void qcow2_refcount_close(BlockDriverState *bs);
int qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index,
uint64_t *refcount);
int qcow2_update_cluster_refcount(BlockDriverState *bs, int64_t cluster_index,
uint64_t addend, bool decrease,
enum qcow2_discard_type type);
int addend, enum qcow2_discard_type type);
int64_t qcow2_alloc_clusters(BlockDriverState *bs, uint64_t size);
int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
@@ -534,25 +500,25 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index);
void qcow2_l2_cache_reset(BlockDriverState *bs);
int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
int qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num,
uint8_t *out_buf, const uint8_t *in_buf,
int nb_sectors, bool enc, Error **errp);
void qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num,
uint8_t *out_buf, const uint8_t *in_buf,
int nb_sectors, int enc,
const AES_KEY *key);
int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
int *num, uint64_t *cluster_offset);
int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
int *num, uint64_t *host_offset, QCowL2Meta **m);
int n_start, int n_end, int *num, uint64_t *host_offset, QCowL2Meta **m);
uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
uint64_t offset,
int compressed_size);
int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m);
int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int nb_sectors, enum qcow2_discard_type type, bool full_discard);
int nb_sectors, enum qcow2_discard_type type);
int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors);
int qcow2_expand_zero_clusters(BlockDriverState *bs,
BlockDriverAmendStatusCB *status_cb);
int qcow2_expand_zero_clusters(BlockDriverState *bs);
/* qcow2-snapshot.c functions */
int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info);
@@ -562,10 +528,7 @@ int qcow2_snapshot_delete(BlockDriverState *bs,
const char *name,
Error **errp);
int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab);
int qcow2_snapshot_load_tmp(BlockDriverState *bs,
const char *snapshot_id,
const char *name,
Error **errp);
int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name);
void qcow2_free_snapshots(BlockDriverState *bs);
int qcow2_read_snapshots(BlockDriverState *bs);
@@ -574,8 +537,7 @@ int qcow2_read_snapshots(BlockDriverState *bs);
Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
void qcow2_cache_entry_mark_dirty(BlockDriverState *bs, Qcow2Cache *c,
void *table);
void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table);
int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
Qcow2Cache *dependency);
@@ -587,6 +549,6 @@ int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
void **table);
int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
void **table);
void qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
#endif

View File

@@ -227,10 +227,8 @@ int qed_check(BDRVQEDState *s, BdrvCheckResult *result, bool fix)
};
int ret;
check.used_clusters = g_try_new0(uint32_t, (check.nclusters + 31) / 32);
if (check.nclusters && check.used_clusters == NULL) {
return -ENOMEM;
}
check.used_clusters = g_malloc0(((check.nclusters + 31) / 32) *
sizeof(check.used_clusters[0]));
check.result->bfi.total_clusters =
(s->header.image_size + s->header.cluster_size - 1) /

View File

@@ -13,7 +13,7 @@
#include "qed.h"
void *gencb_alloc(size_t len, BlockCompletionFunc *cb, void *opaque)
void *gencb_alloc(size_t len, BlockDriverCompletionFunc *cb, void *opaque)
{
GenericCB *gencb = g_malloc(len);
gencb->cb = cb;
@@ -24,7 +24,7 @@ void *gencb_alloc(size_t len, BlockCompletionFunc *cb, void *opaque)
void gencb_complete(void *opaque, int ret)
{
GenericCB *gencb = opaque;
BlockCompletionFunc *cb = gencb->cb;
BlockDriverCompletionFunc *cb = gencb->cb;
void *user_opaque = gencb->opaque;
g_free(gencb);

View File

@@ -49,7 +49,7 @@ out:
}
static void qed_read_table(BDRVQEDState *s, uint64_t offset, QEDTable *table,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
QEDReadTableCB *read_table_cb = gencb_alloc(sizeof(*read_table_cb),
cb, opaque);
@@ -119,7 +119,7 @@ out:
*/
static void qed_write_table(BDRVQEDState *s, uint64_t offset, QEDTable *table,
unsigned int index, unsigned int n, bool flush,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
QEDWriteTableCB *write_table_cb;
unsigned int sector_mask = BDRV_SECTOR_SIZE / sizeof(uint64_t) - 1;
@@ -173,14 +173,14 @@ int qed_read_l1_table_sync(BDRVQEDState *s)
qed_read_table(s, s->header.l1_table_offset,
s->l1_table, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
aio_poll(bdrv_get_aio_context(s->bs), true);
qemu_aio_wait();
}
return ret;
}
void qed_write_l1_table(BDRVQEDState *s, unsigned int index, unsigned int n,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BLKDBG_EVENT(s->bs->file, BLKDBG_L1_UPDATE);
qed_write_table(s, s->header.l1_table_offset,
@@ -194,7 +194,7 @@ int qed_write_l1_table_sync(BDRVQEDState *s, unsigned int index,
qed_write_l1_table(s, index, n, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
aio_poll(bdrv_get_aio_context(s->bs), true);
qemu_aio_wait();
}
return ret;
@@ -235,7 +235,7 @@ static void qed_read_l2_table_cb(void *opaque, int ret)
}
void qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
QEDReadL2TableCB *read_l2_table_cb;
@@ -267,7 +267,7 @@ int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request, uint64_t offset
qed_read_l2_table(s, request, offset, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
aio_poll(bdrv_get_aio_context(s->bs), true);
qemu_aio_wait();
}
return ret;
@@ -275,7 +275,7 @@ int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request, uint64_t offset
void qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
unsigned int index, unsigned int n, bool flush,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BLKDBG_EVENT(s->bs->file, BLKDBG_L2_UPDATE);
qed_write_table(s, request->l2_table->offset,
@@ -289,7 +289,7 @@ int qed_write_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
qed_write_l2_table(s, request, index, n, flush, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
aio_poll(bdrv_get_aio_context(s->bs), true);
qemu_aio_wait();
}
return ret;

View File

@@ -18,8 +18,21 @@
#include "qapi/qmp/qerror.h"
#include "migration/migration.h"
static void qed_aio_cancel(BlockDriverAIOCB *blockacb)
{
QEDAIOCB *acb = (QEDAIOCB *)blockacb;
bool finished = false;
/* Wait for the request to finish */
acb->finished = &finished;
while (!finished) {
qemu_aio_wait();
}
}
static const AIOCBInfo qed_aiocb_info = {
.aiocb_size = sizeof(QEDAIOCB),
.cancel = qed_aio_cancel,
};
static int bdrv_qed_probe(const uint8_t *buf, int buf_size,
@@ -130,7 +143,7 @@ static void qed_write_header_read_cb(void *opaque, int ret)
* This function only updates known header fields in-place and does not affect
* extra data after the QED header.
*/
static void qed_write_header(BDRVQEDState *s, BlockCompletionFunc cb,
static void qed_write_header(BDRVQEDState *s, BlockDriverCompletionFunc cb,
void *opaque)
{
/* We must write full sectors for O_DIRECT but cannot necessarily generate
@@ -360,27 +373,6 @@ static void bdrv_qed_rebind(BlockDriverState *bs)
s->bs = bs;
}
static void bdrv_qed_detach_aio_context(BlockDriverState *bs)
{
BDRVQEDState *s = bs->opaque;
qed_cancel_need_check_timer(s);
timer_free(s->need_check_timer);
}
static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVQEDState *s = bs->opaque;
s->need_check_timer = aio_timer_new(new_context,
QEMU_CLOCK_VIRTUAL, SCALE_NS,
qed_need_check_timer_cb, s);
if (s->header.features & QED_F_NEED_CHECK) {
qed_start_need_check_timer(s);
}
}
static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
@@ -399,16 +391,15 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
qed_header_le_to_cpu(&le_header, &s->header);
if (s->header.magic != QED_MAGIC) {
error_setg(errp, "Image not in QED format");
return -EINVAL;
return -EMEDIUMTYPE;
}
if (s->header.features & ~QED_FEATURE_MASK) {
/* image uses unsupported feature bits */
char buf[64];
snprintf(buf, sizeof(buf), "%" PRIx64,
s->header.features & ~QED_FEATURE_MASK);
error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bdrv_get_device_or_node_name(bs), "QED", buf);
qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bs->device_name, "QED", buf);
return -ENOTSUP;
}
if (!qed_is_cluster_size_valid(s->header.cluster_size)) {
@@ -436,14 +427,9 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
s->table_nelems = (s->header.cluster_size * s->header.table_size) /
sizeof(uint64_t);
s->l2_shift = ctz32(s->header.cluster_size);
s->l2_shift = ffs(s->header.cluster_size) - 1;
s->l2_mask = s->table_nelems - 1;
s->l1_shift = s->l2_shift + ctz32(s->table_nelems);
/* Header size calculation must not overflow uint32_t */
if (s->header.header_size > UINT32_MAX / s->header.cluster_size) {
return -EINVAL;
}
s->l1_shift = s->l2_shift + ffs(s->table_nelems) - 1;
if ((s->header.features & QED_F_BACKING_FILE)) {
if ((uint64_t)s->header.backing_filename_offset +
@@ -509,7 +495,8 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
}
}
bdrv_qed_attach_aio_context(bs, bdrv_get_aio_context(bs));
s->need_check_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
qed_need_check_timer_cb, s);
out:
if (ret) {
@@ -519,13 +506,6 @@ out:
return ret;
}
static void bdrv_qed_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVQEDState *s = bs->opaque;
bs->bl.write_zeroes_alignment = s->header.cluster_size >> BDRV_SECTOR_BITS;
}
/* We have nothing to do for QED reopen, stubs just return
* success */
static int bdrv_qed_reopen_prepare(BDRVReopenState *state,
@@ -538,7 +518,8 @@ static void bdrv_qed_close(BlockDriverState *bs)
{
BDRVQEDState *s = bs->opaque;
bdrv_qed_detach_aio_context(bs);
qed_cancel_need_check_timer(s);
timer_free(s->need_check_timer);
/* Ensure writes reach stable storage */
bdrv_flush(bs->file);
@@ -555,8 +536,7 @@ static void bdrv_qed_close(BlockDriverState *bs)
static int qed_create(const char *filename, uint32_t cluster_size,
uint64_t image_size, uint32_t table_size,
const char *backing_file, const char *backing_fmt,
QemuOpts *opts, Error **errp)
const char *backing_file, const char *backing_fmt)
{
QEDHeader header = {
.magic = QED_MAGIC,
@@ -573,20 +553,20 @@ static int qed_create(const char *filename, uint32_t cluster_size,
size_t l1_size = header.cluster_size * header.table_size;
Error *local_err = NULL;
int ret = 0;
BlockDriverState *bs;
BlockDriverState *bs = NULL;
ret = bdrv_create_file(filename, opts, &local_err);
ret = bdrv_create_file(filename, NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
bs = NULL;
ret = bdrv_open(&bs, filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL, NULL,
&local_err);
ret = bdrv_file_open(&bs, filename, NULL, BDRV_O_RDWR | BDRV_O_CACHE_WB,
&local_err);
if (ret < 0) {
error_propagate(errp, local_err);
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
@@ -630,54 +610,53 @@ out:
return ret;
}
static int bdrv_qed_create(const char *filename, QemuOpts *opts, Error **errp)
static int bdrv_qed_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
uint64_t image_size = 0;
uint32_t cluster_size = QED_DEFAULT_CLUSTER_SIZE;
uint32_t table_size = QED_DEFAULT_TABLE_SIZE;
char *backing_file = NULL;
char *backing_fmt = NULL;
int ret;
const char *backing_file = NULL;
const char *backing_fmt = NULL;
image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
cluster_size = qemu_opt_get_size_del(opts,
BLOCK_OPT_CLUSTER_SIZE,
QED_DEFAULT_CLUSTER_SIZE);
table_size = qemu_opt_get_size_del(opts, BLOCK_OPT_TABLE_SIZE,
QED_DEFAULT_TABLE_SIZE);
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
image_size = options->value.n;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
backing_file = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FMT)) {
backing_fmt = options->value.s;
} else if (!strcmp(options->name, BLOCK_OPT_CLUSTER_SIZE)) {
if (options->value.n) {
cluster_size = options->value.n;
}
} else if (!strcmp(options->name, BLOCK_OPT_TABLE_SIZE)) {
if (options->value.n) {
table_size = options->value.n;
}
}
options++;
}
if (!qed_is_cluster_size_valid(cluster_size)) {
error_setg(errp, "QED cluster size must be within range [%u, %u] "
"and power of 2",
QED_MIN_CLUSTER_SIZE, QED_MAX_CLUSTER_SIZE);
ret = -EINVAL;
goto finish;
fprintf(stderr, "QED cluster size must be within range [%u, %u] and power of 2\n",
QED_MIN_CLUSTER_SIZE, QED_MAX_CLUSTER_SIZE);
return -EINVAL;
}
if (!qed_is_table_size_valid(table_size)) {
error_setg(errp, "QED table size must be within range [%u, %u] "
"and power of 2",
QED_MIN_TABLE_SIZE, QED_MAX_TABLE_SIZE);
ret = -EINVAL;
goto finish;
fprintf(stderr, "QED table size must be within range [%u, %u] and power of 2\n",
QED_MIN_TABLE_SIZE, QED_MAX_TABLE_SIZE);
return -EINVAL;
}
if (!qed_is_image_size_valid(image_size, cluster_size, table_size)) {
error_setg(errp, "QED image size must be a non-zero multiple of "
"cluster size and less than %" PRIu64 " bytes",
qed_max_image_size(cluster_size, table_size));
ret = -EINVAL;
goto finish;
fprintf(stderr, "QED image size must be a non-zero multiple of "
"cluster size and less than %" PRIu64 " bytes\n",
qed_max_image_size(cluster_size, table_size));
return -EINVAL;
}
ret = qed_create(filename, cluster_size, image_size, table_size,
backing_file, backing_fmt, opts, errp);
finish:
g_free(backing_file);
g_free(backing_fmt);
return ret;
return qed_create(filename, cluster_size, image_size, table_size,
backing_file, backing_fmt);
}
typedef struct {
@@ -743,6 +722,11 @@ static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
return cb.status;
}
static int bdrv_qed_make_empty(BlockDriverState *bs)
{
return -ENOTSUP;
}
static BDRVQEDState *acb_to_s(QEDAIOCB *acb)
{
return acb->common.bs->opaque;
@@ -751,20 +735,18 @@ static BDRVQEDState *acb_to_s(QEDAIOCB *acb)
/**
* Read from the backing file or zero-fill if no backing file
*
* @s: QED state
* @pos: Byte position in device
* @qiov: Destination I/O vector
* @backing_qiov: Possibly shortened copy of qiov, to be allocated here
* @cb: Completion function
* @opaque: User data for completion function
* @s: QED state
* @pos: Byte position in device
* @qiov: Destination I/O vector
* @cb: Completion function
* @opaque: User data for completion function
*
* This function reads qiov->size bytes starting at pos from the backing file.
* If there is no backing file then zeroes are read.
*/
static void qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
QEMUIOVector *qiov,
QEMUIOVector **backing_qiov,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
uint64_t backing_length = 0;
size_t size;
@@ -796,21 +778,15 @@ static void qed_read_backing_file(BDRVQEDState *s, uint64_t pos,
/* If the read straddles the end of the backing file, shorten it */
size = MIN((uint64_t)backing_length - pos, qiov->size);
assert(*backing_qiov == NULL);
*backing_qiov = g_new(QEMUIOVector, 1);
qemu_iovec_init(*backing_qiov, qiov->niov);
qemu_iovec_concat(*backing_qiov, qiov, 0, size);
BLKDBG_EVENT(s->bs->file, BLKDBG_READ_BACKING_AIO);
bdrv_aio_readv(s->bs->backing_hd, pos / BDRV_SECTOR_SIZE,
*backing_qiov, size / BDRV_SECTOR_SIZE, cb, opaque);
qiov, size / BDRV_SECTOR_SIZE, cb, opaque);
}
typedef struct {
GenericCB gencb;
BDRVQEDState *s;
QEMUIOVector qiov;
QEMUIOVector *backing_qiov;
struct iovec iov;
uint64_t offset;
} CopyFromBackingFileCB;
@@ -827,12 +803,6 @@ static void qed_copy_from_backing_file_write(void *opaque, int ret)
CopyFromBackingFileCB *copy_cb = opaque;
BDRVQEDState *s = copy_cb->s;
if (copy_cb->backing_qiov) {
qemu_iovec_destroy(copy_cb->backing_qiov);
g_free(copy_cb->backing_qiov);
copy_cb->backing_qiov = NULL;
}
if (ret) {
qed_copy_from_backing_file_cb(copy_cb, ret);
return;
@@ -856,7 +826,7 @@ static void qed_copy_from_backing_file_write(void *opaque, int ret)
*/
static void qed_copy_from_backing_file(BDRVQEDState *s, uint64_t pos,
uint64_t len, uint64_t offset,
BlockCompletionFunc *cb,
BlockDriverCompletionFunc *cb,
void *opaque)
{
CopyFromBackingFileCB *copy_cb;
@@ -870,12 +840,11 @@ static void qed_copy_from_backing_file(BDRVQEDState *s, uint64_t pos,
copy_cb = gencb_alloc(sizeof(*copy_cb), cb, opaque);
copy_cb->s = s;
copy_cb->offset = offset;
copy_cb->backing_qiov = NULL;
copy_cb->iov.iov_base = qemu_blockalign(s->bs, len);
copy_cb->iov.iov_len = len;
qemu_iovec_init_external(&copy_cb->qiov, &copy_cb->iov, 1);
qed_read_backing_file(s, pos, &copy_cb->qiov, &copy_cb->backing_qiov,
qed_read_backing_file(s, pos, &copy_cb->qiov,
qed_copy_from_backing_file_write, copy_cb);
}
@@ -907,15 +876,21 @@ static void qed_update_l2_table(BDRVQEDState *s, QEDTable *table, int index,
static void qed_aio_complete_bh(void *opaque)
{
QEDAIOCB *acb = opaque;
BlockCompletionFunc *cb = acb->common.cb;
BlockDriverCompletionFunc *cb = acb->common.cb;
void *user_opaque = acb->common.opaque;
int ret = acb->bh_ret;
bool *finished = acb->finished;
qemu_bh_delete(acb->bh);
qemu_aio_unref(acb);
qemu_aio_release(acb);
/* Invoke callback */
cb(user_opaque, ret);
/* Signal cancel completion */
if (finished) {
*finished = true;
}
}
static void qed_aio_complete(QEDAIOCB *acb, int ret)
@@ -936,8 +911,7 @@ static void qed_aio_complete(QEDAIOCB *acb, int ret)
/* Arrange for a bh to invoke the completion function */
acb->bh_ret = ret;
acb->bh = aio_bh_new(bdrv_get_aio_context(acb->common.bs),
qed_aio_complete_bh, acb);
acb->bh = qemu_bh_new(qed_aio_complete_bh, acb);
qemu_bh_schedule(acb->bh);
/* Start next allocating write request waiting behind this one. Note that
@@ -1069,7 +1043,7 @@ static void qed_aio_write_main(void *opaque, int ret)
BDRVQEDState *s = acb_to_s(acb);
uint64_t offset = acb->cur_cluster +
qed_offset_into_cluster(s, acb->cur_pos);
BlockCompletionFunc *next_fn;
BlockDriverCompletionFunc *next_fn;
trace_qed_aio_write_main(s, acb, ret, offset, acb->cur_qiov.size);
@@ -1169,7 +1143,7 @@ static void qed_aio_write_zero_cluster(void *opaque, int ret)
static void qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
{
BDRVQEDState *s = acb_to_s(acb);
BlockCompletionFunc *cb;
BlockDriverCompletionFunc *cb;
/* Cancel timer when the first allocating request comes in */
if (QSIMPLEQ_EMPTY(&s->allocating_write_reqs)) {
@@ -1226,11 +1200,7 @@ static void qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
struct iovec *iov = acb->qiov->iov;
if (!iov->iov_base) {
iov->iov_base = qemu_try_blockalign(acb->common.bs, iov->iov_len);
if (iov->iov_base == NULL) {
qed_aio_complete(acb, -ENOMEM);
return;
}
iov->iov_base = qemu_blockalign(acb->common.bs, iov->iov_len);
memset(iov->iov_base, 0, iov->iov_len);
}
}
@@ -1316,7 +1286,7 @@ static void qed_aio_read_data(void *opaque, int ret,
return;
} else if (ret != QED_CLUSTER_FOUND) {
qed_read_backing_file(s, acb->cur_pos, &acb->cur_qiov,
&acb->backing_qiov, qed_aio_next_io, acb);
qed_aio_next_io, acb);
return;
}
@@ -1342,12 +1312,6 @@ static void qed_aio_next_io(void *opaque, int ret)
trace_qed_aio_next_io(s, acb, ret, acb->cur_pos + acb->cur_qiov.size);
if (acb->backing_qiov) {
qemu_iovec_destroy(acb->backing_qiov);
g_free(acb->backing_qiov);
acb->backing_qiov = NULL;
}
/* Handle I/O error */
if (ret) {
qed_aio_complete(acb, ret);
@@ -1370,11 +1334,11 @@ static void qed_aio_next_io(void *opaque, int ret)
io_fn, acb);
}
static BlockAIOCB *qed_aio_setup(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb,
void *opaque, int flags)
static BlockDriverAIOCB *qed_aio_setup(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque, int flags)
{
QEDAIOCB *acb = qemu_aio_get(&qed_aiocb_info, bs, cb, opaque);
@@ -1382,11 +1346,11 @@ static BlockAIOCB *qed_aio_setup(BlockDriverState *bs,
opaque, flags);
acb->flags = flags;
acb->finished = NULL;
acb->qiov = qiov;
acb->qiov_offset = 0;
acb->cur_pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE;
acb->end_pos = acb->cur_pos + nb_sectors * BDRV_SECTOR_SIZE;
acb->backing_qiov = NULL;
acb->request.l2_table = NULL;
qemu_iovec_init(&acb->cur_qiov, qiov->niov);
@@ -1395,20 +1359,20 @@ static BlockAIOCB *qed_aio_setup(BlockDriverState *bs,
return &acb->common;
}
static BlockAIOCB *bdrv_qed_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
static BlockDriverAIOCB *bdrv_qed_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return qed_aio_setup(bs, sector_num, qiov, nb_sectors, cb, opaque, 0);
}
static BlockAIOCB *bdrv_qed_aio_writev(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
static BlockDriverAIOCB *bdrv_qed_aio_writev(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return qed_aio_setup(bs, sector_num, qiov, nb_sectors, cb,
opaque, QED_AIOCB_WRITE);
@@ -1433,10 +1397,9 @@ static void coroutine_fn qed_co_write_zeroes_cb(void *opaque, int ret)
static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors,
BdrvRequestFlags flags)
int nb_sectors)
{
BlockAIOCB *blockacb;
BlockDriverAIOCB *blockacb;
BDRVQEDState *s = bs->opaque;
QEDWriteZeroesCB cb = { .done = false };
QEMUIOVector qiov;
@@ -1511,8 +1474,6 @@ static int bdrv_qed_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
memset(bdi, 0, sizeof(*bdi));
bdi->cluster_size = s->header.cluster_size;
bdi->is_dirty = s->header.features & QED_F_NEED_CHECK;
bdi->unallocated_blocks_are_zero = true;
bdi->can_write_zeroes_with_unmap = true;
return 0;
}
@@ -1588,31 +1549,13 @@ static int bdrv_qed_change_backing_file(BlockDriverState *bs,
return ret;
}
static void bdrv_qed_invalidate_cache(BlockDriverState *bs, Error **errp)
static void bdrv_qed_invalidate_cache(BlockDriverState *bs)
{
BDRVQEDState *s = bs->opaque;
Error *local_err = NULL;
int ret;
bdrv_qed_close(bs);
bdrv_invalidate_cache(bs->file, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
memset(s, 0, sizeof(BDRVQEDState));
ret = bdrv_qed_open(bs, NULL, bs->open_flags, &local_err);
if (local_err) {
error_setg(errp, "Could not reopen qed layer: %s",
error_get_pretty(local_err));
error_free(local_err);
return;
} else if (ret < 0) {
error_setg_errno(errp, -ret, "Could not reopen qed layer");
return;
}
bdrv_qed_open(bs, NULL, bs->open_flags, NULL);
}
static int bdrv_qed_check(BlockDriverState *bs, BdrvCheckResult *result,
@@ -1623,45 +1566,36 @@ static int bdrv_qed_check(BlockDriverState *bs, BdrvCheckResult *result,
return qed_check(s, result, !!fix);
}
static QemuOptsList qed_create_opts = {
.name = "qed-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qed_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = QEMU_OPT_STRING,
.help = "File name of a base image"
},
{
.name = BLOCK_OPT_BACKING_FMT,
.type = QEMU_OPT_STRING,
.help = "Image format of the base image"
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Cluster size (in bytes)",
.def_value_str = stringify(QED_DEFAULT_CLUSTER_SIZE)
},
{
.name = BLOCK_OPT_TABLE_SIZE,
.type = QEMU_OPT_SIZE,
.help = "L1/L2 table size (in clusters)"
},
{ /* end of list */ }
}
static QEMUOptionParameter qed_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size (in bytes)"
}, {
.name = BLOCK_OPT_BACKING_FILE,
.type = OPT_STRING,
.help = "File name of a base image"
}, {
.name = BLOCK_OPT_BACKING_FMT,
.type = OPT_STRING,
.help = "Image format of the base image"
}, {
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = OPT_SIZE,
.help = "Cluster size (in bytes)",
.value = { .n = QED_DEFAULT_CLUSTER_SIZE },
}, {
.name = BLOCK_OPT_TABLE_SIZE,
.type = OPT_SIZE,
.help = "L1/L2 table size (in clusters)"
},
{ /* end of list */ }
};
static BlockDriver bdrv_qed = {
.format_name = "qed",
.instance_size = sizeof(BDRVQEDState),
.create_opts = &qed_create_opts,
.supports_backing = true,
.create_options = qed_create_options,
.bdrv_probe = bdrv_qed_probe,
.bdrv_rebind = bdrv_qed_rebind,
@@ -1671,18 +1605,16 @@ static BlockDriver bdrv_qed = {
.bdrv_create = bdrv_qed_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_co_get_block_status = bdrv_qed_co_get_block_status,
.bdrv_make_empty = bdrv_qed_make_empty,
.bdrv_aio_readv = bdrv_qed_aio_readv,
.bdrv_aio_writev = bdrv_qed_aio_writev,
.bdrv_co_write_zeroes = bdrv_qed_co_write_zeroes,
.bdrv_truncate = bdrv_qed_truncate,
.bdrv_getlength = bdrv_qed_getlength,
.bdrv_get_info = bdrv_qed_get_info,
.bdrv_refresh_limits = bdrv_qed_refresh_limits,
.bdrv_change_backing_file = bdrv_qed_change_backing_file,
.bdrv_invalidate_cache = bdrv_qed_invalidate_cache,
.bdrv_check = bdrv_qed_check,
.bdrv_detach_aio_context = bdrv_qed_detach_aio_context,
.bdrv_attach_aio_context = bdrv_qed_attach_aio_context,
};
static void bdrv_qed_init(void)

View File

@@ -43,7 +43,7 @@
*
* All fields are little-endian on disk.
*/
#define QED_DEFAULT_CLUSTER_SIZE 65536
enum {
QED_MAGIC = 'Q' | 'E' << 8 | 'D' << 16 | '\0' << 24,
@@ -69,6 +69,7 @@ enum {
*/
QED_MIN_CLUSTER_SIZE = 4 * 1024, /* in bytes */
QED_MAX_CLUSTER_SIZE = 64 * 1024 * 1024,
QED_DEFAULT_CLUSTER_SIZE = 64 * 1024,
/* Allocated clusters are tracked using a 2-level pagetable. Table size is
* a multiple of clusters so large maximum image sizes can be supported
@@ -128,11 +129,12 @@ enum {
};
typedef struct QEDAIOCB {
BlockAIOCB common;
BlockDriverAIOCB common;
QEMUBH *bh;
int bh_ret; /* final return status for completion bh */
QSIMPLEQ_ENTRY(QEDAIOCB) next; /* next request */
int flags; /* QED_AIOCB_* bits ORed together */
bool *finished; /* signal for cancel completion */
uint64_t end_pos; /* request end on block device, in bytes */
/* User scatter-gather list */
@@ -141,7 +143,6 @@ typedef struct QEDAIOCB {
/* Current cluster scatter-gather list */
QEMUIOVector cur_qiov;
QEMUIOVector *backing_qiov;
uint64_t cur_pos; /* position on block device, in bytes */
uint64_t cur_cluster; /* cluster offset in image file */
unsigned int cur_nclusters; /* number of clusters being accessed */
@@ -202,11 +203,11 @@ typedef void QEDFindClusterFunc(void *opaque, int ret, uint64_t offset, size_t l
* Generic callback for chaining async callbacks
*/
typedef struct {
BlockCompletionFunc *cb;
BlockDriverCompletionFunc *cb;
void *opaque;
} GenericCB;
void *gencb_alloc(size_t len, BlockCompletionFunc *cb, void *opaque);
void *gencb_alloc(size_t len, BlockDriverCompletionFunc *cb, void *opaque);
void gencb_complete(void *opaque, int ret);
/**
@@ -229,16 +230,16 @@ void qed_commit_l2_cache_entry(L2TableCache *l2_cache, CachedL2Table *l2_table);
*/
int qed_read_l1_table_sync(BDRVQEDState *s);
void qed_write_l1_table(BDRVQEDState *s, unsigned int index, unsigned int n,
BlockCompletionFunc *cb, void *opaque);
BlockDriverCompletionFunc *cb, void *opaque);
int qed_write_l1_table_sync(BDRVQEDState *s, unsigned int index,
unsigned int n);
int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
uint64_t offset);
void qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset,
BlockCompletionFunc *cb, void *opaque);
BlockDriverCompletionFunc *cb, void *opaque);
void qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
unsigned int index, unsigned int n, bool flush,
BlockCompletionFunc *cb, void *opaque);
BlockDriverCompletionFunc *cb, void *opaque);
int qed_write_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
unsigned int index, unsigned int n, bool flush);

File diff suppressed because it is too large Load Diff

View File

@@ -21,10 +21,9 @@
#define QEMU_AIO_IOCTL 0x0004
#define QEMU_AIO_FLUSH 0x0008
#define QEMU_AIO_DISCARD 0x0010
#define QEMU_AIO_WRITE_ZEROES 0x0020
#define QEMU_AIO_TYPE_MASK \
(QEMU_AIO_READ|QEMU_AIO_WRITE|QEMU_AIO_IOCTL|QEMU_AIO_FLUSH| \
QEMU_AIO_DISCARD|QEMU_AIO_WRITE_ZEROES)
QEMU_AIO_DISCARD)
/* AIO flags */
#define QEMU_AIO_MISALIGNED 0x1000
@@ -34,29 +33,19 @@
/* linux-aio.c - Linux native implementation */
#ifdef CONFIG_LINUX_AIO
void *laio_init(void);
void laio_cleanup(void *s);
BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type);
void laio_detach_aio_context(void *s, AioContext *old_context);
void laio_attach_aio_context(void *s, AioContext *new_context);
void laio_io_plug(BlockDriverState *bs, void *aio_ctx);
void laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug);
BlockDriverCompletionFunc *cb, void *opaque, int type);
#endif
#ifdef _WIN32
typedef struct QEMUWin32AIOState QEMUWin32AIOState;
QEMUWin32AIOState *win32_aio_init(void);
void win32_aio_cleanup(QEMUWin32AIOState *aio);
int win32_aio_attach(QEMUWin32AIOState *aio, HANDLE hfile);
BlockAIOCB *win32_aio_submit(BlockDriverState *bs,
BlockDriverAIOCB *win32_aio_submit(BlockDriverState *bs,
QEMUWin32AIOState *aio, HANDLE hfile,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type);
void win32_aio_detach_aio_context(QEMUWin32AIOState *aio,
AioContext *old_context);
void win32_aio_attach_aio_context(QEMUWin32AIOState *aio,
AioContext *new_context);
BlockDriverCompletionFunc *cb, void *opaque, int type);
#endif
#endif /* QEMU_RAW_AIO_H */

File diff suppressed because it is too large Load Diff

View File

@@ -29,7 +29,6 @@
#include "trace.h"
#include "block/thread-pool.h"
#include "qemu/iov.h"
#include "qapi/qmp/qstring.h"
#include <windows.h>
#include <winioctl.h>
@@ -37,6 +36,8 @@
#define FTYPE_CD 1
#define FTYPE_HARDDISK 2
static QEMUWin32AIOState *aio;
typedef struct RawWin32AIOData {
BlockDriverState *bs;
HANDLE hfile;
@@ -102,7 +103,7 @@ static int aio_worker(void *arg)
switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) {
case QEMU_AIO_READ:
count = handle_aiocb_rw(aiocb);
if (count < aiocb->aio_nbytes) {
if (count < aiocb->aio_nbytes && aiocb->bs->growable) {
/* A short read means that we have reached EOF. Pad the buffer
* with zeros for bytes after EOF. */
iov_memset(aiocb->aio_iov, aiocb->aio_niov, count,
@@ -139,9 +140,9 @@ static int aio_worker(void *arg)
return ret;
}
static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type)
BlockDriverCompletionFunc *cb, void *opaque, int type)
{
RawWin32AIOData *acb = g_slice_new(RawWin32AIOData);
ThreadPool *pool;
@@ -201,54 +202,6 @@ static int set_sparse(int fd)
NULL, 0, NULL, 0, &returned, NULL);
}
static void raw_detach_aio_context(BlockDriverState *bs)
{
BDRVRawState *s = bs->opaque;
if (s->aio) {
win32_aio_detach_aio_context(s->aio, bdrv_get_aio_context(bs));
}
}
static void raw_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVRawState *s = bs->opaque;
if (s->aio) {
win32_aio_attach_aio_context(s->aio, new_context);
}
}
static void raw_probe_alignment(BlockDriverState *bs)
{
BDRVRawState *s = bs->opaque;
DWORD sectorsPerCluster, freeClusters, totalClusters, count;
DISK_GEOMETRY_EX dg;
BOOL status;
if (s->type == FTYPE_CD) {
bs->request_alignment = 2048;
return;
}
if (s->type == FTYPE_HARDDISK) {
status = DeviceIoControl(s->hfile, IOCTL_DISK_GET_DRIVE_GEOMETRY_EX,
NULL, 0, &dg, sizeof(dg), &count, NULL);
if (status != 0) {
bs->request_alignment = dg.Geometry.BytesPerSector;
return;
}
/* try GetDiskFreeSpace too */
}
if (s->drive_path[0]) {
GetDiskFreeSpace(s->drive_path, &sectorsPerCluster,
&dg.Geometry.BytesPerSector,
&freeClusters, &totalClusters);
bs->request_alignment = dg.Geometry.BytesPerSector;
}
}
static void raw_parse_flags(int flags, int *access_flags, DWORD *overlapped)
{
assert(access_flags != NULL);
@@ -269,17 +222,6 @@ static void raw_parse_flags(int flags, int *access_flags, DWORD *overlapped)
}
}
static void raw_parse_filename(const char *filename, QDict *options,
Error **errp)
{
/* The filename does not have to be prefixed by the protocol name, since
* "file" is the default protocol; therefore, the return value of this
* function call can be ignored. */
strstart(filename, "file:", &filename);
qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
}
static QemuOptsList raw_runtime_opts = {
.name = "raw",
.head = QTAILQ_HEAD_INITIALIZER(raw_runtime_opts.head),
@@ -306,9 +248,9 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
s->type = FTYPE_FILE;
opts = qemu_opts_create(&raw_runtime_opts, NULL, 0, &error_abort);
opts = qemu_opts_create_nofail(&raw_runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
@@ -318,15 +260,13 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
raw_parse_flags(flags, &access_flags, &overlapped);
if (filename[0] && filename[1] == ':') {
snprintf(s->drive_path, sizeof(s->drive_path), "%c:\\", filename[0]);
} else if (filename[0] == '\\' && filename[1] == '\\') {
s->drive_path[0] = 0;
} else {
/* Relative path. */
char buf[MAX_PATH];
GetCurrentDirectory(MAX_PATH, buf);
snprintf(s->drive_path, sizeof(s->drive_path), "%c:\\", buf[0]);
if ((flags & BDRV_O_NATIVE_AIO) && aio == NULL) {
aio = win32_aio_init();
if (aio == NULL) {
error_setg(errp, "Could not initialize AIO");
ret = -EINVAL;
goto fail;
}
}
s->hfile = CreateFile(filename, access_flags,
@@ -344,35 +284,24 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
}
if (flags & BDRV_O_NATIVE_AIO) {
s->aio = win32_aio_init();
if (s->aio == NULL) {
CloseHandle(s->hfile);
error_setg(errp, "Could not initialize AIO");
ret = -EINVAL;
goto fail;
}
ret = win32_aio_attach(s->aio, s->hfile);
ret = win32_aio_attach(aio, s->hfile);
if (ret < 0) {
win32_aio_cleanup(s->aio);
CloseHandle(s->hfile);
error_setg_errno(errp, -ret, "Could not enable AIO");
goto fail;
}
win32_aio_attach_aio_context(s->aio, bdrv_get_aio_context(bs));
s->aio = aio;
}
raw_probe_alignment(bs);
ret = 0;
fail:
qemu_opts_del(opts);
return ret;
}
static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVRawState *s = bs->opaque;
if (s->aio) {
@@ -384,9 +313,9 @@ static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
}
}
static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
static BlockDriverAIOCB *raw_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVRawState *s = bs->opaque;
if (s->aio) {
@@ -398,8 +327,8 @@ static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
}
}
static BlockAIOCB *raw_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque)
static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVRawState *s = bs->opaque;
return paio_submit(bs, s->hfile, 0, NULL, 0, cb, opaque, QEMU_AIO_FLUSH);
@@ -408,17 +337,7 @@ static BlockAIOCB *raw_aio_flush(BlockDriverState *bs,
static void raw_close(BlockDriverState *bs)
{
BDRVRawState *s = bs->opaque;
if (s->aio) {
win32_aio_detach_aio_context(s->aio, bdrv_get_aio_context(bs));
win32_aio_cleanup(s->aio);
s->aio = NULL;
}
CloseHandle(s->hfile);
if (bs->open_flags & BDRV_O_TEMPORARY) {
unlink(bs->filename);
}
}
static int raw_truncate(BlockDriverState *bs, int64_t offset)
@@ -504,16 +423,19 @@ static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
return st.st_size;
}
static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
static int raw_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int fd;
int64_t total_size = 0;
strstart(filename, "file:", &filename);
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
total_size = options->value.n / 512;
}
options++;
}
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
@@ -522,34 +444,28 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
return -EIO;
}
set_sparse(fd);
ftruncate(fd, total_size);
ftruncate(fd, total_size * 512);
qemu_close(fd);
return 0;
}
static QemuOptsList raw_create_opts = {
.name = "raw-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{ /* end of list */ }
}
static QEMUOptionParameter raw_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{ NULL }
};
BlockDriver bdrv_file = {
static BlockDriver bdrv_file = {
.format_name = "file",
.protocol_name = "file",
.instance_size = sizeof(BDRVRawState),
.bdrv_needs_filename = true,
.bdrv_parse_filename = raw_parse_filename,
.bdrv_file_open = raw_open,
.bdrv_close = raw_close,
.bdrv_create = raw_create,
.bdrv_file_open = raw_open,
.bdrv_close = raw_close,
.bdrv_create = raw_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_aio_readv = raw_aio_readv,
@@ -561,7 +477,7 @@ BlockDriver bdrv_file = {
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
.create_opts = &raw_create_opts,
.create_options = raw_create_options,
};
/***********************************************/
@@ -622,15 +538,6 @@ static int hdev_probe_device(const char *filename)
return 0;
}
static void hdev_parse_filename(const char *filename, QDict *options,
Error **errp)
{
/* The prefix is optional, just as for "file". */
strstart(filename, "host_device:", &filename);
qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
}
static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
@@ -643,10 +550,9 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
Error *local_err = NULL;
const char *filename;
QemuOpts *opts = qemu_opts_create(&raw_runtime_opts, NULL, 0,
&error_abort);
QemuOpts *opts = qemu_opts_create_nofail(&raw_runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto done;
@@ -701,7 +607,6 @@ static BlockDriver bdrv_host_device = {
.protocol_name = "host_device",
.instance_size = sizeof(BDRVRawState),
.bdrv_needs_filename = true,
.bdrv_parse_filename = hdev_parse_filename,
.bdrv_probe_device = hdev_probe_device,
.bdrv_file_open = hdev_open,
.bdrv_close = raw_close,
@@ -710,9 +615,6 @@ static BlockDriver bdrv_host_device = {
.bdrv_aio_writev = raw_aio_writev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_detach_aio_context = raw_detach_aio_context,
.bdrv_attach_aio_context = raw_attach_aio_context,
.bdrv_getlength = raw_getlength,
.has_variable_length = true,

View File

@@ -29,17 +29,13 @@
#include "block/block_int.h"
#include "qemu/option.h"
static QemuOptsList raw_create_opts = {
.name = "raw-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{ /* end of list */ }
}
static QEMUOptionParameter raw_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{ 0 }
};
static int raw_reopen_prepare(BDRVReopenState *reopen_state,
@@ -58,58 +54,8 @@ static int coroutine_fn raw_co_readv(BlockDriverState *bs, int64_t sector_num,
static int coroutine_fn raw_co_writev(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
{
void *buf = NULL;
BlockDriver *drv;
QEMUIOVector local_qiov;
int ret;
if (bs->probed && sector_num == 0) {
/* As long as these conditions are true, we can't get partial writes to
* the probe buffer and can just directly check the request. */
QEMU_BUILD_BUG_ON(BLOCK_PROBE_BUF_SIZE != 512);
QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512);
if (nb_sectors == 0) {
/* qemu_iovec_to_buf() would fail, but we want to return success
* instead of -EINVAL in this case. */
return 0;
}
buf = qemu_try_blockalign(bs->file, 512);
if (!buf) {
ret = -ENOMEM;
goto fail;
}
ret = qemu_iovec_to_buf(qiov, 0, buf, 512);
if (ret != 512) {
ret = -EINVAL;
goto fail;
}
drv = bdrv_probe_all(buf, 512, NULL);
if (drv != bs->drv) {
ret = -EPERM;
goto fail;
}
/* Use the checked buffer, a malicious guest might be overwriting its
* original buffer in the background. */
qemu_iovec_init(&local_qiov, qiov->niov + 1);
qemu_iovec_add(&local_qiov, buf, 512);
qemu_iovec_concat(&local_qiov, qiov, 512, qiov->size - 512);
qiov = &local_qiov;
}
BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
ret = bdrv_co_writev(bs->file, sector_num, nb_sectors, qiov);
fail:
if (qiov == &local_qiov) {
qemu_iovec_destroy(&local_qiov);
}
qemu_vfree(buf);
return ret;
return bdrv_co_writev(bs->file, sector_num, nb_sectors, qiov);
}
static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
@@ -122,10 +68,9 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
}
static int coroutine_fn raw_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
BdrvRequestFlags flags)
int64_t sector_num, int nb_sectors)
{
return bdrv_co_write_zeroes(bs->file, sector_num, nb_sectors, flags);
return bdrv_co_write_zeroes(bs->file, sector_num, nb_sectors);
}
static int coroutine_fn raw_co_discard(BlockDriverState *bs,
@@ -144,11 +89,6 @@ static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
return bdrv_get_info(bs->file, bdi);
}
static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
{
bs->bl = bs->file->bl;
}
static int raw_truncate(BlockDriverState *bs, int64_t offset)
{
return bdrv_truncate(bs->file, offset);
@@ -179,10 +119,10 @@ static int raw_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
return bdrv_ioctl(bs->file, req, buf);
}
static BlockAIOCB *raw_aio_ioctl(BlockDriverState *bs,
unsigned long int req, void *buf,
BlockCompletionFunc *cb,
void *opaque)
static BlockDriverAIOCB *raw_aio_ioctl(BlockDriverState *bs,
unsigned long int req, void *buf,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return bdrv_aio_ioctl(bs->file, req, buf, cb, opaque);
}
@@ -192,13 +132,14 @@ static int raw_has_zero_init(BlockDriverState *bs)
return bdrv_has_zero_init(bs->file);
}
static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
static int raw_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
Error *local_err = NULL;
int ret;
ret = bdrv_create_file(filename, opts, &local_err);
if (local_err) {
ret = bdrv_create_file(filename, options, &local_err);
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
}
return ret;
@@ -208,18 +149,6 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
bs->sg = bs->file->sg;
if (bs->probed && !bdrv_is_read_only(bs)) {
fprintf(stderr,
"WARNING: Image format was not specified for '%s' and probing "
"guessed raw.\n"
" Automatically detecting the format is dangerous for "
"raw images, write operations on block 0 will be restricted.\n"
" Specify the 'raw' format explicitly to remove the "
"restrictions.\n",
bs->file->filename);
}
return 0;
}
@@ -235,17 +164,7 @@ static int raw_probe(const uint8_t *buf, int buf_size, const char *filename)
return 1;
}
static int raw_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz)
{
return bdrv_probe_blocksizes(bs->file, bsz);
}
static int raw_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
{
return bdrv_probe_geometry(bs->file, geo);
}
BlockDriver bdrv_raw = {
static BlockDriver bdrv_raw = {
.format_name = "raw",
.bdrv_probe = &raw_probe,
.bdrv_reopen_prepare = &raw_reopen_prepare,
@@ -261,16 +180,13 @@ BlockDriver bdrv_raw = {
.bdrv_getlength = &raw_getlength,
.has_variable_length = true,
.bdrv_get_info = &raw_get_info,
.bdrv_refresh_limits = &raw_refresh_limits,
.bdrv_probe_blocksizes = &raw_probe_blocksizes,
.bdrv_probe_geometry = &raw_probe_geometry,
.bdrv_is_inserted = &raw_is_inserted,
.bdrv_media_changed = &raw_media_changed,
.bdrv_eject = &raw_eject,
.bdrv_lock_medium = &raw_lock_medium,
.bdrv_ioctl = &raw_ioctl,
.bdrv_aio_ioctl = &raw_aio_ioctl,
.create_opts = &raw_create_opts,
.create_options = &raw_create_options[0],
.bdrv_has_zero_init = &raw_has_zero_init
};

View File

@@ -68,36 +68,49 @@ typedef enum {
} RBDAIOCmd;
typedef struct RBDAIOCB {
BlockAIOCB common;
BlockDriverAIOCB common;
QEMUBH *bh;
int64_t ret;
QEMUIOVector *qiov;
char *bounce;
RBDAIOCmd cmd;
int64_t sector_num;
int error;
struct BDRVRBDState *s;
int cancelled;
int status;
} RBDAIOCB;
typedef struct RADOSCB {
int rcbid;
RBDAIOCB *acb;
struct BDRVRBDState *s;
int done;
int64_t size;
char *buf;
int64_t ret;
} RADOSCB;
#define RBD_FD_READ 0
#define RBD_FD_WRITE 1
typedef struct BDRVRBDState {
int fds[2];
rados_t cluster;
rados_ioctx_t io_ctx;
rbd_image_t image;
char name[RBD_MAX_IMAGE_NAME_SIZE];
char *snap;
int event_reader_pos;
RADOSCB *event_rcb;
} BDRVRBDState;
static void rbd_aio_bh_cb(void *opaque);
static int qemu_rbd_next_tok(char *dst, int dst_len,
char *src, char delim,
const char *name,
char **p, Error **errp)
char **p)
{
int l;
char *end;
@@ -120,10 +133,10 @@ static int qemu_rbd_next_tok(char *dst, int dst_len,
}
l = strlen(src);
if (l >= dst_len) {
error_setg(errp, "%s too long", name);
error_report("%s too long", name);
return -EINVAL;
} else if (l == 0) {
error_setg(errp, "%s too short", name);
error_report("%s too short", name);
return -EINVAL;
}
@@ -149,15 +162,13 @@ static int qemu_rbd_parsename(const char *filename,
char *pool, int pool_len,
char *snap, int snap_len,
char *name, int name_len,
char *conf, int conf_len,
Error **errp)
char *conf, int conf_len)
{
const char *start;
char *p, *buf;
int ret;
if (!strstart(filename, "rbd:", &start)) {
error_setg(errp, "File name must start with 'rbd:'");
return -EINVAL;
}
@@ -166,8 +177,7 @@ static int qemu_rbd_parsename(const char *filename,
*snap = '\0';
*conf = '\0';
ret = qemu_rbd_next_tok(pool, pool_len, p,
'/', "pool name", &p, errp);
ret = qemu_rbd_next_tok(pool, pool_len, p, '/', "pool name", &p);
if (ret < 0 || !p) {
ret = -EINVAL;
goto done;
@@ -175,25 +185,21 @@ static int qemu_rbd_parsename(const char *filename,
qemu_rbd_unescape(pool);
if (strchr(p, '@')) {
ret = qemu_rbd_next_tok(name, name_len, p,
'@', "object name", &p, errp);
ret = qemu_rbd_next_tok(name, name_len, p, '@', "object name", &p);
if (ret < 0) {
goto done;
}
ret = qemu_rbd_next_tok(snap, snap_len, p,
':', "snap name", &p, errp);
ret = qemu_rbd_next_tok(snap, snap_len, p, ':', "snap name", &p);
qemu_rbd_unescape(snap);
} else {
ret = qemu_rbd_next_tok(name, name_len, p,
':', "object name", &p, errp);
ret = qemu_rbd_next_tok(name, name_len, p, ':', "object name", &p);
}
qemu_rbd_unescape(name);
if (ret < 0 || !p) {
goto done;
}
ret = qemu_rbd_next_tok(conf, conf_len, p,
'\0', "configuration", &p, errp);
ret = qemu_rbd_next_tok(conf, conf_len, p, '\0', "configuration", &p);
done:
g_free(buf);
@@ -228,9 +234,7 @@ static char *qemu_rbd_parse_clientname(const char *conf, char *clientname)
return NULL;
}
static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
bool only_read_conf_file,
Error **errp)
static int qemu_rbd_set_conf(rados_t cluster, const char *conf)
{
char *p, *buf;
char name[RBD_MAX_CONF_NAME_SIZE];
@@ -242,41 +246,37 @@ static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
while (p) {
ret = qemu_rbd_next_tok(name, sizeof(name), p,
'=', "conf option name", &p, errp);
'=', "conf option name", &p);
if (ret < 0) {
break;
}
qemu_rbd_unescape(name);
if (!p) {
error_setg(errp, "conf option %s has no value", name);
error_report("conf option %s has no value", name);
ret = -EINVAL;
break;
}
ret = qemu_rbd_next_tok(value, sizeof(value), p,
':', "conf option value", &p, errp);
':', "conf option value", &p);
if (ret < 0) {
break;
}
qemu_rbd_unescape(value);
if (strcmp(name, "conf") == 0) {
/* read the conf file alone, so it doesn't override more
specific settings for a particular device */
if (only_read_conf_file) {
ret = rados_conf_read_file(cluster, value);
if (ret < 0) {
error_setg(errp, "error reading conf file %s", value);
break;
}
ret = rados_conf_read_file(cluster, value);
if (ret < 0) {
error_report("error reading conf file %s", value);
break;
}
} else if (strcmp(name, "id") == 0) {
/* ignore, this is parsed by qemu_rbd_parse_clientname() */
} else if (!only_read_conf_file) {
} else {
ret = rados_conf_set(cluster, name, value);
if (ret < 0) {
error_setg(errp, "invalid conf option %s", name);
error_report("invalid conf option %s", name);
ret = -EINVAL;
break;
}
@@ -287,9 +287,9 @@ static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
return ret;
}
static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
static int qemu_rbd_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
Error *local_err = NULL;
int64_t bytes = 0;
int64_t objsize;
int obj_order = 0;
@@ -306,58 +306,57 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
if (qemu_rbd_parsename(filename, pool, sizeof(pool),
snap_buf, sizeof(snap_buf),
name, sizeof(name),
conf, sizeof(conf), &local_err) < 0) {
error_propagate(errp, local_err);
conf, sizeof(conf)) < 0) {
return -EINVAL;
}
/* Read out options */
bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
objsize = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE, 0);
if (objsize) {
if ((objsize - 1) & objsize) { /* not a power of 2? */
error_setg(errp, "obj size needs to be power of 2");
return -EINVAL;
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
bytes = options->value.n;
} else if (!strcmp(options->name, BLOCK_OPT_CLUSTER_SIZE)) {
if (options->value.n) {
objsize = options->value.n;
if ((objsize - 1) & objsize) { /* not a power of 2? */
error_report("obj size needs to be power of 2");
return -EINVAL;
}
if (objsize < 4096) {
error_report("obj size too small");
return -EINVAL;
}
obj_order = ffs(objsize) - 1;
}
}
if (objsize < 4096) {
error_setg(errp, "obj size too small");
return -EINVAL;
}
obj_order = ctz32(objsize);
options++;
}
clientname = qemu_rbd_parse_clientname(conf, clientname_buf);
if (rados_create(&cluster, clientname) < 0) {
error_setg(errp, "error initializing");
error_report("error initializing");
return -EIO;
}
if (strstr(conf, "conf=") == NULL) {
/* try default location, but ignore failure */
rados_conf_read_file(cluster, NULL);
} else if (conf[0] != '\0' &&
qemu_rbd_set_conf(cluster, conf, true, &local_err) < 0) {
rados_shutdown(cluster);
error_propagate(errp, local_err);
return -EIO;
}
if (conf[0] != '\0' &&
qemu_rbd_set_conf(cluster, conf, false, &local_err) < 0) {
qemu_rbd_set_conf(cluster, conf) < 0) {
error_report("error setting config options");
rados_shutdown(cluster);
error_propagate(errp, local_err);
return -EIO;
}
if (rados_connect(cluster) < 0) {
error_setg(errp, "error connecting");
error_report("error connecting");
rados_shutdown(cluster);
return -EIO;
}
if (rados_ioctx_create(cluster, pool, &io_ctx) < 0) {
error_setg(errp, "error opening pool %s", pool);
error_report("error opening pool %s", pool);
rados_shutdown(cluster);
return -EIO;
}
@@ -370,8 +369,9 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
}
/*
* This aio completion is being called from rbd_finish_bh() and runs in qemu
* BH context.
* This aio completion is being called from qemu_rbd_aio_event_reader()
* and runs in qemu context. It schedules a bh, but just in case the aio
* was not cancelled before.
*/
static void qemu_rbd_complete_aio(RADOSCB *rcb)
{
@@ -401,16 +401,36 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
acb->ret = r;
}
}
/* Note that acb->bh can be NULL in case where the aio was cancelled */
acb->bh = qemu_bh_new(rbd_aio_bh_cb, acb);
qemu_bh_schedule(acb->bh);
g_free(rcb);
}
if (acb->cmd == RBD_AIO_READ) {
qemu_iovec_from_buf(acb->qiov, 0, acb->bounce, acb->qiov->size);
}
qemu_vfree(acb->bounce);
acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret));
/*
* aio fd read handler. It runs in the qemu context and calls the
* completion handling of completed rados aio operations.
*/
static void qemu_rbd_aio_event_reader(void *opaque)
{
BDRVRBDState *s = opaque;
qemu_aio_unref(acb);
ssize_t ret;
do {
char *p = (char *)&s->event_rcb;
/* now read the rcb pointer that was sent from a non qemu thread */
ret = read(s->fds[RBD_FD_READ], p + s->event_reader_pos,
sizeof(s->event_rcb) - s->event_reader_pos);
if (ret > 0) {
s->event_reader_pos += ret;
if (s->event_reader_pos == sizeof(s->event_rcb)) {
s->event_reader_pos = 0;
qemu_rbd_complete_aio(s->event_rcb);
}
}
} while (ret < 0 && errno == EINTR);
}
/* TODO Convert to fine grained options */
@@ -441,10 +461,11 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
const char *filename;
int r;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
opts = qemu_opts_create_nofail(&runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
if (error_is_set(&local_err)) {
qerror_report_err(local_err);
error_free(local_err);
qemu_opts_del(opts);
return -EINVAL;
}
@@ -454,7 +475,7 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
if (qemu_rbd_parsename(filename, pool, sizeof(pool),
snap_buf, sizeof(snap_buf),
s->name, sizeof(s->name),
conf, sizeof(conf), errp) < 0) {
conf, sizeof(conf)) < 0) {
r = -EINVAL;
goto failed_opts;
}
@@ -462,7 +483,7 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
clientname = qemu_rbd_parse_clientname(conf, clientname_buf);
r = rados_create(&s->cluster, clientname);
if (r < 0) {
error_setg(errp, "error initializing");
error_report("error initializing");
goto failed_opts;
}
@@ -471,23 +492,6 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
s->snap = g_strdup(snap_buf);
}
if (strstr(conf, "conf=") == NULL) {
/* try default location, but ignore failure */
rados_conf_read_file(s->cluster, NULL);
} else if (conf[0] != '\0') {
r = qemu_rbd_set_conf(s->cluster, conf, true, errp);
if (r < 0) {
goto failed_shutdown;
}
}
if (conf[0] != '\0') {
r = qemu_rbd_set_conf(s->cluster, conf, false, errp);
if (r < 0) {
goto failed_shutdown;
}
}
/*
* Fallback to more conservative semantics if setting cache
* options fails. Ignore errors from setting rbd_cache because the
@@ -501,29 +505,56 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
rados_conf_set(s->cluster, "rbd_cache", "true");
}
if (strstr(conf, "conf=") == NULL) {
/* try default location, but ignore failure */
rados_conf_read_file(s->cluster, NULL);
}
if (conf[0] != '\0') {
r = qemu_rbd_set_conf(s->cluster, conf);
if (r < 0) {
error_report("error setting config options");
goto failed_shutdown;
}
}
r = rados_connect(s->cluster);
if (r < 0) {
error_setg(errp, "error connecting");
error_report("error connecting");
goto failed_shutdown;
}
r = rados_ioctx_create(s->cluster, pool, &s->io_ctx);
if (r < 0) {
error_setg(errp, "error opening pool %s", pool);
error_report("error opening pool %s", pool);
goto failed_shutdown;
}
r = rbd_open(s->io_ctx, s->name, &s->image, s->snap);
if (r < 0) {
error_setg(errp, "error reading header from %s", s->name);
error_report("error reading header from %s", s->name);
goto failed_open;
}
bs->read_only = (s->snap != NULL);
s->event_reader_pos = 0;
r = qemu_pipe(s->fds);
if (r < 0) {
error_report("error opening eventfd");
goto failed;
}
fcntl(s->fds[0], F_SETFL, O_NONBLOCK);
fcntl(s->fds[1], F_SETFL, O_NONBLOCK);
qemu_aio_set_fd_handler(s->fds[RBD_FD_READ], qemu_rbd_aio_event_reader,
NULL, s);
qemu_opts_del(opts);
return 0;
failed:
rbd_close(s->image);
failed_open:
rados_ioctx_destroy(s->io_ctx);
failed_shutdown:
@@ -538,21 +569,65 @@ static void qemu_rbd_close(BlockDriverState *bs)
{
BDRVRBDState *s = bs->opaque;
close(s->fds[0]);
close(s->fds[1]);
qemu_aio_set_fd_handler(s->fds[RBD_FD_READ], NULL, NULL, NULL);
rbd_close(s->image);
rados_ioctx_destroy(s->io_ctx);
g_free(s->snap);
rados_shutdown(s->cluster);
}
/*
* Cancel aio. Since we don't reference acb in a non qemu threads,
* it is safe to access it here.
*/
static void qemu_rbd_aio_cancel(BlockDriverAIOCB *blockacb)
{
RBDAIOCB *acb = (RBDAIOCB *) blockacb;
acb->cancelled = 1;
while (acb->status == -EINPROGRESS) {
qemu_aio_wait();
}
qemu_aio_release(acb);
}
static const AIOCBInfo rbd_aiocb_info = {
.aiocb_size = sizeof(RBDAIOCB),
.cancel = qemu_rbd_aio_cancel,
};
static void rbd_finish_bh(void *opaque)
static int qemu_rbd_send_pipe(BDRVRBDState *s, RADOSCB *rcb)
{
RADOSCB *rcb = opaque;
qemu_bh_delete(rcb->acb->bh);
qemu_rbd_complete_aio(rcb);
int ret = 0;
while (1) {
fd_set wfd;
int fd = s->fds[RBD_FD_WRITE];
/* send the op pointer to the qemu thread that is responsible
for the aio/op completion. Must do it in a qemu thread context */
ret = write(fd, (void *)&rcb, sizeof(rcb));
if (ret >= 0) {
break;
}
if (errno == EINTR) {
continue;
}
if (errno != EAGAIN) {
break;
}
FD_ZERO(&wfd);
FD_SET(fd, &wfd);
do {
ret = select(fd + 1, NULL, &wfd, NULL, NULL);
} while (ret < 0 && errno == EINTR);
}
return ret;
}
/*
@@ -560,19 +635,40 @@ static void rbd_finish_bh(void *opaque)
*
* Note: this function is being called from a non qemu thread so
* we need to be careful about what we do here. Generally we only
* schedule a BH, and do the rest of the io completion handling
* from rbd_finish_bh() which runs in a qemu context.
* write to the block notification pipe, and do the rest of the
* io completion handling from qemu_rbd_aio_event_reader() which
* runs in a qemu context.
*/
static void rbd_finish_aiocb(rbd_completion_t c, RADOSCB *rcb)
{
RBDAIOCB *acb = rcb->acb;
int ret;
rcb->ret = rbd_aio_get_return_value(c);
rbd_aio_release(c);
ret = qemu_rbd_send_pipe(rcb->s, rcb);
if (ret < 0) {
error_report("failed writing to acb->s->fds");
g_free(rcb);
}
}
acb->bh = aio_bh_new(bdrv_get_aio_context(acb->common.bs),
rbd_finish_bh, rcb);
qemu_bh_schedule(acb->bh);
/* Callback when all queued rbd_aio requests are complete */
static void rbd_aio_bh_cb(void *opaque)
{
RBDAIOCB *acb = opaque;
if (acb->cmd == RBD_AIO_READ) {
qemu_iovec_from_buf(acb->qiov, 0, acb->bounce, acb->qiov->size);
}
qemu_vfree(acb->bounce);
acb->common.cb(acb->common.opaque, (acb->ret > 0 ? 0 : acb->ret));
qemu_bh_delete(acb->bh);
acb->bh = NULL;
acb->status = 0;
if (!acb->cancelled) {
qemu_aio_release(acb);
}
}
static int rbd_aio_discard_wrapper(rbd_image_t image,
@@ -597,16 +693,16 @@ static int rbd_aio_flush_wrapper(rbd_image_t image,
#endif
}
static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque,
RBDAIOCmd cmd)
static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque,
RBDAIOCmd cmd)
{
RBDAIOCB *acb;
RADOSCB *rcb = NULL;
RADOSCB *rcb;
rbd_completion_t c;
int64_t off, size;
char *buf;
@@ -620,15 +716,14 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
acb->bounce = NULL;
} else {
acb->bounce = qemu_try_blockalign(bs, qiov->size);
if (acb->bounce == NULL) {
goto failed;
}
acb->bounce = qemu_blockalign(bs, qiov->size);
}
acb->ret = 0;
acb->error = 0;
acb->s = s;
acb->cancelled = 0;
acb->bh = NULL;
acb->status = -EINPROGRESS;
if (cmd == RBD_AIO_WRITE) {
qemu_iovec_to_buf(acb->qiov, 0, acb->bounce, qiov->size);
@@ -639,7 +734,8 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
off = sector_num * BDRV_SECTOR_SIZE;
size = nb_sectors * BDRV_SECTOR_SIZE;
rcb = g_new(RADOSCB, 1);
rcb = g_malloc(sizeof(RADOSCB));
rcb->done = 0;
rcb->acb = acb;
rcb->buf = buf;
rcb->s = acb->s;
@@ -667,46 +763,43 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs,
}
if (r < 0) {
goto failed_completion;
goto failed;
}
return &acb->common;
failed_completion:
rbd_aio_release(c);
failed:
g_free(rcb);
qemu_vfree(acb->bounce);
qemu_aio_unref(acb);
qemu_aio_release(acb);
return NULL;
}
static BlockAIOCB *qemu_rbd_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
static BlockDriverAIOCB *qemu_rbd_aio_readv(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return rbd_start_aio(bs, sector_num, qiov, nb_sectors, cb, opaque,
RBD_AIO_READ);
}
static BlockAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
static BlockDriverAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return rbd_start_aio(bs, sector_num, qiov, nb_sectors, cb, opaque,
RBD_AIO_WRITE);
}
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
static BlockAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb,
void *opaque)
static BlockDriverAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return rbd_start_aio(bs, 0, NULL, 0, cb, opaque, RBD_AIO_FLUSH);
}
@@ -848,7 +941,7 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
int max_snaps = RBD_MAX_SNAPS;
do {
snaps = g_new(rbd_snap_info_t, max_snaps);
snaps = g_malloc(sizeof(*snaps) * max_snaps);
snap_count = rbd_snap_list(s->image, snaps, &max_snaps);
if (snap_count <= 0) {
g_free(snaps);
@@ -859,7 +952,7 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
goto done;
}
sn_tab = g_new0(QEMUSnapshotInfo, snap_count);
sn_tab = g_malloc0(snap_count * sizeof(QEMUSnapshotInfo));
for (i = 0; i < snap_count; i++) {
const char *snap_name = snaps[i].name;
@@ -882,45 +975,29 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
}
#ifdef LIBRBD_SUPPORTS_DISCARD
static BlockAIOCB* qemu_rbd_aio_discard(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
static BlockDriverAIOCB* qemu_rbd_aio_discard(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return rbd_start_aio(bs, sector_num, NULL, nb_sectors, cb, opaque,
RBD_AIO_DISCARD);
}
#endif
#ifdef LIBRBD_SUPPORTS_INVALIDATE
static void qemu_rbd_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
BDRVRBDState *s = bs->opaque;
int r = rbd_invalidate_cache(s->image);
if (r < 0) {
error_setg_errno(errp, -r, "Failed to invalidate the cache");
}
}
#endif
static QemuOptsList qemu_rbd_create_opts = {
.name = "rbd-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qemu_rbd_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = QEMU_OPT_SIZE,
.help = "RBD object size"
},
{ /* end of list */ }
}
static QEMUOptionParameter qemu_rbd_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = OPT_SIZE,
.help = "RBD object size"
},
{NULL}
};
static BlockDriver bdrv_rbd = {
@@ -932,7 +1009,7 @@ static BlockDriver bdrv_rbd = {
.bdrv_create = qemu_rbd_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_get_info = qemu_rbd_getinfo,
.create_opts = &qemu_rbd_create_opts,
.create_options = qemu_rbd_create_options,
.bdrv_getlength = qemu_rbd_getlength,
.bdrv_truncate = qemu_rbd_truncate,
.protocol_name = "rbd",
@@ -954,9 +1031,6 @@ static BlockDriver bdrv_rbd = {
.bdrv_snapshot_delete = qemu_rbd_snap_remove,
.bdrv_snapshot_list = qemu_rbd_snap_list,
.bdrv_snapshot_goto = qemu_rbd_snap_rollback,
#ifdef LIBRBD_SUPPORTS_INVALIDATE
.bdrv_invalidate_cache = qemu_rbd_invalidate_cache,
#endif
};
static void bdrv_rbd_init(void)

File diff suppressed because it is too large Load Diff

View File

@@ -24,25 +24,6 @@
#include "block/snapshot.h"
#include "block/block_int.h"
#include "qapi/qmp/qerror.h"
QemuOptsList internal_snapshot_opts = {
.name = "snapshot",
.head = QTAILQ_HEAD_INITIALIZER(internal_snapshot_opts.head),
.desc = {
{
.name = SNAPSHOT_OPT_ID,
.type = QEMU_OPT_STRING,
.help = "snapshot id"
},{
.name = SNAPSHOT_OPT_NAME,
.type = QEMU_OPT_STRING,
.help = "snapshot name"
},{
/* end of list */
}
},
};
int bdrv_snapshot_find(BlockDriverState *bs, QEMUSnapshotInfo *sn_info,
const char *name)
@@ -213,7 +194,7 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
* If only @snapshot_id is specified, delete the first one with id
* @snapshot_id.
* If only @name is specified, delete the first one with name @name.
* if none is specified, return -EINVAL.
* if none is specified, return -ENINVAL.
*
* Returns: 0 on success, -errno on failure. If @bs is not inserted, return
* -ENOMEDIUM. If @snapshot_id and @name are both NULL, return -EINVAL. If @bs
@@ -230,26 +211,22 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
{
BlockDriver *drv = bs->drv;
if (!drv) {
error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, bdrv_get_device_name(bs));
error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, bdrv_get_device_name(bs));
return -ENOMEDIUM;
}
if (!snapshot_id && !name) {
error_setg(errp, "snapshot_id and name are both NULL");
return -EINVAL;
}
/* drain all pending i/o before deleting snapshot */
bdrv_drain(bs);
if (drv->bdrv_snapshot_delete) {
return drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
}
if (bs->file) {
return bdrv_snapshot_delete(bs->file, snapshot_id, name, errp);
}
error_setg(errp, "Block format '%s' used by device '%s' "
"does not support internal snapshot deletion",
drv->format_name, bdrv_get_device_name(bs));
error_set(errp, QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
drv->format_name, bdrv_get_device_name(bs),
"internal snapshot deletion");
return -ENOTSUP;
}
@@ -288,71 +265,18 @@ int bdrv_snapshot_list(BlockDriverState *bs,
return -ENOTSUP;
}
/**
* Temporarily load an internal snapshot by @snapshot_id and @name.
* @bs: block device used in the operation
* @snapshot_id: unique snapshot ID, or NULL
* @name: snapshot name, or NULL
* @errp: location to store error
*
* If both @snapshot_id and @name are specified, load the first one with
* id @snapshot_id and name @name.
* If only @snapshot_id is specified, load the first one with id
* @snapshot_id.
* If only @name is specified, load the first one with name @name.
* if none is specified, return -EINVAL.
*
* Returns: 0 on success, -errno on fail. If @bs is not inserted, return
* -ENOMEDIUM. If @bs is not readonly, return -EINVAL. If @bs did not support
* internal snapshot, return -ENOTSUP. If qemu can't find a matching @id and
* @name, return -ENOENT. If @errp != NULL, it will always be filled on
* failure.
*/
int bdrv_snapshot_load_tmp(BlockDriverState *bs,
const char *snapshot_id,
const char *name,
Error **errp)
const char *snapshot_name)
{
BlockDriver *drv = bs->drv;
if (!drv) {
error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, bdrv_get_device_name(bs));
return -ENOMEDIUM;
}
if (!snapshot_id && !name) {
error_setg(errp, "snapshot_id and name are both NULL");
return -EINVAL;
}
if (!bs->read_only) {
error_setg(errp, "Device is not readonly");
return -EINVAL;
}
if (drv->bdrv_snapshot_load_tmp) {
return drv->bdrv_snapshot_load_tmp(bs, snapshot_id, name, errp);
return drv->bdrv_snapshot_load_tmp(bs, snapshot_name);
}
error_setg(errp, "Block format '%s' used by device '%s' "
"does not support temporarily loading internal snapshots",
drv->format_name, bdrv_get_device_name(bs));
return -ENOTSUP;
}
int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
const char *id_or_name,
Error **errp)
{
int ret;
Error *local_err = NULL;
ret = bdrv_snapshot_load_tmp(bs, id_or_name, NULL, &local_err);
if (ret == -ENOENT || ret == -EINVAL) {
error_free(local_err);
local_err = NULL;
ret = bdrv_snapshot_load_tmp(bs, NULL, id_or_name, &local_err);
}
if (local_err) {
error_propagate(errp, local_err);
}
return ret;
}

View File

@@ -30,11 +30,9 @@
#include <libssh2_sftp.h>
#include "block/block_int.h"
#include "qemu/error-report.h"
#include "qemu/sockets.h"
#include "qemu/uri.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
/* DEBUG_SSH=1 enables the DPRINTF (debugging printf) statements in
* this block driver code.
@@ -108,59 +106,30 @@ static void ssh_state_free(BDRVSSHState *s)
}
}
static void GCC_FMT_ATTR(3, 4)
session_error_setg(Error **errp, BDRVSSHState *s, const char *fs, ...)
/* Wrappers around error_report which make sure to dump as much
* information from libssh2 as possible.
*/
static void GCC_FMT_ATTR(2, 3)
session_error_report(BDRVSSHState *s, const char *fs, ...)
{
va_list args;
char *msg;
va_start(args, fs);
msg = g_strdup_vprintf(fs, args);
va_end(args);
error_vprintf(fs, args);
if (s->session) {
if ((s)->session) {
char *ssh_err;
int ssh_err_code;
libssh2_session_last_error((s)->session, &ssh_err, NULL, 0);
/* This is not an errno. See <libssh2.h>. */
ssh_err_code = libssh2_session_last_error(s->session,
&ssh_err, NULL, 0);
error_setg(errp, "%s: %s (libssh2 error code: %d)",
msg, ssh_err, ssh_err_code);
} else {
error_setg(errp, "%s", msg);
ssh_err_code = libssh2_session_last_errno((s)->session);
error_printf(": %s (libssh2 error code: %d)", ssh_err, ssh_err_code);
}
g_free(msg);
}
static void GCC_FMT_ATTR(3, 4)
sftp_error_setg(Error **errp, BDRVSSHState *s, const char *fs, ...)
{
va_list args;
char *msg;
va_start(args, fs);
msg = g_strdup_vprintf(fs, args);
va_end(args);
if (s->sftp) {
char *ssh_err;
int ssh_err_code;
unsigned long sftp_err_code;
/* This is not an errno. See <libssh2.h>. */
ssh_err_code = libssh2_session_last_error(s->session,
&ssh_err, NULL, 0);
/* See <libssh2_sftp.h>. */
sftp_err_code = libssh2_sftp_last_error((s)->sftp);
error_setg(errp,
"%s: %s (libssh2 error code: %d, sftp error code: %lu)",
msg, ssh_err, ssh_err_code, sftp_err_code);
} else {
error_setg(errp, "%s", msg);
}
g_free(msg);
error_printf("\n");
}
static void GCC_FMT_ATTR(2, 3)
@@ -176,9 +145,9 @@ sftp_error_report(BDRVSSHState *s, const char *fs, ...)
int ssh_err_code;
unsigned long sftp_err_code;
libssh2_session_last_error((s)->session, &ssh_err, NULL, 0);
/* This is not an errno. See <libssh2.h>. */
ssh_err_code = libssh2_session_last_error(s->session,
&ssh_err, NULL, 0);
ssh_err_code = libssh2_session_last_errno((s)->session);
/* See <libssh2_sftp.h>. */
sftp_err_code = libssh2_sftp_last_error((s)->sftp);
@@ -274,7 +243,7 @@ static void ssh_parse_filename(const char *filename, QDict *options,
}
static int check_host_key_knownhosts(BDRVSSHState *s,
const char *host, int port, Error **errp)
const char *host, int port)
{
const char *home;
char *knh_file = NULL;
@@ -288,15 +257,14 @@ static int check_host_key_knownhosts(BDRVSSHState *s,
hostkey = libssh2_session_hostkey(s->session, &len, &type);
if (!hostkey) {
ret = -EINVAL;
session_error_setg(errp, s, "failed to read remote host key");
session_error_report(s, "failed to read remote host key");
goto out;
}
knh = libssh2_knownhost_init(s->session);
if (!knh) {
ret = -EINVAL;
session_error_setg(errp, s,
"failed to initialize known hosts support");
session_error_report(s, "failed to initialize known hosts support");
goto out;
}
@@ -321,23 +289,21 @@ static int check_host_key_knownhosts(BDRVSSHState *s,
break;
case LIBSSH2_KNOWNHOST_CHECK_MISMATCH:
ret = -EINVAL;
session_error_setg(errp, s,
"host key does not match the one in known_hosts"
" (found key %s)", found->key);
session_error_report(s, "host key does not match the one in known_hosts (found key %s)",
found->key);
goto out;
case LIBSSH2_KNOWNHOST_CHECK_NOTFOUND:
ret = -EINVAL;
session_error_setg(errp, s, "no host key was found in known_hosts");
session_error_report(s, "no host key was found in known_hosts");
goto out;
case LIBSSH2_KNOWNHOST_CHECK_FAILURE:
ret = -EINVAL;
session_error_setg(errp, s,
"failure matching the host key with known_hosts");
session_error_report(s, "failure matching the host key with known_hosts");
goto out;
default:
ret = -EINVAL;
session_error_setg(errp, s, "unknown error matching the host key"
" with known_hosts (%d)", r);
session_error_report(s, "unknown error matching the host key with known_hosts (%d)",
r);
goto out;
}
@@ -392,20 +358,20 @@ static int compare_fingerprint(const unsigned char *fingerprint, size_t len,
static int
check_host_key_hash(BDRVSSHState *s, const char *hash,
int hash_type, size_t fingerprint_len, Error **errp)
int hash_type, size_t fingerprint_len)
{
const char *fingerprint;
fingerprint = libssh2_hostkey_hash(s->session, hash_type);
if (!fingerprint) {
session_error_setg(errp, s, "failed to read remote host key");
session_error_report(s, "failed to read remote host key");
return -EINVAL;
}
if(compare_fingerprint((unsigned char *) fingerprint, fingerprint_len,
hash) != 0) {
error_setg(errp, "remote host key does not match host_key_check '%s'",
hash);
error_report("remote host key does not match host_key_check '%s'",
hash);
return -EPERM;
}
@@ -413,7 +379,7 @@ check_host_key_hash(BDRVSSHState *s, const char *hash,
}
static int check_host_key(BDRVSSHState *s, const char *host, int port,
const char *host_key_check, Error **errp)
const char *host_key_check)
{
/* host_key_check=no */
if (strcmp(host_key_check, "no") == 0) {
@@ -423,25 +389,25 @@ static int check_host_key(BDRVSSHState *s, const char *host, int port,
/* host_key_check=md5:xx:yy:zz:... */
if (strncmp(host_key_check, "md5:", 4) == 0) {
return check_host_key_hash(s, &host_key_check[4],
LIBSSH2_HOSTKEY_HASH_MD5, 16, errp);
LIBSSH2_HOSTKEY_HASH_MD5, 16);
}
/* host_key_check=sha1:xx:yy:zz:... */
if (strncmp(host_key_check, "sha1:", 5) == 0) {
return check_host_key_hash(s, &host_key_check[5],
LIBSSH2_HOSTKEY_HASH_SHA1, 20, errp);
LIBSSH2_HOSTKEY_HASH_SHA1, 20);
}
/* host_key_check=yes */
if (strcmp(host_key_check, "yes") == 0) {
return check_host_key_knownhosts(s, host, port, errp);
return check_host_key_knownhosts(s, host, port);
}
error_setg(errp, "unknown host_key_check setting (%s)", host_key_check);
error_report("unknown host_key_check setting (%s)", host_key_check);
return -EINVAL;
}
static int authenticate(BDRVSSHState *s, const char *user, Error **errp)
static int authenticate(BDRVSSHState *s, const char *user)
{
int r, ret;
const char *userauthlist;
@@ -452,8 +418,7 @@ static int authenticate(BDRVSSHState *s, const char *user, Error **errp)
userauthlist = libssh2_userauth_list(s->session, user, strlen(user));
if (strstr(userauthlist, "publickey") == NULL) {
ret = -EPERM;
error_setg(errp,
"remote server does not support \"publickey\" authentication");
error_report("remote server does not support \"publickey\" authentication");
goto out;
}
@@ -461,18 +426,17 @@ static int authenticate(BDRVSSHState *s, const char *user, Error **errp)
agent = libssh2_agent_init(s->session);
if (!agent) {
ret = -EINVAL;
session_error_setg(errp, s, "failed to initialize ssh-agent support");
session_error_report(s, "failed to initialize ssh-agent support");
goto out;
}
if (libssh2_agent_connect(agent)) {
ret = -ECONNREFUSED;
session_error_setg(errp, s, "failed to connect to ssh-agent");
session_error_report(s, "failed to connect to ssh-agent");
goto out;
}
if (libssh2_agent_list_identities(agent)) {
ret = -EINVAL;
session_error_setg(errp, s,
"failed requesting identities from ssh-agent");
session_error_report(s, "failed requesting identities from ssh-agent");
goto out;
}
@@ -483,8 +447,7 @@ static int authenticate(BDRVSSHState *s, const char *user, Error **errp)
}
if (r < 0) {
ret = -EINVAL;
session_error_setg(errp, s,
"failed to obtain identity from ssh-agent");
session_error_report(s, "failed to obtain identity from ssh-agent");
goto out;
}
r = libssh2_agent_userauth(agent, user, identity);
@@ -498,8 +461,8 @@ static int authenticate(BDRVSSHState *s, const char *user, Error **errp)
}
ret = -EPERM;
error_setg(errp, "failed to authenticate using publickey authentication "
"and the identities held by your ssh-agent");
error_report("failed to authenticate using publickey authentication "
"and the identities held by your ssh-agent");
out:
if (agent != NULL) {
@@ -513,17 +476,13 @@ static int authenticate(BDRVSSHState *s, const char *user, Error **errp)
}
static int connect_to_ssh(BDRVSSHState *s, QDict *options,
int ssh_flags, int creat_mode, Error **errp)
int ssh_flags, int creat_mode)
{
int r, ret;
Error *err = NULL;
const char *host, *user, *path, *host_key_check;
int port;
if (!qdict_haskey(options, "host")) {
ret = -EINVAL;
error_setg(errp, "No hostname was specified");
goto err;
}
host = qdict_get_str(options, "host");
if (qdict_haskey(options, "port")) {
@@ -532,11 +491,6 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
port = 22;
}
if (!qdict_haskey(options, "path")) {
ret = -EINVAL;
error_setg(errp, "No path was specified");
goto err;
}
path = qdict_get_str(options, "path");
if (qdict_haskey(options, "user")) {
@@ -544,7 +498,6 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
} else {
user = g_get_user_name();
if (!user) {
error_setg_errno(errp, errno, "Can't get user name");
ret = -errno;
goto err;
}
@@ -561,9 +514,11 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
s->hostport = g_strdup_printf("%s:%d", host, port);
/* Open the socket and connect. */
s->sock = inet_connect(s->hostport, errp);
if (s->sock < 0) {
ret = -EIO;
s->sock = inet_connect(s->hostport, &err);
if (err != NULL) {
ret = -errno;
qerror_report_err(err);
error_free(err);
goto err;
}
@@ -571,7 +526,7 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
s->session = libssh2_session_init();
if (!s->session) {
ret = -EINVAL;
session_error_setg(errp, s, "failed to initialize libssh2 session");
session_error_report(s, "failed to initialize libssh2 session");
goto err;
}
@@ -582,18 +537,18 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
r = libssh2_session_handshake(s->session, s->sock);
if (r != 0) {
ret = -EINVAL;
session_error_setg(errp, s, "failed to establish SSH session");
session_error_report(s, "failed to establish SSH session");
goto err;
}
/* Check the remote host's key against known_hosts. */
ret = check_host_key(s, host, port, host_key_check, errp);
ret = check_host_key(s, host, port, host_key_check);
if (ret < 0) {
goto err;
}
/* Authenticate. */
ret = authenticate(s, user, errp);
ret = authenticate(s, user);
if (ret < 0) {
goto err;
}
@@ -601,7 +556,7 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
/* Start SFTP. */
s->sftp = libssh2_sftp_init(s->session);
if (!s->sftp) {
session_error_setg(errp, s, "failed to initialize sftp handle");
session_error_report(s, "failed to initialize sftp handle");
ret = -EINVAL;
goto err;
}
@@ -611,14 +566,14 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
path, ssh_flags, creat_mode);
s->sftp_handle = libssh2_sftp_open(s->sftp, path, ssh_flags, creat_mode);
if (!s->sftp_handle) {
session_error_setg(errp, s, "failed to open remote file '%s'", path);
session_error_report(s, "failed to open remote file '%s'", path);
ret = -EINVAL;
goto err;
}
r = libssh2_sftp_fstat(s->sftp_handle, &s->attrs);
if (r < 0) {
sftp_error_setg(errp, s, "failed to read file attributes");
sftp_error_report(s, "failed to read file attributes");
return -EINVAL;
}
@@ -668,7 +623,7 @@ static int ssh_file_open(BlockDriverState *bs, QDict *options, int bdrv_flags,
}
/* Start up SSH. */
ret = connect_to_ssh(s, options, ssh_flags, 0, errp);
ret = connect_to_ssh(s, options, ssh_flags, 0);
if (ret < 0) {
goto err;
}
@@ -687,22 +642,20 @@ static int ssh_file_open(BlockDriverState *bs, QDict *options, int bdrv_flags,
return ret;
}
static QemuOptsList ssh_create_opts = {
.name = "ssh-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(ssh_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{ /* end of list */ }
}
static QEMUOptionParameter ssh_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{ NULL }
};
static int ssh_create(const char *filename, QemuOpts *opts, Error **errp)
static int ssh_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int r, ret;
Error *local_err = NULL;
int64_t total_size = 0;
QDict *uri_options = NULL;
BDRVSSHState s;
@@ -712,21 +665,26 @@ static int ssh_create(const char *filename, QemuOpts *opts, Error **errp)
ssh_state_init(&s);
/* Get desired file size. */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
total_size = options->value.n;
}
options++;
}
DPRINTF("total_size=%" PRIi64, total_size);
uri_options = qdict_new();
r = parse_uri(filename, uri_options, errp);
r = parse_uri(filename, uri_options, &local_err);
if (r < 0) {
qerror_report_err(local_err);
error_free(local_err);
ret = r;
goto out;
}
r = connect_to_ssh(&s, uri_options,
LIBSSH2_FXF_READ|LIBSSH2_FXF_WRITE|
LIBSSH2_FXF_CREAT|LIBSSH2_FXF_TRUNC,
0644, errp);
LIBSSH2_FXF_CREAT|LIBSSH2_FXF_TRUNC, 0644);
if (r < 0) {
ret = r;
goto out;
@@ -736,7 +694,7 @@ static int ssh_create(const char *filename, QemuOpts *opts, Error **errp)
libssh2_sftp_seek64(s.sftp_handle, total_size-1);
r2 = libssh2_sftp_write(s.sftp_handle, c, 1);
if (r2 < 0) {
sftp_error_setg(errp, &s, "truncate failed");
sftp_error_report(&s, "truncate failed");
ret = -EINVAL;
goto out;
}
@@ -784,7 +742,7 @@ static void restart_coroutine(void *opaque)
qemu_coroutine_enter(co, NULL);
}
static coroutine_fn void set_fd_handler(BDRVSSHState *s, BlockDriverState *bs)
static coroutine_fn void set_fd_handler(BDRVSSHState *s)
{
int r;
IOHandler *rd_handler = NULL, *wr_handler = NULL;
@@ -802,26 +760,24 @@ static coroutine_fn void set_fd_handler(BDRVSSHState *s, BlockDriverState *bs)
DPRINTF("s->sock=%d rd_handler=%p wr_handler=%p", s->sock,
rd_handler, wr_handler);
aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock,
rd_handler, wr_handler, co);
qemu_aio_set_fd_handler(s->sock, rd_handler, wr_handler, co);
}
static coroutine_fn void clear_fd_handler(BDRVSSHState *s,
BlockDriverState *bs)
static coroutine_fn void clear_fd_handler(BDRVSSHState *s)
{
DPRINTF("s->sock=%d", s->sock);
aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, NULL, NULL, NULL);
qemu_aio_set_fd_handler(s->sock, NULL, NULL, NULL);
}
/* A non-blocking call returned EAGAIN, so yield, ensuring the
* handlers are set up so that we'll be rescheduled when there is an
* interesting event on the socket.
*/
static coroutine_fn void co_yield(BDRVSSHState *s, BlockDriverState *bs)
static coroutine_fn void co_yield(BDRVSSHState *s)
{
set_fd_handler(s, bs);
set_fd_handler(s);
qemu_coroutine_yield();
clear_fd_handler(s, bs);
clear_fd_handler(s);
}
/* SFTP has a function `libssh2_sftp_seek64' which seeks to a position
@@ -851,7 +807,7 @@ static void ssh_seek(BDRVSSHState *s, int64_t offset, int flags)
}
}
static coroutine_fn int ssh_read(BDRVSSHState *s, BlockDriverState *bs,
static coroutine_fn int ssh_read(BDRVSSHState *s,
int64_t offset, size_t size,
QEMUIOVector *qiov)
{
@@ -884,7 +840,7 @@ static coroutine_fn int ssh_read(BDRVSSHState *s, BlockDriverState *bs,
DPRINTF("sftp_read returned %zd", r);
if (r == LIBSSH2_ERROR_EAGAIN || r == LIBSSH2_ERROR_TIMEOUT) {
co_yield(s, bs);
co_yield(s);
goto again;
}
if (r < 0) {
@@ -919,14 +875,14 @@ static coroutine_fn int ssh_co_readv(BlockDriverState *bs,
int ret;
qemu_co_mutex_lock(&s->lock);
ret = ssh_read(s, bs, sector_num * BDRV_SECTOR_SIZE,
ret = ssh_read(s, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE, qiov);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static int ssh_write(BDRVSSHState *s, BlockDriverState *bs,
static int ssh_write(BDRVSSHState *s,
int64_t offset, size_t size,
QEMUIOVector *qiov)
{
@@ -954,7 +910,7 @@ static int ssh_write(BDRVSSHState *s, BlockDriverState *bs,
DPRINTF("sftp_write returned %zd", r);
if (r == LIBSSH2_ERROR_EAGAIN || r == LIBSSH2_ERROR_TIMEOUT) {
co_yield(s, bs);
co_yield(s);
goto again;
}
if (r < 0) {
@@ -973,7 +929,7 @@ static int ssh_write(BDRVSSHState *s, BlockDriverState *bs,
*/
if (r == 0) {
ssh_seek(s, offset + written, SSH_SEEK_WRITE|SSH_SEEK_FORCE);
co_yield(s, bs);
co_yield(s);
goto again;
}
@@ -1001,7 +957,7 @@ static coroutine_fn int ssh_co_writev(BlockDriverState *bs,
int ret;
qemu_co_mutex_lock(&s->lock);
ret = ssh_write(s, bs, sector_num * BDRV_SECTOR_SIZE,
ret = ssh_write(s, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE, qiov);
qemu_co_mutex_unlock(&s->lock);
@@ -1022,7 +978,7 @@ static void unsafe_flush_warning(BDRVSSHState *s, const char *what)
#ifdef HAS_LIBSSH2_SFTP_FSYNC
static coroutine_fn int ssh_flush(BDRVSSHState *s, BlockDriverState *bs)
static coroutine_fn int ssh_flush(BDRVSSHState *s)
{
int r;
@@ -1030,7 +986,7 @@ static coroutine_fn int ssh_flush(BDRVSSHState *s, BlockDriverState *bs)
again:
r = libssh2_sftp_fsync(s->sftp_handle);
if (r == LIBSSH2_ERROR_EAGAIN || r == LIBSSH2_ERROR_TIMEOUT) {
co_yield(s, bs);
co_yield(s);
goto again;
}
if (r == LIBSSH2_ERROR_SFTP_PROTOCOL &&
@@ -1052,7 +1008,7 @@ static coroutine_fn int ssh_co_flush(BlockDriverState *bs)
int ret;
qemu_co_mutex_lock(&s->lock);
ret = ssh_flush(s, bs);
ret = ssh_flush(s);
qemu_co_mutex_unlock(&s->lock);
return ret;
@@ -1095,7 +1051,7 @@ static BlockDriver bdrv_ssh = {
.bdrv_co_writev = ssh_co_writev,
.bdrv_getlength = ssh_getlength,
.bdrv_co_flush_to_disk = ssh_co_flush,
.create_opts = &ssh_create_opts,
.create_options = ssh_create_options,
};
static void bdrv_ssh_init(void)

View File

@@ -14,7 +14,6 @@
#include "trace.h"
#include "block/block_int.h"
#include "block/blockjob.h"
#include "qapi/qmp/qerror.h"
#include "qemu/ratelimit.h"
enum {
@@ -33,7 +32,7 @@ typedef struct StreamBlockJob {
RateLimit limit;
BlockDriverState *base;
BlockdevOnError on_error;
char *backing_file_str;
char backing_file_id[1024];
} StreamBlockJob;
static int coroutine_fn stream_populate(BlockDriverState *bs,
@@ -61,7 +60,7 @@ static void close_unused_images(BlockDriverState *top, BlockDriverState *base,
/* Must assign before bdrv_delete() to prevent traversing dangling pointer
* while we delete backing image instances.
*/
bdrv_set_backing_hd(top, base);
top->backing_hd = base;
while (intermediate) {
BlockDriverState *unused;
@@ -73,46 +72,14 @@ static void close_unused_images(BlockDriverState *top, BlockDriverState *base,
unused = intermediate;
intermediate = intermediate->backing_hd;
bdrv_set_backing_hd(unused, NULL);
unused->backing_hd = NULL;
bdrv_unref(unused);
}
bdrv_refresh_limits(top, NULL);
}
typedef struct {
int ret;
bool reached_end;
} StreamCompleteData;
static void stream_complete(BlockJob *job, void *opaque)
{
StreamBlockJob *s = container_of(job, StreamBlockJob, common);
StreamCompleteData *data = opaque;
BlockDriverState *base = s->base;
if (!block_job_is_cancelled(&s->common) && data->reached_end &&
data->ret == 0) {
const char *base_id = NULL, *base_fmt = NULL;
if (base) {
base_id = s->backing_file_str;
if (base->drv) {
base_fmt = base->drv->format_name;
}
}
data->ret = bdrv_change_backing_file(job->bs, base_id, base_fmt);
close_unused_images(job->bs, base, base_id);
}
g_free(s->backing_file_str);
block_job_completed(&s->common, data->ret);
g_free(data);
}
static void coroutine_fn stream_run(void *opaque)
{
StreamBlockJob *s = opaque;
StreamCompleteData *data;
BlockDriverState *bs = s->common.bs;
BlockDriverState *base = s->base;
int64_t sector_num, end;
@@ -121,11 +88,6 @@ static void coroutine_fn stream_run(void *opaque)
int n = 0;
void *buf;
if (!bs->backing_hd) {
block_job_completed(&s->common, 0);
return;
}
s->common.len = bdrv_getlength(bs);
if (s->common.len < 0) {
block_job_completed(&s->common, s->common.len);
@@ -190,14 +152,14 @@ wait:
BlockErrorAction action =
block_job_error_action(&s->common, s->common.bs, s->on_error,
true, -ret);
if (action == BLOCK_ERROR_ACTION_STOP) {
if (action == BDRV_ACTION_STOP) {
n = 0;
continue;
}
if (error == 0) {
error = ret;
}
if (action == BLOCK_ERROR_ACTION_REPORT) {
if (action == BDRV_ACTION_REPORT) {
break;
}
}
@@ -214,13 +176,20 @@ wait:
/* Do not remove the backing file if an error was there but ignored. */
ret = error;
qemu_vfree(buf);
if (!block_job_is_cancelled(&s->common) && sector_num == end && ret == 0) {
const char *base_id = NULL, *base_fmt = NULL;
if (base) {
base_id = s->backing_file_id;
if (base->drv) {
base_fmt = base->drv->format_name;
}
}
ret = bdrv_change_backing_file(bs, base_id, base_fmt);
close_unused_images(bs, base, base_id);
}
/* Modify backing chain and close BDSes in main loop */
data = g_malloc(sizeof(*data));
data->ret = ret;
data->reached_end = sector_num == end;
block_job_defer_to_main_loop(&s->common, stream_complete, data);
qemu_vfree(buf);
block_job_completed(&s->common, ret);
}
static void stream_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -228,7 +197,7 @@ static void stream_set_speed(BlockJob *job, int64_t speed, Error **errp)
StreamBlockJob *s = container_of(job, StreamBlockJob, common);
if (speed < 0) {
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
error_set(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
@@ -241,9 +210,9 @@ static const BlockJobDriver stream_job_driver = {
};
void stream_start(BlockDriverState *bs, BlockDriverState *base,
const char *backing_file_str, int64_t speed,
const char *base_id, int64_t speed,
BlockdevOnError on_error,
BlockCompletionFunc *cb,
BlockDriverCompletionFunc *cb,
void *opaque, Error **errp)
{
StreamBlockJob *s;
@@ -251,7 +220,7 @@ void stream_start(BlockDriverState *bs, BlockDriverState *base,
if ((on_error == BLOCKDEV_ON_ERROR_STOP ||
on_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
!bdrv_iostatus_is_enabled(bs)) {
error_setg(errp, QERR_INVALID_PARAMETER, "on-error");
error_set(errp, QERR_INVALID_PARAMETER, "on-error");
return;
}
@@ -261,7 +230,9 @@ void stream_start(BlockDriverState *bs, BlockDriverState *base,
}
s->base = base;
s->backing_file_str = g_strdup(backing_file_str);
if (base_id) {
pstrcpy(s->backing_file_id, sizeof(s->backing_file_id), base_id);
}
s->on_error = on_error;
s->common.co = qemu_coroutine_create(stream_run);

View File

@@ -1,501 +0,0 @@
/*
* QEMU block throttling group infrastructure
*
* Copyright (C) Nodalink, EURL. 2014
* Copyright (C) Igalia, S.L. 2015
*
* Authors:
* Benoît Canet <benoit.canet@nodalink.com>
* Alberto Garcia <berto@igalia.com>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 or
* (at your option) version 3 of the License.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include "block/throttle-groups.h"
#include "qemu/queue.h"
#include "qemu/thread.h"
#include "sysemu/qtest.h"
/* The ThrottleGroup structure (with its ThrottleState) is shared
* among different BlockDriverState and it's independent from
* AioContext, so in order to use it from different threads it needs
* its own locking.
*
* This locking is however handled internally in this file, so it's
* mostly transparent to outside users (but see the documentation in
* throttle_groups_lock()).
*
* The whole ThrottleGroup structure is private and invisible to
* outside users, that only use it through its ThrottleState.
*
* In addition to the ThrottleGroup structure, BlockDriverState has
* fields that need to be accessed by other members of the group and
* therefore also need to be protected by this lock. Once a BDS is
* registered in a group those fields can be accessed by other threads
* any time.
*
* Again, all this is handled internally and is mostly transparent to
* the outside. The 'throttle_timers' field however has an additional
* constraint because it may be temporarily invalid (see for example
* bdrv_set_aio_context()). Therefore in this file a thread will
* access some other BDS's timers only after verifying that that BDS
* has throttled requests in the queue.
*/
typedef struct ThrottleGroup {
char *name; /* This is constant during the lifetime of the group */
QemuMutex lock; /* This lock protects the following four fields */
ThrottleState ts;
QLIST_HEAD(, BlockDriverState) head;
BlockDriverState *tokens[2];
bool any_timer_armed[2];
/* These two are protected by the global throttle_groups_lock */
unsigned refcount;
QTAILQ_ENTRY(ThrottleGroup) list;
} ThrottleGroup;
static QemuMutex throttle_groups_lock;
static QTAILQ_HEAD(, ThrottleGroup) throttle_groups =
QTAILQ_HEAD_INITIALIZER(throttle_groups);
/* Increments the reference count of a ThrottleGroup given its name.
*
* If no ThrottleGroup is found with the given name a new one is
* created.
*
* @name: the name of the ThrottleGroup
* @ret: the ThrottleGroup
*/
static ThrottleGroup *throttle_group_incref(const char *name)
{
ThrottleGroup *tg = NULL;
ThrottleGroup *iter;
qemu_mutex_lock(&throttle_groups_lock);
/* Look for an existing group with that name */
QTAILQ_FOREACH(iter, &throttle_groups, list) {
if (!strcmp(name, iter->name)) {
tg = iter;
break;
}
}
/* Create a new one if not found */
if (!tg) {
tg = g_new0(ThrottleGroup, 1);
tg->name = g_strdup(name);
qemu_mutex_init(&tg->lock);
throttle_init(&tg->ts);
QLIST_INIT(&tg->head);
QTAILQ_INSERT_TAIL(&throttle_groups, tg, list);
}
tg->refcount++;
qemu_mutex_unlock(&throttle_groups_lock);
return tg;
}
/* Decrease the reference count of a ThrottleGroup.
*
* When the reference count reaches zero the ThrottleGroup is
* destroyed.
*
* @tg: The ThrottleGroup to unref
*/
static void throttle_group_unref(ThrottleGroup *tg)
{
qemu_mutex_lock(&throttle_groups_lock);
if (--tg->refcount == 0) {
QTAILQ_REMOVE(&throttle_groups, tg, list);
qemu_mutex_destroy(&tg->lock);
g_free(tg->name);
g_free(tg);
}
qemu_mutex_unlock(&throttle_groups_lock);
}
/* Get the name from a BlockDriverState's ThrottleGroup. The name (and
* the pointer) is guaranteed to remain constant during the lifetime
* of the group.
*
* @bs: a BlockDriverState that is member of a throttling group
* @ret: the name of the group.
*/
const char *throttle_group_get_name(BlockDriverState *bs)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
return tg->name;
}
/* Return the next BlockDriverState in the round-robin sequence,
* simulating a circular list.
*
* This assumes that tg->lock is held.
*
* @bs: the current BlockDriverState
* @ret: the next BlockDriverState in the sequence
*/
static BlockDriverState *throttle_group_next_bs(BlockDriverState *bs)
{
ThrottleState *ts = bs->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
BlockDriverState *next = QLIST_NEXT(bs, round_robin);
if (!next) {
return QLIST_FIRST(&tg->head);
}
return next;
}
/* Return the next BlockDriverState in the round-robin sequence with
* pending I/O requests.
*
* This assumes that tg->lock is held.
*
* @bs: the current BlockDriverState
* @is_write: the type of operation (read/write)
* @ret: the next BlockDriverState with pending requests, or bs
* if there is none.
*/
static BlockDriverState *next_throttle_token(BlockDriverState *bs,
bool is_write)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
BlockDriverState *token, *start;
start = token = tg->tokens[is_write];
/* get next bs round in round robin style */
token = throttle_group_next_bs(token);
while (token != start && !token->pending_reqs[is_write]) {
token = throttle_group_next_bs(token);
}
/* If no IO are queued for scheduling on the next round robin token
* then decide the token is the current bs because chances are
* the current bs get the current request queued.
*/
if (token == start && !token->pending_reqs[is_write]) {
token = bs;
}
return token;
}
/* Check if the next I/O request for a BlockDriverState needs to be
* throttled or not. If there's no timer set in this group, set one
* and update the token accordingly.
*
* This assumes that tg->lock is held.
*
* @bs: the current BlockDriverState
* @is_write: the type of operation (read/write)
* @ret: whether the I/O request needs to be throttled or not
*/
static bool throttle_group_schedule_timer(BlockDriverState *bs,
bool is_write)
{
ThrottleState *ts = bs->throttle_state;
ThrottleTimers *tt = &bs->throttle_timers;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
bool must_wait;
/* Check if any of the timers in this group is already armed */
if (tg->any_timer_armed[is_write]) {
return true;
}
must_wait = throttle_schedule_timer(ts, tt, is_write);
/* If a timer just got armed, set bs as the current token */
if (must_wait) {
tg->tokens[is_write] = bs;
tg->any_timer_armed[is_write] = true;
}
return must_wait;
}
/* Look for the next pending I/O request and schedule it.
*
* This assumes that tg->lock is held.
*
* @bs: the current BlockDriverState
* @is_write: the type of operation (read/write)
*/
static void schedule_next_request(BlockDriverState *bs, bool is_write)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
bool must_wait;
BlockDriverState *token;
/* Check if there's any pending request to schedule next */
token = next_throttle_token(bs, is_write);
if (!token->pending_reqs[is_write]) {
return;
}
/* Set a timer for the request if it needs to be throttled */
must_wait = throttle_group_schedule_timer(token, is_write);
/* If it doesn't have to wait, queue it for immediate execution */
if (!must_wait) {
/* Give preference to requests from the current bs */
if (qemu_in_coroutine() &&
qemu_co_queue_next(&bs->throttled_reqs[is_write])) {
token = bs;
} else {
ThrottleTimers *tt = &token->throttle_timers;
int64_t now = qemu_clock_get_ns(tt->clock_type);
timer_mod(tt->timers[is_write], now + 1);
tg->any_timer_armed[is_write] = true;
}
tg->tokens[is_write] = token;
}
}
/* Check if an I/O request needs to be throttled, wait and set a timer
* if necessary, and schedule the next request using a round robin
* algorithm.
*
* @bs: the current BlockDriverState
* @bytes: the number of bytes for this I/O
* @is_write: the type of operation (read/write)
*/
void coroutine_fn throttle_group_co_io_limits_intercept(BlockDriverState *bs,
unsigned int bytes,
bool is_write)
{
bool must_wait;
BlockDriverState *token;
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
qemu_mutex_lock(&tg->lock);
/* First we check if this I/O has to be throttled. */
token = next_throttle_token(bs, is_write);
must_wait = throttle_group_schedule_timer(token, is_write);
/* Wait if there's a timer set or queued requests of this type */
if (must_wait || bs->pending_reqs[is_write]) {
bs->pending_reqs[is_write]++;
qemu_mutex_unlock(&tg->lock);
qemu_co_queue_wait(&bs->throttled_reqs[is_write]);
qemu_mutex_lock(&tg->lock);
bs->pending_reqs[is_write]--;
}
/* The I/O will be executed, so do the accounting */
throttle_account(bs->throttle_state, is_write, bytes);
/* Schedule the next request */
schedule_next_request(bs, is_write);
qemu_mutex_unlock(&tg->lock);
}
/* Update the throttle configuration for a particular group. Similar
* to throttle_config(), but guarantees atomicity within the
* throttling group.
*
* @bs: a BlockDriverState that is member of the group
* @cfg: the configuration to set
*/
void throttle_group_config(BlockDriverState *bs, ThrottleConfig *cfg)
{
ThrottleTimers *tt = &bs->throttle_timers;
ThrottleState *ts = bs->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
qemu_mutex_lock(&tg->lock);
/* throttle_config() cancels the timers */
if (timer_pending(tt->timers[0])) {
tg->any_timer_armed[0] = false;
}
if (timer_pending(tt->timers[1])) {
tg->any_timer_armed[1] = false;
}
throttle_config(ts, tt, cfg);
qemu_mutex_unlock(&tg->lock);
}
/* Get the throttle configuration from a particular group. Similar to
* throttle_get_config(), but guarantees atomicity within the
* throttling group.
*
* @bs: a BlockDriverState that is member of the group
* @cfg: the configuration will be written here
*/
void throttle_group_get_config(BlockDriverState *bs, ThrottleConfig *cfg)
{
ThrottleState *ts = bs->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
qemu_mutex_lock(&tg->lock);
throttle_get_config(ts, cfg);
qemu_mutex_unlock(&tg->lock);
}
/* ThrottleTimers callback. This wakes up a request that was waiting
* because it had been throttled.
*
* @bs: the BlockDriverState whose request had been throttled
* @is_write: the type of operation (read/write)
*/
static void timer_cb(BlockDriverState *bs, bool is_write)
{
ThrottleState *ts = bs->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
bool empty_queue;
/* The timer has just been fired, so we can update the flag */
qemu_mutex_lock(&tg->lock);
tg->any_timer_armed[is_write] = false;
qemu_mutex_unlock(&tg->lock);
/* Run the request that was waiting for this timer */
empty_queue = !qemu_co_enter_next(&bs->throttled_reqs[is_write]);
/* If the request queue was empty then we have to take care of
* scheduling the next one */
if (empty_queue) {
qemu_mutex_lock(&tg->lock);
schedule_next_request(bs, is_write);
qemu_mutex_unlock(&tg->lock);
}
}
static void read_timer_cb(void *opaque)
{
timer_cb(opaque, false);
}
static void write_timer_cb(void *opaque)
{
timer_cb(opaque, true);
}
/* Register a BlockDriverState in the throttling group, also
* initializing its timers and updating its throttle_state pointer to
* point to it. If a throttling group with that name does not exist
* yet, it will be created.
*
* @bs: the BlockDriverState to insert
* @groupname: the name of the group
*/
void throttle_group_register_bs(BlockDriverState *bs, const char *groupname)
{
int i;
ThrottleGroup *tg = throttle_group_incref(groupname);
int clock_type = QEMU_CLOCK_REALTIME;
if (qtest_enabled()) {
/* For testing block IO throttling only */
clock_type = QEMU_CLOCK_VIRTUAL;
}
bs->throttle_state = &tg->ts;
qemu_mutex_lock(&tg->lock);
/* If the ThrottleGroup is new set this BlockDriverState as the token */
for (i = 0; i < 2; i++) {
if (!tg->tokens[i]) {
tg->tokens[i] = bs;
}
}
QLIST_INSERT_HEAD(&tg->head, bs, round_robin);
throttle_timers_init(&bs->throttle_timers,
bdrv_get_aio_context(bs),
clock_type,
read_timer_cb,
write_timer_cb,
bs);
qemu_mutex_unlock(&tg->lock);
}
/* Unregister a BlockDriverState from its group, removing it from the
* list, destroying the timers and setting the throttle_state pointer
* to NULL.
*
* The group will be destroyed if it's empty after this operation.
*
* @bs: the BlockDriverState to remove
*/
void throttle_group_unregister_bs(BlockDriverState *bs)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
int i;
qemu_mutex_lock(&tg->lock);
for (i = 0; i < 2; i++) {
if (tg->tokens[i] == bs) {
BlockDriverState *token = throttle_group_next_bs(bs);
/* Take care of the case where this is the last bs in the group */
if (token == bs) {
token = NULL;
}
tg->tokens[i] = token;
}
}
/* remove the current bs from the list */
QLIST_REMOVE(bs, round_robin);
throttle_timers_destroy(&bs->throttle_timers);
qemu_mutex_unlock(&tg->lock);
throttle_group_unref(tg);
bs->throttle_state = NULL;
}
/* Acquire the lock of this throttling group.
*
* You won't normally need to use this. None of the functions from the
* ThrottleGroup API require you to acquire the lock since all of them
* deal with it internally.
*
* This should only be used in exceptional cases when you want to
* access the protected fields of a BlockDriverState directly
* (e.g. bdrv_swap()).
*
* @bs: a BlockDriverState that is member of the group
*/
void throttle_group_lock(BlockDriverState *bs)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
qemu_mutex_lock(&tg->lock);
}
/* Release the lock of this throttling group.
*
* See the comments in throttle_group_lock().
*/
void throttle_group_unlock(BlockDriverState *bs)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
qemu_mutex_unlock(&tg->lock);
}
static void throttle_groups_init(void)
{
qemu_mutex_init(&throttle_groups_lock);
}
block_init(throttle_groups_init);

View File

@@ -31,7 +31,7 @@
* Allocation of blocks could be optimized (less writes to block map and
* header).
*
* Read and write of adjacent blocks could be done in one operation
* Read and write of adjacents blocks could be done in one operation
* (current code uses one operation per block (1 MiB).
*
* The code is not thread safe (missing locks for changes in header and
@@ -53,7 +53,6 @@
#include "block/block_int.h"
#include "qemu/module.h"
#include "migration/migration.h"
#include "block/coroutine.h"
#if defined(CONFIG_UUID)
#include <uuid/uuid.h>
@@ -121,18 +120,8 @@ typedef unsigned char uuid_t[16];
#define VDI_IS_ALLOCATED(X) ((X) < VDI_DISCARDED)
/* The bmap will take up VDI_BLOCKS_IN_IMAGE_MAX * sizeof(uint32_t) bytes; since
* the bmap is read and written in a single operation, its size needs to be
* limited to INT_MAX; furthermore, when opening an image, the bmap size is
* rounded up to be aligned on BDRV_SECTOR_SIZE.
* Therefore this should satisfy the following:
* VDI_BLOCKS_IN_IMAGE_MAX * sizeof(uint32_t) + BDRV_SECTOR_SIZE == INT_MAX + 1
* (INT_MAX + 1 is the first value not representable as an int)
* This guarantees that any value below or equal to the constant will, when
* multiplied by sizeof(uint32_t) and rounded up to a BDRV_SECTOR_SIZE boundary,
* still be below or equal to INT_MAX. */
#define VDI_BLOCKS_IN_IMAGE_MAX \
((unsigned)((INT_MAX + 1u - BDRV_SECTOR_SIZE) / sizeof(uint32_t)))
/* max blocks in image is (0xffffffff / 4) */
#define VDI_BLOCKS_IN_IMAGE_MAX 0x3fffffff
#define VDI_DISK_SIZE_MAX ((uint64_t)VDI_BLOCKS_IN_IMAGE_MAX * \
(uint64_t)DEFAULT_CLUSTER_SIZE)
@@ -148,14 +137,12 @@ static inline int uuid_is_null(const uuid_t uu)
return memcmp(uu, null_uuid, sizeof(uuid_t)) == 0;
}
# if defined(CONFIG_VDI_DEBUG)
static inline void uuid_unparse(const uuid_t uu, char *out)
{
snprintf(out, 37, UUID_FMT,
uu[0], uu[1], uu[2], uu[3], uu[4], uu[5], uu[6], uu[7],
uu[8], uu[9], uu[10], uu[11], uu[12], uu[13], uu[14], uu[15]);
}
# endif
#endif
typedef struct {
@@ -197,8 +184,6 @@ typedef struct {
/* VDI header (converted to host endianness). */
VdiHeader header;
CoMutex write_lock;
Error *migration_blocker;
} BDRVVdiState;
@@ -254,6 +239,7 @@ static void vdi_header_to_le(VdiHeader *header)
cpu_to_le32s(&header->block_extra);
cpu_to_le32s(&header->blocks_in_image);
cpu_to_le32s(&header->blocks_allocated);
cpu_to_le32s(&header->blocks_allocated);
uuid_convert(header->uuid_image);
uuid_convert(header->uuid_last_snap);
uuid_convert(header->uuid_link);
@@ -307,12 +293,7 @@ static int vdi_check(BlockDriverState *bs, BdrvCheckResult *res,
return -ENOTSUP;
}
bmap = g_try_new(uint32_t, s->header.blocks_in_image);
if (s->header.blocks_in_image && bmap == NULL) {
res->check_errors++;
return -ENOMEM;
}
bmap = g_malloc(s->header.blocks_in_image * sizeof(uint32_t));
memset(bmap, 0xff, s->header.blocks_in_image * sizeof(uint32_t));
/* Check block map and value of blocks_allocated. */
@@ -355,7 +336,6 @@ static int vdi_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
logout("\n");
bdi->cluster_size = s->block_size;
bdi->vm_state_offset = 0;
bdi->unallocated_blocks_are_zero = true;
return 0;
}
@@ -370,23 +350,23 @@ static int vdi_make_empty(BlockDriverState *bs)
static int vdi_probe(const uint8_t *buf, int buf_size, const char *filename)
{
const VdiHeader *header = (const VdiHeader *)buf;
int ret = 0;
int result = 0;
logout("\n");
if (buf_size < sizeof(*header)) {
/* Header too small, no VDI. */
} else if (le32_to_cpu(header->signature) == VDI_SIGNATURE) {
ret = 100;
result = 100;
}
if (ret == 0) {
if (result == 0) {
logout("no vdi image\n");
} else {
logout("%s", header->text);
}
return ret;
return result;
}
static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
@@ -410,9 +390,8 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
#endif
if (header.disk_size > VDI_DISK_SIZE_MAX) {
error_setg(errp, "Unsupported VDI image size (size is 0x%" PRIx64
", max supported is 0x%" PRIx64 ")",
header.disk_size, VDI_DISK_SIZE_MAX);
logout("disk size is 0x%" PRIx64 ", max supported is 0x%" PRIx64,
header.disk_size, VDI_DISK_SIZE_MAX);
ret = -ENOTSUP;
goto fail;
}
@@ -422,61 +401,53 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
We accept them but round the disk size to the next multiple of
SECTOR_SIZE. */
logout("odd disk size %" PRIu64 " B, round up\n", header.disk_size);
header.disk_size = ROUND_UP(header.disk_size, SECTOR_SIZE);
header.disk_size += SECTOR_SIZE - 1;
header.disk_size &= ~(SECTOR_SIZE - 1);
}
if (header.signature != VDI_SIGNATURE) {
error_setg(errp, "Image not in VDI format (bad signature %08" PRIx32
")", header.signature);
ret = -EINVAL;
logout("bad vdi signature %08x\n", header.signature);
ret = -EMEDIUMTYPE;
goto fail;
} else if (header.version != VDI_VERSION_1_1) {
error_setg(errp, "unsupported VDI image (version %" PRIu32 ".%" PRIu32
")", header.version >> 16, header.version & 0xffff);
logout("unsupported version %u.%u\n",
header.version >> 16, header.version & 0xffff);
ret = -ENOTSUP;
goto fail;
} else if (header.offset_bmap % SECTOR_SIZE != 0) {
/* We only support block maps which start on a sector boundary. */
error_setg(errp, "unsupported VDI image (unaligned block map offset "
"0x%" PRIx32 ")", header.offset_bmap);
logout("unsupported block map offset 0x%x B\n", header.offset_bmap);
ret = -ENOTSUP;
goto fail;
} else if (header.offset_data % SECTOR_SIZE != 0) {
/* We only support data blocks which start on a sector boundary. */
error_setg(errp, "unsupported VDI image (unaligned data offset 0x%"
PRIx32 ")", header.offset_data);
logout("unsupported data offset 0x%x B\n", header.offset_data);
ret = -ENOTSUP;
goto fail;
} else if (header.sector_size != SECTOR_SIZE) {
error_setg(errp, "unsupported VDI image (sector size %" PRIu32
" is not %u)", header.sector_size, SECTOR_SIZE);
logout("unsupported sector size %u B\n", header.sector_size);
ret = -ENOTSUP;
goto fail;
} else if (header.block_size != DEFAULT_CLUSTER_SIZE) {
error_setg(errp, "unsupported VDI image (block size %" PRIu32
" is not %u)", header.block_size, DEFAULT_CLUSTER_SIZE);
logout("unsupported block size %u B\n", header.block_size);
ret = -ENOTSUP;
goto fail;
} else if (header.disk_size >
(uint64_t)header.blocks_in_image * header.block_size) {
error_setg(errp, "unsupported VDI image (disk size %" PRIu64 ", "
"image bitmap has room for %" PRIu64 ")",
header.disk_size,
(uint64_t)header.blocks_in_image * header.block_size);
logout("unsupported disk size %" PRIu64 " B\n", header.disk_size);
ret = -ENOTSUP;
goto fail;
} else if (!uuid_is_null(header.uuid_link)) {
error_setg(errp, "unsupported VDI image (non-NULL link UUID)");
logout("link uuid != 0, unsupported\n");
ret = -ENOTSUP;
goto fail;
} else if (!uuid_is_null(header.uuid_parent)) {
error_setg(errp, "unsupported VDI image (non-NULL parent UUID)");
logout("parent uuid != 0, unsupported\n");
ret = -ENOTSUP;
goto fail;
} else if (header.blocks_in_image > VDI_BLOCKS_IN_IMAGE_MAX) {
error_setg(errp, "unsupported VDI image "
"(too many blocks %u, max is %u)",
header.blocks_in_image, VDI_BLOCKS_IN_IMAGE_MAX);
logout("too many blocks %u, max is %u)",
header.blocks_in_image, VDI_BLOCKS_IN_IMAGE_MAX);
ret = -ENOTSUP;
goto fail;
}
@@ -489,30 +460,23 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
s->header = header;
bmap_size = header.blocks_in_image * sizeof(uint32_t);
bmap_size = DIV_ROUND_UP(bmap_size, SECTOR_SIZE);
s->bmap = qemu_try_blockalign(bs->file, bmap_size * SECTOR_SIZE);
if (s->bmap == NULL) {
ret = -ENOMEM;
goto fail;
}
bmap_size = (bmap_size + SECTOR_SIZE - 1) / SECTOR_SIZE;
s->bmap = g_malloc(bmap_size * SECTOR_SIZE);
ret = bdrv_read(bs->file, s->bmap_sector, (uint8_t *)s->bmap, bmap_size);
if (ret < 0) {
goto fail_free_bmap;
}
/* Disable migration when vdi images are used */
error_setg(&s->migration_blocker, "The vdi format used by node '%s' "
"does not support live migration",
bdrv_get_device_or_node_name(bs));
error_set(&s->migration_blocker,
QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
"vdi", bs->device_name, "live migration");
migrate_add_blocker(s->migration_blocker);
qemu_co_mutex_init(&s->write_lock);
return 0;
fail_free_bmap:
qemu_vfree(s->bmap);
g_free(s->bmap);
fail:
return ret;
@@ -644,31 +608,11 @@ static int vdi_co_write(BlockDriverState *bs,
buf, n_sectors * SECTOR_SIZE);
memset(block + (sector_in_block + n_sectors) * SECTOR_SIZE, 0,
(s->block_sectors - n_sectors - sector_in_block) * SECTOR_SIZE);
/* Note that this coroutine does not yield anywhere from reading the
* bmap entry until here, so in regards to all the coroutines trying
* to write to this cluster, the one doing the allocation will
* always be the first to try to acquire the lock.
* Therefore, it is also the first that will actually be able to
* acquire the lock and thus the padded cluster is written before
* the other coroutines can write to the affected area. */
qemu_co_mutex_lock(&s->write_lock);
ret = bdrv_write(bs->file, offset, block, s->block_sectors);
qemu_co_mutex_unlock(&s->write_lock);
} else {
uint64_t offset = s->header.offset_data / SECTOR_SIZE +
(uint64_t)bmap_entry * s->block_sectors +
sector_in_block;
qemu_co_mutex_lock(&s->write_lock);
/* This lock is only used to make sure the following write operation
* is executed after the write issued by the coroutine allocating
* this cluster, therefore we do not need to keep it locked.
* As stated above, the allocating coroutine will always try to lock
* the mutex before all the other concurrent accesses to that
* cluster, therefore at this point we can be absolutely certain
* that that write operation has returned (there may be other writes
* in flight, but they do not concern this very operation). */
qemu_co_mutex_unlock(&s->write_lock);
ret = bdrv_write(bs->file, offset, buf, n_sectors);
}
@@ -718,9 +662,11 @@ static int vdi_co_write(BlockDriverState *bs,
return ret;
}
static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
static int vdi_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int ret = 0;
int fd;
int result = 0;
uint64_t bytes = 0;
uint32_t blocks;
size_t block_size = DEFAULT_CLUSTER_SIZE;
@@ -728,54 +674,52 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
VdiHeader header;
size_t i;
size_t bmap_size;
int64_t offset = 0;
Error *local_err = NULL;
BlockDriverState *bs = NULL;
uint32_t *bmap = NULL;
logout("\n");
/* Read out options. */
bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
bytes = options->value.n;
#if defined(CONFIG_VDI_BLOCK_SIZE)
/* TODO: Additional checks (SECTOR_SIZE * 2^n, ...). */
block_size = qemu_opt_get_size_del(opts,
BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE);
} else if (!strcmp(options->name, BLOCK_OPT_CLUSTER_SIZE)) {
if (options->value.n) {
/* TODO: Additional checks (SECTOR_SIZE * 2^n, ...). */
block_size = options->value.n;
}
#endif
#if defined(CONFIG_VDI_STATIC_IMAGE)
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_STATIC, false)) {
image_type = VDI_TYPE_STATIC;
}
} else if (!strcmp(options->name, BLOCK_OPT_STATIC)) {
if (options->value.n) {
image_type = VDI_TYPE_STATIC;
}
#endif
}
options++;
}
if (bytes > VDI_DISK_SIZE_MAX) {
ret = -ENOTSUP;
error_setg(errp, "Unsupported VDI image size (size is 0x%" PRIx64
", max supported is 0x%" PRIx64 ")",
bytes, VDI_DISK_SIZE_MAX);
result = -ENOTSUP;
logout("image size (size is 0x%" PRIx64
", max supported is 0x%" PRIx64 ")",
bytes, VDI_DISK_SIZE_MAX);
goto exit;
}
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
}
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
fd = qemu_open(filename,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
0644);
if (fd < 0) {
result = -errno;
goto exit;
}
/* We need enough blocks to store the given disk size,
so always round up. */
blocks = DIV_ROUND_UP(bytes, block_size);
blocks = (bytes + block_size - 1) / block_size;
bmap_size = blocks * sizeof(uint32_t);
bmap_size = ROUND_UP(bmap_size, SECTOR_SIZE);
bmap_size = ((bmap_size + SECTOR_SIZE - 1) & ~(SECTOR_SIZE -1));
memset(&header, 0, sizeof(header));
pstrcpy(header.text, sizeof(header.text), VDI_TEXT);
@@ -799,20 +743,12 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
vdi_header_print(&header);
#endif
vdi_header_to_le(&header);
ret = bdrv_pwrite_sync(bs, offset, &header, sizeof(header));
if (ret < 0) {
error_setg(errp, "Error writing header to %s", filename);
goto exit;
if (write(fd, &header, sizeof(header)) < 0) {
result = -errno;
}
offset += sizeof(header);
if (bmap_size > 0) {
bmap = g_try_malloc0(bmap_size);
if (bmap == NULL) {
ret = -ENOMEM;
error_setg(errp, "Could not allocate bmap");
goto exit;
}
uint32_t *bmap = g_malloc0(bmap_size);
for (i = 0; i < blocks; i++) {
if (image_type == VDI_TYPE_STATIC) {
bmap[i] = i;
@@ -820,66 +756,59 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
bmap[i] = VDI_UNALLOCATED;
}
}
ret = bdrv_pwrite_sync(bs, offset, bmap, bmap_size);
if (ret < 0) {
error_setg(errp, "Error writing bmap to %s", filename);
goto exit;
if (write(fd, bmap, bmap_size) < 0) {
result = -errno;
}
offset += bmap_size;
g_free(bmap);
}
if (image_type == VDI_TYPE_STATIC) {
ret = bdrv_truncate(bs, offset + blocks * block_size);
if (ret < 0) {
error_setg(errp, "Failed to statically allocate %s", filename);
goto exit;
if (ftruncate(fd, sizeof(header) + bmap_size + blocks * block_size)) {
result = -errno;
}
}
if (close(fd) < 0) {
result = -errno;
}
exit:
bdrv_unref(bs);
g_free(bmap);
return ret;
return result;
}
static void vdi_close(BlockDriverState *bs)
{
BDRVVdiState *s = bs->opaque;
qemu_vfree(s->bmap);
g_free(s->bmap);
migrate_del_blocker(s->migration_blocker);
error_free(s->migration_blocker);
}
static QemuOptsList vdi_create_opts = {
.name = "vdi-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(vdi_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
static QEMUOptionParameter vdi_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
#if defined(CONFIG_VDI_BLOCK_SIZE)
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = QEMU_OPT_SIZE,
.help = "VDI cluster (block) size",
.def_value_str = stringify(DEFAULT_CLUSTER_SIZE)
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = OPT_SIZE,
.help = "VDI cluster (block) size",
.value = { .n = DEFAULT_CLUSTER_SIZE },
},
#endif
#if defined(CONFIG_VDI_STATIC_IMAGE)
{
.name = BLOCK_OPT_STATIC,
.type = QEMU_OPT_BOOL,
.help = "VDI static (pre-allocated) image",
.def_value_str = "off"
},
{
.name = BLOCK_OPT_STATIC,
.type = OPT_FLAG,
.help = "VDI static (pre-allocated) image"
},
#endif
/* TODO: An additional option to set UUID values might be useful. */
{ /* end of list */ }
}
/* TODO: An additional option to set UUID values might be useful. */
{ NULL }
};
static BlockDriver bdrv_vdi = {
@@ -901,7 +830,7 @@ static BlockDriver bdrv_vdi = {
.bdrv_get_info = vdi_get_info,
.create_opts = &vdi_create_opts,
.create_options = vdi_create_options,
.bdrv_check = vdi_check,
};

View File

@@ -82,6 +82,8 @@ void vhdx_log_desc_le_import(VHDXLogDescriptor *d)
assert(d != NULL);
le32_to_cpus(&d->signature);
le32_to_cpus(&d->trailing_bytes);
le64_to_cpus(&d->leading_bytes);
le64_to_cpus(&d->file_offset);
le64_to_cpus(&d->sequence_number);
}
@@ -97,15 +99,6 @@ void vhdx_log_desc_le_export(VHDXLogDescriptor *d)
cpu_to_le64s(&d->sequence_number);
}
void vhdx_log_data_le_import(VHDXLogDataSector *d)
{
assert(d != NULL);
le32_to_cpus(&d->data_signature);
le32_to_cpus(&d->sequence_high);
le32_to_cpus(&d->sequence_low);
}
void vhdx_log_data_le_export(VHDXLogDataSector *d)
{
assert(d != NULL);

View File

@@ -19,7 +19,6 @@
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/error-report.h"
#include "qemu/module.h"
#include "block/vhdx.h"
@@ -85,7 +84,6 @@ static int vhdx_log_peek_hdr(BlockDriverState *bs, VHDXLogEntries *log,
if (ret < 0) {
goto exit;
}
vhdx_log_entry_hdr_le_import(hdr);
exit:
return ret;
@@ -213,7 +211,7 @@ static bool vhdx_log_hdr_is_valid(VHDXLogEntries *log, VHDXLogEntryHeader *hdr,
{
int valid = false;
if (hdr->signature != VHDX_LOG_SIGNATURE) {
if (memcmp(&hdr->signature, "loge", 4)) {
goto exit;
}
@@ -277,12 +275,12 @@ static bool vhdx_log_desc_is_valid(VHDXLogDescriptor *desc,
goto exit;
}
if (desc->signature == VHDX_LOG_ZERO_SIGNATURE) {
if (!memcmp(&desc->signature, "zero", 4)) {
if (desc->zero_length % VHDX_LOG_SECTOR_SIZE == 0) {
/* valid */
ret = true;
}
} else if (desc->signature == VHDX_LOG_DESC_SIGNATURE) {
} else if (!memcmp(&desc->signature, "desc", 4)) {
/* valid */
ret = true;
}
@@ -329,15 +327,13 @@ static int vhdx_compute_desc_sectors(uint32_t desc_cnt)
* passed into this function. Each descriptor will also be validated,
* and error returned if any are invalid. */
static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
VHDXLogEntries *log, VHDXLogDescEntries **buffer,
bool convert_endian)
VHDXLogEntries *log, VHDXLogDescEntries **buffer)
{
int ret = 0;
uint32_t desc_sectors;
uint32_t sectors_read;
VHDXLogEntryHeader hdr;
VHDXLogDescEntries *desc_entries = NULL;
VHDXLogDescriptor desc;
int i;
assert(*buffer == NULL);
@@ -346,19 +342,14 @@ static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
if (ret < 0) {
goto exit;
}
vhdx_log_entry_hdr_le_import(&hdr);
if (vhdx_log_hdr_is_valid(log, &hdr, s) == false) {
ret = -EINVAL;
goto exit;
}
desc_sectors = vhdx_compute_desc_sectors(hdr.descriptor_count);
desc_entries = qemu_try_blockalign(bs->file,
desc_sectors * VHDX_LOG_SECTOR_SIZE);
if (desc_entries == NULL) {
ret = -ENOMEM;
goto exit;
}
desc_entries = qemu_blockalign(bs, desc_sectors * VHDX_LOG_SECTOR_SIZE);
ret = vhdx_log_read_sectors(bs, log, &sectors_read, desc_entries,
desc_sectors, false);
@@ -372,19 +363,12 @@ static int vhdx_log_read_desc(BlockDriverState *bs, BDRVVHDXState *s,
/* put in proper endianness, and validate each desc */
for (i = 0; i < hdr.descriptor_count; i++) {
desc = desc_entries->desc[i];
vhdx_log_desc_le_import(&desc);
if (convert_endian) {
desc_entries->desc[i] = desc;
}
if (vhdx_log_desc_is_valid(&desc, &hdr) == false) {
vhdx_log_desc_le_import(&desc_entries->desc[i]);
if (vhdx_log_desc_is_valid(&desc_entries->desc[i], &hdr) == false) {
ret = -EINVAL;
goto free_and_exit;
}
}
if (convert_endian) {
desc_entries->hdr = hdr;
}
*buffer = desc_entries;
goto exit;
@@ -419,7 +403,7 @@ static int vhdx_log_flush_desc(BlockDriverState *bs, VHDXLogDescriptor *desc,
buffer = qemu_blockalign(bs, VHDX_LOG_SECTOR_SIZE);
if (desc->signature == VHDX_LOG_DESC_SIGNATURE) {
if (!memcmp(&desc->signature, "desc", 4)) {
/* data sector */
if (data == NULL) {
ret = -EFAULT;
@@ -447,15 +431,10 @@ static int vhdx_log_flush_desc(BlockDriverState *bs, VHDXLogDescriptor *desc,
memcpy(buffer+offset, &desc->trailing_bytes, 4);
} else if (desc->signature == VHDX_LOG_ZERO_SIGNATURE) {
} else if (!memcmp(&desc->signature, "zero", 4)) {
/* write 'count' sectors of sector */
memset(buffer, 0, VHDX_LOG_SECTOR_SIZE);
count = desc->zero_length / VHDX_LOG_SECTOR_SIZE;
} else {
error_report("Invalid VHDX log descriptor entry signature 0x%" PRIx32,
desc->signature);
ret = -EINVAL;
goto exit;
}
file_offset = desc->file_offset;
@@ -514,13 +493,13 @@ static int vhdx_log_flush(BlockDriverState *bs, BDRVVHDXState *s,
goto exit;
}
ret = vhdx_log_read_desc(bs, s, &logs->log, &desc_entries, true);
ret = vhdx_log_read_desc(bs, s, &logs->log, &desc_entries);
if (ret < 0) {
goto exit;
}
for (i = 0; i < desc_entries->hdr.descriptor_count; i++) {
if (desc_entries->desc[i].signature == VHDX_LOG_DESC_SIGNATURE) {
if (!memcmp(&desc_entries->desc[i].signature, "desc", 4)) {
/* data sector, so read a sector to flush */
ret = vhdx_log_read_sectors(bs, &logs->log, &sectors_read,
data, 1, false);
@@ -531,7 +510,6 @@ static int vhdx_log_flush(BlockDriverState *bs, BDRVVHDXState *s,
ret = -EINVAL;
goto exit;
}
vhdx_log_data_le_import(data);
}
ret = vhdx_log_flush_desc(bs, &desc_entries->desc[i], data);
@@ -580,6 +558,9 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
goto inc_and_exit;
}
vhdx_log_entry_hdr_le_import(&hdr);
if (vhdx_log_hdr_is_valid(log, &hdr, s) == false) {
goto inc_and_exit;
}
@@ -592,13 +573,13 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
desc_sectors = vhdx_compute_desc_sectors(hdr.descriptor_count);
/* Read all log sectors, and calculate log checksum */
/* Read desc sectors, and calculate log checksum */
total_sectors = hdr.entry_length / VHDX_LOG_SECTOR_SIZE;
/* read_desc() will increment the read idx */
ret = vhdx_log_read_desc(bs, s, log, &desc_buffer, false);
/* read_desc() will incrememnt the read idx */
ret = vhdx_log_read_desc(bs, s, log, &desc_buffer);
if (ret < 0) {
goto free_and_exit;
}
@@ -621,7 +602,7 @@ static int vhdx_validate_log_entry(BlockDriverState *bs, BDRVVHDXState *s,
}
}
crc ^= 0xffffffff;
if (crc != hdr.checksum) {
if (crc != desc_buffer->hdr.checksum) {
goto free_and_exit;
}
@@ -725,8 +706,7 @@ exit:
*
* If read-only, we must replay the log in RAM (or refuse to open
* a dirty VHDX file read-only) */
int vhdx_parse_log(BlockDriverState *bs, BDRVVHDXState *s, bool *flushed,
Error **errp)
int vhdx_parse_log(BlockDriverState *bs, BDRVVHDXState *s, bool *flushed)
{
int ret = 0;
VHDXHeader *hdr;
@@ -781,16 +761,6 @@ int vhdx_parse_log(BlockDriverState *bs, BDRVVHDXState *s, bool *flushed,
}
if (logs.valid) {
if (bs->read_only) {
ret = -EPERM;
error_setg_errno(errp, EPERM,
"VHDX image file '%s' opened read-only, but "
"contains a log that needs to be replayed. To "
"replay the log, execute:\n qemu-img check -r "
"all '%s'",
bs->filename, bs->filename);
goto exit;
}
/* now flush the log */
ret = vhdx_log_flush(bs, s, &logs);
if (ret < 0) {
@@ -924,7 +894,7 @@ static int vhdx_log_write(BlockDriverState *bs, BDRVVHDXState *s,
buffer = qemu_blockalign(bs, total_length);
memcpy(buffer, &new_hdr, sizeof(new_hdr));
new_desc = buffer + sizeof(new_hdr);
new_desc = (VHDXLogDescriptor *) (buffer + sizeof(new_hdr));
data_sector = buffer + (desc_sectors * VHDX_LOG_SECTOR_SIZE);
data_tmp = data;
@@ -981,10 +951,11 @@ static int vhdx_log_write(BlockDriverState *bs, BDRVVHDXState *s,
* last data sector */
vhdx_update_checksum(buffer, total_length,
offsetof(VHDXLogEntryHeader, checksum));
cpu_to_le32s((uint32_t *)(buffer + 4));
/* now write to the log */
ret = vhdx_log_write_sectors(bs, &s->log, &sectors_written, buffer,
desc_sectors + sectors);
vhdx_log_write_sectors(bs, &s->log, &sectors_written, buffer,
desc_sectors + sectors);
if (ret < 0) {
goto exit;
}

View File

@@ -99,8 +99,7 @@ static const MSGUID logical_sector_guid = { .data1 = 0x8141bf1d,
/* Each parent type must have a valid GUID; this is for parent images
* of type 'VHDX'. If we were to allow e.g. a QCOW2 parent, we would
* need to make up our own QCOW2 GUID type */
static const MSGUID parent_vhdx_guid __attribute__((unused))
= { .data1 = 0xb04aefb7,
static const MSGUID parent_vhdx_guid = { .data1 = 0xb04aefb7,
.data2 = 0xd19e,
.data3 = 0x4a81,
.data4 = { 0xb7, 0x89, 0x25, 0xb8,
@@ -136,8 +135,10 @@ typedef struct VHDXSectorInfo {
* buf: buffer pointer
* size: size of buffer (must be > crc_offset+4)
*
* Note: The buffer should have all multi-byte data in little-endian format,
* and the resulting checksum is in little endian format.
* Note: The resulting checksum is in the CPU endianness, not necessarily
* in the file format endianness (LE). Any header export to disk should
* make sure that vhdx_header_le_export() is used to convert to the
* correct endianness
*/
uint32_t vhdx_update_checksum(uint8_t *buf, size_t size, int crc_offset)
{
@@ -148,7 +149,6 @@ uint32_t vhdx_update_checksum(uint8_t *buf, size_t size, int crc_offset)
memset(buf + crc_offset, 0, sizeof(crc));
crc = crc32c(0xffffffff, buf, size);
cpu_to_le32s(&crc);
memcpy(buf + crc_offset, &crc, sizeof(crc));
return crc;
@@ -300,7 +300,7 @@ static int vhdx_write_header(BlockDriverState *bs_file, VHDXHeader *hdr,
{
uint8_t *buffer = NULL;
int ret;
VHDXHeader *header_le;
VHDXHeader header_le;
assert(bs_file != NULL);
assert(hdr != NULL);
@@ -321,12 +321,11 @@ static int vhdx_write_header(BlockDriverState *bs_file, VHDXHeader *hdr,
}
/* overwrite the actual VHDXHeader portion */
header_le = (VHDXHeader *)buffer;
memcpy(header_le, hdr, sizeof(VHDXHeader));
vhdx_header_le_export(hdr, header_le);
vhdx_update_checksum(buffer, VHDX_HEADER_SIZE,
offsetof(VHDXHeader, checksum));
ret = bdrv_pwrite_sync(bs_file, offset, header_le, sizeof(VHDXHeader));
memcpy(buffer, hdr, sizeof(VHDXHeader));
hdr->checksum = vhdx_update_checksum(buffer, VHDX_HEADER_SIZE,
offsetof(VHDXHeader, checksum));
vhdx_header_le_export(hdr, &header_le);
ret = bdrv_pwrite_sync(bs_file, offset, &header_le, sizeof(VHDXHeader));
exit:
qemu_vfree(buffer);
@@ -375,7 +374,7 @@ static int vhdx_update_header(BlockDriverState *bs, BDRVVHDXState *s,
inactive_header->log_guid = *log_guid;
}
ret = vhdx_write_header(bs->file, inactive_header, header_offset, true);
vhdx_write_header(bs->file, inactive_header, header_offset, true);
if (ret < 0) {
goto exit;
}
@@ -403,10 +402,9 @@ int vhdx_update_headers(BlockDriverState *bs, BDRVVHDXState *s,
}
/* opens the specified header block from the VHDX file header section */
static void vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s,
Error **errp)
static int vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s)
{
int ret;
int ret = 0;
VHDXHeader *header1;
VHDXHeader *header2;
bool h1_valid = false;
@@ -433,14 +431,13 @@ static void vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s,
}
/* copy over just the relevant portion that we need */
memcpy(header1, buffer, sizeof(VHDXHeader));
vhdx_header_le_import(header1);
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4)) {
vhdx_header_le_import(header1);
if (header1->signature == VHDX_HEADER_SIGNATURE &&
header1->version == 1) {
h1_seq = header1->sequence_number;
h1_valid = true;
}
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4) &&
!memcmp(&header1->signature, "head", 4) &&
header1->version == 1) {
h1_seq = header1->sequence_number;
h1_valid = true;
}
ret = bdrv_pread(bs->file, VHDX_HEADER2_OFFSET, buffer, VHDX_HEADER_SIZE);
@@ -449,14 +446,13 @@ static void vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s,
}
/* copy over just the relevant portion that we need */
memcpy(header2, buffer, sizeof(VHDXHeader));
vhdx_header_le_import(header2);
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4)) {
vhdx_header_le_import(header2);
if (header2->signature == VHDX_HEADER_SIGNATURE &&
header2->version == 1) {
h2_seq = header2->sequence_number;
h2_valid = true;
}
if (vhdx_checksum_is_valid(buffer, VHDX_HEADER_SIZE, 4) &&
!memcmp(&header2->signature, "head", 4) &&
header2->version == 1) {
h2_seq = header2->sequence_number;
h2_valid = true;
}
/* If there is only 1 valid header (or no valid headers), we
@@ -466,6 +462,7 @@ static void vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s,
} else if (!h1_valid && h2_valid) {
s->curr_header = 1;
} else if (!h1_valid && !h2_valid) {
ret = -EINVAL;
goto fail;
} else {
/* If both headers are valid, then we choose the active one by the
@@ -476,29 +473,27 @@ static void vhdx_parse_header(BlockDriverState *bs, BDRVVHDXState *s,
} else if (h2_seq > h1_seq) {
s->curr_header = 1;
} else {
/* The Microsoft Disk2VHD tool will create 2 identical
* headers, with identical sequence numbers. If the headers are
* identical, don't consider the file corrupt */
if (!memcmp(header1, header2, sizeof(VHDXHeader))) {
s->curr_header = 0;
} else {
goto fail;
}
ret = -EINVAL;
goto fail;
}
}
vhdx_region_register(s, s->headers[s->curr_header]->log_offset,
s->headers[s->curr_header]->log_length);
ret = 0;
goto exit;
fail:
error_setg_errno(errp, -ret, "No valid VHDX header found");
qerror_report(ERROR_CLASS_GENERIC_ERROR, "No valid VHDX header found");
qemu_vfree(header1);
qemu_vfree(header2);
s->headers[0] = NULL;
s->headers[1] = NULL;
exit:
qemu_vfree(buffer);
return ret;
}
@@ -522,21 +517,15 @@ static int vhdx_open_region_tables(BlockDriverState *bs, BDRVVHDXState *s)
goto fail;
}
memcpy(&s->rt, buffer, sizeof(s->rt));
vhdx_region_header_le_import(&s->rt);
offset += sizeof(s->rt);
if (!vhdx_checksum_is_valid(buffer, VHDX_HEADER_BLOCK_SIZE, 4)) {
if (!vhdx_checksum_is_valid(buffer, VHDX_HEADER_BLOCK_SIZE, 4) ||
memcmp(&s->rt.signature, "regi", 4)) {
ret = -EINVAL;
goto fail;
}
vhdx_region_header_le_import(&s->rt);
if (s->rt.signature != VHDX_REGION_SIGNATURE) {
ret = -EINVAL;
goto fail;
}
/* Per spec, maximum region table entry count is 2047 */
if (s->rt.entry_count > 2047) {
ret = -EINVAL;
@@ -639,7 +628,7 @@ static int vhdx_parse_metadata(BlockDriverState *bs, BDRVVHDXState *s)
vhdx_metadata_header_le_import(&s->metadata_hdr);
if (s->metadata_hdr.signature != VHDX_METADATA_SIGNATURE) {
if (memcmp(&s->metadata_hdr.signature, "metadata", 8)) {
ret = -EINVAL;
goto exit;
}
@@ -897,7 +886,8 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, int flags,
int ret = 0;
uint32_t i;
uint64_t signature;
Error *local_err = NULL;
bool log_flushed = false;
s->bat = NULL;
s->first_visible_write = true;
@@ -920,14 +910,12 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, int flags,
* header update */
vhdx_guid_generate(&s->session_guid);
vhdx_parse_header(bs, s, &local_err);
if (local_err != NULL) {
error_propagate(errp, local_err);
ret = -EINVAL;
ret = vhdx_parse_header(bs, s);
if (ret < 0) {
goto fail;
}
ret = vhdx_parse_log(bs, s, &s->log_replayed_on_open, errp);
ret = vhdx_parse_log(bs, s, &log_flushed);
if (ret < 0) {
goto fail;
}
@@ -959,11 +947,7 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, int flags,
}
/* s->bat is freed in vhdx_close() */
s->bat = qemu_try_blockalign(bs->file, s->bat_rt.length);
if (s->bat == NULL) {
ret = -ENOMEM;
goto fail;
}
s->bat = qemu_blockalign(bs, s->bat_rt.length);
ret = bdrv_pread(bs->file, s->bat_offset, s->bat, s->bat_rt.length);
if (ret < 0) {
@@ -1002,9 +986,9 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, int flags,
/* TODO: differencing files */
/* Disable migration when VHDX images are used */
error_setg(&s->migration_blocker, "The vhdx format used by node '%s' "
"does not support live migration",
bdrv_get_device_or_node_name(bs));
error_set(&s->migration_blocker,
QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
"vhdx", bs->device_name, "live migration");
migrate_add_blocker(s->migration_blocker);
return 0;
@@ -1067,18 +1051,6 @@ static void vhdx_block_translate(BDRVVHDXState *s, int64_t sector_num,
}
static int vhdx_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
BDRVVHDXState *s = bs->opaque;
bdi->cluster_size = s->block_size;
bdi->unallocated_blocks_are_zero =
(s->params.data_bits & VHDX_PARAMS_HAS_PARENT) == 0;
return 0;
}
static coroutine_fn int vhdx_co_readv(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
@@ -1109,9 +1081,8 @@ static coroutine_fn int vhdx_co_readv(BlockDriverState *bs, int64_t sector_num,
/* check the payload block state */
switch (s->bat[sinfo.bat_idx] & VHDX_BAT_STATE_BIT_MASK) {
case PAYLOAD_BLOCK_NOT_PRESENT: /* fall through */
case PAYLOAD_BLOCK_UNDEFINED:
case PAYLOAD_BLOCK_UNMAPPED:
case PAYLOAD_BLOCK_UNMAPPED_v095:
case PAYLOAD_BLOCK_UNDEFINED: /* fall through */
case PAYLOAD_BLOCK_UNMAPPED: /* fall through */
case PAYLOAD_BLOCK_ZERO:
/* return zero */
qemu_iovec_memset(&hd_qiov, 0, 0, sinfo.bytes_avail);
@@ -1174,18 +1145,7 @@ static void vhdx_update_bat_table_entry(BlockDriverState *bs, BDRVVHDXState *s,
{
/* The BAT entry is a uint64, with 44 bits for the file offset in units of
* 1MB, and 3 bits for the block state. */
if ((state == PAYLOAD_BLOCK_ZERO) ||
(state == PAYLOAD_BLOCK_UNDEFINED) ||
(state == PAYLOAD_BLOCK_NOT_PRESENT) ||
(state == PAYLOAD_BLOCK_UNMAPPED)) {
s->bat[sinfo->bat_idx] = 0; /* For PAYLOAD_BLOCK_ZERO, the
FileOffsetMB field is denoted as
'reserved' in the v1.0 spec. If it is
non-zero, MS Hyper-V will fail to read
the disk image */
} else {
s->bat[sinfo->bat_idx] = sinfo->file_offset;
}
s->bat[sinfo->bat_idx] = sinfo->file_offset;
s->bat[sinfo->bat_idx] |= state & VHDX_BAT_STATE_BIT_MASK;
@@ -1269,7 +1229,7 @@ static coroutine_fn int vhdx_co_writev(BlockDriverState *bs, int64_t sector_num,
iov1.iov_base = qemu_blockalign(bs, iov1.iov_len);
memset(iov1.iov_base, 0, iov1.iov_len);
qemu_iovec_concat_iov(&hd_qiov, &iov1, 1, 0,
iov1.iov_len);
sinfo.block_offset);
sectors_to_write += iov1.iov_len >> BDRV_SECTOR_BITS;
}
@@ -1285,15 +1245,15 @@ static coroutine_fn int vhdx_co_writev(BlockDriverState *bs, int64_t sector_num,
iov2.iov_base = qemu_blockalign(bs, iov2.iov_len);
memset(iov2.iov_base, 0, iov2.iov_len);
qemu_iovec_concat_iov(&hd_qiov, &iov2, 1, 0,
iov2.iov_len);
sinfo.block_offset);
sectors_to_write += iov2.iov_len >> BDRV_SECTOR_BITS;
}
}
/* fall through */
case PAYLOAD_BLOCK_NOT_PRESENT: /* fall through */
case PAYLOAD_BLOCK_UNMAPPED:
case PAYLOAD_BLOCK_UNMAPPED_v095:
case PAYLOAD_BLOCK_UNDEFINED:
case PAYLOAD_BLOCK_UNMAPPED: /* fall through */
case PAYLOAD_BLOCK_UNDEFINED: /* fall through */
bat_prior_offset = sinfo.file_offset;
ret = vhdx_allocate_block(bs, s, &sinfo.file_offset);
if (ret < 0) {
@@ -1394,7 +1354,7 @@ static int vhdx_create_new_headers(BlockDriverState *bs, uint64_t image_size,
int ret = 0;
VHDXHeader *hdr = NULL;
hdr = g_new0(VHDXHeader, 1);
hdr = g_malloc0(sizeof(VHDXHeader));
hdr->signature = VHDX_HEADER_SIGNATURE;
hdr->sequence_number = g_random_int();
@@ -1420,12 +1380,6 @@ exit:
return ret;
}
#define VHDX_METADATA_ENTRY_BUFFER_SIZE \
(sizeof(VHDXFileParameters) +\
sizeof(VHDXVirtualDiskSize) +\
sizeof(VHDXPage83Data) +\
sizeof(VHDXVirtualDiskLogicalSectorSize) +\
sizeof(VHDXVirtualDiskPhysicalSectorSize))
/*
* Create the Metadata entries.
@@ -1464,7 +1418,11 @@ static int vhdx_create_new_metadata(BlockDriverState *bs,
VHDXVirtualDiskLogicalSectorSize *mt_log_sector_size;
VHDXVirtualDiskPhysicalSectorSize *mt_phys_sector_size;
entry_buffer = g_malloc0(VHDX_METADATA_ENTRY_BUFFER_SIZE);
entry_buffer = g_malloc0(sizeof(VHDXFileParameters) +
sizeof(VHDXVirtualDiskSize) +
sizeof(VHDXPage83Data) +
sizeof(VHDXVirtualDiskLogicalSectorSize) +
sizeof(VHDXVirtualDiskPhysicalSectorSize));
mt_file_params = entry_buffer;
offset += sizeof(VHDXFileParameters);
@@ -1545,7 +1503,7 @@ static int vhdx_create_new_metadata(BlockDriverState *bs,
}
ret = bdrv_pwrite(bs, metadata_offset + (64 * KiB), entry_buffer,
VHDX_METADATA_ENTRY_BUFFER_SIZE);
VHDX_HEADER_BLOCK_SIZE);
if (ret < 0) {
goto exit;
}
@@ -1567,8 +1525,7 @@ exit:
*/
static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
uint64_t image_size, VHDXImageType type,
bool use_zero_blocks, uint64_t file_offset,
uint32_t length)
bool use_zero_blocks, VHDXRegionTableEntry *rt_bat)
{
int ret = 0;
uint64_t data_file_offset;
@@ -1583,7 +1540,7 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
/* this gives a data start after BAT/bitmap entries, and well
* past any metadata entries (with a 4 MB buffer for future
* expansion */
data_file_offset = file_offset + length + 5 * MiB;
data_file_offset = rt_bat->file_offset + rt_bat->length + 5 * MiB;
total_sectors = image_size >> s->logical_sector_size_bits;
if (type == VHDX_TYPE_DYNAMIC) {
@@ -1607,11 +1564,7 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
use_zero_blocks ||
bdrv_has_zero_init(bs) == 0) {
/* for a fixed file, the default BAT entry is not zero */
s->bat = g_try_malloc0(length);
if (length && s->bat == NULL) {
ret = -ENOMEM;
goto exit;
}
s->bat = g_malloc0(rt_bat->length);
block_state = type == VHDX_TYPE_FIXED ? PAYLOAD_BLOCK_FULLY_PRESENT :
PAYLOAD_BLOCK_NOT_PRESENT;
block_state = use_zero_blocks ? PAYLOAD_BLOCK_ZERO : block_state;
@@ -1626,7 +1579,7 @@ static int vhdx_create_bat(BlockDriverState *bs, BDRVVHDXState *s,
cpu_to_le64s(&s->bat[sinfo.bat_idx]);
sector_num += s->sectors_per_block;
}
ret = bdrv_pwrite(bs, file_offset, s->bat, length);
ret = bdrv_pwrite(bs, rt_bat->file_offset, s->bat, rt_bat->length);
if (ret < 0) {
goto exit;
}
@@ -1658,8 +1611,6 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
int ret = 0;
uint32_t offset = 0;
void *buffer = NULL;
uint64_t bat_file_offset;
uint32_t bat_length;
BDRVVHDXState *s = NULL;
VHDXRegionTableHeader *region_table;
VHDXRegionTableEntry *rt_bat;
@@ -1669,7 +1620,7 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
/* Populate enough of the BDRVVHDXState to be able to use the
* pre-existing BAT calculation, translation, and update functions */
s = g_new0(BDRVVHDXState, 1);
s = g_malloc0(sizeof(BDRVVHDXState));
s->chunk_ratio = (VHDX_MAX_SECTORS_PER_BLOCK) *
(uint64_t) sector_size / (uint64_t) block_size;
@@ -1708,26 +1659,19 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
rt_metadata->length = 1 * MiB; /* min size, and more than enough */
*metadata_offset = rt_metadata->file_offset;
bat_file_offset = rt_bat->file_offset;
bat_length = rt_bat->length;
vhdx_region_header_le_export(region_table);
vhdx_region_entry_le_export(rt_bat);
vhdx_region_entry_le_export(rt_metadata);
vhdx_update_checksum(buffer, VHDX_HEADER_BLOCK_SIZE,
offsetof(VHDXRegionTableHeader, checksum));
/* The region table gives us the data we need to create the BAT,
* so do that now */
ret = vhdx_create_bat(bs, s, image_size, type, use_zero_blocks,
bat_file_offset, bat_length);
if (ret < 0) {
goto exit;
}
ret = vhdx_create_bat(bs, s, image_size, type, use_zero_blocks, rt_bat);
/* Now write out the region headers to disk */
vhdx_region_header_le_export(region_table);
vhdx_region_entry_le_export(rt_bat);
vhdx_region_entry_le_export(rt_metadata);
ret = bdrv_pwrite(bs, VHDX_REGION_TABLE_OFFSET, buffer,
VHDX_HEADER_BLOCK_SIZE);
if (ret < 0) {
@@ -1740,6 +1684,7 @@ static int vhdx_create_new_region_table(BlockDriverState *bs,
goto exit;
}
exit:
g_free(s);
g_free(buffer);
@@ -1763,7 +1708,8 @@ exit:
* .---- ~ ----------- ~ ------------ ~ ---------------- ~ -----------.
* 1MB
*/
static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
static int vhdx_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int ret = 0;
uint64_t image_size = (uint64_t) 2 * GiB;
@@ -1776,16 +1722,24 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
gunichar2 *creator = NULL;
glong creator_items;
BlockDriverState *bs;
char *type = NULL;
const char *type = NULL;
VHDXImageType image_type;
Error *local_err = NULL;
image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
log_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_LOG_SIZE, 0);
block_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_BLOCK_SIZE, 0);
type = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
use_zero_blocks = qemu_opt_get_bool_del(opts, VHDX_BLOCK_OPT_ZERO, true);
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
image_size = options->value.n;
} else if (!strcmp(options->name, VHDX_BLOCK_OPT_LOG_SIZE)) {
log_size = options->value.n;
} else if (!strcmp(options->name, VHDX_BLOCK_OPT_BLOCK_SIZE)) {
block_size = options->value.n;
} else if (!strcmp(options->name, BLOCK_OPT_SUBFMT)) {
type = options->value.s;
} else if (!strcmp(options->name, VHDX_BLOCK_OPT_ZERO)) {
use_zero_blocks = options->value.n != 0;
}
options++;
}
if (image_size > VHDX_MAX_IMAGE_SIZE) {
error_setg_errno(errp, EINVAL, "Image size too large; max of 64TB");
@@ -1794,7 +1748,7 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
}
if (type == NULL) {
type = g_strdup("dynamic");
type = "dynamic";
}
if (!strcmp(type, "dynamic")) {
@@ -1834,15 +1788,13 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
block_size = block_size > VHDX_BLOCK_SIZE_MAX ? VHDX_BLOCK_SIZE_MAX :
block_size;
ret = bdrv_create_file(filename, opts, &local_err);
ret = bdrv_create_file(filename, options, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
}
bs = NULL;
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
NULL, &local_err);
ret = bdrv_file_open(&bs, filename, NULL, BDRV_O_RDWR, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
@@ -1855,13 +1807,13 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
creator = g_utf8_to_utf16("QEMU v" QEMU_VERSION, -1, NULL,
&creator_items, NULL);
signature = cpu_to_le64(VHDX_FILE_SIGNATURE);
ret = bdrv_pwrite(bs, VHDX_FILE_ID_OFFSET, &signature, sizeof(signature));
bdrv_pwrite(bs, VHDX_FILE_ID_OFFSET, &signature, sizeof(signature));
if (ret < 0) {
goto delete_and_exit;
}
if (creator) {
ret = bdrv_pwrite(bs, VHDX_FILE_ID_OFFSET + sizeof(signature),
creator, creator_items * sizeof(gunichar2));
bdrv_pwrite(bs, VHDX_FILE_ID_OFFSET + sizeof(signature), creator,
creator_items * sizeof(gunichar2));
if (ret < 0) {
goto delete_and_exit;
}
@@ -1890,69 +1842,45 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
}
delete_and_exit:
bdrv_unref(bs);
exit:
g_free(type);
g_free(creator);
return ret;
}
/* If opened r/w, the VHDX driver will automatically replay the log,
* if one is present, inside the vhdx_open() call.
*
* If qemu-img check -r all is called, the image is automatically opened
* r/w and any log has already been replayed, so there is nothing (currently)
* for us to do here
*/
static int vhdx_check(BlockDriverState *bs, BdrvCheckResult *result,
BdrvCheckMode fix)
{
BDRVVHDXState *s = bs->opaque;
if (s->log_replayed_on_open) {
result->corruptions_fixed++;
}
return 0;
}
static QemuOptsList vhdx_create_opts = {
.name = "vhdx-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(vhdx_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size; max of 64TB."
},
{
.name = VHDX_BLOCK_OPT_LOG_SIZE,
.type = QEMU_OPT_SIZE,
.def_value_str = stringify(DEFAULT_LOG_SIZE),
.help = "Log size; min 1MB."
},
{
.name = VHDX_BLOCK_OPT_BLOCK_SIZE,
.type = QEMU_OPT_SIZE,
.def_value_str = stringify(0),
.help = "Block Size; min 1MB, max 256MB. " \
"0 means auto-calculate based on image size."
},
{
.name = BLOCK_OPT_SUBFMT,
.type = QEMU_OPT_STRING,
.help = "VHDX format type, can be either 'dynamic' or 'fixed'. "\
"Default is 'dynamic'."
},
{
.name = VHDX_BLOCK_OPT_ZERO,
.type = QEMU_OPT_BOOL,
.help = "Force use of payload blocks of type 'ZERO'. "\
"Non-standard, but default. Do not set to 'off' when "\
"using 'qemu-img convert' with subformat=dynamic."
},
{ NULL }
}
static QEMUOptionParameter vhdx_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size; max of 64TB."
},
{
.name = VHDX_BLOCK_OPT_LOG_SIZE,
.type = OPT_SIZE,
.value.n = 1 * MiB,
.help = "Log size; min 1MB."
},
{
.name = VHDX_BLOCK_OPT_BLOCK_SIZE,
.type = OPT_SIZE,
.value.n = 0,
.help = "Block Size; min 1MB, max 256MB. " \
"0 means auto-calculate based on image size."
},
{
.name = BLOCK_OPT_SUBFMT,
.type = OPT_STRING,
.help = "VHDX format type, can be either 'dynamic' or 'fixed'. "\
"Default is 'dynamic'."
},
{
.name = VHDX_BLOCK_OPT_ZERO,
.type = OPT_FLAG,
.help = "Force use of payload blocks of type 'ZERO'. Non-standard."
},
{ NULL }
};
static BlockDriver bdrv_vhdx = {
@@ -1965,11 +1893,8 @@ static BlockDriver bdrv_vhdx = {
.bdrv_co_readv = vhdx_co_readv,
.bdrv_co_writev = vhdx_co_writev,
.bdrv_create = vhdx_create,
.bdrv_get_info = vhdx_get_info,
.bdrv_check = vhdx_check,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.create_opts = &vhdx_create_opts,
.create_options = vhdx_create_options,
};
static void bdrv_vhdx_init(void)

View File

@@ -23,7 +23,6 @@
#define GiB (MiB * 1024)
#define TiB ((uint64_t) GiB * 1024)
#define DEFAULT_LOG_SIZE 1048576 /* 1MiB */
/* Structures and fields present in the VHDX file */
/* The header section has the following blocks,
@@ -62,7 +61,7 @@
/* These structures are ones that are defined in the VHDX specification
* document */
#define VHDX_FILE_SIGNATURE 0x656C696678646876ULL /* "vhdxfile" in ASCII */
#define VHDX_FILE_SIGNATURE 0x656C696678646876 /* "vhdxfile" in ASCII */
typedef struct VHDXFileIdentifier {
uint64_t signature; /* "vhdxfile" in ASCII */
uint16_t creator[256]; /* optional; utf-16 string to identify
@@ -226,8 +225,7 @@ typedef struct QEMU_PACKED VHDXLogDataSector {
#define PAYLOAD_BLOCK_NOT_PRESENT 0
#define PAYLOAD_BLOCK_UNDEFINED 1
#define PAYLOAD_BLOCK_ZERO 2
#define PAYLOAD_BLOCK_UNMAPPED 3
#define PAYLOAD_BLOCK_UNMAPPED_v095 5
#define PAYLOAD_BLOCK_UNMAPPED 5
#define PAYLOAD_BLOCK_FULLY_PRESENT 6
#define PAYLOAD_BLOCK_PARTIALLY_PRESENT 7
@@ -240,7 +238,7 @@ typedef struct QEMU_PACKED VHDXLogDataSector {
/* upper 44 bits are the file offset in 1MB units lower 3 bits are the state
other bits are reserved */
#define VHDX_BAT_STATE_BIT_MASK 0x07
#define VHDX_BAT_FILE_OFF_MASK 0xFFFFFFFFFFF00000ULL /* upper 44 bits */
#define VHDX_BAT_FILE_OFF_MASK 0xFFFFFFFFFFF00000 /* upper 44 bits */
typedef uint64_t VHDXBatEntry;
/* ---- METADATA REGION STRUCTURES ---- */
@@ -249,7 +247,7 @@ typedef uint64_t VHDXBatEntry;
#define VHDX_METADATA_MAX_ENTRIES 2047 /* not including the header */
#define VHDX_METADATA_TABLE_MAX_SIZE \
(VHDX_METADATA_ENTRY_SIZE * (VHDX_METADATA_MAX_ENTRIES+1))
#define VHDX_METADATA_SIGNATURE 0x617461646174656DULL /* "metadata" in ASCII */
#define VHDX_METADATA_SIGNATURE 0x617461646174656D /* "metadata" in ASCII */
typedef struct QEMU_PACKED VHDXMetadataTableHeader {
uint64_t signature; /* "metadata" in ASCII */
uint16_t reserved;
@@ -396,8 +394,6 @@ typedef struct BDRVVHDXState {
Error *migration_blocker;
bool log_replayed_on_open;
QLIST_HEAD(VHDXRegionHead, VHDXRegionEntry) regions;
} BDRVVHDXState;
@@ -412,8 +408,7 @@ uint32_t vhdx_checksum_calc(uint32_t crc, uint8_t *buf, size_t size,
bool vhdx_checksum_is_valid(uint8_t *buf, size_t size, int crc_offset);
int vhdx_parse_log(BlockDriverState *bs, BDRVVHDXState *s, bool *flushed,
Error **errp);
int vhdx_parse_log(BlockDriverState *bs, BDRVVHDXState *s, bool *flushed);
int vhdx_log_write_and_flush(BlockDriverState *bs, BDRVVHDXState *s,
void *data, uint32_t length, uint64_t offset);
@@ -436,7 +431,6 @@ void vhdx_header_le_import(VHDXHeader *h);
void vhdx_header_le_export(VHDXHeader *orig_h, VHDXHeader *new_h);
void vhdx_log_desc_le_import(VHDXLogDescriptor *d);
void vhdx_log_desc_le_export(VHDXLogDescriptor *d);
void vhdx_log_data_le_import(VHDXLogDataSector *d);
void vhdx_log_data_le_export(VHDXLogDataSector *d);
void vhdx_log_entry_hdr_le_import(VHDXLogEntryHeader *hdr);
void vhdx_log_entry_hdr_le_export(VHDXLogEntryHeader *hdr);

Some files were not shown because too many files have changed in this diff Show More