Compare commits

..

1160 Commits

Author SHA1 Message Date
Anthony Liguori
964668b03d Update version for 1.7.0-rc0 release
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-06 21:49:39 -08:00
Lei Li
898ae2846d sdl: Reverse support for video mode setting
Currently, If the setting of video mode failed, qemu will exit. It
should go back to the previous setting if the new screen resolution
failed. This patch fixes LP#1216368, add support to revert to existing
surface for the failure of video mode setting.

Reported-by: Sascha Krissler <sascha@srlabs.de>
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1378285636-7091-1-git-send-email-lilei@linux.vnet.ibm.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-06 21:47:13 -08:00
Paolo Bonzini
5f3e31012e timers: fix stop/cont with -icount
Stop/cont commands are broken with -icount due to a deadlock.  The
real problem is that the computation of timers_state.cpu_ticks_offset
makes no sense with -icount enabled: we set it to an icount clock value
in cpu_disable_ticks, and subtract a TSC (or similar, whatever
cpu_get_real_ticks happens to return) value in cpu_enable_ticks.

The fix is simple.  timers_state.cpu_ticks_offset is only used
together with cpu_get_real_ticks, so we can use cpu_get_real_ticks
in cpu_disable_ticks.  There is no need to update cpu_ticks_prev
at the time cpu_disable_ticks is called; instead, we can do it
the next time cpu_get_ticks is called.

The change to cpu_disable_ticks is the important part of the patch.
The rest modifies the code to always check timers_state.cpu_ticks_prev,
even when the ticks are not advancing (i.e. the VM is stopped).  It also
makes a similar change to cpu_get_clock_locked, so that the code remains
similar for cpu_get_ticks and cpu_get_clock_locked.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1382977938-13844-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-06 21:47:05 -08:00
Amos Kong
cd5be5829c e1000/rtl8139: update HMP NIC when every bit is written
We currently just update the HMP NIC info when the last bit of macaddr
is written. This assumes that guest driver will write all the macaddr
from bit 0 to bit 5 when it changes the macaddr, this is the current
behavior of linux driver (e1000/rtl8139cp), but we can't do this
assumption.

The macaddr that is used for rx-filter will be updated when every bit
is changed. This patch updates the e1000/rtl8139 nic to update HMP NIC
info when every bit is changed. It will be same as virtio-net.

Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Message-id: 1383650238-16015-1-git-send-email-akong@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-06 21:46:25 -08:00
Jason Wang
fe2dafa02d virtio-net: only delete bh that existed
We delete without check whether it existed during exit. This will lead NULL
pointer deference since it was created conditionally depends on guest driver
status and features. So add a check of existence before trying to delete it.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383728288-28469-1-git-send-email-jasowang@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-06 21:46:13 -08:00
Jan Kiszka
c2d3066776 rtc: remove dead SQW IRQ code
This was once introduced by commit 100d9891d6 but was never used in-tree
and then got broken by commit 32e0c8260d. Time to clean up.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-id: 520B6A27.4040207@siemens.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 20:04:03 -08:00
Marc-André Lureau
2c8ebac7cc vga: fix invalid read after free
After calling dpy_gfx_replace_surface(s->con, surface), the outer
surface is invalid.

==5370== Invalid read of size 4
==5370==    at 0x460229: surface_bits_per_pixel (console.h:250)
==5370==    by 0x466A81: get_depth_index (vga.c:1173)
==5370==    by 0x467EC2: vga_draw_graphic (vga.c:1718)
==5370==    by 0x4687A5: vga_update_display (vga.c:1914)
==5370==    by 0x2A782E: qxl_hw_update (qxl.c:1766)
==5370==    by 0x3EB83B: graphic_hw_update (console.c:254)
==5370==    by 0x3FBE31: qemu_spice_display_refresh (spice-display.c:418)
==5370==    by 0x2A7D01: display_refresh (qxl.c:1886)
==5370==    by 0x3EEE1C: dpy_refresh (console.c:1436)
==5370==    by 0x3EB543: gui_update (console.c:192)
==5370==    by 0x3C43B3: timerlist_run_timers (qemu-timer.c:488)
==5370==    by 0x3C4416: qemu_clock_run_timers (qemu-timer.c:499)
==5370==  Address 0x22ffb1e0 is 0 bytes inside a block of size 56 free'd
==5370==    at 0x4A074C4: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==5370==    by 0x4245FC: free_and_trace (vl.c:2771)
==5370==    by 0x50899AE: g_free (gmem.c:252)
==5370==    by 0x3EE8D3: qemu_free_displaysurface (console.c:1332)
==5370==    by 0x3EEDB7: dpy_gfx_replace_surface (console.c:1427)
==5370==    by 0x467EB6: vga_draw_graphic (vga.c:1714)
==5370==    by 0x4687A5: vga_update_display (vga.c:1914)
==5370==    by 0x2A782E: qxl_hw_update (qxl.c:1766)
==5370==    by 0x3EB83B: graphic_hw_update (console.c:254)
==5370==    by 0x3FBE31: qemu_spice_display_refresh (spice-display.c:418)
==5370==    by 0x2A7D01: display_refresh (qxl.c:1886)
==5370==    by 0x3EEE1C: dpy_refresh (console.c:1436)

Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1383664554-15248-1-git-send-email-marcandre.lureau@gmail.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 20:01:11 -08:00
Stefan Hajnoczi
5cb6be2ca3 tests: fix 64-bit int literals for 32-bit hosts
On 32-bit hosts:

  CC    tests/test-opts-visitor.o
tests/test-opts-visitor.c: In function 'test_value':
tests/test-opts-visitor.c:128: warning: integer constant is too large for 'long' type
  CC    tests/test-bitops.o
tests/test-bitops.c:34: warning: integer constant is too large for 'long' type
tests/test-bitops.c:35: warning: integer constant is too large for 'long' type
tests/test-bitops.c:35: warning: integer constant is too large for 'long' type
  CC    tests/endianness-test.o
tests/endianness-test.c:47: warning: integer constant is too large for 'long' type

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1383669768-23926-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:59:43 -08:00
Peter Maydell
6f1ce94a29 docs/memory.txt: Clarify and expand priority/overlap documentation
The documentation of how overlapping memory regions behave and how
the priority system works was rather brief, and confusion about
priorities seems to be quite common for developers trying to understand
how the memory region system works, so expand and clarify it.
This includes a worked example with overlaps, documentation of the
behaviour when an overlapped container has "holes", and mention
that it's valid for a region to have both MMIO callbacks and
subregions (and how this interacts with priorities when it does).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1381848154-31602-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:59:24 -08:00
Mike Frysinger
61cc919f73 configure: detect endian via compile test
This avoids needing to execute a program and keeping an (incomplete)
list when cross-compiling.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: James Hogan <james.hogan@imgtec.com> [mips]
Message-id: 1372649418-4987-1-git-send-email-vapier@gentoo.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:58:48 -08:00
Wenchao Xia
8aa15b6e52 tests: fix memleak in error path test for input visitor
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1383676551-18806-3-git-send-email-xiawenc@linux.vnet.ibm.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:58:38 -08:00
Wenchao Xia
3dce9cad5a qapi: fix memleak by adding implict struct functions in dealloc visitor
Otherwise member "base" is leaked in a qapi_free_STRUCTURE() call.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1383676551-18806-2-git-send-email-xiawenc@linux.vnet.ibm.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:58:38 -08:00
Peter Maydell
7d579514a5 bswap.h: Remove cpu_to_32wu()
Replace the legacy cpu_to_32wu() with stl_p().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-10-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:57:47 -08:00
Peter Maydell
e4ef9f465c bswap.h: Remove cpu_to_be64wu()
Replace the legacy cpu_to_be64wu() with stq_be_p().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-9-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:57:47 -08:00
Peter Maydell
6bd194ab99 bswap.h: Remove cpu_to_be32wu()
Replace the legacy cpu_to_be32wu() with stl_be_p().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-8-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:57:47 -08:00
Peter Maydell
d8ee2591e4 bswap.h: Remove cpu_to_be16wu()
Replace the legacy cpu_to_be16wu() with stw_be_p().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-7-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:57:47 -08:00
Peter Maydell
09fa843973 bswap.h: Remove be32_to_cpupu()
Replace the legacy be32_to_cpupu() with ldl_be_p().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-6-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:57:47 -08:00
Peter Maydell
f567656a67 bswap.h: Remove le32_to_cpupu()
Replace the legacy le32_to_cpupu() with ldl_le_p().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-5-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:57:46 -08:00
Peter Maydell
c65e5de94d bswap.h: Remove le16_to_cpupu()
Replace the legacy le16_to_cpupu() with lduw_le_p().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-4-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:57:46 -08:00
Peter Maydell
6e931878c1 bswap.h: Remove cpu_to_le32wu()
Replace the legacy cpu_to_le32wu() with stl_le_p().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-3-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:57:45 -08:00
Peter Maydell
587ae22760 bswap.h: Remove cpu_to_le16wu()
Replace the legacy cpu_to_le16wu() with stw_le_p().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1383669517-25598-2-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 19:57:45 -08:00
Anthony Liguori
a30b377e0a Merge remote-tracking branch 'afaerber/tags/qom-devices-for-anthony' into staging
QOM device refactorings

* QTest coverage for all machines
* QOM realize for Milkymist UART
* QOM realize for ARM MPCore
* device_add bug fixes and cleanups
* QOM for PCMCIA/MicroDrive (last legacy IDE device)

# gpg: Signature made Tue 05 Nov 2013 09:07:03 AM PST using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found

# By Andreas Färber (49) and others
# Via Andreas Färber
* afaerber/tags/qom-devices-for-anthony: (54 commits)
  pcmcia/pxa2xx: QOM'ify PXA2xxPCMCIAState
  ide: Drop ide_init2_with_non_qdev_drives()
  microdrive: Coding Style cleanups
  pcmcia: QOM'ify PCMCIACardState and MicroDriveState
  pxa: Fix typo "dettach"
  qom: Fix pointer to int property helpers' documentation
  qdev-monitor: Inline qdev_init() for device_add
  qdev-monitor: Avoid qdev as variable name
  qdev: Drop misleading qdev_free() function
  qdev-monitor: Unref device when device_add fails
  qdev-monitor: Fix crash when device_add is called with abstract driver
  qdev-monitor: Clean up qdev_device_add() variable naming
  arm11mpcore: Split off RealView MPCore
  arm11mpcore: Prepare for QOM embedding
  arm11mpcore: Convert mpcore_rirq_state to QOM realize
  realview_gic: Prepare for QOM embedding
  realview_gic: Convert to QOM realize
  arm11mpcore: Convert ARM11MPCorePriveState to QOM realize
  arm11mpcore: Split off SCU device
  arm11mpcore: Create container MemoryRegion in instance_init
  ...
2013-11-05 10:33:32 -08:00
Andreas Färber
80bbaee66a pcmcia/pxa2xx: QOM'ify PXA2xxPCMCIAState
Turn it into a SysBusDevice and use a container MemoryRegion.

Add a link<pcmcia-card> property to the PCMCIACardState.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 18:06:52 +01:00
Andreas Färber
e3d4d36d1b ide: Drop ide_init2_with_non_qdev_drives()
All its users have finally been converted.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 18:06:52 +01:00
Andreas Färber
a6cb20fcba microdrive: Coding Style cleanups
Add missing braces.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 18:06:52 +01:00
Andreas Färber
d1f2c96a81 pcmcia: QOM'ify PCMCIACardState and MicroDriveState
Turn PCMCIACardState into a device.
Move callbacks to new PCMCIACardClass.

Derive TYPE_MICRODRIVE from TYPE_PCMCIA_CARD.
Replace ide_init2_with_non_qdev_drives().

Signed-off-by: Othmar Pasteka <pasteka@kabsi.at>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 18:06:52 +01:00
Andreas Färber
853ca11daf pxa: Fix typo "dettach"
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 18:06:51 +01:00
Michael S. Tsirkin
a25ebcacdd qom: Fix pointer to int property helpers' documentation
Relocate to alongside the other object_property_add_* helpers while at it.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 18:06:51 +01:00
Andreas Färber
852e2c5008 qdev-monitor: Inline qdev_init() for device_add
For historic reasons, qdev_init() unparents the device on failure.
Inline this to make the error paths clearer and consistent.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 18:06:51 +01:00
Andreas Färber
2bcb0c62f6 qdev-monitor: Avoid qdev as variable name
Prepares for bringing error cleanup code into canonical QOM form.

Includes a whitespace removal after curly brace by Stefan.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 18:06:51 +01:00
Stefan Hajnoczi
02a5c4c974 qdev: Drop misleading qdev_free() function
The qdev_free() function name is misleading since all the function does
is unlink the device from its parent.  The device is not necessarily
freed.

The device will be freed when its QObject refcount reaches zero.  It is
usual for the parent (bus) to hold the final reference but there are
cases where something else holds a reference so "free" is a misleading
name.

Call object_unparent(obj) directly instead of having a qdev wrapper
function.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 18:06:38 +01:00
Stefan Hajnoczi
ee6abeb6ec qdev-monitor: Unref device when device_add fails
qdev_device_add() leaks the created device upon failure.  I suspect this
problem crept in because qdev_free() unparents the device but does not
drop a reference - confusing name.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 18:06:24 +01:00
Igor Mammedov
2fa4e56d88 qdev-monitor: Fix crash when device_add is called with abstract driver
User is able to crash running QEMU when following monitor
command is called:

 device_add intel-hda-generic

Crash is caused by assertion in object_initialize_with_type()
when type is abstract.

Checking if type is abstract before instance is created in
qdev_device_add() allows to prevent crash on incorrect user input.

Cc: qemu-stable@nongnu.org
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
f4d8579560 qdev-monitor: Clean up qdev_device_add() variable naming
Avoid confusion between object (obj) and object class (oc).
Tidy DeviceClass variable while at it (k -> dc).

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
9c219b7be6 arm11mpcore: Split off RealView MPCore
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
7b960dc37d arm11mpcore: Prepare for QOM embedding
Move state struct, type constant and cast macro to a new header.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
306476eaec arm11mpcore: Convert mpcore_rirq_state to QOM realize
Embed ARM11MPCorePriveState and RealViewGICState and replace SysBus
initfn with realizefn.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
ce31825de6 realview_gic: Prepare for QOM embedding
Move state struct, type constant and cast macro to a new header.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
612daf0628 realview_gic: Convert to QOM realize
Embed GICState and replace SysBus initfn with realizefn.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
08602ac5bf arm11mpcore: Convert ARM11MPCorePriveState to QOM realize
Embed child devices and replace SysBus initfn with realizefn.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
53cb9a1c2f arm11mpcore: Split off SCU device
Inspired by a9scu.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
2c42c3a063 arm11mpcore: Create container MemoryRegion in instance_init
This allows to map the region directly after object initialization.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
21ebaf1d81 arm11mpcore: Drop unused fields
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
4c14253c9e arm11mpcore: Fix typo in MemoryRegion name
"mpcode" -> "mpcore"

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
b4a37f17fe a9scu: Build only once
It does not have a target or ARMCPU dependency.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
43482f72db a15mpcore: Prepare for QOM embedding
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
7c76a48db4 a15mpcore: Convert to QOM realize
Turn SysBusDevice initfn into a QOM realizefn.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
524a2d8e26 a15mpcore: Embed GICState
This covers both emulated and KVM GIC.

Prepares for QOM realize.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
b9ed148d24 a15mpcore: Split off instance_init
Prepares for QOM realize.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
de4c2dcf7f a9mpcore: Prepare for QOM embedding
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
837cf1013e a9mpcore: Convert to QOM realize
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:30 +01:00
Andreas Färber
eb110bd843 a9mpcore: Embed ARMMPTimerState
Prepares for QOM realize.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
0aadb4909c arm_mptimer: Convert to QOM realize
Split the SysBusDevice initfn into instance_init and realizefn.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
fc719d7741 a9mpcore: Embed A9SCUState
Prepares for QOM realize.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
9eb39db520 a9scu: QOM cleanups
Rename A9SCUState::busdev field to parent_obj and turn realizefn into an
instance_init function to allow early MMIO mapping.

Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
9b5f952bb8 a9mpcore: Embed GICState
Prepares for conversion to QOM realize.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
83728796ad arm_gic: Extract headers hw/intc/arm_gic{,_common}.h
Rename NCPU to GIC_NCPU and move GICState away from gic_internal.h.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
753bc6e981 a9mpcore: Split off instance_init
Prepares for QOM realize.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2013-11-05 17:47:29 +01:00
Antony Pavlov
c77dd5f614 milkymist-uart: Use Device::realize instead of SysBusDevice::init
Use of SysBusDevice::init is deprecated. Use Device::realize instead.

Also introduce TypeInfo::instance_init milkymist_uart_init().

Reported-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Antony Pavlov <antonynpavlov@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
7c41f2177e qtest: Prepare QOM machine tests
Instantiate all [*] machines per target, so that they get a bit of test
coverage at all. This has proven helpful during QOM refactorings.

[*] ppcemb target contains some non-working non-embedded machines, and
ppc405 CPUs are not available there either.
i386 and x86_64 do not cover pc*-x.y or xenfv.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
7761254120 leon3: Don't enforce use of -bios with qtest
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
d32f7d2506 shix: Don't require firmware presence for qtest
Adopt error_report() while at it.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
b6e770ee50 shix: Drop debug output
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
c00eb5cee1 milkymist: Suppress -kernel/-bios/-drive error for qtest
Acked-by: Michael Walle <michael@walle.cc>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
19c82aac75 an5206: Don't enforce use of kernel for qtest
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
5c12762c2d mcf5208: Don't enforce use of kernel for qtest
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
5efe843a9a axis_dev88: Don't enforce use of kernel for qtest
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
5633b90ad4 armv7m: Don't enforce use of kernel for qtest
Adopt error_report().

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
4bd2f93ff9 exynos4_boards: Silence lack of -smp 2 warning for qtest
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
db3fd06902 omap_sx1: Don't enforce use of kernel or flash for qtest
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
1ca8334e42 palm: Don't enforce loading ROM or kernel for qtest
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:29 +01:00
Andreas Färber
e25ac5f662 z2: Don't enforce use of -pflash for qtest
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:28 +01:00
Andreas Färber
bdf921d65f gumstix: Don't enforce use of -pflash for qtest
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:28 +01:00
Andreas Färber
d2f7c496c3 mainstone: Don't enforce use of -pflash for qtest
Simply skip flash setup for now.

Also drop useless debug output.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:28 +01:00
Andreas Färber
f741a26c12 puv3: Turn puv3_load_kernel() into a no-op for qtest without -kernel
Replacing the assert() with more user-friendly error handling is left
for a follow-up.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:28 +01:00
Andreas Färber
22d5523d3f mips_mipssim: Silence BIOS loading warning for qtest
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:47:28 +01:00
Andreas Färber
6d0a373542 Merge tag 'for_anthony' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu
pci, pc, pvpanic bug fixes

This fixes strange pvpanic behaviour: you had to
pause to let VM continue (and potentially reboot on panic
if enabled).

This also fixes two bugs reported by Andreas.
One is a long-standing bug exposed by recent pci changes,
the other affects old piix machine types and was caused
by recent acpi changes.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-11-05 17:46:04 +01:00
Anthony Liguori
c905c5012a ossaudio: do not enable by default
Modern Linux's no longer support /dev/dsp so enabling it by
default causes audio failures on newer Linux distros.

Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Tested-by: Andreas Färber <afaerber@suse.de>
Message-id: 1383497154-9271-1-git-send-email-aliguori@amazon.com
2013-11-05 08:40:36 -08:00
Anthony Liguori
29f8f3835f Merge remote-tracking branch 'spice/spice.v76' into staging
# By Gerd Hoffmann
# Via Gerd Hoffmann
* spice/spice.v76:
  qxl: replace pipe signaling with bottom half

Message-id: 1383656322-24150-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 08:39:49 -08:00
Anthony Liguori
f772a83113 Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pci, pc, pvpanic bug fixes

This fixes strange pvpanic behaviour: you had to
pause to let VM continue (and potentially reboot on panic
if enabled).

This also fixes two bugs reported by Andreas.
One is a long-standing bug exposed by recent pci changes,
the other affects old piix machine types and was caused
by recent acpi changes.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 04 Nov 2013 05:42:46 AM PST using RSA key ID D28D5469
# gpg: Can't check signature: public key not found

# By Michael S. Tsirkin (2) and Paolo Bonzini (1)
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
  vl: allow "cont" from panicked state
  exec: limit system memory size
  pc: disable acpi info for isapc and old pc machine

Message-id: 1383572851-28326-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 08:29:56 -08:00
Anthony Liguori
0d6e9a23ae Merge remote-tracking branch 'kraxel/e820.1' into staging
# By Gerd Hoffmann
# Via Gerd Hoffmann
* kraxel/e820.1:
  pc: register e820 entries for ram
  pc: add etc/e820 fw_cfg file

Message-id: 1383567431-13540-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-05 08:26:57 -08:00
Paolo Bonzini
df39076850 vl: allow "cont" from panicked state
After reporting the GUEST_PANICKED monitor event, QEMU stops the VM.
The reason for this is that events are edge-triggered, and can be lost if
management dies at the wrong time.  Stopping a panicked VM lets management
know of a panic even if it has crashed; management can learn about the
panic when it restarts and queries running QEMU processes.  The downside
is of course that the VM will be paused while management is not running,
but that is acceptable if it only happens with explicit "-device pvpanic".

Upon learning of a panic, management (if configured to do so) can pick a
variety of behaviors: leave the VM paused, reset it, destroy it.  In
addition to all of these behaviors, it is possible to dump the VM core
from the host.

However, right now, the panicked state is irreversible, and can only be
exited by resetting the machine.  This means that any policy decision
is entirely in the hands of the host.  In particular there is no way to
use the "reboot on panic" option together with pvpanic.

This patch makes the panicked state reversible (and removes various
workarounds that were there because of the state being irreversible).
With this change, management has a wider set of possible policies: it
can just log the crash and leave policy to the guest, it can leave the
VM paused.  In particular, the "log the crash and continue" is implemented
simply by sending a "cont" as soon as management learns about the panic.
Management could also implement the "irreversible paused state" itself.
And again, all such actions can be coupled with dumping the VM core.

Unfortunately we cannot change the behavior of 1.6.0.  Thus, even if
it uses "-device pvpanic", management should check for "cont" failures.
If "cont" fails, management can then log that the VM remained paused
and urge the administrator to update QEMU.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-11-04 15:39:41 +02:00
Michael S. Tsirkin
818f86b883 exec: limit system memory size
The page table logic in exec.c assumes
that memory addresses are at most TARGET_PHYS_ADDR_SPACE_BITS.

But pci addresses are full 64 bit so if we try to render them ignoring
the extra bits, we get strange effects with sections overlapping each
other.

To fix, simply limit the system memory size to
 1 << TARGET_PHYS_ADDR_SPACE_BITS,
pci addresses will be rendered within that.

Cc: qemu-stable@nongnu.org
Reported-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-11-04 15:38:49 +02:00
Michael S. Tsirkin
98af2ac93f pc: disable acpi info for isapc and old pc machine
Disable acpi build for isapc and no_kvmclock machine
types (used by xen), since acpi build currently expects pci.

Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2013-11-04 15:38:44 +02:00
Gerd Hoffmann
4a46c99c81 qxl: replace pipe signaling with bottom half
qxl creates a pipe, then writes something to it to wake up the iothread
from the spice server thread to raise an irq.  These days qemu bottom
halves can be scheduled from threads and signals, so there is no reason
to do this any more.  Time to clean it up.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-11-04 12:31:42 +01:00
Gerd Hoffmann
7db16f2480 pc: register e820 entries for ram
So RAM shows up in the new etc/e820 fw_cfg file.

Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-11-04 12:31:33 +01:00
Gerd Hoffmann
7d67110f2d pc: add etc/e820 fw_cfg file
Unlike the existing FW_CFG_E820_TABLE entry which carries reservations
only the new etc/e820 file also has entries for RAM.

Format is simliar to the FW_CFG_E820_TABLE, it is a simple list of
e820_entry structs.  Unlike FW_CFG_E820_TABLE it has no count though
as the number of entries can be figured from the file size.

Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-11-04 12:24:23 +01:00
Anthony Liguori
a126050a10 Merge remote-tracking branch 'kwolf/tags/for-anthony' into staging
Block patches for 1.7.0-rc0 (v2)

# gpg: Signature made Thu 31 Oct 2013 04:44:39 PM CET using RSA key ID C88F2FD6
# gpg: Can't check signature: public key not found

* kwolf/tags/for-anthony: (30 commits)
  vmdk: Implment bdrv_get_specific_info
  qapi: Add optional field 'compressed' to ImageInfo
  qemu-iotests: prefill some data to test image
  sheepdog: check simultaneous create in resend_aioreq
  sheepdog: cancel aio requests if possible
  sheepdog: make add_aio_request and send_aioreq void functions
  sheepdog: try to reconnect to sheepdog after network error
  coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
  sheepdog: reload inode outside of resend_aioreq
  sheepdog: handle vdi objects in resend_aio_req
  sheepdog: check return values of qemu_co_recv/send correctly
  qemu-iotests: Test case for backing file deletion
  qemu-iotests: drop duplicated "create_image"
  qemu-iotests: Fix 051 reference output
  block: Avoid unecessary drv->bdrv_getlength() calls
  block: Disable BDRV_O_COPY_ON_READ for the backing file
  ahci: fix win7 hang on boot
  sheepdog: pass copy_policy in the request
  sheepdog: explicitly set copies as type uint8_t
  block: Don't copy backing file name on error
  ...

Message-id: 1383064269-27720-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-31 17:02:26 +01:00
Anthony Liguori
ef5cfe5bbd Merge remote-tracking branch 'mjt/trivial-patches' into staging
* mjt/trivial-patches:
  audio/mixeng_template.h: fix inline declaration
  misc: Spelling and grammar fixes in comments
  docs/ccid.txt: fix the typo
  qapi: fix documentation example
  .gitignore: ignore qmp-commands.txt
  misc: New spelling fixes in comments
  configure: create fsdev/ directory

Message-id: 1382779887-15971-1-git-send-email-mjt@msgid.tls.msk.ru
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-31 17:01:43 +01:00
Anthony Liguori
1ba1905abd Merge remote-tracking branch 'agraf/ppc-for-upstream' into staging
* agraf/ppc-for-upstream: (29 commits)
  spapr: Use DeviceClass::fw_name for device tree CPU node
  target-ppc: Fill in OpenFirmware names for some PowerPCCPU families
  target-ppc: dump-guest-memory support
  dump-guest-memory: Check for the correct return value
  target-ppc: Use #define for max slb entries
  target-ppc: Check for error on address translation in memsave command
  target-ppc: Update slb array with correct index values.
  spapr-pci: enable irqfd for INTx
  xics-kvm: enable irqfd for MSI
  xics: Implement H_XIRR_X
  xics: Implement H_IPOLL
  xics-kvm: Support for in-kernel XICS interrupt controller
  xics: add cpu_setup callback
  xics: split to xics and xics-common
  xics: add missing const specifiers to TypeInfo
  xics: convert init() to realize()
  xics: add pre_save/post_load dispatchers
  xics: replace fprintf with error_report
  spapr: move cpu_setup after kvmppc_set_papr
  xics: move reset and cpu_setup
  ...

Message-id: 1382736474-32128-1-git-send-email-agraf@suse.de
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-31 17:01:12 +01:00
Anthony Liguori
e2cb2902ba Merge remote-tracking branch 'kraxel/audio.2' into staging
* kraxel/audio.2:
  audio: honor QEMU_AUDIO_TIMER_PERIOD instead of waking up every *nano* second

Message-id: 1382622110-19460-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-31 17:00:55 +01:00
Anthony Liguori
cb95ec1b83 Merge remote-tracking branch 'kraxel/usb.91' into staging
* kraxel/usb.91:
  usb-hcd-xhci: Update endpoint context dequeue pointer for streams too
  usb-hcd-xhci: Report completion of active transfer with CC_STOPPED on ep stop
  usb-hcd-xhci: Remove unused cancelled member from XHCITransfer
  usb-hcd-xhci: Remove unused sstreamsm member from XHCIStreamContext
  usb-host-libusb: Detach kernel drivers earlier
  usb-host-libusb: Configuration 0 may be a valid configuration
  usb-host-libusb: Fix reset handling

Message-id: 1382620267-18065-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-31 17:00:25 +01:00
Anthony Liguori
3fa4270a65 Merge remote-tracking branch 'luiz/queue/qmp' into staging
* luiz/queue/qmp:
  monitor: eliminate monitor_event_state_lock

Message-id: 1382121003-5211-1-git-send-email-lcapitulino@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-31 17:00:07 +01:00
Anthony Liguori
a9c78bb82e Merge remote-tracking branch 'kraxel/e820.1' into staging
* kraxel/e820.1:
  e820: pass high memory too.

Message-id: 1382008179-5968-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-31 16:58:58 +01:00
Anthony Liguori
b0eb759fb2 Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pci, pc, acpi fixes, enhancements

This includes some pretty big changes:
- pci master abort support by Marcel
- pci IRQ API rework by Marcel
- acpi generation support by myself

Everything has gone through several revisions, latest versions have been on
list for a while without any more comments, tested by several
people.

Please pull for 1.7.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 15 Oct 2013 07:33:48 AM CEST using RSA key ID D28D5469
# gpg: Can't check signature: public key not found

* mst/tags/for_anthony: (39 commits)
  ssdt-proc: update generated file
  ssdt: fix PBLK length
  i386: ACPI table generation code from seabios
  pc: use new api to add builtin tables
  acpi: add interface to access user-installed tables
  hpet: add API to find it
  pvpanic: add API to access io port
  ich9: APIs for pc guest info
  piix: APIs for pc guest info
  acpi/piix: add macros for acpi property names
  i386: define pc guest info
  loader: allow adding ROMs in done callbacks
  i386: add bios linker/loader
  loader: use file path size from fw_cfg.h
  acpi: ssdt pcihp: updat generated file
  acpi: pre-compiled ASL files
  acpi: add rules to compile ASL source
  i386: add ACPI table files from seabios
  q35: expose mmcfg size as a property
  q35: use macro for MCFG property name
  ...

Message-id: 1381818560-18367-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-31 16:58:32 +01:00
Fam Zheng
f4c129a38a vmdk: Implment bdrv_get_specific_info
Implement .bdrv_get_specific_info to return the extent information.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-31 14:44:43 +01:00
Alex Bennée
b86160555f integrator: fix Linux boot failure by emulating dbg region
Commit 9b8c69243 (since reverted) broke the ability to boot the kernel
as the value returned by unassigned_mem_read returned non-zero and left
the kernel looping forever waiting for it to change (see
integrator_led_set in the kernel code).

Relying on a varying implementation detail is incorrect anyway so this
introduces a basic stub of a memory region for the debug/LED section
on the integrator board.

Signed-off-by: Alex Bennée <alex@bennee.com>
Message-id: 1382451366-9539-1-git-send-email-alex.bennee@linaro.org
[PMM: removed three unused fields from struct IntegratorDebugState]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-10-31 14:00:16 +01:00
Alvise Rigo
0bc2a331e4 target-arm: fix sorting issue of KVM cpreg list
The compare_u64 function was not sorting the KVM cpreg_list in the
right way due to the wrong returned value.  Since we are comparing
two 64bit values we can't simply return their difference if the
returned type is int.

Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
Message-id: 1381513125-26802-2-git-send-email-a.rigo@virtualopensystems.com
[PMM: fixed coding style, indent and commit message formatting]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-10-31 14:00:16 +01:00
Alvise Rigo
cbf239b769 target-arm: sort TCG cpreg list by KVM-style 64 bit ID number
Both KVM and TCG populate the cpreg_list with 64 bit register IDs,
but in the TCG side the cpreg_list is sorted using the 32 bit ID
version while in the kvm side the 64 bit ID version is used.  This
patch makes the sorting of the cpreg_list consistent between KVM and
TCG.

Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com>
Message-id: 1381513125-26802-1-git-send-email-a.rigo@virtualopensystems.com
[PMM: fixed indent, coding style and commit message formatting]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-10-31 14:00:16 +01:00
Nathan Rossi
8641136c54 target-arm: Add CP15 VBAR support
Added Vector Base Address remapping on ARM v7.

Signed-off-by: Nathan Rossi <nathan.rossi@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[PMM: removed spurious mask of value with 1<<31]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-10-31 14:00:16 +01:00
Peter Maydell
dacecf5485 hw/arm: Tidy up conditional calls to arm_load_kernel
Now that arm_load_kernel doesn't insist on a kernel filename
being present, we can remove some unnecessary conditionals
in board models.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1379980897-21277-3-git-send-email-peter.maydell@linaro.org
2013-10-31 14:00:16 +01:00
Peter Maydell
9546dbabd5 hw/arm/boot: Make user not specifying a kernel not an error
Typically ARM boards will have some kind of flash which might contain
a boot ROM; it's therefore a valid use case to provide only an
image for the boot ROM and not require QEMU's internal boot loader
at all. Remove the fatal error if -kernel isn't specified.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1379980897-21277-2-git-send-email-peter.maydell@linaro.org
2013-10-31 14:00:16 +01:00
Fam Zheng
cbe82d7fb3 qapi: Add optional field 'compressed' to ImageInfo
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 18:25:48 +01:00
Fam Zheng
7890111b64 qemu-iotests: prefill some data to test image
Case 030 occasionally fails because of block job compltes too fast to be
captured by script, and 'unexpected qmp event' of job completion causes
the test failure.

Simply fill in some data to the test image to make this false alarm less
likely to happen.

(For other benefits to prefill data to test image, see also commit
ab68cdfaa).

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:51:45 +01:00
MORITA Kazutaka
80308d33ec sheepdog: check simultaneous create in resend_aioreq
After reconnection happens, all the inflight requests are moved to the
failed request list.  As a result, sd_co_rw_vector() can send another
create request before resend_aioreq() resends a create request from
the failed list.

This patch adds a helper function check_simultaneous_create() and
checks simultaneous create requests more strictly in resend_aioreq().

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:24 +01:00
MORITA Kazutaka
35200687a1 sheepdog: cancel aio requests if possible
This patch tries to cancel aio requests in pending queue and failed
queue.  When the sheepdog driver cannot cancel the requests, it waits
for them to be completed.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:20 +01:00
MORITA Kazutaka
a37dcdf9ae sheepdog: make add_aio_request and send_aioreq void functions
These functions no longer return errors.  We can make them void
functions and simplify the codes.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:15 +01:00
MORITA Kazutaka
011603cacf sheepdog: try to reconnect to sheepdog after network error
This introduces a failed request queue and links all the inflight
requests to the list after network error happens.  After QEMU
reconnects to the sheepdog server successfully, the sheepdog block
driver will retry all the requests in the failed queue.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:12 +01:00
MORITA Kazutaka
3ab7bd1917 coroutine: add co_aio_sleep_ns() to allow sleep in block drivers
This helper function behaves similarly to co_sleep_ns(), but the
sleeping coroutine will be resumed when using qemu_aio_wait().

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:09 +01:00
MORITA Kazutaka
72e0996c41 sheepdog: reload inode outside of resend_aioreq
This prepares for using resend_aioreq() after reconnecting to the
sheepdog server.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:05 +01:00
MORITA Kazutaka
2412aec745 sheepdog: handle vdi objects in resend_aio_req
The current resend_aio_req() doesn't work when the request is against
vdi objects.  This fixes the problem.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:22:01 +01:00
MORITA Kazutaka
80731d9da5 sheepdog: check return values of qemu_co_recv/send correctly
If qemu_co_recv/send doesn't return the specified length, it means
that an error happened.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Tested-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:21:44 +01:00
Max Reitz
321fd7d2b8 qemu-iotests: Test case for backing file deletion
Add a test case for trying to open an image file where it is impossible
to open its backing file (in this case, because it was deleted). When
doing this, qemu (or qemu-io in this case) should not crash but rather
print an appropriate error message.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 12:16:43 +01:00
Fam Zheng
915365a9c6 qemu-iotests: drop duplicated "create_image"
There's a same common function in iotests.py

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-30 11:58:46 +01:00
Kevin Wolf
a7cf03d4e1 qemu-iotests: Fix 051 reference output
Commit 684b254 forgot to update it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-29 17:05:35 +01:00
Kevin Wolf
b94a261057 block: Avoid unecessary drv->bdrv_getlength() calls
The block layer generally keeps the size of an image cached in
bs->total_sectors so that it doesn't have to perform expensive
operations to get the size whenever it needs it.

This doesn't work however when using a backend that can change its size
without qemu being aware of it, i.e. passthrough of removable media like
CD-ROMs or floppy disks. For this reason, the caching is disabled when a
removable device is used.

It is obvious that checking whether the _guest_ device has removable
media isn't the right thing to do when we want to know whether the size
of the host backend can change. To make things worse, non-top-level
BlockDriverStates never have any device attached, which makes qemu
assume they are removable, so drv->bdrv_getlength() is always called on
the protocol layer. In the case of raw-posix, this causes unnecessary
lseek() system calls, which turned out to be rather expensive.

This patch completely changes the logic and disables bs->total_sectors
caching only for certain block driver types, for which a size change is
expected: host_cdrom and host_floppy on POSIX, host_device on win32; also
the raw format in case it sits on top of one of these protocols, but in
the common case the nested bdrv_getlength() call on the protocol driver
will use the cache again and avoid an expensive drv->bdrv_getlength()
call.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-29 13:10:26 +01:00
Thibaut LAURENT
87a5debd31 block: Disable BDRV_O_COPY_ON_READ for the backing file
Since commit 0ebd24e0a2,
bdrv_open_common will throw an error when trying to open a file
read-only with the BDRV_O_COPY_ON_READ flag set.
Although BDRV_O_RDWR is unset for the backing files,
BDRV_O_COPY_ON_READ is still passed on if copy-on-read was requested
for the drive. Let's unset this flag too before opening the backing
file, or bdrv_open_common will fail.

Signed-off-by: Thibaut LAURENT <thibaut.laurent@gmail.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-29 13:06:39 +01:00
Alexander Graf
8464b273d6 ahci: fix win7 hang on boot
When AHCI executes an asynchronous IDE command, it checked DRDY without
checking either DRQ or BSY.  This sometimes caused interrupt to be sent
before command is actually completed.

This resulted in a race condition: if guest then managed to access the
device before command has completed, it would hang waiting for an
interrupt.
This was observed with windows 7 guests.

To fix, check for DRQ or BSY in additiona to DRDY, if set,
the command is asynchronous so delay the interrupt until
asynchronous done callback is invoked.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-29 13:06:39 +01:00
Liu Yuan
1841f8801c sheepdog: pass copy_policy in the request
Currently copy_policy isn't used. Recent sheepdog supports erasure coding, which
make use of copy_policy internally, but require client explicitly passing
copy_policy from base inode to newly creately inode for snapshot related
operations.

If connected sheep daemon doesn't utilize copy_policy, passing it to sheep
daemon is just one extra null effect operation. So no compatibility problem.

With this patch, sheepdog can provide erasure coded volume for QEMU VM.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:40:00 +01:00
Liu Yuan
29a67f7e92 sheepdog: explicitly set copies as type uint8_t
'copies' is actually uint8_t since day one, but request headers and some helper
functions parameterize it as uint32_t for unknown reasons and effectively
reserve 24 bytes for possible future use. This patch explicitly set the correct
for copies and reserve the left bytes.

This is a preparation patch that allow passing copy_policy in request header.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Acked-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:39:56 +01:00
Max Reitz
61ed268453 block: Don't copy backing file name on error
bdrv_open_backing_file() tries to copy the backing file name using
pstrcpy directly after calling bdrv_open() to open the backing file
without checking whether that was actually successful. If it was not,
ps->backing_hd->file will probably be NULL and qemu will crash.

Fix this by moving pstrcpy after checking whether bdrv_open() succeeded.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:35:52 +01:00
Kevin Wolf
d1f3a23bfa tests: Multiboot mmap test case
This adds a test case for Multiboot memory map in the tests/multiboot
directory, where future i386 test kernels can be dropped. Because this
requires an x86 build host and an installed 32 bit libgcc, the test is
not part of a regular 'make check'.

The reference output for the test is verified against test runs of the
same multiboot kernel booted by some GRUB 0.97.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:34:42 +01:00
Kevin Wolf
d7b7e58009 ide-test: Check what happens with bus mastering disabled
The main goal is that qemu doesn't crash.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:34:42 +01:00
Kevin Wolf
e85d9db5f6 exec: Fix bounce buffer allocation in address_space_map()
This fixes a regression introduced by commit e3127ae0c, which kept the
allocation size of the bounce buffer limited to one page in order to
avoid unbounded allocations (as explained in the commit message of
6d16c2f88), but broke the reporting of the shortened bounce buffer to
the caller. The caller therefore assumes that the full requested size
was provided and causes memory corruption when writing beyond the end of
the actually allocated buffer.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:34:42 +01:00
Max Reitz
ba2ab2f2ca qcow2: Flush image after creation
Opening the qcow2 image with BDRV_O_NO_FLUSH prevents any flushes during
the image creation. This means that the image has not yet been flushed
to disk when qemu-img create exits. This flush is delayed until the next
operation on the image involving opening it without BDRV_O_NO_FLUSH and
closing (or directly flushing) it. For large images and/or images with a
small cluster size and preallocated metadata, this flush may take a
significant amount of time and may occur unexpectedly.

Reopening the image without BDRV_O_NO_FLUSH right before the end of
qcow2_create2() results in hoisting the potentially costly flush into
the image creation, which is expected to take some time (whereas
successive image operations may be not).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:34:32 +01:00
Alex Bligh
203cea22a3 audio/mixeng_template.h: fix inline declaration
Fix error: ‘inline’ is not at beginning of declaration
[-Werror=old-style-declaration]

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-26 13:09:34 +04:00
Stefan Weil
59b0096213 misc: Spelling and grammar fixes in comments
* it's -> its
* grammar fix in ui/vnc-enc-zywrle.h

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Don Koch <dkoch@verizon.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-26 13:06:45 +04:00
WengFan
5f32804c79 docs/ccid.txt: fix the typo
Signed-off-by: WengFan <wengfan-fnst@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-26 13:04:12 +04:00
Eric Blake
63922c6477 qapi: fix documentation example
The QMP wire format uses "", not '', around strings.

* docs/qapi-code-gen.txt: Fix typo.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-26 13:01:58 +04:00
Fam Zheng
eb02dc0b11 .gitignore: ignore qmp-commands.txt
This file is moved out from QMP/ to BUILD dir, change the ignore file
too.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-26 13:01:57 +04:00
Stefan Weil
73f395fa88 misc: New spelling fixes in comments
compatiblity -> compatibility
continously -> continuously
existance -> existence
usefull -> useful
shoudl -> should

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-26 13:01:57 +04:00
Michael Tokarev
2b170effc7 configure: create fsdev/ directory
In some cases when building with parallelism (make -jN),
build fails because the directory where output files are
supposed to be does not exist.  In particular, when make
decides to build virtfs-proxy-helper.1 before other files
in fsdev/, build will fail with the following error:

perl -Ww -- BUILDDIR/scripts/texi2pod.pl BUILDDIR/fsdev/virtfs-proxy-helper.texi fsdev/virtfs-proxy-helper.pod && pod2man --utf8 --section=1 --center=" " --release=" " fsdev/virtfs-proxy-helper.pod > fsdev/virtfs-proxy-helper.1
opening "fsdev/virtfs-proxy-helper.pod": No such file or directory

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-26 13:01:37 +04:00
Andreas Färber
3bbf37f269 spapr: Use DeviceClass::fw_name for device tree CPU node
Instead of relying on cpu_model, obtain the device tree node label
per CPU. Use DeviceClass::fw_name as source.

Whenever DeviceClass::fw_name is unknown, default to "PowerPC,UNKNOWN".

As a consequence, spapr_fixup_cpu_dt() can operate on each CPU's fw_name,
obsoleting sPAPREnvironment::cpu_model, and spapr_create_fdt_skel() can
drop its cpu_model argument.

Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:48 +02:00
Andreas Färber
793826cd46 target-ppc: Fill in OpenFirmware names for some PowerPCCPU families
Set the expected values for POWER7, POWER7+, POWER8 and POWER5+.
Note that POWER5+ and POWER7+ are intentionally lacking the '+', so the
lack of a POWER7P family constitutes no problem.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:48 +02:00
Aneesh Kumar K.V
e62fbc54d4 target-ppc: dump-guest-memory support
This patch add support for dumping guest memory using dump-guest-memory
monitor command.

Before patch:

(qemu) dump-guest-memory testcrash
this feature or command is not currently supported
(qemu)

After patch:

(qemu) dump-guest-memory testcrash
(qemu)

crash was able to read the file

crash> bt
PID: 0      TASK: c000000000c0d0d0  CPU: 0   COMMAND: "swapper/0"

 R0:  0000000028000084    R1:  c000000000cafa50    R2:  c000000000cb05b0
 R3:  0000000000000000    R4:  c000000000bc4cb0    R5:  0000000000000000
 R6:  001efe93b8000000    R7:  0000000000000000    R8:  0000000000000000
 R9:  b000000000001032    R10: 0000000000000001    R11: 0001eb2117e00d55
....
...

NOTE: Currently crash tools doesn't look at ELF notes in the dump on ppc64.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:48 +02:00
Aneesh Kumar K.V
bb6b684363 dump-guest-memory: Check for the correct return value
We should check for error with s->note_size

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:48 +02:00
Aneesh Kumar K.V
d83af16786 target-ppc: Use #define for max slb entries
Instead of opencoding 64 use MAX_SLB_ENTRIES. We don't update the kernel
header here.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:48 +02:00
Aneesh Kumar K.V
2f4d0f5990 target-ppc: Check for error on address translation in memsave command
When we translate the virtual address to physical check for error.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:48 +02:00
Aneesh Kumar K.V
4b4d4a21b9 target-ppc: Update slb array with correct index values.
Without this, a value of rb=0 and rs=0 results in replacing the 0th
index. This can be observed when using gdb remote debugging support.

(gdb) x/10i do_fork
   0xc000000000085330 <do_fork>:        Cannot access memory at address 0xc000000000085330
(gdb)

This is because when we do the slb sync via kvm_cpu_synchronize_state,
we overwrite the slb entry (0th entry) for 0xc000000000085330

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:48 +02:00
Alexey Kardashevskiy
5cc7a967e9 spapr-pci: enable irqfd for INTx
This enables IRQFD for LSI (level triggered INTx interrupts) by adding
a spapr_route_intx_pin_to_irq() callback to the sPAPR PCI host bus. This
callback is called to know the global interrupt number to link resampling fd
with IRQFD's fd in KVM.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:47 +02:00
Alexey Kardashevskiy
9554233c9b xics-kvm: enable irqfd for MSI
This enables IRQFD support for sPAPR. The feature decreases the latency
of interrupt handling.

To enable IRQFD for MSI, this sets kvm_gsi_direct_mapping to true which
enables direct MSI mapping.

To enable IRQFD for LSI (level triggered INTx interrupts), a PCI host bus
callback is required. The patch for that is coming next.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:47 +02:00
Benjamin Herrenschmidt
5d87e4b74a xics: Implement H_XIRR_X
This implements H_XIRR_X hypercall in addition to H_XIRR as
it is mandatory for PAPR+ and there is no way for the guest to
detect whether it is supported or not so just add it.

As the Partition Adjunct Option is not supported at the moment,
the CPPR parameter of the hypercall is ignored.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:47 +02:00
Benjamin Herrenschmidt
075edbe3ba xics: Implement H_IPOLL
This adds support for the H_IPOLL hypercall which the guest
uses to poll for a pending interrupt. This hypercall is
mandatory for PAPR+ and there is no way for the guest to
detect whether it is supported or not so just add it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:47 +02:00
David Gibson
11ad93f681 xics-kvm: Support for in-kernel XICS interrupt controller
Recent (host) kernels support emulating the PAPR defined "XICS" interrupt
controller system within KVM.  This patch allows qemu to initialize and
configure the in-kernel XICS, and keep its state in sync with qemu's XICS
state as necessary.

This should give considerable performance improvements.  e.g. on a simple
IPI ping-pong test between hardware threads, using qemu XICS gives us
around 5,000 irqs/second, whereas the in-kernel XICS gives us around
70,000 irqs/s on the same hardware configuration.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[Mike Qiu <qiudayu@linux.vnet.ibm.com>: fixed mistype which caused ics_set_kvm_state() to fail]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:47 +02:00
Alexey Kardashevskiy
5eb92ccc3f xics: add cpu_setup callback
This adds a cpu_setup callback to the XICS device class (as XICS-KVM
will do it different), xics_cpu_setup() will call it if it is set.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:47 +02:00
Alexey Kardashevskiy
5a3d7b23ba xics: split to xics and xics-common
The upcoming XICS-KVM support will use bits of emulated XICS code.
So this introduces new level of hierarchy - "xics-common" class. Both
emulated XICS and XICS-KVM will inherit from it and override class
callbacks when required.

The new "xics-common" class implements:
1. replaces static "nr_irqs" and "nr_servers" properties with
the dynamic ones and adds callbacks to be executed when properties
are set.
2. xics_cpu_setup() callback renamed to xics_common_cpu_setup() as
it is a common part for both XICS'es
3. xics_reset() renamed to xics_common_reset() for the same reason.

The emulated XICS changes:
1. the part of xics_realize() which creates ICPs is moved to
the "nr_servers" property callback as realize() is too late to
create/initialize devices and instance_init() is too early to create
devices as the number of child devices comes via the "nr_servers"
property.
2. added ics_initfn() which does a little part of what xics_realize() did.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:47 +02:00
Alexey Kardashevskiy
456df19cf7 xics: add missing const specifiers to TypeInfo
This adds missing const specifiers to ICS and ICP TypeInfo's.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:47 +02:00
Alexey Kardashevskiy
b45ff2d942 xics: convert init() to realize()
This fixes XICS according new QOM rules.

This converts ICS's init() callbacks to realize().

This converts legacy qdev_init_nofail() to property_set(realized).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:47 +02:00
Alexey Kardashevskiy
d1b5682d88 xics: add pre_save/post_load dispatchers
The upcoming support of in-kernel XICS will redefine migration callbacks
for both ICS and ICP so classes and callback pointers are added.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:46 +02:00
Alexey Kardashevskiy
9ccff2a4d6 xics: replace fprintf with error_report
This replaces old-style fprintf with new style error_report.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:46 +02:00
Alexey Kardashevskiy
24408a7d2b spapr: move cpu_setup after kvmppc_set_papr
This moves the xics_cpu_setup() call after kvmppc_set_papr()
in order to get VCPUs initialized as this is required by upcoming
XICS-KVM.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:46 +02:00
Alexey Kardashevskiy
8ffe04ed2e xics: move reset and cpu_setup
This simple change makes following patches nicer.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:46 +02:00
David Gibson
feaa64c41f target-ppc: Add helper for KVM_PPC_RTAS_DEFINE_TOKEN
Recent PowerKVM allows the kernel to intercept some RTAS calls from the
guest directly.  This is used to implement the more efficient in-kernel
XICS for example.  qemu is still responsible for assigning the RTAS token
numbers however, and needs to tell the kernel which RTAS function name is
assigned to a given token value.  This patch adds a convenience wrapper for
the KVM_PPC_RTAS_DEFINE_TOKEN ioctl() which is used for this purpose.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:46 +02:00
Alexey Kardashevskiy
4fe822e075 spapr-rtas: fix h_rtas parameters reading
On the real hardware, RTAS is called in real mode and therefore
top 4 bits of the address passed in the call are ignored.
So does the patch.

This converts h_rtas() to use existing rtas_ld() handlers.

This fixed rtas_ld()/rtas_st() to ignore top 4 bits.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:46 +02:00
Alexey Kardashevskiy
dcb861cb88 spapr: Add ibm, purr property on power7 and newer
PAPR+ says that no "ibm,purr" tells the guest that H_PURR is not
supported. However some guests still try calling H_PURR on POWER7 unless
the property is present and equal to 0. This adds the property for CPUs
supporting the PURR special register.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:46 +02:00
Alexey Kardashevskiy
3bf6eedd4b spapr: increase temporary fdt buffer size
At the moment the size of the buffer is set to 64K which is
enough for approximately 150 VCPUs which is not the limit.

This increases the buffer up to 256K which allows having
a tree for approximately 600 VCPUs which is way beyond the real
number we need.

As only the real size of the tree is copied to the guest, there
will be no impact on existing configurations.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:46 +02:00
Alexander Graf
9633fcc6a0 PPC: Fix L2CR write accesses
Commit 2345f1c01 was supposed to render L2CR writes into noops. Instead,
it made them illegal instruction traps which apparently didn't confuse
XNU, but can easily confuse other OSs.

Fix it up by actually doing nothing when we write to L2CR.

Reported-by: Julio Guerra <guerr@julio.in>
Signed-off-by: Alexander Graf <agraf@suse.de>
Tested-by: Julio Guerra <guerr@julio.in>
2013-10-25 23:25:45 +02:00
Tom Musta
bbfb6f132a target-ppc: Little Endian Correction to Load/Store Vector Element
The Load Vector Element (lve*x) and Store Vector Element (stve*x)
instructions not only byte-swap in Little Endian mode, they also
invert the element that is accessed. For example, the RTL for
lvehx contains this:

     eb <-- EA[60:63]
     if Big-Endian byte ordering then
         VRT[8*eb:8*eb+15] <-- MEM(EA,2)
     else
         VRT[112-(8*eb):127-(8*eb)] <-- MEM(EA,2)

This patch adds the element inversion, as described in the last line
of the RTL.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:45 +02:00
Tom Musta
04f1f7842e ppc: Add CFAR, DAR and DSISR to the dictionary of printable registers
The CFAR, DAR and DSISR registers are currently missing from the
dictionary of registers that may be printed in the QEMU console.
These are interesting registers when debugging.  With this patch,
the following commands work properly:

     (qemu) print $cfar
     (qemu) print $dar
     (qemu) print $dsisr

Signed-off-by: Tom Musta <tommusta@gmail.com>
Reviewed-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:45 +02:00
Benjamin Herrenschmidt
16457e7f4a pseries: Fix loading of little endian kernels
Try loading the kernel as little endian if it fails big endian.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:45 +02:00
Alexey Kardashevskiy
09b04845a7 pseries: Update SLOF firmware image
This has reworked USB OHCI and adds support of USB EHCI,
VIRTIO-SCSI and various fixes (IBM VSCSI, VGA and more).

The full list of fixes is:
*  usb-ohci: Convert td-phys every time to td-virt
*  usb-storage: Fix cbwflags field
*  Add -fno-strict-aliasing in global CFLAGS
*  usb: fix various issues found with js2x
*  Move hex64-{decode,encode}-unit to node.fs
*  usb: Use separate in-memory endian swap
*  usb-ohci: collect TDs from done list
*  js2x: more fixes
*  js2x: Fix build of takeover image
*  js2x: use new usb stack
*  usb-ohci: Use proper memory barriers always
*  usb: Fix a couple of warnings
*  Fix $cat-instance-unit
*  Cache phandle of /chosen
*  Use root.fs on qemu as well
*  usb-ehci: Add ehci handshake
*  usb: add mb for write accessors
*  usb-ohci: add missing memory barriers
*  usb-ohci: suspend the controller in exit code path
*  usb-ohci: Add a reset when closing the OHCI
*  usb: Use proper accessors for MMIO and separate in-memory endian swap
*  Use a global definition of sync() and mb()
*  net-snk: Remove exception handling
*  usb: unmap buffers
*  slof: call quiesce on closing of stdin
*  usb-kbd: accept "s" to drop to OF prompt
*  USB storage driver
*  usb-ohci: add Bulk transfer support
*  usb-ehci: Add bulk support
*  usb-core: add usb bulk support
*  USB generic hub device driver
*  usb-ehci: setup new device
*  usb-ehci: Check ehci ports
*  usb-ehci: initialize controller
*  USB keyboard driver
*  usb-core: setup new device
*  usb-core: create dev pool allocation
*  usb-ohci: implement ohci send control
*  usb-core: usb send control
*  usb-core: implement usb_{get,put}_pipe routines
*  usb-ohci: allocate pipe pool
*  usb-ohci: reset, init and check-ports
*  Add standard header stdbool.h
*  usb-slof: forth support routines for C
*  usb-ehci: Add USB EHCI skeleton
*  usb-core: Add register accessor functions
*  Use __builtin_bswap routines for endianness swapping
*  usb-core: hcd registration and query routines
*  usb-core: adding generic dev-hci.fs
*  usb-core: registration and makefiles
*  Add new USB code
*  Remove old usb code
*  vga: fix hcall-invert-screen and hcall-blink-screen
*  Enumerate disk/cdrom aliases for multiple disks or cdroms
*  scsi: unify scsi probing code
*  vscsi: generalizing probe code
*  virtio-scsi: iterate through targets
*  scsi: unify and use make-disk-alias
*  nvram: remove unnecessary prints
*  Add hack to client interface finddevice of "/memory"
*  scsi: Fix cdrom boot crash when no medium present
*  Look for /memory@0, not just /memory
*  Fix instance>qname crashing when displaying instance arguments
*  Fix js2x build
*  scsi-disk: Bound check read-blocks
*  Fix off by one error in scsi-disk get-capacity
*  scsi: fix report-luns handling
*  SLOF: virtio-scsi block driver code
*  scsi: Move bits of vio-vscsi.fs to a common helpers file
*  scsi: Move scsi-disk.fs to a generic place
*  SLOF: virtio-scsi helper routines
*  SLOF: virtio-scsi - add pci device file
*  iso9660: Don't constantly reallocate the read buffer
*  vscsi: Sanitize interface between scsi-disk.fs and vio-vscsi.fs
*  vio-vscsi: Rework vio-vscsi support
*  virtio: Add a virtio-set-qaddr helper
*  disk-label: Allocate 4096 bytes for 4k block devices
*  disk-label: Increase the max size of the PReP boot partition
*  Make load-base a real environment variable
*  vio-vscsi: Switch to using a wildcard "disk" node and make scsi-disk generic
*  Fix disk-label package to use proper instance path
*  Increase size of catpad
*  Fix instance>path to contain unit address for wildcard nodes
*  Fix handling of wildcard nodes in open-dev
*  vio-vscsi: Get CRQ on open and release on close

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:45 +02:00
Max Reitz
ab6f2bbb28 qemu-iotests: Test for loading VM state from qcow2
Add a test for saving a VM state from a qcow2 image and loading it back
(with having restarted qemu in between); this should work without any
problems.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-25 11:08:20 +02:00
Edgar E. Iglesias
ec426ff808 hw/microblaze: Add support for loading initrd images
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-10-24 22:56:48 +02:00
Edgar E. Iglesias
d0b022a0e9 hw/microblaze: Indentation cleanups
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-10-24 22:32:56 +02:00
Edgar E. Iglesias
11a7621763 microblaze: At swx, check that the reserved word is unmodified
This improves the reservation check for system emulation, making
it possible to catch stores that modify reserved word.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-10-24 22:32:56 +02:00
Edgar E. Iglesias
4a53627045 microblaze: Turn res_addr into a tcg global
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-10-24 22:32:56 +02:00
Edgar E. Iglesias
536446e914 microblaze: Move the saving of the reservation addr into gen_load
No functional change.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-10-24 22:32:56 +02:00
Edgar E. Iglesias
09b9f113ad microblaze: Improve src
Microblaze carry is mirrored in MSR[31], pick it directly from
there. Also, no need to mask cpu_R[dc->ra] when calling
write_carry.

15% improvement in linux-user src loops.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-10-24 22:32:56 +02:00
Edgar E. Iglesias
bb3cb951ef microblaze: Improve srl
write_carry only looks at bit zero, no need to mask out the others.

Meassured a 12% speed improvement in linux-user srl loops.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-10-24 22:32:56 +02:00
Edgar E. Iglesias
a235900e22 microblaze: Simplify andn by using tcg_gen_andc
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-10-24 22:32:56 +02:00
Edgar E. Iglesias
65ab5eb4ed microblaze: Make write_carryi input a boolean
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-10-24 22:32:55 +02:00
Edgar E. Iglesias
04ec7df708 microblaze: Clarify expected input of write_carry
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-10-24 22:32:55 +02:00
Peter Lieven
fb8fe35f63 block/vpc: check that the image has not been truncated
this adds a check that a dynamic VHD file has not been
accidently truncated (e.g. during transfer or upload).

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-24 17:34:48 +02:00
Peter Lieven
fefddf951b qemu-img: add special exit code if bdrv_check is not supported
currently it is not possible to distinguish by exitcode if there
has been an error or if bdrv_check is not supported by the image
format. Change the exitcode from 1 to 63 for the latter case.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-24 12:03:18 +02:00
Max Reitz
6e13610aa4 qcow2: Unset zero_beyond_eof in save_vmstate
Saving the VM state is done using bdrv_pwrite. This function may perform
a read-modify-write, which in this case results in data being read from
beyond the end of the virtual disk. Since we are actually trying to
access an area which is not a part of the virtual disk, zero_beyond_eof
has to be set to false before performing the partial write, otherwise
the VM state may become corrupted.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-24 11:50:51 +02:00
Max Reitz
eedff66f21 qcow2: Restore total_sectors value in save_vmstate
Since df2a6f29a5, bdrv_co_do_writev increases the total_sectors value of
a growable block devices on writes after the current end. This leads to
the virtual disk apparently growing in qcow2_save_vmstate, which in turn
affects the disk size captured by the internal snapshot taken directly
afterwards through e.g. the HMP savevm command. Such a "grown" snapshot
cannot be loaded after reopening the qcow2 image, since its disk size
differs from the actual virtual disk size (writing a VM state does not
actually increase the virtual disk size).

Fix this by restoring total_sectors at the end of qcow2_save_vmstate.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-24 11:45:06 +02:00
Hans de Goede
b4350deed6 audio: honor QEMU_AUDIO_TIMER_PERIOD instead of waking up every *nano* second
Now that we no longer have MIN_REARM_TIMER_NS a bug in the audio subsys has
clearly shown it self by trying to make a timer fire every nano second.

Note we have a similar problem in 1.6, 1.5 and older but there
MIN_REARM_TIMER_NS limits the wakeups caused by audio being active to
4000 times / second. This still causes a host cpu load of 50 % for simply
playing audio, where as with this patch git master is at 13%, so we should
backport this to 1.5 and 1.6 too.

Note this will not apply to 1.5 and 1.6 as is.

Cc: qemu-stable@nongnu.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-23 10:37:27 +02:00
Hans de Goede
c90daa1c10 usb-hcd-xhci: Update endpoint context dequeue pointer for streams too
With streams the endpoint context dequeue pointer should point to the
dequeue value for the currently active stream.

At least Linux guests expect it to point to value set by an set_ep_dequeue
upon completion of the set_ep_dequeue (before kicking the ep).

Otherwise the Linux kernel will complain (and things won't work):

xhci_hcd 0000:00:05.0: Mismatch between completed Set TR Deq Ptr command & xHCI internal state.
xhci_hcd 0000:00:05.0: ep deq seg = ffff8800366f0880, deq ptr = ffff8800366ec010

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-22 16:28:49 +02:00
Hans de Goede
582d6f4aba usb-hcd-xhci: Report completion of active transfer with CC_STOPPED on ep stop
As we should per the XHCI spec "4.6.9 Stop Endpoint".

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-22 16:28:49 +02:00
Hans de Goede
8de1838afe usb-hcd-xhci: Remove unused cancelled member from XHCITransfer
Since qemu's USB model is geared towards emulated devices cancellation
is instanteneous, so no need to wait for cancellation to complete, as
such there is no wait for cancellation code, and the cancelled bool
as well as the bogus comment about it can be removed.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-22 16:28:49 +02:00
Hans de Goede
946ff2c0c3 usb-hcd-xhci: Remove unused sstreamsm member from XHCIStreamContext
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-22 16:28:49 +02:00
Hans de Goede
f34d5c7508 usb-host-libusb: Detach kernel drivers earlier
If we detach the kernel drivers on the first set_config, then they will
be still attached when the device gets its initial reset. Causing the drivers
to re-initialize the device after the reset, dirtying the device state.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-22 16:28:49 +02:00
Hans de Goede
1294ca797c usb-host-libusb: Configuration 0 may be a valid configuration
Quoting from: linux/Documentation/ABI/stable/sysfs-bus-usb:

	Note that some devices, in violation of the USB spec, have a
	configuration with a value equal to 0. Writing 0 to
	bConfigurationValue for these devices will install that
	configuration, rather then unconfigure the device.

So don't compare the configuration value against 0 to check for unconfigured
devices, instead check for a LIBUSB_ERROR_NOT_FOUND return from
libusb_get_active_config_descriptor().

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-22 16:28:49 +02:00
Hans de Goede
5af35d7fec usb-host-libusb: Fix reset handling
The guest will issue an initial device reset when the device is attached, but
since the current usb-host-libusb code only actually does the reset when
udev->configuration != 0, and on attach the device is not yet configured,
the reset gets ignored. This means that the device gets passed to the guest
in an unknown state, which is not good.

The udev->configuration check is there because of the release / claim
interfaces done around the libusb_device_reset call, but these are not
necessary. If interfaces are claimed when libusb_device_reset gets called
libusb will release + reclaim them itself.

The usb_host_ep_update call also is not necessary. If the reset succeeds the
original config and interface alt settings will be restored.

Last if the reset fails, that means the device has either disconnected or
morphed into an another device and has been completely re-enumerated,
so it is treated by the host as a new device and our handle is invalid,
so on reset failure we need to call usb_host_nodev().

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-22 16:28:48 +02:00
Eric Blake
cc94712b9e qapi: fix documentation example
The QMP wire format uses "", not '', around strings.

* docs/qapi-code-gen.txt: Fix typo.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-21 16:49:35 +02:00
Paolo Bonzini
c20b7fa4b2 monitor: eliminate monitor_event_state_lock
This lock does not protect anything that the BQL does not already
protect.  Furthermore, with -nodefaults and no monitor, the mutex
is not initialized but monitor_protocol_event_queue is called
anyway, which causes a crash under mingw (and only works by luck.
under Linux or other POSIX OSes).

Reported-by: Orx Goshen <orx.goshen@intel.com>
Cc: Daniel Berrange <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-10-18 14:23:00 -04:00
Anthony Liguori
fc8ead7467 Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
# By Paolo Bonzini (2) and Jan Kiszka (1)
# Via Gleb Natapov
* qemu-kvm/uq/master:
  kvmvapic: Prevent reading beyond the end of guest RAM
  x86: cpuid: reconstruct leaf 0Dh data
  x86: fix migration from pre-version 12

Message-id: 1382108641-4862-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-18 10:03:24 -07:00
Anthony Liguori
3551643eb7 Merge remote-tracking branch 'stefanha/net' into staging
# By Amos Kong
# Via Stefan Hajnoczi
* stefanha/net:
  net/rtl8139: update network information when macaddr is changed in guest
  net/e1000: update network information when macaddr is changed in guest
  net: update nic info during device reset

Message-id: 1382103314-21608-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-18 10:02:48 -07:00
Anthony Liguori
1da9772d83 Merge remote-tracking branch 'stefanha/block' into staging
# By Fam Zheng (3) and others
# Via Stefan Hajnoczi
* stefanha/block:
  vmdk: fix VMFS extent parsing
  vmdk: Only read cid from image file when opening
  virtio: Remove unneeded memcpy
  block/raw-win32: Always use -errno in hdev_open
  blockdev: fix cdrom read_only flag
  sd: Avoid access to NULL BlockDriverState
  hmp: drop bogus "[not inserted]"

Message-id: 1382105915-27735-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-18 10:02:14 -07:00
Anthony Liguori
989644915c Merge remote-tracking branch 'bonzini/iommu-for-anthony' into staging
# By Paolo Bonzini (10) and others
# Via Paolo Bonzini
* bonzini/iommu-for-anthony:
  exec: remove qemu_safe_ram_ptr
  icount: make it thread-safe
  icount: document (future) locking rules for icount
  icount: prepare the code for future races in calling qemu_clock_warp
  icount: reorganize icount_warp_rt
  icount: use cpu_get_icount() directly
  timer: add timer_mod_anticipate and timer_mod_anticipate_ns
  timer: extract timer_mod_ns_locked and timerlist_rearm
  timer: make qemu_clock_enable sync between disable and timer's cb
  qemu-thread: add QemuEvent
  timer: protect timers_state's clock with seqlock
  seqlock: introduce read-write seqlock
  vga: Mark relevant portio lists regions as coalesced MMIO flushing
  cirrus: Mark vga io region as coalesced MMIO flushing
  portio: Allow to mark portio lists as coalesced MMIO flushing
  compatfd: switch to QemuThread
  memory: fix 128 arithmetic in info mtree

Message-id: 1382024935-28297-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-18 10:01:49 -07:00
Anthony Liguori
1cb9b64df3 Merge remote-tracking branch 'bonzini/configure' into staging
# By Peter Maydell (3) and Ákos Kovács (2)
# Via Paolo Bonzini
* bonzini/configure:
  ui/Makefile.objs: delete unnecessary cocoa.o dependency
  default-configs/: CONFIG_GDBSTUB_XML removed
  Makefile.target: CONFIG_NO_* variables removed
  rules.mak: New string testing functions
  rules.mak: New logical functions for handling y/n values
2013-10-18 10:01:37 -07:00
Anthony Liguori
c21611ab8d Merge remote-tracking branch 'spice/spice.v75' into staging
# By Gerd Hoffmann (2) and others
# Via Gerd Hoffmann
* spice/spice.v75:
  spice: fix multihead support
  spice-display: add display channel id to the debug messages.
  Fix VNC SASL authentication when using a QXL device
  spice: replace use of deprecated API

Message-id: 1382006760-19388-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-18 10:01:21 -07:00
Anthony Liguori
cd22e320a0 Merge remote-tracking branch 'filippov/tags/20131015-xtensa' into staging
xtensa queue 2013-10-15

# gpg: Signature made Tue 15 Oct 2013 06:27:41 AM PDT using RSA key ID F83FA044
# gpg: Can't check signature: public key not found

# By Max Filippov
# Via Max Filippov
* filippov/tags/20131015-xtensa:
  target-xtensa: add in_asm logging

Message-id: 1381844297-1728-1-git-send-email-jcmvbkbc@gmail.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-18 10:01:08 -07:00
Fam Zheng
dbbcaa8d43 vmdk: fix VMFS extent parsing
The VMFS extent line in description file doesn't have start offset as
FLAT lines does, and it should be defaulted to 0. The flat_offset
variable is initialized to -1, so we need to set it in this case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-18 13:41:36 +02:00
Fam Zheng
c338b6ad60 vmdk: Only read cid from image file when opening
Previously cid of parent is parsed from image file for every IO request.
We already have L1/L2 cache and don't have assumption that parent image
can be updated behind us, so remove this to get more efficiency.

The parent CID is checked only for once after opening.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-18 13:39:59 +02:00
Amos Kong
23c37c37f0 net/rtl8139: update network information when macaddr is changed in guest
rtl8139 has same problem as e1000, nic info isn't updated when macaddr
is changed in guest.

This patch updates the nic info when the last bit of macaddr is written.

Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-18 13:28:09 +02:00
Amos Kong
7c36507c2b net/e1000: update network information when macaddr is changed in guest
If we change macaddr in guest by 'ifconfig eth0 hw ether 12:12:12:34:35:36',
the mac register of e1000 is already updated, but we don't update
network information in qemu. Therefor, the information in monitor
is wrong.

This patch updates nic info when the second part of macaddr is written.

Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-18 13:28:09 +02:00
Amos Kong
655d3b63b0 net: update nic info during device reset
macaddr is reset during device reset, but nic info
isn't updated, this problem exists in e1000 & rtl8139

Signed-off-by: Amos Kong <akong@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-18 13:28:09 +02:00
Stefan Weil
b432779a9f virtio: Remove unneeded memcpy
Report from valgrind:

==19521== Source and destination overlap in memcpy(0x31d38938, 0x31d38938, 64)
==19521==    at 0x4A0A343: memcpy@@GLIBC_2.14 (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19521==    by 0x42774E: virtio_blk_device_init (virtio-blk.c:686)
==19521==    by 0x46EE9E: virtio_device_init (virtio.c:1158)
==19521==    by 0x25405E: device_realize (qdev.c:178)
==19521==    by 0x2559B5: device_set_realized (qdev.c:699)
==19521==    by 0x3A819B: property_set_bool (object.c:1315)
==19521==    by 0x3A6CE0: object_property_set (object.c:803)

Valgrind is right: blk == &s->blks, so it is a memcpy of 64 byte with
source == destination which can be removed.

Reported-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-18 13:02:57 +02:00
Paolo Bonzini
041603fe5d exec: remove qemu_safe_ram_ptr
This is not needed since the RAM list is not modified anymore by
qemu_get_ram_ptr.  Replace it with qemu_get_ram_block.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:31:00 +02:00
Paolo Bonzini
17a15f1b76 icount: make it thread-safe
This lets threads other than the I/O thread use vm_clock even in -icount mode.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:31:00 +02:00
Paolo Bonzini
a3270e19cc icount: document (future) locking rules for icount
Reviewed-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:31:00 +02:00
Paolo Bonzini
ce78d18ced icount: prepare the code for future races in calling qemu_clock_warp
Computing the deadline of all vm_clocks is somewhat expensive and calls
out to qemu-timer.c; two reasons not to do it in the seqlock's write-side
critical section.  This however opens the door for races in setting and
reading vm_clock_warp_start.

To plug them, we need to cover the case where a new deadline slips in
between the call to qemu_clock_deadline_ns_all and the actual modification
of the icount_warp_timer.  Restrict changes to vm_clock_warp_start and
the icount_warp_timer's expiration time, to only move them back (which
would simply cause an early wakeup).

If a vm_clock timer is cancelled while CPUs are idle, this might cause the
icount_warp_timer to fire unnecessarily.  This is not a problem, after it
fires the timer becomes inactive and the next call to timer_mod_anticipate
will be precise.

In addition to this, we must deactivate the icount_warp_timer _before_
checking whether CPUs are idle.  This way, if the "last" CPU becomes idle
during the call to timer_del we will still set up the icount_warp_timer.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:31:00 +02:00
Paolo Bonzini
8ed961d957 icount: reorganize icount_warp_rt
To prepare for future code changes, move the increment of qemu_icount_bias
outside the "if" statement.

Also, hoist outside the if the check for timers that expired due to the
"warping".  The check is redundant when !runstate_is_running(), but
doing it this way helps because the code that increments qemu_icount_bias
will be a critical section.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:31:00 +02:00
Paolo Bonzini
468cc7cf3b icount: use cpu_get_icount() directly
This will help later when we will have to place these calls in
a critical section, and thus call a version of cpu_get_icount()
that does not take the lock.

Reviewed-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:31:00 +02:00
Paolo Bonzini
add40e9777 timer: add timer_mod_anticipate and timer_mod_anticipate_ns
These let a user anticipate the deadline of a timer, atomically with
other sites that call the function.  This helps avoiding complicated
lock hierarchies.

Reviewed-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:31:00 +02:00
Paolo Bonzini
0f809e5fbe timer: extract timer_mod_ns_locked and timerlist_rearm
These will be reused in timer_mod_anticipate functions.

Reviewed-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:30:59 +02:00
Liu Ping Fan
3c05341157 timer: make qemu_clock_enable sync between disable and timer's cb
After disabling the QemuClock, we should make sure that no QemuTimers
are still in flight. To implement that with light overhead, we resort
to QemuEvent. The caller of disabling will wait on QemuEvent of each
timerlist.

Note, qemu_clock_enable(foo,false) can _not_ be called from timer's cb.
Also, the callers of qemu_clock_enable() should be protected by the BQL.

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:30:56 +02:00
Paolo Bonzini
c7c4d063f5 qemu-thread: add QemuEvent
This emulates Win32 manual-reset events using futexes or conditional
variables.  Typical ways to use them are with multi-producer,
single-consumer data structures, to test for a complex condition whose
elements come from different threads:

    for (;;) {
        qemu_event_reset(ev);
        ... test complex condition ...
        if (condition is true) {
            break;
        }
        qemu_event_wait(ev);
    }

Or more efficiently (but with some duplication):

    ... evaluate condition ...
    while (!condition) {
        qemu_event_reset(ev);
        ... evaluate condition ...
        if (!condition) {
            qemu_event_wait(ev);
            ... evaluate condition ...
        }
    }

QemuEvent provides a very fast userspace path in the common case when
no other thread is waiting, or the event is not changing state.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:30:55 +02:00
Liu Ping Fan
cb365646a9 timer: protect timers_state's clock with seqlock
QEMU_CLOCK_VIRTUAL may be read outside BQL. This will make its
foundation, i.e. cpu_clock_offset exposed to race condition.
Using private lock to protect it.

After this patch, reading QEMU_CLOCK_VIRTUAL is thread safe
unless use_icount is true, in which case the existing callers
still rely on the BQL.

Lock rule: private lock innermost, ie BQL->"this lock"

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:30:52 +02:00
Paolo Bonzini
ea753d81e8 seqlock: introduce read-write seqlock
Seqlock implementation for QEMU. Usage idiom

reader:
    do {
        start = seqlock_read_begin(&sl);
        ...
    } while (seqlock_read_retry(&sl, start));

writer:
    seqlock_write_lock(&sl);
    ...
    seqlock_write_unlock(&sl);

initialization:
    seqlock_init(QemuSeqLock *sl, QemuMutex *mutex)

    mutex could be NULL if the caller will provide its own protection
    for concurrent write sides (typically using the BQL).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:24:15 +02:00
Jan Kiszka
c46860ea53 vga: Mark relevant portio lists regions as coalesced MMIO flushing
This allows to remove the explicit qemu_flush_coalesced_mmio_buffer
calls.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:24:15 +02:00
Jan Kiszka
eb25a1d9d4 cirrus: Mark vga io region as coalesced MMIO flushing
This allows to remove the explicit qemu_flush_coalesced_mmio_buffer
calls - the memory core will invoke them now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:24:15 +02:00
Jan Kiszka
c76bc480e2 portio: Allow to mark portio lists as coalesced MMIO flushing
This will enable us to remove all remaining explicit calls of
qemu_flush_coalesced_mmio_buffer in IO handlers.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:24:15 +02:00
Jan Kiszka
518420dfec compatfd: switch to QemuThread
qemu_thread_create already does signal blocking and detaching for us.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:24:14 +02:00
Alexey Kardashevskiy
a66670c79c memory: fix 128 arithmetic in info mtree
mtree_print_mr() calls int128_get64() in 3 places but only 2 places
handle 2^64 correctly.

This fixes the third call of int128_get64().

Cc: qemu-stable@nongnu.org
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:24:14 +02:00
Max Reitz
45d57f6e71 block/raw-win32: Always use -errno in hdev_open
On one occasion, hdev_open() returned -1 in case of an unknown error
instead of a proper -errno value. Adjust this to match the behavior of
raw_open() (in raw-win32), which is to return -EINVAL in this case.
Also, change the call to error_setg*() to match the one in raw_open() as
well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-17 14:55:46 +02:00
Gerd Hoffmann
0624c7f916 e820: pass high memory too.
We have a fw_cfg entry to pass e820 entries from qemu to the firmware.
Today it's used to pass reservations only.  This patch makes qemu pass
entries for RAM too.

This allows to pass RAM sizes larger than 1TB to the firmware and it
will also allow to pass non-contignous memory ramges should we decide
to implement that some day, say for our virtual numa nodes.

Obviously this needs some extra care to not break existing firware.

SeaBIOS loads the entries and happily adds them without looking at the
type.  Which is problematic for memory below 4g as this will overwrite
reservations added for bios memory etc.  For memory above 4g it works
just fine, seabios will merge the entry derived from cmos with the one
loaded from fw_cfg.

OVMF doesn't look at the fw_cfg e820 table.
coreboot doesn't look at the fw_cfg e820 table.

Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
2013-10-17 13:06:11 +02:00
Gerd Hoffmann
9fa032866d spice: fix multihead support
This patch fixes spice display initialization to handle
multihead properly.

spice-core now keeps track of which QemuConsole has a spice
display channel attached to it and which has not.  It also
manages display channel ids.

spice-display looks at all QemuConsoles and will pick up any
graphic console not yet bound to a spice channel (which in practice
are all non-qxl graphic devices).

Result is that
 (a) you'll get a spice client window for each graphical device
     now (first only without this patch), and
 (b) mixing qxl and non-qxl vga cards works properly.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-17 12:42:54 +02:00
Gerd Hoffmann
35b2122db4 spice-display: add display channel id to the debug messages.
And s/__FUNCTION__/__func__/ while being at it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-17 12:41:03 +02:00
Christophe Fergeau
764eb39d1b Fix VNC SASL authentication when using a QXL device
ui/vnc.c:vnc_display_open() and spice-server/server/reds.c:do_spice_init()
are both calling sasl_server_init(). If spice_server_set_sasl_appname()
hasn't been called, spice-server will call it with "spice" as an appname,
causing cyrus-sasl to try to use a /etc/sasl2/spice.conf config file rather
than the /etc/sasl2/qemu.conf file that QEMU uses.

When using -spice sasl on the command line, QEMU properly calls
spice_server_set_sasl_appname() to set the SASL appname as "qemu",
but when using a QXL device without using SPICE, spice_server_init()
is called from qemu_spice_add_interface() without setting the appname
to "qemu", which then causes the VNC code to try to use spice.conf
instead of qemu.conf.

Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-17 12:25:25 +02:00
Marc-André Lureau
26defe81f6 spice: replace use of deprecated API
hose API are deprecated since 0.11, and qemu depends on 0.12 already.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-17 12:25:25 +02:00
Fam Zheng
a7fdbcf0e6 blockdev: fix cdrom read_only flag
Since 0ebd24e0, cdrom doesn't have read-only on by default, which will
error out when using an read only image. Fix it by setting the default
value when parsing opts.

Reported-by: Edivaldo de Araujo Pereira <edivaldoapereira@yahoo.com.br>
Signed-off-by: Fam Zheng <famz@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-17 10:19:59 +02:00
Andreas Färber
794cbc26eb sd: Avoid access to NULL BlockDriverState
Commit 4f8a066b5f (blockdev: Remove IF_*
check for read-only blockdev_init) added a usage of bdrv_is_read_only()
to sd_init(), which is called for versatilepb, versatileab and
xilinx-zynq-a9 machines among others with NULL argument by default,
causing the new qom-test to fail.

Add a check to prevent this.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-17 10:15:18 +02:00
Mike Qiu
684b25447c hmp: drop bogus "[not inserted]"
Commit 3e9fab690d ("block: Add support for
throttling burst max in QMP and the command line.") introduced bogus
"[not inserted]" output, possibly due to a merge failure.  Remove this
artifact.

Output of 'info block'

scsi0-hd0: /images/f18-ppc64.qcow2 (qcow2)
 [not inserted]
scsi0-cd2: [not inserted]
    Removable device: not locked, tray closed

floppy0: [not inserted]
    Removable device: not locked, tray closed

sd0: [not inserted]
    Removable device: not locked, tray closed

There will be no additional lines between scsi0-hd0 and
scsi0-cd2.

At the same time, scsi0-hd0 already inserted, but still has
'[not inserted]' flag. This line should be removed.

This patch is to solve this.

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-17 10:10:14 +02:00
Peter Maydell
2324841c02 ui/Makefile.objs: delete unnecessary cocoa.o dependency
Delete an unnecessary dependency for cocoa.o; we already have
a general rule that tells Make that we can build a .o file
from a .m source using an ObjC compiler, so this specific
rule is unnecessary. Further, it is using the dubious construct
"$(SRC_PATH)/$(obj)" to get at the source directory, which will
break when $(obj) is redefined as part of the preparation for
per-object library support.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-16 18:21:01 +02:00
Ákos Kovács
b77abd95a9 default-configs/: CONFIG_GDBSTUB_XML removed
Makefile.target: Build gdbstub-xml.o only when
TARGET_XML_FILES is not empty.

Signed-off-by: Ákos Kovács <akoskovacs@gmx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-16 18:21:01 +02:00
Ákos Kovács
cf01ba9eef Makefile.target: CONFIG_NO_* variables removed
CONFIG_NO_* variables replaced with the lnot logical function

Signed-off-by: Ákos Kovács <akoskovacs@gmx.com>
[PMM: fixed a few CONFIG_NO_* uses that were missed]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-16 18:21:00 +02:00
Peter Maydell
9ef622e31e rules.mak: New string testing functions
Add new string testing functions which return a y/n result:
 eq : are two strings equal (ignoring leading/trailing space)?
 ne : are two strings unequal?
 isempty : is a string empty?
 notempty : is a string non-empty?

Based on an idea by Ákos Kovács <akoskovacs@gmx.com>.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-16 18:21:00 +02:00
Peter Maydell
837a2e267f rules.mak: New logical functions for handling y/n values
Add new logical functions for handling y/n values like those we
use in CONFIG_FOO variables:
 lnot : logical NOT
 land : logical AND
 lor : logical OR
 lxor : logical XOR
 leqv : logical equality, inverse of lxor
 lif : like Make's $(if) but with an eq-like test

Based on an idea by Ákos Kovács <akoskovacs@gmx.com>.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-16 18:21:00 +02:00
Max Filippov
ca529f8e13 target-xtensa: add in_asm logging
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2013-10-15 17:23:46 +04:00
Anthony Liguori
1680d48577 Merge remote-tracking branch 'rth/tcg-ldst-6' into staging
# By Richard Henderson
# Via Richard Henderson
* rth/tcg-ldst-6:
  target-alpha: Convert to new ldst opcodes
  tcg-ppc64: Support new ldst opcodes
  tcg-ppc: Support new ldst opcodes
  tcg-ppc64: Convert to le/be ldst helpers
  tcg-ppc: Convert to le/be ldst helpers
  tcg-ppc64: Use TCGMemOp within qemu_ldst routines
  tcg-ppc: Use TCGMemOp within qemu_ldst routines
  tcg-arm: Improve GUEST_BASE qemu_ld/st
  tcg-arm: Convert to new ldst opcodes
  tcg-arm: Tidy variable naming convention in qemu_ld/st
  tcg-arm: Convert to le/be ldst helpers
  tcg-arm: Use TCGMemOp within qemu_ldst routines
  tcg-i386: Support new ldst opcodes
  tcg-i386: Remove "cb" output restriction from qemu_st8 for i386
  tcg-i386: Tidy softmmu routines
  tcg-i386: Use TCGMemOp within qemu_ldst routines
  tcg: Use TCGMemOp for TCGLabelQemuLdst.opc

Message-id: 1381620683-4568-1-git-send-email-rth@twiddle.net
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-14 09:59:59 -07:00
Anthony Liguori
ded77da3cd Merge remote-tracking branch 'jliu/or32' into staging
# By Sebastian Macke
# Via Jia Liu
* jliu/or32:
  target-openrisc: Removes a non-conforming behavior for the first page of the memory
  target-openrisc: Correct handling of page faults.

Message-id: 1380789702-18935-1-git-send-email-proljc@gmail.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-14 09:15:47 -07:00
Anthony Liguori
08683cb532 Merge remote-tracking branch 'awilliam/tags/vfio-pci-for-qemu-20131010.0' into staging
vfio-pci updates include:
 - Forgotten MSI affinity patch posted several months ago
 - Lazy option ROM loading to delay load until after device/bus resets
 - Error reporting cleanups
 - PCI hot reset support introduced with Linux v3.12 development kernels
 - Debug build fix for int128

The lazy ROM loading and hot reset should help VGA assignment as we can
now do a bus reset when there are multiple devices on the bus, ex.
multi-function graphics and audio cards.

# gpg: Signature made Thu 10 Oct 2013 11:26:39 AM PDT using RSA key ID 3BB08B22
# gpg: Can't check signature: public key not found

# By Alex Williamson (7) and Alexey Kardashevskiy (1)
# Via Alex Williamson
* awilliam/tags/vfio-pci-for-qemu-20131010.0:
  vfio-pci: Fix endian issues in vfio_pci_size_rom()
  vfio-pci: Add dummy PCI ROM write accessor
  vfio: Fix debug output for int128 values
  vfio-pci: Implement PCI hot reset
  vfio-pci: Cleanup error_reports
  vfio-pci: Lazy PCI option ROM loading
  vfio-pci: Test device reset capabilities
  vfio-pci: Add support for MSI affinity

Message-id: 20131010184122.31667.28382.stgit@bling.home
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-14 09:14:30 -07:00
Stefan Weil
575ddeb459 exec: Fix prototype of phys_mem_set_alloc and related functions
phys_mem_alloc and its assigned values qemu_anon_ram_alloc and
legacy_s390_alloc must have identical argument lists.

legacy_s390_alloc uses the size parameter to call mmap, so size_t is
good enough for all of them.

This patch fixes compiler errors on i686 Linux hosts:

  CC    alpha-softmmu/exec.o
exec.c:752:51: error:
 initialization from incompatible pointer type [-Werror]
exec.c: In function 'qemu_ram_alloc_from_ptr':
exec.c:1139:32: error:
 comparison of distinct pointer types lacks a cast [-Werror]
exec.c: In function 'qemu_ram_remap':
exec.c:1283:21: error:
 comparison of distinct pointer types lacks a cast [-Werror]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1380481005-32399-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-14 08:50:34 -07:00
Michael S. Tsirkin
742f5d2ed5 ssdt-proc: update generated file
Update generated ssdt proc hex file (used for systems
lacking IASL) after P_BLK length change.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:57 +03:00
Michael S. Tsirkin
6ec80ef150 ssdt: fix PBLK length
We don't really support CPU throttling, so supply 0 PBLK length.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:57 +03:00
Michael S. Tsirkin
72c194f7e7 i386: ACPI table generation code from seabios
This adds C code for generating ACPI tables at runtime,
imported from seabios git tree
    commit 51684b7ced75fb76776e8ee84833fcfb6ecf12dd

Although ACPI tables come from a system BIOS on real hw,
it makes sense that the ACPI tables are coupled with the
virtual machine, since they have to abstract the x86 machine to
the OS's.

This is widely desired as a way to avoid the churn
and proliferation of QEMU-specific interfaces
associated with ACPI tables in bios code.

Notes:
As BIOS can reprogram devices prior to loading
ACPI tables, we pre-format ACPI tables but defer loading
hardware configuration there until tables are loaded.

The code structure was intentionally kept as close
to the seabios original as possible, to simplify
comparison and making sure we didn't lose anything
in translation.

Minor code duplication results, to help ensure there are no functional
regressions, I think it's better to merge it like this and do more code
changes in follow-up patches.

Cross-version compatibility concerns have been addressed:
    ACPI tables are exposed to guest as FW_CFG entries.
    When running with -M 1.5 and older, this patch disables ACPI
    table generation, and doesn't expose ACPI
    tables to guest.

    As table content is likely to change over time,
    the following measures are taken to simplify
    cross-version migration:
    - All tables besides the RSDP are packed in a single FW CFG entry.
      This entry size is currently 23K. We round it up to 64K
      to avoid too much churn there.
    - Tables are placed in special ROM blob (not mapped into guest memory)
      which is automatically migrated together with the guest, same
      as BIOS code.
    - Offsets where hardware configuration is loaded in ACPI tables
      are also migrated, this is in case future ACPI changes make us
      rearrange the tables in memory.

This patch reuses some code from SeaBIOS, which was originally under
LGPLv2 and then relicensed to GPLv3 or LGPLv3, in QEMU under GPLv2+. This
relicensing has been acked by all contributors that had contributed to the
code since the v2->v3 relicense. ACKs approving the v2+ relicensing are
listed below. The list might include ACKs from people not holding
copyright on any parts of the reused code, but it's better to err on the
side of caution and include them.

Affected SeaBIOS files (GPLv2+ license headers added)
<http://thread.gmane.org/gmane.comp.bios.coreboot.seabios/5949>:

 src/acpi-dsdt-cpu-hotplug.dsl
 src/acpi-dsdt-dbug.dsl
 src/acpi-dsdt-hpet.dsl
 src/acpi-dsdt-isa.dsl
 src/acpi-dsdt-pci-crs.dsl
 src/acpi.c
 src/acpi.h
 src/ssdt-misc.dsl
 src/ssdt-pcihp.dsl
 src/ssdt-proc.dsl
 tools/acpi_extract.py
 tools/acpi_extract_preprocess.py

Each one of the listed people agreed to the following:

> If you allow the use of your contribution in QEMU under the
> terms of GPLv2 or later as proposed by this patch,
> please respond to this mail including the line:
>
> Acked-by: Name <email address>

  Acked-by: Gerd Hoffmann <kraxel@redhat.com>
  Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
  Acked-by: Jason Baron <jbaron@akamai.com>
  Acked-by: David Woodhouse <David.Woodhouse@intel.com>
  Acked-by: Gleb Natapov <gleb@redhat.com>
  Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
  Acked-by: Dave Frodin <dave.frodin@se-eng.com>
  Acked-by: Paolo Bonzini <pbonzini@redhat.com>
  Acked-by: Kevin O'Connor <kevin@koconnor.net>
  Acked-by: Laszlo Ersek <lersek@redhat.com>
  Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
  Acked-by: Isaku Yamahata <yamahata@valinux.co.jp>
  Acked-by: Magnus Christensson <magnus.christensson@intel.com>
  Acked-by: Hu Tao <hutao@cn.fujitsu.com>
  Acked-by: Eduardo Habkost <ehabkost@redhat.com>

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:57 +03:00
Michael S. Tsirkin
1a4b2666df pc: use new api to add builtin tables
At this point the only builtin table we have is
the DSDT used for Q35.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:57 +03:00
Michael S. Tsirkin
60de1163d5 acpi: add interface to access user-installed tables
Also add a new API to install builtin tables, so
that we can distinguish between the two.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:53 +03:00
Michael S. Tsirkin
64e9df8d34 hpet: add API to find it
Add API to find HPET using QOM.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:52 +03:00
Michael S. Tsirkin
309cd62d6b pvpanic: add API to access io port
Add API to find pvpanic device and get its io port.
Will be used to fill in guest info structure.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:52 +03:00
Michael S. Tsirkin
6f1426ab0f ich9: APIs for pc guest info
This adds APIs that will be used to fill in
acpi tables, implemented using QOM,
to various ich9 components.
Some information is still missing in QOM,
so we fall back on lookups by type instead.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:52 +03:00
Michael S. Tsirkin
277e9340e6 piix: APIs for pc guest info
This adds APIs that will be used to fill in guest acpi tables.
Some required information is still lacking in QOM, so we
fall back on lookups by type and returning explicit types.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:52 +03:00
Michael S. Tsirkin
f854ecc799 acpi/piix: add macros for acpi property names
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:52 +03:00
Michael S. Tsirkin
b20c9bd5f6 i386: define pc guest info
This defines a structure that will be used to fill in acpi tables
where relevant properties are not yet available using QOM.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:52 +03:00
Michael S. Tsirkin
d916b46494 loader: allow adding ROMs in done callbacks
Don't abort if machine done callbacks add ROMs.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:52 +03:00
Michael S. Tsirkin
bc70232918 i386: add bios linker/loader
This adds a dynamic bios linker/loader.
This will be used by acpi table generation
code to:
    - load each table in the appropriate memory segment
    - link tables to each other
    - fix up checksums after said linking

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:52 +03:00
Michael S. Tsirkin
35c12e60c8 loader: use file path size from fw_cfg.h
Avoid a bit of code duplication, make
max file path constant reusable.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:52 +03:00
Michael S. Tsirkin
544d2bfa84 acpi: ssdt pcihp: updat generated file
update generated file, not sure what changed

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:52 +03:00
Michael S. Tsirkin
d512d0d723 acpi: pre-compiled ASL files
Add pre-compiled ASL files. Useful for systems that
do not have IASL.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:51 +03:00
Michael S. Tsirkin
a31a864273 acpi: add rules to compile ASL source
Detect presence of IASL compiler and use it
to process ASL source. If not there, use pre-compiled
files in-tree. Add script to update the in-tree files.

Note: distros are known to silently update iasl
so detect correct iasl flags for the installed version on each run as
opposed to at configure time.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:51 +03:00
Michael S. Tsirkin
74523b8501 i386: add ACPI table files from seabios
This adds ASL code as well as scripts for processing it,
imported from seabios git tree
commit 51684b7ced75fb76776e8ee84833fcfb6ecf12dd

Will be used for runtime acpi table generation.

Note:
This patch reuses some code from SeaBIOS, which was originally under
LGPLv2 and then relicensed to GPLv3 or LGPLv3, in QEMU under GPLv2+. This
relicensing has been acked by all contributors that had contributed to the
code since the v2->v3 relicense. ACKs approving the v2+ relicensing are
listed below. The list might include ACKs from people not holding
copyright on any parts of the reused code, but it's better to err on the
side of caution and include them.

Affected SeaBIOS files (GPLv2+ license headers added)
<http://thread.gmane.org/gmane.comp.bios.coreboot.seabios/5949>:

 src/acpi-dsdt-cpu-hotplug.dsl
 src/acpi-dsdt-dbug.dsl
 src/acpi-dsdt-hpet.dsl
 src/acpi-dsdt-isa.dsl
 src/acpi-dsdt-pci-crs.dsl
 src/acpi.c
 src/acpi.h
 src/ssdt-misc.dsl
 src/ssdt-pcihp.dsl
 src/ssdt-proc.dsl
 tools/acpi_extract.py
 tools/acpi_extract_preprocess.py

Each one of the listed people agreed to the following:

> If you allow the use of your contribution in QEMU under the
> terms of GPLv2 or later as proposed by this patch,
> please respond to this mail including the line:
>
> Acked-by: Name <email address>

  Acked-by: Gerd Hoffmann <kraxel@redhat.com>
  Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
  Acked-by: Jason Baron <jbaron@akamai.com>
  Acked-by: David Woodhouse <David.Woodhouse@intel.com>
  Acked-by: Gleb Natapov <gleb@redhat.com>
  Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
  Acked-by: Dave Frodin <dave.frodin@se-eng.com>
  Acked-by: Paolo Bonzini <pbonzini@redhat.com>
  Acked-by: Kevin O'Connor <kevin@koconnor.net>
  Acked-by: Laszlo Ersek <lersek@redhat.com>
  Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
  Acked-by: Isaku Yamahata <yamahata@valinux.co.jp>
  Acked-by: Magnus Christensson <magnus.christensson@intel.com>
  Acked-by: Hu Tao <hutao@cn.fujitsu.com>
  Acked-by: Eduardo Habkost <ehabkost@redhat.com>

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:51 +03:00
Michael S. Tsirkin
cbcaf79e3c q35: expose mmcfg size as a property
Address is already exposed, expose size for symmetry.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:51 +03:00
Michael S. Tsirkin
87f65245db q35: use macro for MCFG property name
Useful to make it accessible through QOM.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:51 +03:00
Michael S. Tsirkin
6f6d282330 pcie_host: expose address format
Callers pass in the address so it's helpful for
them to be able to decode it.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:51 +03:00
Michael S. Tsirkin
079e3e7012 pcie_host: expose UNMAPPED macro
Make it possible to test unmapped status through QMP.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:51 +03:00
Michael S. Tsirkin
48354cc5a3 loader: support for unmapped ROM blobs
Support ROM blobs not mapped into guest memory:
same as ROM files really but use caller's buffer.

Support invoking callback on access and
return memory pointer making it easier
for caller to update memory if necessary.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:51 +03:00
Michael S. Tsirkin
d87072ceec fw_cfg: interface to trigger callback on read
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:51 +03:00
Michael S. Tsirkin
77d6f4ea76 pci: fix up w64 size calculation helper
BAR base was calculated incorrectly.
Use existing pci_bar_address to get it right.

Tested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:48:45 +03:00
Michael S. Tsirkin
e732ea6387 qom: add pointer to int property helpers
Make it easy to add read-only helpers for simple
integer properties in memory.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:46:00 +03:00
Michael S. Tsirkin
e82df24873 qom: cleanup struct Error references
now that a typedef for struct Error is available,
use it in qom/object.h to match coding style rules.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:45:16 +03:00
Igor Mammedov
008e05662a cleanup object.h: include error.h directly
qapi/error.h is simple enough to be included in qom/object.h
direcly and prepares qom/object.h to use Error typedef.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:45 +03:00
Marcel Apfelbaum
c31d04b516 hw/pci: removed irq field from PCIDevice
Instead of exposing the the irq field,
pci wrappers to qemu_set_irq or qemu_irq_*
can be used.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:45 +03:00
Marcel Apfelbaum
5a03e708f2 hw/pcie: AER and hot-plug events must use device's interrupt
The fields hpev_intx and aer_intx were removed because
both AER and hot-plug events must use device's interrupt.
Assert/deassert interrupts using pci irq wrappers instead.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:45 +03:00
Marcel Apfelbaum
9e64f8a3fc hw: set interrupts using pci irq wrappers
pci_set_irq and the other pci irq wrappers use
PCI_INTERRUPT_PIN config register to compute device
INTx pin to assert/deassert.

An irq is allocated using pci_allocate_irq wrapper
only if is needed by non pci devices.

Removed irq related fields from state if not used anymore.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:45 +03:00
Marcel Apfelbaum
68919cace8 hw/vfio: set interrupts using pci irq wrappers
pci_set_irq and the other pci irq wrappers use
PCI_INTERRUPT_PIN config register to compute device
INTx pin to assert/deassert.

save INTX pin into the config register before calling
pci_set_irq

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:45 +03:00
Marcel Apfelbaum
4c89e3e593 hw/vmxnet3: set interrupts using pci irq wrappers
pci_set_irq uses PCI_INTERRUPT_PIN config register
to compute device INTx pin to assert/deassert.

An assert is used to ensure that intx received
from the quest OS corresponds to PCI_INTERRUPT_PIN.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:45 +03:00
Marcel Apfelbaum
c008ac0c1c hw/pci-bridge: set PCI_INTERRUPT_PIN register before shpc init
The PCI_INTERRUPT_PIN will be used by shpc init, so
was moved before the call to shpc_init.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:45 +03:00
Marcel Apfelbaum
d98f08f54e hw/pci: add pci wrappers for allocating and asserting irqs
Interrupt pin is selected and saved into PCI_INTERRUPT_PIN
register during device initialization. Devices should not call
directly qemu_set_irq and specify the INTx pin on each call.

Added pci_* wrappers to replace qemu_set_irq, qemu_irq_raise,
qemu_irq_lower and qemu_irq_pulse, setting the irq
based on PCI_INTERRUPT_PIN.

Added pci_allocate_irq wrapper to be used by devices that
still need PCIDevice infrastructure to assert irqs.

Renamed a static method which was named already pci_set_irq.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:44 +03:00
Marcel Apfelbaum
a8a9d30bab hw/core: Add interface to allocate and free a single IRQ
qemu_allocate_irq returns a single qemu_irq.
The interface allows to specify an interrupt number.

qemu_free_irq frees it.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:44 +03:00
Marcel Apfelbaum
a53ae8e934 hw/pci: partially handle pci master abort
A MemoryRegion with negative priority was created and
it spans over all the pci address space.
It "intercepts" the accesses to unassigned pci
address space and will follow the pci spec:
 1. returns -1 on read
 2. does nothing on write

Note: setting the RECEIVED MASTER ABORT bit in the STATUS register
      of the device that initiated the transaction will be
      implemented in another series

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:44 +03:00
Marcel Apfelbaum
8002ccd6e4 docs/memory: Explictly state that MemoryRegion priority is signed
When memory regions overlap, priority can be used to specify
which of them takes priority. By making the priority values signed
rather than unsigned, we make it more convenient to implement
a situation where one "background" region should appear only
where no other region exists: rather than having to explicitly
specify a high priority for all the other regions, we can let them take
the default (zero) priority and specify a negative priority for the
background region.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:44 +03:00
Marcel Apfelbaum
a1ff8ae066 memory: Change MemoryRegion priorities from unsigned to signed
When memory regions overlap, priority can be used to specify
which of them takes priority. By making the priority values signed
rather than unsigned, we make it more convenient to implement
a situation where one "background" region should appear only
where no other region exists: rather than having to explicitly
specify a high priority for all the other regions, we can let them take
the default (zero) priority and specify a negative priority for the
background region.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-10-14 17:11:44 +03:00
Richard Henderson
f8da40aefb target-alpha: Convert to new ldst opcodes
Or, partially.  The fundamental primitives for the port are gen_load_mem
and gen_store_mem, which take a callback to emit the memory operation.
For that, we continue to use the original inline functions that forward
to the new ops, rather than replicate the same thing privately.

That said, all free-standing calls to tcg_gen_qemu_* have been converted.
The 32-bit floating-point references now use _i32 opcodes, eliminating
a truncate or extension.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson
1768ec0623 tcg-ppc64: Support new ldst opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson
5dd391604f tcg-ppc: Support new ldst opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson
e349a8d4ff tcg-ppc64: Convert to le/be ldst helpers
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson
92d0acda27 tcg-ppc: Convert to le/be ldst helpers
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson
a058557381 tcg-ppc64: Use TCGMemOp within qemu_ldst routines
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson
f1a16dcdd5 tcg-ppc: Use TCGMemOp within qemu_ldst routines
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson
091d567771 tcg-arm: Improve GUEST_BASE qemu_ld/st
If we pull the code to emit the actual load/store into a subroutine,
we can share the reg+reg addressing mode code between softmmu and
usermode.  This lets us load GUEST_BASE into a temporary register
rather than attempting to add it piece-wise to the address.

Which lets us use movw+movt for armv7, rather than (up to) 4 adds.
Code size for pre-armv7 stays the same.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson
15ecf6e394 tcg-arm: Convert to new ldst opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson
a485cff09c tcg-arm: Tidy variable naming convention in qemu_ld/st
s/addr_reg2/addrhi/
s/addr_reg/addrlo/
s/data_reg2/datahi/
s/data_reg/datalo/

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson
0315c51ea9 tcg-arm: Convert to le/be ldst helpers
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:19 -07:00
Richard Henderson
099fcf2e36 tcg-arm: Use TCGMemOp within qemu_ldst routines
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:19 -07:00
Richard Henderson
8221a267fd tcg-i386: Support new ldst opcodes
No support for helpers with non-default endianness yet,
but good enough to test the opcodes.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:19 -07:00
Richard Henderson
b3e2bc500f tcg-i386: Remove "cb" output restriction from qemu_st8 for i386
Once we form a combined qemu_st_i32 opcode, we won't be able to
have separate constraints based on size.  This one is fairly easy
to work around, since eax is available as a scratch register.

When storing variable data, this tends to merely exchange one mov
for another.  E.g.

-:  mov    %esi,%ecx
...
-:  mov    %cl,(%edx)
+:  mov    %esi,%eax
+:  mov    %al,(%edx)

Where we do have a regression is when storing constant data, in which
we may load the constant into edi, when only ecx/ebx ought to be used.

The proper way to recover this regression is to allow constants as
arguments to qemu_st_i32, so that we never load the constant data into
a register at all, must less the wrong register.  TBD.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:19 -07:00
Richard Henderson
7352ee546c tcg-i386: Tidy softmmu routines
Pass two TCGReg to tcg_out_tlb_load, rather than idx+args.

Move ldst_optimization routines just below tcg_out_tlb_load to avoid
the need for forward declarations.

Use TCGReg enum in preference to int where apprpriate.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:19 -07:00
Richard Henderson
37c5d0d5d1 tcg-i386: Use TCGMemOp within qemu_ldst routines
Step one in the transition, with constants passed down from tcg_out_op.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:19 -07:00
Richard Henderson
d257e0d7ae tcg: Use TCGMemOp for TCGLabelQemuLdst.opc
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:19 -07:00
Anthony Liguori
1cdae4573d Merge remote-tracking branch 'mdroth/qga-pull-2013-10-10' into staging
# By Mark Wu (2) and Tomoki Sekiyama (1)
# Via Michael Roth
* mdroth/qga-pull-2013-10-10:
  qemu-ga: Extend 'guest-info' command to expose flag 'success-response'
  qemu-ga: Add interface to traverse the qmp command list by QmpCommand
  qemu-ga: execute fsfreeze-freeze in reverse order of mounts

Message-id: 1381435782-25524-1-git-send-email-mdroth@linux.vnet.ibm.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-11 09:38:07 -07:00
Anthony Liguori
ab1eb72b1d Merge remote-tracking branch 'rth/tcg-pull' into staging
# By Richard Henderson
# Via Richard Henderson
* rth/tcg-pull:
  exec: Add both big- and little-endian memory helpers
  tcg: Add qemu_ld_st_i32/64
  tcg: Add TCGMemOp
  configure: Remove CONFIG_QEMU_LDST_OPTIMIZATION
  tcg: Add tcg-be-ldst.h
  tcg: Add tcg-be-null.h
  exec: Delete is_tcg_gen_code and GETRA_EXT
  tcg-aarch64: Update to helper_ret_*_mmu routines
  tcg: Merge tcg_register_helper into tcg_context_init
  tcg: Add tcg-runtime.c helpers to all_helpers
  tcg: Put target helper data into an array.
  tcg: Remove stray semi-colons from target-*/helper.h
  tcg: Move helper registration into tcg_context_init
  target-m68k: Rename helpers.h to helper.h
  tcg: Use a GHashTable for tcg_find_helper
  tcg: Delete tcg_helper_get_name declaration
  tcg-hppa: Remove tcg backend

Message-id: 1381440525-6666-1-git-send-email-rth@twiddle.net
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-11 09:36:52 -07:00
Markus Armbruster
a3400aeede qdev-monitor: Group "device_add help" and "info qdm" by category
Output is a long, unsorted list.  Not very helpful.  Print one list
per device category instead, with a header line identifying the
category, plus a list of uncategorized devices.  Print each list in
case-insenitive alphabetical order.

Devices with multiple categories are listed multiple times.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Message-id: 1381410021-1538-3-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-11 09:36:29 -07:00
Markus Armbruster
1fc224b4b6 Mostly revert "qemu-help: Sort devices by logical functionality"
This reverts most of commit 3d1237fb2a.

The commit claims to sort the output of "-device help" "by
functionality rather than alphabetical".  Issues:

* The output was unsorted before, not alphabetically sorted.
  Misleading, but harmless enough.

* The commit doesn't just sort the output of "-device help" as it
  claims, it adds categories to each line of "-device help", and it
  prints devices once per category.  In particular, devices without a
  category aren't shown anymore.  Maybe such devices should not exist,
  but they do.  Regression.

* Categories are also added to the output of "info qdm".  Silent
  change, not nice.  Output remains unsorted, unlike "-device help".

I'm going to reimplement the feature we actually want, without the
warts.  Reverting the flawed commit first should make it easier to
review.  However, I can't revert it completely, since DeviceClass
member categories has been put to use.  So leave that part in.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.a@redhat.com>
Message-id: 1381410021-1538-2-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-11 09:36:29 -07:00
Stefan Hajnoczi
8593898109 Use qemu-project.org domain name
qemu.org is held by a third-party and no core community contributor has
access to the DNS configuration.  This leaves the website exposed to
outages due to DNS issues or IP address changes.  For example, if the
web server IP address needs to change we cannot guarantee qemu.org will
point to it!

The newer qemu-project.org domain name is owned by Anthony Liguori
<anthony@codemonkey.ws>.  You can confirm this by querying the whois
information.  Also note that the #qemu IRC channel topic already
references qemu-project.org.

Short of having a dedicated legal entity to hold the domain name on
behalf of the community, qemu-project.org seems like the safest bet.

Let's replace references to qemu.org with qemu-project.org.

Note that git-submodule(1) does not detect URL changes.  The following
commands clear out and re-initialize all submodules to ensure you are
using the latest URLs:

  $ git submodule deinit . # you'll be warned if you have local changes
  $ rm -rf .git/modules    # also clear cached .git/ directories
  $ git submodule update --init

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1381495958-8306-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-11 09:34:56 -07:00
Anthony Liguori
33c6cae44e Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Max Reitz (30) and others
# Via Kevin Wolf
* kwolf/for-anthony: (61 commits)
  qemu-iotests: Add test for inactive L2 overlap
  qemu-io: Let "open" pass options to block driver
  vmdk: Fix vmdk_parse_extents
  blockdev: blockdev_init() error conversion
  blockdev: Don't disable COR automatically with blockdev-add
  blockdev: Remove 'media' parameter from blockdev_init()
  qemu-iotests: Check autodel behaviour for device_del
  blockdev: Remove IF_* check for read-only blockdev_init
  blockdev: Move virtio-blk device creation to drive_init
  blockdev: Move bus/unit/index processing to drive_init
  blockdev: Move parsing of 'boot' option to drive_init
  blockdev: Moving parsing of geometry options to drive_init
  blockdev: Move parsing of 'if' option to drive_init
  blockdev: Move parsing of 'media' option to drive_init
  blockdev: Pass QDict to blockdev_init()
  blockdev: Separate ID generation from DriveInfo creation
  blockdev: 'blockdev-add' QMP command
  blockdev: Introduce DriveInfo.enable_auto_del
  qapi-types/visit.py: Inheritance for structs
  qapi-types/visit.py: Pass whole expr dict for structs
  ...

Message-id: 1381503951-27985-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-11 09:29:58 -07:00
Max Reitz
34eeb82de6 qemu-iotests: Add test for inactive L2 overlap
Extend 060 by a test which creates a corrupted image with an active L2
entry pointing to an inactive L2 table and writes to the corresponding
guest offset.

Also, use overlap-check=all for all tests in 060.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:02 +02:00
Max Reitz
b543c5cdcb qemu-io: Let "open" pass options to block driver
Add an option to the open command to specify runtime options for the
block driver used.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:02 +02:00
Fam Zheng
899f1ae219 vmdk: Fix vmdk_parse_extents
An extra 'p++' after while loop when *p == '\n' will move p to unknown
data position, risking parsing junk data or memory access violation.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:02 +02:00
Kevin Wolf
b681072d20 blockdev: blockdev_init() error conversion
This gives us meaningful error messages for the blockdev-add QMP
command.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:02 +02:00
Kevin Wolf
0ebd24e0a2 blockdev: Don't disable COR automatically with blockdev-add
If a read-only device is configured with copy-on-read=on, the old code
only prints a warning and automatically disables copy on read. Make it
a real error for blockdev-add.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:02 +02:00
Kevin Wolf
e34ef04641 blockdev: Remove 'media' parameter from blockdev_init()
The remaining users shouldn't be there with blockdev-add and are easy to
move to drive_init().

Bonus bug fix: As a side effect, CD-ROM drives can now use block drivers
on the read-only whitelist without explicitly specifying read-only=on,
even if a format is explicitly specified.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:02 +02:00
Kevin Wolf
a9b43397a9 qemu-iotests: Check autodel behaviour for device_del
Block devices creates with -drive and drive_add should automatically
disappear if the guest device is unplugged. blockdev-add ones shouldn't.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:02 +02:00
Kevin Wolf
4f8a066b5f blockdev: Remove IF_* check for read-only blockdev_init
IF_NONE allows read-only, which makes forbidding it in this place
for other types pretty much pointless.

Instead, make sure that all devices for which the check would have
errored out check in their init function that they don't get a read-only
BlockDriverState. This catches even cases where IF_NONE and -device is
used.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
394c7d4d6b blockdev: Move virtio-blk device creation to drive_init
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
87a899c509 blockdev: Move bus/unit/index processing to drive_init
This requires moving the automatic ID generation at the same time, so
let's do that as well.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
2692929802 blockdev: Move parsing of 'boot' option to drive_init
It's already ignored and only prints a deprecation message. No use in
making it available in new interfaces.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
b41a7338cf blockdev: Moving parsing of geometry options to drive_init
This moves all of the geometry options (cyls/heads/secs/trans) to
drive_init so that they can only be accessed using legacy functions, but
never with anything blockdev-add related.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
593d464bd4 blockdev: Move parsing of 'if' option to drive_init
It's always IF_NONE for blockdev-add.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
33cb7dc8b7 blockdev: Move parsing of 'media' option to drive_init
This moves as much as possible of the processing of the 'media' option
to drive_init so that it can only be accessed using legacy functions,
but never with anything blockdev-add related.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
f298d07166 blockdev: Pass QDict to blockdev_init()
Working on a QDict instead of a QemuOpts that accepts anything is more
in line with bdrv_open(). A QDict is what qmp_blockdev_add() already has
anyway, so this saves additional conversions. And last, but not least,
it allows later patches to easily extract legacy options into a
separate, typed QemuOpts for drive_init() (the untyped QemuOpts that
drive_init already has doesn't allow access to numbers, only strings,
and is therefore useless without conversion).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
326642bc7f blockdev: Separate ID generation from DriveInfo creation
blockdev-add shouldn't automatically generate IDs, but will keep most of
the DriveInfo creation code.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
d26c9a1573 blockdev: 'blockdev-add' QMP command
For examples see the changes to qmp-commands.hx.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
2d246f01d3 blockdev: Introduce DriveInfo.enable_auto_del
BlockDriverStates shouldn't be affected by an unplugged guest device,
except if created with the legacy -drive command line option or the
drive_add HMP command.

Make the automatic deletion as well as cancelling of jobs conditional on
an enable_auto_del boolean that is only set in drive_init().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
622f557f5a qapi-types/visit.py: Inheritance for structs
This introduces a new 'base' key for struct definitions that refers to
another struct type. On the JSON level, the fields of the base type are
included directly into the same namespace as the fields of the defined
type, like with unions. On the C level, a pointer to a struct of the
base type is included.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:01 +02:00
Kevin Wolf
14d36307ff qapi-types/visit.py: Pass whole expr dict for structs
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-10-11 16:50:01 +02:00
Fam Zheng
52c8d629ca vmdk: refuse enabling zeroed grain with flat images
This is a header flag and we needs sparse for the header.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:01 +02:00
Fam Zheng
4823970bcb vmdk: convert error code to use errp
Convert "fprintf(stderr,..." and standardize error messages:

Remove a few local_error's and use errp.

Remove "VMDK:" or "Vmdk:" prefixes in error message and fix to upper
case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:01 +02:00
Wenchao Xia
2cdfb12332 build: add command check-clean
This command will package the clean operations in tests. Now root Makefile
simply calls the command and do not care the details of it any more. Original
the built binaries for test will not be removed, now they will be deleted
in clean operation.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:01 +02:00
Wenchao Xia
22ee5a557a tests: build the helper program by default
Usually we may configure and make, then goto ./tests/qemu-iotest,
check. In this case an error will happen since helper program
was not built. This patch simply build it by default. A better way
may be introducing Makefile in ./tests/qemu-iotest, but it is more
complicate to handle out of tree case, and a bit overkill
for a single file now, we can do that when more files come.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
e428e439df block/raw-posix: Employ error parameter
Make use of the error parameter in the opening and creating functions in
block/raw-posix.c.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Fam Zheng
5dd75f9afb qemu-iotests: move blank lines of output in case 059
Move the blank line to above the test step banner, so it looks clearer
in blocks.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
ca2884087a blkverify: Employ error parameter
Make use of the error parameter in blkverify_open.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
10ffa72fae blkdebug: Employ error parameter
Make use of the error parameter in blkdebug_open.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
c6252b7cea block/raw-win32: Employ error parameter
Make use of the error parameter in the opening and creating functions in
block/raw-win32.c.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
92f1deec31 block/raw_bsd: Employ error parameter
Propagate errors in raw_create rather than directly reporting and
afterwards discarding them.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
1fa5cc839a qcow2: Evaluate overlap check options
Evaluate the runtime overlap check options and set
BDRVQcowState.overlap_check appropriately.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
4a273c398b qcow2: Add more overlap check bitmask macros
Introduces the macros QCOW2_OL_CONSTANT and QCOW2_OL_ALL in addition to
the already existing QCOW2_OL_CACHED, signifying all metadata overlap
checks that can be performed in constant time (regardless of image size
etc.) and truly all available overlap checks, respectively.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
4092e99d93 qcow2: Array assigning options to OL check bits
Add an array which assigns the option string to its corresponding
overlap check bit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
05de7e86ca qcow2: Add overlap-check options
Add runtime options to tune the overlap checks to be performed before
write accesses.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
3e3553905c qcow2: Make overlap check mask variable
Replace the QCOW2_OL_DEFAULT macro by a variable overlap_check in
BDRVQcowState.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
231bb26764 qcow2: Use negated overflow check mask
In qcow2_check_metadata_overlap and qcow2_pre_write_overlap_check,
change the parameter signifying the checks to perform from its current
positive form to a negative one, i.e., it will no longer explicitly
specify every check to perform but rather a mask of checks not to
perform.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Kevin Wolf
8f94a6e40e block: Improve driver whitelist checks
The main intent of this patch is to consolidate the whitelist checks to
a single point in the code instead of spreading it everywhere. This adds
a nicer error message for read-only whitelisting, too, in places where
it was still missing.

The patch also contains a bonus bug fix: By finding the format first in
bdrv_open() and then independently checking against the whitelist only
later, we avoid the case that use of a non-whitelisted format results in
probing rather than an error message. Previously, this could happen when
using the driver=... option.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
00c49b21e7 qcow2: Use better type for numerical snapshot ID
When trying to find a new snapshot ID, the existing ones are converted
to integers using strtoul. This function returns an unsigned long,
therefore its result should be saved in an unsigned long as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
84757f7e67 qcow2: Fix snapshot restoration in snapshot_create
If the new snapshot table could not be written in qcow2_snapshot_create,
the old snapshot table has to be restored in memory and the new one
released. This should include restoration of the old snapshot count as
well, which is added by this patch.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz
f9bff97143 qcow2: Remove wrong metadata overlap check
In qcow2_write_compressed, if the compression fails, a normal cluster is
written to disk. This is done through bdrv_write on the qcow2 BDS
itself (using the guest offset), thus it is wrong to do a metadata
overlap check before.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
9e3f08923a qcow2: Add missing space in error message
The error message in qcow2_downgrade about an unsupported refcount
order is missing a space. This patch adds it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Jeff Cody
89e911816a block: qemu-iotests for vhdx, read sample dynamic image
This adds the VHDX format to the qemu-iotests format, and adds
a read test.  The test reads from an existing sample image, that
was created with Hyper-V under Windwos Server 2012.

The image file is a 1GB dynamic image, with 32MB blocks.

The pattern 0xa5 exists from 0MB-33MB (past a block size boundary)

The pattern 0x96 exists from 33MB-66MB (past another block boundary,
and leaving a partial blank block)

From 66MB-1024MB, all reads should return 0.

Although 1GB dynamic image with 66MB of data, the bzip2'ed image
file size is only 874 bytes.

This also adds in the IMGFMT_GENERIC flag, so r/o images can be
tested (e.g. ./check -vhdx) without failing tests that assume
r/w support.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Michael S. Tsirkin
13164591f3 ahci: set ahci mode on reset
ATM we set AHCI mode on 1st GHC write.
Spec says we should set it on reset.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
975a93c082 qemu-iotests: Discard preallocated zero clusters
Add a new test case for discarding preallocated zero clusters; doing
this should not result in any leaks.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Benoît Canet
f6186f49e2 block: Add BlockDriver.bdrv_check_ext_snapshot.
This field is used by blkverify to disable external snapshots creation.
It will also be used by block filters like quorum to disable external
snapshot creation.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Peter Lieven
92bc50a5ad block/get_block_status: avoid redundant callouts on raw devices
if a raw device like an iscsi target or host device is used
the current implementation makes a second call out to get
the block status of bs->file.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
88fb153512 qcow2: Assert against snapshot name/ID overflow
qcow2_write_snapshots relies on the length of every snapshot ID and name
fitting into an unsigned 16 bit integer. This is currently ensured by
QEMU through generally only allowing 128 byte IDs and 256 byte names.
However, if this should change in the future, the length written to the
image file should not be silently truncated (though the name itself
would be written completely).

Since this is currently not an issue but might require attention due to
internal QEMU changes in the future, an assert ensuring sanity is enough
for now.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
9186ad9658 qcow2: Free allocated snapshot table on error
If an error occurs during qcow2_write_snapshots, the newly allocated
snapshot table clusters are leaked and should thus be freed.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
37d41f0a04 qcow2: Always use error path on writing snapshots
qcow2_write_snapshots does contain a fail label and there is no reason
not to use it on some errors; therefore, we should always jump there on
error.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
8f730dd24e qcow2: Free preallocated zero clusters
In qcow2_free_any_clusters, preallocated zero clusters should be freed
just as normal clusters are.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
998b959c1e qcow2: Use pread for inactive L1 in overlap check
Currently, qcow2_check_metadata_overlap uses bdrv_read to read inactive
L1 tables from disk. The number of sectors to read is calculated through
a truncating integer division, therefore, if the L1 table size is not a
multiple of the sector size, the final entries will not be read and
their entries in memory remain undefined (from the g_malloc).
Using bdrv_pread fixes this.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz
f252080453 qcow2: Alignment of snapshot table entries
The qcow2 specification does not explicitly state so far that every
snapshot table entry is aligned to 8 bytes. QEMU, in contrast, does this
alignment, thus it should be properly documented (which this patch
does).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:58 +02:00
Max Reitz
3677e6f625 qemu-iotests: Additional info from qemu-img info
Add a test for the additional information now provided by qemu-img info
when used on qcow2 images. It also tests the qemu QMP output from the
query-block command when running qemu with different runtime options
than specified in the image (ImageInfoSpecific should always refer to
the image).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:50 +02:00
Max Reitz
4c2e946500 qemu-iotests: Discard specific info in _img_info
In _img_info, filter out additional information specific to the image
format provided by qemu-img info, since tests designed for multiple
image formats would produce different outputs for every image format
otherwise.

In a human-readable dump, that new information will always be last for
each "image information block" (multiple blocks are emitted when
inspecting the backing file chain). Every block is separated by an empty
line. Therefore, in this case, everything starting with the line "Format
specific information:" up to that empty line (or EOF, if it is the last
block) has to be stripped.

The JSON dump will always emit pretty JSON data. Therefore, the opening
and closing braces of every object will be on lines which are indented
by exactly the same amount, and all lines in between will have more
indentation. Thus, in this case, everything starting with a line
matching the regular expression /^ *"format-specific": {/ until /^ *},?/
has to be stripped, where the number of spaces at the beginning of the
respective lines is equal.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 14:04:58 +02:00
Max Reitz
37764dfb71 qcow2: Add support for ImageInfoSpecific
Add a new ImageInfoSpecificQCow2 type as a subtype of ImageInfoSpecific.
This contains the compatibility level as a string and an optional
lazy_refcounts boolean (optional means mandatory for compat >= 1.1 and
not available for compat == 0.10).

Also, add qcow2_get_specific_info, which returns this information.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 14:03:57 +02:00
Max Reitz
a8d8ecb77f block/qapi: Human-readable ImageInfoSpecific dump
Add a function for generically dumping the ImageInfoSpecific information
in a human-readable format to block/qapi.c.

Use this function in bdrv_image_info_dump and qemu-io-cmds.c:info_f to
allow qemu-img info resp. qemu-io -c info to print that format specific
information.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 10:52:54 +02:00
Max Reitz
eae041fe6f block: Add bdrv_get_specific_info
Add a function for retrieving an ImageInfoSpecific object from a block
driver.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 10:52:54 +02:00
Max Reitz
f2bb8a8a47 qapi: Add ImageInfoSpecific type
Add a new type ImageInfoSpecific as a union for image format specific
information in ImageInfo.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 10:52:54 +02:00
Fam Zheng
79e14bf778 qapi: make use of new BlockJobType
Switch the string to enum type BlockJobType in BlockJobDriver.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 10:52:54 +02:00
Fam Zheng
2cb5b22286 qapi: Introduce enum BlockJobType
This will replace the open coded block job type string for mirror,
commit and backup.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 10:52:54 +02:00
Fam Zheng
3fc4b10af0 blockjob: rename BlockJobType to BlockJobDriver
We will use BlockJobType as the enum type name of block jobs in QAPI,
rename current BlockJobType to BlockJobDriver, which will eventually
become a set of operations, similar to block drivers.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 10:52:54 +02:00
Richard Henderson
867b3201a3 exec: Add both big- and little-endian memory helpers
Step three in the transition: helpers not tied to the target
"default" endianness.  To be used when the guest uses a memory
operation with non-default endianness.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 13:19:21 -07:00
Richard Henderson
f713d6ad7b tcg: Add qemu_ld_st_i32/64
Step two in the transition, adding the new ldst opcodes.  Keep the old
opcodes around until all backends support the new opcodes.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 13:19:21 -07:00
Anthony Liguori
39c153b80f Merge remote-tracking branch 'afaerber/tags/qom-cpu-for-anthony' into staging
QOM CPUState refactorings / X86CPU

* Fix for X86CPU model field of qemu32/qemu64 CPU models
* Bug fix for longjmp on FreeBSD
* Removal of unused function
* Confinement of clone syscall infrastructure to linux-user

# gpg: Signature made Wed 09 Oct 2013 03:40:51 AM PDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found

# By Andreas Färber (2) and others
# Via Andreas Färber
* afaerber/tags/qom-cpu-for-anthony:
  cpu: Drop cpu_model_str from CPU_COMMON
  cpu: Move cpu_copy() into linux-user
  cputlb: Remove dead function tlb_update_dirty()
  cpu-exec: Also reload CPUClass *cc after longjmp return in cpu_exec()
  target-i386: Set model=6 on qemu64 & qemu32 CPU models
2013-10-10 13:16:25 -07:00
Anthony Liguori
e8f2f59aaf Merge remote-tracking branch 'amit/char-remove-watch-on-unplug' into staging
# By Amit Shah
# Via Amit Shah
* amit/char-remove-watch-on-unplug:
  char: remove watch callback on chardev detach from frontend
  char: use common function to disable callbacks on chardev close
  char: move backends' io watch tag to CharDriverState

Message-id: 20131004154802.GA25646@grmbl.mre
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-10 13:16:02 -07:00
Anthony Liguori
88b70e56b9 Merge remote-tracking branch 'otubo/seccomp' into staging
# By Eduardo Otubo
# Via Eduardo Otubo
* otubo/seccomp:
  seccomp: fine tuning whitelist by adding times()

Message-id: 1380047458-21673-1-git-send-email-otubo@linux.vnet.ibm.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-10 13:15:46 -07:00
Anthony Liguori
e572398de1 Merge remote-tracking branch 'mcayland/qemu-openbios' into staging
* mcayland/qemu-openbios:
  Update OpenBIOS images

Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-10 13:00:03 -07:00
Mark Wu
0106dc4f05 qemu-ga: Extend 'guest-info' command to expose flag 'success-response'
Now we have several qemu-ga commands not returning response on success.
It has been documented in qga/qapi-schema.json already. This patch exposes
the 'success-response' flag by extending 'guest-info' command. With this
change, the clients can handle the command response more flexibly.

Signed-off-by: Mark Wu <wudxw@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
*fixed up commit subject
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-10-10 14:52:37 -05:00
Mark Wu
8dc4d915dd qemu-ga: Add interface to traverse the qmp command list by QmpCommand
In the original code, qmp_get_command_list is used to construct
a list of all commands' name. To get the information of all qga
commands, it traverses the name list and search the command info
with its name.  So it can cause O(n^2) in the number of commands.

This patch adds an interface to traverse the qmp command list by
QmpCommand to replace qmp_get_command_list. It can decrease the
complexity from O(n^2) to O(n).

Signed-off-by: Mark Wu <wudxw@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
*fix up commit subject
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-10-10 14:52:37 -05:00
Tomoki Sekiyama
e5d9adbdab qemu-ga: execute fsfreeze-freeze in reverse order of mounts
Currently, fsfreeze-freeze may cause deadlock if a guest has loopback mounts
of image files in its disk; e.g.:

    # mount | grep ^/
    /dev/vda1 / type ext4 (rw,noatime,seclabel,data=ordered)
    /tmp/disk.img on /mnt type ext4 (rw,relatime,seclabel)

To avoid the deadlock, this freezes filesystems in reverse order of mounts.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
*fix up commit msg
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-10-10 14:52:37 -05:00
Richard Henderson
6c5f4ead64 tcg: Add TCGMemOp
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 12:20:59 -07:00
Richard Henderson
ec9135cd6e configure: Remove CONFIG_QEMU_LDST_OPTIMIZATION
No longer used.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:44:26 -07:00
Richard Henderson
9ecefc84dd tcg: Add tcg-be-ldst.h
Move TCGLabelQemuLdst and related stuff out of tcg.h.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:44:26 -07:00
Richard Henderson
3cf246f0d4 tcg: Add tcg-be-null.h
This is a no-op backend data implementation, for those targets that
are not currently using the load/store optimization path.

This is prepatory to always requiring these functions in all backends.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:44:26 -07:00
Richard Henderson
dbdbe0cd31 exec: Delete is_tcg_gen_code and GETRA_EXT
All implementations now boil down to GETRA.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:44:25 -07:00
Richard Henderson
023261ef85 tcg-aarch64: Update to helper_ret_*_mmu routines
A minimal update to use the new helpers with the return address argument.

Tested-by: Claudio Fontana <claudio.fontana@linaro.org>
Reviewed-by: Claudio Fontana <claudio.fontana@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:44:25 -07:00
Richard Henderson
84fd9dd3f7 tcg: Merge tcg_register_helper into tcg_context_init
Eliminates the repeated checks for having created
the s->helpers hash table.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:44:25 -07:00
Richard Henderson
4953ee6271 tcg: Add tcg-runtime.c helpers to all_helpers
For the few targets that actually use these, we'd not report
them symbolicly in the tcg opcode logs.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:44:25 -07:00
Richard Henderson
100b5e0170 tcg: Put target helper data into an array.
One call inside of a loop to tcg_register_helper instead of hundreds
of sequential calls.

Presumably more icache and branch prediction friendly; resulting binary
size mostly unchanged on x86_64, as we're trading 32-bit rip-relative
references in .text for full 64-bit pointers in .rodata.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:44:25 -07:00
Richard Henderson
f5daeec412 tcg: Remove stray semi-colons from target-*/helper.h
During GEN_HELPER=1, these are actually stray top-level semi-colons
which are technically invalid ISO C, but GCC accepts as an extension.
If we added enough __extension__ markers that we could dare use
-Wpedantic, we'd see

  warning: ISO C does not allow extra ‘;’ outside of a function

This will become a hard error in the next patch, wherein those ; will
appear in the middle of a data structure.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:43:37 -07:00
Richard Henderson
5cd8f6210f tcg: Move helper registration into tcg_context_init
No longer needs to be done on a per-target basis.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:43:37 -07:00
Richard Henderson
e5e84d22a3 target-m68k: Rename helpers.h to helper.h
This brings the m68k target in line with all other targets.

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:41:54 -07:00
Richard Henderson
6e085f72c6 tcg: Use a GHashTable for tcg_find_helper
Slightly changes the interface, in that we now return name
instead of a TCGHelperInfo structure, which goes away.

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:41:36 -07:00
Richard Henderson
7c57df0d85 tcg: Delete tcg_helper_get_name declaration
The function was deleted in 4dc81f2822.

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:41:15 -07:00
Richard Henderson
802b508123 tcg-hppa: Remove tcg backend
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-10 11:31:06 -07:00
Anthony Liguori
f2c6bcfc2e Merge remote-tracking branch 'sstabellini/xen-2013-10-10' into staging
# By Matthew Daley (1) and Roger Pau Monné (1)
# Via Stefano Stabellini
* sstabellini/xen-2013-10-10:
  qemu/xen: make use of xenstore relative paths
  xen_disk: mark ioreq as mapped before unmapping in error case
2013-10-10 10:03:38 -07:00
Anthony Liguori
634ebf4b17 Merge remote-tracking branch 'bonzini/scsi-next' into staging
# By Asias He (1) and Peter Lieven (1)
# Via Paolo Bonzini
* bonzini/scsi-next:
  scsi: Allocate SCSITargetReq r->buf dynamically [CVE-2013-4344]
  block/iscsi: reenable iscsi_co_get_block_status

Message-id: 1381332391-8781-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-10 10:03:00 -07:00
Anthony Liguori
c4ca690158 Update email address
Amazon is now funding my work as QEMU maintainer so update addresses
accordingly.

Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-10 09:56:25 -07:00
Roger Pau Monné
33876dfad6 qemu/xen: make use of xenstore relative paths
Qemu has several hardcoded xenstore paths that are only valid on Dom0.
Attempts to launch a Qemu instance (to act as a userspace backend for
PV disks) will fail because Qemu is not able to access those paths
when running on a domain different than Dom0.

Instead make the xenstore paths relative to the domain where Qemu is
actually running.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: Anthony PERARD <anthony.perard@citrix.com>
2013-10-10 14:25:52 +00:00
Matthew Daley
a76f48e533 xen_disk: mark ioreq as mapped before unmapping in error case
Commit 4472beae modified the semantics of ioreq_{un,}map so that they are
idempotent if called when they're not needed (ie., twice in a row). However,
it neglected to handle the case where batch mapping is not being used (the
default), and one of the grants fails to map. In this case, ioreq_unmap will
be called to unwind and unmap any mappings already performed, but ioreq_unmap
simply returns due to the aforementioned change (the ioreq has not already
been marked as mapped).

The frontend user can therefore force xen_disk to leak grant mappings, a
per-domain limited resource.

Fix by marking the ioreq as mapped before calling ioreq_unmap in this
situation.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2013-10-10 14:23:45 +00:00
Asias He
846424350b scsi: Allocate SCSITargetReq r->buf dynamically [CVE-2013-4344]
r->buf is hardcoded to 2056 which is (256 + 1) * 8, allowing 256 luns at
most. If more than 256 luns are specified by user, we have buffer
overflow in scsi_target_emulate_report_luns.

To fix, we allocate the buffer dynamically.

Signed-off-by: Asias He <asias@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-09 17:24:18 +02:00
Anthony Liguori
a107170537 Merge remote-tracking branch 'stefanha/block' into staging
# By Max Reitz (5) and others
# Via Stefan Hajnoczi
* stefanha/block:
  block: use correct filename
  qemu-iotests: Correct 026 output
  qcow2: Free allocated L2 cluster on error
  qcow2: Switch L1 table in a single sequence
  block: vhdx - add migration blocker
  block: use correct filename for error report
  qcow2: CHECK_OFLAG_COPIED is obsolete
  qcow2: Correct endianness in overlap check

Message-id: 1381145289-6591-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-09 07:54:42 -07:00
Anthony Liguori
80dfc87394 Merge remote-tracking branch 'mjt/trivial-patches' into staging
# By Stefan Weil (5) and others
# Via Michael Tokarev
* mjt/trivial-patches:
  migration: Fix compiler warning ('caps' may be used uninitialized)
  util/path: Fix type which is longer than 8 bit for MinGW
  hw/9pfs: Fix errno value for xattr functions
  vl: Clean up unnecessary boot_order complications
  qemu-char: Fix potential out of bounds access to local arrays
  pci-ohci: Add missing 'break' in ohci_service_td
  sh4: Fix serial line access for Linux kernels later than 3.2
  hw/alpha: Fix compiler warning (integer constant is too large)
  target-i386: Fix compiler warning (integer constant is too large)
  block: Remove unused assignment (fixes warning from clang)
  exec: cleanup DEBUG_SUBPAGE
  tests: Fix schema parser test for in-tree build
  tests: Update .gitignore for test-int128 and test-bitops
  .gitignore: ignore tests/qemu-iotests/socket_scm_helper

Message-id: 1381051979-25742-1-git-send-email-mjt@msgid.tls.msk.ru
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-09 07:54:21 -07:00
Anthony Liguori
576e81be39 Merge remote-tracking branch 'rth/tcg-arm-pull' into staging
# By Richard Henderson
# Via Richard Henderson
* rth/tcg-arm-pull:
  tcg-arm: Move the tlb addend load earlier
  tcg-arm: Remove restriction on qemu_ld output register
  tcg-arm: Return register containing tlb addend
  tcg-arm: Move load of tlb addend into tcg_out_tlb_read
  tcg-arm: Use QEMU_BUILD_BUG_ON to verify constraints on tlb
  tcg-arm: Use strd for tcg_out_arg_reg64
  tcg-arm: Rearrange slow-path qemu_ld/st
  tcg-arm: Use ldrd/strd for appropriate qemu_ld/st64

Message-id: 1380663109-14434-1-git-send-email-rth@twiddle.net
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-09 07:52:57 -07:00
Anthony Liguori
9e8f8b1cd8 Merge remote-tracking branch 'sweil/mingw' into staging
# By Sebastian Ottlik
# Via Stefan Weil
* sweil/mingw:
  util: call socket_set_fast_reuse instead of setting SO_REUSEADDR
  slirp: call socket_set_fast_reuse instead of setting SO_REUSEADDR
  net: call socket_set_fast_reuse instead of setting SO_REUSEADDR
  gdbstub: call socket_set_fast_reuse instead of setting SO_REUSEADDR
  util: add socket_set_fast_reuse function which will replace setting SO_REUSEADDR

Message-id: 1380735690-24009-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-09 07:52:21 -07:00
Anthony Liguori
dfe2279975 Merge remote-tracking branch 'kraxel/chardev.8' into staging
# By Gerd Hoffmann
# Via Gerd Hoffmann
* kraxel/chardev.8:
  chardev: handle qmp_chardev_add(KIND_MUX) failure

Message-id: 1380708925-6721-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-09 07:52:11 -07:00
Anthony Liguori
ce079abb41 Merge remote-tracking branch 'sweil/tci' into staging
# By Stefan Weil
# Via Stefan Weil
* sweil/tci:
  misc: Use new rotate functions
  bitops: Add rotate functions (rol8, ror8, ...)
  tci: Add implementation of rotl_i64, rotr_i64

Message-id: 1380137693-3729-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-09 07:51:23 -07:00
Peter Lieven
24c7608a5d block/iscsi: reenable iscsi_co_get_block_status
Commit f35c934a accidently disabled iscsi_co_get_block_status for all
libiscsi versions. Its not possible to check for enumeration constants
in the C preprocessor. This patch changes the check to the preprocessor
constant LIBISCSI_FEATURE_IOVECTOR which was introduced shortly after
get_lba_status support was added to libiscsi.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-09 10:43:42 +02:00
Dunrong Huang
d4cea8dfb9 block: use correct filename
The content filename point to may be erased by qemu_opts_absorb_qdict()
in raw_open_common() in drv->bdrv_file_open()

So it's better to use bs->filename.

Signed-off-by: Dunrong Huang <riegamaths@gmail.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-07 13:23:19 +02:00
Max Reitz
5c1fa87708 qemu-iotests: Correct 026 output
Because l2_allocate now frees the unused L2 cluster on error, the
according test cases in 026 don't result in one leaked cluster anymore.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-07 13:23:19 +02:00
Max Reitz
e3b21ef9e0 qcow2: Free allocated L2 cluster on error
If an error occurs in l2_allocate, the allocated (but unused) L2 cluster
should be freed.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-07 13:23:19 +02:00
Andreas Färber
51fb256ab5 cpu: Drop cpu_model_str from CPU_COMMON
Since this is only read in cpu_copy() and linux-user has a global
cpu_model, drop the field from generic code.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-07 11:48:47 +02:00
Andreas Färber
30ba0ee52d cpu: Move cpu_copy() into linux-user
It is only used there and is deemed very fragile if not incorrect in its
current memcpy() form. Moving it into linux-user will allow to move
parts into target_cpu.h headers and only copy what the ABI mandates.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-07 11:48:39 +02:00
liguang
812586405c cputlb: Remove dead function tlb_update_dirty()
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-07 11:48:03 +02:00
Juergen Lock
6c78f29a24 cpu-exec: Also reload CPUClass *cc after longjmp return in cpu_exec()
Local variable CPUClass *cc needs to be reloaded after return from longjmp,
too.  (This fixes a mips-softmmu crash observed on FreeBSD when QEMU is
built with clang.)

Reported-by: Dimitry Andric <dim@FreeBSD.org>
Signed-off-by: Juergen Lock <nox@jelal.kn-bremen.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-07 11:46:58 +02:00
Michael Tokarev
387eedebf6 migration: Fix compiler warning ('caps' may be used uninitialized)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
2013-10-05 14:02:29 +04:00
Stefan Weil
ddd23638d7 util/path: Fix type which is longer than 8 bit for MinGW
While dirent->d_type is 8 bit for most systems, it is 32 bit for MinGW.
Reducing it to 8 bit results in a compiler warning because the macro
is_dir_maybe compares that 8 bit value with 32 bit constants.

Using 'unsigned' instead of 'unsigned char' matches the declaration for
MinGW and does not harm the other systems.

MinGW-w64 is not affected: it does not declare d_type.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-05 14:01:47 +04:00
Daniel P. Berrange
8af0020544 hw/9pfs: Fix errno value for xattr functions
If there is no operation driver for the xattr type the
functions return '-1' and set errno to '-EOPNOTSUPP'.
When the calling code sets 'ret = -errno' this turns
into a large positive number.

In Linux 3.11, the kernel has switched to using 9p
version 9p2000.L, instead of 9p2000.u, which enables
support for xattr operations. This on its own is harmless,
but for another change which makes it request the xattr
with a name 'security.capability'.

The result is that the guest sees a succesful return
of 95 bytes of data, instead of a failure with errno
set to 95. Since the kernel expects a maximum of 20
bytes for an xattr return this gets translated to the
unexpected errno ERANGE.

This all means that when running a binary off a 9p fs
in 3.11 kernels you get a fun result of:

  # ./date
  sh: ./date: Numerical result out of range

The only workaround is to pass 'version=9p2000.u' when
mounting the 9p fs in the guest, to disable all use of
xattrs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-05 13:05:28 +04:00
Markus Armbruster
e3fdc535f2 vl: Clean up unnecessary boot_order complications
Messed up in commit 8281abd.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-05 13:05:28 +04:00
Stefan Weil
49aa4058ac qemu-char: Fix potential out of bounds access to local arrays
Latest gcc-4.8 supports a new option -fsanitize=address which activates
an AddressSanitizer. This AddressSanitizer stops the QEMU system emulation
very early because two character arrays of size 8 are potentially written
with 9 bytes.

Commit 6ea314d914 added the code.

There is no obvious reason why width or height could need 8 characters,
so reduce it to 7 characters which together with the terminating '\0'
fit into the arrays.

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Alex Bennée <alex@bennee.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-05 13:05:15 +04:00
Alex Williamson
b1c50c5f24 vfio-pci: Fix endian issues in vfio_pci_size_rom()
VFIO is always little endian so do byte swapping of our mask on the
way in and byte swapping of the size on the way out.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2013-10-04 12:50:51 -06:00
Alex Williamson
64fa25a0ef vfio-pci: Add dummy PCI ROM write accessor
Just to be sure we don't jump off any NULL pointer cliffs.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-04 08:51:36 -06:00
Jan Kiszka
7174e54cf1 kvmvapic: Prevent reading beyond the end of guest RAM
rom_state_paddr is guest provided (caller address of outw(VAPIC_PORT) +
writen 16-bit value) and can be influenced to point beyond the end of
the host memory backing the guest's RAM. Make sure we do not use this
pointer to actually read beyond the limits.

Reading arbitrary guest bytes is harmless, the guest kernel has to
manage access to this I/O port anyway.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-04 13:13:16 +03:00
Alexey Kardashevskiy
1d5bf692e5 vfio: Fix debug output for int128 values
Memory regions can easily be 2^64 byte long and therefore overflow
for just a bit but that is enough for int128_get64() to assert.

This takes care of debug printing of huge section sizes.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-10-03 09:10:09 -06:00
Sebastian Macke
6ef8263ead target-openrisc: Removes a non-conforming behavior for the first page of the memory
Where *software* leaves 0x0000 - 0x2000 unmapped, the hardware should
still allow for this area to be mapped.

Signed-off-by: Sebastian Macke <sebastian@macke.de>
Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Reviewed-by: Jia Liu <proljc@gmail.com>
2013-10-03 16:24:44 +08:00
Sebastian Macke
bf961b5278 target-openrisc: Correct handling of page faults.
The result of (rw & 0) is always zero and therefore a logic false.
The whole comparison will therefore never be executed, it is a obvious bug,
we should use !(rw & 1) here.

Signed-off-by: Sebastian Macke <sebastian@macke.de>
Reviewed-by: Jia Liu <proljc@gmail.com>
2013-10-03 16:24:24 +08:00
Mark Cave-Ayland
ad98acb9b1 Update OpenBIOS images
Update OpenBIOS images to SVN r1229 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2013-10-03 00:04:20 +01:00
Alex Williamson
f16f39c3fc vfio-pci: Implement PCI hot reset
Now that VFIO has a PCI hot reset interface, take advantage of it.
There are two modes that we need to consider.  The first is when only
one device within the set of devices affected is actually assigned to
the guest.  In this case the other devices are are just held by VFIO
for isolation and we can pretend they're not there, doing an entire
bus reset whenever the device reset callback is triggered.  Supporting
this case separately allows us to do the best reset we can do of the
device even if the device is hotplugged.

The second mode is when multiple affected devices are all exposed to
the guest.  In this case we can only do a hot reset when the entire
system is being reset.  However, this also allows us to track which
individual devices are affected by a reset and only do them once.

We split our reset function into pre- and post-reset helper functions
prioritize the types of device resets available to us, and create
separate _one vs _multi reset interfaces to handle the distinct cases
above.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-10-02 13:51:00 -06:00
Ján Veselý
4b351a0f21 pci-ohci: Add missing 'break' in ohci_service_td
Device communication errors need to be reported to driver.
Add a debug message while at it.

Signed-off-by: Jan Vesely <jano.vesely@gmail.com>
Acked-by: Gerd Hoffmann <kraxel@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-02 22:55:28 +04:00
Guenter Roeck
84faf7c392 sh4: Fix serial line access for Linux kernels later than 3.2
With Linux kernel version 3.3 or later, qemu fails with the following message:

sh_serial: unsupported read from 0x18
  Aborted

Reported-and-analyzed-by: Rob Landley <rob@landley.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-02 22:55:28 +04:00
Stefan Weil
9b2caaf40b hw/alpha: Fix compiler warning (integer constant is too large)
From buildbot default_i386_rhel61:

  CC    alpha-softmmu/hw/alpha/typhoon.o
hw/alpha/typhoon.c: In function 'typhoon_translate_iommu':
hw/alpha/typhoon.c:703: warning: integer constant is too large for 'long' type
hw/alpha/typhoon.c:703: warning: integer constant is too large for 'long' type

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-02 22:55:28 +04:00
Stefan Weil
00fdef6586 target-i386: Fix compiler warning (integer constant is too large)
From buildbot default_i386_rhel61:

  CC    i386-softmmu/target-i386/arch_memory_mapping.o
target-i386/arch_memory_mapping.c: In function 'walk_pde':
target-i386/arch_memory_mapping.c:110: warning:
 integer constant is too large for 'long' type

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-02 22:55:28 +04:00
Stefan Weil
3a6f270326 block: Remove unused assignment (fixes warning from clang)
blockdev.c:1929:13: warning: Value stored to 'ret' is never read
            ret = 0;
            ^     ~

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-02 22:55:28 +04:00
Amos Kong
016e9d62fe exec: cleanup DEBUG_SUBPAGE
Touched some error after enabling DEBUG_SUBPAGE.

Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-02 22:55:28 +04:00
Markus Armbruster
d8039e58b1 tests: Fix schema parser test for in-tree build
Commit 4f193e3 added the test, but screwed up in-tree builds
(SRCDIR=.): the tests's output overwrites the expected output, and is
thus compared to itself.

Cc: qemu-stable@nongnu.org
Reported-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-02 22:55:27 +04:00
Alex Williamson
8fbf47c3a8 vfio-pci: Cleanup error_reports
Remove carriage returns and tweak formatting for error_reports.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-10-02 12:52:38 -06:00
Alex Williamson
6f864e6ec8 vfio-pci: Lazy PCI option ROM loading
During vfio-pci initfn, the device is not always in a state where the
option ROM can be read.  In the case of graphics cards, there's often
no per function reset, which means we have host driver state affecting
whether the option ROM is usable.  Ideally we want to move reading the
option ROM past any co-assigned device resets to the point where the
guest first tries to read the ROM itself.

To accomplish this, we switch the memory region for the option rom to
an I/O region rather than a memory mapped region.  This has the side
benefit that we don't waste KVM memory slots for a BAR where we don't
care about performance.  This also allows us to delay loading the ROM
from the device until the first read by the guest.  We then use the
PCI config space size of the ROM BAR when setting up the BAR through
QEMU PCI.

Another benefit of this approach is that previously when a user set
the ROM to a file using the romfile= option, we still probed VFIO for
the parameters of the ROM, which can result in dmesg errors about an
invalid ROM.  We now only probe VFIO to get the ROM contents if the
guest actually tries to read the ROM.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-10-02 12:52:38 -06:00
Alex Williamson
befe5176ef vfio-pci: Test device reset capabilities
Not all resets are created equal.  PM reset is not very reliable,
especially for GPUs, so we might want to opt for a bus reset if a
standard reset will only do a D3hot->D0 transition.  We can also
use this to tell if the standard reset will do a bus reset (if
neither has_pm_reset or has_flr is probed, but the device still
supports reset).

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-10-02 12:52:38 -06:00
Alex Williamson
c7679d450e vfio-pci: Add support for MSI affinity
When MSI is accelerated through KVM the vectors are only programmed
when the guest first enables MSI support.  Subsequent writes to the
vector address or data fields are ignored.  Unfortunately that means
we're ignore updates done to adjust SMP affinity of the vectors.
MSI SMP affinity already works in non-KVM mode because the address
and data fields are read from their backing store on each interrupt.

This patch stores the MSIMessage programmed into KVM so that we can
determine when changes are made and update the routes.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2013-10-02 12:52:38 -06:00
Sebastian Ottlik
04fd1c7896 util: call socket_set_fast_reuse instead of setting SO_REUSEADDR
SO_REUSEADDR should be avoided on Windows but is desired on other operating
systems. So instead of setting it we call socket_set_fast_reuse that will result
in the appropriate behaviour on all operating systems.

Signed-off-by: Sebastian Ottlik <ottlik@fzi.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2013-10-02 19:20:31 +02:00
Sebastian Ottlik
aad1239a7e slirp: call socket_set_fast_reuse instead of setting SO_REUSEADDR
SO_REUSEADDR should be avoided on Windows but is desired on other operating
systems. So instead of setting it we call socket_set_fast_reuse that will result
in the appropriate behaviour on all operating systems.

Signed-off-by: Sebastian Ottlik <ottlik@fzi.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2013-10-02 19:20:31 +02:00
Sebastian Ottlik
bcbe92fb08 net: call socket_set_fast_reuse instead of setting SO_REUSEADDR
SO_REUSEADDR should be avoided on Windows but is desired on other operating
systems. So instead of setting it we call socket_set_fast_reuse that will result
in the appropriate behaviour on all operating systems.

An exception to this rule are multicast sockets where it is sensible to have
multiple sockets listen on the same ip and port and we should set SO_REUSEADDR
on windows.

Signed-off-by: Sebastian Ottlik <ottlik@fzi.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2013-10-02 19:20:31 +02:00
Sebastian Ottlik
6669ca13c3 gdbstub: call socket_set_fast_reuse instead of setting SO_REUSEADDR
SO_REUSEADDR should be avoided on Windows but is desired on other operating
systems. So instead of setting it we call socket_set_fast_reuse that will result
in the appropriate behaviour on all operating systems.

Signed-off-by: Sebastian Ottlik <ottlik@fzi.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2013-10-02 19:20:31 +02:00
Sebastian Ottlik
606600a176 util: add socket_set_fast_reuse function which will replace setting SO_REUSEADDR
If a socket is closed it remains in TIME_WAIT state for some time. On operating
systems using BSD sockets the endpoint of the socket may not be reused while in
this state unless SO_REUSEADDR was set on the socket. On windows on the other
hand the default behaviour is to allow reuse (i.e. identical to SO_REUSEADDR on
other operating systems) and setting SO_REUSEADDR on a socket allows it to be
bound to a endpoint even if the endpoint is already used by another socket
independently of the other sockets state. This can even result in undefined
behaviour.

Many sockets used by QEMU should not block the use of their endpoint after being
closed while they are still in TIME_WAIT state. Currently QEMU sets SO_REUSEADDR
for such sockets, which can lead to problems on Windows. This patch introduces
the function socket_set_fast_reuse that should be used instead of setting
SO_REUSEADDR when fast socket reuse is desired and behaves correctly on all
operating systems.

As a failure of this function can only be caused by bad QEMU internal errors, an
assertion handles these situations. The return value is still passed on, to
minimize changes in client code and prevent unused variable warnings if NDEBUG
is defined.

Signed-off-by: Sebastian Ottlik <ottlik@fzi.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2013-10-02 19:20:31 +02:00
Anthony Liguori
0e19885e73 Update MAINTAINERS
All of Paul's emails are bouncing and he hasn't been active for
some time.

Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-10-02 12:09:12 -05:00
Paolo Bonzini
2560f19f42 x86: cpuid: reconstruct leaf 0Dh data
The data in leaf 0Dh depends on information from other feature bits.
Instead of passing it blindly from the host, compute it based on
whether these feature bits are enabled.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-02 18:58:27 +03:00
Paolo Bonzini
c74f41bbcc x86: fix migration from pre-version 12
On KVM, the KVM_SET_XSAVE would be executed with a 0 xstate_bv,
and not restore anything.

Since FP and SSE data are always valid, set them in xstate_bv at reset
time.  In fact, that value is the same that KVM_GET_XSAVE returns on
pre-XSAVE hosts.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-02 18:58:23 +03:00
Eduardo Habkost
f8e6a11aec target-i386: Set model=6 on qemu64 & qemu32 CPU models
There's no Intel CPU with family=6,model=2, and Linux and Windows guests
disable SEP when seeing that combination due to Pentium Pro erratum #82.

In addition to just having SEP ignored by guests, Skype (and maybe other
applications) runs sysenter directly without passing through ntdll on
Windows, and crashes because Windows ignored the SEP CPUID bit.

So, having model > 2 is a better default on qemu64 and qemu32 for two
reasons: making SEP really available for guests, and avoiding crashing
applications that work on bare metal.

model=3 would fix the problem, but it causes CPU enumeration problems
for Windows guests[1]. So let's set model=6, that matches "Athlon
(PM core)" on AMD and "P2 with on-die L2 cache" on Intel and it allows
Windows to use all CPUs as well as fixing sysenter.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=508623

Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-02 16:51:12 +02:00
Max Reitz
fda74f826b qcow2: Switch L1 table in a single sequence
Switching the L1 table in memory should be an atomic operation, as far
as possible. Calling qcow2_free_clusters on the old L1 table on disk is
not a good idea when the old L1 table is no longer valid and the address
to the new one hasn't yet been written into the corresponding
BDRVQcowState field. To be more specific, this can lead to segfaults due
to qcow2_check_metadata_overlap trying to access the L1 table during the
free operation.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-02 15:38:29 +02:00
Jeff Cody
5641bf4056 block: vhdx - add migration blocker
This blocks migration for VHDX image files, until the
functionality can be supported.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-02 15:24:39 +02:00
Dunrong Huang
2fa9aa59cf block: use correct filename for error report
The content filename point to will be erased by qemu_opts_absorb_qdict()
in raw_open_common() in drv->bdrv_file_open()

So it's better to use bs->filename.

Signed-off-by: Dunrong Huang <riegamaths@gmail.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-02 11:41:50 +02:00
Max Reitz
db0749012b qcow2: CHECK_OFLAG_COPIED is obsolete
CHECK_OFLAG_COPIED as a parameter to check_refcounts_l1 and
check_refcounts_l2 is obselete now, since the OFLAG_COPIED consistency
check is actually no longer performed by these functions (but by
check_oflag_copied).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-02 11:40:41 +02:00
Max Reitz
1e242b5544 qcow2: Correct endianness in overlap check
If an inactive L1 table is loaded from disk, its entries are in big
endian and have to be converted to host byte order before using them.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-10-02 11:06:35 +02:00
Richard Henderson
ee06e23051 tcg-arm: Move the tlb addend load earlier
There are free scheduling slots between the sequence of
comparison instructions.  This requires changing the
register in use to avoid conflict with those compares.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-01 10:20:33 -07:00
Richard Henderson
66c2056fb8 tcg-arm: Remove restriction on qemu_ld output register
The main intent of the patch is to allow the tlb addend register
to be changed, without tying that change to the constraint.  But
the most common side-effect seems to be to enable usage of ldrd
with the r0,r1 pair.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-01 10:20:33 -07:00
Richard Henderson
d3e440bef2 tcg-arm: Return register containing tlb addend
Preparatory to rescheduling the tlb load, and changing said register.
Continues to use R1 for now.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-01 10:20:33 -07:00
Richard Henderson
d0ebde2284 tcg-arm: Move load of tlb addend into tcg_out_tlb_read
This allows us to make more intelligent decisions about the relative
offsets of the tlb comparator and the addend, avoiding any need of
writeback addressing.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-01 10:20:33 -07:00
Richard Henderson
f248873637 tcg-arm: Use QEMU_BUILD_BUG_ON to verify constraints on tlb
One of the two constraints we already checked via #if, but
the tlb offset distance was only checked at runtime.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-01 10:20:33 -07:00
Richard Henderson
e5e2e4a74b tcg-arm: Use strd for tcg_out_arg_reg64
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-01 10:20:33 -07:00
Richard Henderson
d9f4dde4a6 tcg-arm: Rearrange slow-path qemu_ld/st
Use the new helper_ret_*_mmu routines.  Use a conditional call
to arrange for a tail-call from the store path, and to load the
return address for the helper for the load path.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-01 10:20:33 -07:00
Richard Henderson
23bbc25085 tcg-arm: Use ldrd/strd for appropriate qemu_ld/st64
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-01 10:20:33 -07:00
Markus Armbruster
9dbb52e862 tests: Update .gitignore for test-int128 and test-bitops
Forgotten in commit 6046c62 and 3464700.

Cc: qemu-stable@nongnu.org
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-01 16:06:07 +04:00
Fam Zheng
d1c295f572 .gitignore: ignore tests/qemu-iotests/socket_scm_helper
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Wenchao Xia<xiawenc@linux.vnet.ibm.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-01 16:06:07 +04:00
Gerd Hoffmann
ee6ee83de2 chardev: handle qmp_chardev_add(KIND_MUX) failure
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-10-01 10:16:04 +02:00
Anthony Liguori
a684f3cf9b Merge remote-tracking branch 'kraxel/seabios-1.7.3.2' into staging
# By Gerd Hoffmann
# Via Gerd Hoffmann
* kraxel/seabios-1.7.3.2:
  update seabios from 1.7.2.2 to 1.7.3.2

Message-id: 1380533055-24960-1-git-send-email-kraxel@redhat.com
2013-09-30 17:15:27 -05:00
Anthony Liguori
349cd52c70 Merge remote-tracking branch 'kraxel/roms.1' into staging
# By Gerd Hoffmann
# Via Gerd Hoffmann
* kraxel/roms.1:
  roms: add support for building sgabios
  roms: enable parallel seabios / seavgabios builds
  roms: enable ipxe cross builds
  roms: add rules to build slof
  roms: rewrite scripts/refresh-pxe-roms.sh
  roms: parallel ipxe builds
  roms: build lgplvgabios isavga variant
  roms: enable parallel builds for 'make lgplvgabios'
  roms: add 'make clean'

Message-id: 1380532378-22138-1-git-send-email-kraxel@redhat.com
2013-09-30 17:15:18 -05:00
Anthony Liguori
eb322b8155 Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pc,pci,virtio fixes and cleanups

This includes pc and pci cleanups and enhancements,
and a virtio-net bugfix related to softmac programming.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sun 29 Sep 2013 01:51:16 AM CDT using RSA key ID D28D5469
# gpg: Can't check signature: public key not found

# By Michael S. Tsirkin (8) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
  smbios: Factor out smbios_maybe_add_str()
  smbios: Make multiple -smbios type= accumulate sanely
  smbios: Improve diagnostics for conflicting entries
  smbios: Convert to QemuOpts
  smbios: Normalize smbios_entry_add()'s error handling to exit(1)
  virtio-net: fix up HMP NIC info string on reset
  pci: remove explicit check to 64K ioport size
  piix4: disable io on reset
  piix: use 64 bit window programmed by guest
  q35: use 64 bit window programmed by guest
  pci: add helper to retrieve the 64-bit range
  range: add min/max operations on ranges
  range: add Range to typedefs
  q35: make pci window address/size match guest cfg

Message-id: 1380437951-21788-1-git-send-email-mst@redhat.com
2013-09-30 17:15:01 -05:00
Anthony Liguori
4235d77349 Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Max Reitz (10) and others
# Via Kevin Wolf
* kwolf/for-anthony: (30 commits)
  qcow2: Remove useless count_contiguous_clusters() parameter
  qcow2: COMPRESSED on count_contiguous_clusters
  qcow2: count_contiguous_clusters and compression
  qcow2: Free only newly allocated clusters on error
  qcow2: Always use error path in l2_allocate
  qcow2: Don't put invalid L2 table into cache
  qemu-iotests: Preallocated zero clusters in 061
  qcow2: Correct bitmap size in zero expansion
  qemu-iotests: Quote $TEST_IMG* and $TEST_DIR usage
  qemu-iotests: Add basic ability to use binary sample images
  qemu-iotests: fix qmp.py search path
  block: use DIV_ROUND_UP in bdrv_co_do_readv
  qcow2: Assert against currently impossible overflow
  block: qed - use QEMU_PACKED for on-disk structures
  block: qcow2 - used QEMU_PACKED for on-disk structures
  block: vpc - use QEMU_PACKED for on-disk structures
  block: vdi - use QEMU_PACKED for on-disk structures
  rbd: avoid qemu_rbd_snap_list() memory leaks
  qdict: Extract qdict_extract_subqdict
  block: Fix compiler warning (-Werror=uninitialized)
  ...

Message-id: 1380296370-14523-1-git-send-email-kwolf@redhat.com
2013-09-30 17:14:49 -05:00
Anthony Liguori
3469a60d9f Merge remote-tracking branch 'sstabellini/xen-2013-09-25' into staging
# By Anthony PERARD (2) and Liu, Jinsong (2)
# Via Stefano Stabellini
* sstabellini/xen-2013-09-25:
  xen: Enable cpu-hotplug on xenfv machine.
  xen: Fix vcpu initialization.
  qemu: Add qemu xen logic for Xen HVM S3 resume
  qemu: Adjust qemu wakeup

Message-id: alpine.DEB.2.02.1309251749180.5498@kaball.uk.xensource.com
2013-09-30 17:14:10 -05:00
Anthony Liguori
28b9d47db6 Merge remote-tracking branch 'rth/tcg-ppc-pull' into staging
# By Richard Henderson (19) and Paolo Bonzini (2)
# Via Richard Henderson
* rth/tcg-ppc-pull: (21 commits)
  tcg-ppc64: Implement CONFIG_QEMU_LDST_OPTIMIZATION
  tcg-ppc64: Add _noaddr functions for emitting forward branches
  tcg-ppc64: Streamline tcg_out_tlb_read
  tcg-ppc64: Implement tcg_register_jit
  tcg-ppc64: Handle long offsets better
  tcg-ppc64: Tidy register allocation order
  tcg-ppc64: Look through a constant function descriptor
  tcg-ppc64: Fold constant call address into descriptor load
  tcg-ppc64: Don't load the static chain from TCG
  tcg-ppc64: Avoid code for nop move
  tcg-ppc64: Use tcg_out64
  tcg-ppc64: Use TCG_REG_Rn constants
  tcg-ppc64: More use of TAI and SAI helper macros
  tcg-ppc64: Reformat tcg-target.c
  tcg-ppc: Fix and cleanup tcg_out_tlb_check
  tcg-ppc: Use conditional branch and link to slow path
  tcg-ppc: Cleanup tcg_out_qemu_ld/st_slow_path
  tcg-ppc: Avoid code for nop move
  tcg-ppc: use new return-argument ld/st helpers
  tcg-ppc: fix qemu_ld/qemu_st for AIX ABI
  ...

Message-id: 1380126458-3247-1-git-send-email-rth@twiddle.net
2013-09-30 17:14:01 -05:00
Anthony Liguori
8429d63b0e Merge remote-tracking branch 'quintela/migration.next' into staging
# By Isaku Yamahata (4) and others
# Via Juan Quintela
* quintela/migration.next:
  migration: ram_handle_compressed
  arch_init: make is_zero_page accept size
  migration: Fix debug print type
  migration: add version supporting macros for struct pointer
  rdma: constify ram_chunk_{index, start, end}
  rdma: clean up of qemu_rdma_cleanup()
  arch_init: right return for ram_save_iterate
  savevm: fix wrong initialization by ram_control_load_hook
  savevm: add comments for qemu_file_get_error()

Message-id: 1380024203-25897-1-git-send-email-quintela@redhat.com
2013-09-30 17:13:43 -05:00
Anthony Liguori
d7f0efcb22 Merge remote-tracking branch 'kraxel/audio.1' into staging
# By Bandan Das (3) and Gerd Hoffmann (1)
# Via Gerd Hoffmann
* kraxel/audio.1:
  audio: remove CONFIG_MIXEMU configure option
  hda-codec: make mixemu selectable at runtime
  hda-codec: refactor common definitions into a header file
  audio maintainers update

Message-id: 1380011943-15083-1-git-send-email-kraxel@redhat.com
2013-09-30 17:13:32 -05:00
Anthony Liguori
1b365b2eb6 Merge remote-tracking branch 'borntraeger/tags/s390-next-20130924' into staging
This is a bunch of fixes/changes for the s390 architecture. It also
contains the fixes from the previous pull request, which did not make
it yet.
Overall it contains
- a fix for kexec without kdump (which uses diag308 subcode 0 instead of 1)
- several sclp related fixes
- some initial sclp migration code
- the sclp line mode console
- A fix for a boot problem with the virtio ccw ipl bios
- zeroed out padding bytes for the notes section of dump-guest-memory
- some cleanups

# gpg: Signature made Tue 24 Sep 2013 02:18:44 AM CDT using RSA key ID B5A61C7C
# gpg: Can't check signature: public key not found

# By Christian Borntraeger (6) and others
# Via Christian Borntraeger
* borntraeger/tags/s390-next-20130924:
  s390/sclplmconsole: Add support for SCLP line-mode console
  s390/ebcdic: Move conversion tables to header file
  s390/eventfacility: allow childs to handle more than 1 event type
  s390/eventfacility: remove unused event_type variable
  s390/eventfacility: Fix receive/send masks
  s390/eventfacility: fix multiple Read Event Data sources
  s390/sclp: add reset() functions
  s390/sclpquiesce: Add code to support live migration
  s390/sclpconsole: Add code to support live migration for sclpconsole
  s390/sclpconsole: modify definition of input buffer
  s390/kexec: Implement diag308 subcode 0
  s390/ioinst: Moved the CC setting to the IO instruction handlers
  s390/cpu: Make setcc() function available to other files
  s390/ipl: Update the s390-ccw.img rom
  s390/ipl: Fix waiting for virtio processing
  s390/dump: zero out padding bytes in notes sections
  s390/kvm: Add check for priviledged SCLP handler

Message-id: 1380007671-18976-1-git-send-email-borntraeger@de.ibm.com
2013-09-30 17:13:18 -05:00
Gerd Hoffmann
1cf9412b3b update seabios from 1.7.2.2 to 1.7.3.2
'git shortlog d4f7d90f..ece025f5' says:

Alex Williamson (4):
      seabios q35: Enable all PIRQn IRQs at startup
      seabios q35: Add new PCI slot to irq routing function
      seabios: Add a dummy PCI slot to irq mapping function
      pciinit: Enable default VGA device

Asias He (2):
      virtio-scsi: Set _DRIVER_OK flag before scsi target scanning
      virtio-scsi: Pack struct virtio_scsi_{req_cmd,resp_cmd}

Avik Sil (1):
      USB-EHCI: Fix null pointer assignment

Christian Gmeiner (5):
      geodevga: fix errors in geode_fp_* functions
      geodevga: move framebuffer setup
      geodevga: move output setup to own function
      geodevga: add debug to msr functions
      geodevga: fix wrong define name

David Woodhouse (26):
      Add macros for pushing and popping struct bregs
      Clean up #if in pirtable.c. CONFIG_PIRTABLE can't be set if CONFIG_COREBOOT is
      post: Export functions which will be used individually by CSM
      Export callrom() for CSM to use
      Export copy_smbios() from biostables.c
      Import LegacyBios.h from OVMF
      Complete and checksum EFI_COMPATIBILITY16_TABLE at build time
      Add pic_save_mask() and pic_restore_mask() functions
      Add CSM support
      Add README.CSM
      Add find_pmtimer() function
      Enable PMTIMER for CSM build
      Fix rom_reserve()/rom_confirm() for CSM oprom dispatch
      Don't calibrate TSC if PMTIMER is already set up
      Move find_pmtimer() to ACPI table setup where it logically belongs
      Use find_pmtimer() after copying Xen ACPI tables
      Use find_pmtimer() after copying coreboot ACPI tables
      Unify return path for CSM to go via csm_return()
      Make CONFIG_OPTIONROMS_DEPLOYED depend on CONFIG_QEMU
      Implement !CONFIG_OPTIONROMS support for CSM
      Implement !CONFIG_BOOT for CSM
      Enable VGA output when settings bochs-specific mode
      Disable CONFIG_THREAD_OPTIONROMS for CSM build
      Fix return type of le64_to_cpu() and be64_to_cpu()
      Rename find_pmtimer() to find_acpi_features()
      Add acpi_reboot() reset method using RESET_REG

Gerd Hoffmann (6):
      config: allow DEBUG_IO for !QEMU
      coreboot: add qemu detection
      tweak coreboot qemu detection
      apm: fix shutdown
      ahci: add missing check for allocation failure
      fix buildversion.sh

Hu Tao (1):
      Add pvpanic device driver

Kevin O'Connor (101):
      pmm: Use 'struct segoff_s' in pmm header.
      Minor: Update README - variable changes are now reset on soft-reboots.
      Normalize POST initialization function name suffixes.
      POST: Reorganize post init functions for better grouping and reusability.
      Fix rebase error in commit 8a0a972f that broke LOWMEM variables.
      Support calling a function other than maininit() from reloc_preinit().
      Ensure exported symbols are visible in the final link
      POST: Move QEMU specific ramsize and BIOS table setup to paravirt.c.
      POST: Reorganize post entry and "preinit" functions.
      POST: Move cpu caching and dma setup to platform_hardware_setup().
      Undo incorrect assumptions about Xen in commit 6ca0460f.
      Determine century during init and store in VARLOW mem during runtime.
      No need to check both CONFIG_THREADS and CONFIG_THREAD_OPTIONROMS.
      Add runningOnQEMU() and runningOnXen() for runtime platform detection.
      Consistently use CONFIG_COREBOOT, CONFIG_QEMU, and runningOnXen().
      Convert kvm_para_available() to runningOnKVM().
      Minor - move definitions to paravirt.c from paravirt.h.
      Only perform SMP setup on QEMU.
      Start device_hardware_setup in mainint even with CONFIG_THREAD_OPTIONROMS.
      The mathcp setup touches the PIC and thus move to the "setup" phase.
      Update tools/acpi_extract.py to handle iasl 20130117 release.
      Support skipping content when reading from QEMU fw_cfg romfile entries.
      Convert fw_cfg ACPI entries into romfile entries.
      Convert fw_cfg SMBIOS entries into romfile entries.
      Convert basic integer fw_cfg entries into romfile entries.
      Convert fw_cfg NUMA entries into a romfile entry.
      Process fw_cfg e820 entries during the fw_cfg setup stage.
      Integrate qemu_cfg_preinit() into qemu_romfile_init().
      Group QEMU platform setup together and move to paravirt.c.
      vgabios: Bochs/QEMU vgabios support should depend on CONFIG_QEMU.
      Warn on unaligned PCI ROM structure in option roms.
      Fix Makefile - don't reference "out/" directly, instead use "$(OUT)".
      build: Don't require $(OUT) to be a sub-directory of the main directory.
      Rename rom_get_top() to rom_get_max().
      Report on f-segment UMB ram also.
      Clarify build generated "zone low" values.
      Verify CC is valid during build tests.
      Disable handle_post() on CSM builds.
      Remove unnecessary "export" declarations from assembler functions.
      Minor assembler enhancements to __csm_return.
      Introduce VARFSEG for variables that will reside in the f-segment.
      Convert VAR16VISIBLE, VAR16EXPORT, and VAR32VISIBLE to VARFSEG.
      Don't relocate "varlow" variable references at runtime.
      Move malloc's ZoneFSeg and ZoneLow setup to malloc_init.
      Calculate "RamSize" needed by 16bit interface dynamically.
      Eliminate separate BiosTableSpace[] space for f-segment allocations.
      Use CONFIG_ prefix for Kconfig variables; use BUILD_ for others.
      Try to detect an unsuccessful hard-reboot to prevent soft-reboot loops.
      Minor - fix confusing final_sec32low_start name in layoutrom.py.
      Minor - introduce numeric defines for the IVT offset of hw irqs.
      Separate out 16bit PCI-BIOS entry point from regular int 0x1a entry point.
      Support using the "extra stack" for all 16bit irq entry points.
      Minor - improve comments and grouping of handle_08().
      floppy: Introduce 'struct floppy_pio_s' for floppy PIO ops.
      floppy: Cleanup floppy irq wait handling.
      floppy: Clean up Check Interrupt Status code.
      floppy: Move recalibration and results parsing to floppy_cmd().
      floppy: Improve floppy_pio() error checking.
      floppy: Implement media format sensing.
      floppy: Actually do controller reset in floppy_reset().
      Minor - note that passing QEMU config via cmos is deprecated.
      Cache boot-fail-wait to avoid romfile access after POST.
      Rename src/ssdt-susp.dsl to src/ssdt-misc.dsl.
      acpi: Eliminate BDAT parameter passing to DSDT code.
      Add additional dependency checks to Makefile.
      Don't use __FILE__ in virtio-ring.c.
      shadow: Don't use PCIDevices list in make_bios_readonly().
      smm: Don't use PCIDevices list in smm_setup().
      Add VARVERIFY32INIT attribute for variables only available during "init".
      Use VARVERIFY32INIT on global variables that point to "tmp" memory.
      vgabios: Fix stdvga_perform_gray_scale_summing().
      vgabios: Fix cirrus memory clear on mode switch.
      Minor - add missing newline to floppy debug statement.
      Fix bug in NUMA node setup - don't create SRAT if NUMA not present.
      Update README - copy *.aml files for QEMU.
      Add dependencies to vgafixup.py and buildversion.sh scripts.
      Set ZF prior to keyboard read call in check_for_keystroke().
      mptable: Don't describe pci-to-pci bridges.
      mptable: Use same PCI irqs as ACPI code.
      Cleanup QEMU_CFG_NUMA fw_cfg processing - split into two romfile entries.
      Use container_of on romfile entries.
      acpi: Move ACPI table definitions from acpi.c to acpi.h.
      acpi: Remove dead code with descriptions of bit flags.
      acpi: Use cpu_to_leXX() consistently.
      Minor - explicitly close files in buildrom.py.
      Minor - move "tracked memory alloc" code in pmm.c.
      Introduce and convert pmm code to use standard list helpers.
      Minor - relocate code in stacks.c to keep low-level thread code together.
      Introduce helper function have_threads() in stacks.c.
      Convert stacks.c to use standard list manipulation code.
      Convert boot.c to use standard list manipulation code.
      Convert pciinit.c to use standard list manipulation code.
      Convert PCIDevices list to use standard list manipultion code.
      Revert "Convert pciinit.c to use standard list manipulation code."
      Fix error in hlist_for_each_entry_safe macro.
      Convert pciinit.c to use standard list manipulation code.
      make qemu_cfg_init depend on QEMU_HARDWARE instead of QEMU
      Another fix for hlist_for_each_entry_safe.
      Minor - remove debugging dprintf added to pciinit.c.
      Fix USB EHCI detection that was broken in hlist conversion of PCIDevices.
      Fix bug in CBFS file walking with compressed files.

Laszlo Ersek (1):
      Enable VGA output when setting Cirrus-specific mode

Michael S. Tsirkin (2):
      acpi: make default DSDT optional
      acpi: sync FADT flags from PIIX4 to Q35

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-30 11:18:02 +02:00
Gerd Hoffmann
774e80ea1d roms: add support for building sgabios
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-30 09:44:36 +02:00
Gerd Hoffmann
95f7c6803c roms: enable parallel seabios / seavgabios builds
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-30 09:44:35 +02:00
Gerd Hoffmann
779fa9d706 roms: enable ipxe cross builds 2013-09-30 09:44:35 +02:00
Gerd Hoffmann
bcf06c15e7 roms: add rules to build slof
Add some logic to detect cross compilers.  Add support for "make slof",
which should JustWork[tm] if you are on a ppx64 machine or have a ppc64
cross compiler installed somewhere in your path.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-30 09:44:35 +02:00
Gerd Hoffmann
93a2b3c470 roms: rewrite scripts/refresh-pxe-roms.sh
Just use the Makefile in roms/

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-30 09:44:35 +02:00
Gerd Hoffmann
46ef7f33a2 roms: parallel ipxe builds
Enable parallel ipxe builds.  Reduce the recursive make calls.  Call
recursive make properly using $(MAKE) $(MAKEFLAGS).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-30 09:44:35 +02:00
Gerd Hoffmann
5a7bd33385 roms: build lgplvgabios isavga variant
Add logic to also build+install the isavga vgabios variant.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-30 09:44:35 +02:00
Gerd Hoffmann
1ede4dd04b roms: enable parallel builds for 'make lgplvgabios'
Recurse into vgabios once, adjust dependencies, call make using
$(MAKE) $(MAKEFLAGS) so jobserver mode works.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-30 09:44:35 +02:00
Gerd Hoffmann
6887581728 roms: add 'make clean'
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-30 09:44:35 +02:00
Markus Armbruster
e26d3e7346 smbios: Factor out smbios_maybe_add_str()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-28 23:49:39 +03:00
Markus Armbruster
fc3b32958a smbios: Make multiple -smbios type= accumulate sanely
Currently, -smbios type=T,NAME=VAL,... adds one field (T,NAME) with
value VAL to fw_cfg for each unique NAME.  If NAME occurs multiple
times, the last one's VAL is used (before the QemuOpts conversion, the
first one was used).

Multiple -smbios can add multiple fields with the same (T, NAME).
SeaBIOS reads all of them from fw_cfg, but uses only the first field
(T, NAME).  The others are ignored.

"First one wins, subsequent ones get ignored silently" isn't nice.  We
commonly let the last option win.  Useful, because it lets you
-readconfig first, then selectively override with command line
options.

Clean up -smbios to work the common way.  Accumulate the settings,
with later ones overwriting earlier ones.  Put the result into fw_cfg
(no more useless duplicates).

Bonus cleanup: qemu_uuid_parse() no longer sets SMBIOS system uuid by
side effect.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-28 23:49:39 +03:00
Markus Armbruster
ec2df8c10a smbios: Improve diagnostics for conflicting entries
We allow either tables or fields for the same type.  Makes sense,
because SeaBIOS uses fields only when no tables are present.

We do this by searching the SMBIOS blob for a previously added table
or field.  Error messages look like this:

    qemu-system-x86_64: -smbios type=1,serial=42: SMBIOS type 1 table already defined, cannot add field

User needs to know that "table" is defined by -smbios file=..., and
"field" by -smbios type=...

Instead of searching the blob, record additions of interest, and check
that.  Simpler, and makes better error messages possible:

    qemu-system-x86_64: -smbios file=smbios_type_1.bin: Can't mix file= and type= for same type
    qemu-system-x86_64: -smbios type=1,serial=42,serial=99: This is the conflicting setting

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-28 23:49:39 +03:00
Markus Armbruster
4f953d2fc8 smbios: Convert to QemuOpts
So that it can be set in config file for -readconfig.

This tightens parsing of -smbios, and makes it more consistent with
other options: unknown parameters are rejected, numbers with trailing
junk are rejected, when a parameter is given multiple times, last
rather than first wins, ...

MST: drop one chunk to fix build errors

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-28 23:49:06 +03:00
Markus Armbruster
351a6a73ca smbios: Normalize smbios_entry_add()'s error handling to exit(1)
It exits on all error conditions but one, where it returns -1.
Normalize, and return void.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-28 22:40:58 +03:00
Kevin Wolf
61653008ad qcow2: Remove useless count_contiguous_clusters() parameter
All callers pass start = 0, and it's doubtful if any other value would
actually do what you expect. Remove the parameter.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2013-09-27 17:22:43 +02:00
Max Reitz
22f0dd29af qcow2: COMPRESSED on count_contiguous_clusters
Compressed clusters can never be contiguous, therefore the corresponding
flag does not need to be given explicitly to count_contiguous_clusters.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 17:22:43 +02:00
Max Reitz
15684a4742 qcow2: count_contiguous_clusters and compression
The function is not intended to be used on compressed clusters and will
not work correctly, if used anyway, since L2E_OFFSET_MASK is not the
right mask for determining the offset of compressed clusters. Therefore,
assert that the first cluster is not compressed and always include the
compression flag in the mask of significant flags, i.e., stop the search
as soon as a compressed cluster occurs.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 17:22:43 +02:00
Max Reitz
320c706666 qcow2: Free only newly allocated clusters on error
In expand_zero_clusters_in_l1, a new cluster is only allocated if it was
not already preallocated. On error, such preallocated clusters should
not be freed, but only the newly allocated ones.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 17:22:43 +02:00
Max Reitz
be0b742ee3 qcow2: Always use error path in l2_allocate
Just returning -errno in some cases prevents
trace_qcow2_l2_allocate_done from being executed (and, in one case, also
the unused allocated L2 table from being freed). Always going down the
error path fixes this.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 17:22:43 +02:00
Max Reitz
8585afd813 qcow2: Don't put invalid L2 table into cache
In l2_allocate, the fail path is executed if qcow2_cache_flush fails.
However, the L2 table has not yet been fetched from the L2 table cache.
The qcow2_cache_put in the fail path therefore basically gives an
undefined argument as the L2 table address (in this case).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 11:31:59 +02:00
Max Reitz
fd9e03e606 qemu-iotests: Preallocated zero clusters in 061
Add a test case for zero cluster expansion on an image completely filled
with preallocated zero clusters to test 061.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 11:16:36 +02:00
Max Reitz
e390cf5a97 qcow2: Correct bitmap size in zero expansion
Since the expanded_clusters bitmap is addressed using host offsets in
the underlying image file, the correct size to use for allocating the
bitmap is not determined by the guest disk image but by the underlying
host image file.

Furthermore, this size may change during the expansion due to cluster
allocations on growable image files. In this case, the bitmap needs to
be resized as well to reflect the growth.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 11:16:35 +02:00
Jeff Cody
fef9c19139 qemu-iotests: Quote $TEST_IMG* and $TEST_DIR usage
A lot of image filename and paths are used unquoted.  Quote these to
make sure that directories / filenames with spaces are not problematic.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 11:10:45 +02:00
Jeff Cody
85edbd375b qemu-iotests: Add basic ability to use binary sample images
For image formats that are not "QEMU native", but supported for
compatibility, it is useful to verify that an image created with
the 'gold standard' native tool can be read / written to successfully
by QEMU.

In addition to testing non-native images, this could also be useful to
test against image files created by older versions of QEMU.

This provides a directory to store small sample images, for use by
scripts in tests/qemu-iotests.

Image files should be compressed with bzip2.

To use a sample image from a bash script, the _use_sample_img function
will copy and decompress the image into $TEST_DIR, and set $TEST_IMG to
be the decompressed sample image copy.  To cleanup, call
_cleanup_test_img as normal.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-27 10:59:07 +02:00
Fam Zheng
212774c5a5 qemu-iotests: fix qmp.py search path
QMP/qmp.py is renamed to scripts/qmp/qmp.py, fix the search path in iotests.py.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-26 16:04:24 +02:00
Fam Zheng
d055a1fec3 block: use DIV_ROUND_UP in bdrv_co_do_readv
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-26 14:11:06 +02:00
Max Reitz
c01dbccbad qcow2: Assert against currently impossible overflow
If qcow2_alloc_cluster_link_l2 is called with a QCowL2Meta describing a
request crossing L2 boundaries, a buffer overflow will occur. This is
impossible right now since such requests are never generated (every
request is shortened to L2 boundaries before) and probably also
completely unintended (considering the name "QCowL2Meta"), however, it
is still worth an assertion.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 21:57:44 +02:00
Stefan Weil
3df2b8fde9 misc: Use new rotate functions
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2013-09-25 21:23:05 +02:00
Stefan Weil
6aa25b4a7b bitops: Add rotate functions (rol8, ror8, ...)
These functions were copies from include/linux/bitopts.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2013-09-25 21:22:33 +02:00
Stefan Weil
d285bf784b tci: Add implementation of rotl_i64, rotr_i64
It is used by qemu-ppc64 when running Debian's busybox-static.

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2013-09-25 21:22:00 +02:00
Jeff Cody
687fb89366 block: qed - use QEMU_PACKED for on-disk structures
QEDHeader is read, and written, directly from on-disk images
via bdrv_pread()/write().  To avoid any unintentional padding,
these structs should be packed.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 20:51:15 +02:00
Jeff Cody
c4217f645d block: qcow2 - used QEMU_PACKED for on-disk structures
QCowHeader and QCowExtension are structs that reside in the on-disk
image format, and are read and written directly via bdrv_pread()/write(),
and as such should be packed to avoid any unintentional struct padding.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 20:51:13 +02:00
Jeff Cody
e54835c06d block: vpc - use QEMU_PACKED for on-disk structures
The VHD footer and header structs (vhd_footer and vhd_dyndisk_header)
are on-disk structures for the image format, and as such should be
packed.

Go ahead and make these typedefs as well, with the preferred QEMU
naming convention, so that the packed attribute is used consistently
with the struct.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 20:51:10 +02:00
Jeff Cody
8368febd81 block: vdi - use QEMU_PACKED for on-disk structures
The header struct VdiHeader is an on-disk structure for the image
format, and as such should be packed.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 20:51:05 +02:00
Anthony PERARD
594278d9f2 xen: Enable cpu-hotplug on xenfv machine.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2013-09-25 16:43:12 +00:00
Anthony PERARD
1cd25a8896 xen: Fix vcpu initialization.
Each vcpu need a evtchn binded in qemu, even those that are
offline at QEMU initialisation.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2013-09-25 16:41:48 +00:00
Liu, Jinsong
11addd0ab9 qemu: Add qemu xen logic for Xen HVM S3 resume
This patch is qemu patch 2 to fix Xen HVM S3 bug, adding qemu
xen logic. When qemu wakeup, qemu xen logic is notified and
hypercall to xen hypervisor to unpause domain.

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
2013-09-25 16:40:23 +00:00
Liu, Jinsong
4bc78a8772 qemu: Adjust qemu wakeup
Currently Xen hvm s3 has a bug coming from the difference between
qemu-traditioanl and qemu-xen. For qemu-traditional, the way to
resume from hvm s3 is via 'xl trigger' command. However, for
qemu-xen, the way to resume from hvm s3 inherited from standard
qemu, i.e. via QMP, and it doesn't work under Xen.

The root cause is, for qemu-xen, 'xl trigger' command didn't reset
devices, while QMP didn't unpause hvm domain though they did qemu
system reset.

We have two qemu patches and one xl patch to fix Xen hvm s3 bug.
This patch is the qemu patch 1. It adjusts qemu wakeup so that
Xen s3 resume logic (which will be implemented at qemu patch 2)
will be notified after qemu system reset.

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
2013-09-25 16:38:29 +00:00
Richard Henderson
7f12d6497f tcg-ppc64: Implement CONFIG_QEMU_LDST_OPTIMIZATION
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:33 -07:00
Richard Henderson
c7ca6a2b75 tcg-ppc64: Add _noaddr functions for emitting forward branches
... rather than open-coding this stuff through the file.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
fedee3e7fd tcg-ppc64: Streamline tcg_out_tlb_read
Less conditional compilation.  Merge an add insn with the indexed
memory load insn.  Load the tlb addend earlier.  Avoid the address
update memory form.

Fix a bug in not allowing large enough tlb offsets for some guests.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
fa94c3be7a tcg-ppc64: Implement tcg_register_jit
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
b18d5d2b80 tcg-ppc64: Handle long offsets better
Previously we'd only handle 16-bit offsets from memory operand without falling
back to indexed, but it's easy to use ADDIS to handle full 32-bit offsets.

This also lets us unify code that existed inline in tcg_out_op for handling
addition of large constants.

The new R2 temporary was marked reserved for the AIX calling convention, but
the register really is call-clobbered and since tcg generated code has no use
for a TOC, it's available for use.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
5e1702b074 tcg-ppc64: Tidy register allocation order
Remove conditionalization from tcg_target_reg_alloc_order, relying on
reserved_regs to prevent register allocation that shouldn't happen.
So R11 is now present in reg_alloc_order for __APPLE__, but also now
reserved.

Sort reg_alloc_order into call-saved, call-clobbered, and parameters.
This reduces the effect of values getting spilled and reloaded before
function calls.

Whether or not it is reserved, R2 (TOC) is always call-clobbered.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
b0940da012 tcg-ppc64: Look through a constant function descriptor
Especially in the user-only configurations, a direct branch into
the executable may be in range.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
d40f3cb112 tcg-ppc64: Fold constant call address into descriptor load
Eliminates one insn per call:

 :  lis     r2,4165
-:  ori     r2,r2,59616
-:  ld      r0,0(r2)
+:  ld      r0,-5920(r2)
 :  mtctr   r0
-:  ld      r2,8(r2)
+:  ld      r2,-5912(r2)
 :  bctrl

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
ad94e1a9db tcg-ppc64: Don't load the static chain from TCG
There are no helpers that require the static chain.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
f8b8412907 tcg-ppc64: Avoid code for nop move
While these are rare from code that's been through the optimizer,
it's not uncommon within the tcg backend.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
5e0f40cfed tcg-ppc64: Use tcg_out64
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
8327a470df tcg-ppc64: Use TCG_REG_Rn constants
Instead of bare N, for clarity.  The only (intentional) exception made
is for insns that encode R|0, i.e. when R0 encoded into the insn is
interpreted as zero not the contents of the register.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
29b6919869 tcg-ppc64: More use of TAI and SAI helper macros
Finish conversion of all memory operations.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:32 -07:00
Richard Henderson
541dd4ceaa tcg-ppc64: Reformat tcg-target.c
Whitespace and brace changes only.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:31 -07:00
Richard Henderson
8f50c841b3 tcg-ppc: Fix and cleanup tcg_out_tlb_check
The fix is that sparc has so many mmu modes that the last one overflowed
the 16-bit signed offset we assumed would fit.  Handle this, and check
the new assumption at compile time.

Load the tlb addend earlier for the fast path.

Remove the explicit address + addend and make use of index addressing.

Adjust constraints for qemu_ld64 such that we don't clobber the address
register or tlb addend before loading both values.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:31 -07:00
Richard Henderson
5b1c985b7e tcg-ppc: Use conditional branch and link to slow path
Saves one insn per slow path.  Note that we can no longer use
a tail call into the store helper.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:31 -07:00
Richard Henderson
1d10cf9886 tcg-ppc: Cleanup tcg_out_qemu_ld/st_slow_path
Coding style fixes.  Use TCGReg enumeration values instead of raw
numbers.  Don't needlessly pull the whole TCGLabelQemuLdst struct
into local variables.  Less conditional compilation.

No functional changes.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:31 -07:00
Richard Henderson
4b2b114d8c tcg-ppc: Avoid code for nop move
While these are rare from code that's been through the optimizer,
it's not uncommon within the tcg backend.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:46:31 -07:00
Paolo Bonzini
619f90ba62 tcg-ppc: use new return-argument ld/st helpers
These use a 32-bit load-of-immediate to save a mflr+addi+mtlr sequence.
Tested with a Windows 98 guest (pretty much the most recent thing I
could run on my PPC machine) and kvm-unit-tests's sieve.flat.  The
speed up for sieve.flat is as high as 10% for qemu-system-i386, 25%
(no kidding) for qemu-system-x86_64 on my PowerBook G4.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:45:39 -07:00
Paolo Bonzini
6a11557988 tcg-ppc: fix qemu_ld/qemu_st for AIX ABI
For the AIX ABI, the function pointer and small area pointer need
to be loaded in the trampoline.  The trampoline instead is called
with a normal BL instruction.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-25 07:45:30 -07:00
Stefan Hajnoczi
9e6337d081 rbd: avoid qemu_rbd_snap_list() memory leaks
When there are no snapshots qemu_rbd_snap_list() returns 0 and the
snapshot table pointer is NULL.  Don't forget to free the snaps buffer
we allocated for librbd rbd_snap_list().

When the function succeeds don't forget to free the snaps buffer after
calling rbd_snap_list_end().

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 16:22:00 +02:00
Benoît Canet
5726d872f3 qdict: Extract qdict_extract_subqdict
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 16:21:28 +02:00
Stefan Weil
c3e4f43a99 block: Fix compiler warning (-Werror=uninitialized)
The patch fixes a warning from gcc (Debian 4.6.3-14+rpi1) 4.6.3:

block/stream.c:141:22: error:
‘copy’ may be used uninitialized in this function [-Werror=uninitialized]

This is not a real bug - a better compiler would not complain.

Now 'copy' has always a defined value, so the check for ret >= 0
can be removed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 16:21:28 +02:00
Benoît Canet
030be32184 block: introduce BlockDriver.bdrv_needs_filename to enable some drivers.
Some drivers will have driver specifics options but no filename.
This new bool allow the block layer to treat them correctly.

The .bdrv_needs_filename is set in drivers not having .bdrv_parse_filename and
not having .bdrv_open.

The first exception to this rule will be the quorum driver.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 16:21:28 +02:00
Fam Zheng
2fe2e29071 qemu-iotests: add monolithicFlat creation test to 059
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 16:21:28 +02:00
Fam Zheng
fc7ce63fb1 qemu-iotests: fix test case 059
Since commit "block: Error parameter for open functions", error output
is more verbose. Update test case output file to follow the change.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 16:21:28 +02:00
Fam Zheng
301c7d38a0 vmdk: fix cluster size check for flat extents
We use the extent size as cluster size for flat extents (where no L1/L2
table is allocated so it's safe) reuse sector calculating code with
sparse extents.

Don't pass in the cluster size for adding flat extent, just set it to
sectors later, then the cluster size checking will not fail.

The cluster_sectors is changed to int64_t to allow big flat extent.

Without this, flat extent opening is broken:

    # qemu-img create -f vmdk -o subformat=monolithicFlat /tmp/a.vmdk 100G
    Formatting '/tmp/a.vmdk', fmt=vmdk size=107374182400 compat6=off subformat='monolithicFlat' zeroed_grain=off
    # qemu-img info /tmp/a.vmdk
    image: /tmp/a.vmdk
    file format: raw
    virtual size: 0 (0 bytes)
    disk size: 4.0K

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 16:21:28 +02:00
Peter Lieven
1f9db2243c block/get_block_status: avoid segfault if there is no backing_hd
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 10:08:56 +02:00
Peter Lieven
3e0a233d86 block/get_block_status: set *pnum = 0 on error
if the call is invoked through bdrv_is_allocated the caller might
expect *pnum = 0 on error. however, a new implementation of
bdrv_get_block_status might only return a negative exit value on
error while keeping *pnum untouched.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 10:08:56 +02:00
Max Reitz
7454d60045 qcow2: Don't shadow return value
When trying to update the refcounts for a snapshot, the return value of
update_refcount on a compressed cluster was pretty much ignored,
cancelling the update on error but returning 0. This is caused by an
inner "ret" variable shadowing the outer one (the latter is used in the
return statement).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 10:08:56 +02:00
Max Reitz
ff42308f30 qemu-iotests: Do not execute 052 with -nocache
Test 052 uses qemu-io -s which will result in bdrv_open trying to create
a temporary snapshot file in /tmp. However, since O_DIRECT and tmpfs
do not work well together, disable this test for -nocache.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 10:08:56 +02:00
Fam Zheng
4db9c98002 qemu-iotests: add test for backing file overriding
Test that backing.file.filename option can be parsed and override the
backing file from image (backing file reflected with "info block").

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 10:08:56 +02:00
Fam Zheng
dbecebddfa block: fix backing file overriding
Providing backing.file.filename doesn't override backing file as expected:

    $ x86_64-softmmu/qemu-system-x86_64 -drive \
        file=/tmp/child.qcow2,backing.file.filename=/tmp/fake.qcow2

    qemu-system-x86_64: -drive \
        file=/tmp/child.qcow2,backing.file.filename=/tmp/fake.qcow2: could not
        open disk image /tmp/child.qcow2: Can't specify 'file' and 'filename'
        options at the same time

With

    $ qemu-img info /tmp/child.qcow2
    image: /tmp/child.qcow2
    file format: qcow2
    virtual size: 1.0G (1073741824 bytes)
    disk size: 196K
    cluster_size: 65536
    backing file: /tmp/fake.qcow2

This fixes it by calling bdrv_get_full_backing_filename only if
backing.file.filename is not provided. Also save the backing file name
to bs->backing_file so the information is correct with HMP "info block".

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 10:08:56 +02:00
Richard Henderson
e3608d66ce configure: Allow command-line configure for ppc32
Similar to manually selecting i386 for an x86_64 host.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-24 18:50:40 -07:00
Eduardo Otubo
c236f4519c seccomp: fine tuning whitelist by adding times()
This was causing Qemu process to hang when using -sandbox on as
discribed on RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1004175

Signed-off-by: Eduardo Otubo <otubo@linux.vnet.ibm.com>
Tested-by: Paul Moore <pmoore@redhat.com>
Acked-by: Paul Moore <pmoore@redhat.com>
2013-09-24 15:15:16 -03:00
Isaku Yamahata
d613a56f84 migration: ram_handle_compressed
ram_handle_compressed() should be aware of size > TARGET_PAGE_SIZE.
migration-rdma can call it with larger size.

Signed-off-by: Isaku Yamahata <yamahata@private.email.ne.jp>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-09-24 13:22:50 +02:00
Isaku Yamahata
dc3c26a479 arch_init: make is_zero_page accept size
Later is_zero_page will be used for non TARGET_PAGE_SIZE
range.
And rename it to is_zero_range as it isn't page size any more.

Signed-off-by: Isaku Yamahata <yamahata@private.email.ne.jp>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-09-24 13:22:50 +02:00
Christoffer Dall
5016e2df56 migration: Fix debug print type
The printf args are uint64_t and with -Werr QEMU doesn't compile with
migration debugging turned on unless this is fixed.  Fix it.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-09-24 13:22:50 +02:00
Alexey Kardashevskiy
7102400d40 migration: add version supporting macros for struct pointer
This adds version supporting macros VMSTATE_STRUCT_POINTER_TEST_V
and VMSTATE_STRUCT_POINTER_V in addition to the already existing
VMSTATE_STRUCT_POINTER and VMSTATE_STRUCT_POINTER_TEST macros.

Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-09-24 13:22:50 +02:00
Isaku Yamahata
dd286ed700 rdma: constify ram_chunk_{index, start, end}
Signed-off-by: Isaku Yamahata <yamahata@private.email.ne.jp>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-09-24 13:22:50 +02:00
Isaku Yamahata
5a91337cdf rdma: clean up of qemu_rdma_cleanup()
- It can't be determined by RDMAContext::cm_id != NULL if the connection
  is established or not.
- RDMAContext::cm_id is leaked and not destroyed because it is set to NULL
  too early.
- RDMAContext::qp is created by rdma_create_qp() so that it should be destroyed
  by rdma_destroy_qp(). not ibv_destroy_qp()

Cc: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Isaku Yamahata <yamahata@private.email.ne.jp>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-09-24 13:22:50 +02:00
Lei Li
6cd0beda2c arch_init: right return for ram_save_iterate
qemu_file_rate_limit() never return negative value since the refactor
by Commit 1964a39, this patch gets rid of the negative check for it,
adjust bytes_transferred and return value correspondingly in
ram_save_iterate().

Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-09-24 13:22:08 +02:00
Lei Li
c77a5f2daa savevm: fix wrong initialization by ram_control_load_hook
It should set negative error value rather than 0 in QEMUFile
if there has been an error.

Reviewed-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-09-24 13:22:08 +02:00
Lei Li
675fd0a7da savevm: add comments for qemu_file_get_error()
Add comments for qemu_file_get_error(), as its return value
is not very clear.

Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-09-24 13:22:08 +02:00
Bandan Das
19b0dfc19c audio: remove CONFIG_MIXEMU configure option
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-24 10:29:34 +02:00
Bandan Das
2690e61e8e hda-codec: make mixemu selectable at runtime
Define PARAM so that we have two versions of the "desc_codec
and family" structs. Add a property called "mixer" whose default
value depends on whether CONFIG_MIXEMU is defined or not which
will help us call the appropriate instance init functions.

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-24 10:29:34 +02:00
Bandan Das
7953793c03 hda-codec: refactor common definitions into a header file
Move common defines and structs to a header file.
The next commit will include it twice, once for a device with a
mixer, and once for device without a mixer.

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-24 10:29:34 +02:00
Gerd Hoffmann
9f57584667 audio maintainers update
av1474@comtv.ru bounces, and I havn't seen malc @ qemu-devel for quite a
while (anyone knows what is up?).  Adding myself as audio maintainer, so
audio patches don't fall through the cracks that easily.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-24 10:29:34 +02:00
Edgar E. Iglesias
53d09b761f linux-user: Handle SOCK_CLOEXEC/NONBLOCK if unavailable on host
If the host lacks SOCK_CLOEXEC, bail out with -EINVAL.
If the host lacks SOCK_ONONBLOCK, try to emulate it with fcntl()
and O_NONBLOCK.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-09-24 10:47:07 +03:00
Riku Voipio
89aaf1a6ad [v2] linux-user: implement m68k atomic syscalls
With nptl enabled, atomic_cmpxchg_32 and atomic_barrier
system calls are needed. This patch enabled really dummy
versions of the system calls, modeled after the m68k
kernel code.

With this patch I am able to execute m68k binaries
with qemu linux-user (busybox compiled for coldfire).

[v2] que an segfault instead of returning a EFAULT
to keep in line with kernel code.

Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-09-24 10:47:07 +03:00
Kwok Cheung Yeung
1308c464a8 linux-user: Check type of microMIPS break instruction
microMIPS instructions that cause breakpoint exceptions come in
16-bit and 32-bit variants.  When handling exceptions caused by
such instructions, the instruction type needs to be taken into
account when extracting the break code.

The code has also been restructured for better clarity.

Signed-off-by: Kwok Cheung Yeung <kcy@codesourcery.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-09-24 10:47:07 +03:00
Petar Jovanovic
dbf4f7965a linux-user: correct how SOL_SOCKET is converted from target to host and back
Previous implementation does not take into account that SOL_SOCKET constant
can be arch specific. This change fixes some issues with sendmsg/recvmsg.

Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-09-24 10:47:07 +03:00
Laurent Vivier
03cfd8faa7 linux-user: add support of binfmt_misc 'O' flag
The binfmt_misc module can calculate the credentials and security
token according to the binary instead of to the interpreter if the
'C' flag is enabled.

To be able to execute non-readable binaries, this flag implies 'O'
flag. When 'O' flag is enabled, bintfmt_misc opens the file for
reading and pass the file descriptor to the interpreter.

References:
linux/Documentation/binfmt_misc.txt          ['O' and 'C' description]
linux/fs/binfmt_misc.c linux/fs/binfmt_elf.c [ AT_EXECFD usage ]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-09-24 10:47:07 +03:00
Laurent Vivier
0d78b3b5b1 linux-user: add some IPV6 commands in setsockop()
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-09-24 10:47:07 +03:00
Laurent Vivier
bd00c74c7f linux-user: allow use of TIOCGSID
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-09-24 10:47:07 +03:00
Laurent Vivier
f57d419241 linux-user: Add setsockopt(SO_ATTACH_FILTER)
This is needed to be able to run dhclient.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-09-24 10:47:06 +03:00
Laurent Vivier
de6b993377 linux-user: convert /proc/net/route when endianess differs
This patch allows to have IP addresses in correct order
in the case of "netstat -nr" when the endianess of the
guest differs from one of the host.

For instance, an m68k guest on an x86_64 host:

WITHOUT this patch:

$ netstat -nr
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         1.3.0.10        0.0.0.0         UG        0 0          0 eth0
0.3.0.10        0.0.0.0         0.255.255.255   U         0 0          0 eth0
$ cat /proc/net/route
Iface	Destination	Gateway 	Flags	RefCnt	Use	Metric	Mask	MTU	Window	IRTT

eth0	00000000	0103000A	0003	0	0	0	000000000	0	0
eth0	0003000A	00000000	0001	0	0	0	00FFFFFF0	0	0

WITH this patch:

$ netstat -nr
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         10.0.3.1        0.0.0.0         UG        0 0          0 eth0
10.0.3.0        0.0.0.0         255.255.255.0   U         0 0          0 eth0
$ cat /proc/net/route
Iface	Destination	Gateway 	Flags	RefCnt	Use	Metric	Mask	MTU	Window	IRTT
eth0	00000000	0a000301	0003	0	0	0	000000000	0	0
eth0	0a000300	00000000	0001	0	0	0	ffffff000	0	0

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-09-24 10:47:06 +03:00
Richard Henderson
868e34d7bd mips-linux-user: Adjust names in mips_syscall_args
The name field of MIPS_SYS isn't actually used; it's just documentation.
But adjust the umount entries to match mips/syscall_nr.h anyway.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-09-24 10:47:06 +03:00
Richard Henderson
8070e7be8b alpha-linux-user: Fix umount syscall numbers
It has been pointed out on LKML that the alpha umount syscall numbers
are named wrong, and a patch to rectify that has been posted for 3.11.

Glibc works around this by treating NR_umount as NR_umount2 if
NR_oldumount exists.  That's more complicated than we need in QEMU,
given that we control linux-user/*/syscall_nr.h.

This is the last instance of TARGET_NR_oldumount, so delete that from
the strace.list.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-09-24 10:47:06 +03:00
Anthony Liguori
f828a4c8fa Merge remote-tracking branch 'stefanha/tracing' into staging
# By Alexey Kardashevskiy
# Via Stefan Hajnoczi
* stefanha/tracing:
  kvm: fix traces to use %x instead of %d

Message-id: 1379699931-5837-1-git-send-email-stefanha@redhat.com
2013-09-23 11:53:22 -05:00
Anthony Liguori
feb678c6f7 Merge remote-tracking branch 'stefanha/net' into staging
# By Aurelien Jarno (1) and Vincenzo Maffione (1)
# Via Stefan Hajnoczi
* stefanha/net:
  e1000: NetClientInfo.receive_iov implemented
  pcnet-pci: mark I/O and MMIO as LITTLE_ENDIAN

Message-id: 1379699613-5338-1-git-send-email-stefanha@redhat.com
2013-09-23 11:53:11 -05:00
Anthony Liguori
16121fa39e Merge remote-tracking branch 'stefanha/block' into staging
# By Stefan Hajnoczi (4) and others
# Via Stefan Hajnoczi
* stefanha/block:
  virtio-blk: do not relay a previous driver's WCE configuration to the current
  blockdev: do not default cache.no-flush to true
  block: don't lose data from last incomplete sector
  qcow2: Correct snapshots size for overlap check
  coroutine: fix /perf/nesting coroutine benchmark
  coroutine: add qemu_coroutine_yield benchmark
  qemu-timer: do not take the lock in timer_pending
  qemu-timer: make qemu_timer_mod_ns() and qemu_timer_del() thread-safe
  qemu-timer: drop outdated signal safety comments
  osdep: warn if open(O_DIRECT) on fails with EINVAL
  libcacard: link against qemu-error.o for error_report()

Message-id: 1379698931-946-1-git-send-email-stefanha@redhat.com
2013-09-23 11:53:05 -05:00
Anthony Liguori
2e6ae666c8 Merge remote-tracking branch 'mjt/trivial-patches' into staging
# By Stefan Weil (8) and others
# Via Michael Tokarev
* mjt/trivial-patches:
  tests/.gitignore: ignore test-throttle
  exec: Fix broken build for MinGW (regression)
  kvm: Fix compiler warning (clang)
  tcg-sparc: Fix parenthesis warning
  Makefile: Remove some more files when cleaning
  target-i386: Fix segment cache dump
  iov: avoid "orig_len may be used unitialized" warning
  vscclient: remove unnecessary use of uninitialized variable
  trace-events: Clean up with scripts/cleanup-trace-events.pl again
  tci: Fix qemu-alpha on 32 bit hosts (wrong assertions)
  *-user: Improve documentation for lock_user function
  MAINTAINERS: Add missing entry to filelist for TCI target
  translate-all: Fix formatting of dump output
  *-user: Fix typo in comment (ulocking -> unlocking)
  docs: Fix IO port number for CPU present bitmap.
  q35: Fix typo in constant DEFUALT -> DEFAULT.
  configure: Undefine _FORTIFY_SOURCE prior using it

Message-id: 1379696296-32105-1-git-send-email-mjt@msgid.tls.msk.ru
2013-09-23 11:52:55 -05:00
Anthony Liguori
3e4be9c297 Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
# By Alexey Kardashevskiy (3) and others
# Via Paolo Bonzini
* qemu-kvm/uq/master:
  target-i386: add feature kvm_pv_unhalt
  linux-headers: update to 3.12-rc1
  target-i386: forward CPUID cache leaves when -cpu host is used
  linux-headers: update to 3.11
  kvm: fix traces to use %x instead of %d
  kvmvapic: Clear also physical ROM address when entering INACTIVE state
  kvmvapic: Enter inactive state on hardware reset
  kvmvapic: Catch invalid ROM size
  kvm irqfd: support direct msimessage to irq translation
  fix steal time MSR vmsd callback to proper opaque type
  kvm: warn if num cpus is greater than num recommended
  cpu: Move cpu state syncs up into cpu_dump_state()
  exec: always use MADV_DONTFORK

Message-id: 1379694292-1601-1-git-send-email-pbonzini@redhat.com
2013-09-23 11:52:49 -05:00
Anthony Liguori
f3ca508f00 Merge remote-tracking branch 'bonzini/scsi-next' into staging
# By Hervé Poussineau (5) and Stefan Weil (1)
# Via Paolo Bonzini
* bonzini/scsi-next:
  block/iscsi: Drop iscsi_co_get_block_status for older versions of libiscsi
  lsi: add 53C810 variant
  lsi: remove todo
  lsi: ignore write accesses to CTEST0 registers
  lsi: check ssid versus sdid only if ssid is valid
  lsi: use constant name instead of its value
2013-09-23 11:52:32 -05:00
Michael S. Tsirkin
702d66a813 virtio-net: fix up HMP NIC info string on reset
When mac is updated on reset, info string has stale data.
Fix it up.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-22 09:30:22 +03:00
Alexey Kardashevskiy
cbf5b96856 kvm: fix traces to use %x instead of %d
KVM request types are normally defined using hex constants but QEMU traces
print decimal values instead, which is not very convenient.

This changes the request type format from %d to %x.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-20 19:55:01 +02:00
Vincenzo Maffione
97410dde60 e1000: NetClientInfo.receive_iov implemented
This patch implements the NetClientInfo.receive_iov method for the
e1000 device emulation. In this way a network backend that uses
qemu_sendv_packet() can deliver the fragmented packet without
requiring an additional copy in the frontend/backend network code
(nc_sendv_compat() function).

The existing method NetClientInfo.receive has been reimplemented
using the new method.

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-20 19:49:14 +02:00
Aurelien Jarno
a26405b350 pcnet-pci: mark I/O and MMIO as LITTLE_ENDIAN
Now that the memory subsystem is propagating the endianness correctly,
the pcnet-pci device should have its I/O ports and MMIO memory marked
as LITTLE_ENDIAN, as PCI devices are little endian.

This makes the pcnet-pci NIC to work again on big endian MIPS Malta
(default NIC).

Cc: qemu-stable@nongnu.org
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-20 19:49:14 +02:00
Paolo Bonzini
ef5bc96268 virtio-blk: do not relay a previous driver's WCE configuration to the current
The following sequence happens:
- the SeaBIOS virtio-blk driver does not support the WCE feature, which
causes QEMU to disable writeback caching

- the Linux virtio-blk driver resets the device, finds WCE is available
but writeback caching is disabled; tells block layer to not send cache
flush commands

- the Linux virtio-blk driver sets the DRIVER_OK bit, which causes
writeback caching to be re-enabled, but the Linux virtio-blk driver does
not know of this side effect and cache flushes remain disabled

The bug is at the third step.  If the guest does know about CONFIG_WCE,
QEMU should ignore the WCE feature's state.  The guest will control the
cache mode solely using configuration space.  This change makes Linux
do flushes correctly, but Linux will keep SeaBIOS's writethrough mode.

Hence, whenever the guest is reset, the cache mode of the disk should
be reset to whatever was specified in the "-drive" option.  With this
change, the Linux virtio-blk driver finds that writeback caching is
enabled, and tells the block layer to send cache flush commands
appropriately.

Reported-by: Rusty Russell <rusty@au1.ibm.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-20 19:27:48 +02:00
Paolo Bonzini
1df6fa4bc6 blockdev: do not default cache.no-flush to true
That's why all my VMs were so fast lately. :)

This changed in 1.6.0 by mistake in patch 29c4e2b (blockdev: Split up
'cache' option, 2013-07-18).

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-20 19:27:44 +02:00
Fam Zheng
bcb9d66e85 block: don't lose data from last incomplete sector
To read the last sector that is not aligned to sector boundary, current
code for growable backends, since commit 893a8f6 "block: Produce zeros
when protocols reading beyond end of file", drops the data and directly
returns zeroes. That is incorrect.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-20 19:27:26 +02:00
Fam Zheng
7a1c0d200f tests/.gitignore: ignore test-throttle
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:15:33 +04:00
Stefan Weil
089f3f761e exec: Fix broken build for MinGW (regression)
Commit 3435f39513 reduced the ifdeffery with
this result for MinGW:

exec.c: In function ‘qemu_ram_free’:
exec.c:1239:17: warning:
 implicit declaration of function ‘munmap’ [-Wimplicit-function-declaration]
exec.c:1239:17: warning:
 nested extern declaration of ‘munmap’ [-Wnested-externs]
exec.c:1239: undefined reference to `munmap'

Add some ifdeffery again to fix this.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:13:09 +04:00
Stefan Weil
e76d05c2b5 kvm: Fix compiler warning (clang)
Report from clang analyzer:

clock.c:42:15: warning:
Value stored to 'cpu' during its initialization is never read

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:11:32 +04:00
Richard Henderson
387e417666 tcg-sparc: Fix parenthesis warning
error: suggest parentheses around comparison in operand of ‘&’ [-Werror=parentheses]

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Stefan Weil
8b6bfc7711 Makefile: Remove some more files when cleaning
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Tobias Markus
469936ae0a target-i386: Fix segment cache dump
When in Long Mode, cpu_x86_seg_cache() logs "DS16" because the Default
operation size bit (D/B bit) is not set for Long Mode Data Segments since
there are only Data Segments in Long Mode and no explicit 16/32/64-bit
Descriptors.
This patch fixes this by checking the Long Mode Active bit of the hidden
flags variable and logging "DS" if it is set. (I.e. in Long Mode all Data
Segments are logged as "DS")

Signed-off-by: Tobias Markus <tobias@markus-regensburg.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Michael Tokarev
2be178a475 iov: avoid "orig_len may be used unitialized" warning
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Michael Tokarev
69fded480e vscclient: remove unnecessary use of uninitialized variable
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Markus Armbruster
ddd0bd480f trace-events: Clean up with scripts/cleanup-trace-events.pl again
Event qxl_render_blit_guest_primary_initialized is unused since commit
c58c7b9, drop it.

Commit 42e5b4c moved hw/ppc/xics.c to hw/intc/xics.c without updating
the comment in trace-events.

"scripts/cleanup-trace-events.pl trace-events | diff trace-events" is
now clean again.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Stefan Weil
07ac4dc5db tci: Fix qemu-alpha on 32 bit hosts (wrong assertions)
Debian busybox-static for alpha has a load address of 0x0000000120000000
which is mapped to 0x0000000020000000 for 32 bit hosts.

qemu-alpha uses the TCG opcodes qemu_ld32, qemu_ld64, qemu_st32 and
qemu_st64 which all raise the assertion (taddr == host_addr).

Remove all assertions of this type because they are either wrong or
unnecessary (when sizeof(tcg_target_ulong) >= sizeof(target_ulong)).

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Stefan Weil
6f20f55bcc *-user: Improve documentation for lock_user function
Add a missing "function" and replace "and" by "any".
BSD and Linux use the same documentation here, so fix both.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Stefan Weil
2b7be8c8f5 MAINTAINERS: Add missing entry to filelist for TCI target
tci.c is also a maintained part of the TCI implementation.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Stefan Weil
227b8175e2 translate-all: Fix formatting of dump output
The page dump writes a table with 3 abi_ulong values in each row.
These values take 8 or 16 characters (depending on sizeof abi_ulong).

Fix the table headings to be aligned with the table columns.

old:
start    end      size     prot
0000000120000000-000000012021e000 000000000021e000 rwx
0000004000000000-0000004000002000 0000000000002000 ---
0000004000002000-0000004000802000 0000000000800000 rw-

new:
start            end              size             prot
0000000120000000-000000012021e000 000000000021e000 rwx
0000004000000000-0000004000002000 0000000000002000 ---
0000004000002000-0000004000802000 0000000000800000 rw-

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Stefan Weil
41d1af4de4 *-user: Fix typo in comment (ulocking -> unlocking)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Anthony PERARD
314b5d4bb6 docs: Fix IO port number for CPU present bitmap.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewd-By: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:24 +04:00
Richard W.M. Jones
451f7846ec q35: Fix typo in constant DEFUALT -> DEFAULT.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:23 +04:00
Michal Privoznik
e600cdf3b4 configure: Undefine _FORTIFY_SOURCE prior using it
Currently, we are enforcing the _FORTIFY_SOURCE=2 without any
previous detection if the macro has been already defined, e.g.
by environment, or is just enabled by compiler by default.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Jan Vesely <jano.vesely@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:09:23 +04:00
Anthony Liguori
2571f8f5fb Merge remote-tracking branch 'spice/spice.v74' into staging
# By Gerd Hoffmann
# Via Gerd Hoffmann
* spice/spice.v74:
  qxl: compile only once
  qxl: simplify page dirtying
  qxl: simplify qxl_rom_size
  qxl: define qxl operating on 4k pages

Message-id: 1379583534-7831-1-git-send-email-kraxel@redhat.com
2013-09-20 08:08:18 -05:00
Anthony Liguori
ce63e9c258 Merge remote-tracking branch 'kraxel/usb.90' into staging
# By Hans de Goede (6) and Gerd Hoffmann (1)
# Via Gerd Hoffmann
* kraxel/usb.90:
  usb: Fix iovec memleak on combined-packet free
  usb: Also reset max_packet_size on ep_reset
  xhci: Fix memory leak on xhci_disable_ep
  xhci: Add xhci_epid_to_usbep helper function
  xhci: Init a transfers xhci, slotid and epid member on epctx alloc
  xhci: Fix number of streams allocated when using streams
  usb: remove old usb-host code

Message-id: 1379583298-7524-1-git-send-email-kraxel@redhat.com
2013-09-20 08:08:09 -05:00
Anthony Liguori
f54c49e218 Merge remote-tracking branch 'luiz/queue/qmp' into staging
# By Luiz Capitulino
# Via Luiz Capitulino
* luiz/queue/qmp:
  QMP: qmp-events.txt: alphabetical order fix and other minor changes
  QMP: Update qmp-spec.txt
  QMP: Update README file
  QMP: QMP/ -> docs/qmp/
  QMP: fix qmp-commands.txt generation path
  QMP: add scripts/qmp

Message-id: 1379509422-29115-1-git-send-email-lcapitulino@redhat.com
2013-09-20 08:06:38 -05:00
Heinz Graalfs
6a444f8507 s390/sclplmconsole: Add support for SCLP line-mode console
Add simple support for SCLP line-mode also known as operating
system messages. This can be added in addition to or instead of
the SCLP full screen console with -device sclplmconsole.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 13:55:30 +02:00
Heinz Graalfs
40fa5264f6 s390/ebcdic: Move conversion tables to header file
Move conversion tables to header file.
   - In SCLP line mode processing EBCDIC/ASCII conversion is needed.
   - An additional EBCDIC to ASCII conversion function is added.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 13:55:30 +02:00
Christian Borntraeger
c3d9f24a39 s390/eventfacility: allow childs to handle more than 1 event type
Currently all handlers (quiesce, console) only handle one event type.
Some drivers will handle multiple (compatible) event types. Rework the
code accordingly.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
2013-09-20 13:55:30 +02:00
Christian Borntraeger
8b8b1138df s390/eventfacility: remove unused event_type variable
The event_type variable is never used. Get rid of it.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
2013-09-20 13:55:30 +02:00
Christian Borntraeger
788be8e9d6 s390/eventfacility: Fix receive/send masks
Currently we announce interchanged receive/send masks. This did not
trigger a bug, since the sclp console has the same masks for
send/receive and the Linux guest does not check the sclp mask for simple
events like quiesce. With other event users like the sclp line mode
console, we will have different send/receive bits. Fix it.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
2013-09-20 13:55:30 +02:00
Ralf Hoppe
a0c8699b23 s390/eventfacility: fix multiple Read Event Data sources
Make the handler for SCLP Read Event Data deal with notifications
for multiple sources correctly.

Signed-off-by: Ralf Hoppe <rhoppe@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[split bigger patch into smaller independent chunks]
Reviewed-by: Alexander Graf <agraf@suse.de>
2013-09-20 13:55:29 +02:00
Heinz Graalfs
3af6de321f s390/sclp: add reset() functions
Add reset() functions for event-facility, sclpconsole, and sclpquiesce.
The reset() functions perform variable initialization
at IPL and e.g. when monitor system_reset is called.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 13:55:29 +02:00
Heinz Graalfs
7e36b7a356 s390/sclpquiesce: Add code to support live migration
This patch adds the necessary life migration pieces to sclpquiesce
by using the vmstate_register.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 13:55:29 +02:00
Heinz Graalfs
cb335bebe1 s390/sclpconsole: Add code to support live migration for sclpconsole
This patch adds the necessary life migration pieces to the sclp code
by using vmstate_register.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 13:55:29 +02:00
Heinz Graalfs
ea9ad3e945 s390/sclpconsole: modify definition of input buffer
To use VMState for migration, we need to adapt some sclp code:
   - allocate console buffer as part of the console
   - change semantic of sclpconsole offset fields

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 13:55:29 +02:00
Christian Borntraeger
d8b30c8302 s390/kexec: Implement diag308 subcode 0
This patch implements subcode 0 of diag 308. This is necessary for kexec
(without kdump). The main difference to subcode 1 is that all CPUs get
a full reset, instead of the architectured CPU reset (which leaves all
registers untouched).

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 13:55:29 +02:00
Max Reitz
0f39ac9a07 qcow2: Correct snapshots size for overlap check
Using s->snapshots_size instead of snapshots_size for the metadata
overlap check in qcow2_write_snapshots leads to the detection of an
overlap with the main qcow2 image header when deleting the last
snapshot, since s->snapshots_size has not yet been updated and is
therefore non-zero. However, the offset returned by qcow2_alloc_clusters
will be zero since snapshots_size is zero. Therefore, an overlap is
detected albeit no such will occur.

This patch fixes this by replacing s->snapshots_size by snapshots_size
when calling qcow2_pre_write_overlap_check.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-20 12:48:03 +02:00
Thomas Huth
5d9bf1c07c s390/ioinst: Moved the CC setting to the IO instruction handlers
The IO instruction handlers now take care of setting the CC value on
their own, so that the confusing return code magic in kvm_handle_css_inst()
is not needed anymore.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 12:46:53 +02:00
Thomas Huth
3d0a615fe9 s390/cpu: Make setcc() function available to other files
Moved the setcc() function to cpu.h so that it can be used by other
files, too. It now also does not modify the kvm state anymore since
this gets updated during kvm_arch_put_registers() anyway.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 12:46:53 +02:00
Christian Borntraeger
1902269c19 s390/ipl: Update the s390-ccw.img rom
Rebuild of the virtio-ccw rom containing these patches:
1. s390/ipl: Fix waiting for virtio processing

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 12:46:53 +02:00
Cornelia Huck
441ea695f9 s390/ipl: Fix waiting for virtio processing
The guest side must not manipulate the index for the used buffers. Instead,
remember the state of the used buffer locally and wait until it has moved.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 12:46:53 +02:00
Christian Borntraeger
abd137a1bc s390/dump: zero out padding bytes in notes sections
The prstatus of an s390x dump contains several padding areas. Zero out
these bytes to make reading the notes section easier with a hexdump.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 12:46:53 +02:00
Thomas Huth
3ac85fb666 s390/kvm: Add check for priviledged SCLP handler
The SCLP instruction is priviledged, so we should make sure that
we generate an exception when it is called from the problem state.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-09-20 12:46:52 +02:00
Andrew Jones
f010bc643a target-i386: add feature kvm_pv_unhalt
I don't know yet if want this feature on by default, so for now I'm
just adding support for "-cpu ...,+kvm_pv_unhalt".

Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-20 12:38:49 +02:00
Andrew Jones
4f2656079f linux-headers: update to 3.12-rc1
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-20 12:38:48 +02:00
Benoît Canet
787aaf5703 target-i386: forward CPUID cache leaves when -cpu host is used
Some users running cpu intensive tasks checking the cache CPUID leaves at
startup and making decisions based on the result reported that the guest was
not reflecting the host CPUID leaves when -cpu host is used.

This patch fix this.

Signed-off-by: Benoît Canet <benoit@irqsave.net>
[Rename new field to cache_info_passthrough - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-20 12:38:40 +02:00
Alexey Kardashevskiy
c5daeae1b4 linux-headers: update to 3.11
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-20 12:37:52 +02:00
Alexey Kardashevskiy
4fe6e9ecb7 kvm: fix traces to use %x instead of %d
KVM request types are normally defined using hex constants but QEMU traces
print decimal values instead, which is not very convenient.

This changes the request type format from %d to %x.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-20 12:37:52 +02:00
Jan Kiszka
4357930b8a kvmvapic: Clear also physical ROM address when entering INACTIVE state
To avoid misinterpreting INACTIVE after migration as old qemu-kvm's
STANDBY, also clear rom_state_paddr when going back to this state.

CC: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-20 12:37:52 +02:00
Jan Kiszka
c056bc3f34 kvmvapic: Enter inactive state on hardware reset
ROM layout may change after reset of devices are hotplugged, so we have
to pick up the physical address again when the ROM is initialized. This
is best achieved by resetting the state to INACTIVE.

CC: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-20 12:37:52 +02:00
Jan Kiszka
18e5eec4db kvmvapic: Catch invalid ROM size
If not caught early, a zero-length ROM will cause a NULL-pointer access
later on in patch_hypercalls when allocating a zero-length ROM copy and
trying to read from it.

CC: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-20 12:37:52 +02:00
Alexey Kardashevskiy
76fe21deda kvm irqfd: support direct msimessage to irq translation
On PPC64 systems MSI Messages are translated to system IRQ in a PCI
host bridge. This is already supported for emulated MSI/MSIX but
not for irqfd where the current QEMU allocates IRQ numbers from
irqchip and maps MSIMessages to IRQ in the host kernel.

This adds a new direct mapping flag which tells
the kvm_irqchip_add_msi_route() function that a new VIRQ
should not be allocated, instead the value from MSIMessage::data
should be used. It is up to the platform code to make sure that
this contains a valid IRQ number as sPAPR does in spapr_pci.c.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-20 12:37:52 +02:00
Marcelo Tosatti
0e5035776d fix steal time MSR vmsd callback to proper opaque type
Convert steal time MSR vmsd callback pointer to proper X86CPU type.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-20 12:37:52 +02:00
Andrew Jones
670436ced0 kvm: warn if num cpus is greater than num recommended
The comment in kvm_max_vcpus() states that it's using the recommended
procedure from the kernel API documentation to get the max number
of vcpus that kvm supports. It is, but by always returning the
maximum number supported. The maximum number should only be used
for development purposes. qemu should check KVM_CAP_NR_VCPUS for
the recommended number of vcpus. This patch adds a warning if a user
specifies a number of cpus between the recommended and max.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-09-20 12:37:52 +02:00
James Hogan
97577fd4c3 cpu: Move cpu state syncs up into cpu_dump_state()
The x86 and ppc targets call cpu_synchronize_state() from their
*_cpu_dump_state() callbacks to ensure that up to date state is dumped
when KVM is enabled (for example when a KVM internal error occurs).

Move this call up into the generic cpu_dump_state() function so that
other KVM targets (namely MIPS) can take advantage of it.

This requires kvm_cpu_synchronize_state() and cpu_synchronize_state() to
be moved out of the #ifdef NEED_CPU_H in <sysemu/kvm.h> so that they're
accessible to qom/cpu.c.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Alexander Graf <agraf@suse.de>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: qemu-ppc@nongnu.org
Cc: kvm@vger.kernel.org
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-09-20 12:37:52 +02:00
Andrea Arcangeli
3e469dbfe4 exec: always use MADV_DONTFORK
MADV_DONTFORK prevents fork to fail with -ENOMEM if the default
overcommit heuristics decides there's too much anonymous virtual
memory allocated. If the KVM secondary MMU is synchronized with MMU
notifiers or not, doesn't make a difference in that regard.

Secondly it's always more efficient to avoid copying the guest
physical address space in the fork child (so we avoid to mark all the
guest memory readonly in the parent and so we skip the establishment
and teardown of lots of pagetables in the child).

In the common case we can ignore the error if MADV_DONTFORK is not
available. Leave a second invocation that errors out in the KVM path
if MMU notifiers are missing and KVM is enabled, to abort in such
case.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-By: Benoit Canet <benoit@irqsave.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-09-20 12:37:52 +02:00
Gabriel Kerneis
a9031675b9 coroutine: fix /perf/nesting coroutine benchmark
The /perf/nesting benchmark is broken because the counters are
not reset after each iteration. Therefore, nesting is done only
on the first iteration, and skipped on every other.

This patch fixes the issue, and reduces the number of iterations
to make it possible to run the benchmark in a reasonable amount of
time.

Signed-off-by: Gabriel Kerneis <gabriel@kerneis.info>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-19 13:21:41 +02:00
Gabriel Kerneis
2fcd15eac3 coroutine: add qemu_coroutine_yield benchmark
Current coroutine performance benchmarks test only coroutine creation,
either directly or in a nested way. This patch adds a benchmark to
evaluate the performance of qemu_coroutine_yield.

Signed-off-by: Gabriel Kerneis <gabriel@kerneis.info>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-19 13:21:32 +02:00
Hans de Goede
0ca6db4f3b usb: Fix iovec memleak on combined-packet free
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-19 11:28:40 +02:00
Hans de Goede
9adbaad318 usb: Also reset max_packet_size on ep_reset
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-19 11:28:40 +02:00
Hans de Goede
b21da4e504 xhci: Fix memory leak on xhci_disable_ep
The USBPacket-s in the transfers need to be cleaned up so that the memory
allocated by the iovec in there gets freed.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-19 11:28:40 +02:00
Hans de Goede
518ad5f2a0 xhci: Add xhci_epid_to_usbep helper function
And use it instead of prying the USBEndpoint out of the packet struct
in various places.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-19 11:28:40 +02:00
Hans de Goede
4c5d82ecf1 xhci: Init a transfers xhci, slotid and epid member on epctx alloc
Transfers are part of an epctx, which is part of a slot, which is part of
a xhci. Transfers cannot dynamically be moved from one epctx to another,
so once created their xhci, slotid and epid are constant, so lets set these
up at creation time, rather then re-initializing them with the same
value each time a transfer gets submitted.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-19 11:28:40 +02:00
Hans de Goede
d063c3112c xhci: Fix number of streams allocated when using streams
According to the xhci spec the total number of streams is
2 ^ (MaxPStreams + 1), and this is also how the Linux xhci driver
uses this field.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-19 11:28:40 +02:00
Gerd Hoffmann
b5613fdcb0 usb: remove old usb-host code
The usb-host code has been rewritten for qemu 1.5 to use libusb,
the old code has been left in as temporary fallback.  Now we are
two releases further out, targeting the 1.7 release.  No major
issues with the new code poped up until now.  Time to remove it
from tre tree.  Should we ever need it again for some reason --
git has a copy for us in the history.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-19 11:28:40 +02:00
Paolo Bonzini
3db1ee7c2a qemu-timer: do not take the lock in timer_pending
We can deduce the result from expire_time, by making it always -1 if
the timer is not in the active_timers list.  We need to check against
negative times passed to timer_mod_ns; clamping them to zero is not
a problem because the only clock that has a zero value at VM startup
is QEMU_CLOCK_VIRTUAL, and it is monotonic so it cannot be non-zero.
QEMU_CLOCK_HOST, instead, is not monotonic but it cannot go to negative
values unless the host time is seriously screwed up and points to
the 1960s.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-18 15:48:33 +02:00
Stefan Hajnoczi
978f2205c7 qemu-timer: make qemu_timer_mod_ns() and qemu_timer_del() thread-safe
Introduce QEMUTimerList->active_timers_lock to protect the linked list
of active timers.  This allows qemu_timer_mod_ns() to be called from any
thread.

Note that vm_clock is not thread-safe and its use of
qemu_clock_has_timers() works fine today but is also not thread-safe.

The purpose of this patch is to eventually let device models set or
cancel timers from a vcpu thread without holding the global mutex.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-18 15:48:33 +02:00
Stefan Hajnoczi
da718ceb17 qemu-timer: drop outdated signal safety comments
host_alarm_handler() is invoked from the signal processing thread
(currently the iothread).  Previously we did processing in a real signal
handler with signalfd and therefore needed signal-safe timer code.

Today host_alarm_handler() just marks the alarm timer as expired/pending
and notifies the main loop using qemu_notify_event().

Therefore these outdated comments about signal safety can be dropped.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-18 15:48:33 +02:00
Stefan Hajnoczi
a5813077aa osdep: warn if open(O_DIRECT) on fails with EINVAL
Print a warning when opening a file O_DIRECT fails with EINVAL.  This
saves users a lot of time trying to figure out the EINVAL error, which
is typical when attempting to open a file O_DIRECT on Linux tmpfs.

Reported-by: Deepak C Shetty <deepakcs@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-09-18 15:34:51 +02:00
Stefan Hajnoczi
975a0015ee libcacard: link against qemu-error.o for error_report()
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-18 15:34:51 +02:00
Luiz Capitulino
7b5ce8db60 QMP: qmp-events.txt: alphabetical order fix and other minor changes
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-09-18 08:57:02 -04:00
Luiz Capitulino
715c18600c QMP: Update qmp-spec.txt
Simplify the text, fix some of the examples.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-09-18 08:57:02 -04:00
Luiz Capitulino
52bbff77c4 QMP: Update README file
Drop unneeded info, fix some of the examples and rename QEMU Monitor
Protocol to QEMU Machine Protocol.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-09-18 08:57:02 -04:00
Luiz Capitulino
7537fe0487 QMP: QMP/ -> docs/qmp/
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-09-18 08:57:02 -04:00
Luiz Capitulino
d076a2addd QMP: fix qmp-commands.txt generation path
This file should be generated in the BUILD_DIR, as all other docs.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-09-18 08:57:02 -04:00
Luiz Capitulino
22f3946bc5 QMP: add scripts/qmp
Populate it with all scripts stored in QMP/. Also fixes trailing
whitespaces in qmp-shell and qmp.py.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-09-18 08:57:02 -04:00
Gerd Hoffmann
521e759cf1 qxl: compile only once
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-18 11:13:29 +02:00
Gerd Hoffmann
b0297b4a82 qxl: simplify page dirtying
No need to do target page size calculations here,
memory_region_set_dirty will care for us.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-18 11:13:29 +02:00
Gerd Hoffmann
60b3b2a55f qxl: simplify qxl_rom_size
Nowdays rom size is fixed at 8192 for live migration compat reasons.
So we can ditch the pointless math trying to calculate the size needed.
Also make the size sanity check fail at compile time not runtime.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-18 11:13:29 +02:00
Gerd Hoffmann
9efc2d8d81 qxl: define qxl operating on 4k pages
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-18 11:13:29 +02:00
Stefan Weil
f35c934a5a block/iscsi: Drop iscsi_co_get_block_status for older versions of libiscsi
Debian wheezy includes libiscsi-dev 1.4.0 which does not provide
SCSI_PROVISIONING_TYPE_DEALLOCATED. Drop iscsi_co_get_block_status
in this case to allow compilation without errors.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-18 01:28:50 +02:00
Eduardo Otubo
92bfedb0b6 MAINTAINERS: Add myself to MAINTAINERS file
Add myself to the MAINTAINERS file. I'll be looking at qemu-seccomp.c
and include/sysemu/seccomp.h.

Signed-off-by: Eduardo Otubo <otubo@linux.vnet.ibm.com>
Acked-by: Paul Moore <pmoore@redhat.com>
Message-id: 1378746255-2089-1-git-send-email-otubo@linux.vnet.ibm.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-17 11:22:16 -05:00
Anthony Liguori
46663e5eff hmp: block-stream: fix typo
Found this by enabling C++ errors.  The bool and enum arguments
are mistakenly flipped.

Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-17 11:10:47 -05:00
Anthony Liguori
6c2679fc19 Merge remote-tracking branch 'kiszka/queues/slirp' into staging
# By Liu Ping Fan (3) and Jan Kiszka (1)
# Via Jan Kiszka
* kiszka/queues/slirp:
  slirp: clean up slirp_update_timeout
  slirp: set mainloop timeout with more precise value
  slirp: define timeout as macro
  slirp: make timeout local

Message-id: cover.1379415024.git.jan.kiszka@siemens.com
2013-09-17 10:01:24 -05:00
Anthony Liguori
5dc11192b2 Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Max Reitz (16) and others
# Via Kevin Wolf
* kwolf/for-anthony: (33 commits)
  qemu-iotests: Fix test 038
  block: Assert validity of BdrvActionOps
  qemu-iotests: Cleanup test image in test number 007
  qemu-img: fix invalid JSON
  coroutine: add ./configure --disable-coroutine-pool
  qemu-iotests: Adjustments due to error propagation
  qcow2: Use Error parameter
  qemu-img create: Emit filename on error
  block: Error parameter for create functions
  block: Error parameter for open functions
  bdrv: Use "Error" for creating images
  bdrv: Use "Error" for opening images
  qemu-iotests: add 057 internal snapshot for block device test case
  hmp: add interface hmp_snapshot_delete_blkdev_internal
  hmp: add interface hmp_snapshot_blkdev_internal
  qmp: add interface blockdev-snapshot-delete-internal-sync
  qmp: add interface blockdev-snapshot-internal-sync
  qmp: add internal snapshot support in qmp_transaction
  snapshot: distinguish id and name in snapshot delete
  snapshot: new function bdrv_snapshot_find_by_id_and_name()
  ...

Message-id: 1379073063-14963-1-git-send-email-kwolf@redhat.com
2013-09-17 09:51:40 -05:00
Anthony Liguori
ab9cec42bf Merge remote-tracking branch 'rth/tgt-i386' into staging
# By Paolo Bonzini (1) and Peter Maydell (1)
# Via Richard Henderson
* rth/tgt-i386:
  target-i386: Only provide CMOV and friends if feature bit set
  target-i386: fix disassembly with PAE=1, PG=0

Message-id: 1379010496-5875-1-git-send-email-rth@twiddle.net
2013-09-17 09:51:23 -05:00
Anthony Liguori
7d41364e71 Merge remote-tracking branch 'bonzini/scsi-next' into staging
# By Peter Lieven (3) and others
# Via Paolo Bonzini
* bonzini/scsi-next:
  spapr-vscsi: Report error on unsupported MAD requests
  spapr-vscsi: Adding VSCSI capabilities
  iscsi: split discard requests in multiple parts
  iscsi: add .bdrv_get_block_status
  iscsi: add logical block provisioning information to iscsilun
  hw/scsi/lsi53c895a: Use deposit32 rather than handcoded shift/mask
  hw/scsi/lsi53c895a: Use sextract32 for sign-extension
  scsi: Fix scsi_bus_legacy_add_drive() scsi-generic with serial
  virtio-scsi: Make type virtio-scsi-common abstract
  spapr-vscsi: add task management
  scsi: prefer UUID to VM name for the initiator name

Message-id: 1378984634-765-1-git-send-email-pbonzini@redhat.com
2013-09-17 09:50:23 -05:00
Anthony Liguori
25afd6eb15 Merge remote-tracking branch 'kraxel/chardev.7' into staging
# By Gerd Hoffmann
# Via Gerd Hoffmann
* kraxel/chardev.7:
  chardev: fix pty_chr_timer

Message-id: 1378972894-11185-1-git-send-email-kraxel@redhat.com
2013-09-17 09:49:44 -05:00
Jan Kiszka
426e3e6ce1 slirp: clean up slirp_update_timeout
No need to write out the timeout early, keep it local until we are done.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2013-09-17 12:26:05 +02:00
Liu Ping Fan
a42e9c4188 slirp: set mainloop timeout with more precise value
If slirp needs to emulate tcp timeout, then the timeout value
for mainloop should be more precise, which is determined by
slirp's fasttimo or slowtimo. Achieve this by swap the logic
sequence of slirp_pollfds_fill and slirp_update_timeout.

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2013-09-17 12:26:05 +02:00
Liu Ping Fan
9b0ca6cc64 slirp: define timeout as macro
Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2013-09-17 12:26:04 +02:00
Liu Ping Fan
fe0ff43c9d slirp: make timeout local
Each slirp has its own time to caculate timeout.

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2013-09-17 12:26:04 +02:00
Hervé Poussineau
ceae18bd74 lsi: add 53C810 variant
Currently, treat it exactly as a 53C895A.
53C895A is a 53C810 with more capabilities, so this should work.

However, this lets us test different code paths on Linux, which
don't use lastest features if it detect a 810, or on some OSes
which only support 810 and not 895A (like very old Windows NT
versions).

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-16 12:42:40 +02:00
Hervé Poussineau
689f5ff437 lsi: remove todo
LSI emulation has been tested with Linux on PPC platform.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-16 12:42:35 +02:00
Hervé Poussineau
0903c35dde lsi: ignore write accesses to CTEST0 registers
53C895A datasheet says that this register is read/write, and that the value
returned on read access is dependant of DMA FIFO state. However, nothing is
said for written value.

53C810A datasheet gives more insight about this register:
"This was a general purpose read/write register in previous SYM53C8XX
family chips. Although it is still a read/write register, Symbios reserves
the right to use these bits for future 53C8XX family enhancements."

This prevents going to the default case, which prints an error message.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-16 12:42:25 +02:00
Hervé Poussineau
c7ac9f403a lsi: check ssid versus sdid only if ssid is valid
This prevents some (invalid) error messages on console.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-16 12:42:18 +02:00
Hervé Poussineau
16b8ed1d09 lsi: use constant name instead of its value
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-16 12:41:35 +02:00
Hervé Poussineau
9f1a029abf pci: remove explicit check to 64K ioport size
This check is useless, as bigger addresses will be ignored when
added to 'io' MemoryRegion, which has a size of 64K.

However, some architectures don't use the 'io' MemoryRegion, like
the alpha and versatile platforms. They create a PCI I/O region
bigger than 64K, so let them handle PCI I/O BARs in the higher range.

MST: reinstated work-around for BAR sizing.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-15 11:49:51 +03:00
Michael S. Tsirkin
c046e8c4a2 piix4: disable io on reset
io base register at 0x40 is cleared on reset,
but io is not disabled until some other event
happens to call pm_io_space_update.

Invoke pm_io_space_update directly to make this
consistent.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-15 11:49:50 +03:00
Michael S. Tsirkin
2028fdf379 piix: use 64 bit window programmed by guest
Detect the 64 bit window programmed by firmware
and configure properties accordingly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-15 11:49:50 +03:00
Michael S. Tsirkin
8b42d730e3 q35: use 64 bit window programmed by guest
Detect the 64 bit window programmed by firmware
and configure properties accordingly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-15 11:49:50 +03:00
Michael S. Tsirkin
4386406957 pci: add helper to retrieve the 64-bit range
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-15 11:49:50 +03:00
Michael S. Tsirkin
c5a22c4344 range: add min/max operations on ranges
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-15 11:49:50 +03:00
Michael S. Tsirkin
cfe25e2bca range: add Range to typedefs
will help simplify header dependencies.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-15 09:36:57 +03:00
Michael S. Tsirkin
636228a887 q35: make pci window address/size match guest cfg
For Q35, MMCFG address and size are guest configurable.
Update w32 property to make it behave accordingly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-15 09:36:57 +03:00
Max Reitz
c21bddf27f qemu-iotests: Fix test 038
Test 038 uses asynchronous I/O, resulting (potentially) in a different
output for every run (regarding the order of the I/O accesses). This can
be fixed by simply sorting the I/O access messages, since their order is
irrelevant anyway (for this asynchonous I/O).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-13 12:02:33 +02:00
Peter Maydell
bff93281a7 target-i386: Only provide CMOV and friends if feature bit set
The instructions CMOVcc, FCMOVcc and F[U]COMI[P] should only be
present if the CMOV feature bit is set. Add missing feature bit
checks so we correctly fault if emulating a 486 or 586.
This fixes bug LP:1201446.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-12 11:24:48 -07:00
Paolo Bonzini
f2f8560c7a target-i386: fix disassembly with PAE=1, PG=0
CR4.PAE=1 will not enable paging if CR0.PG=0, but the "if" chain
in x86_cpu_get_phys_page_debug says otherwise.  Check CR0.PG
before everything else.

Fixes "-d in_asm" for a code section at the beginning of OVMF.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
2013-09-12 11:20:42 -07:00
Markus Armbruster
7f87af39dc pc_sysfw: Fix ISA BIOS init for ridiculously big flash
pc_isa_bios_init() suffers integer overflow for flash larger than
INT_MAX.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1375276272-15988-9-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:32 -05:00
Markus Armbruster
39228250ce exec: Don't abort when we can't allocate guest memory
We abort() on memory allocation failure.  abort() is appropriate for
programming errors.  Maybe most memory allocation failures are
programming errors, maybe not.  But guest memory allocation failure
isn't, and aborting when the user asks for more memory than we can
provide is not nice.  exit(1) instead, and do it in just one place, so
the error message is consistent.

Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1375276272-15988-8-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:32 -05:00
Markus Armbruster
e1e84ba050 exec: Clean up unnecessary S390 ifdeffery
Another issue missed in commit fdec991 is -mem-path: it needs to be
rejected only for old S390 KVM, not for any S390.  Not that I
personally care, but the ifdeffery in qemu_ram_alloc_from_ptr() annoys
me.

Note that this doesn't actually make -mem-path work, as the kernel
doesn't (yet?)  support large pages in the host for KVM guests.  Clean
it up anyway.

Thanks to Christian Borntraeger for pointing out the S390 kernel
limitations.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1375276272-15988-7-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:32 -05:00
Markus Armbruster
2eb9fbaab5 exec: Drop incorrect & dead S390 code in qemu_ram_remap()
Old S390 KVM wants guest RAM mapped in a peculiar way.  Commit 6b02494
implemented that.

When qemu_ram_remap() got added in commit cd19cfa, its code carefully
mimicked the allocation code: peculiar way if defined(TARGET_S390X) &&
defined(CONFIG_KVM), else normal way.

For new S390 KVM, we actually want the normal way.  Commit fdec991
changed qemu_ram_alloc_from_ptr() accordingly, but forgot to update
qemu_ram_remap().  If qemu_ram_alloc_from_ptr() maps RAM the normal
way, but qemu_ram_remap() remaps it the peculiar way, remapping
changes protection and flags, which it shouldn't.

Fortunately, this can't happen, as we never remap on S390.

Replace the incorrect code with an assertion.

Thanks to Christian Borntraeger for help with assessing the bug's
(non-)impact.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-6-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
91138037cb exec: Simplify the guest physical memory allocation hook
Make it a generic hook rather than a KVM hook.  Less code and
ifdeffery.

Since the only user of the hook is old S390 KVM, there's hope we can
get rid of it some day.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-5-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
3435f39513 exec: Reduce ifdeffery around -mem-path
Instead of spreading its ifdeffery everywhere, confine it to
qemu_ram_alloc_from_ptr().  Everywhere else, simply test block->fd,
which is non-negative exactly when block uses -mem-path.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-4-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
0628c18267 exec: Clean up fall back when -mem-path allocation fails
With -mem-path, qemu_ram_alloc_from_ptr() first tries to allocate
accordingly, but when it fails, it falls back to normal allocation.

The fall back allocation code used to be effectively identical to the
"-mem-path not given" code, until it started to diverge in commit
432d268.  I believe the code still works, but clean it up anyway: drop
the special fall back allocation code, and fall back to the ordinary
"-mem-path not given" code instead.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-3-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
dfeaf2abc7 exec: Fix Xen RAM allocation with unusual options
Issues:

* We try to obey -mem-path even though it can't work with Xen.

* To implement -machine mem-merge, we call
  memory_try_enable_merging(new_block->host, size).  But with Xen,
  new_block->host remains null.  Oops.

Fix by separating Xen allocation from normal allocation.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1375276272-15988-2-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Max Reitz
aa3fe714f7 block: Assert validity of BdrvActionOps
In qmp_transaction, assert that the BdrvActionOps to be used is actually
valid.

This assertion failing is very improbable, however, it might happen, if
a new TransactionActionKind is introduced "out of order" and the
actions[] array is not updated.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 16:28:36 +02:00
Bharata B Rao
4aa846f25e qemu-iotests: Cleanup test image in test number 007
qemu-iotests number 007 doesn't do test image cleanup. This will affect
those protocols that expect a clean state before every test. Hence
ensure that test image is cleaned up in this test.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 13:54:43 +02:00
Paolo Bonzini
c745bfb430 qemu-img: fix invalid JSON
Single quotes for JSON are a QMP-ism, use real JSON in
qemu-img output.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 13:49:50 +02:00
Alexey Kardashevskiy
f4ff3b7ba1 spapr-vscsi: Report error on unsupported MAD requests
The existing driver just dropped unsupported requests. This adds error
responses to those unhandled requests.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 13:15:54 +02:00
Nikunj A. Dadhania
26573a0c1f spapr-vscsi: Adding VSCSI capabilities
This implements capabilities exchange between vscsi host and client.  As
at the moment no capability is supported, put zero flags everywhere and
return.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
2013-09-12 13:15:54 +02:00
Peter Lieven
65f3e33964 iscsi: split discard requests in multiple parts
Replace .bdrv_aio_discard with .bdrv_co_discard so that discard
requests can be split in multiple parts, each for a small amount
of sectors.

This is useful because we expose a generic API with no limit
on the amount of sectors that can be unmapped in one request.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 13:14:19 +02:00
Stefan Hajnoczi
70c60c089f coroutine: add ./configure --disable-coroutine-pool
The 'gthread' coroutine backend was written before the freelist (aka
pool) existed in qemu-coroutine.c.

This means that every thread is expected to exit when its coroutine
terminates.  It is not possible to reuse threads from a pool.

This patch automatically disables the pool when 'gthread' is used.  This
allows the 'gthread' backend to work again (for example,
tests/test-coroutine completes successfully instead of hanging).

I considered implementing thread reuse but I don't want quirks like CPU
affinity differences due to coroutine threads being recycled.  The
'gthread' backend is a reference backend and it's therefore okay to skip
the pool optimization.

Note this patch also makes it easy to toggle the pool for benchmarking
purposes:

  ./configure --with-coroutine-backend=ucontext \
              --disable-coroutine-pool

Reported-by: Gabriel Kerneis <gabriel@kerneis.info>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Gabriel Kerneis <gabriel@kerneis.info>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz
2c78857bf6 qemu-iotests: Adjustments due to error propagation
When opening/creating images, propagating errors instead of immediately
emitting them on occurrence results in errors generally being printed on
a single line rather than being split up into multiple ones. This in
turn requires adjustments to some test results.

Also, test 060 used a sed to filter out the test image directory and
format by removing everything from the affected line after a certain
keyword; this now also removes the error message itself, which can be
fixed by using _filter_testdir and _filter_imgfmt.

Finally, _make_test_img in common.rc did not filter out the test image
directory etc. from stderr. This has been fixed through a redirection of
stderr to stdout (which is already done in _check_test_img and
_img_info).

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz
3ef6c40ad0 qcow2: Use Error parameter
Employ usage of the new Error ** parameter in qcow2_open, qcow2_create
and associated functions.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz
b70d8c237a qemu-img create: Emit filename on error
bdrv_img_create generally does not emit the target filename, although
this is pretty important information. Therefore, prepend its error
message with the output filename (if an error occurs).

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz
cc84d90ff5 block: Error parameter for create functions
Add an Error ** parameter to bdrv_create and its associated functions to
allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz
34b5d2c68e block: Error parameter for open functions
Add an Error ** parameter to bdrv_open, bdrv_file_open and associated
functions to allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz
d5124c00d8 bdrv: Use "Error" for creating images
Add an Error ** parameter to BlockDriver.bdrv_create to allow more
specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz
015a1036a7 bdrv: Use "Error" for opening images
Add an Error ** parameter to BlockDriver.bdrv_open and
BlockDriver.bdrv_file_open to allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
8023090be5 qemu-iotests: add 057 internal snapshot for block device test case
Create in transaction and deletion in single command will be tested.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
7a4ed2ee42 hmp: add interface hmp_snapshot_delete_blkdev_internal
It is hard to make both id and name optional in hmp console as qmp
interface, so this interface require user to specify name.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
775ca88e82 hmp: add interface hmp_snapshot_blkdev_internal
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
44e3e053af qmp: add interface blockdev-snapshot-delete-internal-sync
This interface use id and name as optional parameters, to handle the
case that one image contain multiple snapshots with same name which
may be '', but with different id.

Adding parameter id is for historical compatiability reason, and
that case is not possible in qemu's new interface for internal
snapshot at block device level, but still possible in qemu-img.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
f323bc9e8b qmp: add interface blockdev-snapshot-internal-sync
Snapshot ID can't be specified in this interface.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
bbe860104f qmp: add internal snapshot support in qmp_transaction
Unlike savevm, the qmp_transaction interface will not generate
snapshot name automatically, saving trouble to return information
of the new created snapshot.

Although qcow2 support storing multiple snapshots with same name
but different ID, here it will fail when an snapshot with that name
already exist before the operation. Format such as rbd do not support
ID at all, and in most case, it means trouble to user when he faces
multiple snapshots with same name, so ban that case. Request with
empty name will be rejected.

Snapshot ID can't be specified in this interface.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
a89d89d3e6 snapshot: distinguish id and name in snapshot delete
Snapshot creation actually already distinguish id and name since it take
a structured parameter *sn, but delete can't. Later an accurate delete
is needed in qmp_transaction abort and blockdev-snapshot-delete-sync,
so change its prototype. Also *errp is added to tip error, but return
value is kepted to let caller check what kind of error happens. Existing
caller for it are savevm, delvm and qemu-img, they are not impacted by
introducing a new function bdrv_snapshot_delete_by_id_or_name(), which
check the return value and do the operation again.

Before this patch:
  For qcow2, it search id first then name to find the one to delete.
  For rbd, it search name.
  For sheepdog, it does nothing.

After this patch:
  For qcow2, logic is the same by call it twice in caller.
  For rbd, it always fails in delete with id, but still search for name
in second try, no change to user.

Some code for *errp is based on Pavel's patch.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
2ea1dd758c snapshot: new function bdrv_snapshot_find_by_id_and_name()
To make it clear about id and name in searching, add this API
to distinguish them. Caller can choose to search by id or name,
*errp will be set only for exception.

Some code are modified based on Pavel's patch.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Max Reitz
d982919d38 qemu-iotests: New test case in 061
Add one test case for zero cluster expansion on qcow2 version downgrade
in shared L2 tables (i.e., L2 tables with a refcount > 1) and one for
zero expansion on backed clusters in shared L2 tables.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
fd9c577b24 qemu-iotests: add tests for runtime fd passing via SCM rights
This case will test whether the monitor can receive fd at runtime.
To verify better, additional monitor is created to see if qemu
can handler two monitor instances correctly.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:47 +02:00
Wenchao Xia
30b005d9d7 qemu-iotests: add infrastructure of fd passing via SCM
This patch make use of the compiled scm helper program to transfer
fd via unix socket at runtime.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Wenchao Xia
f93296eaff qemu-iotests: add unix socket help program
This program can do a sendmsg call to transfer fd with unix
socket, which is not supported in python2.

The built binary will not be deleted in clean, but it is a
existing issue in ./tests, which should be solved in another
patch.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Max Reitz
a8110c3d32 qemu-iotest: qcow2 image option amendment
Add tests for qemu-img amend on qcow2 image files.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Max Reitz
9296b3ed70 qcow2: Implement bdrv_amend_options
Implement bdrv_amend_options for compat, size, backing_file, backing_fmt
and lazy_refcounts.

Downgrading images from compat=1.1 to compat=0.10 is achieved through
handling all incompatible flags accordingly, clearing all compatible and
autoclear flags and expanding all zero clusters.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Max Reitz
b6481f376b qcow2: Save refcount order in BDRVQcowState
Save the image refcount order in BDRVQcowState. This will be relevant
for future code supporting different refcount orders than four and also
for code that needs to verify a certain refcount order for an opened
image.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Max Reitz
32b6444d23 qcow2-cluster: Expand zero clusters
Add functionality for expanding zero clusters. This is necessary for
downgrading the image version to one without zero cluster support.

For non-backed images, this function may also just discard zero clusters
instead of truly expanding them.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Max Reitz
e7108feaac qcow2-cache: Empty cache
Add a function for emptying a cache, i.e., flushing it and marking all
elements invalid.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Max Reitz
6f176b48f9 block: Image file option amendment
This patch adds the "amend" option to qemu-img which allows changing
image options on existing image files. It also adds the generic bdrv
implementation which is basically just a wrapper for the image format
specific function.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Tal Kain
56e023af80 raw-win32.c: Fix incorrect handling behaviour of small block files
It is a valid case that the read data's size is smaller than the
requested size since there could be files that are smaller than
the minimum block size (For ex. when a VMDK disk descriptor file)

Signed-off-by: Tal Kain <tal.kain@ravellosystems.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Kevin Wolf
1ebf561c11 qcow2: Discard VM state in active L1 after creating snapshot
During savevm, the VM state is written to the active L1 of the image and
then a snapshot is taken. After that, the VM state isn't needed any more
in the active L1 and should be discarded. This is implemented by this
patch.

The impact of not discarding the VM state is that a snapshot can never
become smaller than any previous snapshot (because it would be padded
with old VM state), and more importantly that future savevm operations
cause unnecessary COWs (with associated flushes), which makes subsequent
snapshots much slower.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:46 +02:00
Kevin Wolf
670df5e3b4 qcow2: Pass discard type to qcow2_discard_clusters()
The function will be used internally instead of only being called for
guest discard requests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:46 +02:00
Gerd Hoffmann
b0d768c35e chardev: fix pty_chr_timer
pty_chr_timer first calls pty_chr_update_read_handler(), then clears
timer_tag (because it is a one-shot timer).   This is the wrong order
though.  pty_chr_update_read_handler might re-arm time timer, and the
new timer_tag gets overwitten in that case.

This leads to crashes when unplugging a pty chardev:  pty_chr_close
thinks no timer is running -> timer isn't canceled -> pty_chr_timer gets
called with stale CharDevState -> BOOM.

This patch fixes the ordering.
Kill the pointless goto while being at it.

https://bugzilla.redhat.com/show_bug.cgi?id=994414

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-12 09:58:18 +02:00
Peter Lieven
54a5c1d5db iscsi: add .bdrv_get_block_status
this patch adds a coroutine for .bdrv_co_block_status as well as
a generic framework that can be used to build coroutines in block/iscsi.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 08:46:21 +02:00
Peter Lieven
f18a7cbb09 iscsi: add logical block provisioning information to iscsilun
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 08:46:21 +02:00
Peter Maydell
57ffcc4c83 hw/scsi/lsi53c895a: Use deposit32 rather than handcoded shift/mask
Use deposit32() rather than handcoded shifts/masks to update the
scratch registers. This is cleaner and incidentally avoids a clang
sanitizer complaint ("runtime error: left shift of 255 by 24 places
cannot be represented in type 'int'").

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 08:46:21 +02:00
Peter Maydell
927941059b hw/scsi/lsi53c895a: Use sextract32 for sign-extension
Use sextract32() for doing sign-extension rather than rolling
our own implementation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 08:46:21 +02:00
Markus Armbruster
c24e7517ee scsi: Fix scsi_bus_legacy_add_drive() scsi-generic with serial
scsi_bus_legacy_add_drive() creates either a scsi-disk or a
scsi-generic device.  It sets property "serial" to argument serial
unless null.  Crashes with scsi-generic, because it doesn't have such
the property.

Only usb_msd_initfn_storage() passes non-null serial.  Reproducer:

    $ qemu-system-x86_64 -nodefaults -display none -S -usb \
    -drive if=none,file=/dev/sg1,id=usb-drv0 \
    -device usb-storage,id=usb-msd0,drive=usb-drv0,serial=123
    qemu-system-x86_64: -device usb-storage,id=usb-msd0,drive=usb-drv0,serial=123: Property '.serial' not found
    Aborted (core dumped)

Fix by handling exactly like "removable": set the property only when
it exists.

Cc: qemu-stable@nongnu.org
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 08:46:21 +02:00
Markus Armbruster
a27292b5d7 virtio-scsi: Make type virtio-scsi-common abstract
It's the abstract base of virtio-scsi-device and vhost-scsi.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 08:46:21 +02:00
Alexey Kardashevskiy
eb37f14658 spapr-vscsi: add task management
At the moment the guest kernel issues two types of task management
requests to the hypervisor - task about and lun reset. This adds
handling for these tasks. As spapr-vscsi starts calling scsi_req_cancel(),
free_request callback was implemented.

As virtio-vscsi, spapr-vscsi does not handle CLEAR_ACA either as CDB
control byte does not seem to be used at all so NACA bit is not
set to the guest so the guest has no good reason to call CLEAR_ACA task.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[Fix choice of UCSOLCNT vs. SCSOLCNT. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 08:46:21 +02:00
Paolo Bonzini
5accc8408f scsi: prefer UUID to VM name for the initiator name
The UUID is unique even across multiple hosts, thus it is
better than a VM name even if it is less user-friendly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-12 08:46:21 +02:00
Anthony Liguori
2d1fe1873a Merge remote-tracking branch 'pmaydell/tags/pull-target-arm-20130910' into staging
ARM queue:
 * aarch64 preparation patchset (excluding the defconfigs, so this
   doesn't actually enable the new targets yet)
 * minor bugfixes and cleanups
 * disable "-cpu any" in system emulation mode
 * fix ARMv7M stack alignment on reset

# gpg: Signature made Tue 10 Sep 2013 01:46:11 PM CDT using RSA key ID 14360CDE
# gpg: Can't check signature: public key not found

# By Alexander Graf (13) and others
# Via Peter Maydell
* pmaydell/tags/pull-target-arm-20130910: (28 commits)
  configure: Add handling code for AArch64 targets
  linux-user: Add AArch64 support
  linux-user: Allow targets to specify a minimum uname release
  linux-user: Add AArch64 termbits.h definitions
  linux-user: Implement cpu_set_tls() and cpu_clone_regs() for AArch64
  linux-user: Make sure NWFPE code is 32 bit ARM only
  linux-user: Add signal handling for AArch64
  linux-user: Fix up AArch64 syscall handlers
  linux-user: Add syscall number definitions for AArch64
  linux-user: Add cpu loop for AArch64
  linux-user: Don't treat AArch64 cpu names specially
  target-arm: Add AArch64 gdbstub support
  target-arm: Add AArch64 translation stub
  target-arm: Prepare translation for AArch64 code
  target-arm: Disable 32 bit CPUs in 64 bit linux-user builds
  target-arm: Add new AArch64CPUInfo base class and subclasses
  target-arm: Pass DisasContext* to gen_set_pc_im()
  target-arm: Fix target_ulong/uint32_t confusions
  target-arm: Export cpu_env
  target-arm: Extract the disas struct to a header file
  ...

Message-id: 1378839142-7726-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-11 14:46:52 -05:00
Anthony Liguori
6f52e51bb7 Merge remote-tracking branch 'luiz/queue/qmp' into staging
# By Cole Robinson
# Via Luiz Capitulino
* luiz/queue/qmp:
  qapi-types.py: Fix enum struct sizes on i686

Message-id: 1378822364-13887-1-git-send-email-lcapitulino@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-11 14:46:44 -05:00
Anthony Liguori
d985bd4d55 Merge remote-tracking branch 'spice/spice.v73' into staging
# By Gerd Hoffmann (2) and Christophe Fergeau (1)
# Via Gerd Hoffmann
* spice/spice.v73:
  qxl: fix local renderer
  qxl: trace io port name
  spice-core: Use g_strdup_printf instead of snprintf

Message-id: 1378807572-27902-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-11 14:46:26 -05:00
Anthony Liguori
a640f07c0d Merge remote-tracking branch 'kraxel/usb.89' into staging
# By Gerd Hoffmann (2) and Miroslav Rezanina (2)
# Via Gerd Hoffmann
* kraxel/usb.89:
  ehci: save device pointer in EHCIState
  Remove dev-bluetooth.c dependency from vl.c
  Preparation for usb-bt-dongle conditional build
  usb: sanity check setup_index+setup_len in post_load

Message-id: 1378806073-25197-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-11 14:46:21 -05:00
Anthony Liguori
f69f0bcac9 Merge remote-tracking branch 'mdroth/qga-pull-2013-9-9' into staging
# By Tomoki Sekiyama (10) and Paul Burton (1)
# Via Michael Roth
* mdroth/qga-pull-2013-9-9:
  QMP/qemu-ga-client: Make timeout longer for guest-fsfreeze-freeze command
  qemu-ga: Install Windows VSS provider on `qemu-ga -s install'
  qemu-ga: Call Windows VSS requester in fsfreeze command handler
  qemu-ga: Add Windows VSS provider and requester as DLL
  error: Add error_set_win32 and error_setg_win32
  qemu-ga: Add configure options to specify path to Windows/VSS SDK
  Add a script to extract VSS SDK headers on POSIX system
  checkpatch.pl: Check .cpp files
  Add c++ keywords to QAPI helper script
  configure: Support configuring C++ compiler
  mips_malta: support up to 2GiB RAM

Message-id: 1378755701-2051-1-git-send-email-mdroth@linux.vnet.ibm.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-11 14:46:08 -05:00
Anthony Liguori
97fdb9410b Merge remote-tracking branch 'sstabellini/xen-2013-09-09' into staging
# By Anthony PERARD
# Via Stefano Stabellini
* sstabellini/xen-2013-09-09:
  pc_q35: Initialize Xen.
  pc: Initializing ram_memory under Xen.

Message-id: alpine.DEB.2.02.1309091718030.6397@kaball.uk.xensource.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-11 14:45:57 -05:00
Anthony Liguori
964737ea19 Merge remote-tracking branch 'stefanha/block' into staging
# By Paolo Bonzini (21) and others
# Via Stefan Hajnoczi
* stefanha/block: (42 commits)
  qemu-iotests: Fixed test case 026
  qemu-iotests: Whitespace cleanup
  dataplane: Fix startup race.
  block: look for zero blocks in bs->file
  block: add default get_block_status implementation for protocols
  raw-posix: report unwritten extents as zero
  raw-posix: return get_block_status data and flags
  docs, qapi: document qemu-img map
  qemu-img: add a "map" subcommand
  block: return BDRV_BLOCK_ZERO past end of backing file
  block: use bdrv_has_zero_init to return BDRV_BLOCK_ZERO
  block: return get_block_status data and flags for formats
  block: define get_block_status return value
  block: introduce bdrv_get_block_status API
  block: make bdrv_has_zero_init return false for copy-on-write-images
  qemu-img: always probe the input image for allocated sectors
  block: expect errors from bdrv_co_is_allocated
  block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above distinction
  block: do not use ->total_sectors in bdrv_co_is_allocated
  block: make bdrv_co_is_allocated static
  ...

Message-id: 1378481953-23099-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-11 14:45:37 -05:00
Anthony Liguori
ce2b69417c Merge remote-tracking branch 'stefanha/net' into staging
# By Brad Smith (2) and others
# Via Stefan Hajnoczi
* stefanha/net:
  ne2000: mark I/O as LITTLE_ENDIAN
  vmxnet3: Eliminate __packed redefined warning
  e1000: add interrupt mitigation support
  net: Rename send_queue to incoming_queue
  tap: Use numbered tap/tun devices on all *BSD OS's

Message-id: 1378481624-20964-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-11 14:45:18 -05:00
Alexander Graf
6a49fa95c9 configure: Add handling code for AArch64 targets
Add the necessary code to configure to handle AArch64 as a target
CPU (we already have some code for supporting it as host). Note
that this doesn't enable the AArch64 targets yet.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-23-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-12-git-send-email-john.rigby@linaro.org
[PMM:
 * don't need to set TARGET_ABI_DIR to aarch64 as that is the default
 * don't build nwfpe -- this is 32 bit legacy only
 * rewrite commit message
 * add aarch64 to the list of "fdt required" targets
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:29 +01:00
Alexander Graf
99033caee6 linux-user: Add AArch64 support
This patch adds support for AArch64 in all the small corners of
linux-user (primarily in image loading and startup code).

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-22-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-11-git-send-email-john.rigby@linaro.org
[PMM:
 * removed some unnecessary #defines from syscall.h
 * catch attempts to use a 32 bit only cpu with aarch64-linux-user
 * termios stuff moved into its own patch
 * we specify our minimum uname version here now
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:29 +01:00
Peter Maydell
4a24a75810 linux-user: Allow targets to specify a minimum uname release
For newer target architectures, glibc can be picky about the kernel
version: for example, it will not run on an aarch64 system unless
the kernel reports itself as at least 3.8.0. Accommodate this by
enhancing the existing support for faking the kernel version so
that each target can optionally specify a minimum version: if
the user doesn't force a specific fake version then we will override
with the minimum required version only if the real host kernel
version is insufficient.

Use this facility to let aarch64 report a minimum of 3.8.0.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-21-git-send-email-peter.maydell@linaro.org
2013-09-10 19:11:29 +01:00
Alexander Graf
af89c7dba5 linux-user: Add AArch64 termbits.h definitions
Add the AArch64 termbits.h with all the target's termios related
constants and structures.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-20-git-send-email-peter.maydell@linaro.org
[PMM: split out from another patch]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:29 +01:00
Alexander Graf
e2cea499cc linux-user: Implement cpu_set_tls() and cpu_clone_regs() for AArch64
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-19-git-send-email-peter.maydell@linaro.org
[PMM: pulled out from another patch; don't use is_a64() here;
 moved to linux-user from target-arm]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:29 +01:00
Peter Maydell
848d72cdd8 linux-user: Make sure NWFPE code is 32 bit ARM only
On ARM, linux-user emulation includes NWFPE support for emulating the
ancient FPA floating point coprocessor. This has long since been
superseded by VFP and is only required for legacy binaries. The
AArch64 linux-user target doesn't compile in NWFPE support, so make
sure the relevant code is protected by suitable ifdefs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-18-git-send-email-peter.maydell@linaro.org
2013-09-10 19:11:28 +01:00
1744aea182 linux-user: Add signal handling for AArch64
This patch adds signal handling for AArch64. The code is based on the
respective source in the Linux kernel.

Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-17-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-10-git-send-email-john.rigby@linaro.org
[PMM: fixed style nits: tabs, long lines;
 pulled target_signal.h in from a later patch; it fits better here]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:28 +01:00
Alexander Graf
09701199f6 linux-user: Fix up AArch64 syscall handlers
Some syscall handlers have special code for ARM enabled that we don't
need on AArch64. Exclude AArch64 in those cases. In other places we
can share struct definitions with other targets or have to provide our
own.

With this patch applied, most syscall definitions in linux-user should
be sound for AArch64.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-16-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-9-git-send-email-john.rigby@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:28 +01:00
Alexander Graf
c7907301e7 linux-user: Add syscall number definitions for AArch64
The AArch64 syscall definitions are all publicly available in the Linux
kernel. Let's add them to our linux-user emulation target, so that we
can easily handle AArch64 syscalls.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-15-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-8-git-send-email-john.rigby@linaro.org
[PMM: changes relating to cpu_loop() removed as they are superseded
 by an earlier patch]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:28 +01:00
Peter Maydell
1861c4543f linux-user: Add cpu loop for AArch64
Add the main linux-user cpu loop for AArch64. Since AArch64
has a different system call interface, doesn't need to worry
about FPA emulation and may in the future keep the prefetch/data
abort information in different system registers, it's simplest
just to use a completely separate loop from the 32 bit ARM
target, rather than peppering it with ifdefs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-14-git-send-email-peter.maydell@linaro.org
2013-09-10 19:11:28 +01:00
Alexander Graf
067d983127 linux-user: Don't treat AArch64 cpu names specially
32-bit ARM has a lot of different names for different types of CPUs it supports.
On AArch64, we don't have this, so we really don't want to execute the 32-bit
logic. Stub it out for AArch64 linux-user guests.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-13-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-7-git-send-email-john.rigby@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:28 +01:00
Alexander Graf
96c04212ba target-arm: Add AArch64 gdbstub support
We want to be able to debug AArch64 guests. So let's add the respective gdb
stub functions and xml descriptions that allow us to do so.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-12-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-6-git-send-email-john.rigby@linaro.org
[PMM: dropped unused fp regs XML for now; moved 64 bit only functions
 to new gdbstub64.c; these are hooked up in AArch64CPU, not via
 ifdefs in ARMCPU]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:28 +01:00
Alexander Graf
14ade10f84 target-arm: Add AArch64 translation stub
We should translate AArch64 mode separately from AArch32 mode. In AArch64 mode,
registers look vastly different, instruction encoding is completely different,
basically the system turns into a different machine.

So let's do a simple if() in translate.c to decide whether we can handle the
current code in the legacy AArch32 code or in the new AArch64 code.

So far, the translation always complains about unallocated instructions. There
is no emulator functionality in this patch!

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-11-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-5-git-send-email-john.rigby@linaro.org
[PMM:
 * provide no-op versions of a64 functions ifndef TARGET_AARCH64;
   this lets us avoid #ifdefs in translate.c
 * insert the missing call to disas_a64_insn()
 * stash the insn in the DisasContext rather than reloading it in
   real_unallocated_encoding()
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:28 +01:00
Alexander Graf
3926cc8433 target-arm: Prepare translation for AArch64 code
This patch adds all the prerequisites for AArch64 support that didn't
fit into split up patches. It extends important bits in the core cpu
headers to also take AArch64 mode into account.

Add new ARM_TBFLAG_AARCH64_STATE translation buffer flag
indicate an ARMv8 cpu running in aarch64 mode vs aarch32 mode.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-10-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-4-git-send-email-john.rigby@linaro.org
[PMM:
 * rearranged tbflags so AArch64? is bit 31 and if it is set then
  30..0 are freely available for whatever makes most sense for that mode
 * added version bump since we change VFP migration state
 * added a comment about how VFP/Neon register state works
 * physical address space is 48 bits, not 64
 * added ARM_FEATURE_AARCH64 flag to identify 64-bit capable CPUs
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:28 +01:00
Peter Maydell
15ee776bf2 target-arm: Disable 32 bit CPUs in 64 bit linux-user builds
If we're building aarch64-linux-user then the 32 bit CPUs are
all unwanted, because they can't possibly execute the 64 bit
binaries we will be running; disable them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-9-git-send-email-peter.maydell@linaro.org
2013-09-10 19:11:28 +01:00
Peter Maydell
d14d42f19b target-arm: Add new AArch64CPUInfo base class and subclasses
Create a new AArch64CPU class; all 64-bit capable ARM
CPUs are subclasses of this. (Currently we only support
one, the "any" CPU used by linux-user.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-8-git-send-email-peter.maydell@linaro.org
2013-09-10 19:11:28 +01:00
Peter Maydell
eaed129dea target-arm: Pass DisasContext* to gen_set_pc_im()
We want gen_set_pc_im() to work for both AArch64 and AArch32, but
to do this we'll need the DisasContext* so we can tell which mode
we're in, so pass it in as a parameter.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-7-git-send-email-peter.maydell@linaro.org
2013-09-10 19:11:28 +01:00
Alexander Graf
0a2461fa49 target-arm: Fix target_ulong/uint32_t confusions
Correct a few places that were using uint32_t or a 32 bit
only format string to handle something that should be a target_ulong.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-6-git-send-email-peter.maydell@linaro.org
[PMM: split out to separate patch; added gen_goto_tb() and
gen_set_pc_im() dest params to list of things to change.]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:28 +01:00
Alexander Graf
3407ad0e7a target-arm: Export cpu_env
The cpu_env tcg variable will be used by both the AArch32 and AArch64
handling code. Unstaticify it, so that both sides can make use of it.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-5-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-3-git-send-email-john.rigby@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:28 +01:00
Alexander Graf
f570c61e69 target-arm: Extract the disas struct to a header file
We will need to share the disassembly status struct between AArch32 and
AArch64 modes. So put it into a header file that both sides can use.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-4-git-send-email-peter.maydell@linaro.org
Message-id: 1368505980-17151-2-git-send-email-john.rigby@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:11:27 +01:00
Peter Maydell
08307563ff target-arm: Abstract out load/store from a vaddr in AArch32
AArch32 code (ie traditional 32 bit world) expects to be
able to pass a vaddr in a TCGv_i32. However when QEMU is
compiled with TARGET_LONG_BITS=32 the TCG load/store
functions take a TCGv_i64. Abstract out load/store with
a 32 bit vaddr so we have a place to put the zero extension
of the vaddr and the extension/truncation of the data value.

Apart from the function definitions most of this patch is
a simple s/tcg_gen_qemu_/gen_aa32_/.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378235544-22290-3-git-send-email-peter.maydell@linaro.org
2013-09-10 19:11:27 +01:00
Peter Maydell
4d017979aa abitypes.h: Remove incorrect ARM ABI_LLONG_ALIGNMENT
The ARM EABI specifies that 64 bit integers should be
8 aligned; remove our incorrect setting of 4 alignment.
This has no actual effect since it only set the alignment
for the 'abi_ullong' and 'abi_llong' types, which are used
only inside code which is MIPS-specific, but it will
avoid problems later if we use the types elsewhere.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:09:33 +01:00
Peter Maydell
031c44e4de pl110: Clarify comment about PL110 ID on VersatilePB
Clarify a comment about the ID register value presented by
the PL110 variant present on the VersatilePB board (based
on testing what the actual hardware does), to indicate that
this is not an error in our emulation, and to remove an #if-0.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:09:33 +01:00
Cole Robinson
78027bb6d9 target-arm: Implement qmp query-cpu-definitions
Libvirt uses this to introspect available CPU models.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-id: c0bdcd6c7ea6a085a6902ccaa73180fd771c8267.1378303555.git.crobinso@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:09:33 +01:00
Sebastian Ottlik
f62cafd4c8 target-arm: fix ARMv7M stack alignment on reset
When the initial SP is loaded from the vector table on ARMv7M systems the two
least significant bits are ignored as the stack is always aligned at a four byte
boundary (see ARM DDI 0403C, B1.4.1 and B1.5.5). So far QEMU did not ignore
these bits leading to a stack alignment inconsitent with real hardware for
binaries that rely on this behaviour. This patch fixes this issue by masking the
two least significant bits when loading the SP.

Signed-off-by: Sebastian Ottlik <ottlik@fzi.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1378286595-27072-1-git-send-email-ottlik@fzi.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-09-10 19:09:32 +01:00
Peter Maydell
78dbbbe4df target-arm: Avoid "1 << 31" undefined behaviour
Avoid the undefined behaviour of "1 << 31" by using 1U to make
the shift be of an unsigned value rather than shifting into the
sign bit of a signed integer. For consistency, we make all the
CPSR_* constants unsigned, though the only one which triggers
undefined behaviour is CPSR_N.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1378391908-22137-3-git-send-email-peter.maydell@linaro.org
2013-09-10 19:09:32 +01:00
Peter Maydell
534df15609 target-arm: Use sextract32() in branch decode
In the decode of ARM B and BL insns, swap the order of the
"append 2 implicit zeros to imm24" and the sign extend, and
use the new sextract32() utility function to do the latter.
This avoids a direct dependency on the undefined C behaviour
of shifting into the sign bit of an integer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1378391908-22137-2-git-send-email-peter.maydell@linaro.org
2013-09-10 19:09:32 +01:00
Peter Maydell
f5f6d38b74 target-arm: Make '-cpu any' available in linux-user mode only
Make the 'any' CPU for target-arm available only in linux-user mode.
The ARM target provides a CPU named "any", which turns on support for
all user-level instruction set extensions we know about. This is
intended for linux-user emulation mode, where it is the default CPU type.
It makes no sense to try to use this for system emulation, since we don't
initialize it with any system-level information like feature register
values or implementation specific cp15 registers. (Unsurprisingly, some
boards won't boot at all, though you might get lucky in some cases where
the guest doesn't happen to prod things that aren't there.)

Prevent users from making this command line error by removing the
CPU definition from the softmmu build.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-id: 1378213995-12945-1-git-send-email-peter.maydell@linaro.org
2013-09-10 19:09:32 +01:00
Cole Robinson
02dc4bf568 qapi-types.py: Fix enum struct sizes on i686
Unlike other list types, enum wasn't adding any padding, which caused
a mismatch between the generated struct size and GenericList struct
size. More details in a678e26cbe

This crashed qemu if calling qmp query-tpm-types for example, which
upsets libvirt capabilities probing. Reproducer on i686:

(sleep 5; printf '{"execute":"qmp_capabilities"}\n{"execute":"query-tpm-types"}\n') | ./i386-softmmu/qemu-system-i386 -S -nodefaults -nographic -M none -qmp stdio

https://bugs.launchpad.net/qemu/+bug/1219207

Cc: qemu-stable@nongnu.org
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-09-10 10:09:04 -04:00
Gerd Hoffmann
adbecc8973 ehci: save device pointer in EHCIState
We'll need a pointer to the actual pci/sysbus device,
stick a pointer to it into the EHCIState struct.

https://bugzilla.redhat.com/show_bug.cgi?id=1005495

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-10 11:14:42 +02:00
Miroslav Rezanina
615fe4de4b Remove dev-bluetooth.c dependency from vl.c
Use usb_legacy_register handling to create bt-dongle device and remove code
dependency from vl.c so CONFIG_USB_BLUETOOTH can be disabled.

Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-10 11:14:42 +02:00
Miroslav Rezanina
644e1a8a34 Preparation for usb-bt-dongle conditional build
To allow disable usb-bt-dongle device using CONFIG_BLUETOOTH option, some of
functions in vl.c file has to be made accessible in dev-bluetooth.c. This is
pure code moving.

Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-10 11:14:41 +02:00
Gerd Hoffmann
c60174e847 usb: sanity check setup_index+setup_len in post_load
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-10 11:14:41 +02:00
Gerd Hoffmann
c58c7b959b qxl: fix local renderer
The local spice renderer assumes the primary surface is located at the
start of the "ram" bar.  This used to be a requirement in qxl hardware
revision 1.  In revision 2+ this is relaxed.  Nevertheless guest drivers
continued to use the traditional location, for historical and backward
compatibility reasons.  The qxl kms driver doesn't though as it depends
on qxl revision 4+ anyway.

Result is that local rendering is hosed for recent linux guests, you'll
get pixel garbage with non-spice ui (gtk, sdl, vnc) and when doing
screendumps.  Fix that by doing a proper mapping of the guest-specified
memory location.

https://bugzilla.redhat.com/show_bug.cgi?id=948717

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-10 11:14:08 +02:00
Gerd Hoffmann
18b203850a qxl: trace io port name
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-10 11:14:08 +02:00
Christophe Fergeau
6735aa99a4 spice-core: Use g_strdup_printf instead of snprintf
Several places in spice-core.c were using either g_malloc+snprintf
or snprintf+g_strdup to achieve the same result as g_strdup_printf.

Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-10 11:14:08 +02:00
Tomoki Sekiyama
e2682db06a QMP/qemu-ga-client: Make timeout longer for guest-fsfreeze-freeze command
guest-fsfreeze-freeze command can take longer than 3 seconds when heavy
disk I/O is running. To avoid unexpected timeout, this changes the timeout
to 60 seconds (timeout of pre-commit phase of VSS).

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-09-09 14:17:57 -05:00
Tomoki Sekiyama
f311f2c20a qemu-ga: Install Windows VSS provider on `qemu-ga -s install'
Register QGA VSS provider library into Windows when qemu-ga is installed as
Windows service ('-s install' option). It is deregistered when the service
is uninstalled ('-s uninstall' option).

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-09-09 14:17:57 -05:00
Tomoki Sekiyama
64c0031740 qemu-ga: Call Windows VSS requester in fsfreeze command handler
Support guest-fsfreeze-freeze and guest-fsfreeze-thaw commands for Windows
guests. When fsfreeze command is issued, it calls the VSS requester to
freeze filesystems and applications. On thaw command, it again tells the VSS
requester to thaw them.

This also adds calling of initialize functions for the VSS requester.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-09-09 14:17:57 -05:00
Tomoki Sekiyama
b39297aedf qemu-ga: Add Windows VSS provider and requester as DLL
Adds VSS provider and requester as a qga-vss.dll, which is loaded by
Windows VSS service as well as by qemu-ga.

"provider.cpp" implements a basic stub of a software VSS provider.
Currently, this module only relays a frozen event from VSS service to the
agent, and thaw event from the agent to VSS service, to block VSS process
to keep the system frozen while snapshots are taken at the host.

To register the provider to the guest system as COM+ application, the type
library (.tlb) for qga-vss.dll is required. To build it from COM IDL (.idl),
VisualC++, MIDL and stdole2.tlb in Windows SDK are required. This patch also
adds pre-compiled .tlb file in the repository in order to enable
cross-compile qemu-ga.exe for Windows with VSS support.

"requester.cpp" provides the VSS requester to kick the VSS snapshot process.
Qemu-ga.exe works without the DLL, although fsfreeze features are disabled.

These functions are only supported in Windows 2003 or later. In older
systems, fsfreeze features are disabled.

In several versions of Windows which don't support attribute
VSS_VOLSNAP_ATTR_NO_AUTORECOVERY, DoSnapshotSet fails with error
VSS_E_OBJECT_NOT_FOUND. In this patch, we just ignore this error.
To solve this fundamentally, we need a framework to handle mount writable
snapshot on guests, which is required by VSS auto-recovery feature
(cleanup phase after a snapshot is taken).

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-09-09 14:17:57 -05:00
Tomoki Sekiyama
20840d4cfe error: Add error_set_win32 and error_setg_win32
These functions help maintaining homogeneous formatting of error messages
with Windows error code and description (generated by
g_win32_error_message()).

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-09-09 14:17:57 -05:00
Tomoki Sekiyama
d9840e2592 qemu-ga: Add configure options to specify path to Windows/VSS SDK
To enable VSS support in qemu-ga for Windows, header files included in
VSS SDK are required.
The VSS support is enabled by the configure option like below:
  ./configure --with-vss-sdk="/path/to/VSS SDK"

If the path is omitted, it tries to search the headers from default paths
and VSS support is enabled only if the SDK is found.
VSS support is disabled if --without-vss-sdk or --with-vss-sdk=no is
specified.

VSS SDK is available from:
  http://www.microsoft.com/en-us/download/details.aspx?id=23490

To cross-compile using mingw, you need to setup the SDK on Windows
environments to extract headers. You can also extract the SDK headers on
POSIX environments using scripts/extract-vss-headers and msitools.

In addition, --with-win-sdk="/path/to/Windows SDK" option is also added to
specify path to Windows SDK, which may be used for native-compile of .tlb
file of qemu-ga VSS provider. However, this is usually unnecessary because
pre-compiled .tlb file is included.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-09-09 14:17:56 -05:00
Tomoki Sekiyama
24482749c7 Add a script to extract VSS SDK headers on POSIX system
VSS SDK(*) setup.exe is only runnable on Windows. This adds a script
to extract VSS SDK headers on POSIX-systems using msitools.

  * http://www.microsoft.com/en-us/download/details.aspx?id=23490

From: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-09-09 14:17:56 -05:00
Tomoki Sekiyama
69d5d21f90 checkpatch.pl: Check .cpp files
Enable checkpatch.pl to apply the same checks as C source files for
C++ files with .cpp extensions. It also adds some exceptions for C++
sources to suppress errors for:
  - <> used in C++ template arguments (e.g. template <class T>)
  - :: used to represent namespaces   (e.g. SomeClass::method())
  - : used in class declaration       (e.g. class T : public Super)
  - ~ used in destructor method name  (e.g. T::~T())
  - spacing around 'catch'            (e.g. catch (...))

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-09-09 14:17:56 -05:00
Tomoki Sekiyama
6f88009ee5 Add c++ keywords to QAPI helper script
Add c++ keywords to avoid errors in compiling with c++ compiler.
This also renames class member of PciDeviceInfo to q_class.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-09-09 14:17:56 -05:00
Tomoki Sekiyama
83f73fce4c configure: Support configuring C++ compiler
Add configuration for C++ compiler in configure and Makefiles.
The C++ compiler is choosed as following:
 - ${CXX}, if it is specified.
 - ${cross_prefix}g++, if ${cross_prefix} is specified.
 - Otherwise, c++ is used.

Currently, usage of C++ language is only for access to Windows VSS
using COM+ services in qemu-guest-agent for Windows.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Micael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-09-09 14:17:56 -05:00
Paul Burton
94c2b6aff4 mips_malta: support up to 2GiB RAM
A Malta board can support up to 2GiB of RAM. Since the unmapped kseg0/1
regions are only 512MiB large & the latter 256MiB of those are taken up
by the IO region, access to RAM beyond 256MiB must be done through a
mapped region. In the case of a Linux guest this means we need to use
highmem.

The mainline Linux kernel does not support highmem for Malta at this
time, however this can be tested using the linux-mti-3.8 kernel branch
available from:

  git://git.linux-mips.org/pub/scm/linux-mti.git

You should be able to boot a Linux kernel built from the linux-mti-3.8
branch, with CONFIG_HIGHMEM enabled, using 2GiB RAM by passing "-m 2G"
to QEMU and appending the following kernel parameters:

  mem=256m@0x0 mem=256m@0x90000000 mem=1536m@0x20000000

Note that the upper half of the physical address space of a Malta
mirrors the lower half (hence the 2GiB limit) except that the IO region
(0x10000000-0x1fffffff in the lower half) is not mirrored in the upper
half. That is, physical addresses 0x90000000-0x9fffffff access RAM
rather than the IO region, resulting in a physical address space
resembling the following:

  0x00000000 -> 0x0fffffff  RAM
  0x10000000 -> 0x1fffffff  I/O
  0x20000000 -> 0x7fffffff  RAM
  0x80000000 -> 0x8fffffff  RAM (mirror of 0x00000000 -> 0x0fffffff)
  0x90000000 -> 0x9fffffff  RAM
  0xa0000000 -> 0xffffffff  RAM (mirror of 0x20000000 -> 0x7fffffff)

The second mem parameter provided to the kernel above accesses the
second 256MiB of RAM through the upper half of the physical address
space, making use of the aliasing described above in order to avoid
the IO region and use the whole 2GiB RAM.

The memory setup may be seen as 'backwards' in this commit since the
'real' memory is mapped in the upper half of the physical address space
and the lower half contains the aliases. On real hardware it would be
typical to see the upper half of the physical address space as the alias
since the bus addresses generated match the lower half of the physical
address space. However since the memory accessible in the upper half of
the physical address space is uninterrupted by the IO region it is
easiest to map the RAM as a whole there, and functionally it makes no
difference to the target code.

Due to the requirements of accessing the second 256MiB of RAM through
a mapping to the upper half of the physical address space it is usual
for the bootloader to indicate a maximum of 256MiB memory to a kernel.
This allows kernels which do not support such access to boot on systems
with more than 256MiB of RAM. It is also the behaviour assumed by Linux.
QEMUs small generated bootloader is modified to provide this behaviour.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-09-09 18:42:22 +02:00
Anthony PERARD
254c12825f pc_q35: Initialize Xen.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-09 16:24:33 +00:00
Anthony PERARD
04d7bad8a4 pc: Initializing ram_memory under Xen.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
CC: qemu-stable@nongnu.org
2013-09-09 16:22:19 +00:00
Aurelien Jarno
45d883dcf2 ne2000: mark I/O as LITTLE_ENDIAN
Now that the memory subsystem is propagating the endianness correctly,
the ne2000 device should have its I/O ports marked as LITTLE_ENDIAN, as
PCI devices are little endian.

This makes the ne2000 NIC to work again on PowerPC.

Cc: qemu-stable@nongnu.org
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 17:27:40 +02:00
Brad Smith
3dbb9786e9 vmxnet3: Eliminate __packed redefined warning
This eliminates a warning about __packed being redefined as exposed by the
vmxnet3 code. __packed is not used anywhere in the vmxnet3 code.

  CC    hw/net/vmxnet3.o
In file included from hw/net/vmxnet3.c:29:
hw/net/vmxnet3.h:37:1: warning: "__packed" redefined
In file included from /usr/include/stdlib.h:38,
                 from /buildbot-qemu/default_openbsd_current/build/include/qemu-common.h:26,
                 from /buildbot-qemu/default_openbsd_current/build/include/hw/hw.h:5,
                 from hw/net/vmxnet3.c:18:
/usr/include/sys/cdefs.h:209:1: warning: this is the location of the previous definition

Signed-off-by: Brad Smith <brad@comstyle.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 17:25:55 +02:00
Vincenzo Maffione
e9845f0985 e1000: add interrupt mitigation support
This patch partially implements the e1000 interrupt mitigation mechanisms.
Using a single QEMUTimer, it emulates the ITR register (which is the newer
mitigation register, recommended by Intel) and approximately emulates
RADV and TADV registers. TIDV and RDTR register functionalities are not
emulated (RDTR is only used to validate RADV, according to the e1000 specs).

RADV, TADV, TIDV and RDTR registers make up the older e1000 mitigation
mechanism and would need a timer each to be completely emulated. However,
a single timer has been used in order to reach a good compromise between
emulation accuracy and simplicity/efficiency.

The implemented mechanism can be enabled/disabled specifying the command
line e1000-specific boolean parameter "mitigation", e.g.

    qemu-system-x86_64 -device e1000,mitigation=on,... ...

For more information, see the Software developer's manual at
http://download.intel.com/design/network/manuals/8254x_GBe_SDM.pdf.

Interrupt mitigation boosts performance when the guest suffers from
an high interrupt rate (i.e. receiving short UDP packets at high packet
rate). For some numerical results see the following link
http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Reviewed-by: Andreas Färber <afaerber@suse.de> (for pc-* machines)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 17:25:52 +02:00
Jan Kiszka
067404be62 net: Rename send_queue to incoming_queue
Each networking client has a queue for packets that could not yet be
delivered to that client. Calling this queue "send_queue" is highly
confusing as it has nothing to to with packets send from this client but
to it. Avoid this confusing by renaming it to "incoming_queue".

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 17:01:26 +02:00
Brad Smith
aa4f082f75 tap: Use numbered tap/tun devices on all *BSD OS's
The following patch simplifies the *BSD tap/tun code and makes use of numbered
tap/tun interfaces on all *BSD OS's. NetBSD has a patch in their pkgsrc tree
to make use of this feature and DragonFly also supports this as well.

Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 17:01:26 +02:00
Kevin Wolf
8f94b07787 qemu-iotests: Fixed test case 026
The reference output for test case 026 hasn't been updated in a long
time and it's one of the "known failing" cases. This patch updates the
reference output so that unintentional changes can be reliably detected
again.

The problem with this test case is that it produces different output
depending on whether -nocache is used or not. The solution of this patch
is to actually have two different reference outputs. If nnn.out.nocache
exists, it is used as the reference output for -nocache; otherwise,
nnn.out stays valid for both cases.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:10 +02:00
Kevin Wolf
79e40ab10e qemu-iotests: Whitespace cleanup
These scripts used to have a four characters indentation, with eight
consecutive spaces converted into a tab. Convert everything into spaces.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Cornelia Huck
8caf907f07 dataplane: Fix startup race.
Avoid trying to setup dataplane again if dataplane setup is already in
progress. This may happen if an eventfd is triggered during setup.

I saw this occasionally with an experimental s390 irqfd implementation:

virtio_blk_handle_output
-> virtio_blk_data_plane_start
-> virtio_ccw_set_host_notifier
...
-> virtio_queue_set_host_notifier_fd_handler
-> virtio_queue_host_notifier_read
-> virtio_queue_notify_vq
-> virtio_blk_handle_output
-> virtio_blk_data_plane_start
-> vring_setup
-> hostmem_init
-> memory_listener_register
-> BOOM

As virtio-ccw tries to follow what virtio-pci does, it might be triggerable
for other platforms as well.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
5daa74a6eb block: look for zero blocks in bs->file
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
918e92d71b block: add default get_block_status implementation for protocols
Protocols return raw data, so you can assume the offsets to pass
through unchanged.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
f5f7abcfd5 raw-posix: report unwritten extents as zero
These are created for example with XFS_IOC_ZERO_RANGE.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
63390a8d14 raw-posix: return get_block_status data and flags
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
facd6e2b5c docs, qapi: document qemu-img map
Eric Blake also requested including the output in qapi-schema.json,
so that it is published through the introspection mechanism.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
4c93a13b5d qemu-img: add a "map" subcommand
This command dumps the metadata of an entire chain, in either tabular or JSON
format.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
f0ad5712d5 block: return BDRV_BLOCK_ZERO past end of backing file
If the sectors are unallocated and we are past the end of the
backing file, they will read as zero.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
415b5b013c block: use bdrv_has_zero_init to return BDRV_BLOCK_ZERO
Alternatively, this could use a "discard zeroes data" flag returned
by bdrv_get_info.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
4bc74be997 block: return get_block_status data and flags for formats
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
4333bb7140 block: define get_block_status return value
Define the return value of get_block_status.  Bits 0, 1, 2 and 9-62
are valid; bit 63 (the sign bit) is reserved for errors.  Bits 3-8
are left for future extensions.

The return code is compatible with the old is_allocated API: if a driver
only returns 0 or 1 (aka BDRV_BLOCK_DATA) like is_allocated used to,
clients of is_allocated will not have any change in behavior.  Still,
we will return more precise information in the next patches and the
new definition of bdrv_is_allocated is already prepared for this.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
b6b8a33354 block: introduce bdrv_get_block_status API
For now, bdrv_get_block_status is just another name for bdrv_is_allocated.
The next patches will add more flags.

This also touches all block drivers with a mostly mechanical rename.  The
sole exception is cow; because it calls cow_co_is_allocated from the read
code, we keep that function and make cow_co_get_block_status a wrapper.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
11212d8fa0 block: make bdrv_has_zero_init return false for copy-on-write-images
This helps implementing is_allocated on top of get_block_status.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
e4a86f88cc qemu-img: always probe the input image for allocated sectors
qemu-img convert can assume "that sectors which are unallocated in the
input image are present in both the output's and input's base images".

However it is only doing this if the output image returns true for
bdrv_has_zero_init().  Testing bdrv_has_zero_init() does not make much
sense if the output image is copy-on-write, because a copy-on-write
image is never initialized to zero (it is initialized to the content
of the backing file).

There is nothing here that makes has_zero_init images special.  The
input and output must be equal for the operation to make sense, and
that's it.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
d663640c04 block: expect errors from bdrv_co_is_allocated
Some bdrv_is_allocated callers do not expect errors, but the fallback
in qcow2.c might make other callers trip on assertion failures or
infinite loops.

Fix the callers to always look for errors.

Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
4f5786376e block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above distinction
Now that bdrv_is_allocated detects coroutine context, the two can
use the same code.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini
617ccb466e block: do not use ->total_sectors in bdrv_co_is_allocated
This is more robust when the device has removable media.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Paolo Bonzini
bdad13b9de block: make bdrv_co_is_allocated static
bdrv_is_allocated can detect coroutine context and go through a fast
path, similar to other block layer functions.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Paolo Bonzini
df2a6f29a5 block: keep bs->total_sectors up to date even for growable block devices
If a BlockDriverState is growable, after every write we need to
check if bs->total_sectors might have changed.  With this change,
bdrv_getlength does not need anymore a system call.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Paolo Bonzini
e641c1e81e cow: do not call bdrv_co_is_allocated
As we change bdrv_is_allocated to gather more information from bs and
bs->file, it will become a bit slower.  It is still appropriate for online
jobs, but not for reads/writes.  Call the internal function instead.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Paolo Bonzini
26ae980492 cow: make writes go at a less indecent speed
Only sync once per write, rather than once per sector.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Paolo Bonzini
276cbc7f2f cow: make reads go at a decent speed
Do not do two reads for each sector; load each sector of the bitmap
and use bitmap operations to process it.

Writes are still dog slow!

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Max Reitz
0ca0b0d5f8 qmp: Documentation for BLOCK_IMAGE_CORRUPTED
Add an appropriate entry describing this event and its parameters into
qmp-events.txt.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Fam Zheng
fa510ebffa block: use BDS ref for block jobs
Block jobs used drive_get_ref(drive_get_by_blockdev(bs)) to avoid BDS
being deleted. Now we have BDS reference count, and block jobs don't
care about dinfo, so replace them to get cleaner code. It is also the
safe way when BDS has no drive info.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Fam Zheng
38b54b6dc1 nbd: use BlockDriverState refcnt
Previously, nbd calls drive_get_ref() on the drive of bs. A BDS doesn't
always have associated dinfo, which nbd doesn't care either. We already
have BDS ref count, so use it to make it safe for a BDS w/o blockdev.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Fam Zheng
c0777fe18b xen_disk: simplify blk_disconnect with refcnt
We call bdrv_attach_dev when initializing whether or not bs is created
locally, so call bdrv_detach_dev and let the refcnt handle the
lifecycle.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Fam Zheng
8442cfd034 migration: omit drive ref as we have bdrv_ref now
block-migration.c does not actually use DriveInfo anywhere.  Hence it's
safe to drive ref code, we really only care about referencing BDS.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Fam Zheng
4f6fd3491c block: make bdrv_delete() static
Manage BlockDriverState lifecycle with refcnt, so bdrv_delete() is no
longer public and should be called by bdrv_unref() if refcnt is
decreased to 0.

This is an identical change because effectively, there's no multiple
reference of BDS now: no caller of bdrv_ref() yet, only bdrv_new() sets
bs->refcnt to 1, so all bdrv_unref() now actually delete the BDS.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Fam Zheng
9fcb025146 block: implement reference count for BlockDriverState
Introduce bdrv_ref/bdrv_unref to manage the lifecycle of
BlockDriverState. They are unused for now but will used to replace
bdrv_delete() later.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Fam Zheng
13c91cb7e2 iscsi: use bdrv_new() instead of stack structure
BlockDriverState structure needs bdrv_new() to initialize refcnt, don't
allocate a local structure variable and memset to 0, becasue with coming
refcnt implementation, bdrv_unref will crash if bs->refcnt not
initialized to 1.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Fam Zheng
3d34c6cd99 vvfat: use bdrv_new() to allocate BlockDriverState
we need bdrv_new() to properly initialize BDS, don't allocate memory
manually.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Alex Bligh
a94a3fac19 aio / timers: fix build of test/test-aio.c on non-linux platforms
tests/test-aio.c used pipe2 which is Linux only. Use qemu_pipe
and qemu_set_nonblock for portabillity. Addition of O_CLOEXEC
is a harmless bonus.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Stefan Weil
68dc036488 w32: Fix access to host devices (regression)
QEMU failed to open host devices like \\.\PhysicalDrive0 (first hard disk)
since some time (commit 8a79380b8ef1b02d2abd705dd026a18863b09020?).

Those devices use hdev_open which did not use the latest API for options.
This resulted in a fatal runtime error:

  Block protocol 'host_device' doesn't support the option 'filename'

Duplicate code from raw_open to fix this.

Cc: qemu-stable@nongnu.org
Reported-by: David Brenner <david.brenner3@gmail.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Alexandre Derumier
b2e10493c7 add qemu-img convert -n option (skip target volume creation)
Add a -n option to skip volume creation on qemu-img convert.
This is useful for targets such as rbd / ceph, where the
target volume may already exist; we cannot always rely on
qemu-img convert to create the image, as dependent on the
output format, there may be parameters which are not possible
to specify through the qemu-img convert command line.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:07 +02:00
Max Reitz
b3f3a30f38 qemu-iotests: Adjust test result 039
The moved OFLAG_COPIED check in qcow2_check_refcounts results in a
different output from test 039 (mismatches are now found after the
general refcount check (as far as any remain)). This patch adjusts the
expected test result accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:07 +02:00
Benoît Canet
2024c1df43 block: Add iops_size to do the iops accounting for a given io size.
This feature can be used in case where users are avoiding the iops limit by
doing jumbo I/Os hammering the storage backend.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:07 +02:00
Benoît Canet
3e9fab690d block: Add support for throttling burst max in QMP and the command line.
The max parameter of the leaky bucket throttling algorithm can be used to
allow the guest to do bursts.
The max value is a pool of I/O that the guest can use without being throttled
at all. Throttling is triggered once this pool is empty.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:07 +02:00
Benoît Canet
cc0681c454 block: Enable the new throttling code in the block layer.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:07 +02:00
Benoît Canet
f17cfe813c throttle: Add units tests
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:07 +02:00
Benoît Canet
5ddfffbdc5 throttle: Add a new throttling API implementing continuous leaky bucket.
Implement the continuous leaky bucket algorithm devised on IRC as a separate
module.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:07 +02:00
Anthony Liguori
df7131623d Merge remote-tracking branch 'bonzini/iommu-for-anthony' into staging
# By Jan Kiszka (2) and others
# Via Paolo Bonzini
* bonzini/iommu-for-anthony:
  exec: do tcg_commit only when tcg_enabled
  Revert "memory: Return -1 again on reads from unsigned regions"
  memory: Provide separate handling of unassigned io ports accesses
  exec: check offset_within_address_space for register subpage
  exec: fix writing to MMIO area with non-power-of-two length

Message-id: 1378401455-583-1-git-send-email-pbonzini@redhat.com
2013-09-05 13:38:53 -05:00
liguang
2641689a37 exec: do tcg_commit only when tcg_enabled
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:52 +02:00
Jan Kiszka
68a7439a15 Revert "memory: Return -1 again on reads from unsigned regions"
This reverts commit 9b8c692435.

The commit was wrong: We only return -1 on invalid accesses, not on
valid but unbacked ones. This broke various corner cases.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:48 +02:00
Jan Kiszka
3bb28b7208 memory: Provide separate handling of unassigned io ports accesses
Accesses to unassigned io ports shall return -1 on read and be ignored
on write. Ensure these properties via dedicated ops, decoupling us from
the memory core's handling of unassigned accesses.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:43 +02:00
Hu Tao
8826624970 exec: check offset_within_address_space for register subpage
If offset_within_address_space falls in a page, then we register a
subpage. So check offset_within_address_space rather than
offset_within_region.

Cc: qemu-stable@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: "Andreas Färber" <afaerber@suse.de>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:37 +02:00
Paolo Bonzini
098178f274 exec: fix writing to MMIO area with non-power-of-two length
The problem is introduced by commit 2332616 (exec: Support 64-bit
operations in address_space_rw, 2013-07-08).  Before that commit,
memory_access_size would only return 1/2/4.

Since alignment is already handled above, reduce l to the largest
power of two that is smaller than l.

Cc: qemu-stable@nongnu.org
Reported-by: Oleksii Shevchuk <alxchk@gmail.com>
Tested-by: Oleksii Shevchuk <alxchk@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:28 +02:00
Anthony Liguori
863a834157 Update mailmap
This makes get_maintainers.pl behave a little better.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-05 09:40:31 -05:00
Amit Shah
386a5a1e00 char: remove watch callback on chardev detach from frontend
If a frontend device releases the chardev (via unplug), the chr handlers
are set to NULL via qdev's exit callbacks invoking
qemu_chr_add_handlers().  If the chardev had a pending operation, a
callback will be invoked, which will try to access data in the
just-released frontend, causing a segfault.

Ensure the callbacks are disabled when frontends release chardevs.

This was seen when a virtio-serial port was unplugged when heavy
guest->host IO was in progress (causing a callback to be registered).
In the window in which the throttling was active, unplugging ports
caused a qemu segfault.

https://bugzilla.redhat.com/show_bug.cgi?id=985205

CC: <qemu-stable@nongnu.org>
Reported-by: Sibiao Luo <sluo@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2013-09-05 18:30:36 +05:30
Amit Shah
26da70c725 char: use common function to disable callbacks on chardev close
This deduplicates code used a lot of times.

CC: <qemu-stable@nongnu.org>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2013-09-05 18:30:31 +05:30
Amit Shah
7ba9addc16 char: move backends' io watch tag to CharDriverState
All the backends implement an io watcher tag for callbacks.  Move it to
CharDriverState from each backend's struct to make accessing the tag from
backend-neutral functions easier.

This will be used later to cancel a callback on chardev detach from a
frontend.

CC: <qemu-stable@nongnu.org>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2013-09-05 18:30:31 +05:30
Anthony Liguori
aaa6a40194 Merge remote-tracking branch 'afaerber/tags/qom-cpu-for-anthony' into staging
QOM CPUState refactorings / X86CPU

* Conversion of global CPU list to QTAILQ - preparing for CPU hot-unplug
* Document X86CPU magic numbers for CPUID cache info

# gpg: Signature made Tue 03 Sep 2013 10:59:22 AM CDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found

# By Andreas Färber (3) and Eduardo Habkost (1)
# Via Andreas Färber
* afaerber/tags/qom-cpu-for-anthony:
  target-i386: Use #defines instead of magic numbers for CPUID cache info
  cpu: Replace qemu_for_each_cpu()
  cpu: Use QTAILQ for CPU list
  a15mpcore: Use qemu_get_cpu() for generic timers
2013-09-03 12:33:32 -05:00
Anthony Liguori
bb7d4d82b6 Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Max Reitz (11) and others
# Via Kevin Wolf
* kwolf/for-anthony: (26 commits)
  qemu-iotests: Overlapping cluster allocations
  qcow2_check: Mark image consistent
  qcow2-refcount: Repair shared refcount blocks
  qcow2-refcount: Repair OFLAG_COPIED errors
  qcow2-refcount: Move OFLAG_COPIED checks
  qcow2: Employ metadata overlap checks
  qcow2: Metadata overlap checks
  qcow2: Add corrupt bit
  qemu-iotests: Snapshotting zero clusters
  qcow2-refcount: Snapshot update for zero clusters
  option: Add assigned flag to QEMUOptionParameter
  gluster: Abort on AIO completion failure
  block: Remove old raw driver
  switch raw block driver from "raw.o" to "raw_bsd.o"
  raw_bsd: register bdrv_raw
  raw_bsd: add raw_create_options
  raw_bsd: introduce "special members"
  raw_bsd: add raw_create()
  raw_bsd: emit debug events in bdrv_co_readv() and bdrv_co_writev()
  add skeleton for BSD licensed "raw" BlockDriver
  ...

Message-id: 1378111792-20436-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-03 12:32:46 -05:00
Anthony Liguori
5a93d5c2ab Merge remote-tracking branch 'mjt/trivial-patches' into staging
# By Stefan Weil (6) and others
# Via Michael Tokarev
* mjt/trivial-patches:
  aio / timers: use g_usleep() not sleep()
  adlib: sort offsets in portio registration
  qmp: fix integer usage in examples
  tci: Remove function tcg_out64 (fix broken build)
  target-arm: Report unimplemented opcodes (LOG_UNIMP)
  pflash_cfi02.c: fix debug macro
  configure: Remove unneeded redirections of stderr (pkg-config --exists)
  configure: Remove unneeded redirections of stderr (pkg-config --cflags, --libs)
  configure: Don't write .pyc files by default (python -B)
  curl: qemu_bh_new() can never return NULL
  slirp/arp_table.c: Avoid shifting into sign bit of signed integers
  configure: disable clang -Wstring-plus-int warning
  rdma: silly ipv6 bugfix
  misc: Fix some typos in names and comments
  slirp: Port redirection option behave differently on Linux and Windows

Message-id: 1378119695-14568-1-git-send-email-mjt@msgid.tls.msk.ru
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-03 12:31:44 -05:00
Anthony Liguori
9ea0f58fc7 Merge remote-tracking branch 'kraxel/usb.88' into staging
# By Gerd Hoffmann (10) and Marcel Apfelbaum (1)
# Via Gerd Hoffmann
* kraxel/usb.88:
  usb/dev-hid: Modified usb-tablet category from Misc to Input
  Revert "usb-hub: report status changes only once"
  usb-hub: add tracepoint for status reports
  usb: parallelize usb3 streams
  uas: add property for request logging
  xhci: reset port when disabling slot
  xhci: emulate intr endpoint intervals correctly
  xhci: fix endpoint interval calculation
  xhci: add port to slot_address tracepoint
  xhci: add tracepoint for endpoint state changes
  xhci: remove leftover debug printf

Message-id: 1378117055-29620-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-03 12:31:30 -05:00
Anthony Liguori
9889e04ac1 Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pc,pci,virtio fixes and cleanups

This includes pc and pci cleanups and enhancements,
and a virtio bugfix for level interrupts.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sun 01 Sep 2013 03:15:36 AM CDT using RSA key ID D28D5469
# gpg: Can't check signature: public key not found

# By Michael S. Tsirkin (3) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
  virtio_pci: fix level interrupts with irqfd
  pc: reduce duplication, fix PIIX descriptions
  hw: Clean up bogus default boot order
  pci: add config space access traces
  pc: fix regression for 64 bit PCI memory
  pci: Introduce helper to retrieve a PCI device's DMA address space

Message-id: 1378023590-11109-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-03 12:31:07 -05:00
Anthony Liguori
5cff81f098 Merge remote-tracking branch 'afaerber/tags/qom-devices-for-anthony' into staging
QOM device refactorings

* Fix QOM and ISA documentation errors
* Extend object_initialize() et al. to check the instance size

# gpg: Signature made Fri 30 Aug 2013 02:19:48 PM CDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found

# By Andreas Färber (14) and others
# Via Andreas Färber
* afaerber/tags/qom-devices-for-anthony:
  isa: Fix documentation of isa_register_portio_list()
  qom: Assert instance size in object_initialize_with_type()
  qom: Pass available size to object_initialize()
  qdev: Pass size to qbus_create_inplace()
  virtio-mmio: Pass size to virtio_mmio_bus_new()
  virtio-ccw: Pass size to virtio_ccw_bus_new()
  s390-virtio-bus: Pass size to virtio_s390_bus_new()
  virtio-pci: Pass size to virtio_pci_bus_new()
  usb: Pass size to usb_bus_new()
  scsi: Pass size to scsi_bus_new()
  pci: Pass size to pci_bus_new_inplace()
  ide: Pass size to ide_bus_new()
  ipack: Pass size to ipack_bus_new_inplace()
  intel-hda: Pass size to hda_codec_bus_init()
  qom: Fix object_initialize_with_type() argument name in documentation
  virtio: Remove unnecessary OBJECT() casts
  object: Fix typo in qom/object.h
2013-09-03 12:30:51 -05:00
Eduardo Habkost
5e891bf8fd target-i386: Use #defines instead of magic numbers for CPUID cache info
This is an attempt to make the CPUID cache topology code clearer, by
replacing the magic numbers in the code with #defines, and moving all
the cache information to the same place in the file.

I took care of comparing the assembly output of compiling
target-i386/cpu.c before and after applying this change, to make sure
not a single bit was changed on cpu_x86_cpuid() before and after
applying this patch (unfortunately I had to manually check existing
differences, because of __LINE__ expansions on
object_class_dynamic_cast_assert() calls).

This even keeps the code bug-compatible with the previous version: today
the cache information returned on AMD cache information leaves (CPUID
0x80000005 & 0x80000006) do not match the information returned on CPUID
leaves 2 and 4. The L2 cache information on CPUID leaf 2 also doesn't
match the information on CPUID leaf 2. The new constants should make it
easier to eventually fix those inconsistencies. All inconsistencies I
have found are documented in code comments.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-09-03 12:25:56 +02:00
Andreas Färber
38fcbd3f08 cpu: Replace qemu_for_each_cpu()
It was introduced to loop over CPUs from target-independent code, but
since commit 182735efaf target-independent
CPUState is used.

A loop can be considered more efficient than function calls in a loop,
and CPU_FOREACH() hides implementation details just as well, so use that
instead.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-09-03 12:25:55 +02:00
Andreas Färber
bdc44640cb cpu: Use QTAILQ for CPU list
Introduce CPU_FOREACH(), CPU_FOREACH_SAFE() and CPU_NEXT() shorthand
macros.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-09-03 12:25:55 +02:00
Andreas Färber
27013bf20d a15mpcore: Use qemu_get_cpu() for generic timers
This simplifies the loop and aids with refactoring of CPU list.

Requested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-09-03 11:30:04 +02:00
Aurelien Jarno
545825d4cd Merge branch 'tcg-next' of git://github.com/rth7680/qemu
* 'tcg-next' of git://github.com/rth7680/qemu: (29 commits)
  tcg-i386: Make use of zero-extended memory helper routines
  tcg: Introduce zero and sign-extended versions of load helpers
  exec: Split softmmu_defs.h
  target: Include softmmu_exec.h where forgotten
  exec: Rename USUFFIX to LSUFFIX
  tcg-i386: Don't perform GETPC adjustment in TCG code
  exec: Reorganize the GETRA/GETPC macros
  configure: Allow x32 as a host
  tcg-i386: Adjust tcg_out_tlb_load for x32
  tcg-i386: Use intptr_t appropriately
  tcg: Fix jit debug for x32
  tcg: Use appropriate types in tcg_reg_alloc_call
  tcg: Change tcg_out_ld/st offset to intptr_t
  tcg: Change tcg_gen_exit_tb argument to uintptr_t
  tcg: Use uintptr_t in TCGHelperInfo
  tcg: Change relocation offsets to intptr_t
  tcg: Change memory offsets to intptr_t
  tcg: Change frame pointer offsets to intptr_t
  tcg: Define TCG_ptr properly
  tcg: Define TCG_TYPE_PTR properly
  ...
2013-09-03 01:35:43 +02:00
Aurelien Jarno
32f3bd6d4d Merge branch 'ppc-for-upstream' of git://github.com/agraf/qemu
* 'ppc-for-upstream' of git://github.com/agraf/qemu:
  PPC: spapr: iommu: rework traces
  spapr: add "stop-self" RTAS call required to support hot CPU unplug
  PPC: KVM: Compile fix for qemu_notify_event
  pseries: Add H_SET_MODE hcall to change guest exception endianness
  xics: move registration of global state to realize()
  spapr-pci: rework MSI/MSIX
  target-ppc: Use #define instead of opencoding SLB valid bit
  spapr-pci: fix config space access to support bridges
  target-ppc: fix bit extraction for FPBF and FPL
  ppc405_boards: Don't enforce presence of firmware for qtest
  ppc405_uc: Disable debug output
  ppc405_boards: Disable debug output
  ppc: virtex_ml507: QEMU_OPTION_dtb support for this machine.
  disas/ppc.c: Fix little endian disassembly
  target-ppc: POWER7 supports the MSR_LE bit
  target-ppc: USE LPCR_ILE to control exception endian on POWER7
  pseries: Fix stalls on hypervisor virtual console
  PPC: E500: Generate device tree on reset
2013-09-03 01:35:25 +02:00
Aurelien Jarno
3207bf2549 tcg/mips: only enable ext8s/ext16s ops on MIPS32R2
On MIPS ext8s and ext16s ops are implemented with a dedicated
instruction only on MIPS32R2, otherwise the same kind of implementation
than at TCG level (shift left followed by shift right) is used.

Change that by only implementing the ext8s and ext16s ops on MIPS32R2 so
that optimizations can be done by the optimizer. Use an inline version to
avoid having to test again for MIPS32R2 instructions. Keep the shift
implementation for the ld/st routines.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-09-03 01:34:46 +02:00
Aurelien Jarno
df81ff51d5 tcg/mips: inline bswap16/bswap32 ops
Use an inline version for the bswap16 and bswap32 ops to avoid
testing for MIPS32R2 instructions availability, as these ops are
only available in that case.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-09-03 01:34:46 +02:00
Aurelien Jarno
988902fc3b tcg/mips: detect available host instructions at runtime
Now that TCG supports enabling and disabling ops at runtime, it's
possible to detect the available host instructions at runtime, and
enable the corresponding ops accordingly.

Unfortunately it's not easy to probe for available instructions on
MIPS, the information is partially available in /proc/cpuinfo, and
not available in AUXV. This patch therefore probes for the instructions
by trying to execute them and by catching a possible SIGILL signal.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-09-03 01:34:46 +02:00
Richard Henderson
6fb5874590 tcg-i386: Make use of zero-extended memory helper routines
For 8 and 16-bit unsigned loads, rely on the zero-extension
from the helper and use a smaller 32-bit move insn.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:31 -07:00
Richard Henderson
c8f94df593 tcg: Introduce zero and sign-extended versions of load helpers
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:31 -07:00
Richard Henderson
e58eb53413 exec: Split softmmu_defs.h
The _cmmu helpers can be moved to exec-all.h.  The helpers that are
used from TCG will shortly need access to tcg_target_long so move
their declarations into tcg.h.

This requires minor include adjustments to all TCG backends.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
b1669e5e32 target: Include softmmu_exec.h where forgotten
Several targets forgot to include softmmu_exec.h, which would
break them with a header cleanup to follow.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
701e3a5cc0 exec: Rename USUFFIX to LSUFFIX
In a following patch, there will be confusion between multiple "unsigned"
suffixes; rename this one so as to imply "load".

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
5bcebc253c tcg-i386: Don't perform GETPC adjustment in TCG code
Since we now perform it inside the helper, no need to do it here.
This also lets us perform a tail-call from the store slow path to
the helper.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
0f842f8a24 exec: Reorganize the GETRA/GETPC macros
Always define GETRA; use __builtin_extract_return_addr, rather than
having a special case for s390.  Split GETPC_ADJ out of GETPC; use 2
universally, rather than having a special case for arm.

Rename GETPC_LDST to GETRA_LDST to indicate that it does not
contain the GETPC_ADJ value.  Likewise with GETPC_EXT to GETRA_EXT.

Perform the GETPC_ADJ adjustment inside helper_ret_ld/st.  This will
allow backends to pass along the "true" return address rather than
the massaged GETPC value.  In the meantime, double application of
GETPC_ADJ does not hurt, since the call insn in all ISAs is at least
4 bytes long.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
c72b26ec92 configure: Allow x32 as a host
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
d5dad3be31 tcg-i386: Adjust tcg_out_tlb_load for x32
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
357e3d8a29 tcg-i386: Use intptr_t appropriately
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
edee2579ae tcg: Fix jit debug for x32
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
d3452f1f40 tcg: Use appropriate types in tcg_reg_alloc_call
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
a05b5b9be0 tcg: Change tcg_out_ld/st offset to intptr_t
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
8cfd04959a tcg: Change tcg_gen_exit_tb argument to uintptr_t
And update all users.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:30 -07:00
Richard Henderson
48bc6bab47 tcg: Use uintptr_t in TCGHelperInfo
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
2ba7fae29e tcg: Change relocation offsets to intptr_t
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
2f2f244d02 tcg: Change memory offsets to intptr_t
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
e2c6d1b42d tcg: Change frame pointer offsets to intptr_t
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
8b73d49f53 tcg: Define TCG_ptr properly
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
d289837eef tcg: Define TCG_TYPE_PTR properly
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
78cd7b835e tcg: Allow TCG_TARGET_REG_BITS to be specified independantly
There are several hosts for which it would be useful to use the
available 64-bit registers in a 32-bit pointer environment.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
3e9bd63acf tcg: Fix next_tb type in cpu_exec
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
04d5a1da70 tcg: Change tcg_qemu_tb_exec return to uintptr_t
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
b93949ef6a tcg: Change flush_icache_range arguments to uintptr_t
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
35aa3fb387 qtest: Fix FMT_timeval vs time_t
Since FMT_timeval unconditionally uses %ld for both tv_sec and tv_usec,
and already casts tv_usec to long, also cast tv_sec to long.

Cc: Andreas Färber <afaerber@suse.de>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
01547f7f92 tcg: Constant fold div, rem
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
32f5717f07 tcg-ppc64: Implement muluh, mulsh
Using these instead of mulu2 and muls2 lets us avoid having to argument
overlap analysis in the backend.  Normal register allocation will DTRT.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
3c9a8f1756 tcg-mips: Implement mulsh, muluh
With the optimization in tcg_liveness_analysis,
we can avoid the MFLO when it is unused.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Richard Henderson
03271524b6 tcg: Add muluh and mulsh opcodes
Use them in places where mulu2 and muls2 are used.
Optimize mulx2 with dead low part to mulxh.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-09-02 09:08:29 -07:00
Marcel Apfelbaum
31efd2e883 usb/dev-hid: Modified usb-tablet category from Misc to Input
usb-tablet device was wrongly assigned to Misc category

Reported-by: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-02 11:06:20 +02:00
Gerd Hoffmann
bdebd6ee81 Revert "usb-hub: report status changes only once"
This reverts commit a309ee6e0a.

This isn't in line with the usb specification and adds regressions,
win7 fails to drive the usb hub for example.

Was added because it "solved" the issue of hubs interacting badly
with the xhci host controller.  Now with the root cause being fixed
in xhci (commit <FIXME>) we can revert this one.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-02 11:06:20 +02:00
Gerd Hoffmann
b8cbc1374a usb-hub: add tracepoint for status reports
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-02 11:06:20 +02:00
Gerd Hoffmann
c96c41ed0d usb: parallelize usb3 streams
usb3 bulk endpoints with streams are implicitly pipelined now,
so the requests will actually be processed in parallel.  Also
allow them to complete out-of-order.

Fixes stalls in the uas driver.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-02 11:06:20 +02:00
Gerd Hoffmann
1556a8fc38 uas: add property for request logging
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-02 11:06:19 +02:00
Gerd Hoffmann
5c67dd7b48 xhci: reset port when disabling slot
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-02 11:06:19 +02:00
Gerd Hoffmann
4d7a81c06f xhci: emulate intr endpoint intervals correctly
Respect the interval for interrupt endpoints, so we don't finish
transfers as fast as possible but at the rate configured by the guest.

Fixes guest deadlocks triggered by interrupt storms.

Cc:
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-02 11:06:19 +02:00
Gerd Hoffmann
ca7162782a xhci: fix endpoint interval calculation
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-02 11:06:19 +02:00
Gerd Hoffmann
65d81ed402 xhci: add port to slot_address tracepoint
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-02 11:06:19 +02:00
Gerd Hoffmann
1c82392a15 xhci: add tracepoint for endpoint state changes
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-02 11:06:19 +02:00
Gerd Hoffmann
5219042274 xhci: remove leftover debug printf
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-09-02 11:06:19 +02:00
Max Reitz
ca0eca91b6 qemu-iotests: Overlapping cluster allocations
A new test on corrupted images with overlapping cluster allocations.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-02 10:15:15 +02:00
Max Reitz
24530f3e06 qcow2_check: Mark image consistent
If no corruptions remain after an image repair (and no errors have been
encountered), clear the corrupt flag in qcow2_check.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-02 10:15:15 +02:00
Max Reitz
afa50193cd qcow2-refcount: Repair shared refcount blocks
If the refcount of a refcount block is greater than one, we can at least
try to repair that problem by duplicating the affected block.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-02 10:06:59 +02:00
Alexey Kardashevskiy
7e472264e9 PPC: spapr: iommu: rework traces
This converts old style fprintf to traces.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: change patch subject]
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:43 +02:00
Alexey Kardashevskiy
59760f2dba spapr: add "stop-self" RTAS call required to support hot CPU unplug
PAPR+ requires two RTAS calls to be supported by the hypervisor in
order to allow hotplugging VCPUs from the guest. The "start-cpu" RTAS
call was already there but "stop-self" was not.

This adds the "stop-self" RTAS call.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:42 +02:00
Alexander Graf
7bb438b6a1 PPC: KVM: Compile fix for qemu_notify_event
The function qemu_notify_event is defined by a header that we don't
include in the PPC KVM code. Include it to get the code building
again.

  target-ppc/kvm_ppc.c: In function 'kvmppc_timer_hack':
  target-ppc/kvm_ppc.c:26:5: error: implicit declaration of function 'qemu_notify_event' [-Werror=implicit-function-declaration]
  target-ppc/kvm_ppc.c:26:5: error: nested extern declaration of 'qemu_notify_event' [-Werror=nested-externs]

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:42 +02:00
Anton Blanchard
42561bf2e4 pseries: Add H_SET_MODE hcall to change guest exception endianness
H_SET_MODE is used for controlling various partition settings. One
of these settings is the endianness a guest takes its exceptions in.

Signed-off-by: Anton Blanchard <anton@samba.org>
[agraf: fix whitespace]
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:42 +02:00
Alexey Kardashevskiy
33a0e5d8c5 xics: move registration of global state to realize()
Registration of global state belongs into realize so move it there.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:42 +02:00
Alexey Kardashevskiy
f1c2dc7c86 spapr-pci: rework MSI/MSIX
On the sPAPR platform a guest allocates MSI/MSIX vectors via RTAS
hypercalls which return global IRQ numbers to a guest so it only
operates with those and never touches MSIMessage.

Therefore MSIMessage handling is completely hidden in QEMU.

Previously every sPAPR PCI host bridge implemented its own MSI window
to catch msi_notify()/msix_notify() calls from QEMU devices (virtio-pci
or vfio) and route them to the guest via qemu_pulse_irq().
MSIMessage used to be encoded as:
	.addr - address within the PHB MSI window;
	.data - the device index on PHB plus vector number.
The MSI MR write function translated this MSIMessage to a global IRQ
number and called qemu_pulse_irq().

However the total number of IRQs is not really big (at the moment it is
1024 IRQs starting from 4096) and even 16bit data field of MSIMessage
seems to be enough to store an IRQ number there.

This simplifies MSI handling in sPAPR PHB. Specifically, this does:
1. remove a MSI window from a PHB;
2. add a single memory region for all MSIs to sPAPREnvironment
and spapr_pci_msi_init() to initialize it;
3. encode MSIMessage as:
    * .addr - a fixed address of SPAPR_PCI_MSI_WINDOW==0x40000000000ULL;
    * .data as an IRQ number.
4. change IRQ allocator to align first IRQ number in a block for MSI.
MSI uses lower bits to specify the vector number so the first IRQ has to
be aligned. MSIX does not need any special allocator though.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:42 +02:00
Aneesh Kumar K.V
a3cedb541c target-ppc: Use #define instead of opencoding SLB valid bit
Use SLB_ESID_V instead of (1 << 27) in the code

Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:42 +02:00
Alexey Kardashevskiy
5dac82ce0d spapr-pci: fix config space access to support bridges
spapr-pci config space accessors use find_dev() to find a PCI device.
However find_dev() only searched on a primary bus and did not do
recursive search through secondary buses so config space access was not
possible for devices other that on a primary bus.

This fixed find_dev() by using the PCI API pci_find_device() function.
This effectively enabled pci bridges on spapr.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:42 +02:00
Aurelien Jarno
779f659021 target-ppc: fix bit extraction for FPBF and FPL
Bit extraction for the FP BF and L field of the MTFSFI and MTFSF
instructions is wrong and doesn't match the reference manual (which
explain the bit number in big endian format). It has been broken in
commit 7d08d85645.

This patch fixes this, which in turn fixes the problem reported by
Khem Raj about the floor() function of libm.

Reported-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
CC: qemu-stable@nongnu.org (1.6)
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:42 +02:00
Andreas Färber
ad9990acc5 ppc405_boards: Don't enforce presence of firmware for qtest
Adopt error_report() while at it.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:41 +02:00
Andreas Färber
0d84382ed9 ppc405_uc: Disable debug output
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:41 +02:00
Andreas Färber
bf2ed917d7 ppc405_boards: Disable debug output
Also move one stray debug output into an #ifdef.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:41 +02:00
Efimov Vasily
daf285b606 ppc: virtex_ml507: QEMU_OPTION_dtb support for this machine.
QEMU has 'dtb' option for specifing the device tree file for the kernel.
The patch adds support for this option to the 'virtex_ml507' machine
implementation.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:41 +02:00
Anton Blanchard
95f5b6e3af disas/ppc.c: Fix little endian disassembly
Use info->endian to select the endian of the instruction to
be disassembled.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:41 +02:00
Anton Blanchard
bb429d2247 target-ppc: POWER7 supports the MSR_LE bit
Add MSR_LE to the msr_mask for POWER7.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:41 +02:00
Anton Blanchard
1e0c7e554e target-ppc: USE LPCR_ILE to control exception endian on POWER7
On POWER7, LPCR_ILE is used to control what endian guests take
their exceptions in so use it instead of MSR_ILE.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:41 +02:00
Anton Blanchard
7770b6f78a pseries: Fix stalls on hypervisor virtual console
A number of users are reporting stalls when using the pseries
hypervisor virtual console.

A simple test case is to paste 15 or 17 characters at a time
into the console. Pasting 15 characters at a time works fine
but pasting 17 characters hangs for a random amount of time.
Other activity (network, qemu monitor etc) unblocks it.

If qemu-char tries to send more than 16 characters at once,
vty_can_receive returns false. At this point we have to
wait for the guest to consume that output. Everything is good
so far.

The problem occurs when the the guest does consume the output.
We need to signal back to the qemu-char layer that we are
ready for more input. Without this we block until something
else kicks us (eg network activity).

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:41 +02:00
Alexander Graf
28290f37e2 PPC: E500: Generate device tree on reset
Today we generate the device tree once on machine initialization and then
store the finalized blob in memory to reload it on reset.

This is bad for 2 reasons. First we potentially waste a bunch of RAM for no
good reason, as we have all information required to regenerate the device
tree available anyways.

The second reason is even more important. On machine init when we generate
the device tree for the first time, we don't have all of the devices fully
initialized yet. But the device tree needs to potentially walk devices to
put information about them into the device tree.

Move the generation into a reset function. That way we just generate it new
every time we reset, solving both of the above issues.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-09-02 10:06:40 +02:00
Alex Bligh
fcdda211f9 aio / timers: use g_usleep() not sleep()
sleep() apparently doesn't exist under mingw. Use g_usleep for
portability.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 20:02:45 +04:00
Hervé Poussineau
2b21fb57af adlib: sort offsets in portio registration
This fixes the following assert when -device adlib is used:
ioport.c:240: portio_list_add: Assertion `pio->offset >= off_last' failed.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:59:30 +04:00
Eric Blake
586b546657 qmp: fix integer usage in examples
Per the qapi schema, block_set_io_throttle takes most arguments
as ints, not strings.

* qmp-commands.hx (block_set_io_throttle): Use correct type.  Fix
whitespace and a copy-paste bug in the process.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:46:58 +04:00
Stefan Weil
a32b12741b tci: Remove function tcg_out64 (fix broken build)
Commit ac26eb69a3 added tcg_out64 to tcg/tcg.c.
tcg/tci/tcg-target.c already had a nearly identical implementation which is
now removed to fix a compiler error.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:36:16 +04:00
Stefan Weil
e0c270d946 target-arm: Report unimplemented opcodes (LOG_UNIMP)
These unimplemented opcodes are handled like illegal opcodes, but
they are used in existing code. We should at least report when they
are executed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:34:32 +04:00
Antony Pavlov
56f99ea19b pflash_cfi02.c: fix debug macro
If PFLASH_DEBUG is enabled then we have some build errors:

hw/block/pflash_cfi02.c: In function ‘pflash_timer’:
hw/block/pflash_cfi02.c:128:5: error: expected ‘)’ before string constant
hw/block/pflash_cfi02.c:128:5: error: too few arguments to function ‘fprintf’

This patch fixes the problem.

Signed-off-by: Antony Pavlov <antonynpavlov@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:32:42 +04:00
Stefan Weil
65d5d3f922 configure: Remove unneeded redirections of stderr (pkg-config --exists)
Predicate options (--exists, --atleast-version, ...) of pkg-config dont't
print error messages to stderr, so redirecting stderr is not necessary.

Combining a predicate option with --modversion is not necessary for tests.
Instead of testing with --modversion, --exists can be used.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:26:08 +04:00
Stefan Weil
ca871ec861 configure: Remove unneeded redirections of stderr (pkg-config --cflags, --libs)
For existing libraries, pkg-config --cflags and pkg-config --libs won't
print error messages to stderr, so redirecting stderr is not necessary.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:26:00 +04:00
Stefan Weil
1d984a67a9 configure: Don't write .pyc files by default (python -B)
When a Python script is run, Python normally writes bytecode into a .pyc file.
QEMU's build process uses several Python scripts which are called from
configure or make.

The generated .pyc files take disk space without being of much use, because
those scripts are short, not time critical and only called a few times.

Python's option -B disables writing of .pyc files. QEMU now uses "python -B"
as default, but it is still possible to choose a different call by passing
--python=PYTHON to configure.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:14:49 +04:00
Stefan Hajnoczi
5b21a2ae4d curl: qemu_bh_new() can never return NULL
Drop error code path which cannot be taken since qemu_bh_new() does not
return NULL.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:11:56 +04:00
Peter Maydell
ed6bc28e8a slirp/arp_table.c: Avoid shifting into sign bit of signed integers
"0xf << 28" shifts right into the sign bit, since 0xf is a signed
integer. Use the 'U' suffix to force an unsigned shift to avoid
this undefined behaviour and a clang sanitizer warning.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:09:09 +04:00
Peter Maydell
714290979a configure: disable clang -Wstring-plus-int warning
Some versions of clang will warn about adding integers to strings:

disas/i386.c:4753:23: error: adding 'char' to a string does not append
      to the string [-Werror,-Wstring-plus-int]
      oappend ("%es:" + intel_syntax);
               ~~~~~~~^~~~~~~~~~~~~~
disas/i386.c:4753:23: note: use array indexing to silence this warning
      oappend ("%es:" + intel_syntax);
                      ^
               &      [             ]

disas/i386.c uses this idiom to to skip a "%" prefix if using intel
rather than AT&T syntax. This seems like a reasonable  thing to do,
and I don't think anybody contributing to QEMU is likely to believe
that '+' is a string concatenation operator in C, so just disable
-Wstring-plus-int.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:06:26 +04:00
Michael R. Hines
c89aa2f185 rdma: silly ipv6 bugfix
My bad - but it's very important for us to warn the user that
IPv6 is broken on RoCE in linux right now, until linux releases
a fixed version.

Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 19:03:43 +04:00
Stefan Weil
4c293dc6e4 misc: Fix some typos in names and comments
Most typos were found using a modified version of codespell:

accross -> across
issueing -> issuing
TICNT_THRESHHOLD -> TICNT_THRESHOLD
bandwith -> bandwidth
VCARD_7816_PROPIETARY -> VCARD_7816_PROPRIETARY
occured -> occurred
gaurantee -> guarantee
sofware -> software

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 18:59:24 +04:00
Taimoor Mirza
efcb7e4529 slirp: Port redirection option behave differently on Linux and Windows
port redirection code uses SO_REUSEADDR socket option before binding to
host port. Behavior of SO_REUSEADDR is different on Windows and Linux.
Relaunching QEMU with same host and guest port redirection values on Linux
throws error but on Windows it does not throw any error.
Problem is discussed in http://lists.gnu.org/archive/html/qemu-devel/2013-04/msg03089.html

Signed-off-by: Taimoor Mirza <tmirza@codesourcery.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 18:52:30 +04:00
Michael S. Tsirkin
23fe2b3f9e virtio_pci: fix level interrupts with irqfd
commit 62c96360ae
    virtio-pci: fix level interrupts
only helps systems without irqfd: on systems with irqfd support we
passed in flag requesting irqfd even when msix is disabled.

As a result, for level interrupts we didn't install an fd handler so
unmasking an fd had no effect.

Fix this up.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-01 11:15:03 +03:00
Michael S. Tsirkin
a0dba644c1 pc: reduce duplication, fix PIIX descriptions
We have a lot of code duplication between machine types,
this increases with each new machine type
and each new field.

This has already introduced a minor bug: description
for pc-1.3 says "Standard PC" while description for
pc-1.4 is "Standard PC (i440FX + PIIX, 1996)"
which makes you think 1.3 is somehow more standard,
or newer, while in fact it's a revision of the same PC.

This patch addresses this issue by using macros, along
the lines used by PC_COMPAT_X_X - only for
non-property options.

The approach can extend to non-PC machine types.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-01 10:43:25 +03:00
Hervé Poussineau
520902a656 isa: Fix documentation of isa_register_portio_list()
Commit b40acf9 (ioport: Switch dispatching to memory core layer,
2013-06-24) removed all instances of old_portio.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 21:15:44 +02:00
Andreas Färber
5b9237f67c qom: Assert instance size in object_initialize_with_type()
This catches objects initializing beyond allocated memory, e.g.,
when subtypes get extended with instance state of their own.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 21:15:44 +02:00
Andreas Färber
213f0c4f61 qom: Pass available size to object_initialize()
To be passed on to object_initialize_with_type().

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> (virtio-ccw)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 21:15:44 +02:00
Andreas Färber
fb17dfe057 qdev: Pass size to qbus_create_inplace()
To be passed to object_initialize().

Since commit 39355c3826 the argument is
void*, so drop some superfluous (BusState *) casts or direct parent
field usages.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 21:15:35 +02:00
Andreas Färber
e5f720391e virtio-mmio: Pass size to virtio_mmio_bus_new()
To be passed to qbus_create_initialize().

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:40 +02:00
Andreas Färber
1bf4d7aad6 virtio-ccw: Pass size to virtio_ccw_bus_new()
To be passed to qbus_create_inplace().

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:39 +02:00
Andreas Färber
5d6c0c4913 s390-virtio-bus: Pass size to virtio_s390_bus_new()
To be passed to qbus_create_inplace().

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:39 +02:00
Andreas Färber
ac7af1120f virtio-pci: Pass size to virtio_pci_bus_new()
To be passed to qbus_create_inplace().

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:39 +02:00
Andreas Färber
c889b3a55d usb: Pass size to usb_bus_new()
To be passed to qbus_create_inplace().

Use DEVICE() cast to avoid a direct parent field access.

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:39 +02:00
Andreas Färber
b1187b51ff scsi: Pass size to scsi_bus_new()
To be passed to qbus_create_inplace().

Use DEVICE() casts instead of direct parent field access.

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:39 +02:00
Andreas Färber
dd301ca607 pci: Pass size to pci_bus_new_inplace()
To be passed to qbus_create_inplace().

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:39 +02:00
Andreas Färber
c6baf942e0 ide: Pass size to ide_bus_new()
To be passed to qbus_create_inplace().

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:39 +02:00
Andreas Färber
77cbb28a5b ipack: Pass size to ipack_bus_new_inplace()
To be passed to qbus_create_inplace().

Simplify DEVICE() cast to avoid parent field access.

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:38 +02:00
Andreas Färber
ab809e84a7 intel-hda: Pass size to hda_codec_bus_init()
To be passed to qbus_create_inplace().

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:38 +02:00
Andreas Färber
53caad9a31 qom: Fix object_initialize_with_type() argument name in documentation
@obj -> @data.

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:38 +02:00
Peter Maydell
e65177a87f virtio: Remove unnecessary OBJECT() casts
There's no need to cast the first argument of object_initialize()
to Object. Remove these unnecessary casts.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:38 +02:00
Peter Chubb
70392912ed object: Fix typo in qom/object.h
There's been a cut-and-paste error, it looks like, in the documentation
in qom/object.h.

Signed-off-by: Peter Chubb <peter.chubb@nicta.com.au>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-30 20:14:37 +02:00
Anthony Liguori
4ff78e0dbc Merge remote-tracking branch 'luiz/queue/qmp' into staging
# By Wenchao Xia (15) and Stefan Weil (1)
# Via Luiz Capitulino
* luiz/queue/qmp:
  monitor: improve auto complete of "help" for single command in sub group
  monitor: allow "help" show message for single command in sub group
  monitor: support sub command in auto completion
  monitor: refine monitor_find_completion()
  monitor: support sub command in help
  monitor: refine parse_cmdline()
  monitor: code move for parse_cmdline()
  monitor: avoid direct use of global variable *mon_cmds
  monitor: split off monitor_data_init()
  monitor: call sortcmdlist() only one time
  monitor: avoid use of global *cur_mon in readline_completion()
  monitor: avoid use of global *cur_mon in monitor_find_completion()
  monitor: avoid use of global *cur_mon in block_completion_it()
  monitor: avoid use of global *cur_mon in file_completion()
  monitor: avoid use of global *cur_mon in cmd_completion()
  monitor: Add missing attributes to local function

Message-id: 1377865357-6742-1-git-send-email-lcapitulino@redhat.com
2013-08-30 12:26:04 -05:00
Anthony Liguori
b95fdc0e99 Merge remote-tracking branch 'borntraeger/tags/kdump' into staging
This is a set of patches dealing with kdump support for s390x/kvm.
kdump on s390x uses subcode 1 of diagnose 0x308 to put the hardware
in a defined state. This is different from a full reset, since it
does not touch all CPU registers.
These patches define the cpu resets, the subsystem reset a load
function and also wires up the "nmi" command to issue a RESTART
interrupt as defined in the z/Architecture principles of operation.

This allows recent guest kernels with properly setup userspace
to trigger kdump:
- via guest crash
- via nmi from the host

# gpg: Signature made Fri 30 Aug 2013 07:19:18 AM CDT using RSA key ID B5A61C7C
# gpg: Can't check signature: public key not found

# By Christian Borntraeger (5) and Eugene (jno) Dvurechenski (2)
# Via Christian Borntraeger
* borntraeger/tags/kdump:
  s390: wire up nmi command to raise a RESTART interrupt on S390
  s390: Implement load normal reset
  s390/cpu: split CPU reset into architectured functions
  s390: provide a cpu load normal function
  s390: provide I/O subsystem reset
  s390/kvm: basic implementation of diagnose 308 subcode 6
  s390x/kvm: Fix switch/case indentation for handle_diag

Message-id: 1377810649-47484-1-git-send-email-borntraeger@de.ibm.com
2013-08-30 12:25:56 -05:00
Max Reitz
e23e400ec6 qcow2-refcount: Repair OFLAG_COPIED errors
Since the OFLAG_COPIED checks are now executed after the refcounts have
been repaired (if repairing), it is safe to assume that they are correct
but the OFLAG_COPIED flag may be not. Therefore, if its value differs
from what it should be (considering the according refcount), that
discrepancy can be repaired by correctly setting (or clearing that flag.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:44 +02:00
Max Reitz
4f6ed88c03 qcow2-refcount: Move OFLAG_COPIED checks
Move the OFLAG_COPIED checks out of check_refcounts_l1 and
check_refcounts_l2 and after the actual refcount checks/fixes (since the
refcounts might actually change there).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:44 +02:00
Max Reitz
cf93980e77 qcow2: Employ metadata overlap checks
The pre-write overlap check function is now called before most of the
qcow2 writes (aborting it on collision or other error).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:43 +02:00
Max Reitz
a40f1c2add qcow2: Metadata overlap checks
Two new functions are added; the first one checks a given range in the
image file for overlaps with metadata (main header, L1 tables, L2
tables, refcount table and blocks).

The second one should be used immediately before writing to the image
file as it calls the first function and, upon collision, marks the
image as corrupt and makes the BDS unusable, thereby preventing
further access.

Both functions take a bitmask argument specifying the structures which
should be checked for overlaps, making it possible to also check
metadata writes against colliding with other structures.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:43 +02:00
Max Reitz
69c9872653 qcow2: Add corrupt bit
This adds an incompatible bit indicating corruption to qcow2. Any image
with this bit set may not be written to unless for repairing (and
subsequently clearing the bit if the repair has been successful).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:43 +02:00
Max Reitz
449df70638 qemu-iotests: Snapshotting zero clusters
This test creates an image with unallocated zero clusters, then creates
a snapshot. Afterwards, there should be neither any errors nor leaks.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Max Reitz
8b81a7b6ba qcow2-refcount: Snapshot update for zero clusters
Account for all cluster types in qcow2_update_snapshot_refcounts;
this prevents this function from updating the refcount of unallocated
zero clusters which effectively led to wrong adjustments of the refcount
of cluster 0 (the main qcow2 header). This in turn resulted in images
with (unallocated) zero clusters having a cluster 0 refcount greater
than one after creating a snapshot.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Max Reitz
d4ca092a42 option: Add assigned flag to QEMUOptionParameter
Adds an "assigned" flag to QEMUOptionParameter which is cleared at the
beginning of parse_option_parameters and set on (successful)
set_option_parameter and set_option_parameter_int.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Bharata B Rao
9faa574f7d gluster: Abort on AIO completion failure
Currently if gluster AIO callback thread fails to notify the QEMU thread about
AIO completion, we try graceful recovery by marking the disk drive as
inaccessible. This error recovery code is race-prone as found by Asias and
Stefan. However as found out by Paolo, this kind of error is impossible and
hence simplify the code that handles this error recovery.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Kevin Wolf
e5b1d99f55 block: Remove old raw driver
This is unused code now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Laszlo Ersek
7a6d3fc594 switch raw block driver from "raw.o" to "raw_bsd.o"
"Incoming" function prototypes and "outgoing" function calls must match
reality. Implemented using the "struct BlockDriver" definition in
"include/block/block_int.h", and gcc errors & warnings.

v1->v2:

On 08/20/13 09:51, Kevin Wolf wrote:
> Am 18.08.2013 um 16:29 hat Paolo Bonzini geschrieben:
>> Il 16/08/2013 16:15, Laszlo Ersek ha scritto:
>>> +static int raw_reopen_prepare(BDRVReopenState *reopen_state,
>>> +                              BlockReopenQueue *queue, Error **errp)
>>>  {
>>> -    return bdrv_reopen_prepare(bs->file);
>>> +    BDRVReopenState tmp = *reopen_state;
>>> +
>>> +    tmp.bs = tmp.bs->file;
>>> +    return bdrv_reopen_prepare(&tmp, queue, errp);
>>>  }
>>
>> This should just return zero, my fault.
>
> Which is because bdrv_reopen_queue() already queues bs->file for reopen.
> The simple return 0; implementation is shared by all other format drivers
> that support reopening images.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Laszlo Ersek
775d6afd5c raw_bsd: register bdrv_raw
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 5) Formats are registered with bdrv_register (takes a BlockDriver*). You
> also need to pass the caller of bdrv_register to block_init.

Fill in the BlockDriver structure with the raw_*() functions that have
been added to "block/raw_bsd.c", in the order the fields are defined in
"include/block/block_int.h".

I needed more explanation / naming examples for registering the driver
than what Paolo gave me, so I copied / adapted from "block/qcow2.c". The
parts I took as basis for modification are blamed on

    commit 5efa9d5a8b
    Author: Anthony Liguori <aliguori@us.ibm.com>
    Date:   Sat May 9 17:03:42 2009 -0500

        Convert block infrastructure to use new module init functionality

    commit 20d97356c9
    Author: Blue Swirl <blauwirbel@gmail.com>
    Date:   Fri Apr 23 20:19:47 2010 +0000

        Fix OpenBSD build

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Laszlo Ersek
ff369a483d raw_bsd: add raw_create_options
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 4) There is another member, .create_options, which is an array of
> QEMUOptionParameter structs, terminated by an all-zero item.  The only
> option you need is for the virtual disk size.  You will find something
> to copy from in other block drivers, for example block/qcow2.c.

Code taken and adapted from "block/qcow2.c", as suggested. The code being
copied/modified is blamed on

    commit 20d97356c9
    Author: Blue Swirl <blauwirbel@gmail.com>
    Date:   Fri Apr 23 20:19:47 2010 +0000

        Fix OpenBSD build

and

    commit 7c80ab3f21
    Author: Jes Sorensen <Jes.Sorensen@redhat.com>
    Date:   Fri Dec 17 16:02:39 2010 +0100

        block/qcow2.c: rename qcow_ functions to qcow2_

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Laszlo Ersek
01dd96d8f4 raw_bsd: introduce "special members"
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 3) These members are special
>
>     .format_name   is the string "raw"
>     .bdrv_open     raw_open should set bs->sg to bs->file->sg and return 0
>     .bdrv_close    raw_close should do nothing
>     .bdrv_probe    raw_probe should just return 1.

v1->v2:

On 08/20/13 10:11, Kevin Wolf wrote:
> Am 16.08.2013 um 16:15 hat Laszlo Ersek geschrieben:

>> +static int raw_probe(void)
>> +{
>> +    return 1;
>> +}
>
> Maybe add a comment here like "smallest possible positive score so that
> raw is used if and only if no other block driver works".

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Laszlo Ersek
1565262c37 raw_bsd: add raw_create()
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 2) This is also a simple forwarder function:
>
>     .bdrv_create
>
> but there is no BlockDriverState argument so the forwarded-to function
> does not have a bs->file argument either.  The forwarded-to function is
> bdrv_create_file.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Laszlo Ersek
9eaafd90d1 raw_bsd: emit debug events in bdrv_co_readv() and bdrv_co_writev()
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 1) BlockDriver is a struct in which these function members are
> interesting:
>
>     .bdrv_reopen_prepare
>     .bdrv_co_readv
>     .bdrv_co_writev
>     .bdrv_co_is_allocated
>     .bdrv_co_write_zeroes
>     .bdrv_co_discard
>     .bdrv_getlength
>     .bdrv_get_info
>     .bdrv_truncate
>     .bdrv_is_inserted
>     .bdrv_media_changed
>     .bdrv_eject
>     .bdrv_lock_medium
>     .bdrv_ioctl
>     .bdrv_aio_ioctl
>     .bdrv_has_zero_init
>
> They should be implemented as simple forwarders (see above). There are
> 16 functions listed here, you can easily see how this already accounts
> for 100+ SLOC roughly...
>
> The implementations of bdrv_co_readv and bdrv_co_writev should also call
> BLKDBG_EVENT on bs->file too, before forwarding to bs->file.  The events
> to be generated are BLKDBG_READ_AIO and BLKDBG_WRITE_AIO.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Laszlo Ersek
e1c66c6d82 add skeleton for BSD licensed "raw" BlockDriver
On 08/05/13 15:03, Paolo Bonzini wrote:
>
>
> ----- Original Message -----
>> From: "Laszlo Ersek" <lersek@redhat.com>
>> To: "Paolo Bonzini" <pbonzini@redhat.com>
>> Sent: Monday, August 5, 2013 2:43:46 PM
>> Subject: Re: [PATCH 1/2] raw: add license header
>>
>> On 08/02/13 00:27, Paolo Bonzini wrote:
>>> On 08/01/2013 10:13 AM, Christoph Hellwig wrote:
>>>> On Wed, Jul 31, 2013 at 08:19:51AM +0200, Paolo Bonzini wrote:
>>>>> Most of the block layer is under the BSD license, thus it is
>>>>> reasonable to license block/raw.c the same way.  CCed people should
>>>>> ACK by replying with a Signed-off-by line.
>>>>
>>>> The coded was intended to be GPLv2.
>>>
>>> Laszlo, would you be willing to do clean-room reverse engineering?
>>>
>>> (No rants, please. :))
>>
>> What's the scope exactly?
>
> It's quite small, it's a file full of forwarders like
>
> static void raw_foo(BlockDriverState *bs)
> {
>     return bdrv_foo(bs->file);
> }
>
> It's 170 lines of code, all as boring as this.  I only picked you
> because I'm quite certain you have never seen the file (and the answer
> confirmed it).
>
> Basically:
>
> 1) BlockDriver is a struct in which these function members are
> interesting:
>
>     .bdrv_reopen_prepare
>     .bdrv_co_readv
>     .bdrv_co_writev
>     .bdrv_co_is_allocated
>     .bdrv_co_write_zeroes
>     .bdrv_co_discard
>     .bdrv_getlength
>     .bdrv_get_info
>     .bdrv_truncate
>     .bdrv_is_inserted
>     .bdrv_media_changed
>     .bdrv_eject
>     .bdrv_lock_medium
>     .bdrv_ioctl
>     .bdrv_aio_ioctl
>     .bdrv_has_zero_init
>
> They should be implemented as simple forwarders (see above).
> There are 16 functions listed here, you can easily see how this
> already accounts for 100+ SLOC roughly...
>
> The implementations of bdrv_co_readv and bdrv_co_writev should also
> call BLKDBG_EVENT on bs->file too, before forwarding to bs->file.  The
> events to be generated are BLKDBG_READ_AIO and BLKDBG_WRITE_AIO.
>
> 2) This is also a simple forwarder function:
>
>     .bdrv_create
>
> but there is no BlockDriverState argument so the forwarded-to function
> does not have a bs->file argument either.  The forwarded-to function
> is bdrv_create_file.
>
> 3) These members are special
>
>     .format_name   is the string "raw"
>     .bdrv_open     raw_open should set bs->sg to bs->file->sg and return 0
>     .bdrv_close    raw_close should do nothing
>     .bdrv_probe    raw_probe should just return 1.
>
> 4) There is another member, .create_options, which is an array of
> QEMUOptionParameter structs, terminated by an all-zero item.  The only
> option you need is for the virtual disk size.  You will find something
> to copy from in other block drivers, for example block/qcow2.c.
>
> 5) Formats are registered with bdrv_register (takes a BlockDriver*).
> You also need to pass the caller of bdrv_register to block_init.
>
> 6) I'm not sure how to organize the patch series, so I'll leave this to
> your creativity.  I guess in this case move/copy detection of git should
> be disabled.  I would definitely include this spec in the commit
> message as a proof of clean-room reverse engineering.
>
> 7) Remember a BSD header like the one in block.c.
>
> Paolo

This patch implements the email up to the paragraph ending with "100+ SLOC
roughly". The skeleton is generated from the list there, with a simple
shell loop using "sed" and the raw_foo() template.

The BSD license block is copied (and reflowed) from
"util/qemu-progress.c".

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Peter Maydell
127c84e1a5 block/qcow2.h: Avoid "1LL << 63" (shifts into sign bit)
The expression "1LL << 63" tries to shift the 1 into the sign bit of a
'long long', which provokes a clang sanitizer warning:

runtime error: left shift of 1 by 63 places cannot be represented in type 'long long'

Use "1ULL << 63" as the definition of QCOW_OFLAG_COPIED instead
to avoid this. For consistency, we also update the other QCOW_OFLAG
definitions to use the ULL suffix rather than LL, though only the
shift by 63 is undefined behaviour.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Kevin Wolf
cccc30b4ad qemu-iotests: Update reference output for 051
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Kevin Wolf
c0447d870b Revert "block: Disable driver-specific options for 1.6"
This reverts commit 8afaefb891.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Kevin Wolf
015370301f qapi-types.py: Split off generate_struct_fields()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-08-30 15:28:52 +02:00
Kevin Wolf
09da4a7292 block: Remove redundant assertion
The failing condition is checked immediately before the assertion, so
keeping the assertion is kind of redundant.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:28:52 +02:00
Kevin Wolf
9117b47717 qcow2: Change default for new images to compat=1.1
By the time that qemu 1.7 will be released, enough time will have passed
since qemu 1.1, which is the first version to understand version 3
images, that changing the default shouldn't hurt many people any more
and the benefits of using the new format outweigh the pain.

qemu-iotests already runs with compat=1.1 by default.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-08-30 15:28:51 +02:00
Eugene (jno) Dvurechenski
7f7f975295 s390: wire up nmi command to raise a RESTART interrupt on S390
There is the 'nmi' command that is used to trigger a guest dump via kdump feature on x86.
s390 uses RESTART interrupt to trigger kdump.
So, this patch provides a mean to use 'nmi' command on s390 to raise RESTART interrupt.

The CPU to receive the RESTART interrupt is the "default" one.

There is an infrastructure to select the "default" CPU using 'cpu' command.
The 'info cpus' command can be used to see which one is the "default".

In order to wire up the RESTART to 'nmi' command we had to:
1. implement the kvm_s390_cpu_restart function by exporting the existing code
2. implement s390_cpu_restart function as kvm-aware wrapper
3. modify the qmp_inject_nmi function to enable (for s390) the scan for
   "default" CPU and call s390_cpu_restart for it;
3. fix some messages.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
2013-08-30 14:16:48 +02:00
Christian Borntraeger
f077847572 s390: Implement load normal reset
kdump on s390 uses a load normal reset to bring the system in a defined
state by doing a subsystem reset. The issuing CPUs will have an initial
CPU reset, all other CPUs will have a CPU reset as defined in POP (no
register content will change).

Implement this as architectured.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-08-30 14:16:48 +02:00
Christian Borntraeger
f5ae2a4fd8 s390/cpu: split CPU reset into architectured functions
s390 provides several CPU resets:
- CPU reset, clears interrupts, stop processing, clears TLB, but does
  not touch registers
- initial CPU reset, like CPU reset, but also clears PSW, prefix, FPC,
  timer and control registers. It does not touch gprs, fprs and acrs (!)
- Power on reset: the full monty

wire up CPUClass reset to the full monty, but provide the lesser resets
as part of S390CPUClass.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-08-30 14:16:43 +02:00
Wenchao Xia
7ca0e06104 monitor: improve auto complete of "help" for single command in sub group
Now special case "help *" in auto completion can work with sub commands,
such as "help info u*".

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:15 -04:00
Wenchao Xia
129be006d6 monitor: allow "help" show message for single command in sub group
A new parameter type 'S' is introduced to allow user input any string.
"help info block" works normal now.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:15 -04:00
Wenchao Xia
d903a779cf monitor: support sub command in auto completion
This patch allows auto completion work normal for sub command case,
"info block [DEVICE]" can auto complete now, by re-enter the completion
function. In original code "info" is treated as a special case, now it
is treated as a sub command group, global variable info_cmds is not used
any more.

"help" command is still treated as a special case, since it is not a sub
command group but want to auto complete command in root command table.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:15 -04:00
Wenchao Xia
c35b640033 monitor: refine monitor_find_completion()
In order to support sub command in auto completion, a reentrant function
is needed, so monitor_find_completion() is split into two parts. The
first part does parsing of user input which need to be done only once,
the second part does the auto completion job according to the parsing
result, which contains the necessary code to support sub command and
works as the reentrant function. The global "info_cmds" is still used
in second part, which will be replaced by sub command code later.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:15 -04:00
Wenchao Xia
66855495fb monitor: support sub command in help
The old code in help_cmd() uses global 'info_cmds' and treats it as a
special case. Actually 'info_cmds' is a sub command group of 'mon_cmds',
in order to avoid direct use of it, help_cmd() needs to change its work
mechanism to support sub command and not treat it as a special case
any more.

To support sub command, help_cmd() will first parse the input and then call
help_cmd_dump(), which works as a reentrant function. When it meets a sub
command, it simply enters the function again. Since help dumping needs to
know whole input to printf full help message include prefix, for example,
"help info block" need to printf prefix "info", so help_cmd_dump() takes all
args from input and extra parameter arg_index to identify the progress.
Another function help_cmd_dump_one() is introduced to printf the prefix
and command's help message.

Now help supports sub command, so later if another sub command group is
added in any depth, help will automatically work for it. Still "help info
block" will show error since command parser reject additional parameter,
which can be improved later. "log" is still treated as a special case.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:15 -04:00
Wenchao Xia
dcc70cdf09 monitor: refine parse_cmdline()
Since this function will be used by help_cmd() later, so improve
it to make it more generic and easier to use. free_cmdline_args()
is added too as paired function to free the result.

One change of this function is that, when the valid args in input
exceed the limit of MAX_ARGS, it fails now, instead of return with
MAX_ARGS of parsed args in old code. This should not impact much
since it is rare that user input many args in monitor's "help" and
auto complete scenario.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:15 -04:00
Wenchao Xia
f5438c0500 monitor: code move for parse_cmdline()
help_cmd() need this function later, so move it. get_str() is called by
parse_cmdline() so it is moved also. Some code style error reported by
check script, is also fixed.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:15 -04:00
Wenchao Xia
7717239dc1 monitor: avoid direct use of global variable *mon_cmds
New member *cmd_table is added in structure Monitor to avoid direct usage of
*mon_cmds. Now monitor have an associated command table, when global variable
*info_cmds is also discarded, structure Monitor would gain full control about
how to deal with user input.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:15 -04:00
Wenchao Xia
b01fe89e91 monitor: split off monitor_data_init()
In qmp_human_monitor_command(), the monitor need to initialized for
basic functionalities, and later more init code will be added, so
split off this function. Note that it is different with QMP mode
monitor which accept json string from monitor's input,
qmp_human_monitor_command() retrieve the human style command from
QMP input, then send the command to a normal mode monitor.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:15 -04:00
Wenchao Xia
d038317c35 monitor: call sortcmdlist() only one time
It doesn't need to be done for every monitor, so change it.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:14 -04:00
Wenchao Xia
d1a9756ab8 monitor: avoid use of global *cur_mon in readline_completion()
Now all completion functions do not use *cur_mon any more, instead
they use rs->mon. In short, structure ReadLineState decide where
the complete action would be taken now.

Tested with the case that qemu have two telnet monitors, auto
completion function works normal.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:14 -04:00
Wenchao Xia
d2674b2cf7 monitor: avoid use of global *cur_mon in monitor_find_completion()
Parameter *mon is added, and local variable *mon added in previous patch
is removed. The caller readline_completion(), pass rs->mon as value, which
should be initialized in readline_init() called by monitor_init().

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:14 -04:00
Wenchao Xia
599a926abc monitor: avoid use of global *cur_mon in block_completion_it()
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:14 -04:00
Wenchao Xia
cb8f68b104 monitor: avoid use of global *cur_mon in file_completion()
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:14 -04:00
Wenchao Xia
cd5c6bba1b monitor: avoid use of global *cur_mon in cmd_completion()
A new local variable *mon is added in monitor_find_completion()
to make compile pass, which will be removed later in
conversion patch for monitor_find_completion().

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:14 -04:00
Stefan Weil
9c3175cc15 monitor: Add missing attributes to local function
Function expr_error gets a format string and variable arguments like printf.
It also never returns. Add the necessary attributes.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-30 07:41:14 -04:00
Christian Borntraeger
29c6157ca7 s390: provide a cpu load normal function
Some code needs to perform an IPL-like bootup that mimics the
ESA (31bit) restart. Provide a cpu class method that does so.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
2013-08-30 12:49:30 +02:00
Christian Borntraeger
4e872a3fb0 s390: provide I/O subsystem reset
Provide a function that resets the I/O subsystem.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
2013-08-30 12:49:30 +02:00
Eugene (jno) Dvurechenski
268846ba93 s390/kvm: basic implementation of diagnose 308 subcode 6
Linux uses a check for subcode 6 to decide if other subcodes are
available. Provide a minimal implementation for subcode 6, as well
as for subcode 5.

Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[Move code from kvm.c into misc_helper.c]
2013-08-30 12:48:25 +02:00
Christian Borntraeger
39fbc5c62c s390x/kvm: Fix switch/case indentation for handle_diag
This alignes case statements to switch statements in the handle_diag
function as mandated by coding style.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-08-30 11:09:13 +02:00
Anthony Liguori
b5d54bd421 Merge remote-tracking branch 'qemu-kvm/uq/master' into stable-1.5
* qemu-kvm/uq/master:
  kvm-stub: fix compilation
  kvm: shorten the parameter list for get_real_device()
  kvm: i386: fix LAPIC TSC deadline timer save/restore
  kvm-all.c: max_cpus should not exceed KVM vcpu limit
  kvm: Simplify kvm_handle_io
  kvm: x86: fix setting IA32_FEATURE_CONTROL with nested VMX disabled
  kvm: add KVM_IRQFD_FLAG_RESAMPLE support
  kvm: migrate vPMU state
  target-i386: remove tabs from target-i386/cpu.h
  Initialize IA32_FEATURE_CONTROL MSR in reset and migration

Conflicts:
	target-i386/cpu.h
	target-i386/kvm.c

aliguori: fixup trivial conflicts due to whitespace and added cpu
          argument

Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-08-29 17:21:51 -05:00
Anthony Liguori
e560992f21 Merge remote-tracking branch 'sweil/mingw' into stable-1.5
# By Stefan Weil
# Via Stefan Weil
* sweil/mingw:
  gtk: Remove unused include statements which are not portable
  w32: Add an icon resource
  w32: Fix broken out-of-tree builds (missing version.o)

Message-id: 1377607132-21336-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-08-29 17:20:17 -05:00
Anthony Liguori
3e998a7788 Merge remote-tracking branch 'mst/tags/for_anthony' into stable-1.5
pc,pci,virtio fixes and cleanups

This includes pc and pci cleanups, future-proofing of ROM files,
and a virtio bugfix correcting splice on virtio console.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 26 Aug 2013 01:34:20 AM CDT using RSA key ID D28D5469
# gpg: Can't check signature: public key not found

# By Markus Armbruster (5) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
  virtio: virtqueue_get_avail_bytes: fix desc_pa when loop over the indirect descriptor table
  pc_piix: Kill pc_init1() memory region args
  pc: pc_compat_1_4() now can call pc_compat_1_5()
  pc: Create pc_compat_*() functions
  pc: Kill pc_init_pci_1_0()
  pc: Don't explode QEMUMachineInitArgs into local variables needlessly
  pc: Don't prematurely explode QEMUMachineInitArgs
  ppc: Don't duplicate QEMUMachineInitArgs in PPCE500Params
  ppc: Don't explode QEMUMachineInitArgs into local variables needlessly
  sun4: Don't prematurely explode QEMUMachineInitArgs
  q35: Add PCIe switch to example q35 configuration
  loader: store FW CFG ROM files in RAM
  arch_init: align MR size to target page size
  pc: cleanup 1.4 compat support

Message-id: 1377535318-30491-1-git-send-email-mst@redhat.com
2013-08-29 17:19:19 -05:00
Richard Henderson
584950fd4e tcg-i386: Remove abort from GETPC_LDST
Indeed, remove it entirely and remove the is_tcg_gen_code check
from GETPC_EXT.

Fixes https://bugs.launchpad.net/qemu/+bug/1218098 wherein a call
to a "normal" helper function performed a sequence of tail calls
all the way into the memory helper functions, leading to a stack
frame in which the memory helper function appeared to be called
directly from tcg.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-29 20:20:39 +02:00
James Hogan
951fab990d target-mips: fix get_physical_address() #if 0 build error
In get_physical_address() is a qemu_log() call inside an #if 0 block.
When enabled the following build error is hit:

target-mips/helper.c In function ‘get_physical_address’:
target-mips/helper.c:220:13: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 5 has type ‘hwaddr’ [-Werror=format]

Fix the *physical (hwaddr) formatting by using "%"HWADDR_PRIx instead of
TARGET_FMT_lx.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-08-28 19:28:02 +02:00
Paolo Bonzini
821c808bd1 kvm-stub: fix compilation
Non-KVM targets fail compilation on the uq/master branch.
Fix the prototype of kvm_irqchip_add_irqfd_notifier to match
the one in kvm-all.c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-08-28 17:07:02 +03:00
Markus Armbruster
c165473269 hw: Clean up bogus default boot order
We set default boot order "cad" in every single machine definition
except "pseries" and "moxiesim", even though very few boards actually
care for boot order, and "cad" makes sense for even fewer.

Machines that care:

* pc and its variants

  Accept up to three letters 'a', 'b' (undocumented alias for 'a'),
  'c', 'd' and 'n'.  Reject all others (fatal with -boot).

* nseries (n800, n810)

  Check whether order starts with 'n'.  Silently ignored otherwise.

* prep, g3beige, mac99

  Extract the first character the machine understands (subset of
  'a'..'f').  Silently ignored otherwise.

* spapr

  Accept an arbitrary string (vl.c restricts it to contain only
  'a'..'p', no duplicates).

* sun4[mdc]

  Use the first character.  Silently ignored otherwise.

Strip characters these machines ignore from their default boot order.

For all other machines, remove the unused default boot order
alltogether.

Note that my rename of QEMUMachine member boot_order to
default_boot_order and QEMUMachineInitArgs member boot_device to
boot_order has a welcome side effect: it makes every use of boot
orders visible in this patch, for easy review.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-28 10:16:47 +03:00
Alexey Kardashevskiy
3bf4dfdd11 pci: add config space access traces
This adds pci_cfg_read and pci_cfg_write traces for config spaces
accesses.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-28 10:11:23 +03:00
Stefan Weil
92f1623663 gtk: Remove unused include statements which are not portable
These include files don't exist for MinGW and are not needed for Linux
(and hopefully for other hosts as well), so remove them.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2013-08-27 14:21:16 +02:00
Stefan Weil
487cddb2bf w32: Add an icon resource
The QEMU mascot which was already used for the NSIS installer
is now used for all QEMU executables.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2013-08-27 14:21:16 +02:00
Stefan Weil
7e75e33e78 w32: Fix broken out-of-tree builds (missing version.o)
Commit 0b516ef0df added version.o to all
executables, but broke out-of-tree builds: for those builds the pattern
rule %.o: %.rc from rules.mak does not match, so version.o was no longer
built.

Adding explicit build rules fixes this.

Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-08-27 14:21:16 +02:00
Michael S. Tsirkin
1466cef32d pc: fix regression for 64 bit PCI memory
commit 3984890181
    pc: limit 64 bit hole to 2G by default
introduced a way for management to control
the window allocated to the 64 bit PCI hole.

This is useful, but existing management tools do not know how to set
this property.  As a result, e.g. specifying a large ivshmem device with
size > 4G is broken by default.  For example this configuration no
longer works:

-device ivshmem,size=4294967296,chardev=cfoo
-chardev socket,path=/tmp/sock,id=cfoo,server,nowait

Fix this by detecting that hole size was not specified
and defaulting to the backwards-compatible value of 1 << 62.

Cc: qemu-stable@nongnu.org
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-27 10:13:41 +03:00
Alexey Kardashevskiy
9eda7d373e pci: Introduce helper to retrieve a PCI device's DMA address space
A PCI device's DMA address space (possibly an IOMMU) is returned by a
method on the PCIBus.  At the moment that only has one caller, so the
method is simply open coded.  We'll need another caller for VFIO, so
this patch introduces a helper/wrapper function.

If IOMMU is not set, the pci_device_iommu_address_space() function
returns the parent's IOMMU skipping the "bus master" address space as
otherwise proper emulation would require more effort for no benefit.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[aik: added inheritance from parent if iommu is not set for the current bus]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-27 08:30:48 +03:00
Richard Henderson
401c227b0a tcg-i386: Use new return-argument ld/st helpers
Discontinue the jump-around-jump-to-jump scheme, trading it for a single
immediate move instruction.  The two extra jumps always consume 7 bytes,
whereas the immediate move is either 5 or 7 bytes depending on where the
code_gen_buffer gets located.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-26 13:31:54 -07:00
Richard Henderson
aac1fb0576 tcg: Tidy softmmu_template.h
Avoid a loop in the tlb_fill path; the fill will either succeed or
generate an exception.

Inline the slow_ld/st function; it was a complete copy of the main
helper except for the actual cross-page unaligned code, and the
compiler was inlining it anyway.

Add unlikely markers optimizing for the most common case of simple
tlb miss.

Make sure the compiler can optimize away the unaligned paths for a
1 byte access.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-26 13:31:54 -07:00
Richard Henderson
e25c3887e6 tcg: Add mmu helpers that take a return address argument
Allow the code that tcg generates to be less obtuse, passing in
the return address directly instead of computing it in the helper.

Maintain the old entrance point unchanged as an alternate entry point.

Delete the helper_st*_cmmu prototypes; the implementations did not exist.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-26 13:31:53 -07:00
Richard Henderson
c6f29ff096 tcg-i386: Tidy qemu_ld/st slow path
Use existing stack space for arguments; don't push/pop.
Use less ifdefs and more C ifs.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-26 13:31:53 -07:00
Richard Henderson
8023ccda07 tcg-i386: Try pc-relative lea for constant formation
Use a 7 byte lea before the ultimate 10 byte movq.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-26 13:31:53 -07:00
Richard Henderson
ac26eb69a3 tcg-i386: Add and use tcg_out64
No point in splitting the write into 32-bit pieces.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-26 13:31:53 -07:00
Richard Henderson
2bb8656dad tcg: Tidy generated code for tcg_outN
Aliasing was forcing s->code_ptr to be re-read after the store.
Keep the pointer in a local variable to help the compiler.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-26 13:31:53 -07:00
Anthony Liguori
f7ad538e1e Merge remote-tracking branch 'stefanha/block' into staging
# By Alex Bligh (32) and others
# Via Stefan Hajnoczi
* stefanha/block: (42 commits)
  win32-aio: drop win32_aio_flush_cb()
  aio-win32: replace incorrect AioHandler->opaque usage with ->e
  aio / timers: remove dummy_io_handler_flush from tests/test-aio.c
  aio / timers: Remove legacy interface
  aio / timers: Switch entire codebase to the new timer API
  aio / timers: Add scripts/switch-timer-api
  aio / timers: Add test harness for AioContext timers
  aio / timers: convert block_job_sleep_ns and co_sleep_ns to new API
  aio / timers: Convert rtc_clock to be a QEMUClockType
  aio / timers: Remove main_loop_timerlist
  aio / timers: Rearrange timer.h & make legacy functions call non-legacy
  aio / timers: Add qemu_clock_get_ms and qemu_clock_get_ms
  aio / timers: Remove legacy qemu_clock_deadline & qemu_timerlist_deadline
  aio / timers: Remove alarm timers
  aio / timers: Add documentation and new format calls
  aio / timers: Use all timerlists in icount warp calculations
  aio / timers: Introduce new API timer_new and friends
  aio / timers: On timer modification, qemu_notify or aio_notify
  aio / timers: Convert mainloop to use timeout
  aio / timers: Convert aio_poll to use AioContext timers' deadline
  ...

Message-id: 1377202298-22896-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-08-26 09:19:50 -05:00
Anthony Liguori
e3f024aec2 Merge remote-tracking branch 'afaerber/tags/0.15-maintainer-for-anthony' into staging
MAINTAINERS update for stable-0.15

# gpg: Signature made Thu 22 Aug 2013 10:59:31 AM CDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found

# By Andreas Färber
# Via Andreas Färber
* afaerber/tags/0.15-maintainer-for-anthony:
  MAINTAINERS: Take over 0.15 maintenance
2013-08-26 09:19:36 -05:00
yinyin
1ae2757c6c virtio: virtqueue_get_avail_bytes: fix desc_pa when loop over the indirect descriptor table
virtqueue_get_avail_bytes: when found a indirect desc, we need loop over it.
           /* loop over the indirect descriptor table */
           indirect = 1;
           max = vring_desc_len(desc_pa, i) / sizeof(VRingDesc);
           num_bufs = i = 0;
           desc_pa = vring_desc_addr(desc_pa, i);
But, It init i to 0, then use i to update desc_pa. so we will always get:
desc_pa = vring_desc_addr(desc_pa, 0);
the last two line should swap.

Cc: qemu-stable@nongnu.org
Signed-off-by: Yin Yin <yin.yin@cs2c.com.cn>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-25 12:52:33 +03:00
Richard Henderson
42eed424e1 disas-objdump: Pass --adjust-vma to objdump
This gives the dumped blob its correct address during disassembly,
which makes pc-relative insns much easier to interpret.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-08-24 07:26:45 +02:00
Richard Henderson
8dc6d24091 disas: Add disas-objdump.pl
The script massages the output produced for architectures that are
not supported internally by qemu though an external objdump program
for disassembly.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-08-24 07:26:45 +02:00
Richard Henderson
c46ffd57a3 disas: Implement fallback to dump object code as hex
The OBJD-[HT] tags will be used by a script to run the hex blob
through objdump --disassemble.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
2013-08-24 07:26:45 +02:00
Wei Yang
867c47cbba kvm: shorten the parameter list for get_real_device()
get_real_device() has 5 parameters with the last 4 is contained in the first
structure.

This patch removes the last 4 parameters and directly use them from the first
parameter.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-23 11:37:35 +02:00
Stefan Hajnoczi
b10577df13 win32-aio: drop win32_aio_flush_cb()
The io_flush argument to qemu_aio_set_event_notifier() has been removed
since the block layer learnt to drain requests by itself.  Fix the
Windows build for win32-aio.o by updating the
qemu_aio_set_event_notifier() call and dropping win32_aio_flush_cb().

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 22:05:04 +02:00
Stefan Hajnoczi
8b2d42d273 aio-win32: replace incorrect AioHandler->opaque usage with ->e
The AioHandler->opaque field does not exist in aio-win32.c.  The code
that uses it was incorrectly copied from aio-posix.c.  For Windows we
can use AioHandler->e to match against AioContext->notifier.

This patch fixes the Windows build for aio-win32.o.

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 22:04:54 +02:00
Alex Bligh
91c68f143d aio / timers: remove dummy_io_handler_flush from tests/test-aio.c
Remove dummy_io_handler_flush from tests/test-aio.c as it does
nothing now.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 22:03:47 +02:00
Alex Bligh
b4049b74b9 aio / timers: Remove legacy interface
Remove the legacy interface from include/qemu/timers.h.

Ensure struct QEMUClock is not exposed at all.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:14:24 +02:00
Alex Bligh
bc72ad6754 aio / timers: Switch entire codebase to the new timer API
This is an autogenerated patch using scripts/switch-timer-api.

Switch the entire code base to using the new timer API.

Note this patch may introduce some line length issues.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:14:24 +02:00
Alex Bligh
fe10ab540b aio / timers: Add scripts/switch-timer-api
Add scripts/switch-timer-api to programatically rewrite source
files to use the new timer system.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:14:24 +02:00
Alex Bligh
b53edf971f aio / timers: Add test harness for AioContext timers
Add a test harness for AioContext timers. The g_source equivalent is
unsatisfactory as it suffers from false wakeups.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:14:24 +02:00
Alex Bligh
7483d1e547 aio / timers: convert block_job_sleep_ns and co_sleep_ns to new API
Convert block_job_sleep_ns and co_sleep_ns to use the new timer
API.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:14:24 +02:00
Alex Bligh
884f17c235 aio / timers: Convert rtc_clock to be a QEMUClockType
Convert rtc_clock to be a QEMUClockType

Move rtc_clock users to use the new API

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:14:24 +02:00
Alex Bligh
7bf8fbde44 aio / timers: Remove main_loop_timerlist
Now we have timerlistgroups implemented and main_loop_tlg, we
no longer need the concept of a default timer list associated
with each clock. Remove it and simplify initialisation of
clocks and timer lists.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:14:24 +02:00
Alex Bligh
40daca54cd aio / timers: Rearrange timer.h & make legacy functions call non-legacy
Rearrange timer.h so it is in order by function type.

Make legacy functions call non-legacy functions rather than vice-versa.

Convert cpus.c to use new API.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:14:24 +02:00
Alex Bligh
55a197dab4 aio / timers: Add qemu_clock_get_ms and qemu_clock_get_ms
Add utility functions qemu_clock_get_ms and qemu_clock_get_us

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:55 +02:00
Alex Bligh
63111b69cc aio / timers: Remove legacy qemu_clock_deadline & qemu_timerlist_deadline
Remove qemu_clock_deadline and qemu_timerlist_deadline now we are using
the ns functions throughout.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:55 +02:00
Alex Bligh
6d32717155 aio / timers: Remove alarm timers
Remove alarm timers from qemu-timers.c now we use g_poll / ppoll
instead.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:55 +02:00
Alex Bligh
54904d2a91 aio / timers: Add documentation and new format calls
Add documentation for existing qemu timer calls. Add new format
calls of the format timer_XXX rather than qemu_XXX_timer
for consistency.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:53 +02:00
Alex Bligh
ac70aafc28 aio / timers: Use all timerlists in icount warp calculations
Notify all timerlists derived from vm_clock in icount warp
calculations.

When calculating timer delay based on vm_clock deadline, use
all timerlists.

For compatibility, maintain an apparent bug where when using
icount, if no vm_clock timer was set, qemu_clock_deadline
would return INT32_MAX and always set an icount clock expiry
about 2 seconds ahead.

NB: thread safety - when different timerlists sit on different
threads, this will need some locking.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:28 +02:00
Alex Bligh
a3a726ae09 aio / timers: Introduce new API timer_new and friends
Introduce new API for creating timers - timer_new and
_ns, _ms, _us derivatives.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:28 +02:00
Alex Bligh
b1bbfe72ec aio / timers: On timer modification, qemu_notify or aio_notify
On qemu_mod_timer_ns, ensure qemu_notify or aio_notify is called to
end the appropriate poll(), irrespective of use_icount value.

On qemu_clock_enable, ensure qemu_notify or aio_notify is called for
all QEMUTimerLists attached to the QEMUClock.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:28 +02:00
Alex Bligh
7b595f35d8 aio / timers: Convert mainloop to use timeout
Convert mainloop to use timeout from default timerlist group
(i.e. the current 3 static timers)

main-loop.c produces a (possibly spurious) warning about
multiple iterations. Adapt the way this works for a signed
timeout and make the warning a bit safer.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:28 +02:00
Alex Bligh
438e1f47e7 aio / timers: Convert aio_poll to use AioContext timers' deadline
Convert aio_poll to use deadline based on AioContext's timers.

aio_poll has been changed to return accurately whether progress
has occurred. Prior to this commit, aio_poll always returned
true if g_poll was entered, whether or not any progress was
made. This required a change to tests/test-aio.c where an
assert was backwards.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:28 +02:00
Alex Bligh
4e29e8311a aio / timers: Add aio_timer_init & aio_timer_new wrappers
Add aio_timer_init and aio_timer_new wrapper functions.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:28 +02:00
Alex Bligh
533a8cf350 aio / timers: aio_ctx_prepare sets timeout from AioContext timers
Calculate the timeout in aio_ctx_prepare taking into account
the timers attached to the AioContext.

Alter aio_ctx_check similarly.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:28 +02:00
Alex Bligh
d5541d8680 aio / timers: Add a notify callback to QEMUTimerList
Add a notify pointer to QEMUTimerList so it knows what to notify
on a timer change.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:28 +02:00
Alex Bligh
dae21b98b9 aio / timers: Add QEMUTimerListGroup to AioContext
Add a QEMUTimerListGroup each AioContext (meaning a QEMUTimerList
associated with each clock is added) and delete it when the
AioContext is freed.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:27 +02:00
Alex Bligh
754d6a544d aio / timers: Add QEMUTimerListGroup and helper functions
Add QEMUTimerListGroup and helper functions, to represent
a QEMUTimerList associated with each clock. Add a default
QEMUTimerListGroup representing the default timer lists
which are not associated with any other object (e.g.
an AioContext as added by future patches).

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:27 +02:00
Alex Bligh
6a1751b7aa aio / timers: Untangle include files
include/qemu/timer.h has no need to include main-loop.h and
doing so causes an issue for the next patch. Unfortunately
various files assume including timers.h will pull in main-loop.h.
Untangle this mess.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:27 +02:00
Alex Bligh
ff83c66ecc aio / timers: Split QEMUClock into QEMUClock and QEMUTimerList
Split QEMUClock into QEMUClock and QEMUTimerList so that we can
have more than one QEMUTimerList associated with the same clock.

Introduce a main_loop_timerlist concept and make existing
qemu_clock_* calls that actually should operate on a QEMUTimerList
call the relevant QEMUTimerList implementations, using the clock's
default timerlist. This vastly reduces the invasiveness of this
change and means the API stays constant for existing users.

Introduce a list of QEMUTimerLists associated with each clock
so that reenabling the clock can cause all the notifiers
to be called. Note the code to do the notifications is added
in a later patch.

Switch QEMUClockType to an enum. Remove global variables vm_clock,
host_clock and rt_clock and add compatibility defines. Do not
fix qemu_next_alarm_deadline as it's going to be deleted.

Add qemu_clock_use_for_deadline to indicate whether a particular
clock should be used for deadline calculations. When use_icount
is true, vm_clock should not be used for deadline calculations
as it does not contain a nanosecond count. Instead, icount
timeouts come from the execution thread doing aio_notify or
qemu_notify as appropriate. This function is used in the next
patch.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:27 +02:00
Alex Bligh
f9a976b740 aio / timers: Make qemu_run_timers and qemu_run_all_timers return progress
Make qemu_run_timers and qemu_run_all_timers return progress
so that aio_poll etc. can determine whether a timer has been
run.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:27 +02:00
Alex Bligh
cd758dd0ac aio / timers: Add prctl(PR_SET_TIMERSLACK, 1, ...) to reduce timer slack
Where supported, called prctl(PR_SET_TIMERSLACK, 1, ...) to
set one nanosecond timer slack to increase precision of timer
calls.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:27 +02:00
Alex Bligh
4e0c6529fc aio / timers: add ppoll support with qemu_poll_ns
Add qemu_poll_ns which works like g_poll but takes a nanosecond
timeout.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:26 +02:00
Andreas Färber
73c30df69c MAINTAINERS: Take over 0.15 maintenance
SUSE is shipping qemu-kvm 0.15.1 with SLES 11 SP2 so we will be actively
tracking all KVM-related issues. Therefore upgrade to Supported.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-22 17:27:43 +02:00
Anthony Liguori
5211333bf7 Merge remote-tracking branch 'luiz/queue/qmp' into staging
# By Laszlo Ersek (8) and others
# Via Luiz Capitulino
* luiz/queue/qmp:
  scripts/qapi.py: Avoid syntax not supported by Python 2.4
  monitor: print the invalid char in error message
  OptsVisitor: introduce unit tests, with test cases for range flattening
  add "test-int128" and "test-bitops" to .gitignore
  OptsVisitor: don't try to flatten overlong integer ranges
  OptsVisitor: opts_type_uint64(): recognize intervals when LM_IN_PROGRESS
  OptsVisitor: rebase opts_type_uint64() to parse_uint_full()
  OptsVisitor: opts_type_int(): recognize intervals when LM_IN_PROGRESS
  OptsVisitor: introduce list modes for interval flattening
  OptsVisitor: introduce basic list modes
  Convert stderr message calling error_get_pretty() to error_report()

Message-id: 1377015041-6567-1-git-send-email-lcapitulino@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-08-22 09:29:25 -05:00
Anthony Liguori
9fe480695a Merge remote-tracking branch 'jliu/or32' into staging
# By Jia Liu
# Via Jia Liu
* jliu/or32:
  hw/openrisc: Avoid undefined shift in openrisc_pic_cpu_handler()
  hw/openrisc: Fix masking in openrisc_pic_cpu_handler()
  hw/openrisc: Avoid using uninitialised variable 'entry'

Message-id: 1377050811-11116-1-git-send-email-proljc@gmail.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-08-22 09:29:13 -05:00
Alex Bligh
043a7e1f8f aio / timers: Consistent treatment of disabled clocks for deadlines
Make treatment of disabled clocks consistent in deadline calculation

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 15:58:05 +02:00
Alex Bligh
02a03a9f12 aio / timers: add qemu-timer.c utility functions
Add utility functions to qemu-timer.c for nanosecond timing.

Add qemu_clock_deadline_ns to calculate deadlines to
nanosecond accuracy.

Add utility function qemu_soonest_timeout to calculate soonest deadline.

Add qemu_timeout_ns_to_ms to convert a timeout in nanoseconds back to
milliseconds for when ppoll is not used.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 15:58:05 +02:00
Alex Bligh
58ac56b9ad aio / timers: Rename qemu_new_clock and expose clock types
Rename qemu_new_clock to qemu_clock_new.

Expose clock types.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 15:58:05 +02:00
Alex Bligh
e93379b039 aio / timers: Rename qemu_timer_* functions
Rename four functions in preparation for new API.

Rename qemu_timer_expired to timer_expired
Rename qemu_timer_expire_time_ns to timer_expire_time_ns
Rename qemu_timer_pending to timer_pending
Rename qemu_timer_expired_ns to timer_expired_ns

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 15:58:05 +02:00
Paolo Bonzini
04d542c8b8 vmdk: support vmfs files
VMware ESX hosts also use different create and extent types for flat
files, respectively "vmfs" and "VMFS".  This is not documented, but it
can be found at http://kb.vmware.com/kb/10002511 (Recreating a missing
virtual machine disk (VMDK) descriptor file).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 15:35:58 +02:00
Fam Zheng
daac8fdc68 vmdk: support vmfsSparse files
VMware ESX hosts use a variant of the VMDK3 format, identified by the
vmfsSparse create type ad the VMFSSPARSE extent type.

It has 16 KB grain tables (L2) and a variable-size grain directory (L1).
In addition, the grain size is always 512, but that is not a problem
because it is included in the header.

The format of the extents is documented in the VMDK spec.  The format
of the descriptor file is not documented precisely, but it can be
found at http://kb.vmware.com/kb/10026353 (Recreating a missing virtual
machine disk (VMDK) descriptor file for delta disks).

With these patches, vmfsSparse files only work if opened through the
descriptor file.  Data files without descriptor files, as far as I
could understand, are not supported by ESX.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

--
v2: Rebase to patch 01.
    Change le64_to_cpu to le32_to_cpu.
    Rename vmdk_open_vmdk3 to vmdk_open_vmfs_sparse, which represents the
    current usage of this format.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 15:35:58 +02:00
Fam Zheng
f6b61e54bd vmdk: fix L1 and L2 table size in vmdk3 open
VMDK3 header has the field l1dir_size, but vmdk_open_vmdk3 hardcoded the
value. This patch honors the header field.

And the L2 table size is 4096 according to VMDK spec[1], instead of
1 << 9 (512).

[1]:
http://www.vmware.com/support/developer/vddk/vmdk_50_technote.pdf?src=vmdk

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 15:35:58 +02:00
Fam Zheng
b0651b8c24 vmdk: Move l1_size check into vmdk_add_extent()
This header check is common to VMDK3 and VMDK4, so move it into
vmdk_add_extent().

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 15:35:58 +02:00
Fam Zheng
7780d47211 block: better error message for read only format name
When user tries to use read-only whitelist format in the command line
option, failure message was "'foo' invalid format". It might be invalid
only for writable, but valid for read-only, so it is confusing. Give the
user easier to understand information.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 14:30:03 +02:00
MORITA Kazutaka
893a8f6220 block: Produce zeros when protocols reading beyond end of file
While Asias is debugging an issue creating qcow2 images on top of
non-file protocols.  It boils down to this example using NBD:

$ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'

Notice the open -g option to set bs->growable.  This means you can
read/write beyond end of file.  Reading beyond end of file is supposed
to produce zeroes.

We rely on this behavior in qcow2_create2() during qcow2 image
creation.  We create a new file and then write the qcow2 header
structure using bdrv_pwrite().  Since QCowHeader is not a multiple of
sector size, block.c first uses bdrv_read() on the empty file to fetch
the first sector (should be all zeroes).

Here is the output from the qemu-io NBD example above:

$ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'
00000000:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  ................
00000010:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  ................
00000020:  ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab  ................
...

We are not zeroing the buffer!  As a result qcow2 image creation on top
of protocols is not guaranteed to work even when file creation is
supported by the protocol.

[Adapted this patch to use bs->zero_beyond_eof.
-- Stefan]

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 14:14:56 +02:00
Asias He
0d51b4debe block: Introduce bs->zero_beyond_eof
In 4146b46c42e0989cb5842e04d88ab6ccb1713a48 (block: Produce zeros when
protocols reading beyond end of file), we break qemu-iotests ./check
-qcow2 022. This happens because qcow2 temporarily sets ->growable = 1
for vmstate accesses (which are stored beyond the end of regular image
data).

We introduce the bs->zero_beyond_eof to allow qcow2_load_vmstate() to
disable ->zero_beyond_eof temporarily in addition to enable ->growable.

[Since the broken patch "block: Produce zeros when protocols reading
beyond end of file" has not been merged yet, I have applied this fix
*first* and will then apply the next patch to keep the tree bisectable.
-- Stefan]

Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 14:10:21 +02:00
Eduardo Habkost
1e09955619 pc_piix: Kill pc_init1() memory region args
All callers always use the same values (get_system_memory(),
get_system_io()), so the parameters are pointless.

If one day we decide to eliminate get_system_memory() and
get_system_io(), we will be able to do that more easily by adding the
values to struct QEMUMachineInitArgs.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-21 23:30:22 +03:00
Eduardo Habkost
396f79f45e pc: pc_compat_1_4() now can call pc_compat_1_5()
It just needs to set has_pvpanic=false after calling it. This way, it
won't be a special case anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-21 23:30:22 +03:00
Eduardo Habkost
89b439f313 pc: Create pc_compat_*() functions
Making the older compat functions call the newer compat functions at the
beginning allows the older functions undo what's done by newer compat
functions. e.g.: pc_compat_1_4() will be able to call pc_compat_1_5()
and then set has_pvpanic=false.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-21 23:30:22 +03:00
Eduardo Habkost
43a52ce657 pc: Kill pc_init_pci_1_0()
The pc_init_pci_1_2()/pc_init_pci_1_0() split was made on commit
6fd028f64f, in preparation for commit
9953f8822c. The latter was reverted, so there's
no reason to keep two separate functions that do exactly the same, anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-21 23:30:21 +03:00
Markus Armbruster
3b6fb9cab2 pc: Don't explode QEMUMachineInitArgs into local variables needlessly
Don't explode when the variable is used just a few times, and never
changed.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-21 23:30:21 +03:00
Markus Armbruster
5650f5f48b pc: Don't prematurely explode QEMUMachineInitArgs
Don't explode QEMUMachineInitArgs before passing it to pc_init1().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-21 23:30:21 +03:00
Markus Armbruster
9223836745 ppc: Don't duplicate QEMUMachineInitArgs in PPCE500Params
Pass on the generic arguments unadulterated, and the machine-specific
ones as separate argument.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-21 23:22:22 +03:00
Markus Armbruster
ee87e32f83 ppc: Don't explode QEMUMachineInitArgs into local variables needlessly
Don't explode when the variable is used just once, and never changed.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-21 23:20:35 +03:00
Markus Armbruster
6b63ef4d0f sun4: Don't prematurely explode QEMUMachineInitArgs
Don't explode QEMUMachineInitArgs before passing it to
sun4m_hw_init(), sun4uv_init().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-21 23:19:27 +03:00
Kevin Wolf
8ad1898cf1 qcow2: Change default for new images to compat=1.1
By the time that qemu 1.7 will be released, enough time will have passed
since qemu 1.1, which is the first version to understand version 3
images, that changing the default shouldn't hurt many people any more
and the benefits of using the new format outweigh the pain.

qemu-iotests already runs with compat=1.1 by default.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-21 14:41:09 +02:00
Alex Williamson
4b38e989b4 q35: Add PCIe switch to example q35 configuration
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-08-21 12:39:15 +03:00
Jia Liu
7717f248ee hw/openrisc: Avoid undefined shift in openrisc_pic_cpu_handler()
In C99 signed shift (1 << 31) is undefined behavior, since the result
exceeds INT_MAX.  Use 1U instead and move the shift after the check.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Acked-by: Jia Liu <proljc@gmail.com>
2013-08-21 09:31:42 +08:00
Jia Liu
ed396e2b2d hw/openrisc: Fix masking in openrisc_pic_cpu_handler()
Consider the masking of PICSR and PICMR:

    ((cpu->env.picsr && (1 << i)) && (cpu->env.picmr && (1 << i)))

To correctly mask bits, we should use the bitwise AND "&" rather than
the logical AND "&&".  Also, the loop is not necessary for masking.
Simply use (cpu->env.picsr & cpu->env.picmr).

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Acked-by: Jia Liu <proljc@gmail.com>
2013-08-21 09:23:10 +08:00
Jia Liu
b6d9766ddf hw/openrisc: Avoid using uninitialised variable 'entry'
clang warns that cpu_openrisc_load_kernel() can use 'entry' uninitialized:

hw/openrisc/openrisc_sim.c:69:9: error: variable 'entry' is used uninitialized
whenever '&&' condition is false [-Werror,-Wsometimes-uninitialized]

    if (kernel_filename && !qtest_enabled()) {
        ^~~~~~~~~~~~~~~
hw/openrisc/openrisc_sim.c:91:19: note: uninitialized use occurs here
    cpu->env.pc = entry;
                  ^~~~~

Fix this by not attempting to change the CPU's starting PC unless
we actually loaded a kernel.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Jia Liu <proljc@gmail.com>
2013-08-21 09:15:36 +08:00
Michael S. Tsirkin
04920fc0fa loader: store FW CFG ROM files in RAM
ROM files that are put in FW CFG are copied to guest ram, by BIOS, but
they are not backed by RAM so they don't get migrated.

Each time we change two bytes in such a ROM this breaks cross-version
migration: since we can migrate after BIOS has read the first byte but
before it has read the second one, getting an inconsistent state.

Future-proof this by creating, for each such ROM,
an MR serving as the backing store.
This MR is never mapped into guest memory, but it's registered
as RAM so it's migrated with the guest.

Naturally, this only helps for -M 1.7 and up, older machine types
will still have the cross-version migration bug.
Luckily the race window for the problem to trigger is very small,
which is also likely why we didn't notice the cross-version
migration bug in testing yet.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2013-08-21 00:18:39 +03:00
Michael S. Tsirkin
0851c9f75c arch_init: align MR size to target page size
Migration code assumes that each MR is a multiple of TARGET_PAGE_SIZE:
MR size is divided by TARGET_PAGE_SIZE, so if it isn't migration
never completes.
But this isn't really required for regions set up with
memory_region_init_ram, since that calls qemu_ram_alloc
which aligns size up using TARGET_PAGE_ALIGN.

Align MR size up to full target page sizes, this way
migration completes even if we create a RAM MR
which is not a full target page size.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2013-08-21 00:18:39 +03:00
Michael S. Tsirkin
c0b4cc1f9f pc: cleanup 1.4 compat support
Make 1.4 compat code call the 1.6 one, reducing
code duplication. Add comment explaining why we can't
make 1.4 call 1.5 as usual.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2013-08-21 00:18:39 +03:00
Marcelo Tosatti
7477cd3897 kvm: i386: fix LAPIC TSC deadline timer save/restore
The configuration of the timer represented by MSR_IA32_TSCDEADLINE depends on:

- APIC LVT Timer register.
- TSC value.

Change the order to respect the dependency.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-20 18:38:44 +02:00
Marcelo Tosatti
7dc5252685 kvm-all.c: max_cpus should not exceed KVM vcpu limit
maxcpus, which specifies the maximum number of hotpluggable CPUs,
should not exceed KVM's vcpu limit.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
[Reword message. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-20 18:38:35 +02:00
Jan Kiszka
354678c5ce kvm: Simplify kvm_handle_io
Now that cpu_in/out is just a wrapper around address_space_rw, we can
also call the latter directly. As host endianness == guest endianness,
there is no need for the memory access helpers st*_p/ld*_p as well.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-20 18:37:17 +02:00
Liu Jinsong
df67696e97 kvm: x86: fix setting IA32_FEATURE_CONTROL with nested VMX disabled
This patch is to fix the bug https://bugs.launchpad.net/qemu-kvm/+bug/1207623

IA32_FEATURE_CONTROL is pointless if not expose VMX or SMX bits to
cpuid.1.ecx of vcpu. Current qemu-kvm will error return when kvm_put_msrs
or kvm_get_msrs.

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-20 18:37:17 +02:00
Anthony Liguori
ecfe10c9a6 Merge remote-tracking branch 'pmaydell/tags/pull-target-arm-20130820' into staging
target-arm queue

# gpg: Signature made Tue 20 Aug 2013 08:56:28 AM CDT using RSA key ID 14360CDE
# gpg: Can't check signature: public key not found

# By Peter Maydell (20) and Peter Chubb (1)
# Via Peter Maydell
* pmaydell/tags/pull-target-arm-20130820: (21 commits)
  hw/timer/imx_epit: Simplify and fix imx_epit implementation
  default-configs: Fix A9MP and A15MP config names
  hw/cpu/a15mpcore: Wire generic timer outputs to GIC inputs
  target-arm: Implement the generic timer
  target-arm: Support coprocessor registers which do I/O
  target-arm: Allow raw_read() and raw_write() to handle 64 bit regs
  hw/arm/pic_cpu: Remove the now-unneeded arm_pic_init_cpu()
  hw/arm/xilinx_zynq: Don't use arm_pic_init_cpu()
  hw/arm/vexpress: Don't use arm_pic_init_cpu()
  hw/arm/versatilepb: Don't use arm_pic_init_cpu()
  hw/arm/strongarm: Don't use arm_pic_init_cpu()
  hw/arm/realview: Don't use arm_pic_init_cpu()
  hw/arm/omap*: Don't use arm_pic_init_cpu()
  hw/arm/musicpal: Don't use arm_pic_init_cpu()
  hw/arm/kzm: Don't use arm_pic_init_cpu()
  hw/arm/integratorcp: Don't use arm_pic_init_cpu()
  hw/arm/highbank: Don't use arm_pic_init_cpu()
  hw/arm/exynos4210: Don't use arm_pic_init_cpu()
  hw/arm/armv7m: Don't use arm_pic_init_cpu()
  target-arm: Make IRQ and FIQ gpio lines on the CPU object
  ...

Message-id: 1377007680-4934-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-08-20 11:23:52 -05:00
Peter Maydell
21e0043bad scripts/qapi.py: Avoid syntax not supported by Python 2.4
The Python "except Foo as x" syntax was only introduced in
Python 2.6, but we aim to support Python 2.4 and later.
Use the old-style "except Foo, x" syntax instead, thus
fixing configure/compile on systems with older Python.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-20 11:52:00 -04:00
Fam Zheng
277acfe8b3 monitor: print the invalid char in error message
It's more friendly to print which char is invalid to user, especially
when user tries to input a float value and expect the monitor to round
it to int. Since we don't round float number when we look for a integer,
telling which char is invalid is less confusing.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-20 11:52:00 -04:00
Laszlo Ersek
3953e3a5d3 OptsVisitor: introduce unit tests, with test cases for range flattening
According to commit 4f193e34
("tests: Use qapi-schema-test.json as schema parser test")
the "tests/qapi-schema/qapi-schema-test.out" file must be updated as well.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-20 11:52:00 -04:00
Laszlo Ersek
99351c8472 add "test-int128" and "test-bitops" to .gitignore
"test-int128" was probably missed in commit 6046c620
("int128: optimize and add test cases").

"test-bitops" was probably missed in commit 3464700f
("tests: Add test-bitops.c with some sextract tests").

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-20 11:52:00 -04:00
Laszlo Ersek
15a849be10 OptsVisitor: don't try to flatten overlong integer ranges
Prevent mistyped command line options from incurring high memory and CPU
usage at startup. 64K elements in a range should be enough for everyone
(TM).

The OPTS_VISITOR_RANGE_MAX macro is public so that unit tests can
construct corner cases with it.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-20 11:52:00 -04:00
Laszlo Ersek
581a8a8000 OptsVisitor: opts_type_uint64(): recognize intervals when LM_IN_PROGRESS
When a well-formed range value, bounded by unsigned integers, is
encountered while processing a repeated option, enter LM_UNSIGNED_INTERVAL
and return the low bound.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-20 11:51:59 -04:00
Laszlo Ersek
62d090e23f OptsVisitor: rebase opts_type_uint64() to parse_uint_full()
Simplify the code in preparation for the next patch.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-20 11:51:59 -04:00
Laszlo Ersek
1e1c555a49 OptsVisitor: opts_type_int(): recognize intervals when LM_IN_PROGRESS
When a well-formed range value, bounded by signed integers, is encountered
while processing a repeated option, enter LM_SIGNED_INTERVAL and return
the low bound.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-20 11:51:59 -04:00
Laszlo Ersek
d8754f40ac OptsVisitor: introduce list modes for interval flattening
The new modes are equal-rank, exclusive alternatives of LM_IN_PROGRESS.
Teach opts_next_list(), opts_type_int() and opts_type_uint64() to handle
them.

Also enumerate explicitly what functions are valid to call in what modes:
- opts_next_list() is valid to call while flattening a range,
- opts_end_list(): ditto,
- lookup_scalar() is invalid to call during flattening; generated qapi
  traversal code must continue asking for the same kind of signed/unsigned
  list element until the interval is fully flattened,
- processed(): ditto.

List mode restrictions are always formulated in positive / inclusive
sense. The restrictions for lookup_scalar() and processed() are
automatically satisfied by current qapi traversals if the schema to build
is compatible with OptsVisitor.

The new list modes are not entered yet.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-20 11:51:59 -04:00
Laszlo Ersek
d957043412 OptsVisitor: introduce basic list modes
We're going to need more state while processing a list of repeated
options. This change eliminates "repeated_opts_first" and adds a new state
variable:

  list_mode       repeated_opts  repeated_opts_first
  --------------  -------------  -------------------
  LM_NONE         NULL           false
  LM_STARTED      non-NULL       true
  LM_IN_PROGRESS  non-NULL       false

Additionally, it is documented that lookup_scalar() and processed(), both
called by opts_type_XXX(), are invalid in LM_STARTED -- generated qapi
code calls opts_next_list() to allocate the very first link before trying
to parse a scalar into it. List mode restrictions are expressed in
positive / inclusive form.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-20 11:51:59 -04:00
Seiji Aguchi
4a44d85e28 Convert stderr message calling error_get_pretty() to error_report()
Convert stderr messages calling error_get_pretty()
to error_report().

Timestamp is prepended by -msg timstamp option with it.

Per Markus's comment below, A conversion from fprintf() to
error_report() is always an improvement, regardless of
error_get_pretty().

http://marc.info/?l=qemu-devel&m=137513283408601&w=2

But, it is not reasonable to convert them at one time
because fprintf() is used everwhere in qemu.

So, it should be done step by step with avoiding regression.

Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-08-20 11:51:59 -04:00
Anthony Liguori
9176e8fb8f Merge remote-tracking branch 'stefanha/block-next' into staging
# By Stefan Hajnoczi
# Via Stefan Hajnoczi
* stefanha/block-next:
  aio: drop io_flush argument
  tests: drop event_active_cb()
  thread-pool: drop thread_pool_active()
  dataplane/virtio-blk: drop flush_true() and flush_io()
  block/ssh: drop return_true()
  block/sheepdog: drop have_co_req() and aio_flush_request()
  block/rbd: drop qemu_rbd_aio_flush_cb()
  block/nbd: drop nbd_have_request()
  block/linux-aio: drop qemu_laio_completion_cb()
  block/iscsi: drop iscsi_process_flush()
  block/gluster: drop qemu_gluster_aio_flush_cb()
  block/curl: drop curl_aio_flush()
  aio: stop using .io_flush()
  tests: adjust test-thread-pool to new aio_poll() semantics
  tests: adjust test-aio to new aio_poll() semantics
  dataplane/virtio-blk: check exit conditions before aio_poll()
  block: stop relying on io_flush() in bdrv_drain_all()
  block: ensure bdrv_drain_all() works during bdrv_delete()

Message-id: 1376921877-9576-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-08-20 09:52:18 -05:00
Anthony Liguori
72420ce9f0 Merge remote-tracking branch 'rth/axp-next' into staging
# By Richard Henderson
# Via Richard Henderson
* rth/axp-next:
  target-alpha: Implement the typhoon iommu
  target-alpha: Consider the superpage when threading and ending TBs
  target-alpha: Use goto_tb in call_pal
  target-alpha: Implement call_pal without an exception

Message-id: 1376720412-2165-1-git-send-email-rth@twiddle.net
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-08-20 09:52:07 -05:00
Anthony Liguori
237e4f92a8 Merge remote-tracking branch 'afaerber/tags/qom-cpu-for-anthony' into staging
QOM CPUState refactorings / X86CPU

* gdbstub coprocessor register count bugfix
* QOM instance_post_init infrastructure to override dynamic properties
* X86CPU HyperV preparations for CPU subclasses

# gpg: Signature made Fri 16 Aug 2013 11:49:02 AM CDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found

# By Eduardo Habkost (3) and others
# Via Andreas Färber
* afaerber/tags/qom-cpu-for-anthony:
  cpus: Use cpu_is_stopped() efficiently
  target-i386: Move hyperv_* static globals to X86CPU
  qdev: Set globals in instance_post_init function
  qom: Introduce instance_post_init hook
  tests: Unit tests for qdev global properties handling
  gdbstub: Fix gdb_register_coprocessor() register counting
2013-08-20 09:51:53 -05:00
Peter Chubb
230058106a hw/timer/imx_epit: Simplify and fix imx_epit implementation
When imx_epit.c was last refactored, a common usecase (comparison
register zero) broke.  This patch fixes that, and simplifies the code
yet more.  It also fixes a major thinko in the reset path --- the
wrong bits in the control register were being cleared.

Signed-off-by: Peter Chubb <peter.chubb@nicta.com.au>
Reviewed-by: Jean-Christophe DUBOIS <jcd@tribudubois.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2013-08-20 14:54:32 +01:00
Peter Maydell
66aae5e1ec default-configs: Fix A9MP and A15MP config names
When individual CONFIG_ switches for the A9MPcore and A15MPcore
devices were created, they were inadvertently given incorrect names
(CONFIG_ARM9MPCORE and CONFIG_ARM15MPCORE). These CPUs are
"Cortex-A9MP" and "Cortex-A15MP", and in particular the ARM9 is
a different (rather older) CPU than the Cortex-A9. Rename the
CONFIG_ switches to bring them into line with the source file
names and CPU names.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1376056215-26391-1-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:32 +01:00
Peter Maydell
6033e840c7 hw/cpu/a15mpcore: Wire generic timer outputs to GIC inputs
Now our A15 CPU implements the generic timers, we can wire them
up to the appropriate inputs on the GIC.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 1376065080-26661-5-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:32 +01:00
Peter Maydell
55d284af8e target-arm: Implement the generic timer
The ARMv7 architecture specifies a 'generic timer' which is implemented
via cp15 registers. Newer kernels will prefer to use this rather than
a devboard-level timer. Implement the generic timer for TCG; for KVM
we will already use the hardware's virtualized timer for this.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 1376065080-26661-4-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:31 +01:00
Peter Maydell
2452731c88 target-arm: Support coprocessor registers which do I/O
Add an ARM_CP_IO flag which an ARMCPRegInfo definition can use to
indicate that the register's implementation does I/O and thus
its accesses need to be surrounded by gen_io_start()/gen_io_end()
in order for icount to work. Most notably, cp registers which
implement clocks or timers need this.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Message-id: 1376065080-26661-3-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:31 +01:00
Peter Maydell
22d9e1a986 target-arm: Allow raw_read() and raw_write() to handle 64 bit regs
Extend the raw_read() and raw_write() helper accessors so that
they can be used for 64 bit registers as well as 32 bit registers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Message-id: 1376065080-26661-2-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:31 +01:00
Peter Maydell
b643e4b90b hw/arm/pic_cpu: Remove the now-unneeded arm_pic_init_cpu()
Now all the boards have been converted arm_pic_init_cpu()
is unused and can just be deleted.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-15-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:31 +01:00
Peter Maydell
e4a6540ded hw/arm/xilinx_zynq: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-14-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:30 +01:00
Peter Maydell
fe9120a5d1 hw/arm/vexpress: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-13-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:30 +01:00
Peter Maydell
bace999f8a hw/arm/versatilepb: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-12-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:30 +01:00
Peter Maydell
4f071cf9b5 hw/arm/strongarm: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-11-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:30 +01:00
Peter Maydell
033ee5a5ac hw/arm/realview: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-10-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:30 +01:00
Peter Maydell
437f0f10a4 hw/arm/omap*: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-9-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:29 +01:00
Peter Maydell
fcef61ec6b hw/arm/musicpal: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-8-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:29 +01:00
Peter Maydell
2f69ba1736 hw/arm/kzm: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-7-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:29 +01:00
Peter Maydell
99d228d6e9 hw/arm/integratorcp: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-6-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:29 +01:00
Peter Maydell
9188dbf71a hw/arm/highbank: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-5-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:29 +01:00
Peter Maydell
ad666d91f4 hw/arm/exynos4210: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-4-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:28 +01:00
Peter Maydell
de3a658f5b hw/arm/armv7m: Don't use arm_pic_init_cpu()
Drop the now-deprecated arm_pic_init_cpu() in favour of directly
getting the IRQ line from the ARMCPU object.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-3-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:28 +01:00
Peter Maydell
7c1840b686 target-arm: Make IRQ and FIQ gpio lines on the CPU object
Now that ARMCPU is a subclass of DeviceState, we can make the
CPU's inbound IRQ and FIQ lines be simply gpio lines, which
means we can remove the odd arm_pic shim.

We retain the arm_pic_init_cpu() function as a backwards
compatibility shim layer so we can convert the board models
to get the IRQ and FIQ lines directly from the ARMCPU
object one at a time.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1375977856-25046-2-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:28 +01:00
Peter Maydell
3f1beaca88 target-arm: Implement 'int' loglevel
The 'int' loglevel for recording interrupts and exceptions
requires support in the target-specific code. Implement
it for ARM. This improves debug logging in some situations
that were otherwise pretty opaque, such as when we fault
trying to execute at an exception vector address, which
would otherwise cause an infinite loop of taking exceptions
without any indication in the debug log of what was going on.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1375700771-21665-1-git-send-email-peter.maydell@linaro.org
2013-08-20 14:54:28 +01:00
Stefan Hajnoczi
f2e5dca46b aio: drop io_flush argument
The .io_flush() handler no longer exists and has no users.  Drop the
io_flush argument to aio_set_fd_handler() and related functions.

The AioFlushEventNotifierHandler and AioFlushHandler typedefs are no
longer used and are dropped too.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi
1b9ecdb164 tests: drop event_active_cb()
Drop the io_flush argument to aio_set_event_notifier().

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi
bb52b14be1 thread-pool: drop thread_pool_active()
.io_flush() is no longer called so drop thread_pool_active().  The block
layer is the only thread-pool.c user and it already tracks in-flight
requests, therefore we do not need thread_pool_active().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi
ce689368bb dataplane/virtio-blk: drop flush_true() and flush_io()
.io_flush() is no longer called so drop flush_true() and flush_io().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi
f0d3576599 block/ssh: drop return_true()
.io_flush() is no longer called so drop return_true().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi
d6d94c6785 block/sheepdog: drop have_co_req() and aio_flush_request()
.io_flush() is no longer called so drop have_co_req() and
aio_flush_request().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi
5d289cc724 block/rbd: drop qemu_rbd_aio_flush_cb()
.io_flush() is no longer called so drop qemu_rbd_aio_flush_cb().
qemu_aio_count is unused now so drop it too.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi
bed2e759eb block/nbd: drop nbd_have_request()
.io_flush() is no longer called so drop nbd_have_request().  We cannot
drop in_flight since it is still used by other block/nbd.c code.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi
94473d0c06 block/linux-aio: drop qemu_laio_completion_cb()
.io_flush() is no longer called so drop qemu_laio_completion_cb().  It
turns out that count is now unused so drop that too.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi
70ecdc6e4e block/iscsi: drop iscsi_process_flush()
.io_flush() is no longer called so drop iscsi_process_flush().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi
372835fbc3 block/gluster: drop qemu_gluster_aio_flush_cb()
Since .io_flush() is no longer called we do not need
qemu_gluster_aio_flush_cb() anymore.  It turns out that qemu_aio_count
is unused now and can be dropped.

Thanks to Bharata B Rao <bharata@linux.vnet.ibm.com> for catching a
build failure with CONFIG_GLUSTERFS_DISCARD, which has been fixed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:51:09 +02:00
Anthony Liguori
bc02fb304c Change email address
My IBM email address will be unaccessible after August 23rd, 2013.

Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-08-19 08:49:37 -05:00
Stefan Hajnoczi
0d1460226f block/curl: drop curl_aio_flush()
.io_flush() is no longer called so drop curl_aio_flush().  The acb[]
array that the function checks is still used in other parts of
block/curl.c.  Therefore we cannot remove acb[], it is needed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:45:35 +02:00
Stefan Hajnoczi
164a101f28 aio: stop using .io_flush()
Now that aio_poll() users check their termination condition themselves,
it is no longer necessary to call .io_flush() handlers.

The behavior of aio_poll() changes as follows:

1. .io_flush() is no longer invoked and file descriptors are *always*
monitored.  Previously returning 0 from .io_flush() would skip this file
descriptor.

Due to this change it is essential to check that requests are pending
before calling qemu_aio_wait().  Failure to do so means we block, for
example, waiting for an idle iSCSI socket to become readable when there
are no requests.  Currently all qemu_aio_wait()/aio_poll() callers check
before calling.

2. aio_poll() now returns true if progress was made (BH or fd handlers
executed) and false otherwise.  Previously it would return true whenever
'busy', which means that .io_flush() returned true.  The 'busy' concept
no longer exists so just progress is returned.

Due to this change we need to update tests/test-aio.c which asserts
aio_poll() return values.  Note that QEMU doesn't actually rely on these
return values so only tests/test-aio.c cares.

Note that ctx->notifier, the EventNotifier fd used for aio_notify(), is
now handled as a special case.  This is a little ugly but maintains
aio_poll() semantics, i.e. aio_notify() does not count as 'progress' and
aio_poll() avoids blocking when the user has not set any fd handlers yet.

Patches after this remove .io_flush() handler code until we can finally
drop the io_flush arguments to aio_set_fd_handler() and friends.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:45:35 +02:00
Stefan Hajnoczi
35ecde2601 tests: adjust test-thread-pool to new aio_poll() semantics
aio_poll(ctx, true) will soon block when fd handlers have been set.
Previously aio_poll() would return early if all .io_flush() returned
false.  This means we need to check the equivalent of the .io_flush()
condition *before* calling aio_poll(ctx, true) to avoid deadlock.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:45:35 +02:00
Stefan Hajnoczi
24d1a6d9d5 tests: adjust test-aio to new aio_poll() semantics
aio_poll(ctx, true) will soon block if any fd handlers have been set.
Previously it would only block when .io_flush() returned true.

This means that callers must check their wait condition *before*
aio_poll() to avoid deadlock.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:45:34 +02:00
Stefan Hajnoczi
bf0da4df83 dataplane/virtio-blk: check exit conditions before aio_poll()
Check exit conditions before entering blocking aio_poll().  This is
mainly for consistency since it's unlikely that we are stopping in the
first event loop iteration.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:45:34 +02:00
Stefan Hajnoczi
88266f5aa7 block: stop relying on io_flush() in bdrv_drain_all()
If a block driver has no file descriptors to monitor but there are still
active requests, it can return 1 from .io_flush().  This is used to spin
during synchronous I/O.

Stop relying on .io_flush() and instead check
QLIST_EMPTY(&bs->tracked_requests) to decide whether there are active
requests.

This is the first step in removing .io_flush() so that event loops no
longer need to have the concept of synchronous I/O.  Eventually we may
be able to kill synchronous I/O completely by running everything in a
coroutine, but that is future work.

Note this patch moves bs->throttled_reqs initialization to bdrv_new() so
that bdrv_requests_pending(bs) can safely access it.  In practice bs is
g_malloc0() so the memory is already zeroed but it's safer to initialize
the queue properly.

We also need to fix up block/stream.c:close_unused_images() to prevent
traversing a dangling pointer while it rearranges the backing file
chain.  This is necessary since the new bdrv_drain_all() traverses the
backing file chain.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:45:34 +02:00
Stefan Hajnoczi
e1b5c52e04 block: ensure bdrv_drain_all() works during bdrv_delete()
In bdrv_delete() make sure to call bdrv_make_anon() *after* bdrv_close()
so that the device is still seen by bdrv_drain_all() when iterating
bdrv_states.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:45:34 +02:00
Richard Henderson
b83c4db895 target-alpha: Implement the typhoon iommu
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-16 11:17:23 -07:00
Richard Henderson
b114b68adf target-alpha: Consider the superpage when threading and ending TBs
This allows significantly more threading, and occasionally larger TBs,
when processing code for the kernel and PALcode.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-16 11:17:23 -07:00
Richard Henderson
a9ead83261 target-alpha: Use goto_tb in call_pal
With appropriate flushing when the PALBR changes, the target of
a CALL_PAL is so predictable we can chain to it.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-16 11:17:23 -07:00
Richard Henderson
ba96394e20 target-alpha: Implement call_pal without an exception
The destination of the call_pal, and the cpu state, is very predictable;
there's no need for exiting the cpu loop.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-08-16 11:17:23 -07:00
Tiejun Chen
321bc0b2b2 cpus: Use cpu_is_stopped() efficiently
It makes more sense and will make things simpler later.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-16 18:44:33 +02:00
Igor Mammedov
92067bf4bf target-i386: Move hyperv_* static globals to X86CPU
- since hyperv_* helper functions are used only in target-i386/kvm.c
  move them there as static helpers

Requested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-16 18:44:33 +02:00
Eduardo Habkost
99a0b03650 qdev: Set globals in instance_post_init function
This way, properties registered in the instance_init function of
child classes will be handled properly by qdev_prop_set_globals(), too.

Includes a unit test for the new functionality.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-16 18:44:33 +02:00
Eduardo Habkost
8231c2dd22 qom: Introduce instance_post_init hook
This will allow classes to specify a function to be called after all
instance_init functions were called.

This will be used by DeviceState to call qdev_prop_set_globals() at the
right moment.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-16 18:44:33 +02:00
Eduardo Habkost
747b0cb4b5 tests: Unit tests for qdev global properties handling
This tests the qdev global-properties handling code.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-16 18:44:33 +02:00
Andreas Färber
35143f0164 gdbstub: Fix gdb_register_coprocessor() register counting
Commit a0e372f0c4 reorganized the register
counting for GDB. While it seems correct not to let the total number of
registers skyrocket in an SMP scenario through a static variable, the
distinction between total register count and 'g' packet register count
(last_reg vs. num_g_regs) got lost among the way.

Fix this by introducing CPUState::gdb_num_g_regs and using that in
gdb_handle_packet().

Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org (stable-1.6)
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-08-16 18:44:33 +02:00
Anthony Liguori
f202039811 Open up 1.7 development branch
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-08-15 15:41:13 -05:00
Vincenzo Maffione
ca916d3729 kvm: add KVM_IRQFD_FLAG_RESAMPLE support
Added an EventNotifier* parameter to
kvm-all.c:kvm_irqchip_add_irqfd_notifier(), in order to give KVM
another eventfd to be used as "resamplefd". See the documentation
in the linux kernel sources in Documentation/virtual/kvm/api.txt
(section 4.75) for more details.
When the added parameter is passed NULL, the behaviour of the
function is unchanged with respect to the previous versions.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-09 21:19:54 +02:00
Paolo Bonzini
0d89436786 kvm: migrate vPMU state
Reviewed-by: Gleb Natapov <gnatapov@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-09 21:19:52 +02:00
Paolo Bonzini
e4a09c9637 target-i386: remove tabs from target-i386/cpu.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-08-09 21:18:35 +02:00
Arthur Chunqi Li
0779caeb1a Initialize IA32_FEATURE_CONTROL MSR in reset and migration
The recent KVM patch adds IA32_FEATURE_CONTROL support. QEMU needs
to clear this MSR when reset vCPU and keep the value of it when
migration. This patch add this feature.

Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-07-25 13:09:08 +03:00
917 changed files with 50175 additions and 19975 deletions

6
.gitignore vendored
View File

@@ -44,8 +44,11 @@ qemu-ga
qemu-bridge-helper
qemu-monitor.texi
vscclient
QMP/qmp-commands.txt
qmp-commands.txt
test-bitops
test-coroutine
test-int128
test-opts-visitor
test-qmp-input-visitor
test-qmp-output-visitor
test-string-input-visitor
@@ -79,6 +82,7 @@ fsdev/virtfs-proxy-helper.pod
*.la
*.pc
.libs
.sdk
*.swp
*.orig
.pc

14
.gitmodules vendored
View File

@@ -1,27 +1,27 @@
[submodule "roms/vgabios"]
path = roms/vgabios
url = git://git.qemu.org/vgabios.git/
url = git://git.qemu-project.org/vgabios.git/
[submodule "roms/seabios"]
path = roms/seabios
url = git://git.qemu.org/seabios.git/
url = git://git.qemu-project.org/seabios.git/
[submodule "roms/SLOF"]
path = roms/SLOF
url = git://git.qemu.org/SLOF.git
url = git://git.qemu-project.org/SLOF.git
[submodule "roms/ipxe"]
path = roms/ipxe
url = git://git.qemu.org/ipxe.git
url = git://git.qemu-project.org/ipxe.git
[submodule "roms/openbios"]
path = roms/openbios
url = git://git.qemu.org/openbios.git
url = git://git.qemu-project.org/openbios.git
[submodule "roms/qemu-palcode"]
path = roms/qemu-palcode
url = git://github.com/rth7680/qemu-palcode.git
[submodule "roms/sgabios"]
path = roms/sgabios
url = git://git.qemu.org/sgabios.git
url = git://git.qemu-project.org/sgabios.git
[submodule "pixman"]
path = pixman
url = git://anongit.freedesktop.org/pixman
[submodule "dtc"]
path = dtc
url = git://git.qemu.org/dtc.git
url = git://git.qemu-project.org/dtc.git

View File

@@ -2,7 +2,8 @@
# into proper addresses so that they are counted properly in git shortlog output.
#
Andrzej Zaborowski <balrogg@gmail.com> balrog <balrog@c046a42c-6fe2-441c-8c8c-71466251a162>
Anthony Liguori <aliguori@us.ibm.com> aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
Anthony Liguori <anthony@codemonkey.ws> aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
Anthony Liguori <anthony@codemonkey.ws> Anthony Liguori <aliguori@us.ibm.com>
Aurelien Jarno <aurelien@aurel32.net> aurel32 <aurel32@c046a42c-6fe2-441c-8c8c-71466251a162>
Blue Swirl <blauwirbel@gmail.com> blueswir1 <blueswir1@c046a42c-6fe2-441c-8c8c-71466251a162>
Edgar E. Iglesias <edgar.iglesias@gmail.com> edgar_igl <edgar_igl@c046a42c-6fe2-441c-8c8c-71466251a162>

View File

@@ -1,6 +1,6 @@
This file documents changes for QEMU releases 0.12 and earlier.
For changelog information for later releases, see
http://wiki.qemu.org/ChangeLog or look at the git history for
http://wiki.qemu-project.org/ChangeLog or look at the git history for
more detailed information.

View File

@@ -50,8 +50,7 @@ Descriptions of section entries:
General Project Administration
------------------------------
M: Anthony Liguori <aliguori@us.ibm.com>
M: Paul Brook <paul@codesourcery.com>
M: Anthony Liguori <aliguori@amazon.com>
Guest CPU cores (TCG):
----------------------
@@ -62,7 +61,6 @@ F: target-alpha/
F: hw/alpha/
ARM
M: Paul Brook <paul@codesourcery.com>
M: Peter Maydell <peter.maydell@linaro.org>
S: Maintained
F: target-arm/
@@ -83,8 +81,7 @@ F: hw/lm32/
F: hw/char/lm32_*
M68K
M: Paul Brook <paul@codesourcery.com>
S: Odd Fixes
S: Orphan
F: target-m68k/
F: hw/m68k/
@@ -248,7 +245,6 @@ F: hw/*/imx*
F: hw/arm/kzm.c
Integrator CP
M: Paul Brook <paul@codesourcery.com>
M: Peter Maydell <peter.maydell@linaro.org>
S: Maintained
F: hw/arm/integratorcp.c
@@ -274,7 +270,6 @@ S: Maintained
F: hw/arm/palm.c
Real View
M: Paul Brook <paul@codesourcery.com>
M: Peter Maydell <peter.maydell@linaro.org>
S: Maintained
F: hw/arm/realview*
@@ -285,13 +280,11 @@ S: Maintained
F: hw/arm/spitz.c
Stellaris
M: Paul Brook <paul@codesourcery.com>
M: Peter Maydell <peter.maydell@linaro.org>
S: Maintained
F: hw/*/stellaris*
Versatile PB
M: Paul Brook <paul@codesourcery.com>
M: Peter Maydell <peter.maydell@linaro.org>
S: Maintained
F: hw/*/versatile*
@@ -327,18 +320,15 @@ F: hw/lm32/milkymist.c
M68K Machines
-------------
an5206
M: Paul Brook <paul@codesourcery.com>
S: Maintained
S: Orphan
F: hw/m68k/an5206.c
dummy_m68k
M: Paul Brook <paul@codesourcery.com>
S: Maintained
S: Orphan
F: hw/m68k/dummy_m68k.c
mcf5208
M: Paul Brook <paul@codesourcery.com>
S: Maintained
S: Orphan
F: hw/m68k/mcf5208.c
MicroBlaze Machines
@@ -509,7 +499,7 @@ F: hw/unicore32/
X86 Machines
------------
PC
M: Anthony Liguori <aliguori@us.ibm.com>
M: Anthony Liguori <aliguori@amazon.com>
S: Supported
F: hw/i386/pc.[ch]
F: hw/i386/pc_piix.c
@@ -567,8 +557,7 @@ F: hw/scsi/*
T: git git://github.com/bonzini/qemu.git scsi-next
LSI53C895A
M: Paul Brook <paul@codesourcery.com>
S: Odd Fixes
S: Orphan
F: hw/scsi/lsi53c895a.c
SSI
@@ -593,7 +582,7 @@ S: Supported
F: hw/*/*vhost*
virtio
M: Anthony Liguori <aliguori@us.ibm.com>
M: Anthony Liguori <aliguori@amazon.com>
S: Supported
F: hw/*/virtio*
@@ -638,6 +627,7 @@ Subsystems
----------
Audio
M: Vassili Karpov (malc) <av1474@comtv.ru>
M: Gerd Hoffmann <kraxel@redhat.com>
S: Maintained
F: audio/
F: hw/audio/
@@ -651,7 +641,7 @@ F: block/
F: hw/block/
Character Devices
M: Anthony Liguori <aliguori@us.ibm.com>
M: Anthony Liguori <aliguori@amazon.com>
S: Maintained
F: qemu-char.c
@@ -689,7 +679,7 @@ F: audio/spiceaudio.c
F: hw/display/qxl*
Graphics
M: Anthony Liguori <aliguori@us.ibm.com>
M: Anthony Liguori <aliguori@amazon.com>
S: Maintained
F: ui/
@@ -699,7 +689,7 @@ S: Odd Fixes
F: ui/cocoa.m
Main loop
M: Anthony Liguori <aliguori@us.ibm.com>
M: Anthony Liguori <aliguori@amazon.com>
S: Supported
F: vl.c
@@ -711,7 +701,7 @@ F: hmp.c
F: hmp-commands.hx
Network device layer
M: Anthony Liguori <aliguori@us.ibm.com>
M: Anthony Liguori <aliguori@amazon.com>
M: Stefan Hajnoczi <stefanha@redhat.com>
S: Maintained
F: net/
@@ -766,6 +756,12 @@ M: Blue Swirl <blauwirbel@gmail.com>
S: Odd Fixes
F: scripts/checkpatch.pl
Seccomp
M: Eduardo Otubo <otubo@linux.vnet.ibm.com>
S: Supported
F: qemu-seccomp.c
F: include/sysemu/seccomp.h
Usermode Emulation
------------------
BSD user
@@ -797,11 +793,6 @@ M: Andrzej Zaborowski <balrogg@gmail.com>
S: Maintained
F: tcg/arm/
HPPA target
M: Richard Henderson <rth@twiddle.net>
S: Maintained
F: tcg/hppa/
i386 target
M: qemu-devel@nongnu.org
S: Maintained
@@ -842,25 +833,27 @@ TCI target
M: Stefan Weil <sw@weilnetz.de>
S: Maintained
F: tcg/tci/
F: tci.c
Stable branches
---------------
Stable 1.0
L: qemu-stable@nongnu.org
T: git git://git.qemu.org/qemu-stable-1.0.git
T: git git://git.qemu-project.org/qemu-stable-1.0.git
S: Orphan
Stable 0.15
L: qemu-stable@nongnu.org
T: git git://git.qemu.org/qemu-stable-0.15.git
S: Orphan
M: Andreas Färber <afaerber@suse.de>
T: git git://git.qemu-project.org/qemu-stable-0.15.git
S: Supported
Stable 0.14
L: qemu-stable@nongnu.org
T: git git://git.qemu.org/qemu-stable-0.14.git
T: git git://git.qemu-project.org/qemu-stable-0.14.git
S: Orphan
Stable 0.10
L: qemu-stable@nongnu.org
T: git git://git.qemu.org/qemu-stable-0.10.git
T: git git://git.qemu-project.org/qemu-stable-0.10.git
S: Orphan

View File

@@ -65,7 +65,7 @@ LIBS+=-lz $(LIBS_TOOLS)
HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)
ifdef BUILD_DOCS
DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 QMP/qmp-commands.txt
DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 qmp-commands.txt
ifdef CONFIG_VIRTFS
DOCS+=fsdev/virtfs-proxy-helper.1
endif
@@ -168,7 +168,9 @@ recurse-all: $(SUBDIR_RULES) $(ROMSUBDIR_RULES)
bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS)
$(BUILD_DIR)/version.o: $(SRC_PATH)/version.rc $(BUILD_DIR)/config-host.h | $(BUILD_DIR)/version.lo
$(call quiet-command,$(WINDRES) -I$(BUILD_DIR) -o $@ $<," RC version.o")
$(BUILD_DIR)/version.lo: $(SRC_PATH)/version.rc $(BUILD_DIR)/config-host.h
$(call quiet-command,$(WINDRES) -I$(BUILD_DIR) -o $@ $<," RC version.lo")
Makefile: $(version-obj-y) $(version-lobj-y)
@@ -233,8 +235,9 @@ clean:
rm -f qemu-options.def
find . -name '*.[oda]' -type f -exec rm -f {} +
find . -name '*.l[oa]' -type f -exec rm -f {} +
rm -f $(TOOLS) $(HELPERS-y) qemu-ga TAGS cscope.* *.pod *~ */*~
rm -Rf .libs
rm -f $(filter-out %.tlb,$(TOOLS)) $(HELPERS-y) qemu-ga TAGS cscope.* *.pod *~ */*~
rm -f fsdev/*.pod
rm -rf .libs */.libs
rm -f qemu-img-cmds.h
@# May not be present in GENERATED_HEADERS
rm -f trace/generated-tracers-dtrace.dtrace*
@@ -243,7 +246,6 @@ clean:
rm -f $(foreach f,$(GENERATED_SOURCES),$(f) $(f)-timestamp)
rm -rf qapi-generated
rm -rf qga/qapi-generated
$(MAKE) -C tests/tcg clean
for d in $(ALL_SUBDIRS); do \
if test -d $$d; then $(MAKE) -C $$d $@ || exit 1; fi; \
rm -f $$d/qemu-options.def; \
@@ -259,6 +261,7 @@ qemu-%.tar.bz2:
distclean: clean
rm -f config-host.mak config-host.h* config-host.ld $(DOCS) qemu-options.texi qemu-img-cmds.texi qemu-monitor.texi
rm -f config-all-devices.mak config-all-disas.mak
rm -f po/*.mo
rm -f roms/seabios/config.mak roms/vgabios/config.mak
rm -f qemu-doc.info qemu-doc.aux qemu-doc.cp qemu-doc.cps qemu-doc.dvi
rm -f qemu-doc.fn qemu-doc.fns qemu-doc.info qemu-doc.ky qemu-doc.kys
@@ -270,6 +273,7 @@ distclean: clean
for d in $(TARGET_DIRS); do \
rm -rf $$d || exit 1 ; \
done
rm -Rf .sdk
if test -f pixman/config.log; then make -C pixman distclean; fi
if test -f dtc/version_gen.h; then make $(DTC_MAKE_ARGS) clean; fi
@@ -301,7 +305,7 @@ endif
install-doc: $(DOCS)
$(INSTALL_DIR) "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qemu-doc.html qemu-tech.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) QMP/qmp-commands.txt "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qmp-commands.txt "$(DESTDIR)$(qemu_docdir)"
ifdef CONFIG_POSIX
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DATA) qemu.1 "$(DESTDIR)$(mandir)/man1"
@@ -395,7 +399,7 @@ qemu-options.texi: $(SRC_PATH)/qemu-options.hx
qemu-monitor.texi: $(SRC_PATH)/hmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
QMP/qmp-commands.txt: $(SRC_PATH)/qmp-commands.hx
qmp-commands.txt: $(SRC_PATH)/qmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -q < $< > $@," GEN $@")
qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx

View File

@@ -109,6 +109,7 @@ version-lobj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.lo
# FIXME: a few definitions from qapi-types.o/qapi-visit.o are needed
# by libqemuutil.a. These should be moved to a separate .json schema.
qga-obj-y = qga/ qapi-types.o qapi-visit.o
qga-vss-dll-obj-y = qga/
vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
@@ -120,6 +121,7 @@ nested-vars += \
stub-obj-y \
util-obj-y \
qga-obj-y \
qga-vss-dll-obj-y \
block-obj-y \
common-obj-y
dummy := $(call unnest-vars)

View File

@@ -70,10 +70,6 @@ all: $(PROGS) stap
# Dummy command so that make thinks it has done something
@true
CONFIG_NO_PCI = $(if $(subst n,,$(CONFIG_PCI)),n,y)
CONFIG_NO_KVM = $(if $(subst n,,$(CONFIG_KVM)),n,y)
CONFIG_NO_XEN = $(if $(subst n,,$(CONFIG_XEN)),n,y)
#########################################################
# cpu emulator library
obj-y = exec.o translate-all.o cpu-exec.o
@@ -83,8 +79,8 @@ obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
obj-y += fpu/softfloat.o
obj-y += target-$(TARGET_BASE_ARCH)/
obj-y += disas.o
obj-$(CONFIG_GDBSTUB_XML) += gdbstub-xml.o
obj-$(CONFIG_NO_KVM) += kvm-stub.o
obj-$(call notempty,$(TARGET_XML_FILES)) += gdbstub-xml.o
obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
#########################################################
# Linux user emulator target
@@ -125,7 +121,7 @@ LIBS+=$(libs_softmmu)
# xen support
obj-$(CONFIG_XEN) += xen-all.o xen-mapcache.o
obj-$(CONFIG_NO_XEN) += xen-stub.o
obj-$(call lnot,$(CONFIG_XEN)) += xen-stub.o
# Hardware support
ifeq ($(TARGET_NAME), sparc64)

View File

@@ -1,88 +0,0 @@
QEMU Monitor Protocol
=====================
Introduction
-------------
The QEMU Monitor Protocol (QMP) allows applications to communicate with
QEMU's Monitor.
QMP is JSON[1] based and currently has the following features:
- Lightweight, text-based, easy to parse data format
- Asynchronous messages support (ie. events)
- Capabilities Negotiation
For detailed information on QMP's usage, please, refer to the following files:
o qmp-spec.txt QEMU Monitor Protocol current specification
o qmp-commands.txt QMP supported commands (auto-generated at build-time)
o qmp-events.txt List of available asynchronous events
There is also a simple Python script called 'qmp-shell' available.
IMPORTANT: It's strongly recommended to read the 'Stability Considerations'
section in the qmp-commands.txt file before making any serious use of QMP.
[1] http://www.json.org
Usage
-----
To enable QMP, you need a QEMU monitor instance in "control mode". There are
two ways of doing this.
The simplest one is using the '-qmp' command-line option. The following
example makes QMP available on localhost port 4444:
$ qemu [...] -qmp tcp:localhost:4444,server
However, in order to have more complex combinations, like multiple monitors,
the '-mon' command-line option should be used along with the '-chardev' one.
For instance, the following example creates one user monitor on stdio and one
QMP monitor on localhost port 4444.
$ qemu [...] -chardev stdio,id=mon0 -mon chardev=mon0,mode=readline \
-chardev socket,id=mon1,host=localhost,port=4444,server \
-mon chardev=mon1,mode=control
Please, refer to QEMU's manpage for more information.
Simple Testing
--------------
To manually test QMP one can connect with telnet and issue commands by hand:
$ telnet localhost 4444
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 50, "minor": 13, "major": 0}, "package": ""}, "capabilities": []}}
{ "execute": "qmp_capabilities" }
{"return": {}}
{ "execute": "query-version" }
{"return": {"qemu": {"micro": 50, "minor": 13, "major": 0}, "package": ""}}
Development Process
-------------------
When changing QMP's interface (by adding new commands, events or modifying
existing ones) it's mandatory to update the relevant documentation, which is
one (or more) of the files listed in the 'Introduction' section*.
Also, it's strongly recommended to send the documentation patch first, before
doing any code change. This is so because:
1. Avoids the code dictating the interface
2. Review can improve your interface. Letting that happen before
you implement it can save you work.
* The qmp-commands.txt file is generated from the qmp-commands.hx one, which
is the file that should be edited.
Homepage
--------
http://wiki.qemu.org/QMP

2
README
View File

@@ -1,3 +1,3 @@
Read the documentation in qemu-doc.html or on http://wiki.qemu.org
Read the documentation in qemu-doc.html or on http://wiki.qemu-project.org
- QEMU team

View File

@@ -1 +1 @@
1.6.2
1.6.90

View File

@@ -23,7 +23,6 @@ struct AioHandler
GPollFD pfd;
IOHandler *io_read;
IOHandler *io_write;
AioFlushHandler *io_flush;
int deleted;
int pollfds_idx;
void *opaque;
@@ -47,7 +46,6 @@ void aio_set_fd_handler(AioContext *ctx,
int fd,
IOHandler *io_read,
IOHandler *io_write,
AioFlushHandler *io_flush,
void *opaque)
{
AioHandler *node;
@@ -84,7 +82,6 @@ void aio_set_fd_handler(AioContext *ctx,
/* Update handler with latest information */
node->io_read = io_read;
node->io_write = io_write;
node->io_flush = io_flush;
node->opaque = opaque;
node->pollfds_idx = -1;
@@ -97,12 +94,10 @@ void aio_set_fd_handler(AioContext *ctx,
void aio_set_event_notifier(AioContext *ctx,
EventNotifier *notifier,
EventNotifierHandler *io_read,
AioFlushEventNotifierHandler *io_flush)
EventNotifierHandler *io_read)
{
aio_set_fd_handler(ctx, event_notifier_get_fd(notifier),
(IOHandler *)io_read, NULL,
(AioFlushHandler *)io_flush, notifier);
(IOHandler *)io_read, NULL, notifier);
}
bool aio_pending(AioContext *ctx)
@@ -147,7 +142,11 @@ static bool aio_dispatch(AioContext *ctx)
(revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) &&
node->io_read) {
node->io_read(node->opaque);
progress = true;
/* aio_notify() does not count as progress */
if (node->opaque != &ctx->notifier) {
progress = true;
}
}
if (!node->deleted &&
(revents & (G_IO_OUT | G_IO_ERR)) &&
@@ -166,6 +165,10 @@ static bool aio_dispatch(AioContext *ctx)
g_free(tmp);
}
}
/* Run our timers */
progress |= timerlistgroup_run_timers(&ctx->tlg);
return progress;
}
@@ -173,7 +176,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
int ret;
bool busy, progress;
bool progress;
progress = false;
@@ -200,20 +203,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
g_array_set_size(ctx->pollfds, 0);
/* fill pollfds */
busy = false;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
node->pollfds_idx = -1;
/* If there aren't pending AIO operations, don't invoke callbacks.
* Otherwise, if there are no AIO requests, qemu_aio_wait() would
* wait indefinitely.
*/
if (!node->deleted && node->io_flush) {
if (node->io_flush(node->opaque) == 0) {
continue;
}
busy = true;
}
if (!node->deleted && node->pfd.events) {
GPollFD pfd = {
.fd = node->pfd.fd,
@@ -226,15 +217,15 @@ bool aio_poll(AioContext *ctx, bool blocking)
ctx->walking_handlers--;
/* No AIO operations? Get us out of here */
if (!busy) {
/* early return if we only have the aio_notify() fd */
if (ctx->pollfds->len == 1) {
return progress;
}
/* wait until next event */
ret = g_poll((GPollFD *)ctx->pollfds->data,
ctx->pollfds->len,
blocking ? -1 : 0);
ret = qemu_poll_ns((GPollFD *)ctx->pollfds->data,
ctx->pollfds->len,
blocking ? timerlistgroup_deadline_ns(&ctx->tlg) : 0);
/* if we have any readable fds, dispatch event */
if (ret > 0) {
@@ -245,11 +236,12 @@ bool aio_poll(AioContext *ctx, bool blocking)
node->pfd.revents = pfd->revents;
}
}
if (aio_dispatch(ctx)) {
progress = true;
}
}
assert(progress || busy);
return true;
/* Run dispatch even if there were no readable fds to run timers */
if (aio_dispatch(ctx)) {
progress = true;
}
return progress;
}

View File

@@ -23,7 +23,6 @@
struct AioHandler {
EventNotifier *e;
EventNotifierHandler *io_notify;
AioFlushEventNotifierHandler *io_flush;
GPollFD pfd;
int deleted;
QLIST_ENTRY(AioHandler) node;
@@ -31,8 +30,7 @@ struct AioHandler {
void aio_set_event_notifier(AioContext *ctx,
EventNotifier *e,
EventNotifierHandler *io_notify,
AioFlushEventNotifierHandler *io_flush)
EventNotifierHandler *io_notify)
{
AioHandler *node;
@@ -73,7 +71,6 @@ void aio_set_event_notifier(AioContext *ctx,
}
/* Update handler with latest information */
node->io_notify = io_notify;
node->io_flush = io_flush;
}
aio_notify(ctx);
@@ -96,8 +93,9 @@ bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
bool busy, progress;
bool progress;
int count;
int timeout;
progress = false;
@@ -111,6 +109,9 @@ bool aio_poll(AioContext *ctx, bool blocking)
progress = true;
}
/* Run timers */
progress |= timerlistgroup_run_timers(&ctx->tlg);
/*
* Then dispatch any pending callbacks from the GSource.
*
@@ -126,7 +127,11 @@ bool aio_poll(AioContext *ctx, bool blocking)
if (node->pfd.revents && node->io_notify) {
node->pfd.revents = 0;
node->io_notify(node->e);
progress = true;
/* aio_notify() does not count as progress */
if (node->e != &ctx->notifier) {
progress = true;
}
}
tmp = node;
@@ -147,19 +152,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
ctx->walking_handlers++;
/* fill fd sets */
busy = false;
count = 0;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
/* If there aren't pending AIO operations, don't invoke callbacks.
* Otherwise, if there are no AIO requests, qemu_aio_wait() would
* wait indefinitely.
*/
if (!node->deleted && node->io_flush) {
if (node->io_flush(node->e) == 0) {
continue;
}
busy = true;
}
if (!node->deleted && node->io_notify) {
events[count++] = event_notifier_get_handle(node->e);
}
@@ -167,15 +161,18 @@ bool aio_poll(AioContext *ctx, bool blocking)
ctx->walking_handlers--;
/* No AIO operations? Get us out of here */
if (!busy) {
/* early return if we only have the aio_notify() fd */
if (count == 1) {
return progress;
}
/* wait until next event */
while (count > 0) {
int timeout = blocking ? INFINITE : 0;
int ret = WaitForMultipleObjects(count, events, FALSE, timeout);
int ret;
timeout = blocking ?
qemu_timeout_ns_to_ms(timerlistgroup_deadline_ns(&ctx->tlg)) : 0;
ret = WaitForMultipleObjects(count, events, FALSE, timeout);
/* if we have any signaled events, dispatch event */
if ((DWORD) (ret - WAIT_OBJECT_0) >= count) {
@@ -196,7 +193,11 @@ bool aio_poll(AioContext *ctx, bool blocking)
event_notifier_get_handle(node->e) == events[ret - WAIT_OBJECT_0] &&
node->io_notify) {
node->io_notify(node->e);
progress = true;
/* aio_notify() does not count as progress */
if (node->e != &ctx->notifier) {
progress = true;
}
}
tmp = node;
@@ -214,6 +215,14 @@ bool aio_poll(AioContext *ctx, bool blocking)
events[ret - WAIT_OBJECT_0] = events[--count];
}
assert(progress || busy);
return true;
if (blocking) {
/* Run the timers a second time. We do this because otherwise aio_wait
* will not note progress - and will stop a drain early - if we have
* a timer that was not ready to run entering g_poll but is ready
* after g_poll. This will only do anything if a timer has expired.
*/
progress |= timerlistgroup_run_timers(&ctx->tlg);
}
return progress;
}

View File

@@ -150,10 +150,9 @@ int qemu_read_default_config_files(bool userconfig)
return 0;
}
static inline bool is_zero_page(uint8_t *p)
static inline bool is_zero_range(uint8_t *p, uint64_t size)
{
return buffer_find_nonzero_offset(p, TARGET_PAGE_SIZE) ==
TARGET_PAGE_SIZE;
return buffer_find_nonzero_offset(p, size) == size;
}
/* struct contains XBZRLE cache and a static page
@@ -342,7 +341,8 @@ ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
{
unsigned long base = mr->ram_addr >> TARGET_PAGE_BITS;
unsigned long nr = base + (start >> TARGET_PAGE_BITS);
unsigned long size = base + (int128_get64(mr->size) >> TARGET_PAGE_BITS);
uint64_t mr_size = TARGET_PAGE_ALIGN(memory_region_size(mr));
unsigned long size = base + (mr_size >> TARGET_PAGE_BITS);
unsigned long next;
@@ -392,7 +392,7 @@ static void migration_bitmap_sync(void)
}
if (!start_time) {
start_time = qemu_get_clock_ms(rt_clock);
start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
}
trace_migration_bitmap_sync_start();
@@ -410,7 +410,7 @@ static void migration_bitmap_sync(void)
trace_migration_bitmap_sync_end(migration_dirty_pages
- num_dirty_pages_init);
num_dirty_pages_period += migration_dirty_pages - num_dirty_pages_init;
end_time = qemu_get_clock_ms(rt_clock);
end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
/* more than 1 second = 1000 millisecons */
if (end_time > start_time + 1000) {
@@ -496,7 +496,7 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
acct_info.dup_pages++;
}
}
} else if (is_zero_page(p)) {
} else if (is_zero_range(p, TARGET_PAGE_SIZE)) {
acct_info.dup_pages++;
bytes_sent = save_block_hdr(f, block, offset, cont,
RAM_SAVE_FLAG_COMPRESS);
@@ -672,7 +672,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
ram_control_before_iterate(f, RAM_CONTROL_ROUND);
t0 = qemu_get_clock_ns(rt_clock);
t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
i = 0;
while ((ret = qemu_file_rate_limit(f)) == 0) {
int bytes_sent;
@@ -691,7 +691,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
iterations
*/
if ((i & 63) == 0) {
uint64_t t1 = (qemu_get_clock_ns(rt_clock) - t0) / 1000000;
uint64_t t1 = (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - t0) / 1000000;
if (t1 > MAX_WAIT) {
DPRINTF("big wait: %" PRIu64 " milliseconds, %d iterations\n",
t1, i);
@@ -709,15 +709,20 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
*/
ram_control_after_iterate(f, RAM_CONTROL_ROUND);
bytes_transferred += total_sent;
/*
* Do not count these 8 bytes into total_sent, so that we can
* return 0 if no page had been dirtied.
*/
qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
bytes_transferred += 8;
ret = qemu_file_get_error(f);
if (ret < 0) {
bytes_transferred += total_sent;
return ret;
}
qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
total_sent += 8;
bytes_transferred += total_sent;
return total_sent;
}
@@ -843,13 +848,14 @@ static inline void *host_from_stream_offset(QEMUFile *f,
*/
void ram_handle_compressed(void *host, uint8_t ch, uint64_t size)
{
if (ch != 0 || !is_zero_page(host)) {
if (ch != 0 || !is_zero_range(host, size)) {
memset(host, ch, size);
#ifndef _WIN32
if (ch == 0 &&
(!kvm_enabled() || kvm_has_sync_mmu()) &&
getpagesize() <= TARGET_PAGE_SIZE) {
qemu_madvise(host, TARGET_PAGE_SIZE, QEMU_MADV_DONTNEED);
if (ch == 0 && (!kvm_enabled() || kvm_has_sync_mmu())) {
size = size & ~(getpagesize() - 1);
if (size > 0) {
qemu_madvise(host, size, QEMU_MADV_DONTNEED);
}
}
#endif
}
@@ -1112,9 +1118,6 @@ int qemu_uuid_parse(const char *str, uint8_t *uuid)
if (ret != 16) {
return -1;
}
#ifdef TARGET_I386
smbios_add_field(1, offsetof(struct smbios_type_1, uuid), uuid, 16);
#endif
return 0;
}
@@ -1125,20 +1128,18 @@ void do_acpitable_option(const QemuOpts *opts)
acpi_table_add(opts, &err);
if (err) {
fprintf(stderr, "Wrong acpi table provided: %s\n",
error_get_pretty(err));
error_report("Wrong acpi table provided: %s",
error_get_pretty(err));
error_free(err);
exit(1);
}
#endif
}
void do_smbios_option(const char *optarg)
void do_smbios_option(QemuOpts *opts)
{
#ifdef TARGET_I386
if (smbios_entry_add(optarg) < 0) {
exit(1);
}
smbios_entry_add(opts);
#endif
}
@@ -1195,15 +1196,14 @@ static void mig_sleep_cpu(void *opq)
much time in the VM. The migration thread will try to catchup.
Workload will experience a performance drop.
*/
static void mig_throttle_cpu_down(CPUState *cpu, void *data)
{
async_run_on_cpu(cpu, mig_sleep_cpu, NULL);
}
static void mig_throttle_guest_down(void)
{
CPUState *cpu;
qemu_mutex_lock_iothread();
qemu_for_each_cpu(mig_throttle_cpu_down, NULL);
CPU_FOREACH(cpu) {
async_run_on_cpu(cpu, mig_sleep_cpu, NULL);
}
qemu_mutex_unlock_iothread();
}
@@ -1217,11 +1217,11 @@ static void check_guest_throttling(void)
}
if (!t0) {
t0 = qemu_get_clock_ns(rt_clock);
t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
return;
}
t1 = qemu_get_clock_ns(rt_clock);
t1 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
/* If it has been more than 40 ms since the last time the guest
* was throttled then do it again.

24
async.c
View File

@@ -150,7 +150,10 @@ aio_ctx_prepare(GSource *source, gint *timeout)
{
AioContext *ctx = (AioContext *) source;
QEMUBH *bh;
int deadline;
/* We assume there is no timeout already supplied */
*timeout = -1;
for (bh = ctx->first_bh; bh; bh = bh->next) {
if (!bh->deleted && bh->scheduled) {
if (bh->idle) {
@@ -166,6 +169,14 @@ aio_ctx_prepare(GSource *source, gint *timeout)
}
}
deadline = qemu_timeout_ns_to_ms(timerlistgroup_deadline_ns(&ctx->tlg));
if (deadline == 0) {
*timeout = 0;
return true;
} else {
*timeout = qemu_soonest_timeout(*timeout, deadline);
}
return false;
}
@@ -180,7 +191,7 @@ aio_ctx_check(GSource *source)
return true;
}
}
return aio_pending(ctx);
return aio_pending(ctx) || (timerlistgroup_deadline_ns(&ctx->tlg) == 0);
}
static gboolean
@@ -201,10 +212,11 @@ aio_ctx_finalize(GSource *source)
AioContext *ctx = (AioContext *) source;
thread_pool_free(ctx->thread_pool);
aio_set_event_notifier(ctx, &ctx->notifier, NULL, NULL);
aio_set_event_notifier(ctx, &ctx->notifier, NULL);
event_notifier_cleanup(&ctx->notifier);
qemu_mutex_destroy(&ctx->bh_lock);
g_array_free(ctx->pollfds, TRUE);
timerlistgroup_deinit(&ctx->tlg);
}
static GSourceFuncs aio_source_funcs = {
@@ -233,6 +245,11 @@ void aio_notify(AioContext *ctx)
event_notifier_set(&ctx->notifier);
}
static void aio_timerlist_notify(void *opaque)
{
aio_notify(opaque);
}
AioContext *aio_context_new(void)
{
AioContext *ctx;
@@ -243,7 +260,8 @@ AioContext *aio_context_new(void)
event_notifier_init(&ctx->notifier, false);
aio_set_event_notifier(ctx, &ctx->notifier,
(EventNotifierHandler *)
event_notifier_test_and_clear, NULL);
event_notifier_test_and_clear);
timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
return ctx;
}

View File

@@ -1124,11 +1124,11 @@ static int audio_is_timer_needed (void)
static void audio_reset_timer (AudioState *s)
{
if (audio_is_timer_needed ()) {
qemu_mod_timer (s->ts,
qemu_get_clock_ns(vm_clock) + conf.period.ticks);
timer_mod (s->ts,
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + conf.period.ticks);
}
else {
qemu_del_timer (s->ts);
timer_del (s->ts);
}
}
@@ -1835,7 +1835,7 @@ static void audio_init (void)
QLIST_INIT (&s->cap_head);
atexit (audio_atexit);
s->ts = qemu_new_timer_ns (vm_clock, audio_timer, s);
s->ts = timer_new_ns(QEMU_CLOCK_VIRTUAL, audio_timer, s);
if (!s->ts) {
hw_error("Could not create audio timer\n");
}

View File

@@ -348,7 +348,6 @@ void mixeng_clear (struct st_sample *buf, int len)
void mixeng_volume (struct st_sample *buf, int len, struct mixeng_volume *vol)
{
#ifdef CONFIG_MIXEMU
if (vol->mute) {
mixeng_clear (buf, len);
return;
@@ -364,9 +363,4 @@ void mixeng_volume (struct st_sample *buf, int len, struct mixeng_volume *vol)
#endif
buf += 1;
}
#else
(void) buf;
(void) len;
(void) vol;
#endif
}

View File

@@ -35,7 +35,7 @@
#define IN_T glue (glue (ITYPE, BSIZE), _t)
#ifdef FLOAT_MIXENG
static mixeng_real inline glue (conv_, ET) (IN_T v)
static inline mixeng_real glue (conv_, ET) (IN_T v)
{
IN_T nv = ENDIAN_CONVERT (v);
@@ -54,7 +54,7 @@ static mixeng_real inline glue (conv_, ET) (IN_T v)
#endif
}
static IN_T inline glue (clip_, ET) (mixeng_real v)
static inline IN_T glue (clip_, ET) (mixeng_real v)
{
if (v >= 0.5) {
return IN_MAX;

View File

@@ -46,7 +46,7 @@ static int no_run_out (HWVoiceOut *hw, int live)
int64_t ticks;
int64_t bytes;
now = qemu_get_clock_ns (vm_clock);
now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
ticks = now - no->old_ticks;
bytes = muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());
bytes = audio_MIN (bytes, INT_MAX);
@@ -102,7 +102,7 @@ static int no_run_in (HWVoiceIn *hw)
int samples = 0;
if (dead) {
int64_t now = qemu_get_clock_ns (vm_clock);
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
int64_t ticks = now - no->old_ticks;
int64_t bytes =
muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());

View File

@@ -932,7 +932,7 @@ struct audio_driver oss_audio_driver = {
.init = oss_audio_init,
.fini = oss_audio_fini,
.pcm_ops = &oss_pcm_ops,
.can_be_default = 1,
.can_be_default = 0,
.max_voices_out = INT_MAX,
.max_voices_in = INT_MAX,
.voice_size_out = sizeof (OSSVoiceOut),

View File

@@ -81,7 +81,7 @@ static void spice_audio_fini (void *opaque)
static void rate_start (SpiceRateCtl *rate)
{
memset (rate, 0, sizeof (*rate));
rate->start_ticks = qemu_get_clock_ns (vm_clock);
rate->start_ticks = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
}
static int rate_get_samples (struct audio_pcm_info *info, SpiceRateCtl *rate)
@@ -91,7 +91,7 @@ static int rate_get_samples (struct audio_pcm_info *info, SpiceRateCtl *rate)
int64_t bytes;
int64_t samples;
now = qemu_get_clock_ns (vm_clock);
now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
ticks = now - rate->start_ticks;
bytes = muldiv64 (ticks, info->bytes_per_second, get_ticks_per_sec ());
samples = (bytes - rate->bytes_sent) >> info->shift;

View File

@@ -52,7 +52,7 @@ static int wav_run_out (HWVoiceOut *hw, int live)
int rpos, decr, samples;
uint8_t *dst;
struct st_sample *src;
int64_t now = qemu_get_clock_ns (vm_clock);
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
int64_t ticks = now - wav->old_ticks;
int64_t bytes =
muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());

View File

@@ -314,9 +314,9 @@ static int baum_eat_packet(BaumDriverState *baum, const uint8_t *buf, int len)
return 0; \
if (*cur++ != ESC) { \
DPRINTF("Broken packet %#2x, tossing\n", req); \
if (qemu_timer_pending(baum->cellCount_timer)) { \
qemu_del_timer(baum->cellCount_timer); \
baum_cellCount_timer_cb(baum); \
if (timer_pending(baum->cellCount_timer)) { \
timer_del(baum->cellCount_timer); \
baum_cellCount_timer_cb(baum); \
} \
return (cur - 2 - buf); \
} \
@@ -334,7 +334,7 @@ static int baum_eat_packet(BaumDriverState *baum, const uint8_t *buf, int len)
int i;
/* Allow 100ms to complete the DisplayData packet */
qemu_mod_timer(baum->cellCount_timer, qemu_get_clock_ns(vm_clock) +
timer_mod(baum->cellCount_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
get_ticks_per_sec() / 10);
for (i = 0; i < baum->x * baum->y ; i++) {
EAT(c);
@@ -348,7 +348,7 @@ static int baum_eat_packet(BaumDriverState *baum, const uint8_t *buf, int len)
c = '?';
text[i] = c;
}
qemu_del_timer(baum->cellCount_timer);
timer_del(baum->cellCount_timer);
memset(zero, 0, sizeof(zero));
@@ -553,7 +553,7 @@ static void baum_close(struct CharDriverState *chr)
{
BaumDriverState *baum = chr->opaque;
qemu_free_timer(baum->cellCount_timer);
timer_free(baum->cellCount_timer);
if (baum->brlapi) {
brlapi__closeConnection(baum->brlapi);
g_free(baum->brlapi);
@@ -588,7 +588,7 @@ CharDriverState *chr_baum_init(void)
goto fail_handle;
}
baum->cellCount_timer = qemu_new_timer_ns(vm_clock, baum_cellCount_timer_cb, baum);
baum->cellCount_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, baum_cellCount_timer_cb, baum);
if (brlapi__getDisplaySize(handle, &baum->x, &baum->y) == -1) {
brlapi_perror("baum_init: brlapi_getDisplaySize");
@@ -614,7 +614,7 @@ CharDriverState *chr_baum_init(void)
return chr;
fail:
qemu_free_timer(baum->cellCount_timer);
timer_free(baum->cellCount_timer);
brlapi__closeConnection(handle);
fail_handle:
g_free(handle);

View File

@@ -91,14 +91,12 @@ static int rng_egd_chr_can_read(void *opaque)
static void rng_egd_chr_read(void *opaque, const uint8_t *buf, int size)
{
RngEgd *s = RNG_EGD(opaque);
size_t buf_offset = 0;
while (size > 0 && s->requests) {
RngRequest *req = s->requests->data;
int len = MIN(size, req->size - req->offset);
memcpy(req->data + req->offset, buf + buf_offset, len);
buf_offset += len;
memcpy(req->data + req->offset, buf, len);
req->offset += len;
size -= len;

View File

@@ -336,8 +336,8 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
bmds->completed_sectors = 0;
bmds->shared_base = block_mig_state.shared_base;
alloc_aio_bitmap(bmds);
drive_get_ref(drive_get_by_blockdev(bs));
bdrv_set_in_use(bs, 1);
bdrv_ref(bs);
block_mig_state.total_sector_sum += sectors;
@@ -575,7 +575,7 @@ static void blk_mig_cleanup(void)
while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
bdrv_set_in_use(bmds->bs, 0);
drive_put_ref(drive_get_by_blockdev(bmds->bs));
bdrv_unref(bmds->bs);
g_free(bmds->aio_bitmap);
g_free(bmds);
}

907
block.c

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +1,4 @@
block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
block-obj-y += raw_bsd.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o

View File

@@ -202,9 +202,9 @@ static void backup_iostatus_reset(BlockJob *job)
bdrv_iostatus_reset(s->target);
}
static const BlockJobType backup_job_type = {
static const BlockJobDriver backup_job_driver = {
.instance_size = sizeof(BackupBlockJob),
.job_type = "backup",
.job_type = BLOCK_JOB_TYPE_BACKUP,
.set_speed = backup_set_speed,
.iostatus_reset = backup_iostatus_reset,
};
@@ -272,9 +272,9 @@ static void coroutine_fn backup_run(void *opaque)
uint64_t delay_ns = ratelimit_calculate_delay(
&job->limit, job->sectors_read);
job->sectors_read = 0;
block_job_sleep_ns(&job->common, rt_clock, delay_ns);
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
} else {
block_job_sleep_ns(&job->common, rt_clock, 0);
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, 0);
}
if (block_job_is_cancelled(&job->common)) {
@@ -289,14 +289,14 @@ static void coroutine_fn backup_run(void *opaque)
* backing file. */
for (i = 0; i < BACKUP_SECTORS_PER_CLUSTER;) {
/* bdrv_co_is_allocated() only returns true/false based
* on the first set of sectors it comes accross that
/* bdrv_is_allocated() only returns true/false based
* on the first set of sectors it comes across that
* are are all in the same state.
* For that reason we must verify each sector in the
* backup cluster length. We end up copying more than
* needed but at some point that is always the case. */
alloced =
bdrv_co_is_allocated(bs,
bdrv_is_allocated(bs,
start * BACKUP_SECTORS_PER_CLUSTER + i,
BACKUP_SECTORS_PER_CLUSTER - i, &n);
i += n;
@@ -338,7 +338,7 @@ static void coroutine_fn backup_run(void *opaque)
hbitmap_free(job->bitmap);
bdrv_iostatus_disable(target);
bdrv_delete(target);
bdrv_unref(target);
block_job_completed(&job->common, ret);
}
@@ -370,7 +370,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
return;
}
BackupBlockJob *job = block_job_create(&backup_job_type, bs, speed,
BackupBlockJob *job = block_job_create(&backup_job_driver, bs, speed,
cb, opaque, errp);
if (!job) {
return;

View File

@@ -168,6 +168,7 @@ static const char *event_names[BLKDBG_EVENT_MAX] = {
[BLKDBG_REFTABLE_LOAD] = "reftable_load",
[BLKDBG_REFTABLE_GROW] = "reftable_grow",
[BLKDBG_REFTABLE_UPDATE] = "reftable_update",
[BLKDBG_REFBLOCK_LOAD] = "refblock_load",
[BLKDBG_REFBLOCK_UPDATE] = "refblock_update",
@@ -349,7 +350,8 @@ static QemuOptsList runtime_opts = {
},
};
static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags)
static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVBlkdebugState *s = bs->opaque;
QemuOpts *opts;
@@ -360,8 +362,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags)
opts = qemu_opts_create_nofail(&runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (error_is_set(&local_err)) {
qerror_report_err(local_err);
error_free(local_err);
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
@@ -371,6 +372,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags)
if (config) {
ret = read_config(s, config);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not read blkdebug config file");
goto fail;
}
}
@@ -381,12 +383,14 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags)
/* Open the backing file */
filename = qemu_opt_get(opts, "x-image");
if (filename == NULL) {
error_setg(errp, "Could not retrieve image file name");
ret = -EINVAL;
goto fail;
}
ret = bdrv_file_open(&bs->file, filename, NULL, flags);
ret = bdrv_file_open(&bs->file, filename, NULL, flags, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
}

View File

@@ -116,7 +116,8 @@ static QemuOptsList runtime_opts = {
},
};
static int blkverify_open(BlockDriverState *bs, QDict *options, int flags)
static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVBlkverifyState *s = bs->opaque;
QemuOpts *opts;
@@ -127,8 +128,7 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags)
opts = qemu_opts_create_nofail(&runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (error_is_set(&local_err)) {
qerror_report_err(local_err);
error_free(local_err);
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
@@ -136,26 +136,30 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags)
/* Parse the raw image filename */
raw = qemu_opt_get(opts, "x-raw");
if (raw == NULL) {
error_setg(errp, "Could not retrieve raw image filename");
ret = -EINVAL;
goto fail;
}
ret = bdrv_file_open(&bs->file, raw, NULL, flags);
ret = bdrv_file_open(&bs->file, raw, NULL, flags, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
}
/* Open the test file */
filename = qemu_opt_get(opts, "x-image");
if (filename == NULL) {
error_setg(errp, "Could not retrieve test image filename");
ret = -EINVAL;
goto fail;
}
s->test_file = bdrv_new("");
ret = bdrv_open(s->test_file, filename, NULL, flags, NULL);
ret = bdrv_open(s->test_file, filename, NULL, flags, NULL, &local_err);
if (ret < 0) {
bdrv_delete(s->test_file);
error_propagate(errp, local_err);
bdrv_unref(s->test_file);
s->test_file = NULL;
goto fail;
}
@@ -169,7 +173,7 @@ static void blkverify_close(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_delete(s->test_file);
bdrv_unref(s->test_file);
s->test_file = NULL;
}
@@ -412,6 +416,8 @@ static BlockDriver bdrv_blkverify = {
.bdrv_aio_readv = blkverify_aio_readv,
.bdrv_aio_writev = blkverify_aio_writev,
.bdrv_aio_flush = blkverify_aio_flush,
.bdrv_check_ext_snapshot = bdrv_check_ext_snapshot_forbidden,
};
static void bdrv_blkverify_init(void)

View File

@@ -108,7 +108,8 @@ static int bochs_probe(const uint8_t *buf, int buf_size, const char *filename)
return 0;
}
static int bochs_open(BlockDriverState *bs, QDict *options, int flags)
static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVBochsState *s = bs->opaque;
int i;

View File

@@ -53,7 +53,8 @@ static int cloop_probe(const uint8_t *buf, int buf_size, const char *filename)
return 0;
}
static int cloop_open(BlockDriverState *bs, QDict *options, int flags)
static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVCloopState *s = bs->opaque;
uint32_t offsets_size, max_compressed_block_size = 1, i;

View File

@@ -103,14 +103,14 @@ wait:
/* Note that even when no rate limit is applied we need to yield
* with no pending I/O here so that bdrv_drain_all() returns.
*/
block_job_sleep_ns(&s->common, rt_clock, delay_ns);
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
if (block_job_is_cancelled(&s->common)) {
break;
}
/* Copy if allocated above the base */
ret = bdrv_co_is_allocated_above(top, base, sector_num,
COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
&n);
ret = bdrv_is_allocated_above(top, base, sector_num,
COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
&n);
copy = (ret == 1);
trace_commit_one_iteration(s, sector_num, n, ret);
if (copy) {
@@ -173,9 +173,9 @@ static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static const BlockJobType commit_job_type = {
static const BlockJobDriver commit_job_driver = {
.instance_size = sizeof(CommitBlockJob),
.job_type = "commit",
.job_type = BLOCK_JOB_TYPE_COMMIT,
.set_speed = commit_set_speed,
};
@@ -238,7 +238,7 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
}
s = block_job_create(&commit_job_type, bs, speed, cb, opaque, errp);
s = block_job_create(&commit_job_driver, bs, speed, cb, opaque, errp);
if (!s) {
return;
}

View File

@@ -58,7 +58,8 @@ static int cow_probe(const uint8_t *buf, int buf_size, const char *filename)
return 0;
}
static int cow_open(BlockDriverState *bs, QDict *options, int flags)
static int cow_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVCowState *s = bs->opaque;
struct cow_header_v2 cow_header;
@@ -106,7 +107,7 @@ static int cow_open(BlockDriverState *bs, QDict *options, int flags)
* XXX(hch): right now these functions are extremely inefficient.
* We should just read the whole bitmap we'll need in one go instead.
*/
static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum, bool *first)
{
uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
uint8_t bitmap;
@@ -117,27 +118,52 @@ static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
return ret;
}
if (bitmap & (1 << (bitnum % 8))) {
return 0;
}
if (*first) {
ret = bdrv_flush(bs->file);
if (ret < 0) {
return ret;
}
*first = false;
}
bitmap |= (1 << (bitnum % 8));
ret = bdrv_pwrite_sync(bs->file, offset, &bitmap, sizeof(bitmap));
ret = bdrv_pwrite(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
return 0;
}
static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
#define BITS_PER_BITMAP_SECTOR (512 * 8)
/* Cannot use bitmap.c on big-endian machines. */
static int cow_test_bit(int64_t bitnum, const uint8_t *bitmap)
{
uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
uint8_t bitmap;
int ret;
return (bitmap[bitnum / 8] & (1 << (bitnum & 7))) != 0;
}
ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
static int cow_find_streak(const uint8_t *bitmap, int value, int start, int nb_sectors)
{
int streak_value = value ? 0xFF : 0;
int last = MIN(start + nb_sectors, BITS_PER_BITMAP_SECTOR);
int bitnum = start;
while (bitnum < last) {
if ((bitnum & 7) == 0 && bitmap[bitnum / 8] == streak_value) {
bitnum += 8;
continue;
}
if (cow_test_bit(bitnum, bitmap) == value) {
bitnum++;
continue;
}
break;
}
return !!(bitmap & (1 << (bitnum % 8)));
return MIN(bitnum, last) - start;
}
/* Return true if first block has been changed (ie. current version is
@@ -146,34 +172,44 @@ static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *num_same)
{
int64_t bitnum = sector_num + sizeof(struct cow_header_v2) * 8;
uint64_t offset = (bitnum / 8) & -BDRV_SECTOR_SIZE;
uint8_t bitmap[BDRV_SECTOR_SIZE];
int ret;
int changed;
if (nb_sectors == 0) {
*num_same = nb_sectors;
return 0;
}
changed = is_bit_set(bs, sector_num);
if (changed < 0) {
return 0; /* XXX: how to return I/O errors? */
}
for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) {
if (is_bit_set(bs, sector_num + *num_same) != changed)
break;
ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
bitnum &= BITS_PER_BITMAP_SECTOR - 1;
changed = cow_test_bit(bitnum, bitmap);
*num_same = cow_find_streak(bitmap, changed, bitnum, nb_sectors);
return changed;
}
static int64_t coroutine_fn cow_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *num_same)
{
BDRVCowState *s = bs->opaque;
int ret = cow_co_is_allocated(bs, sector_num, nb_sectors, num_same);
int64_t offset = s->cow_sectors_offset + (sector_num << BDRV_SECTOR_BITS);
if (ret < 0) {
return ret;
}
return (ret ? BDRV_BLOCK_DATA : 0) | offset | BDRV_BLOCK_OFFSET_VALID;
}
static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
int error = 0;
int i;
bool first = true;
for (i = 0; i < nb_sectors; i++) {
error = cow_set_bit(bs, sector_num + i);
error = cow_set_bit(bs, sector_num + i, &first);
if (error) {
break;
}
@@ -189,7 +225,7 @@ static int coroutine_fn cow_read(BlockDriverState *bs, int64_t sector_num,
int ret, n;
while (nb_sectors > 0) {
ret = bdrv_co_is_allocated(bs, sector_num, nb_sectors, &n);
ret = cow_co_is_allocated(bs, sector_num, nb_sectors, &n);
if (ret < 0) {
return ret;
}
@@ -259,12 +295,14 @@ static void cow_close(BlockDriverState *bs)
{
}
static int cow_create(const char *filename, QEMUOptionParameter *options)
static int cow_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
struct cow_header_v2 cow_header;
struct stat st;
int64_t image_sectors = 0;
const char *image_filename = NULL;
Error *local_err = NULL;
int ret;
BlockDriverState *cow_bs;
@@ -278,13 +316,17 @@ static int cow_create(const char *filename, QEMUOptionParameter *options)
options++;
}
ret = bdrv_create_file(filename, options);
ret = bdrv_create_file(filename, options, &local_err);
if (ret < 0) {
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
ret = bdrv_file_open(&cow_bs, filename, NULL, BDRV_O_RDWR);
ret = bdrv_file_open(&cow_bs, filename, NULL, BDRV_O_RDWR, &local_err);
if (ret < 0) {
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
@@ -318,7 +360,7 @@ static int cow_create(const char *filename, QEMUOptionParameter *options)
}
exit:
bdrv_delete(cow_bs);
bdrv_unref(cow_bs);
return ret;
}
@@ -348,7 +390,7 @@ static BlockDriver bdrv_cow = {
.bdrv_read = cow_co_read,
.bdrv_write = cow_co_write,
.bdrv_co_is_allocated = cow_co_is_allocated,
.bdrv_co_get_block_status = cow_co_get_block_status,
.create_options = cow_create_options,
};

View File

@@ -86,7 +86,6 @@ typedef struct BDRVCURLState {
static void curl_clean_state(CURLState *s);
static void curl_multi_do(void *arg);
static int curl_aio_flush(void *opaque);
static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
void *s, void *sp)
@@ -94,17 +93,16 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
switch (action) {
case CURL_POLL_IN:
qemu_aio_set_fd_handler(fd, curl_multi_do, NULL, curl_aio_flush, s);
qemu_aio_set_fd_handler(fd, curl_multi_do, NULL, s);
break;
case CURL_POLL_OUT:
qemu_aio_set_fd_handler(fd, NULL, curl_multi_do, curl_aio_flush, s);
qemu_aio_set_fd_handler(fd, NULL, curl_multi_do, s);
break;
case CURL_POLL_INOUT:
qemu_aio_set_fd_handler(fd, curl_multi_do, curl_multi_do,
curl_aio_flush, s);
qemu_aio_set_fd_handler(fd, curl_multi_do, curl_multi_do, s);
break;
case CURL_POLL_REMOVE:
qemu_aio_set_fd_handler(fd, NULL, NULL, NULL, NULL);
qemu_aio_set_fd_handler(fd, NULL, NULL, NULL);
break;
}
@@ -397,7 +395,8 @@ static QemuOptsList runtime_opts = {
},
};
static int curl_open(BlockDriverState *bs, QDict *options, int flags)
static int curl_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVCURLState *s = bs->opaque;
CURLState *state = NULL;
@@ -495,21 +494,6 @@ out_noclean:
return -EINVAL;
}
static int curl_aio_flush(void *opaque)
{
BDRVCURLState *s = opaque;
int i, j;
for (i=0; i < CURL_NUM_STATES; i++) {
for(j=0; j < CURL_NUM_ACB; j++) {
if (s->states[i].acb[j]) {
return 1;
}
}
}
return 0;
}
static void curl_aio_cancel(BlockDriverAIOCB *blockacb)
{
// Do we have to implement canceling? Seems to work without...
@@ -589,12 +573,6 @@ static BlockDriverAIOCB *curl_aio_readv(BlockDriverState *bs,
acb->nb_sectors = nb_sectors;
acb->bh = qemu_bh_new(curl_readv_bh_cb, acb);
if (!acb->bh) {
DPRINTF("CURL: qemu_bh_new failed\n");
return NULL;
}
qemu_bh_schedule(acb->bh);
return &acb->common;
}

View File

@@ -92,7 +92,8 @@ static int read_uint32(BlockDriverState *bs, int64_t offset, uint32_t *result)
return 0;
}
static int dmg_open(BlockDriverState *bs, QDict *options, int flags)
static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVDMGState *s = bs->opaque;
uint64_t info_begin,info_end,last_in_offset,last_out_offset;

View File

@@ -32,7 +32,6 @@ typedef struct BDRVGlusterState {
struct glfs *glfs;
int fds[2];
struct glfs_fd *fd;
int qemu_aio_count;
int event_reader_pos;
GlusterAIOCB *event_acb;
} BDRVGlusterState;
@@ -247,7 +246,6 @@ static void qemu_gluster_complete_aio(GlusterAIOCB *acb, BDRVGlusterState *s)
ret = -EIO; /* Partial read/write - fail it */
}
s->qemu_aio_count--;
qemu_aio_release(acb);
cb(opaque, ret);
if (finished) {
@@ -275,13 +273,6 @@ static void qemu_gluster_aio_event_reader(void *opaque)
} while (ret < 0 && errno == EINTR);
}
static int qemu_gluster_aio_flush_cb(void *opaque)
{
BDRVGlusterState *s = opaque;
return (s->qemu_aio_count > 0);
}
/* TODO Convert to fine grained options */
static QemuOptsList runtime_opts = {
.name = "gluster",
@@ -297,7 +288,7 @@ static QemuOptsList runtime_opts = {
};
static int qemu_gluster_open(BlockDriverState *bs, QDict *options,
int bdrv_flags)
int bdrv_flags, Error **errp)
{
BDRVGlusterState *s = bs->opaque;
int open_flags = O_BINARY;
@@ -348,7 +339,7 @@ static int qemu_gluster_open(BlockDriverState *bs, QDict *options,
}
fcntl(s->fds[GLUSTER_FD_READ], F_SETFL, O_NONBLOCK);
qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ],
qemu_gluster_aio_event_reader, NULL, qemu_gluster_aio_flush_cb, s);
qemu_gluster_aio_event_reader, NULL, s);
out:
qemu_opts_del(opts);
@@ -366,7 +357,7 @@ out:
}
static int qemu_gluster_create(const char *filename,
QEMUOptionParameter *options)
QEMUOptionParameter *options, Error **errp)
{
struct glfs *glfs;
struct glfs_fd *fd;
@@ -436,22 +427,9 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
/*
* Gluster AIO callback thread failed to notify the waiting
* QEMU thread about IO completion.
*
* Complete this IO request and make the disk inaccessible for
* subsequent reads and writes.
*/
error_report("Gluster failed to notify QEMU about IO completion");
qemu_mutex_lock_iothread(); /* We are in gluster thread context */
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_release(acb);
s->qemu_aio_count--;
close(s->fds[GLUSTER_FD_READ]);
close(s->fds[GLUSTER_FD_WRITE]);
qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], NULL, NULL, NULL,
NULL);
bs->drv = NULL; /* Make the disk inaccessible */
qemu_mutex_unlock_iothread();
error_report("Gluster AIO completion failed: %s", strerror(errno));
abort();
}
}
@@ -467,7 +445,6 @@ static BlockDriverAIOCB *qemu_gluster_aio_rw(BlockDriverState *bs,
offset = sector_num * BDRV_SECTOR_SIZE;
size = nb_sectors * BDRV_SECTOR_SIZE;
s->qemu_aio_count++;
acb = qemu_aio_get(&gluster_aiocb_info, bs, cb, opaque);
acb->size = size;
@@ -488,7 +465,6 @@ static BlockDriverAIOCB *qemu_gluster_aio_rw(BlockDriverState *bs,
return &acb->common;
out:
s->qemu_aio_count--;
qemu_aio_release(acb);
return NULL;
}
@@ -531,7 +507,6 @@ static BlockDriverAIOCB *qemu_gluster_aio_flush(BlockDriverState *bs,
acb->size = 0;
acb->ret = 0;
acb->finished = NULL;
s->qemu_aio_count++;
ret = glfs_fsync_async(s->fd, &gluster_finish_aiocb, acb);
if (ret < 0) {
@@ -540,7 +515,6 @@ static BlockDriverAIOCB *qemu_gluster_aio_flush(BlockDriverState *bs,
return &acb->common;
out:
s->qemu_aio_count--;
qemu_aio_release(acb);
return NULL;
}
@@ -563,7 +537,6 @@ static BlockDriverAIOCB *qemu_gluster_aio_discard(BlockDriverState *bs,
acb->size = 0;
acb->ret = 0;
acb->finished = NULL;
s->qemu_aio_count++;
ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
if (ret < 0) {
@@ -572,7 +545,6 @@ static BlockDriverAIOCB *qemu_gluster_aio_discard(BlockDriverState *bs,
return &acb->common;
out:
s->qemu_aio_count--;
qemu_aio_release(acb);
return NULL;
}
@@ -611,7 +583,7 @@ static void qemu_gluster_close(BlockDriverState *bs)
close(s->fds[GLUSTER_FD_READ]);
close(s->fds[GLUSTER_FD_WRITE]);
qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], NULL, NULL, NULL, NULL);
qemu_aio_set_fd_handler(s->fds[GLUSTER_FD_READ], NULL, NULL, NULL);
if (s->fd) {
glfs_close(s->fd);
@@ -639,6 +611,7 @@ static BlockDriver bdrv_gluster = {
.format_name = "gluster",
.protocol_name = "gluster",
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,
@@ -659,6 +632,7 @@ static BlockDriver bdrv_gluster_tcp = {
.format_name = "gluster",
.protocol_name = "gluster+tcp",
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,
@@ -679,6 +653,7 @@ static BlockDriver bdrv_gluster_unix = {
.format_name = "gluster",
.protocol_name = "gluster+unix",
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,
@@ -699,6 +674,7 @@ static BlockDriver bdrv_gluster_rdma = {
.format_name = "gluster",
.protocol_name = "gluster+rdma",
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,

View File

@@ -33,6 +33,8 @@
#include "trace.h"
#include "block/scsi.h"
#include "qemu/iov.h"
#include "sysemu/sysemu.h"
#include "qmp-commands.h"
#include <iscsi/iscsi.h>
#include <iscsi/scsi-lowlevel.h>
@@ -50,8 +52,21 @@ typedef struct IscsiLun {
uint64_t num_blocks;
int events;
QEMUTimer *nop_timer;
uint8_t lbpme;
uint8_t lbprz;
struct scsi_inquiry_logical_block_provisioning lbp;
struct scsi_inquiry_block_limits bl;
} IscsiLun;
typedef struct IscsiTask {
int status;
int complete;
int retries;
int do_retry;
struct scsi_task *task;
Coroutine *co;
} IscsiTask;
typedef struct IscsiAIOCB {
BlockDriverAIOCB common;
QEMUIOVector *qiov;
@@ -72,6 +87,7 @@ typedef struct IscsiAIOCB {
#define NOP_INTERVAL 5000
#define MAX_NOP_FAILURES 3
#define ISCSI_CMD_RETRIES 5
#define ISCSI_MAX_UNMAP 131072
static void
iscsi_bh_cb(void *p)
@@ -105,6 +121,41 @@ iscsi_schedule_bh(IscsiAIOCB *acb)
qemu_bh_schedule(acb->bh);
}
static void
iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
void *command_data, void *opaque)
{
struct IscsiTask *iTask = opaque;
struct scsi_task *task = command_data;
iTask->complete = 1;
iTask->status = status;
iTask->do_retry = 0;
iTask->task = task;
if (iTask->retries-- > 0 && status == SCSI_STATUS_CHECK_CONDITION
&& task->sense.key == SCSI_SENSE_UNIT_ATTENTION) {
iTask->do_retry = 1;
goto out;
}
if (status != SCSI_STATUS_GOOD) {
error_report("iSCSI: Failure. %s", iscsi_get_error(iscsi));
}
out:
if (iTask->co) {
qemu_coroutine_enter(iTask->co, NULL);
}
}
static void iscsi_co_init_iscsitask(IscsiLun *iscsilun, struct IscsiTask *iTask)
{
*iTask = (struct IscsiTask) {
.co = qemu_coroutine_self(),
.retries = ISCSI_CMD_RETRIES,
};
}
static void
iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void *command_data,
@@ -146,13 +197,6 @@ static const AIOCBInfo iscsi_aiocb_info = {
static void iscsi_process_read(void *arg);
static void iscsi_process_write(void *arg);
static int iscsi_process_flush(void *arg)
{
IscsiLun *iscsilun = arg;
return iscsi_queue_length(iscsilun->iscsi) > 0;
}
static void
iscsi_set_events(IscsiLun *iscsilun)
{
@@ -166,7 +210,6 @@ iscsi_set_events(IscsiLun *iscsilun)
qemu_aio_set_fd_handler(iscsi_get_fd(iscsi),
iscsi_process_read,
(ev & POLLOUT) ? iscsi_process_write : NULL,
iscsi_process_flush,
iscsilun);
}
@@ -576,88 +619,6 @@ iscsi_aio_flush(BlockDriverState *bs,
return &acb->common;
}
static int iscsi_aio_discard_acb(IscsiAIOCB *acb);
static void
iscsi_unmap_cb(struct iscsi_context *iscsi, int status,
void *command_data, void *opaque)
{
IscsiAIOCB *acb = opaque;
if (acb->canceled != 0) {
return;
}
acb->status = 0;
if (status != 0) {
if (status == SCSI_STATUS_CHECK_CONDITION
&& acb->task->sense.key == SCSI_SENSE_UNIT_ATTENTION
&& acb->retries-- > 0) {
scsi_free_scsi_task(acb->task);
acb->task = NULL;
if (iscsi_aio_discard_acb(acb) == 0) {
iscsi_set_events(acb->iscsilun);
return;
}
}
error_report("Failed to unmap data on iSCSI lun. %s",
iscsi_get_error(iscsi));
acb->status = -EIO;
}
iscsi_schedule_bh(acb);
}
static int iscsi_aio_discard_acb(IscsiAIOCB *acb) {
struct iscsi_context *iscsi = acb->iscsilun->iscsi;
struct unmap_list list[1];
acb->canceled = 0;
acb->bh = NULL;
acb->status = -EINPROGRESS;
acb->buf = NULL;
list[0].lba = sector_qemu2lun(acb->sector_num, acb->iscsilun);
list[0].num = acb->nb_sectors * BDRV_SECTOR_SIZE / acb->iscsilun->block_size;
acb->task = iscsi_unmap_task(iscsi, acb->iscsilun->lun,
0, 0, &list[0], 1,
iscsi_unmap_cb,
acb);
if (acb->task == NULL) {
error_report("iSCSI: Failed to send unmap command. %s",
iscsi_get_error(iscsi));
return -1;
}
return 0;
}
static BlockDriverAIOCB *
iscsi_aio_discard(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
{
IscsiLun *iscsilun = bs->opaque;
IscsiAIOCB *acb;
acb = qemu_aio_get(&iscsi_aiocb_info, bs, cb, opaque);
acb->iscsilun = iscsilun;
acb->nb_sectors = nb_sectors;
acb->sector_num = sector_num;
acb->retries = ISCSI_CMD_RETRIES;
if (iscsi_aio_discard_acb(acb) != 0) {
qemu_aio_release(acb);
return NULL;
}
iscsi_set_events(iscsilun);
return &acb->common;
}
#ifdef __linux__
static void
iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
@@ -850,6 +811,171 @@ iscsi_getlength(BlockDriverState *bs)
return len;
}
#if defined(LIBISCSI_FEATURE_IOVECTOR)
static int64_t coroutine_fn iscsi_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
{
IscsiLun *iscsilun = bs->opaque;
struct scsi_get_lba_status *lbas = NULL;
struct scsi_lba_status_descriptor *lbasd = NULL;
struct IscsiTask iTask;
int64_t ret;
iscsi_co_init_iscsitask(iscsilun, &iTask);
if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
ret = -EINVAL;
goto out;
}
/* default to all sectors allocated */
ret = BDRV_BLOCK_DATA;
ret |= (sector_num << BDRV_SECTOR_BITS) | BDRV_BLOCK_OFFSET_VALID;
*pnum = nb_sectors;
/* LUN does not support logical block provisioning */
if (iscsilun->lbpme == 0) {
goto out;
}
retry:
if (iscsi_get_lba_status_task(iscsilun->iscsi, iscsilun->lun,
sector_qemu2lun(sector_num, iscsilun),
8 + 16, iscsi_co_generic_cb,
&iTask) == NULL) {
ret = -EIO;
goto out;
}
while (!iTask.complete) {
iscsi_set_events(iscsilun);
qemu_coroutine_yield();
}
if (iTask.do_retry) {
if (iTask.task != NULL) {
scsi_free_scsi_task(iTask.task);
iTask.task = NULL;
}
goto retry;
}
if (iTask.status != SCSI_STATUS_GOOD) {
/* in case the get_lba_status_callout fails (i.e.
* because the device is busy or the cmd is not
* supported) we pretend all blocks are allocated
* for backwards compatibility */
goto out;
}
lbas = scsi_datain_unmarshall(iTask.task);
if (lbas == NULL) {
ret = -EIO;
goto out;
}
lbasd = &lbas->descriptors[0];
if (sector_qemu2lun(sector_num, iscsilun) != lbasd->lba) {
ret = -EIO;
goto out;
}
*pnum = sector_lun2qemu(lbasd->num_blocks, iscsilun);
if (*pnum > nb_sectors) {
*pnum = nb_sectors;
}
if (lbasd->provisioning == SCSI_PROVISIONING_TYPE_DEALLOCATED ||
lbasd->provisioning == SCSI_PROVISIONING_TYPE_ANCHORED) {
ret &= ~BDRV_BLOCK_DATA;
if (iscsilun->lbprz) {
ret |= BDRV_BLOCK_ZERO;
}
}
out:
if (iTask.task != NULL) {
scsi_free_scsi_task(iTask.task);
}
return ret;
}
#endif /* LIBISCSI_FEATURE_IOVECTOR */
static int
coroutine_fn iscsi_co_discard(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
IscsiLun *iscsilun = bs->opaque;
struct IscsiTask iTask;
struct unmap_list list;
uint32_t nb_blocks;
uint32_t max_unmap;
if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
return -EINVAL;
}
if (!iscsilun->lbp.lbpu) {
/* UNMAP is not supported by the target */
return 0;
}
list.lba = sector_qemu2lun(sector_num, iscsilun);
nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
max_unmap = iscsilun->bl.max_unmap;
if (max_unmap == 0xffffffff) {
max_unmap = ISCSI_MAX_UNMAP;
}
while (nb_blocks > 0) {
iscsi_co_init_iscsitask(iscsilun, &iTask);
list.num = nb_blocks;
if (list.num > max_unmap) {
list.num = max_unmap;
}
retry:
if (iscsi_unmap_task(iscsilun->iscsi, iscsilun->lun, 0, 0, &list, 1,
iscsi_co_generic_cb, &iTask) == NULL) {
return -EIO;
}
while (!iTask.complete) {
iscsi_set_events(iscsilun);
qemu_coroutine_yield();
}
if (iTask.task != NULL) {
scsi_free_scsi_task(iTask.task);
iTask.task = NULL;
}
if (iTask.do_retry) {
goto retry;
}
if (iTask.status == SCSI_STATUS_CHECK_CONDITION) {
/* the target might fail with a check condition if it
is not happy with the alignment of the UNMAP request
we silently fail in this case */
return 0;
}
if (iTask.status != SCSI_STATUS_GOOD) {
return -EIO;
}
list.lba += list.num;
nb_blocks -= list.num;
}
return 0;
}
static int parse_chap(struct iscsi_context *iscsi, const char *target)
{
QemuOptsList *list;
@@ -930,8 +1056,9 @@ static char *parse_initiator_name(const char *target)
{
QemuOptsList *list;
QemuOpts *opts;
const char *name = NULL;
const char *iscsi_name = qemu_get_vm_name();
const char *name;
char *iscsi_name;
UuidInfo *uuid_info;
list = qemu_find_opts("iscsi");
if (list) {
@@ -941,16 +1068,22 @@ static char *parse_initiator_name(const char *target)
}
if (opts) {
name = qemu_opt_get(opts, "initiator-name");
if (name) {
return g_strdup(name);
}
}
}
if (name) {
return g_strdup(name);
uuid_info = qmp_query_uuid(NULL);
if (strcmp(uuid_info->UUID, UUID_NONE) == 0) {
name = qemu_get_vm_name();
} else {
return g_strdup_printf("iqn.2008-11.org.linux-kvm%s%s",
iscsi_name ? ":" : "",
iscsi_name ? iscsi_name : "");
name = uuid_info->UUID;
}
iscsi_name = g_strdup_printf("iqn.2008-11.org.linux-kvm%s%s",
name ? ":" : "", name ? name : "");
qapi_free_UuidInfo(uuid_info);
return iscsi_name;
}
#if defined(LIBISCSI_FEATURE_NOP_COUNTER)
@@ -968,7 +1101,7 @@ static void iscsi_nop_timed_event(void *opaque)
return;
}
qemu_mod_timer(iscsilun->nop_timer, qemu_get_clock_ms(rt_clock) + NOP_INTERVAL);
timer_mod(iscsilun->nop_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + NOP_INTERVAL);
iscsi_set_events(iscsilun);
}
#endif
@@ -998,6 +1131,8 @@ static int iscsi_readcapacity_sync(IscsiLun *iscsilun)
} else {
iscsilun->block_size = rc16->block_length;
iscsilun->num_blocks = rc16->returned_lba + 1;
iscsilun->lbpme = rc16->lbpme;
iscsilun->lbprz = rc16->lbprz;
}
}
break;
@@ -1050,11 +1185,43 @@ static QemuOptsList runtime_opts = {
},
};
static struct scsi_task *iscsi_do_inquiry(struct iscsi_context *iscsi,
int lun, int evpd, int pc) {
int full_size;
struct scsi_task *task = NULL;
task = iscsi_inquiry_sync(iscsi, lun, evpd, pc, 64);
if (task == NULL || task->status != SCSI_STATUS_GOOD) {
goto fail;
}
full_size = scsi_datain_getfullsize(task);
if (full_size > task->datain.size) {
scsi_free_scsi_task(task);
/* we need more data for the full list */
task = iscsi_inquiry_sync(iscsi, lun, evpd, pc, full_size);
if (task == NULL || task->status != SCSI_STATUS_GOOD) {
goto fail;
}
}
return task;
fail:
error_report("iSCSI: Inquiry command failed : %s",
iscsi_get_error(iscsi));
if (task) {
scsi_free_scsi_task(task);
return NULL;
}
return NULL;
}
/*
* We support iscsi url's on the form
* iscsi://[<username>%<password>@]<host>[:<port>]/<targetname>/<lun>
*/
static int iscsi_open(BlockDriverState *bs, QDict *options, int flags)
static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
IscsiLun *iscsilun = bs->opaque;
struct iscsi_context *iscsi = NULL;
@@ -1179,10 +1346,50 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags)
bs->sg = 1;
}
if (iscsilun->lbpme) {
struct scsi_inquiry_logical_block_provisioning *inq_lbp;
task = iscsi_do_inquiry(iscsilun->iscsi, iscsilun->lun, 1,
SCSI_INQUIRY_PAGECODE_LOGICAL_BLOCK_PROVISIONING);
if (task == NULL) {
ret = -EINVAL;
goto out;
}
inq_lbp = scsi_datain_unmarshall(task);
if (inq_lbp == NULL) {
error_report("iSCSI: failed to unmarshall inquiry datain blob");
ret = -EINVAL;
goto out;
}
memcpy(&iscsilun->lbp, inq_lbp,
sizeof(struct scsi_inquiry_logical_block_provisioning));
scsi_free_scsi_task(task);
task = NULL;
}
if (iscsilun->lbp.lbpu || iscsilun->lbp.lbpws) {
struct scsi_inquiry_block_limits *inq_bl;
task = iscsi_do_inquiry(iscsilun->iscsi, iscsilun->lun, 1,
SCSI_INQUIRY_PAGECODE_BLOCK_LIMITS);
if (task == NULL) {
ret = -EINVAL;
goto out;
}
inq_bl = scsi_datain_unmarshall(task);
if (inq_bl == NULL) {
error_report("iSCSI: failed to unmarshall inquiry datain blob");
ret = -EINVAL;
goto out;
}
memcpy(&iscsilun->bl, inq_bl,
sizeof(struct scsi_inquiry_block_limits));
scsi_free_scsi_task(task);
task = NULL;
}
#if defined(LIBISCSI_FEATURE_NOP_COUNTER)
/* Set up a timer for sending out iSCSI NOPs */
iscsilun->nop_timer = qemu_new_timer_ms(rt_clock, iscsi_nop_timed_event, iscsilun);
qemu_mod_timer(iscsilun->nop_timer, qemu_get_clock_ms(rt_clock) + NOP_INTERVAL);
iscsilun->nop_timer = timer_new_ms(QEMU_CLOCK_REALTIME, iscsi_nop_timed_event, iscsilun);
timer_mod(iscsilun->nop_timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + NOP_INTERVAL);
#endif
out:
@@ -1212,10 +1419,10 @@ static void iscsi_close(BlockDriverState *bs)
struct iscsi_context *iscsi = iscsilun->iscsi;
if (iscsilun->nop_timer) {
qemu_del_timer(iscsilun->nop_timer);
qemu_free_timer(iscsilun->nop_timer);
timer_del(iscsilun->nop_timer);
timer_free(iscsilun->nop_timer);
}
qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), NULL, NULL, NULL, NULL);
qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), NULL, NULL, NULL);
iscsi_destroy_context(iscsi);
memset(iscsilun, 0, sizeof(IscsiLun));
}
@@ -1245,15 +1452,16 @@ static int iscsi_has_zero_init(BlockDriverState *bs)
return 0;
}
static int iscsi_create(const char *filename, QEMUOptionParameter *options)
static int iscsi_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int ret = 0;
int64_t total_size = 0;
BlockDriverState bs;
BlockDriverState *bs;
IscsiLun *iscsilun = NULL;
QDict *bs_options;
memset(&bs, 0, sizeof(BlockDriverState));
bs = bdrv_new("");
/* Read out options */
while (options && options->name) {
@@ -1263,26 +1471,26 @@ static int iscsi_create(const char *filename, QEMUOptionParameter *options)
options++;
}
bs.opaque = g_malloc0(sizeof(struct IscsiLun));
iscsilun = bs.opaque;
bs->opaque = g_malloc0(sizeof(struct IscsiLun));
iscsilun = bs->opaque;
bs_options = qdict_new();
qdict_put(bs_options, "filename", qstring_from_str(filename));
ret = iscsi_open(&bs, bs_options, 0);
ret = iscsi_open(bs, bs_options, 0, NULL);
QDECREF(bs_options);
if (ret != 0) {
goto out;
}
if (iscsilun->nop_timer) {
qemu_del_timer(iscsilun->nop_timer);
qemu_free_timer(iscsilun->nop_timer);
timer_del(iscsilun->nop_timer);
timer_free(iscsilun->nop_timer);
}
if (iscsilun->type != TYPE_DISK) {
ret = -ENODEV;
goto out;
}
if (bs.total_sectors < total_size) {
if (bs->total_sectors < total_size) {
ret = -ENOSPC;
goto out;
}
@@ -1292,7 +1500,9 @@ out:
if (iscsilun->iscsi != NULL) {
iscsi_destroy_context(iscsilun->iscsi);
}
g_free(bs.opaque);
g_free(bs->opaque);
bs->opaque = NULL;
bdrv_unref(bs);
return ret;
}
@@ -1310,6 +1520,7 @@ static BlockDriver bdrv_iscsi = {
.protocol_name = "iscsi",
.instance_size = sizeof(IscsiLun),
.bdrv_needs_filename = true,
.bdrv_file_open = iscsi_open,
.bdrv_close = iscsi_close,
.bdrv_create = iscsi_create,
@@ -1318,11 +1529,15 @@ static BlockDriver bdrv_iscsi = {
.bdrv_getlength = iscsi_getlength,
.bdrv_truncate = iscsi_truncate,
#if defined(LIBISCSI_FEATURE_IOVECTOR)
.bdrv_co_get_block_status = iscsi_co_get_block_status,
#endif
.bdrv_co_discard = iscsi_co_discard,
.bdrv_aio_readv = iscsi_aio_readv,
.bdrv_aio_writev = iscsi_aio_writev,
.bdrv_aio_flush = iscsi_aio_flush,
.bdrv_aio_discard = iscsi_aio_discard,
.bdrv_has_zero_init = iscsi_has_zero_init,
#ifdef __linux__

View File

@@ -39,7 +39,6 @@ struct qemu_laiocb {
struct qemu_laio_state {
io_context_t ctx;
EventNotifier e;
int count;
};
static inline ssize_t io_event_ret(struct io_event *ev)
@@ -55,8 +54,6 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
{
int ret;
s->count--;
ret = laiocb->ret;
if (ret != -ECANCELED) {
if (ret == laiocb->nbytes) {
@@ -101,13 +98,6 @@ static void qemu_laio_completion_cb(EventNotifier *e)
}
}
static int qemu_laio_flush_cb(EventNotifier *e)
{
struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, e);
return (s->count > 0) ? 1 : 0;
}
static void laio_cancel(BlockDriverAIOCB *blockacb)
{
struct qemu_laiocb *laiocb = (struct qemu_laiocb *)blockacb;
@@ -177,14 +167,11 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
goto out_free_aiocb;
}
io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
s->count++;
if (io_submit(s->ctx, 1, &iocbs) < 0)
goto out_dec_count;
goto out_free_aiocb;
return &laiocb->common;
out_dec_count:
s->count--;
out_free_aiocb:
qemu_aio_release(laiocb);
return NULL;
@@ -203,8 +190,7 @@ void *laio_init(void)
goto out_close_efd;
}
qemu_aio_set_event_notifier(&s->e, qemu_laio_completion_cb,
qemu_laio_flush_cb);
qemu_aio_set_event_notifier(&s->e, qemu_laio_completion_cb);
return s;

View File

@@ -338,8 +338,8 @@ static void coroutine_fn mirror_run(void *opaque)
base = s->mode == MIRROR_SYNC_MODE_FULL ? NULL : bs->backing_hd;
for (sector_num = 0; sector_num < end; ) {
int64_t next = (sector_num | (sectors_per_chunk - 1)) + 1;
ret = bdrv_co_is_allocated_above(bs, base,
sector_num, next - sector_num, &n);
ret = bdrv_is_allocated_above(bs, base,
sector_num, next - sector_num, &n);
if (ret < 0) {
goto immediate_exit;
@@ -356,7 +356,7 @@ static void coroutine_fn mirror_run(void *opaque)
}
bdrv_dirty_iter_init(bs, &s->hbi);
last_pause_ns = qemu_get_clock_ns(rt_clock);
last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
for (;;) {
uint64_t delay_ns;
int64_t cnt;
@@ -374,7 +374,7 @@ static void coroutine_fn mirror_run(void *opaque)
* We do so every SLICE_TIME nanoseconds, or when there is an error,
* or when the source is clean, whichever comes first.
*/
if (qemu_get_clock_ns(rt_clock) - last_pause_ns < SLICE_TIME &&
if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - last_pause_ns < SLICE_TIME &&
s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
if (s->in_flight == MAX_IN_FLIGHT || s->buf_free_count == 0 ||
(cnt == 0 && s->in_flight > 0)) {
@@ -439,13 +439,13 @@ static void coroutine_fn mirror_run(void *opaque)
delay_ns = 0;
}
block_job_sleep_ns(&s->common, rt_clock, delay_ns);
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
if (block_job_is_cancelled(&s->common)) {
break;
}
} else if (!should_complete) {
delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
block_job_sleep_ns(&s->common, rt_clock, delay_ns);
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
} else if (cnt == 0) {
/* The two disks are in sync. Exit and report successful
* completion.
@@ -454,7 +454,7 @@ static void coroutine_fn mirror_run(void *opaque)
s->common.cancelled = false;
break;
}
last_pause_ns = qemu_get_clock_ns(rt_clock);
last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
}
immediate_exit:
@@ -480,7 +480,7 @@ immediate_exit:
bdrv_swap(s->target, s->common.bs);
}
bdrv_close(s->target);
bdrv_delete(s->target);
bdrv_unref(s->target);
block_job_completed(&s->common, ret);
}
@@ -505,14 +505,15 @@ static void mirror_iostatus_reset(BlockJob *job)
static void mirror_complete(BlockJob *job, Error **errp)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
Error *local_err = NULL;
int ret;
ret = bdrv_open_backing_file(s->target, NULL);
ret = bdrv_open_backing_file(s->target, NULL, &local_err);
if (ret < 0) {
char backing_filename[PATH_MAX];
bdrv_get_full_backing_filename(s->target, backing_filename,
sizeof(backing_filename));
error_setg_file_open(errp, -ret, backing_filename);
error_propagate(errp, local_err);
return;
}
if (!s->synced) {
@@ -524,9 +525,9 @@ static void mirror_complete(BlockJob *job, Error **errp)
block_job_resume(job);
}
static const BlockJobType mirror_job_type = {
static const BlockJobDriver mirror_job_driver = {
.instance_size = sizeof(MirrorBlockJob),
.job_type = "mirror",
.job_type = BLOCK_JOB_TYPE_MIRROR,
.set_speed = mirror_set_speed,
.iostatus_reset= mirror_iostatus_reset,
.complete = mirror_complete,
@@ -562,7 +563,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
return;
}
s = block_job_create(&mirror_job_type, bs, speed, cb, opaque, errp);
s = block_job_create(&mirror_job_driver, bs, speed, cb, opaque, errp);
if (!s) {
return;
}

View File

@@ -279,13 +279,6 @@ static void nbd_coroutine_start(BDRVNBDState *s, struct nbd_request *request)
request->handle = INDEX_TO_HANDLE(s, i);
}
static int nbd_have_request(void *opaque)
{
BDRVNBDState *s = opaque;
return s->in_flight > 0;
}
static void nbd_reply_ready(void *opaque)
{
BDRVNBDState *s = opaque;
@@ -341,8 +334,7 @@ static int nbd_co_send_request(BDRVNBDState *s, struct nbd_request *request,
qemu_co_mutex_lock(&s->send_mutex);
s->send_coroutine = qemu_coroutine_self();
qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, nbd_restart_write,
nbd_have_request, s);
qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, nbd_restart_write, s);
if (qiov) {
if (!s->is_unix) {
socket_set_cork(s->sock, 1);
@@ -361,8 +353,7 @@ static int nbd_co_send_request(BDRVNBDState *s, struct nbd_request *request,
} else {
rc = nbd_send_request(s->sock, request);
}
qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, NULL,
nbd_have_request, s);
qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, NULL, s);
s->send_coroutine = NULL;
qemu_co_mutex_unlock(&s->send_mutex);
return rc;
@@ -438,8 +429,7 @@ static int nbd_establish_connection(BlockDriverState *bs)
/* Now that we're connected, set the socket to be non-blocking and
* kick the reply mechanism. */
qemu_set_nonblock(sock);
qemu_aio_set_fd_handler(sock, nbd_reply_ready, NULL,
nbd_have_request, s);
qemu_aio_set_fd_handler(sock, nbd_reply_ready, NULL, s);
s->sock = sock;
s->size = size;
@@ -459,11 +449,12 @@ static void nbd_teardown_connection(BlockDriverState *bs)
request.len = 0;
nbd_send_request(s->sock, &request);
qemu_aio_set_fd_handler(s->sock, NULL, NULL, NULL, NULL);
qemu_aio_set_fd_handler(s->sock, NULL, NULL, NULL);
closesocket(s->sock);
}
static int nbd_open(BlockDriverState *bs, QDict *options, int flags)
static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVNBDState *s = bs->opaque;
int result;

View File

@@ -68,7 +68,8 @@ static int parallels_probe(const uint8_t *buf, int buf_size, const char *filenam
return 0;
}
static int parallels_open(BlockDriverState *bs, QDict *options, int flags)
static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVParallelsState *s = bs->opaque;
int i;

View File

@@ -25,6 +25,9 @@
#include "block/qapi.h"
#include "block/block_int.h"
#include "qmp-commands.h"
#include "qapi-visit.h"
#include "qapi/qmp-output-visitor.h"
#include "qapi/qmp/types.h"
/*
* Returns 0 on success, with *p_list either set to describe snapshot
@@ -134,6 +137,9 @@ void bdrv_query_image_info(BlockDriverState *bs,
info->dirty_flag = bdi.is_dirty;
info->has_dirty_flag = true;
}
info->format_specific = bdrv_get_specific_info(bs);
info->has_format_specific = info->format_specific != NULL;
backing_filename = bs->backing_file;
if (backing_filename[0] != '\0') {
info->backing_filename = g_strdup(backing_filename);
@@ -223,18 +229,44 @@ void bdrv_query_info(BlockDriverState *bs,
info->inserted->backing_file_depth = bdrv_get_backing_file_depth(bs);
if (bs->io_limits_enabled) {
info->inserted->bps =
bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL];
info->inserted->bps_rd =
bs->io_limits.bps[BLOCK_IO_LIMIT_READ];
info->inserted->bps_wr =
bs->io_limits.bps[BLOCK_IO_LIMIT_WRITE];
info->inserted->iops =
bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL];
info->inserted->iops_rd =
bs->io_limits.iops[BLOCK_IO_LIMIT_READ];
info->inserted->iops_wr =
bs->io_limits.iops[BLOCK_IO_LIMIT_WRITE];
ThrottleConfig cfg;
throttle_get_config(&bs->throttle_state, &cfg);
info->inserted->bps = cfg.buckets[THROTTLE_BPS_TOTAL].avg;
info->inserted->bps_rd = cfg.buckets[THROTTLE_BPS_READ].avg;
info->inserted->bps_wr = cfg.buckets[THROTTLE_BPS_WRITE].avg;
info->inserted->iops = cfg.buckets[THROTTLE_OPS_TOTAL].avg;
info->inserted->iops_rd = cfg.buckets[THROTTLE_OPS_READ].avg;
info->inserted->iops_wr = cfg.buckets[THROTTLE_OPS_WRITE].avg;
info->inserted->has_bps_max =
cfg.buckets[THROTTLE_BPS_TOTAL].max;
info->inserted->bps_max =
cfg.buckets[THROTTLE_BPS_TOTAL].max;
info->inserted->has_bps_rd_max =
cfg.buckets[THROTTLE_BPS_READ].max;
info->inserted->bps_rd_max =
cfg.buckets[THROTTLE_BPS_READ].max;
info->inserted->has_bps_wr_max =
cfg.buckets[THROTTLE_BPS_WRITE].max;
info->inserted->bps_wr_max =
cfg.buckets[THROTTLE_BPS_WRITE].max;
info->inserted->has_iops_max =
cfg.buckets[THROTTLE_OPS_TOTAL].max;
info->inserted->iops_max =
cfg.buckets[THROTTLE_OPS_TOTAL].max;
info->inserted->has_iops_rd_max =
cfg.buckets[THROTTLE_OPS_READ].max;
info->inserted->iops_rd_max =
cfg.buckets[THROTTLE_OPS_READ].max;
info->inserted->has_iops_wr_max =
cfg.buckets[THROTTLE_OPS_WRITE].max;
info->inserted->iops_wr_max =
cfg.buckets[THROTTLE_OPS_WRITE].max;
info->inserted->has_iops_size = cfg.op_size;
info->inserted->iops_size = cfg.op_size;
}
bs0 = bs;
@@ -397,6 +429,119 @@ void bdrv_snapshot_dump(fprintf_function func_fprintf, void *f,
}
}
static void dump_qdict(fprintf_function func_fprintf, void *f, int indentation,
QDict *dict);
static void dump_qlist(fprintf_function func_fprintf, void *f, int indentation,
QList *list);
static void dump_qobject(fprintf_function func_fprintf, void *f,
int comp_indent, QObject *obj)
{
switch (qobject_type(obj)) {
case QTYPE_QINT: {
QInt *value = qobject_to_qint(obj);
func_fprintf(f, "%" PRId64, qint_get_int(value));
break;
}
case QTYPE_QSTRING: {
QString *value = qobject_to_qstring(obj);
func_fprintf(f, "%s", qstring_get_str(value));
break;
}
case QTYPE_QDICT: {
QDict *value = qobject_to_qdict(obj);
dump_qdict(func_fprintf, f, comp_indent, value);
break;
}
case QTYPE_QLIST: {
QList *value = qobject_to_qlist(obj);
dump_qlist(func_fprintf, f, comp_indent, value);
break;
}
case QTYPE_QFLOAT: {
QFloat *value = qobject_to_qfloat(obj);
func_fprintf(f, "%g", qfloat_get_double(value));
break;
}
case QTYPE_QBOOL: {
QBool *value = qobject_to_qbool(obj);
func_fprintf(f, "%s", qbool_get_int(value) ? "true" : "false");
break;
}
case QTYPE_QERROR: {
QString *value = qerror_human((QError *)obj);
func_fprintf(f, "%s", qstring_get_str(value));
break;
}
case QTYPE_NONE:
break;
case QTYPE_MAX:
default:
abort();
}
}
static void dump_qlist(fprintf_function func_fprintf, void *f, int indentation,
QList *list)
{
const QListEntry *entry;
int i = 0;
for (entry = qlist_first(list); entry; entry = qlist_next(entry), i++) {
qtype_code type = qobject_type(entry->value);
bool composite = (type == QTYPE_QDICT || type == QTYPE_QLIST);
const char *format = composite ? "%*s[%i]:\n" : "%*s[%i]: ";
func_fprintf(f, format, indentation * 4, "", i);
dump_qobject(func_fprintf, f, indentation + 1, entry->value);
if (!composite) {
func_fprintf(f, "\n");
}
}
}
static void dump_qdict(fprintf_function func_fprintf, void *f, int indentation,
QDict *dict)
{
const QDictEntry *entry;
for (entry = qdict_first(dict); entry; entry = qdict_next(dict, entry)) {
qtype_code type = qobject_type(entry->value);
bool composite = (type == QTYPE_QDICT || type == QTYPE_QLIST);
const char *format = composite ? "%*s%s:\n" : "%*s%s: ";
char key[strlen(entry->key) + 1];
int i;
/* replace dashes with spaces in key (variable) names */
for (i = 0; entry->key[i]; i++) {
key[i] = entry->key[i] == '-' ? ' ' : entry->key[i];
}
key[i] = 0;
func_fprintf(f, format, indentation * 4, "", key);
dump_qobject(func_fprintf, f, indentation + 1, entry->value);
if (!composite) {
func_fprintf(f, "\n");
}
}
}
void bdrv_image_info_specific_dump(fprintf_function func_fprintf, void *f,
ImageInfoSpecific *info_spec)
{
Error *local_err = NULL;
QmpOutputVisitor *ov = qmp_output_visitor_new();
QObject *obj, *data;
visit_type_ImageInfoSpecific(qmp_output_get_visitor(ov), &info_spec, NULL,
&local_err);
obj = qmp_output_get_qobject(ov);
assert(qobject_type(obj) == QTYPE_QDICT);
data = qdict_get(qobject_to_qdict(obj), "data");
dump_qobject(func_fprintf, f, 1, data);
qmp_output_visitor_cleanup(ov);
}
void bdrv_image_info_dump(fprintf_function func_fprintf, void *f,
ImageInfo *info)
{
@@ -467,4 +612,9 @@ void bdrv_image_info_dump(fprintf_function func_fprintf, void *f,
func_fprintf(f, "\n");
}
}
if (info->has_format_specific) {
func_fprintf(f, "Format specific information:\n");
bdrv_image_info_specific_dump(func_fprintf, f, info->format_specific);
}
}

View File

@@ -92,7 +92,8 @@ static int qcow_probe(const uint8_t *buf, int buf_size, const char *filename)
return 0;
}
static int qcow_open(BlockDriverState *bs, QDict *options, int flags)
static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVQcowState *s = bs->opaque;
int len, i, shift, ret;
@@ -395,7 +396,7 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
return cluster_offset;
}
static int coroutine_fn qcow_co_is_allocated(BlockDriverState *bs,
static int64_t coroutine_fn qcow_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum)
{
BDRVQcowState *s = bs->opaque;
@@ -410,7 +411,14 @@ static int coroutine_fn qcow_co_is_allocated(BlockDriverState *bs,
if (n > nb_sectors)
n = nb_sectors;
*pnum = n;
return (cluster_offset != 0);
if (!cluster_offset) {
return 0;
}
if ((cluster_offset & QCOW_OFLAG_COMPRESSED) || s->crypt_method) {
return BDRV_BLOCK_DATA;
}
cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | cluster_offset;
}
static int decompress_buffer(uint8_t *out_buf, int out_buf_size,
@@ -651,7 +659,8 @@ static void qcow_close(BlockDriverState *bs)
error_free(s->migration_blocker);
}
static int qcow_create(const char *filename, QEMUOptionParameter *options)
static int qcow_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int header_size, backing_filename_len, l1_size, shift, i;
QCowHeader header;
@@ -659,6 +668,7 @@ static int qcow_create(const char *filename, QEMUOptionParameter *options)
int64_t total_size = 0;
const char *backing_file = NULL;
int flags = 0;
Error *local_err = NULL;
int ret;
BlockDriverState *qcow_bs;
@@ -674,13 +684,17 @@ static int qcow_create(const char *filename, QEMUOptionParameter *options)
options++;
}
ret = bdrv_create_file(filename, options);
ret = bdrv_create_file(filename, options, &local_err);
if (ret < 0) {
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
ret = bdrv_file_open(&qcow_bs, filename, NULL, BDRV_O_RDWR);
ret = bdrv_file_open(&qcow_bs, filename, NULL, BDRV_O_RDWR, &local_err);
if (ret < 0) {
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
@@ -751,7 +765,7 @@ static int qcow_create(const char *filename, QEMUOptionParameter *options)
g_free(tmp);
ret = 0;
exit:
bdrv_delete(qcow_bs);
bdrv_unref(qcow_bs);
return ret;
}
@@ -896,7 +910,7 @@ static BlockDriver bdrv_qcow = {
.bdrv_co_readv = qcow_co_readv,
.bdrv_co_writev = qcow_co_writev,
.bdrv_co_is_allocated = qcow_co_is_allocated,
.bdrv_co_get_block_status = qcow_co_get_block_status,
.bdrv_set_key = qcow_set_key,
.bdrv_make_empty = qcow_make_empty,

View File

@@ -114,6 +114,21 @@ static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
return ret;
}
if (c == s->refcount_block_cache) {
ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_REFCOUNT_BLOCK,
c->entries[i].offset, s->cluster_size);
} else if (c == s->l2_table_cache) {
ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_ACTIVE_L2,
c->entries[i].offset, s->cluster_size);
} else {
ret = qcow2_pre_write_overlap_check(bs, 0,
c->entries[i].offset, s->cluster_size);
}
if (ret < 0) {
return ret;
}
if (c == s->refcount_block_cache) {
BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_UPDATE_PART);
} else if (c == s->l2_table_cache) {
@@ -185,6 +200,24 @@ void qcow2_cache_depends_on_flush(Qcow2Cache *c)
c->depends_on_flush = true;
}
int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c)
{
int ret, i;
ret = qcow2_cache_flush(bs, c);
if (ret < 0) {
return ret;
}
for (i = 0; i < c->size; i++) {
assert(c->entries[i].ref == 0);
c->entries[i].offset = 0;
c->entries[i].cache_hits = 0;
}
return 0;
}
static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c)
{
int i;

View File

@@ -35,6 +35,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
BDRVQcowState *s = bs->opaque;
int new_l1_size2, ret, i;
uint64_t *new_l1_table;
int64_t old_l1_table_offset, old_l1_size;
int64_t new_l1_table_offset, new_l1_size;
uint8_t data[12];
@@ -80,6 +81,14 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
goto fail;
}
/* the L1 position has not yet been updated, so these clusters must
* indeed be completely free */
ret = qcow2_pre_write_overlap_check(bs, 0, new_l1_table_offset,
new_l1_size2);
if (ret < 0) {
goto fail;
}
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_WRITE_TABLE);
for(i = 0; i < s->l1_size; i++)
new_l1_table[i] = cpu_to_be64(new_l1_table[i]);
@@ -92,17 +101,19 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
/* set new table */
BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ACTIVATE_TABLE);
cpu_to_be32w((uint32_t*)data, new_l1_size);
cpu_to_be64wu((uint64_t*)(data + 4), new_l1_table_offset);
stq_be_p(data + 4, new_l1_table_offset);
ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, l1_size), data,sizeof(data));
if (ret < 0) {
goto fail;
}
g_free(s->l1_table);
qcow2_free_clusters(bs, s->l1_table_offset, s->l1_size * sizeof(uint64_t),
QCOW2_DISCARD_OTHER);
old_l1_table_offset = s->l1_table_offset;
s->l1_table_offset = new_l1_table_offset;
s->l1_table = new_l1_table;
old_l1_size = s->l1_size;
s->l1_size = new_l1_size;
qcow2_free_clusters(bs, old_l1_table_offset, old_l1_size * sizeof(uint64_t),
QCOW2_DISCARD_OTHER);
return 0;
fail:
g_free(new_l1_table);
@@ -137,7 +148,7 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset,
* and we really don't want bdrv_pread to perform a read-modify-write)
*/
#define L1_ENTRIES_PER_SECTOR (512 / 8)
static int write_l1_entry(BlockDriverState *bs, int l1_index)
int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
{
BDRVQcowState *s = bs->opaque;
uint64_t buf[L1_ENTRIES_PER_SECTOR];
@@ -149,6 +160,12 @@ static int write_l1_entry(BlockDriverState *bs, int l1_index)
buf[i] = cpu_to_be64(s->l1_table[l1_start_index + i]);
}
ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_ACTIVE_L1,
s->l1_table_offset + 8 * l1_start_index, sizeof(buf));
if (ret < 0) {
return ret;
}
BLKDBG_EVENT(bs->file, BLKDBG_L1_UPDATE);
ret = bdrv_pwrite_sync(bs->file, s->l1_table_offset + 8 * l1_start_index,
buf, sizeof(buf));
@@ -173,7 +190,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
{
BDRVQcowState *s = bs->opaque;
uint64_t old_l2_offset;
uint64_t *l2_table;
uint64_t *l2_table = NULL;
int64_t l2_offset;
int ret;
@@ -185,7 +202,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
l2_offset = qcow2_alloc_clusters(bs, s->l2_size * sizeof(uint64_t));
if (l2_offset < 0) {
return l2_offset;
ret = l2_offset;
goto fail;
}
ret = qcow2_cache_flush(bs, s->refcount_block_cache);
@@ -198,7 +216,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
trace_qcow2_l2_allocate_get_empty(bs, l1_index);
ret = qcow2_cache_get_empty(bs, s->l2_table_cache, l2_offset, (void**) table);
if (ret < 0) {
return ret;
goto fail;
}
l2_table = *table;
@@ -239,7 +257,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
/* update the L1 entry */
trace_qcow2_l2_allocate_write_l1(bs, l1_index);
s->l1_table[l1_index] = l2_offset | QCOW_OFLAG_COPIED;
ret = write_l1_entry(bs, l1_index);
ret = qcow2_write_l1_entry(bs, l1_index);
if (ret < 0) {
goto fail;
}
@@ -250,8 +268,14 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
fail:
trace_qcow2_l2_allocate_done(bs, l1_index, ret);
qcow2_cache_put(bs, s->l2_table_cache, (void**) table);
if (l2_table != NULL) {
qcow2_cache_put(bs, s->l2_table_cache, (void**) table);
}
s->l1_table[l1_index] = old_l2_offset;
if (l2_offset > 0) {
qcow2_free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t),
QCOW2_DISCARD_ALWAYS);
}
return ret;
}
@@ -263,10 +287,10 @@ fail:
* cluster which may require a different handling)
*/
static int count_contiguous_clusters(uint64_t nb_clusters, int cluster_size,
uint64_t *l2_table, uint64_t start, uint64_t stop_flags)
uint64_t *l2_table, uint64_t stop_flags)
{
int i;
uint64_t mask = stop_flags | L2E_OFFSET_MASK | QCOW_OFLAG_COMPRESSED;
uint64_t mask = stop_flags | L2E_OFFSET_MASK | QCOW2_CLUSTER_COMPRESSED;
uint64_t first_entry = be64_to_cpu(l2_table[0]);
uint64_t offset = first_entry & mask;
@@ -275,14 +299,14 @@ static int count_contiguous_clusters(uint64_t nb_clusters, int cluster_size,
assert(qcow2_get_cluster_type(first_entry) != QCOW2_CLUSTER_COMPRESSED);
for (i = start; i < start + nb_clusters; i++) {
for (i = 0; i < nb_clusters; i++) {
uint64_t l2_entry = be64_to_cpu(l2_table[i]) & mask;
if (offset + (uint64_t) i * cluster_size != l2_entry) {
break;
}
}
return (i - start);
return i;
}
static int count_contiguous_free_clusters(uint64_t nb_clusters, uint64_t *l2_table)
@@ -371,6 +395,12 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
&s->aes_encrypt_key);
}
ret = qcow2_pre_write_overlap_check(bs, 0,
cluster_offset + n_start * BDRV_SECTOR_SIZE, n * BDRV_SECTOR_SIZE);
if (ret < 0) {
goto out;
}
BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
ret = bdrv_co_writev(bs->file, (cluster_offset >> 9) + n_start, n, &qiov);
if (ret < 0) {
@@ -469,8 +499,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
return -EIO;
}
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], 0,
QCOW_OFLAG_COMPRESSED | QCOW_OFLAG_ZERO);
&l2_table[l2_index], QCOW_OFLAG_ZERO);
*cluster_offset = 0;
break;
case QCOW2_CLUSTER_UNALLOCATED:
@@ -481,8 +510,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
case QCOW2_CLUSTER_NORMAL:
/* how many allocated clusters ? */
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], 0,
QCOW_OFLAG_COMPRESSED | QCOW_OFLAG_ZERO);
&l2_table[l2_index], QCOW_OFLAG_ZERO);
*cluster_offset &= L2E_OFFSET_MASK;
break;
default:
@@ -698,6 +726,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
}
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
assert(l2_index + m->nb_clusters <= s->l2_size);
for (i = 0; i < m->nb_clusters; i++) {
/* if two concurrent writes happen to the same unallocated cluster
* each write allocates separate cluster and writes data concurrently.
@@ -911,7 +940,7 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
/* We keep all QCOW_OFLAG_COPIED clusters */
keep_clusters =
count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], 0,
&l2_table[l2_index],
QCOW_OFLAG_COPIED | QCOW_OFLAG_ZERO);
assert(keep_clusters <= nb_clusters);
@@ -1320,7 +1349,7 @@ int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset)
* clusters.
*/
static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
unsigned int nb_clusters)
unsigned int nb_clusters, enum qcow2_discard_type type)
{
BDRVQcowState *s = bs->opaque;
uint64_t *l2_table;
@@ -1349,7 +1378,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
l2_table[l2_index + i] = cpu_to_be64(0);
/* Then decrease the refcount */
qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
qcow2_free_any_clusters(bs, old_offset, 1, type);
}
ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
@@ -1361,7 +1390,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
}
int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int nb_sectors)
int nb_sectors, enum qcow2_discard_type type)
{
BDRVQcowState *s = bs->opaque;
uint64_t end_offset;
@@ -1384,7 +1413,7 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
/* Each L2 table is handled by its own loop iteration */
while (nb_clusters > 0) {
ret = discard_single_l2(bs, offset, nb_clusters);
ret = discard_single_l2(bs, offset, nb_clusters, type);
if (ret < 0) {
goto fail;
}
@@ -1479,3 +1508,255 @@ fail:
return ret;
}
/*
* Expands all zero clusters in a specific L1 table (or deallocates them, for
* non-backed non-pre-allocated zero clusters).
*
* expanded_clusters is a bitmap where every bit corresponds to one cluster in
* the image file; a bit gets set if the corresponding cluster has been used for
* zero expansion (i.e., has been filled with zeroes and is referenced from an
* L2 table). nb_clusters contains the total cluster count of the image file,
* i.e., the number of bits in expanded_clusters.
*/
static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
int l1_size, uint8_t **expanded_clusters,
uint64_t *nb_clusters)
{
BDRVQcowState *s = bs->opaque;
bool is_active_l1 = (l1_table == s->l1_table);
uint64_t *l2_table = NULL;
int ret;
int i, j;
if (!is_active_l1) {
/* inactive L2 tables require a buffer to be stored in when loading
* them from disk */
l2_table = qemu_blockalign(bs, s->cluster_size);
}
for (i = 0; i < l1_size; i++) {
uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
bool l2_dirty = false;
if (!l2_offset) {
/* unallocated */
continue;
}
if (is_active_l1) {
/* get active L2 tables from cache */
ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
(void **)&l2_table);
} else {
/* load inactive L2 tables from disk */
ret = bdrv_read(bs->file, l2_offset / BDRV_SECTOR_SIZE,
(void *)l2_table, s->cluster_sectors);
}
if (ret < 0) {
goto fail;
}
for (j = 0; j < s->l2_size; j++) {
uint64_t l2_entry = be64_to_cpu(l2_table[j]);
int64_t offset = l2_entry & L2E_OFFSET_MASK, cluster_index;
int cluster_type = qcow2_get_cluster_type(l2_entry);
bool preallocated = offset != 0;
if (cluster_type == QCOW2_CLUSTER_NORMAL) {
cluster_index = offset >> s->cluster_bits;
assert((cluster_index >= 0) && (cluster_index < *nb_clusters));
if ((*expanded_clusters)[cluster_index / 8] &
(1 << (cluster_index % 8))) {
/* Probably a shared L2 table; this cluster was a zero
* cluster which has been expanded, its refcount
* therefore most likely requires an update. */
ret = qcow2_update_cluster_refcount(bs, cluster_index, 1,
QCOW2_DISCARD_NEVER);
if (ret < 0) {
goto fail;
}
/* Since we just increased the refcount, the COPIED flag may
* no longer be set. */
l2_table[j] = cpu_to_be64(l2_entry & ~QCOW_OFLAG_COPIED);
l2_dirty = true;
}
continue;
}
else if (qcow2_get_cluster_type(l2_entry) != QCOW2_CLUSTER_ZERO) {
continue;
}
if (!preallocated) {
if (!bs->backing_hd) {
/* not backed; therefore we can simply deallocate the
* cluster */
l2_table[j] = 0;
l2_dirty = true;
continue;
}
offset = qcow2_alloc_clusters(bs, s->cluster_size);
if (offset < 0) {
ret = offset;
goto fail;
}
}
ret = qcow2_pre_write_overlap_check(bs, 0, offset, s->cluster_size);
if (ret < 0) {
if (!preallocated) {
qcow2_free_clusters(bs, offset, s->cluster_size,
QCOW2_DISCARD_ALWAYS);
}
goto fail;
}
ret = bdrv_write_zeroes(bs->file, offset / BDRV_SECTOR_SIZE,
s->cluster_sectors);
if (ret < 0) {
if (!preallocated) {
qcow2_free_clusters(bs, offset, s->cluster_size,
QCOW2_DISCARD_ALWAYS);
}
goto fail;
}
l2_table[j] = cpu_to_be64(offset | QCOW_OFLAG_COPIED);
l2_dirty = true;
cluster_index = offset >> s->cluster_bits;
if (cluster_index >= *nb_clusters) {
uint64_t old_bitmap_size = (*nb_clusters + 7) / 8;
uint64_t new_bitmap_size;
/* The offset may lie beyond the old end of the underlying image
* file for growable files only */
assert(bs->file->growable);
*nb_clusters = size_to_clusters(s, bs->file->total_sectors *
BDRV_SECTOR_SIZE);
new_bitmap_size = (*nb_clusters + 7) / 8;
*expanded_clusters = g_realloc(*expanded_clusters,
new_bitmap_size);
/* clear the newly allocated space */
memset(&(*expanded_clusters)[old_bitmap_size], 0,
new_bitmap_size - old_bitmap_size);
}
assert((cluster_index >= 0) && (cluster_index < *nb_clusters));
(*expanded_clusters)[cluster_index / 8] |= 1 << (cluster_index % 8);
}
if (is_active_l1) {
if (l2_dirty) {
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
qcow2_cache_depends_on_flush(s->l2_table_cache);
}
ret = qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
if (ret < 0) {
l2_table = NULL;
goto fail;
}
} else {
if (l2_dirty) {
ret = qcow2_pre_write_overlap_check(bs,
QCOW2_OL_INACTIVE_L2 | QCOW2_OL_ACTIVE_L2, l2_offset,
s->cluster_size);
if (ret < 0) {
goto fail;
}
ret = bdrv_write(bs->file, l2_offset / BDRV_SECTOR_SIZE,
(void *)l2_table, s->cluster_sectors);
if (ret < 0) {
goto fail;
}
}
}
}
ret = 0;
fail:
if (l2_table) {
if (!is_active_l1) {
qemu_vfree(l2_table);
} else {
if (ret < 0) {
qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
} else {
ret = qcow2_cache_put(bs, s->l2_table_cache,
(void **)&l2_table);
}
}
}
return ret;
}
/*
* For backed images, expands all zero clusters on the image. For non-backed
* images, deallocates all non-pre-allocated zero clusters (and claims the
* allocation for pre-allocated ones). This is important for downgrading to a
* qcow2 version which doesn't yet support metadata zero clusters.
*/
int qcow2_expand_zero_clusters(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
uint64_t *l1_table = NULL;
uint64_t nb_clusters;
uint8_t *expanded_clusters;
int ret;
int i, j;
nb_clusters = size_to_clusters(s, bs->file->total_sectors *
BDRV_SECTOR_SIZE);
expanded_clusters = g_malloc0((nb_clusters + 7) / 8);
ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size,
&expanded_clusters, &nb_clusters);
if (ret < 0) {
goto fail;
}
/* Inactive L1 tables may point to active L2 tables - therefore it is
* necessary to flush the L2 table cache before trying to access the L2
* tables pointed to by inactive L1 entries (else we might try to expand
* zero clusters that have already been expanded); furthermore, it is also
* necessary to empty the L2 table cache, since it may contain tables which
* are now going to be modified directly on disk, bypassing the cache.
* qcow2_cache_empty() does both for us. */
ret = qcow2_cache_empty(bs, s->l2_table_cache);
if (ret < 0) {
goto fail;
}
for (i = 0; i < s->nb_snapshots; i++) {
int l1_sectors = (s->snapshots[i].l1_size * sizeof(uint64_t) +
BDRV_SECTOR_SIZE - 1) / BDRV_SECTOR_SIZE;
l1_table = g_realloc(l1_table, l1_sectors * BDRV_SECTOR_SIZE);
ret = bdrv_read(bs->file, s->snapshots[i].l1_table_offset /
BDRV_SECTOR_SIZE, (void *)l1_table, l1_sectors);
if (ret < 0) {
goto fail;
}
for (j = 0; j < s->snapshots[i].l1_size; j++) {
be64_to_cpus(&l1_table[j]);
}
ret = expand_zero_clusters_in_l1(bs, l1_table, s->snapshots[i].l1_size,
&expanded_clusters, &nb_clusters);
if (ret < 0) {
goto fail;
}
}
ret = 0;
fail:
g_free(expanded_clusters);
g_free(l1_table);
return ret;
}

View File

@@ -25,6 +25,8 @@
#include "qemu-common.h"
#include "block/block_int.h"
#include "block/qcow2.h"
#include "qemu/range.h"
#include "qapi/qmp/types.h"
static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size);
static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
@@ -599,10 +601,10 @@ fail:
* If the return value is non-negative, it is the new refcount of the cluster.
* If it is negative, it is -errno and indicates an error.
*/
static int update_cluster_refcount(BlockDriverState *bs,
int64_t cluster_index,
int addend,
enum qcow2_discard_type type)
int qcow2_update_cluster_refcount(BlockDriverState *bs,
int64_t cluster_index,
int addend,
enum qcow2_discard_type type)
{
BDRVQcowState *s = bs->opaque;
int ret;
@@ -731,8 +733,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
if (free_in_cluster == 0)
s->free_byte_offset = 0;
if ((offset & (s->cluster_size - 1)) != 0)
update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
QCOW2_DISCARD_NEVER);
qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
QCOW2_DISCARD_NEVER);
} else {
offset = qcow2_alloc_clusters(bs, s->cluster_size);
if (offset < 0) {
@@ -742,8 +744,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
if ((cluster_offset + s->cluster_size) == offset) {
/* we are lucky: contiguous data */
offset = s->free_byte_offset;
update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
QCOW2_DISCARD_NEVER);
qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
QCOW2_DISCARD_NEVER);
s->free_byte_offset += size;
} else {
s->free_byte_offset = offset;
@@ -752,8 +754,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
}
/* The cluster refcount was incremented, either by qcow2_alloc_clusters()
* or explicitly by update_cluster_refcount(). Refcount blocks must be
* flushed before the caller's L2 table updates.
* or explicitly by qcow2_update_cluster_refcount(). Refcount blocks must
* be flushed before the caller's L2 table updates.
*/
qcow2_cache_set_dependency(bs, s->l2_table_cache, s->refcount_block_cache);
return offset;
@@ -794,11 +796,13 @@ void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
}
break;
case QCOW2_CLUSTER_NORMAL:
qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
nb_clusters << s->cluster_bits, type);
case QCOW2_CLUSTER_ZERO:
if (l2_entry & L2E_OFFSET_MASK) {
qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
nb_clusters << s->cluster_bits, type);
}
break;
case QCOW2_CLUSTER_UNALLOCATED:
case QCOW2_CLUSTER_ZERO:
break;
default:
abort();
@@ -861,15 +865,17 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
}
for(j = 0; j < s->l2_size; j++) {
uint64_t cluster_index;
offset = be64_to_cpu(l2_table[j]);
if (offset != 0) {
old_offset = offset;
offset &= ~QCOW_OFLAG_COPIED;
if (offset & QCOW_OFLAG_COMPRESSED) {
old_offset = offset;
offset &= ~QCOW_OFLAG_COPIED;
switch (qcow2_get_cluster_type(offset)) {
case QCOW2_CLUSTER_COMPRESSED:
nb_csectors = ((offset >> s->csize_shift) &
s->csize_mask) + 1;
if (addend != 0) {
int ret;
ret = update_refcount(bs,
(offset & s->cluster_offset_mask) & ~511,
nb_csectors * 512, addend,
@@ -880,11 +886,20 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
}
/* compressed clusters are never modified */
refcount = 2;
} else {
uint64_t cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
break;
case QCOW2_CLUSTER_NORMAL:
case QCOW2_CLUSTER_ZERO:
cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
if (!cluster_index) {
/* unallocated */
refcount = 0;
break;
}
if (addend != 0) {
refcount = update_cluster_refcount(bs, cluster_index, addend,
QCOW2_DISCARD_SNAPSHOT);
refcount = qcow2_update_cluster_refcount(bs,
cluster_index, addend,
QCOW2_DISCARD_SNAPSHOT);
} else {
refcount = get_refcount(bs, cluster_index);
}
@@ -893,19 +908,26 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
ret = refcount;
goto fail;
}
}
break;
if (refcount == 1) {
offset |= QCOW_OFLAG_COPIED;
}
if (offset != old_offset) {
if (addend > 0) {
qcow2_cache_set_dependency(bs, s->l2_table_cache,
s->refcount_block_cache);
}
l2_table[j] = cpu_to_be64(offset);
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
case QCOW2_CLUSTER_UNALLOCATED:
refcount = 0;
break;
default:
abort();
}
if (refcount == 1) {
offset |= QCOW_OFLAG_COPIED;
}
if (offset != old_offset) {
if (addend > 0) {
qcow2_cache_set_dependency(bs, s->l2_table_cache,
s->refcount_block_cache);
}
l2_table[j] = cpu_to_be64(offset);
qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
}
}
@@ -916,8 +938,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
if (addend != 0) {
refcount = update_cluster_refcount(bs, l2_offset >> s->cluster_bits, addend,
QCOW2_DISCARD_SNAPSHOT);
refcount = qcow2_update_cluster_refcount(bs, l2_offset >>
s->cluster_bits, addend, QCOW2_DISCARD_SNAPSHOT);
} else {
refcount = get_refcount(bs, l2_offset >> s->cluster_bits);
}
@@ -1014,7 +1036,6 @@ static void inc_refcounts(BlockDriverState *bs,
/* Flags for check_refcounts_l1() and check_refcounts_l2() */
enum {
CHECK_OFLAG_COPIED = 0x1, /* check QCOW_OFLAG_COPIED matches refcount */
CHECK_FRAG_INFO = 0x2, /* update BlockFragInfo counters */
};
@@ -1033,7 +1054,7 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
BDRVQcowState *s = bs->opaque;
uint64_t *l2_table, l2_entry;
uint64_t next_contiguous_offset = 0;
int i, l2_size, nb_csectors, refcount;
int i, l2_size, nb_csectors;
/* Read L2 table from disk */
l2_size = s->l2_size * sizeof(uint64_t);
@@ -1085,23 +1106,8 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
case QCOW2_CLUSTER_NORMAL:
{
/* QCOW_OFLAG_COPIED must be set iff refcount == 1 */
uint64_t offset = l2_entry & L2E_OFFSET_MASK;
if (flags & CHECK_OFLAG_COPIED) {
refcount = get_refcount(bs, offset >> s->cluster_bits);
if (refcount < 0) {
fprintf(stderr, "Can't get refcount for offset %"
PRIx64 ": %s\n", l2_entry, strerror(-refcount));
goto fail;
}
if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
fprintf(stderr, "ERROR OFLAG_COPIED: offset=%"
PRIx64 " refcount=%d\n", l2_entry, refcount);
res->corruptions++;
}
}
if (flags & CHECK_FRAG_INFO) {
res->bfi.allocated_clusters++;
if (next_contiguous_offset &&
@@ -1158,7 +1164,7 @@ static int check_refcounts_l1(BlockDriverState *bs,
{
BDRVQcowState *s = bs->opaque;
uint64_t *l1_table, l2_offset, l1_size2;
int i, refcount, ret;
int i, ret;
l1_size2 = l1_size * sizeof(uint64_t);
@@ -1182,22 +1188,6 @@ static int check_refcounts_l1(BlockDriverState *bs,
for(i = 0; i < l1_size; i++) {
l2_offset = l1_table[i];
if (l2_offset) {
/* QCOW_OFLAG_COPIED must be set iff refcount == 1 */
if (flags & CHECK_OFLAG_COPIED) {
refcount = get_refcount(bs, (l2_offset & ~QCOW_OFLAG_COPIED)
>> s->cluster_bits);
if (refcount < 0) {
fprintf(stderr, "Can't get refcount for l2_offset %"
PRIx64 ": %s\n", l2_offset, strerror(-refcount));
goto fail;
}
if ((refcount == 1) != ((l2_offset & QCOW_OFLAG_COPIED) != 0)) {
fprintf(stderr, "ERROR OFLAG_COPIED: l2_offset=%" PRIx64
" refcount=%d\n", l2_offset, refcount);
res->corruptions++;
}
}
/* Mark L2 table as used */
l2_offset &= L1E_OFFSET_MASK;
inc_refcounts(bs, res, refcount_table, refcount_table_size,
@@ -1228,6 +1218,238 @@ fail:
return -EIO;
}
/*
* Checks the OFLAG_COPIED flag for all L1 and L2 entries.
*
* This function does not print an error message nor does it increment
* check_errors if get_refcount fails (this is because such an error will have
* been already detected and sufficiently signaled by the calling function
* (qcow2_check_refcounts) by the time this function is called).
*/
static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix)
{
BDRVQcowState *s = bs->opaque;
uint64_t *l2_table = qemu_blockalign(bs, s->cluster_size);
int ret;
int refcount;
int i, j;
for (i = 0; i < s->l1_size; i++) {
uint64_t l1_entry = s->l1_table[i];
uint64_t l2_offset = l1_entry & L1E_OFFSET_MASK;
bool l2_dirty = false;
if (!l2_offset) {
continue;
}
refcount = get_refcount(bs, l2_offset >> s->cluster_bits);
if (refcount < 0) {
/* don't print message nor increment check_errors */
continue;
}
if ((refcount == 1) != ((l1_entry & QCOW_OFLAG_COPIED) != 0)) {
fprintf(stderr, "%s OFLAG_COPIED L2 cluster: l1_index=%d "
"l1_entry=%" PRIx64 " refcount=%d\n",
fix & BDRV_FIX_ERRORS ? "Repairing" :
"ERROR",
i, l1_entry, refcount);
if (fix & BDRV_FIX_ERRORS) {
s->l1_table[i] = refcount == 1
? l1_entry | QCOW_OFLAG_COPIED
: l1_entry & ~QCOW_OFLAG_COPIED;
ret = qcow2_write_l1_entry(bs, i);
if (ret < 0) {
res->check_errors++;
goto fail;
}
res->corruptions_fixed++;
} else {
res->corruptions++;
}
}
ret = bdrv_pread(bs->file, l2_offset, l2_table,
s->l2_size * sizeof(uint64_t));
if (ret < 0) {
fprintf(stderr, "ERROR: Could not read L2 table: %s\n",
strerror(-ret));
res->check_errors++;
goto fail;
}
for (j = 0; j < s->l2_size; j++) {
uint64_t l2_entry = be64_to_cpu(l2_table[j]);
uint64_t data_offset = l2_entry & L2E_OFFSET_MASK;
int cluster_type = qcow2_get_cluster_type(l2_entry);
if ((cluster_type == QCOW2_CLUSTER_NORMAL) ||
((cluster_type == QCOW2_CLUSTER_ZERO) && (data_offset != 0))) {
refcount = get_refcount(bs, data_offset >> s->cluster_bits);
if (refcount < 0) {
/* don't print message nor increment check_errors */
continue;
}
if ((refcount == 1) != ((l2_entry & QCOW_OFLAG_COPIED) != 0)) {
fprintf(stderr, "%s OFLAG_COPIED data cluster: "
"l2_entry=%" PRIx64 " refcount=%d\n",
fix & BDRV_FIX_ERRORS ? "Repairing" :
"ERROR",
l2_entry, refcount);
if (fix & BDRV_FIX_ERRORS) {
l2_table[j] = cpu_to_be64(refcount == 1
? l2_entry | QCOW_OFLAG_COPIED
: l2_entry & ~QCOW_OFLAG_COPIED);
l2_dirty = true;
res->corruptions_fixed++;
} else {
res->corruptions++;
}
}
}
}
if (l2_dirty) {
ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_ACTIVE_L2,
l2_offset, s->cluster_size);
if (ret < 0) {
fprintf(stderr, "ERROR: Could not write L2 table; metadata "
"overlap check failed: %s\n", strerror(-ret));
res->check_errors++;
goto fail;
}
ret = bdrv_pwrite(bs->file, l2_offset, l2_table, s->cluster_size);
if (ret < 0) {
fprintf(stderr, "ERROR: Could not write L2 table: %s\n",
strerror(-ret));
res->check_errors++;
goto fail;
}
}
}
ret = 0;
fail:
qemu_vfree(l2_table);
return ret;
}
/*
* Writes one sector of the refcount table to the disk
*/
#define RT_ENTRIES_PER_SECTOR (512 / sizeof(uint64_t))
static int write_reftable_entry(BlockDriverState *bs, int rt_index)
{
BDRVQcowState *s = bs->opaque;
uint64_t buf[RT_ENTRIES_PER_SECTOR];
int rt_start_index;
int i, ret;
rt_start_index = rt_index & ~(RT_ENTRIES_PER_SECTOR - 1);
for (i = 0; i < RT_ENTRIES_PER_SECTOR; i++) {
buf[i] = cpu_to_be64(s->refcount_table[rt_start_index + i]);
}
ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_REFCOUNT_TABLE,
s->refcount_table_offset + rt_start_index * sizeof(uint64_t),
sizeof(buf));
if (ret < 0) {
return ret;
}
BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_UPDATE);
ret = bdrv_pwrite_sync(bs->file, s->refcount_table_offset +
rt_start_index * sizeof(uint64_t), buf, sizeof(buf));
if (ret < 0) {
return ret;
}
return 0;
}
/*
* Allocates a new cluster for the given refcount block (represented by its
* offset in the image file) and copies the current content there. This function
* does _not_ decrement the reference count for the currently occupied cluster.
*
* This function prints an informative message to stderr on error (and returns
* -errno); on success, 0 is returned.
*/
static int64_t realloc_refcount_block(BlockDriverState *bs, int reftable_index,
uint64_t offset)
{
BDRVQcowState *s = bs->opaque;
int64_t new_offset = 0;
void *refcount_block = NULL;
int ret;
/* allocate new refcount block */
new_offset = qcow2_alloc_clusters(bs, s->cluster_size);
if (new_offset < 0) {
fprintf(stderr, "Could not allocate new cluster: %s\n",
strerror(-new_offset));
ret = new_offset;
goto fail;
}
/* fetch current refcount block content */
ret = qcow2_cache_get(bs, s->refcount_block_cache, offset, &refcount_block);
if (ret < 0) {
fprintf(stderr, "Could not fetch refcount block: %s\n", strerror(-ret));
goto fail;
}
/* new block has not yet been entered into refcount table, therefore it is
* no refcount block yet (regarding this check) */
ret = qcow2_pre_write_overlap_check(bs, 0, new_offset, s->cluster_size);
if (ret < 0) {
fprintf(stderr, "Could not write refcount block; metadata overlap "
"check failed: %s\n", strerror(-ret));
/* the image will be marked corrupt, so don't even attempt on freeing
* the cluster */
new_offset = 0;
goto fail;
}
/* write to new block */
ret = bdrv_write(bs->file, new_offset / BDRV_SECTOR_SIZE, refcount_block,
s->cluster_sectors);
if (ret < 0) {
fprintf(stderr, "Could not write refcount block: %s\n", strerror(-ret));
goto fail;
}
/* update refcount table */
assert(!(new_offset & (s->cluster_size - 1)));
s->refcount_table[reftable_index] = new_offset;
ret = write_reftable_entry(bs, reftable_index);
if (ret < 0) {
fprintf(stderr, "Could not update refcount table: %s\n",
strerror(-ret));
goto fail;
}
fail:
if (new_offset && (ret < 0)) {
qcow2_free_clusters(bs, new_offset, s->cluster_size,
QCOW2_DISCARD_ALWAYS);
}
if (refcount_block) {
if (ret < 0) {
qcow2_cache_put(bs, s->refcount_block_cache, &refcount_block);
} else {
ret = qcow2_cache_put(bs, s->refcount_block_cache, &refcount_block);
}
}
if (ret < 0) {
return ret;
}
return new_offset;
}
/*
* Checks an image for refcount consistency.
*
@@ -1257,8 +1479,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
/* current L1 table */
ret = check_refcounts_l1(bs, res, refcount_table, nb_clusters,
s->l1_table_offset, s->l1_size,
CHECK_OFLAG_COPIED | CHECK_FRAG_INFO);
s->l1_table_offset, s->l1_size, CHECK_FRAG_INFO);
if (ret < 0) {
goto fail;
}
@@ -1304,10 +1525,39 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
inc_refcounts(bs, res, refcount_table, nb_clusters,
offset, s->cluster_size);
if (refcount_table[cluster] != 1) {
fprintf(stderr, "ERROR refcount block %" PRId64
fprintf(stderr, "%s refcount block %" PRId64
" refcount=%d\n",
fix & BDRV_FIX_ERRORS ? "Repairing" :
"ERROR",
i, refcount_table[cluster]);
res->corruptions++;
if (fix & BDRV_FIX_ERRORS) {
int64_t new_offset;
new_offset = realloc_refcount_block(bs, i, offset);
if (new_offset < 0) {
res->corruptions++;
continue;
}
/* update refcounts */
if ((new_offset >> s->cluster_bits) >= nb_clusters) {
/* increase refcount_table size if necessary */
int old_nb_clusters = nb_clusters;
nb_clusters = (new_offset >> s->cluster_bits) + 1;
refcount_table = g_realloc(refcount_table,
nb_clusters * sizeof(uint16_t));
memset(&refcount_table[old_nb_clusters], 0, (nb_clusters
- old_nb_clusters) * sizeof(uint16_t));
}
refcount_table[cluster]--;
inc_refcounts(bs, res, refcount_table, nb_clusters,
new_offset, s->cluster_size);
res->corruptions_fixed++;
} else {
res->corruptions++;
}
}
}
}
@@ -1363,6 +1613,12 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
}
}
/* check OFLAG_COPIED */
ret = check_oflag_copied(bs, res, fix);
if (ret < 0) {
goto fail;
}
res->image_end_offset = (highest_cluster + 1) * s->cluster_size;
ret = 0;
@@ -1372,3 +1628,173 @@ fail:
return ret;
}
#define overlaps_with(ofs, sz) \
ranges_overlap(offset, size, ofs, sz)
/*
* Checks if the given offset into the image file is actually free to use by
* looking for overlaps with important metadata sections (L1/L2 tables etc.),
* i.e. a sanity check without relying on the refcount tables.
*
* The ign parameter specifies what checks not to perform (being a bitmask of
* QCow2MetadataOverlap values), i.e., what sections to ignore.
*
* Returns:
* - 0 if writing to this offset will not affect the mentioned metadata
* - a positive QCow2MetadataOverlap value indicating one overlapping section
* - a negative value (-errno) indicating an error while performing a check,
* e.g. when bdrv_read failed on QCOW2_OL_INACTIVE_L2
*/
int qcow2_check_metadata_overlap(BlockDriverState *bs, int ign, int64_t offset,
int64_t size)
{
BDRVQcowState *s = bs->opaque;
int chk = s->overlap_check & ~ign;
int i, j;
if (!size) {
return 0;
}
if (chk & QCOW2_OL_MAIN_HEADER) {
if (offset < s->cluster_size) {
return QCOW2_OL_MAIN_HEADER;
}
}
/* align range to test to cluster boundaries */
size = align_offset(offset_into_cluster(s, offset) + size, s->cluster_size);
offset = start_of_cluster(s, offset);
if ((chk & QCOW2_OL_ACTIVE_L1) && s->l1_size) {
if (overlaps_with(s->l1_table_offset, s->l1_size * sizeof(uint64_t))) {
return QCOW2_OL_ACTIVE_L1;
}
}
if ((chk & QCOW2_OL_REFCOUNT_TABLE) && s->refcount_table_size) {
if (overlaps_with(s->refcount_table_offset,
s->refcount_table_size * sizeof(uint64_t))) {
return QCOW2_OL_REFCOUNT_TABLE;
}
}
if ((chk & QCOW2_OL_SNAPSHOT_TABLE) && s->snapshots_size) {
if (overlaps_with(s->snapshots_offset, s->snapshots_size)) {
return QCOW2_OL_SNAPSHOT_TABLE;
}
}
if ((chk & QCOW2_OL_INACTIVE_L1) && s->snapshots) {
for (i = 0; i < s->nb_snapshots; i++) {
if (s->snapshots[i].l1_size &&
overlaps_with(s->snapshots[i].l1_table_offset,
s->snapshots[i].l1_size * sizeof(uint64_t))) {
return QCOW2_OL_INACTIVE_L1;
}
}
}
if ((chk & QCOW2_OL_ACTIVE_L2) && s->l1_table) {
for (i = 0; i < s->l1_size; i++) {
if ((s->l1_table[i] & L1E_OFFSET_MASK) &&
overlaps_with(s->l1_table[i] & L1E_OFFSET_MASK,
s->cluster_size)) {
return QCOW2_OL_ACTIVE_L2;
}
}
}
if ((chk & QCOW2_OL_REFCOUNT_BLOCK) && s->refcount_table) {
for (i = 0; i < s->refcount_table_size; i++) {
if ((s->refcount_table[i] & REFT_OFFSET_MASK) &&
overlaps_with(s->refcount_table[i] & REFT_OFFSET_MASK,
s->cluster_size)) {
return QCOW2_OL_REFCOUNT_BLOCK;
}
}
}
if ((chk & QCOW2_OL_INACTIVE_L2) && s->snapshots) {
for (i = 0; i < s->nb_snapshots; i++) {
uint64_t l1_ofs = s->snapshots[i].l1_table_offset;
uint32_t l1_sz = s->snapshots[i].l1_size;
uint64_t l1_sz2 = l1_sz * sizeof(uint64_t);
uint64_t *l1 = g_malloc(l1_sz2);
int ret;
ret = bdrv_pread(bs->file, l1_ofs, l1, l1_sz2);
if (ret < 0) {
g_free(l1);
return ret;
}
for (j = 0; j < l1_sz; j++) {
uint64_t l2_ofs = be64_to_cpu(l1[j]) & L1E_OFFSET_MASK;
if (l2_ofs && overlaps_with(l2_ofs, s->cluster_size)) {
g_free(l1);
return QCOW2_OL_INACTIVE_L2;
}
}
g_free(l1);
}
}
return 0;
}
static const char *metadata_ol_names[] = {
[QCOW2_OL_MAIN_HEADER_BITNR] = "qcow2_header",
[QCOW2_OL_ACTIVE_L1_BITNR] = "active L1 table",
[QCOW2_OL_ACTIVE_L2_BITNR] = "active L2 table",
[QCOW2_OL_REFCOUNT_TABLE_BITNR] = "refcount table",
[QCOW2_OL_REFCOUNT_BLOCK_BITNR] = "refcount block",
[QCOW2_OL_SNAPSHOT_TABLE_BITNR] = "snapshot table",
[QCOW2_OL_INACTIVE_L1_BITNR] = "inactive L1 table",
[QCOW2_OL_INACTIVE_L2_BITNR] = "inactive L2 table",
};
/*
* First performs a check for metadata overlaps (through
* qcow2_check_metadata_overlap); if that fails with a negative value (error
* while performing a check), that value is returned. If an impending overlap
* is detected, the BDS will be made unusable, the qcow2 file marked corrupt
* and -EIO returned.
*
* Returns 0 if there were neither overlaps nor errors while checking for
* overlaps; or a negative value (-errno) on error.
*/
int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
int64_t size)
{
int ret = qcow2_check_metadata_overlap(bs, ign, offset, size);
if (ret < 0) {
return ret;
} else if (ret > 0) {
int metadata_ol_bitnr = ffs(ret) - 1;
char *message;
QObject *data;
assert(metadata_ol_bitnr < QCOW2_OL_MAX_BITNR);
fprintf(stderr, "qcow2: Preventing invalid write on metadata (overlaps "
"with %s); image marked as corrupt.\n",
metadata_ol_names[metadata_ol_bitnr]);
message = g_strdup_printf("Prevented %s overwrite",
metadata_ol_names[metadata_ol_bitnr]);
data = qobject_from_jsonf("{ 'device': %s, 'msg': %s, 'offset': %"
PRId64 ", 'size': %" PRId64 " }", bs->device_name, message,
offset, size);
monitor_protocol_event(QEVENT_BLOCK_IMAGE_CORRUPTED, data);
g_free(message);
qobject_decref(data);
qcow2_mark_corrupt(bs);
bs->drv = NULL; /* make BDS unusable */
return -EIO;
}
return 0;
}

View File

@@ -182,13 +182,22 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
snapshots_offset = qcow2_alloc_clusters(bs, snapshots_size);
offset = snapshots_offset;
if (offset < 0) {
return offset;
ret = offset;
goto fail;
}
ret = bdrv_flush(bs);
if (ret < 0) {
return ret;
goto fail;
}
/* The snapshot list position has not yet been updated, so these clusters
* must indeed be completely free */
ret = qcow2_pre_write_overlap_check(bs, 0, offset, snapshots_size);
if (ret < 0) {
goto fail;
}
/* Write all snapshots to the new list */
for(i = 0; i < s->nb_snapshots; i++) {
sn = s->snapshots + i;
@@ -211,6 +220,7 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
id_str_size = strlen(sn->id_str);
name_size = strlen(sn->name);
assert(id_str_size <= UINT16_MAX && name_size <= UINT16_MAX);
h.id_str_size = cpu_to_be16(id_str_size);
h.name_size = cpu_to_be16(name_size);
offset = align_offset(offset, 8);
@@ -269,6 +279,10 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
return 0;
fail:
if (snapshots_offset > 0) {
qcow2_free_clusters(bs, snapshots_offset, snapshots_size,
QCOW2_DISCARD_ALWAYS);
}
return ret;
}
@@ -277,7 +291,8 @@ static void find_new_snapshot_id(BlockDriverState *bs,
{
BDRVQcowState *s = bs->opaque;
QCowSnapshot *sn;
int i, id, id_max = 0;
int i;
unsigned long id, id_max = 0;
for(i = 0; i < s->nb_snapshots; i++) {
sn = s->snapshots + i;
@@ -285,34 +300,50 @@ static void find_new_snapshot_id(BlockDriverState *bs,
if (id > id_max)
id_max = id;
}
snprintf(id_str, id_str_size, "%d", id_max + 1);
snprintf(id_str, id_str_size, "%lu", id_max + 1);
}
static int find_snapshot_by_id(BlockDriverState *bs, const char *id_str)
static int find_snapshot_by_id_and_name(BlockDriverState *bs,
const char *id,
const char *name)
{
BDRVQcowState *s = bs->opaque;
int i;
for(i = 0; i < s->nb_snapshots; i++) {
if (!strcmp(s->snapshots[i].id_str, id_str))
return i;
if (id && name) {
for (i = 0; i < s->nb_snapshots; i++) {
if (!strcmp(s->snapshots[i].id_str, id) &&
!strcmp(s->snapshots[i].name, name)) {
return i;
}
}
} else if (id) {
for (i = 0; i < s->nb_snapshots; i++) {
if (!strcmp(s->snapshots[i].id_str, id)) {
return i;
}
}
} else if (name) {
for (i = 0; i < s->nb_snapshots; i++) {
if (!strcmp(s->snapshots[i].name, name)) {
return i;
}
}
}
return -1;
}
static int find_snapshot_by_id_or_name(BlockDriverState *bs, const char *name)
static int find_snapshot_by_id_or_name(BlockDriverState *bs,
const char *id_or_name)
{
BDRVQcowState *s = bs->opaque;
int i, ret;
int ret;
ret = find_snapshot_by_id(bs, name);
if (ret >= 0)
ret = find_snapshot_by_id_and_name(bs, id_or_name, NULL);
if (ret >= 0) {
return ret;
for(i = 0; i < s->nb_snapshots; i++) {
if (!strcmp(s->snapshots[i].name, name))
return i;
}
return -1;
return find_snapshot_by_id_and_name(bs, NULL, id_or_name);
}
/* if no id is provided, a new one is constructed */
@@ -334,7 +365,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
}
/* Check that the ID is unique */
if (find_snapshot_by_id(bs, sn_info->id_str) >= 0) {
if (find_snapshot_by_id_and_name(bs, sn_info->id_str, NULL) >= 0) {
return -EEXIST;
}
@@ -363,6 +394,12 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
l1_table[i] = cpu_to_be64(s->l1_table[i]);
}
ret = qcow2_pre_write_overlap_check(bs, 0, sn->l1_table_offset,
s->l1_size * sizeof(uint64_t));
if (ret < 0) {
goto fail;
}
ret = bdrv_pwrite(bs->file, sn->l1_table_offset, l1_table,
s->l1_size * sizeof(uint64_t));
if (ret < 0) {
@@ -396,11 +433,19 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
if (ret < 0) {
g_free(s->snapshots);
s->snapshots = old_snapshot_list;
s->nb_snapshots--;
goto fail;
}
g_free(old_snapshot_list);
/* The VM state isn't needed any more in the active L1 table; in fact, it
* hurts by causing expensive COW for the next snapshot. */
qcow2_discard_clusters(bs, qcow2_vm_state_offset(s),
align_offset(sn->vm_state_size, s->cluster_size)
>> BDRV_SECTOR_BITS,
QCOW2_DISCARD_NEVER);
#ifdef DEBUG_ALLOC
{
BdrvCheckResult result = {0};
@@ -475,6 +520,12 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
goto fail;
}
ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_ACTIVE_L1,
s->l1_table_offset, cur_l1_bytes);
if (ret < 0) {
goto fail;
}
ret = bdrv_pwrite_sync(bs->file, s->l1_table_offset, sn_l1_table,
cur_l1_bytes);
if (ret < 0) {
@@ -531,15 +582,19 @@ fail:
return ret;
}
int qcow2_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
int qcow2_snapshot_delete(BlockDriverState *bs,
const char *snapshot_id,
const char *name,
Error **errp)
{
BDRVQcowState *s = bs->opaque;
QCowSnapshot sn;
int snapshot_index, ret;
/* Search the snapshot */
snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_id);
snapshot_index = find_snapshot_by_id_and_name(bs, snapshot_id, name);
if (snapshot_index < 0) {
error_setg(errp, "Can't find the snapshot");
return -ENOENT;
}
sn = s->snapshots[snapshot_index];
@@ -551,6 +606,7 @@ int qcow2_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
s->nb_snapshots--;
ret = qcow2_write_snapshots(bs);
if (ret < 0) {
error_setg(errp, "Failed to remove snapshot from snapshot list");
return ret;
}
@@ -568,6 +624,7 @@ int qcow2_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
ret = qcow2_update_snapshot_refcount(bs, sn.l1_table_offset,
sn.l1_size, -1);
if (ret < 0) {
error_setg(errp, "Failed to free the cluster and L1 table");
return ret;
}
qcow2_free_clusters(bs, sn.l1_table_offset, sn.l1_size * sizeof(uint64_t),
@@ -576,6 +633,7 @@ int qcow2_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
/* must update the copied flag on the current cluster offsets */
ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0);
if (ret < 0) {
error_setg(errp, "Failed to update snapshot status in disk");
return ret;
}

File diff suppressed because it is too large Load Diff

View File

@@ -40,11 +40,11 @@
#define QCOW_MAX_CRYPT_CLUSTERS 32
/* indicate that the refcount of the referenced cluster is exactly one. */
#define QCOW_OFLAG_COPIED (1LL << 63)
#define QCOW_OFLAG_COPIED (1ULL << 63)
/* indicate that the cluster is compressed (they never have the copied flag) */
#define QCOW_OFLAG_COMPRESSED (1LL << 62)
#define QCOW_OFLAG_COMPRESSED (1ULL << 62)
/* The cluster reads as all zeros */
#define QCOW_OFLAG_ZERO (1LL << 0)
#define QCOW_OFLAG_ZERO (1ULL << 0)
#define REFCOUNT_SHIFT 1 /* refcount size is 2 bytes */
@@ -63,6 +63,15 @@
#define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
#define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot"
#define QCOW2_OPT_DISCARD_OTHER "pass-discard-other"
#define QCOW2_OPT_OVERLAP "overlap-check"
#define QCOW2_OPT_OVERLAP_MAIN_HEADER "overlap-check.main-header"
#define QCOW2_OPT_OVERLAP_ACTIVE_L1 "overlap-check.active-l1"
#define QCOW2_OPT_OVERLAP_ACTIVE_L2 "overlap-check.active-l2"
#define QCOW2_OPT_OVERLAP_REFCOUNT_TABLE "overlap-check.refcount-table"
#define QCOW2_OPT_OVERLAP_REFCOUNT_BLOCK "overlap-check.refcount-block"
#define QCOW2_OPT_OVERLAP_SNAPSHOT_TABLE "overlap-check.snapshot-table"
#define QCOW2_OPT_OVERLAP_INACTIVE_L1 "overlap-check.inactive-l1"
#define QCOW2_OPT_OVERLAP_INACTIVE_L2 "overlap-check.inactive-l2"
typedef struct QCowHeader {
uint32_t magic;
@@ -86,7 +95,7 @@ typedef struct QCowHeader {
uint32_t refcount_order;
uint32_t header_length;
} QCowHeader;
} QEMU_PACKED QCowHeader;
typedef struct QCowSnapshot {
uint64_t l1_table_offset;
@@ -119,9 +128,12 @@ enum {
/* Incompatible feature bits */
enum {
QCOW2_INCOMPAT_DIRTY_BITNR = 0,
QCOW2_INCOMPAT_CORRUPT_BITNR = 1,
QCOW2_INCOMPAT_DIRTY = 1 << QCOW2_INCOMPAT_DIRTY_BITNR,
QCOW2_INCOMPAT_CORRUPT = 1 << QCOW2_INCOMPAT_CORRUPT_BITNR,
QCOW2_INCOMPAT_MASK = QCOW2_INCOMPAT_DIRTY,
QCOW2_INCOMPAT_MASK = QCOW2_INCOMPAT_DIRTY
| QCOW2_INCOMPAT_CORRUPT,
};
/* Compatible feature bits */
@@ -196,9 +208,12 @@ typedef struct BDRVQcowState {
int flags;
int qcow_version;
bool use_lazy_refcounts;
int refcount_order;
bool discard_passthrough[QCOW2_DISCARD_MAX];
int overlap_check; /* bitmask of Qcow2MetadataOverlap values */
uint64_t incompatible_features;
uint64_t compatible_features;
uint64_t autoclear_features;
@@ -286,6 +301,45 @@ enum {
QCOW2_CLUSTER_ZERO
};
typedef enum QCow2MetadataOverlap {
QCOW2_OL_MAIN_HEADER_BITNR = 0,
QCOW2_OL_ACTIVE_L1_BITNR = 1,
QCOW2_OL_ACTIVE_L2_BITNR = 2,
QCOW2_OL_REFCOUNT_TABLE_BITNR = 3,
QCOW2_OL_REFCOUNT_BLOCK_BITNR = 4,
QCOW2_OL_SNAPSHOT_TABLE_BITNR = 5,
QCOW2_OL_INACTIVE_L1_BITNR = 6,
QCOW2_OL_INACTIVE_L2_BITNR = 7,
QCOW2_OL_MAX_BITNR = 8,
QCOW2_OL_NONE = 0,
QCOW2_OL_MAIN_HEADER = (1 << QCOW2_OL_MAIN_HEADER_BITNR),
QCOW2_OL_ACTIVE_L1 = (1 << QCOW2_OL_ACTIVE_L1_BITNR),
QCOW2_OL_ACTIVE_L2 = (1 << QCOW2_OL_ACTIVE_L2_BITNR),
QCOW2_OL_REFCOUNT_TABLE = (1 << QCOW2_OL_REFCOUNT_TABLE_BITNR),
QCOW2_OL_REFCOUNT_BLOCK = (1 << QCOW2_OL_REFCOUNT_BLOCK_BITNR),
QCOW2_OL_SNAPSHOT_TABLE = (1 << QCOW2_OL_SNAPSHOT_TABLE_BITNR),
QCOW2_OL_INACTIVE_L1 = (1 << QCOW2_OL_INACTIVE_L1_BITNR),
/* NOTE: Checking overlaps with inactive L2 tables will result in bdrv
* reads. */
QCOW2_OL_INACTIVE_L2 = (1 << QCOW2_OL_INACTIVE_L2_BITNR),
} QCow2MetadataOverlap;
/* Perform all overlap checks which can be done in constant time */
#define QCOW2_OL_CONSTANT \
(QCOW2_OL_MAIN_HEADER | QCOW2_OL_ACTIVE_L1 | QCOW2_OL_REFCOUNT_TABLE | \
QCOW2_OL_SNAPSHOT_TABLE)
/* Perform all overlap checks which don't require disk access */
#define QCOW2_OL_CACHED \
(QCOW2_OL_CONSTANT | QCOW2_OL_ACTIVE_L2 | QCOW2_OL_REFCOUNT_BLOCK | \
QCOW2_OL_INACTIVE_L1)
/* Perform all overlap checks */
#define QCOW2_OL_ALL \
(QCOW2_OL_CACHED | QCOW2_OL_INACTIVE_L2)
#define L1E_OFFSET_MASK 0x00ffffffffffff00ULL
#define L2E_OFFSET_MASK 0x00ffffffffffff00ULL
#define L2E_COMPRESSED_OFFSET_SIZE_MASK 0x3fffffffffffffffULL
@@ -324,6 +378,11 @@ static inline int64_t align_offset(int64_t offset, int n)
return offset;
}
static inline int64_t qcow2_vm_state_offset(BDRVQcowState *s)
{
return (int64_t)s->l1_vm_state_index << (s->cluster_bits + s->l2_bits);
}
static inline int qcow2_get_cluster_type(uint64_t l2_entry)
{
if (l2_entry & QCOW_OFLAG_COMPRESSED) {
@@ -361,12 +420,17 @@ int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
int64_t sector_num, int nb_sectors);
int qcow2_mark_dirty(BlockDriverState *bs);
int qcow2_mark_corrupt(BlockDriverState *bs);
int qcow2_mark_consistent(BlockDriverState *bs);
int qcow2_update_header(BlockDriverState *bs);
/* qcow2-refcount.c functions */
int qcow2_refcount_init(BlockDriverState *bs);
void qcow2_refcount_close(BlockDriverState *bs);
int qcow2_update_cluster_refcount(BlockDriverState *bs, int64_t cluster_index,
int addend, enum qcow2_discard_type type);
int64_t qcow2_alloc_clusters(BlockDriverState *bs, int64_t size);
int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
int nb_clusters);
@@ -385,9 +449,15 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
void qcow2_process_discards(BlockDriverState *bs, int ret);
int qcow2_check_metadata_overlap(BlockDriverState *bs, int ign, int64_t offset,
int64_t size);
int qcow2_pre_write_overlap_check(BlockDriverState *bs, int ign, int64_t offset,
int64_t size);
/* qcow2-cluster.c functions */
int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
bool exact_size);
int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index);
void qcow2_l2_cache_reset(BlockDriverState *bs);
int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
void qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num,
@@ -405,13 +475,18 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m);
int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
int nb_sectors);
int nb_sectors, enum qcow2_discard_type type);
int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors);
int qcow2_expand_zero_clusters(BlockDriverState *bs);
/* qcow2-snapshot.c functions */
int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info);
int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id);
int qcow2_snapshot_delete(BlockDriverState *bs, const char *snapshot_id);
int qcow2_snapshot_delete(BlockDriverState *bs,
const char *snapshot_id,
const char *name,
Error **errp);
int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab);
int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name);
@@ -428,6 +503,8 @@ int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
Qcow2Cache *dependency);
void qcow2_cache_depends_on_flush(Qcow2Cache *c);
int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c);
int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
void **table);
int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,

View File

@@ -353,10 +353,10 @@ static void qed_start_need_check_timer(BDRVQEDState *s)
{
trace_qed_start_need_check_timer(s);
/* Use vm_clock so we don't alter the image file while suspended for
/* Use QEMU_CLOCK_VIRTUAL so we don't alter the image file while suspended for
* migration.
*/
qemu_mod_timer(s->need_check_timer, qemu_get_clock_ns(vm_clock) +
timer_mod(s->need_check_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
get_ticks_per_sec() * QED_NEED_CHECK_TIMEOUT);
}
@@ -364,7 +364,7 @@ static void qed_start_need_check_timer(BDRVQEDState *s)
static void qed_cancel_need_check_timer(BDRVQEDState *s)
{
trace_qed_cancel_need_check_timer(s);
qemu_del_timer(s->need_check_timer);
timer_del(s->need_check_timer);
}
static void bdrv_qed_rebind(BlockDriverState *bs)
@@ -373,7 +373,8 @@ static void bdrv_qed_rebind(BlockDriverState *bs)
s->bs = bs;
}
static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags)
static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVQEDState *s = bs->opaque;
QEDHeader le_header;
@@ -494,7 +495,7 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags)
}
}
s->need_check_timer = qemu_new_timer_ns(vm_clock,
s->need_check_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
qed_need_check_timer_cb, s);
out:
@@ -518,7 +519,7 @@ static void bdrv_qed_close(BlockDriverState *bs)
BDRVQEDState *s = bs->opaque;
qed_cancel_need_check_timer(s);
qemu_free_timer(s->need_check_timer);
timer_free(s->need_check_timer);
/* Ensure writes reach stable storage */
bdrv_flush(bs->file);
@@ -550,16 +551,22 @@ static int qed_create(const char *filename, uint32_t cluster_size,
QEDHeader le_header;
uint8_t *l1_table = NULL;
size_t l1_size = header.cluster_size * header.table_size;
Error *local_err = NULL;
int ret = 0;
BlockDriverState *bs = NULL;
ret = bdrv_create_file(filename, NULL);
ret = bdrv_create_file(filename, NULL, &local_err);
if (ret < 0) {
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
ret = bdrv_file_open(&bs, filename, NULL, BDRV_O_RDWR | BDRV_O_CACHE_WB);
ret = bdrv_file_open(&bs, filename, NULL, BDRV_O_RDWR | BDRV_O_CACHE_WB,
&local_err);
if (ret < 0) {
qerror_report_err(local_err);
error_free(local_err);
return ret;
}
@@ -599,11 +606,12 @@ static int qed_create(const char *filename, uint32_t cluster_size,
ret = 0; /* success */
out:
g_free(l1_table);
bdrv_delete(bs);
bdrv_unref(bs);
return ret;
}
static int bdrv_qed_create(const char *filename, QEMUOptionParameter *options)
static int bdrv_qed_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
uint64_t image_size = 0;
uint32_t cluster_size = QED_DEFAULT_CLUSTER_SIZE;
@@ -652,45 +660,66 @@ static int bdrv_qed_create(const char *filename, QEMUOptionParameter *options)
}
typedef struct {
BlockDriverState *bs;
Coroutine *co;
int is_allocated;
uint64_t pos;
int64_t status;
int *pnum;
} QEDIsAllocatedCB;
static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t len)
{
QEDIsAllocatedCB *cb = opaque;
BDRVQEDState *s = cb->bs->opaque;
*cb->pnum = len / BDRV_SECTOR_SIZE;
cb->is_allocated = (ret == QED_CLUSTER_FOUND || ret == QED_CLUSTER_ZERO);
switch (ret) {
case QED_CLUSTER_FOUND:
offset |= qed_offset_into_cluster(s, cb->pos);
cb->status = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
break;
case QED_CLUSTER_ZERO:
cb->status = BDRV_BLOCK_ZERO;
break;
case QED_CLUSTER_L2:
case QED_CLUSTER_L1:
cb->status = 0;
break;
default:
assert(ret < 0);
cb->status = ret;
break;
}
if (cb->co) {
qemu_coroutine_enter(cb->co, NULL);
}
}
static int coroutine_fn bdrv_qed_co_is_allocated(BlockDriverState *bs,
static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
{
BDRVQEDState *s = bs->opaque;
uint64_t pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE;
size_t len = (size_t)nb_sectors * BDRV_SECTOR_SIZE;
QEDIsAllocatedCB cb = {
.is_allocated = -1,
.bs = bs,
.pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE,
.status = BDRV_BLOCK_OFFSET_MASK,
.pnum = pnum,
};
QEDRequest request = { .l2_table = NULL };
qed_find_cluster(s, &request, pos, len, qed_is_allocated_cb, &cb);
qed_find_cluster(s, &request, cb.pos, len, qed_is_allocated_cb, &cb);
/* Now sleep if the callback wasn't invoked immediately */
while (cb.is_allocated == -1) {
while (cb.status == BDRV_BLOCK_OFFSET_MASK) {
cb.co = qemu_coroutine_self();
qemu_coroutine_yield();
}
qed_unref_l2_cache_entry(request.l2_table);
return cb.is_allocated;
return cb.status;
}
static int bdrv_qed_make_empty(BlockDriverState *bs)
@@ -1526,7 +1555,7 @@ static void bdrv_qed_invalidate_cache(BlockDriverState *bs)
bdrv_qed_close(bs);
memset(s, 0, sizeof(BDRVQEDState));
bdrv_qed_open(bs, NULL, bs->open_flags);
bdrv_qed_open(bs, NULL, bs->open_flags, NULL);
}
static int bdrv_qed_check(BlockDriverState *bs, BdrvCheckResult *result,
@@ -1575,7 +1604,7 @@ static BlockDriver bdrv_qed = {
.bdrv_reopen_prepare = bdrv_qed_reopen_prepare,
.bdrv_create = bdrv_qed_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_co_is_allocated = bdrv_qed_co_is_allocated,
.bdrv_co_get_block_status = bdrv_qed_co_get_block_status,
.bdrv_make_empty = bdrv_qed_make_empty,
.bdrv_aio_readv = bdrv_qed_aio_readv,
.bdrv_aio_writev = bdrv_qed_aio_writev,

View File

@@ -100,7 +100,7 @@ typedef struct {
/* if (features & QED_F_BACKING_FILE) */
uint32_t backing_filename_offset; /* in bytes from start of header */
uint32_t backing_filename_size; /* in bytes */
} QEDHeader;
} QEMU_PACKED QEDHeader;
typedef struct {
uint64_t offsets[0]; /* in bytes */

View File

@@ -276,7 +276,7 @@ static QemuOptsList raw_runtime_opts = {
};
static int raw_open_common(BlockDriverState *bs, QDict *options,
int bdrv_flags, int open_flags)
int bdrv_flags, int open_flags, Error **errp)
{
BDRVRawState *s = bs->opaque;
QemuOpts *opts;
@@ -287,8 +287,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
opts = qemu_opts_create_nofail(&raw_runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (error_is_set(&local_err)) {
qerror_report_err(local_err);
error_free(local_err);
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
@@ -297,6 +296,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
ret = raw_normalize_devicepath(&filename);
if (ret != 0) {
error_setg_errno(errp, -ret, "Could not normalize device path");
goto fail;
}
@@ -310,6 +310,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
if (ret == -EROFS) {
ret = -EACCES;
}
error_setg_errno(errp, -ret, "Could not open file");
goto fail;
}
s->fd = fd;
@@ -318,6 +319,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
if (raw_set_aio(&s->aio_ctx, &s->use_aio, bdrv_flags)) {
qemu_close(fd);
ret = -errno;
error_setg_errno(errp, -ret, "Could not set AIO state");
goto fail;
}
#endif
@@ -335,12 +337,19 @@ fail:
return ret;
}
static int raw_open(BlockDriverState *bs, QDict *options, int flags)
static int raw_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRawState *s = bs->opaque;
Error *local_err = NULL;
int ret;
s->type = FTYPE_FILE;
return raw_open_common(bs, options, flags, 0);
ret = raw_open_common(bs, options, flags, 0, &local_err);
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
}
return ret;
}
static int raw_reopen_prepare(BDRVReopenState *state,
@@ -365,6 +374,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
* valid in the 'false' condition even if aio_ctx is set, and raw_set_aio()
* won't override aio_ctx if aio_ctx is non-NULL */
if (raw_set_aio(&s->aio_ctx, &raw_s->use_aio, state->flags)) {
error_setg(errp, "Could not set AIO state");
return -1;
}
#endif
@@ -416,6 +426,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
assert(!(raw_s->open_flags & O_CREAT));
raw_s->fd = qemu_open(state->bs->filename, raw_s->open_flags);
if (raw_s->fd == -1) {
error_setg_errno(errp, errno, "Could not reopen file");
ret = -1;
}
}
@@ -1040,7 +1051,8 @@ static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
return (int64_t)st.st_blocks * 512;
}
static int raw_create(const char *filename, QEMUOptionParameter *options)
static int raw_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int fd;
int result = 0;
@@ -1058,12 +1070,15 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
0644);
if (fd < 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not create file");
} else {
if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not resize file");
}
if (qemu_close(fd) != 0) {
result = -errno;
error_setg_errno(errp, -result, "Could not close the new file");
}
}
return result;
@@ -1084,12 +1099,12 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
* 'nb_sectors' is the max value 'pnum' should be set to. If nb_sectors goes
* beyond the end of the disk image it will be clamped.
*/
static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
{
off_t start, data, hole;
int ret;
int64_t ret;
ret = fd_open(bs);
if (ret < 0) {
@@ -1097,6 +1112,7 @@ static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
}
start = sector_num * BDRV_SECTOR_SIZE;
ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
#ifdef CONFIG_FIEMAP
@@ -1114,7 +1130,7 @@ static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
if (ioctl(s->fd, FS_IOC_FIEMAP, &f) == -1) {
/* Assume everything is allocated. */
*pnum = nb_sectors;
return 1;
return ret;
}
if (f.fm.fm_mapped_extents == 0) {
@@ -1127,6 +1143,9 @@ static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
} else {
data = f.fe.fe_logical;
hole = f.fe.fe_logical + f.fe.fe_length;
if (f.fe.fe_flags & FIEMAP_EXTENT_UNWRITTEN) {
ret |= BDRV_BLOCK_ZERO;
}
}
#elif defined SEEK_HOLE && defined SEEK_DATA
@@ -1141,7 +1160,7 @@ static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
/* Most likely EINVAL. Assume everything is allocated. */
*pnum = nb_sectors;
return 1;
return ret;
}
if (hole > start) {
@@ -1154,19 +1173,21 @@ static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
}
}
#else
*pnum = nb_sectors;
return 1;
data = 0;
hole = start + nb_sectors * BDRV_SECTOR_SIZE;
#endif
if (data <= start) {
/* On a data extent, compute sectors to the end of the extent. */
*pnum = MIN(nb_sectors, (hole - start) / BDRV_SECTOR_SIZE);
return 1;
} else {
/* On a hole, compute sectors to the beginning of the next extent. */
*pnum = MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE);
return 0;
ret &= ~BDRV_BLOCK_DATA;
ret |= BDRV_BLOCK_ZERO;
}
return ret;
}
static coroutine_fn BlockDriverAIOCB *raw_aio_discard(BlockDriverState *bs,
@@ -1192,6 +1213,7 @@ static BlockDriver bdrv_file = {
.format_name = "file",
.protocol_name = "file",
.instance_size = sizeof(BDRVRawState),
.bdrv_needs_filename = true,
.bdrv_probe = NULL, /* no probe for protocols */
.bdrv_file_open = raw_open,
.bdrv_reopen_prepare = raw_reopen_prepare,
@@ -1200,7 +1222,7 @@ static BlockDriver bdrv_file = {
.bdrv_close = raw_close,
.bdrv_create = raw_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_co_is_allocated = raw_co_is_allocated,
.bdrv_co_get_block_status = raw_co_get_block_status,
.bdrv_aio_readv = raw_aio_readv,
.bdrv_aio_writev = raw_aio_writev,
@@ -1325,9 +1347,11 @@ static int check_hdev_writable(BDRVRawState *s)
return 0;
}
static int hdev_open(BlockDriverState *bs, QDict *options, int flags)
static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRawState *s = bs->opaque;
Error *local_err = NULL;
int ret;
const char *filename = qdict_get_str(options, "filename");
@@ -1371,8 +1395,11 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags)
}
#endif
ret = raw_open_common(bs, options, flags, 0);
ret = raw_open_common(bs, options, flags, 0, &local_err);
if (ret < 0) {
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
}
return ret;
}
@@ -1380,6 +1407,7 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags)
ret = check_hdev_writable(s);
if (ret < 0) {
raw_close(bs);
error_setg_errno(errp, -ret, "The device is not writable");
return ret;
}
}
@@ -1498,7 +1526,8 @@ static coroutine_fn BlockDriverAIOCB *hdev_aio_discard(BlockDriverState *bs,
cb, opaque, QEMU_AIO_DISCARD|QEMU_AIO_BLKDEV);
}
static int hdev_create(const char *filename, QEMUOptionParameter *options)
static int hdev_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int fd;
int ret = 0;
@@ -1514,15 +1543,23 @@ static int hdev_create(const char *filename, QEMUOptionParameter *options)
}
fd = qemu_open(filename, O_WRONLY | O_BINARY);
if (fd < 0)
return -errno;
if (fstat(fd, &stat_buf) < 0)
if (fd < 0) {
ret = -errno;
else if (!S_ISBLK(stat_buf.st_mode) && !S_ISCHR(stat_buf.st_mode))
error_setg_errno(errp, -ret, "Could not open device");
return ret;
}
if (fstat(fd, &stat_buf) < 0) {
ret = -errno;
error_setg_errno(errp, -ret, "Could not stat device");
} else if (!S_ISBLK(stat_buf.st_mode) && !S_ISCHR(stat_buf.st_mode)) {
error_setg(errp,
"The given file is neither a block nor a character device");
ret = -ENODEV;
else if (lseek(fd, 0, SEEK_END) < total_size * BDRV_SECTOR_SIZE)
} else if (lseek(fd, 0, SEEK_END) < total_size * BDRV_SECTOR_SIZE) {
error_setg(errp, "Device is too small");
ret = -ENOSPC;
}
qemu_close(fd);
return ret;
@@ -1532,6 +1569,7 @@ static BlockDriver bdrv_host_device = {
.format_name = "host_device",
.protocol_name = "host_device",
.instance_size = sizeof(BDRVRawState),
.bdrv_needs_filename = true,
.bdrv_probe_device = hdev_probe_device,
.bdrv_file_open = hdev_open,
.bdrv_close = raw_close,
@@ -1559,17 +1597,23 @@ static BlockDriver bdrv_host_device = {
};
#ifdef __linux__
static int floppy_open(BlockDriverState *bs, QDict *options, int flags)
static int floppy_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRawState *s = bs->opaque;
Error *local_err = NULL;
int ret;
s->type = FTYPE_FD;
/* open will not fail even if no floppy is inserted, so add O_NONBLOCK */
ret = raw_open_common(bs, options, flags, O_NONBLOCK);
if (ret)
ret = raw_open_common(bs, options, flags, O_NONBLOCK, &local_err);
if (ret) {
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
}
return ret;
}
/* close fd so that we can reopen it as needed */
qemu_close(s->fd);
@@ -1656,6 +1700,7 @@ static BlockDriver bdrv_host_floppy = {
.format_name = "host_floppy",
.protocol_name = "host_floppy",
.instance_size = sizeof(BDRVRawState),
.bdrv_needs_filename = true,
.bdrv_probe_device = floppy_probe_device,
.bdrv_file_open = floppy_open,
.bdrv_close = raw_close,
@@ -1670,7 +1715,8 @@ static BlockDriver bdrv_host_floppy = {
.bdrv_aio_flush = raw_aio_flush,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
.bdrv_getlength = raw_getlength,
.has_variable_length = true,
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
@@ -1680,14 +1726,21 @@ static BlockDriver bdrv_host_floppy = {
.bdrv_eject = floppy_eject,
};
static int cdrom_open(BlockDriverState *bs, QDict *options, int flags)
static int cdrom_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRawState *s = bs->opaque;
Error *local_err = NULL;
int ret;
s->type = FTYPE_CD;
/* open will not fail even if no CD is inserted, so add O_NONBLOCK */
return raw_open_common(bs, options, flags, O_NONBLOCK);
ret = raw_open_common(bs, options, flags, O_NONBLOCK, &local_err);
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
}
return ret;
}
static int cdrom_probe_device(const char *filename)
@@ -1757,6 +1810,7 @@ static BlockDriver bdrv_host_cdrom = {
.format_name = "host_cdrom",
.protocol_name = "host_cdrom",
.instance_size = sizeof(BDRVRawState),
.bdrv_needs_filename = true,
.bdrv_probe_device = cdrom_probe_device,
.bdrv_file_open = cdrom_open,
.bdrv_close = raw_close,
@@ -1771,7 +1825,8 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_aio_flush = raw_aio_flush,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
.bdrv_getlength = raw_getlength,
.has_variable_length = true,
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
@@ -1790,13 +1845,18 @@ static BlockDriver bdrv_host_cdrom = {
static int cdrom_open(BlockDriverState *bs, QDict *options, int flags)
{
BDRVRawState *s = bs->opaque;
Error *local_err = NULL;
int ret;
s->type = FTYPE_CD;
ret = raw_open_common(bs, options, flags, 0);
if (ret)
ret = raw_open_common(bs, options, flags, 0, &local_err);
if (ret) {
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
}
return ret;
}
/* make sure the door isn't locked at this time */
ioctl(s->fd, CDIOCALLOW);
@@ -1878,6 +1938,7 @@ static BlockDriver bdrv_host_cdrom = {
.format_name = "host_cdrom",
.protocol_name = "host_cdrom",
.instance_size = sizeof(BDRVRawState),
.bdrv_needs_filename = true,
.bdrv_probe_device = cdrom_probe_device,
.bdrv_file_open = cdrom_open,
.bdrv_close = raw_close,
@@ -1892,7 +1953,8 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_aio_flush = raw_aio_flush,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
.bdrv_getlength = raw_getlength,
.has_variable_length = true,
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,

View File

@@ -85,6 +85,7 @@ static size_t handle_aiocb_rw(RawWin32AIOData *aiocb)
ret_count = 0;
}
if (ret_count != len) {
offset += ret_count;
break;
}
offset += len;
@@ -234,7 +235,8 @@ static QemuOptsList raw_runtime_opts = {
},
};
static int raw_open(BlockDriverState *bs, QDict *options, int flags)
static int raw_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRawState *s = bs->opaque;
int access_flags;
@@ -249,8 +251,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags)
opts = qemu_opts_create_nofail(&raw_runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (error_is_set(&local_err)) {
qerror_report_err(local_err);
error_free(local_err);
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
@@ -262,6 +263,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags)
if ((flags & BDRV_O_NATIVE_AIO) && aio == NULL) {
aio = win32_aio_init();
if (aio == NULL) {
error_setg(errp, "Could not initialize AIO");
ret = -EINVAL;
goto fail;
}
@@ -278,6 +280,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags)
} else {
ret = -EINVAL;
}
error_setg_errno(errp, -ret, "Could not open file");
goto fail;
}
@@ -285,6 +288,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags)
ret = win32_aio_attach(aio, s->hfile);
if (ret < 0) {
CloseHandle(s->hfile);
error_setg_errno(errp, -ret, "Could not enable AIO");
goto fail;
}
s->aio = aio;
@@ -420,7 +424,8 @@ static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
return st.st_size;
}
static int raw_create(const char *filename, QEMUOptionParameter *options)
static int raw_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int fd;
int64_t total_size = 0;
@@ -435,8 +440,10 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
if (fd < 0)
if (fd < 0) {
error_setg_errno(errp, errno, "Could not create file");
return -EIO;
}
set_sparse(fd);
ftruncate(fd, total_size * 512);
qemu_close(fd);
@@ -456,6 +463,7 @@ static BlockDriver bdrv_file = {
.format_name = "file",
.protocol_name = "file",
.instance_size = sizeof(BDRVRawState),
.bdrv_needs_filename = true,
.bdrv_file_open = raw_open,
.bdrv_close = raw_close,
.bdrv_create = raw_create,
@@ -531,7 +539,8 @@ static int hdev_probe_device(const char *filename)
return 0;
}
static int hdev_open(BlockDriverState *bs, QDict *options, int flags)
static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRawState *s = bs->opaque;
int access_flags, create_flags;
@@ -545,8 +554,7 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags)
QemuOpts *opts = qemu_opts_create_nofail(&raw_runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (error_is_set(&local_err)) {
qerror_report_err(local_err);
error_free(local_err);
error_propagate(errp, local_err);
ret = -EINVAL;
goto done;
}
@@ -555,6 +563,7 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags)
if (strstart(filename, "/dev/cdrom", NULL)) {
if (find_cdrom(device_name, sizeof(device_name)) < 0) {
error_setg(errp, "Could not open CD-ROM drive");
ret = -ENOENT;
goto done;
}
@@ -583,8 +592,9 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags)
if (err == ERROR_ACCESS_DENIED) {
ret = -EACCES;
} else {
ret = -1;
ret = -EINVAL;
}
error_setg_errno(errp, -ret, "Could not open device");
goto done;
}
@@ -597,6 +607,7 @@ static BlockDriver bdrv_host_device = {
.format_name = "host_device",
.protocol_name = "host_device",
.instance_size = sizeof(BDRVRawState),
.bdrv_needs_filename = true,
.bdrv_probe_device = hdev_probe_device,
.bdrv_file_open = hdev_open,
.bdrv_close = raw_close,
@@ -605,7 +616,9 @@ static BlockDriver bdrv_host_device = {
.bdrv_aio_writev = raw_aio_writev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_getlength = raw_getlength,
.bdrv_getlength = raw_getlength,
.has_variable_length = true,
.bdrv_get_allocated_file_size
= raw_get_allocated_file_size,
};

View File

@@ -1,13 +1,17 @@
/*
* Block driver for RAW format
/* BlockDriver implementation for "raw"
*
* Copyright (c) 2006 Fabrice Bellard
* Copyright (C) 2010, 2013, Red Hat, Inc.
* Copyright (C) 2010, Blue Swirl <blauwirbel@gmail.com>
* Copyright (C) 2009, Anthony Liguori <aliguori@us.ibm.com>
*
* Author:
* Laszlo Ersek <lersek@redhat.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* of this software and associated documentation files (the "Software"), to
* deal in the Software without restriction, including without limitation the
* rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
* sell copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
@@ -15,27 +19,27 @@
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/option.h"
static int raw_open(BlockDriverState *bs, QDict *options, int flags)
{
bs->sg = bs->file->sg;
return 0;
}
static QEMUOptionParameter raw_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{ 0 }
};
/* We have nothing to do for raw reopen, stubs just return
* success */
static int raw_reopen_prepare(BDRVReopenState *state,
BlockReopenQueue *queue, Error **errp)
static int raw_reopen_prepare(BDRVReopenState *reopen_state,
BlockReopenQueue *queue, Error **errp)
{
return 0;
}
@@ -54,45 +58,42 @@ static int coroutine_fn raw_co_writev(BlockDriverState *bs, int64_t sector_num,
return bdrv_co_writev(bs->file, sector_num, nb_sectors, qiov);
}
static void raw_close(BlockDriverState *bs)
{
}
static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
{
return bdrv_co_is_allocated(bs->file, sector_num, nb_sectors, pnum);
*pnum = nb_sectors;
return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
(sector_num << BDRV_SECTOR_BITS);
}
static int coroutine_fn raw_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors)
int64_t sector_num, int nb_sectors)
{
return bdrv_co_write_zeroes(bs->file, sector_num, nb_sectors);
}
static int coroutine_fn raw_co_discard(BlockDriverState *bs,
int64_t sector_num, int nb_sectors)
{
return bdrv_co_discard(bs->file, sector_num, nb_sectors);
}
static int64_t raw_getlength(BlockDriverState *bs)
{
return bdrv_getlength(bs->file);
}
static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
return bdrv_get_info(bs->file, bdi);
}
static int raw_truncate(BlockDriverState *bs, int64_t offset)
{
return bdrv_truncate(bs->file, offset);
}
static int raw_probe(const uint8_t *buf, int buf_size, const char *filename)
{
return 1; /* everything can be opened as raw image */
}
static int coroutine_fn raw_co_discard(BlockDriverState *bs,
int64_t sector_num, int nb_sectors)
{
return bdrv_co_discard(bs->file, sector_num, nb_sectors);
}
static int raw_is_inserted(BlockDriverState *bs)
{
return bdrv_is_inserted(bs->file);
@@ -115,73 +116,78 @@ static void raw_lock_medium(BlockDriverState *bs, bool locked)
static int raw_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
{
return bdrv_ioctl(bs->file, req, buf);
return bdrv_ioctl(bs->file, req, buf);
}
static BlockDriverAIOCB *raw_aio_ioctl(BlockDriverState *bs,
unsigned long int req, void *buf,
BlockDriverCompletionFunc *cb, void *opaque)
unsigned long int req, void *buf,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return bdrv_aio_ioctl(bs->file, req, buf, cb, opaque);
return bdrv_aio_ioctl(bs->file, req, buf, cb, opaque);
}
static int raw_create(const char *filename, QEMUOptionParameter *options)
{
return bdrv_create_file(filename, options);
}
static QEMUOptionParameter raw_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{ NULL }
};
static int raw_has_zero_init(BlockDriverState *bs)
{
return bdrv_has_zero_init(bs->file);
}
static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
static int raw_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
return bdrv_get_info(bs->file, bdi);
Error *local_err = NULL;
int ret;
ret = bdrv_create_file(filename, options, &local_err);
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
}
return ret;
}
static int raw_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
bs->sg = bs->file->sg;
return 0;
}
static void raw_close(BlockDriverState *bs)
{
}
static int raw_probe(const uint8_t *buf, int buf_size, const char *filename)
{
/* smallest possible positive score so that raw is used if and only if no
* other block driver works
*/
return 1;
}
static BlockDriver bdrv_raw = {
.format_name = "raw",
/* It's really 0, but we need to make g_malloc() happy */
.instance_size = 1,
.bdrv_open = raw_open,
.bdrv_close = raw_close,
.bdrv_reopen_prepare = raw_reopen_prepare,
.bdrv_co_readv = raw_co_readv,
.bdrv_co_writev = raw_co_writev,
.bdrv_co_is_allocated = raw_co_is_allocated,
.bdrv_co_write_zeroes = raw_co_write_zeroes,
.bdrv_co_discard = raw_co_discard,
.bdrv_probe = raw_probe,
.bdrv_getlength = raw_getlength,
.bdrv_get_info = raw_get_info,
.bdrv_truncate = raw_truncate,
.bdrv_is_inserted = raw_is_inserted,
.bdrv_media_changed = raw_media_changed,
.bdrv_eject = raw_eject,
.bdrv_lock_medium = raw_lock_medium,
.bdrv_ioctl = raw_ioctl,
.bdrv_aio_ioctl = raw_aio_ioctl,
.bdrv_create = raw_create,
.create_options = raw_create_options,
.bdrv_has_zero_init = raw_has_zero_init,
.format_name = "raw",
.bdrv_probe = &raw_probe,
.bdrv_reopen_prepare = &raw_reopen_prepare,
.bdrv_open = &raw_open,
.bdrv_close = &raw_close,
.bdrv_create = &raw_create,
.bdrv_co_readv = &raw_co_readv,
.bdrv_co_writev = &raw_co_writev,
.bdrv_co_write_zeroes = &raw_co_write_zeroes,
.bdrv_co_discard = &raw_co_discard,
.bdrv_co_get_block_status = &raw_co_get_block_status,
.bdrv_truncate = &raw_truncate,
.bdrv_getlength = &raw_getlength,
.has_variable_length = true,
.bdrv_get_info = &raw_get_info,
.bdrv_is_inserted = &raw_is_inserted,
.bdrv_media_changed = &raw_media_changed,
.bdrv_eject = &raw_eject,
.bdrv_lock_medium = &raw_lock_medium,
.bdrv_ioctl = &raw_ioctl,
.bdrv_aio_ioctl = &raw_aio_ioctl,
.create_options = &raw_create_options[0],
.bdrv_has_zero_init = &raw_has_zero_init
};
static void bdrv_raw_init(void)

View File

@@ -100,7 +100,6 @@ typedef struct BDRVRBDState {
rados_ioctx_t io_ctx;
rbd_image_t image;
char name[RBD_MAX_IMAGE_NAME_SIZE];
int qemu_aio_count;
char *snap;
int event_reader_pos;
RADOSCB *event_rcb;
@@ -288,7 +287,8 @@ static int qemu_rbd_set_conf(rados_t cluster, const char *conf)
return ret;
}
static int qemu_rbd_create(const char *filename, QEMUOptionParameter *options)
static int qemu_rbd_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int64_t bytes = 0;
int64_t objsize;
@@ -428,19 +428,11 @@ static void qemu_rbd_aio_event_reader(void *opaque)
if (s->event_reader_pos == sizeof(s->event_rcb)) {
s->event_reader_pos = 0;
qemu_rbd_complete_aio(s->event_rcb);
s->qemu_aio_count--;
}
}
} while (ret < 0 && errno == EINTR);
}
static int qemu_rbd_aio_flush_cb(void *opaque)
{
BDRVRBDState *s = opaque;
return (s->qemu_aio_count > 0);
}
/* TODO Convert to fine grained options */
static QemuOptsList runtime_opts = {
.name = "rbd",
@@ -455,7 +447,8 @@ static QemuOptsList runtime_opts = {
},
};
static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags)
static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRBDState *s = bs->opaque;
char pool[RBD_MAX_POOL_NAME_SIZE];
@@ -554,7 +547,7 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags)
fcntl(s->fds[0], F_SETFL, O_NONBLOCK);
fcntl(s->fds[1], F_SETFL, O_NONBLOCK);
qemu_aio_set_fd_handler(s->fds[RBD_FD_READ], qemu_rbd_aio_event_reader,
NULL, qemu_rbd_aio_flush_cb, s);
NULL, s);
qemu_opts_del(opts);
@@ -578,7 +571,7 @@ static void qemu_rbd_close(BlockDriverState *bs)
close(s->fds[0]);
close(s->fds[1]);
qemu_aio_set_fd_handler(s->fds[RBD_FD_READ], NULL, NULL, NULL, NULL);
qemu_aio_set_fd_handler(s->fds[RBD_FD_READ], NULL, NULL, NULL);
rbd_close(s->image);
rados_ioctx_destroy(s->io_ctx);
@@ -741,8 +734,6 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
off = sector_num * BDRV_SECTOR_SIZE;
size = nb_sectors * BDRV_SECTOR_SIZE;
s->qemu_aio_count++; /* All the RADOSCB */
rcb = g_malloc(sizeof(RADOSCB));
rcb->done = 0;
rcb->acb = acb;
@@ -779,7 +770,6 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
failed:
g_free(rcb);
s->qemu_aio_count--;
qemu_aio_release(acb);
return NULL;
}
@@ -903,12 +893,31 @@ static int qemu_rbd_snap_create(BlockDriverState *bs,
}
static int qemu_rbd_snap_remove(BlockDriverState *bs,
const char *snapshot_name)
const char *snapshot_id,
const char *snapshot_name,
Error **errp)
{
BDRVRBDState *s = bs->opaque;
int r;
if (!snapshot_name) {
error_setg(errp, "rbd need a valid snapshot name");
return -EINVAL;
}
/* If snapshot_id is specified, it must be equal to name, see
qemu_rbd_snap_list() */
if (snapshot_id && strcmp(snapshot_id, snapshot_name)) {
error_setg(errp,
"rbd do not support snapshot id, it should be NULL or "
"equal to snapshot name");
return -EINVAL;
}
r = rbd_snap_remove(s->image, snapshot_name);
if (r < 0) {
error_setg_errno(errp, -r, "Failed to remove the snapshot");
}
return r;
}
@@ -994,6 +1003,7 @@ static QEMUOptionParameter qemu_rbd_create_options[] = {
static BlockDriver bdrv_rbd = {
.format_name = "rbd",
.instance_size = sizeof(BDRVRBDState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_rbd_open,
.bdrv_close = qemu_rbd_close,
.bdrv_create = qemu_rbd_create,

View File

@@ -125,8 +125,9 @@ typedef struct SheepdogObjReq {
uint32_t data_length;
uint64_t oid;
uint64_t cow_oid;
uint32_t copies;
uint32_t rsvd;
uint8_t copies;
uint8_t copy_policy;
uint8_t reserved[6];
uint64_t offset;
} SheepdogObjReq;
@@ -138,7 +139,9 @@ typedef struct SheepdogObjRsp {
uint32_t id;
uint32_t data_length;
uint32_t result;
uint32_t copies;
uint8_t copies;
uint8_t copy_policy;
uint8_t reserved[2];
uint32_t pad[6];
} SheepdogObjRsp;
@@ -151,7 +154,9 @@ typedef struct SheepdogVdiReq {
uint32_t data_length;
uint64_t vdi_size;
uint32_t vdi_id;
uint32_t copies;
uint8_t copies;
uint8_t copy_policy;
uint8_t reserved[2];
uint32_t snapid;
uint32_t pad[3];
} SheepdogVdiReq;
@@ -222,6 +227,11 @@ static inline uint64_t data_oid_to_idx(uint64_t oid)
return oid & (MAX_DATA_OBJS - 1);
}
static inline uint32_t oid_to_vid(uint64_t oid)
{
return (oid & ~VDI_BIT) >> VDI_SPACE_SHIFT;
}
static inline uint64_t vid_to_vdi_oid(uint32_t vid)
{
return VDI_BIT | ((uint64_t)vid << VDI_SPACE_SHIFT);
@@ -289,11 +299,14 @@ struct SheepdogAIOCB {
Coroutine *coroutine;
void (*aio_done_func)(SheepdogAIOCB *);
bool canceled;
bool cancelable;
bool *finished;
int nr_pending;
};
typedef struct BDRVSheepdogState {
BlockDriverState *bs;
SheepdogInode inode;
uint32_t min_dirty_data_idx;
@@ -313,8 +326,11 @@ typedef struct BDRVSheepdogState {
Coroutine *co_recv;
uint32_t aioreq_seq_num;
/* Every aio request must be linked to either of these queues. */
QLIST_HEAD(inflight_aio_head, AIOReq) inflight_aio_head;
QLIST_HEAD(pending_aio_head, AIOReq) pending_aio_head;
QLIST_HEAD(failed_aio_head, AIOReq) failed_aio_head;
} BDRVSheepdogState;
static const char * sd_strerror(int err)
@@ -403,6 +419,7 @@ static inline void free_aio_req(BDRVSheepdogState *s, AIOReq *aio_req)
{
SheepdogAIOCB *acb = aio_req->aiocb;
acb->cancelable = false;
QLIST_REMOVE(aio_req, aio_siblings);
g_free(aio_req);
@@ -411,23 +428,68 @@ static inline void free_aio_req(BDRVSheepdogState *s, AIOReq *aio_req)
static void coroutine_fn sd_finish_aiocb(SheepdogAIOCB *acb)
{
if (!acb->canceled) {
qemu_coroutine_enter(acb->coroutine, NULL);
qemu_coroutine_enter(acb->coroutine, NULL);
if (acb->finished) {
*acb->finished = true;
}
qemu_aio_release(acb);
}
/*
* Check whether the specified acb can be canceled
*
* We can cancel aio when any request belonging to the acb is:
* - Not processed by the sheepdog server.
* - Not linked to the inflight queue.
*/
static bool sd_acb_cancelable(const SheepdogAIOCB *acb)
{
BDRVSheepdogState *s = acb->common.bs->opaque;
AIOReq *aioreq;
if (!acb->cancelable) {
return false;
}
QLIST_FOREACH(aioreq, &s->inflight_aio_head, aio_siblings) {
if (aioreq->aiocb == acb) {
return false;
}
}
return true;
}
static void sd_aio_cancel(BlockDriverAIOCB *blockacb)
{
SheepdogAIOCB *acb = (SheepdogAIOCB *)blockacb;
BDRVSheepdogState *s = acb->common.bs->opaque;
AIOReq *aioreq, *next;
bool finished = false;
/*
* Sheepdog cannot cancel the requests which are already sent to
* the servers, so we just complete the request with -EIO here.
*/
acb->ret = -EIO;
qemu_coroutine_enter(acb->coroutine, NULL);
acb->canceled = true;
acb->finished = &finished;
while (!finished) {
if (sd_acb_cancelable(acb)) {
/* Remove outstanding requests from pending and failed queues. */
QLIST_FOREACH_SAFE(aioreq, &s->pending_aio_head, aio_siblings,
next) {
if (aioreq->aiocb == acb) {
free_aio_req(s, aioreq);
}
}
QLIST_FOREACH_SAFE(aioreq, &s->failed_aio_head, aio_siblings,
next) {
if (aioreq->aiocb == acb) {
free_aio_req(s, aioreq);
}
}
assert(acb->nr_pending == 0);
sd_finish_aiocb(acb);
return;
}
qemu_aio_wait();
}
}
static const AIOCBInfo sd_aiocb_info = {
@@ -448,7 +510,8 @@ static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov,
acb->nb_sectors = nb_sectors;
acb->aio_done_func = NULL;
acb->canceled = false;
acb->cancelable = true;
acb->finished = NULL;
acb->coroutine = qemu_coroutine_self();
acb->ret = 0;
acb->nr_pending = 0;
@@ -489,13 +552,13 @@ static coroutine_fn int send_co_req(int sockfd, SheepdogReq *hdr, void *data,
int ret;
ret = qemu_co_send(sockfd, hdr, sizeof(*hdr));
if (ret < sizeof(*hdr)) {
if (ret != sizeof(*hdr)) {
error_report("failed to send a req, %s", strerror(errno));
return ret;
}
ret = qemu_co_send(sockfd, data, *wlen);
if (ret < *wlen) {
if (ret != *wlen) {
error_report("failed to send a req, %s", strerror(errno));
}
@@ -509,13 +572,6 @@ static void restart_co_req(void *opaque)
qemu_coroutine_enter(co, NULL);
}
static int have_co_req(void *opaque)
{
/* this handler is set only when there is a pending request, so
* always returns 1. */
return 1;
}
typedef struct SheepdogReqCo {
int sockfd;
SheepdogReq *hdr;
@@ -538,17 +594,17 @@ static coroutine_fn void do_co_req(void *opaque)
unsigned int *rlen = srco->rlen;
co = qemu_coroutine_self();
qemu_aio_set_fd_handler(sockfd, NULL, restart_co_req, have_co_req, co);
qemu_aio_set_fd_handler(sockfd, NULL, restart_co_req, co);
ret = send_co_req(sockfd, hdr, data, wlen);
if (ret < 0) {
goto out;
}
qemu_aio_set_fd_handler(sockfd, restart_co_req, NULL, have_co_req, co);
qemu_aio_set_fd_handler(sockfd, restart_co_req, NULL, co);
ret = qemu_co_recv(sockfd, hdr, sizeof(*hdr));
if (ret < sizeof(*hdr)) {
if (ret != sizeof(*hdr)) {
error_report("failed to get a rsp, %s", strerror(errno));
ret = -errno;
goto out;
@@ -560,7 +616,7 @@ static coroutine_fn void do_co_req(void *opaque)
if (*rlen) {
ret = qemu_co_recv(sockfd, data, *rlen);
if (ret < *rlen) {
if (ret != *rlen) {
error_report("failed to get the data, %s", strerror(errno));
ret = -errno;
goto out;
@@ -570,7 +626,7 @@ static coroutine_fn void do_co_req(void *opaque)
out:
/* there is at most one request for this sockfd, so it is safe to
* set each handler to NULL. */
qemu_aio_set_fd_handler(sockfd, NULL, NULL, NULL, NULL);
qemu_aio_set_fd_handler(sockfd, NULL, NULL, NULL);
srco->ret = ret;
srco->finished = true;
@@ -603,11 +659,13 @@ static int do_req(int sockfd, SheepdogReq *hdr, void *data,
return srco.ret;
}
static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
struct iovec *iov, int niov, bool create,
enum AIOCBState aiocb_type);
static int coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req);
static void coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req);
static int reload_inode(BDRVSheepdogState *s, uint32_t snapid, const char *tag);
static int get_sheep_fd(BDRVSheepdogState *s);
static void co_write_request(void *opaque);
static AIOReq *find_pending_req(BDRVSheepdogState *s, uint64_t oid)
{
@@ -630,22 +688,59 @@ static void coroutine_fn send_pending_req(BDRVSheepdogState *s, uint64_t oid)
{
AIOReq *aio_req;
SheepdogAIOCB *acb;
int ret;
while ((aio_req = find_pending_req(s, oid)) != NULL) {
acb = aio_req->aiocb;
/* move aio_req from pending list to inflight one */
QLIST_REMOVE(aio_req, aio_siblings);
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
ret = add_aio_request(s, aio_req, acb->qiov->iov,
acb->qiov->niov, false, acb->aiocb_type);
if (ret < 0) {
error_report("add_aio_request is failed");
free_aio_req(s, aio_req);
if (!acb->nr_pending) {
sd_finish_aiocb(acb);
}
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov, false,
acb->aiocb_type);
}
}
static coroutine_fn void reconnect_to_sdog(void *opaque)
{
BDRVSheepdogState *s = opaque;
AIOReq *aio_req, *next;
qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL);
close(s->fd);
s->fd = -1;
/* Wait for outstanding write requests to be completed. */
while (s->co_send != NULL) {
co_write_request(opaque);
}
/* Try to reconnect the sheepdog server every one second. */
while (s->fd < 0) {
s->fd = get_sheep_fd(s);
if (s->fd < 0) {
DPRINTF("Wait for connection to be established\n");
co_aio_sleep_ns(bdrv_get_aio_context(s->bs), QEMU_CLOCK_REALTIME,
1000000000ULL);
}
};
/*
* Now we have to resend all the request in the inflight queue. However,
* resend_aioreq() can yield and newly created requests can be added to the
* inflight queue before the coroutine is resumed. To avoid mixing them, we
* have to move all the inflight requests to the failed queue before
* resend_aioreq() is called.
*/
QLIST_FOREACH_SAFE(aio_req, &s->inflight_aio_head, aio_siblings, next) {
QLIST_REMOVE(aio_req, aio_siblings);
QLIST_INSERT_HEAD(&s->failed_aio_head, aio_req, aio_siblings);
}
/* Resend all the failed aio requests. */
while (!QLIST_EMPTY(&s->failed_aio_head)) {
aio_req = QLIST_FIRST(&s->failed_aio_head);
QLIST_REMOVE(aio_req, aio_siblings);
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
resend_aioreq(s, aio_req);
}
}
@@ -665,15 +760,11 @@ static void coroutine_fn aio_read_response(void *opaque)
SheepdogAIOCB *acb;
uint64_t idx;
if (QLIST_EMPTY(&s->inflight_aio_head)) {
goto out;
}
/* read a header */
ret = qemu_co_recv(fd, &rsp, sizeof(rsp));
if (ret < 0) {
if (ret != sizeof(rsp)) {
error_report("failed to get the header, %s", strerror(errno));
goto out;
goto err;
}
/* find the right aio_req from the inflight aio list */
@@ -684,7 +775,7 @@ static void coroutine_fn aio_read_response(void *opaque)
}
if (!aio_req) {
error_report("cannot find aio_req %x", rsp.id);
goto out;
goto err;
}
acb = aio_req->aiocb;
@@ -722,9 +813,9 @@ static void coroutine_fn aio_read_response(void *opaque)
case AIOCB_READ_UDATA:
ret = qemu_co_recvv(fd, acb->qiov->iov, acb->qiov->niov,
aio_req->iov_offset, rsp.data_length);
if (ret < 0) {
if (ret != rsp.data_length) {
error_report("failed to get the data, %s", strerror(errno));
goto out;
goto err;
}
break;
case AIOCB_FLUSH_CACHE:
@@ -755,11 +846,20 @@ static void coroutine_fn aio_read_response(void *opaque)
case SD_RES_SUCCESS:
break;
case SD_RES_READONLY:
ret = resend_aioreq(s, aio_req);
if (ret == SD_RES_SUCCESS) {
goto out;
if (s->inode.vdi_id == oid_to_vid(aio_req->oid)) {
ret = reload_inode(s, 0, "");
if (ret < 0) {
goto err;
}
}
/* fall through */
if (is_data_obj(aio_req->oid)) {
aio_req->oid = vid_to_data_oid(s->inode.vdi_id,
data_oid_to_idx(aio_req->oid));
} else {
aio_req->oid = vid_to_vdi_oid(s->inode.vdi_id);
}
resend_aioreq(s, aio_req);
goto out;
default:
acb->ret = -EIO;
error_report("%s", sd_strerror(rsp.result));
@@ -776,6 +876,10 @@ static void coroutine_fn aio_read_response(void *opaque)
}
out:
s->co_recv = NULL;
return;
err:
s->co_recv = NULL;
reconnect_to_sdog(opaque);
}
static void co_read_response(void *opaque)
@@ -796,14 +900,6 @@ static void co_write_request(void *opaque)
qemu_coroutine_enter(s->co_send, NULL);
}
static int aio_flush_request(void *opaque)
{
BDRVSheepdogState *s = opaque;
return !QLIST_EMPTY(&s->inflight_aio_head) ||
!QLIST_EMPTY(&s->pending_aio_head);
}
/*
* Return a socket discriptor to read/write objects.
*
@@ -819,7 +915,7 @@ static int get_sheep_fd(BDRVSheepdogState *s)
return fd;
}
qemu_aio_set_fd_handler(fd, co_read_response, NULL, aio_flush_request, s);
qemu_aio_set_fd_handler(fd, co_read_response, NULL, s);
return fd;
}
@@ -1012,7 +1108,7 @@ out:
return ret;
}
static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
struct iovec *iov, int niov, bool create,
enum AIOCBState aiocb_type)
{
@@ -1069,36 +1165,30 @@ static int coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
qemu_co_mutex_lock(&s->lock);
s->co_send = qemu_coroutine_self();
qemu_aio_set_fd_handler(s->fd, co_read_response, co_write_request,
aio_flush_request, s);
qemu_aio_set_fd_handler(s->fd, co_read_response, co_write_request, s);
socket_set_cork(s->fd, 1);
/* send a header */
ret = qemu_co_send(s->fd, &hdr, sizeof(hdr));
if (ret < 0) {
qemu_co_mutex_unlock(&s->lock);
if (ret != sizeof(hdr)) {
error_report("failed to send a req, %s", strerror(errno));
return -errno;
goto out;
}
if (wlen) {
ret = qemu_co_sendv(s->fd, iov, niov, aio_req->iov_offset, wlen);
if (ret < 0) {
qemu_co_mutex_unlock(&s->lock);
if (ret != wlen) {
error_report("failed to send a data, %s", strerror(errno));
return -errno;
}
}
out:
socket_set_cork(s->fd, 0);
qemu_aio_set_fd_handler(s->fd, co_read_response, NULL,
aio_flush_request, s);
qemu_aio_set_fd_handler(s->fd, co_read_response, NULL, s);
s->co_send = NULL;
qemu_co_mutex_unlock(&s->lock);
return 0;
}
static int read_write_object(int fd, char *buf, uint64_t oid, int copies,
static int read_write_object(int fd, char *buf, uint64_t oid, uint8_t copies,
unsigned int datalen, uint64_t offset,
bool write, bool create, uint32_t cache_flags)
{
@@ -1146,7 +1236,7 @@ static int read_write_object(int fd, char *buf, uint64_t oid, int copies,
}
}
static int read_object(int fd, char *buf, uint64_t oid, int copies,
static int read_object(int fd, char *buf, uint64_t oid, uint8_t copies,
unsigned int datalen, uint64_t offset,
uint32_t cache_flags)
{
@@ -1154,7 +1244,7 @@ static int read_object(int fd, char *buf, uint64_t oid, int copies,
false, cache_flags);
}
static int write_object(int fd, char *buf, uint64_t oid, int copies,
static int write_object(int fd, char *buf, uint64_t oid, uint8_t copies,
unsigned int datalen, uint64_t offset, bool create,
uint32_t cache_flags)
{
@@ -1198,51 +1288,62 @@ out:
return ret;
}
static int coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
/* Return true if the specified request is linked to the pending list. */
static bool check_simultaneous_create(BDRVSheepdogState *s, AIOReq *aio_req)
{
AIOReq *areq;
QLIST_FOREACH(areq, &s->inflight_aio_head, aio_siblings) {
if (areq != aio_req && areq->oid == aio_req->oid) {
/*
* Sheepdog cannot handle simultaneous create requests to the same
* object, so we cannot send the request until the previous request
* finishes.
*/
DPRINTF("simultaneous create to %" PRIx64 "\n", aio_req->oid);
aio_req->flags = 0;
aio_req->base_oid = 0;
QLIST_REMOVE(aio_req, aio_siblings);
QLIST_INSERT_HEAD(&s->pending_aio_head, aio_req, aio_siblings);
return true;
}
}
return false;
}
static void coroutine_fn resend_aioreq(BDRVSheepdogState *s, AIOReq *aio_req)
{
SheepdogAIOCB *acb = aio_req->aiocb;
bool create = false;
int ret;
ret = reload_inode(s, 0, "");
if (ret < 0) {
return ret;
}
aio_req->oid = vid_to_data_oid(s->inode.vdi_id,
data_oid_to_idx(aio_req->oid));
/* check whether this request becomes a CoW one */
if (acb->aiocb_type == AIOCB_WRITE_UDATA) {
if (acb->aiocb_type == AIOCB_WRITE_UDATA && is_data_obj(aio_req->oid)) {
int idx = data_oid_to_idx(aio_req->oid);
AIOReq *areq;
if (s->inode.data_vdi_id[idx] == 0) {
create = true;
goto out;
}
if (is_data_obj_writable(&s->inode, idx)) {
goto out;
}
/* link to the pending list if there is another CoW request to
* the same object */
QLIST_FOREACH(areq, &s->inflight_aio_head, aio_siblings) {
if (areq != aio_req && areq->oid == aio_req->oid) {
DPRINTF("simultaneous CoW to %" PRIx64 "\n", aio_req->oid);
QLIST_REMOVE(aio_req, aio_siblings);
QLIST_INSERT_HEAD(&s->pending_aio_head, aio_req, aio_siblings);
return SD_RES_SUCCESS;
}
if (check_simultaneous_create(s, aio_req)) {
return;
}
aio_req->base_oid = vid_to_data_oid(s->inode.data_vdi_id[idx], idx);
aio_req->flags |= SD_FLAG_CMD_COW;
if (s->inode.data_vdi_id[idx]) {
aio_req->base_oid = vid_to_data_oid(s->inode.data_vdi_id[idx], idx);
aio_req->flags |= SD_FLAG_CMD_COW;
}
create = true;
}
out:
return add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
create, acb->aiocb_type);
if (is_data_obj(aio_req->oid)) {
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov, create,
acb->aiocb_type);
} else {
struct iovec iov;
iov.iov_base = &s->inode;
iov.iov_len = sizeof(s->inode);
add_aio_request(s, aio_req, &iov, 1, false, AIOCB_WRITE_UDATA);
}
}
/* TODO Convert to fine grained options */
@@ -1259,7 +1360,8 @@ static QemuOptsList runtime_opts = {
},
};
static int sd_open(BlockDriverState *bs, QDict *options, int flags)
static int sd_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
int ret, fd;
uint32_t vid = 0;
@@ -1271,6 +1373,8 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags)
Error *local_err = NULL;
const char *filename;
s->bs = bs;
opts = qemu_opts_create_nofail(&runtime_opts);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (error_is_set(&local_err)) {
@@ -1284,6 +1388,7 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags)
QLIST_INIT(&s->inflight_aio_head);
QLIST_INIT(&s->pending_aio_head);
QLIST_INIT(&s->failed_aio_head);
s->fd = -1;
memset(vdi, 0, sizeof(vdi));
@@ -1350,7 +1455,7 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags)
g_free(buf);
return 0;
out:
qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL, NULL);
qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL);
if (s->fd >= 0) {
closesocket(s->fd);
}
@@ -1360,7 +1465,8 @@ out:
}
static int do_sd_create(BDRVSheepdogState *s, char *filename, int64_t vdi_size,
uint32_t base_vid, uint32_t *vdi_id, int snapshot)
uint32_t base_vid, uint32_t *vdi_id, int snapshot,
uint8_t copy_policy)
{
SheepdogVdiReq hdr;
SheepdogVdiRsp *rsp = (SheepdogVdiRsp *)&hdr;
@@ -1390,6 +1496,7 @@ static int do_sd_create(BDRVSheepdogState *s, char *filename, int64_t vdi_size,
hdr.data_length = wlen;
hdr.vdi_size = vdi_size;
hdr.copy_policy = copy_policy;
ret = do_req(fd, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
@@ -1417,10 +1524,13 @@ static int sd_prealloc(const char *filename)
uint32_t idx, max_idx;
int64_t vdi_size;
void *buf = g_malloc0(SD_DATA_OBJ_SIZE);
Error *local_err = NULL;
int ret;
ret = bdrv_file_open(&bs, filename, NULL, BDRV_O_RDWR);
ret = bdrv_file_open(&bs, filename, NULL, BDRV_O_RDWR, &local_err);
if (ret < 0) {
qerror_report_err(local_err);
error_free(local_err);
goto out;
}
@@ -1447,14 +1557,15 @@ static int sd_prealloc(const char *filename)
}
out:
if (bs) {
bdrv_delete(bs);
bdrv_unref(bs);
}
g_free(buf);
return ret;
}
static int sd_create(const char *filename, QEMUOptionParameter *options)
static int sd_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int ret = 0;
uint32_t vid = 0, base_vid = 0;
@@ -1464,6 +1575,7 @@ static int sd_create(const char *filename, QEMUOptionParameter *options)
char vdi[SD_MAX_VDI_LEN], tag[SD_MAX_VDI_TAG_LEN];
uint32_t snapid;
bool prealloc = false;
Error *local_err = NULL;
s = g_malloc0(sizeof(BDRVSheepdogState));
@@ -1517,8 +1629,10 @@ static int sd_create(const char *filename, QEMUOptionParameter *options)
goto out;
}
ret = bdrv_file_open(&bs, backing_file, NULL, 0);
ret = bdrv_file_open(&bs, backing_file, NULL, 0, &local_err);
if (ret < 0) {
qerror_report_err(local_err);
error_free(local_err);
goto out;
}
@@ -1526,16 +1640,17 @@ static int sd_create(const char *filename, QEMUOptionParameter *options)
if (!is_snapshot(&s->inode)) {
error_report("cannot clone from a non snapshot vdi");
bdrv_delete(bs);
bdrv_unref(bs);
ret = -EINVAL;
goto out;
}
base_vid = s->inode.vdi_id;
bdrv_delete(bs);
bdrv_unref(bs);
}
ret = do_sd_create(s, vdi, vdi_size, base_vid, &vid, 0);
/* TODO: allow users to specify copy number */
ret = do_sd_create(s, vdi, vdi_size, base_vid, &vid, 0, 0);
if (!prealloc || ret) {
goto out;
}
@@ -1578,7 +1693,7 @@ static void sd_close(BlockDriverState *bs)
error_report("%s, %s", sd_strerror(rsp->result), s->name);
}
qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL, NULL);
qemu_aio_set_fd_handler(s->fd, NULL, NULL, NULL);
closesocket(s->fd);
g_free(s->host_spec);
}
@@ -1630,7 +1745,6 @@ static int sd_truncate(BlockDriverState *bs, int64_t offset)
*/
static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
{
int ret;
BDRVSheepdogState *s = acb->common.bs->opaque;
struct iovec iov;
AIOReq *aio_req;
@@ -1652,18 +1766,13 @@ static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
aio_req = alloc_aio_req(s, acb, vid_to_vdi_oid(s->inode.vdi_id),
data_len, offset, 0, 0, offset);
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
ret = add_aio_request(s, aio_req, &iov, 1, false, AIOCB_WRITE_UDATA);
if (ret) {
free_aio_req(s, aio_req);
acb->ret = -EIO;
goto out;
}
add_aio_request(s, aio_req, &iov, 1, false, AIOCB_WRITE_UDATA);
acb->aio_done_func = sd_finish_aiocb;
acb->aiocb_type = AIOCB_WRITE_UDATA;
return;
}
out:
sd_finish_aiocb(acb);
}
@@ -1725,7 +1834,7 @@ static int sd_create_branch(BDRVSheepdogState *s)
*/
deleted = sd_delete(s);
ret = do_sd_create(s, s->name, s->inode.vdi_size, s->inode.vdi_id, &vid,
!deleted);
!deleted, s->inode.copy_policy);
if (ret) {
goto out;
}
@@ -1849,35 +1958,16 @@ static int coroutine_fn sd_co_rw_vector(void *p)
}
aio_req = alloc_aio_req(s, acb, oid, len, offset, flags, old_oid, done);
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
if (create) {
AIOReq *areq;
QLIST_FOREACH(areq, &s->inflight_aio_head, aio_siblings) {
if (areq->oid == oid) {
/*
* Sheepdog cannot handle simultaneous create
* requests to the same object. So we cannot send
* the request until the previous request
* finishes.
*/
aio_req->flags = 0;
aio_req->base_oid = 0;
QLIST_INSERT_HEAD(&s->pending_aio_head, aio_req,
aio_siblings);
goto done;
}
if (check_simultaneous_create(s, aio_req)) {
goto done;
}
}
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
ret = add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
create, acb->aiocb_type);
if (ret < 0) {
error_report("add_aio_request is failed");
free_aio_req(s, aio_req);
acb->ret = -EIO;
goto out;
}
add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov, create,
acb->aiocb_type);
done:
offset = 0;
idx++;
@@ -1945,7 +2035,6 @@ static int coroutine_fn sd_co_flush_to_disk(BlockDriverState *bs)
BDRVSheepdogState *s = bs->opaque;
SheepdogAIOCB *acb;
AIOReq *aio_req;
int ret;
if (s->cache_flags != SD_FLAG_CMD_CACHE) {
return 0;
@@ -1958,13 +2047,7 @@ static int coroutine_fn sd_co_flush_to_disk(BlockDriverState *bs)
aio_req = alloc_aio_req(s, acb, vid_to_vdi_oid(s->inode.vdi_id),
0, 0, 0, 0, 0);
QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
ret = add_aio_request(s, aio_req, NULL, 0, false, acb->aiocb_type);
if (ret < 0) {
error_report("add_aio_request is failed");
free_aio_req(s, aio_req);
qemu_aio_release(acb);
return ret;
}
add_aio_request(s, aio_req, NULL, 0, false, acb->aiocb_type);
qemu_coroutine_yield();
return acb->ret;
@@ -2015,7 +2098,7 @@ static int sd_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
}
ret = do_sd_create(s, s->name, s->inode.vdi_size, s->inode.vdi_id, &new_vid,
1);
1, s->inode.copy_policy);
if (ret < 0) {
error_report("failed to create inode for snapshot. %s",
strerror(errno));
@@ -2089,7 +2172,10 @@ out:
return ret;
}
static int sd_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
static int sd_snapshot_delete(BlockDriverState *bs,
const char *snapshot_id,
const char *name,
Error **errp)
{
/* FIXME: Delete specified snapshot id. */
return 0;
@@ -2287,9 +2373,9 @@ static coroutine_fn int sd_co_discard(BlockDriverState *bs, int64_t sector_num,
return acb->ret;
}
static coroutine_fn int
sd_co_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
int *pnum)
static coroutine_fn int64_t
sd_co_get_block_status(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
int *pnum)
{
BDRVSheepdogState *s = bs->opaque;
SheepdogInode *inode = &s->inode;
@@ -2297,7 +2383,7 @@ sd_co_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
end = DIV_ROUND_UP((sector_num + nb_sectors) *
BDRV_SECTOR_SIZE, SD_DATA_OBJ_SIZE);
unsigned long idx;
int ret = 1;
int64_t ret = BDRV_BLOCK_DATA;
for (idx = start; idx < end; idx++) {
if (inode->data_vdi_id[idx] == 0) {
@@ -2344,6 +2430,7 @@ static BlockDriver bdrv_sheepdog = {
.format_name = "sheepdog",
.protocol_name = "sheepdog",
.instance_size = sizeof(BDRVSheepdogState),
.bdrv_needs_filename = true,
.bdrv_file_open = sd_open,
.bdrv_close = sd_close,
.bdrv_create = sd_create,
@@ -2355,7 +2442,7 @@ static BlockDriver bdrv_sheepdog = {
.bdrv_co_writev = sd_co_writev,
.bdrv_co_flush_to_disk = sd_co_flush_to_disk,
.bdrv_co_discard = sd_co_discard,
.bdrv_co_is_allocated = sd_co_is_allocated,
.bdrv_co_get_block_status = sd_co_get_block_status,
.bdrv_snapshot_create = sd_snapshot_create,
.bdrv_snapshot_goto = sd_snapshot_goto,
@@ -2372,6 +2459,7 @@ static BlockDriver bdrv_sheepdog_tcp = {
.format_name = "sheepdog",
.protocol_name = "sheepdog+tcp",
.instance_size = sizeof(BDRVSheepdogState),
.bdrv_needs_filename = true,
.bdrv_file_open = sd_open,
.bdrv_close = sd_close,
.bdrv_create = sd_create,
@@ -2383,7 +2471,7 @@ static BlockDriver bdrv_sheepdog_tcp = {
.bdrv_co_writev = sd_co_writev,
.bdrv_co_flush_to_disk = sd_co_flush_to_disk,
.bdrv_co_discard = sd_co_discard,
.bdrv_co_is_allocated = sd_co_is_allocated,
.bdrv_co_get_block_status = sd_co_get_block_status,
.bdrv_snapshot_create = sd_snapshot_create,
.bdrv_snapshot_goto = sd_snapshot_goto,
@@ -2400,6 +2488,7 @@ static BlockDriver bdrv_sheepdog_unix = {
.format_name = "sheepdog",
.protocol_name = "sheepdog+unix",
.instance_size = sizeof(BDRVSheepdogState),
.bdrv_needs_filename = true,
.bdrv_file_open = sd_open,
.bdrv_close = sd_close,
.bdrv_create = sd_create,
@@ -2411,7 +2500,7 @@ static BlockDriver bdrv_sheepdog_unix = {
.bdrv_co_writev = sd_co_writev,
.bdrv_co_flush_to_disk = sd_co_flush_to_disk,
.bdrv_co_discard = sd_co_discard,
.bdrv_co_is_allocated = sd_co_is_allocated,
.bdrv_co_get_block_status = sd_co_get_block_status,
.bdrv_snapshot_create = sd_snapshot_create,
.bdrv_snapshot_goto = sd_snapshot_goto,

View File

@@ -48,6 +48,79 @@ int bdrv_snapshot_find(BlockDriverState *bs, QEMUSnapshotInfo *sn_info,
return ret;
}
/**
* Look up an internal snapshot by @id and @name.
* @bs: block device to search
* @id: unique snapshot ID, or NULL
* @name: snapshot name, or NULL
* @sn_info: location to store information on the snapshot found
* @errp: location to store error, will be set only for exception
*
* This function will traverse snapshot list in @bs to search the matching
* one, @id and @name are the matching condition:
* If both @id and @name are specified, find the first one with id @id and
* name @name.
* If only @id is specified, find the first one with id @id.
* If only @name is specified, find the first one with name @name.
* if none is specified, abort().
*
* Returns: true when a snapshot is found and @sn_info will be filled, false
* when error or not found. If all operation succeed but no matching one is
* found, @errp will NOT be set.
*/
bool bdrv_snapshot_find_by_id_and_name(BlockDriverState *bs,
const char *id,
const char *name,
QEMUSnapshotInfo *sn_info,
Error **errp)
{
QEMUSnapshotInfo *sn_tab, *sn;
int nb_sns, i;
bool ret = false;
assert(id || name);
nb_sns = bdrv_snapshot_list(bs, &sn_tab);
if (nb_sns < 0) {
error_setg_errno(errp, -nb_sns, "Failed to get a snapshot list");
return false;
} else if (nb_sns == 0) {
return false;
}
if (id && name) {
for (i = 0; i < nb_sns; i++) {
sn = &sn_tab[i];
if (!strcmp(sn->id_str, id) && !strcmp(sn->name, name)) {
*sn_info = *sn;
ret = true;
break;
}
}
} else if (id) {
for (i = 0; i < nb_sns; i++) {
sn = &sn_tab[i];
if (!strcmp(sn->id_str, id)) {
*sn_info = *sn;
ret = true;
break;
}
}
} else if (name) {
for (i = 0; i < nb_sns; i++) {
sn = &sn_tab[i];
if (!strcmp(sn->name, name)) {
*sn_info = *sn;
ret = true;
break;
}
}
}
g_free(sn_tab);
return ret;
}
int bdrv_can_snapshot(BlockDriverState *bs)
{
BlockDriver *drv = bs->drv;
@@ -97,9 +170,9 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
if (bs->file) {
drv->bdrv_close(bs);
ret = bdrv_snapshot_goto(bs->file, snapshot_id);
open_ret = drv->bdrv_open(bs, NULL, bs->open_flags);
open_ret = drv->bdrv_open(bs, NULL, bs->open_flags, NULL);
if (open_ret < 0) {
bdrv_delete(bs->file);
bdrv_unref(bs->file);
bs->drv = NULL;
return open_ret;
}
@@ -109,21 +182,73 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
return -ENOTSUP;
}
int bdrv_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
/**
* Delete an internal snapshot by @snapshot_id and @name.
* @bs: block device used in the operation
* @snapshot_id: unique snapshot ID, or NULL
* @name: snapshot name, or NULL
* @errp: location to store error
*
* If both @snapshot_id and @name are specified, delete the first one with
* id @snapshot_id and name @name.
* If only @snapshot_id is specified, delete the first one with id
* @snapshot_id.
* If only @name is specified, delete the first one with name @name.
* if none is specified, return -ENINVAL.
*
* Returns: 0 on success, -errno on failure. If @bs is not inserted, return
* -ENOMEDIUM. If @snapshot_id and @name are both NULL, return -EINVAL. If @bs
* does not support internal snapshot deletion, return -ENOTSUP. If @bs does
* not support parameter @snapshot_id or @name, or one of them is not correctly
* specified, return -EINVAL. If @bs can't find one matching @id and @name,
* return -ENOENT. If @errp != NULL, it will always be filled with error
* message on failure.
*/
int bdrv_snapshot_delete(BlockDriverState *bs,
const char *snapshot_id,
const char *name,
Error **errp)
{
BlockDriver *drv = bs->drv;
if (!drv) {
error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, bdrv_get_device_name(bs));
return -ENOMEDIUM;
}
if (!snapshot_id && !name) {
error_setg(errp, "snapshot_id and name are both NULL");
return -EINVAL;
}
if (drv->bdrv_snapshot_delete) {
return drv->bdrv_snapshot_delete(bs, snapshot_id);
return drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
}
if (bs->file) {
return bdrv_snapshot_delete(bs->file, snapshot_id);
return bdrv_snapshot_delete(bs->file, snapshot_id, name, errp);
}
error_set(errp, QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
drv->format_name, bdrv_get_device_name(bs),
"internal snapshot deletion");
return -ENOTSUP;
}
void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
const char *id_or_name,
Error **errp)
{
int ret;
Error *local_err = NULL;
ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
if (ret == -ENOENT || ret == -EINVAL) {
error_free(local_err);
local_err = NULL;
ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
}
if (ret < 0) {
error_propagate(errp, local_err);
}
}
int bdrv_snapshot_list(BlockDriverState *bs,
QEMUSnapshotInfo **psn_info)
{

View File

@@ -608,7 +608,8 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
return ret;
}
static int ssh_file_open(BlockDriverState *bs, QDict *options, int bdrv_flags)
static int ssh_file_open(BlockDriverState *bs, QDict *options, int bdrv_flags,
Error **errp)
{
BDRVSSHState *s = bs->opaque;
int ret;
@@ -650,7 +651,8 @@ static QEMUOptionParameter ssh_create_options[] = {
{ NULL }
};
static int ssh_create(const char *filename, QEMUOptionParameter *options)
static int ssh_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int r, ret;
Error *local_err = NULL;
@@ -740,14 +742,6 @@ static void restart_coroutine(void *opaque)
qemu_coroutine_enter(co, NULL);
}
/* Always true because when we have called set_fd_handler there is
* always a request being processed.
*/
static int return_true(void *opaque)
{
return 1;
}
static coroutine_fn void set_fd_handler(BDRVSSHState *s)
{
int r;
@@ -766,13 +760,13 @@ static coroutine_fn void set_fd_handler(BDRVSSHState *s)
DPRINTF("s->sock=%d rd_handler=%p wr_handler=%p", s->sock,
rd_handler, wr_handler);
qemu_aio_set_fd_handler(s->sock, rd_handler, wr_handler, return_true, co);
qemu_aio_set_fd_handler(s->sock, rd_handler, wr_handler, co);
}
static coroutine_fn void clear_fd_handler(BDRVSSHState *s)
{
DPRINTF("s->sock=%d", s->sock);
qemu_aio_set_fd_handler(s->sock, NULL, NULL, NULL, NULL);
qemu_aio_set_fd_handler(s->sock, NULL, NULL, NULL);
}
/* A non-blocking call returned EAGAIN, so yield, ensuring the

View File

@@ -57,6 +57,11 @@ static void close_unused_images(BlockDriverState *top, BlockDriverState *base,
BlockDriverState *intermediate;
intermediate = top->backing_hd;
/* Must assign before bdrv_delete() to prevent traversing dangling pointer
* while we delete backing image instances.
*/
top->backing_hd = base;
while (intermediate) {
BlockDriverState *unused;
@@ -68,9 +73,8 @@ static void close_unused_images(BlockDriverState *top, BlockDriverState *base,
unused = intermediate;
intermediate = intermediate->backing_hd;
unused->backing_hd = NULL;
bdrv_delete(unused);
bdrv_unref(unused);
}
top->backing_hd = base;
}
static void coroutine_fn stream_run(void *opaque)
@@ -110,21 +114,22 @@ wait:
/* Note that even when no rate limit is applied we need to yield
* with no pending I/O here so that bdrv_drain_all() returns.
*/
block_job_sleep_ns(&s->common, rt_clock, delay_ns);
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
if (block_job_is_cancelled(&s->common)) {
break;
}
ret = bdrv_co_is_allocated(bs, sector_num,
STREAM_BUFFER_SIZE / BDRV_SECTOR_SIZE, &n);
copy = false;
ret = bdrv_is_allocated(bs, sector_num,
STREAM_BUFFER_SIZE / BDRV_SECTOR_SIZE, &n);
if (ret == 1) {
/* Allocated in the top, no need to copy. */
copy = false;
} else if (ret >= 0) {
/* Copy if allocated in the intermediate images. Limit to the
* known-unallocated area [sector_num, sector_num+n). */
ret = bdrv_co_is_allocated_above(bs->backing_hd, base,
sector_num, n, &n);
ret = bdrv_is_allocated_above(bs->backing_hd, base,
sector_num, n, &n);
/* Finish early if end of backing file has been reached */
if (ret == 0 && n == 0) {
@@ -134,7 +139,7 @@ wait:
copy = (ret == 1);
}
trace_stream_one_iteration(s, sector_num, n, ret);
if (ret >= 0 && copy) {
if (copy) {
if (s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, n);
if (delay_ns > 0) {
@@ -198,9 +203,9 @@ static void stream_set_speed(BlockJob *job, int64_t speed, Error **errp)
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static const BlockJobType stream_job_type = {
static const BlockJobDriver stream_job_driver = {
.instance_size = sizeof(StreamBlockJob),
.job_type = "stream",
.job_type = BLOCK_JOB_TYPE_STREAM,
.set_speed = stream_set_speed,
};
@@ -219,7 +224,7 @@ void stream_start(BlockDriverState *bs, BlockDriverState *base,
return;
}
s = block_job_create(&stream_job_type, bs, speed, cb, opaque, errp);
s = block_job_create(&stream_job_driver, bs, speed, cb, opaque, errp);
if (!s) {
return;
}

View File

@@ -165,7 +165,7 @@ typedef struct {
uuid_t uuid_link;
uuid_t uuid_parent;
uint64_t unused2[7];
} VdiHeader;
} QEMU_PACKED VdiHeader;
typedef struct {
/* The block map entries are little endian (even in memory). */
@@ -364,7 +364,8 @@ static int vdi_probe(const uint8_t *buf, int buf_size, const char *filename)
return result;
}
static int vdi_open(BlockDriverState *bs, QDict *options, int flags)
static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVVdiState *s = bs->opaque;
VdiHeader header;
@@ -470,7 +471,7 @@ static int vdi_reopen_prepare(BDRVReopenState *state,
return 0;
}
static int coroutine_fn vdi_co_is_allocated(BlockDriverState *bs,
static int64_t coroutine_fn vdi_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum)
{
/* TODO: Check for too large sector_num (in bdrv_is_allocated or here). */
@@ -479,12 +480,23 @@ static int coroutine_fn vdi_co_is_allocated(BlockDriverState *bs,
size_t sector_in_block = sector_num % s->block_sectors;
int n_sectors = s->block_sectors - sector_in_block;
uint32_t bmap_entry = le32_to_cpu(s->bmap[bmap_index]);
uint64_t offset;
int result;
logout("%p, %" PRId64 ", %d, %p\n", bs, sector_num, nb_sectors, pnum);
if (n_sectors > nb_sectors) {
n_sectors = nb_sectors;
}
*pnum = n_sectors;
return VDI_IS_ALLOCATED(bmap_entry);
result = VDI_IS_ALLOCATED(bmap_entry);
if (!result) {
return 0;
}
offset = s->header.offset_data +
(uint64_t)bmap_entry * s->block_size +
sector_in_block * SECTOR_SIZE;
return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
}
static int vdi_co_read(BlockDriverState *bs,
@@ -633,7 +645,8 @@ static int vdi_co_write(BlockDriverState *bs,
return ret;
}
static int vdi_create(const char *filename, QEMUOptionParameter *options)
static int vdi_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int fd;
int result = 0;
@@ -780,7 +793,7 @@ static BlockDriver bdrv_vdi = {
.bdrv_reopen_prepare = vdi_reopen_prepare,
.bdrv_create = vdi_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_co_is_allocated = vdi_co_is_allocated,
.bdrv_co_get_block_status = vdi_co_get_block_status,
.bdrv_make_empty = vdi_make_empty,
.bdrv_read = vdi_co_read,

View File

@@ -20,6 +20,7 @@
#include "qemu/module.h"
#include "qemu/crc32c.h"
#include "block/vhdx.h"
#include "migration/migration.h"
/* Several metadata and region table data entries are identified by
@@ -159,6 +160,7 @@ typedef struct BDRVVHDXState {
VHDXParentLocatorHeader parent_header;
VHDXParentLocatorEntry *parent_entries;
Error *migration_blocker;
} BDRVVHDXState;
uint32_t vhdx_checksum_calc(uint32_t crc, uint8_t *buf, size_t size,
@@ -715,7 +717,8 @@ exit:
}
static int vhdx_open(BlockDriverState *bs, QDict *options, int flags)
static int vhdx_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVVHDXState *s = bs->opaque;
int ret = 0;
@@ -805,6 +808,12 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, int flags)
/* TODO: differencing files, write */
/* Disable migration when VHDX images are used */
error_set(&s->migration_blocker,
QERR_BLOCK_FORMAT_FEATURE_NOT_SUPPORTED,
"vhdx", bs->device_name, "live migration");
migrate_add_blocker(s->migration_blocker);
return 0;
fail:
qemu_vfree(s->headers[0]);
@@ -951,6 +960,8 @@ static void vhdx_close(BlockDriverState *bs)
qemu_vfree(s->headers[1]);
qemu_vfree(s->bat);
qemu_vfree(s->parent_entries);
migrate_del_blocker(s->migration_blocker);
error_free(s->migration_blocker);
}
static BlockDriver bdrv_vhdx = {

View File

@@ -106,17 +106,21 @@ typedef struct VmdkExtent {
uint32_t l2_cache_counts[L2_CACHE_SIZE];
int64_t cluster_sectors;
char *type;
} VmdkExtent;
typedef struct BDRVVmdkState {
CoMutex lock;
uint64_t desc_offset;
bool cid_updated;
bool cid_checked;
uint32_t cid;
uint32_t parent_cid;
int num_extents;
/* Extent array with num_extents entries, ascend ordered by address */
VmdkExtent *extents;
Error *migration_blocker;
char *create_type;
} BDRVVmdkState;
typedef struct VmdkMetaData {
@@ -197,8 +201,6 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, const char *filename)
}
}
#define CHECK_CID 1
#define SECTOR_SIZE 512
#define DESC_SIZE (20 * SECTOR_SIZE) /* 20 sectors of 512 bytes each */
#define BUF_SIZE 4096
@@ -215,8 +217,9 @@ static void vmdk_free_extents(BlockDriverState *bs)
g_free(e->l1_table);
g_free(e->l2_cache);
g_free(e->l1_backup_table);
g_free(e->type);
if (e->file != bs->file) {
bdrv_delete(e->file);
bdrv_unref(e->file);
}
}
g_free(s->extents);
@@ -301,19 +304,18 @@ static int vmdk_write_cid(BlockDriverState *bs, uint32_t cid)
static int vmdk_is_cid_valid(BlockDriverState *bs)
{
#ifdef CHECK_CID
BDRVVmdkState *s = bs->opaque;
BlockDriverState *p_bs = bs->backing_hd;
uint32_t cur_pcid;
if (p_bs) {
if (!s->cid_checked && p_bs) {
cur_pcid = vmdk_read_cid(p_bs, 0);
if (s->parent_cid != cur_pcid) {
/* CID not valid */
return 0;
}
}
#endif
s->cid_checked = true;
/* CID valid */
return 1;
}
@@ -331,8 +333,7 @@ static int vmdk_reopen_prepare(BDRVReopenState *state,
assert(state->bs != NULL);
if (queue == NULL) {
error_set(errp, ERROR_CLASS_GENERIC_ERROR,
"No reopen queue for VMDK extents");
error_setg(errp, "No reopen queue for VMDK extents");
goto exit;
}
@@ -391,15 +392,24 @@ static int vmdk_add_extent(BlockDriverState *bs,
int64_t l1_offset, int64_t l1_backup_offset,
uint32_t l1_size,
int l2_size, uint64_t cluster_sectors,
VmdkExtent **new_extent)
VmdkExtent **new_extent,
Error **errp)
{
VmdkExtent *extent;
BDRVVmdkState *s = bs->opaque;
if (cluster_sectors > 0x200000) {
/* 0x200000 * 512Bytes = 1GB for one cluster is unrealistic */
error_report("invalid granularity, image may be corrupt");
return -EINVAL;
error_setg(errp, "Invalid granularity, image may be corrupt");
return -EFBIG;
}
if (l1_size > 512 * 1024 * 1024) {
/* Although with big capacity and small l1_entry_sectors, we can get a
* big l1_size, we don't want unbounded value to allocate the table.
* Limit it to 512M, which is 16PB for default cluster and L2 table
* size */
error_setg(errp, "L1 size too big");
return -EFBIG;
}
s->extents = g_realloc(s->extents,
@@ -430,7 +440,8 @@ static int vmdk_add_extent(BlockDriverState *bs,
return 0;
}
static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent)
static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent,
Error **errp)
{
int ret;
int l1_size, i;
@@ -439,10 +450,13 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent)
l1_size = extent->l1_size * sizeof(uint32_t);
extent->l1_table = g_malloc(l1_size);
ret = bdrv_pread(extent->file,
extent->l1_table_offset,
extent->l1_table,
l1_size);
extent->l1_table_offset,
extent->l1_table,
l1_size);
if (ret < 0) {
error_setg_errno(errp, -ret,
"Could not read l1 table from extent '%s'",
extent->file->filename);
goto fail_l1;
}
for (i = 0; i < extent->l1_size; i++) {
@@ -452,10 +466,13 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent)
if (extent->l1_backup_table_offset) {
extent->l1_backup_table = g_malloc(l1_size);
ret = bdrv_pread(extent->file,
extent->l1_backup_table_offset,
extent->l1_backup_table,
l1_size);
extent->l1_backup_table_offset,
extent->l1_backup_table,
l1_size);
if (ret < 0) {
error_setg_errno(errp, -ret,
"Could not read l1 backup table from extent '%s'",
extent->file->filename);
goto fail_l1b;
}
for (i = 0; i < extent->l1_size; i++) {
@@ -473,9 +490,9 @@ static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent)
return ret;
}
static int vmdk_open_vmdk3(BlockDriverState *bs,
BlockDriverState *file,
int flags)
static int vmdk_open_vmfs_sparse(BlockDriverState *bs,
BlockDriverState *file,
int flags, Error **errp)
{
int ret;
uint32_t magic;
@@ -484,20 +501,24 @@ static int vmdk_open_vmdk3(BlockDriverState *bs,
ret = bdrv_pread(file, sizeof(magic), &header, sizeof(header));
if (ret < 0) {
error_setg_errno(errp, -ret,
"Could not read header from file '%s'",
file->filename);
return ret;
}
ret = vmdk_add_extent(bs,
bs->file, false,
le32_to_cpu(header.disk_sectors),
le32_to_cpu(header.l1dir_offset) << 9,
0, 1 << 6, 1 << 9,
le32_to_cpu(header.granularity),
&extent);
ret = vmdk_add_extent(bs, file, false,
le32_to_cpu(header.disk_sectors),
le32_to_cpu(header.l1dir_offset) << 9,
0,
le32_to_cpu(header.l1dir_size),
4096,
le32_to_cpu(header.granularity),
&extent,
errp);
if (ret < 0) {
return ret;
}
ret = vmdk_init_tables(bs, extent);
ret = vmdk_init_tables(bs, extent, errp);
if (ret) {
/* free extent allocated by vmdk_add_extent */
vmdk_free_last_extent(bs);
@@ -506,30 +527,37 @@ static int vmdk_open_vmdk3(BlockDriverState *bs,
}
static int vmdk_open_desc_file(BlockDriverState *bs, int flags,
uint64_t desc_offset);
uint64_t desc_offset, Error **errp);
static int vmdk_open_vmdk4(BlockDriverState *bs,
BlockDriverState *file,
int flags)
int flags, Error **errp)
{
int ret;
uint32_t magic;
uint32_t l1_size, l1_entry_sectors;
VMDK4Header header;
VmdkExtent *extent;
BDRVVmdkState *s = bs->opaque;
int64_t l1_backup_offset = 0;
ret = bdrv_pread(file, sizeof(magic), &header, sizeof(header));
if (ret < 0) {
return ret;
error_setg_errno(errp, -ret,
"Could not read header from file '%s'",
file->filename);
}
if (header.capacity == 0) {
uint64_t desc_offset = le64_to_cpu(header.desc_offset);
if (desc_offset) {
return vmdk_open_desc_file(bs, flags, desc_offset << 9);
return vmdk_open_desc_file(bs, flags, desc_offset << 9, errp);
}
}
if (!s->create_type) {
s->create_type = g_strdup("monolithicSparse");
}
if (le64_to_cpu(header.gd_offset) == VMDK4_GD_AT_END) {
/*
* The footer takes precedence over the header, so read it in. The
@@ -598,14 +626,6 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
}
l1_size = (le64_to_cpu(header.capacity) + l1_entry_sectors - 1)
/ l1_entry_sectors;
if (l1_size > 512 * 1024 * 1024) {
/* although with big capacity and small l1_entry_sectors, we can get a
* big l1_size, we don't want unbounded value to allocate the table.
* Limit it to 512M, which is 16PB for default cluster and L2 table
* size */
error_report("L1 size too big");
return -EFBIG;
}
if (le32_to_cpu(header.flags) & VMDK4_FLAG_RGD) {
l1_backup_offset = le64_to_cpu(header.rgd_offset) << 9;
}
@@ -616,7 +636,8 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
l1_size,
le32_to_cpu(header.num_gtes_per_gt),
le64_to_cpu(header.granularity),
&extent);
&extent,
errp);
if (ret < 0) {
return ret;
}
@@ -625,7 +646,7 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
extent->has_marker = le32_to_cpu(header.flags) & VMDK4_FLAG_MARKER;
extent->version = le32_to_cpu(header.version);
extent->has_zero_grain = le32_to_cpu(header.flags) & VMDK4_FLAG_ZERO_GRAIN;
ret = vmdk_init_tables(bs, extent);
ret = vmdk_init_tables(bs, extent, errp);
if (ret) {
/* free extent allocated by vmdk_add_extent */
vmdk_free_last_extent(bs);
@@ -663,7 +684,7 @@ static int vmdk_parse_description(const char *desc, const char *opt_name,
/* Open an extent file and append to bs array */
static int vmdk_open_sparse(BlockDriverState *bs,
BlockDriverState *file,
int flags)
int flags, Error **errp)
{
uint32_t magic;
@@ -674,10 +695,10 @@ static int vmdk_open_sparse(BlockDriverState *bs,
magic = be32_to_cpu(magic);
switch (magic) {
case VMDK3_MAGIC:
return vmdk_open_vmdk3(bs, file, flags);
return vmdk_open_vmfs_sparse(bs, file, flags, errp);
break;
case VMDK4_MAGIC:
return vmdk_open_vmdk4(bs, file, flags);
return vmdk_open_vmdk4(bs, file, flags, errp);
break;
default:
return -EMEDIUMTYPE;
@@ -686,7 +707,7 @@ static int vmdk_open_sparse(BlockDriverState *bs,
}
static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
const char *desc_file_path)
const char *desc_file_path, Error **errp)
{
int ret;
char access[11];
@@ -697,6 +718,8 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
int64_t flat_offset;
char extent_path[PATH_MAX];
BlockDriverState *extent_file;
BDRVVmdkState *s = bs->opaque;
VmdkExtent *extent;
while (*p) {
/* parse extent line:
@@ -711,48 +734,54 @@ static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
goto next_line;
} else if (!strcmp(type, "FLAT")) {
if (ret != 5 || flat_offset < 0) {
error_setg(errp, "Invalid extent lines: \n%s", p);
return -EINVAL;
}
} else if (!strcmp(type, "VMFS")) {
flat_offset = 0;
} else if (ret != 4) {
error_setg(errp, "Invalid extent lines: \n%s", p);
return -EINVAL;
}
if (sectors <= 0 ||
(strcmp(type, "FLAT") && strcmp(type, "SPARSE")) ||
(strcmp(type, "FLAT") && strcmp(type, "SPARSE") &&
strcmp(type, "VMFS") && strcmp(type, "VMFSSPARSE")) ||
(strcmp(access, "RW"))) {
goto next_line;
}
path_combine(extent_path, sizeof(extent_path),
desc_file_path, fname);
ret = bdrv_file_open(&extent_file, extent_path, NULL, bs->open_flags);
ret = bdrv_file_open(&extent_file, extent_path, NULL, bs->open_flags,
errp);
if (ret) {
return ret;
}
/* save to extents array */
if (!strcmp(type, "FLAT")) {
if (!strcmp(type, "FLAT") || !strcmp(type, "VMFS")) {
/* FLAT extent */
VmdkExtent *extent;
ret = vmdk_add_extent(bs, extent_file, true, sectors,
0, 0, 0, 0, 0, &extent);
0, 0, 0, 0, 0, &extent, errp);
if (ret < 0) {
return ret;
}
extent->flat_start_offset = flat_offset << 9;
} else if (!strcmp(type, "SPARSE")) {
/* SPARSE extent */
ret = vmdk_open_sparse(bs, extent_file, bs->open_flags);
} else if (!strcmp(type, "SPARSE") || !strcmp(type, "VMFSSPARSE")) {
/* SPARSE extent and VMFSSPARSE extent are both "COWD" sparse file*/
ret = vmdk_open_sparse(bs, extent_file, bs->open_flags, errp);
if (ret) {
bdrv_delete(extent_file);
bdrv_unref(extent_file);
return ret;
}
extent = &s->extents[s->num_extents - 1];
} else {
fprintf(stderr,
"VMDK: Not supported extent type \"%s\""".\n", type);
error_setg(errp, "Unsupported extent type '%s'", type);
return -ENOTSUP;
}
extent->type = g_strdup(type);
next_line:
/* move to next line */
while (*p) {
@@ -767,7 +796,7 @@ next_line:
}
static int vmdk_open_desc_file(BlockDriverState *bs, int flags,
uint64_t desc_offset)
uint64_t desc_offset, Error **errp)
{
int ret;
char *buf = NULL;
@@ -792,29 +821,32 @@ static int vmdk_open_desc_file(BlockDriverState *bs, int flags,
goto exit;
}
if (strcmp(ct, "monolithicFlat") &&
strcmp(ct, "vmfs") &&
strcmp(ct, "vmfsSparse") &&
strcmp(ct, "twoGbMaxExtentSparse") &&
strcmp(ct, "twoGbMaxExtentFlat")) {
fprintf(stderr,
"VMDK: Not supported image type \"%s\""".\n", ct);
error_setg(errp, "Unsupported image type '%s'", ct);
ret = -ENOTSUP;
goto exit;
}
s->create_type = g_strdup(ct);
s->desc_offset = 0;
ret = vmdk_parse_extents(buf, bs, bs->file->filename);
ret = vmdk_parse_extents(buf, bs, bs->file->filename, errp);
exit:
g_free(buf);
return ret;
}
static int vmdk_open(BlockDriverState *bs, QDict *options, int flags)
static int vmdk_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
int ret;
BDRVVmdkState *s = bs->opaque;
if (vmdk_open_sparse(bs, bs->file, flags) == 0) {
if (vmdk_open_sparse(bs, bs->file, flags, errp) == 0) {
s->desc_offset = 0x200;
} else {
ret = vmdk_open_desc_file(bs, flags, 0);
ret = vmdk_open_desc_file(bs, flags, 0, errp);
if (ret) {
goto fail;
}
@@ -824,6 +856,7 @@ static int vmdk_open(BlockDriverState *bs, QDict *options, int flags)
if (ret) {
goto fail;
}
s->cid = vmdk_read_cid(bs, 0);
s->parent_cid = vmdk_read_cid(bs, 1);
qemu_co_mutex_init(&s->lock);
@@ -836,6 +869,8 @@ static int vmdk_open(BlockDriverState *bs, QDict *options, int flags)
return 0;
fail:
g_free(s->create_type);
s->create_type = NULL;
vmdk_free_extents(bs);
return ret;
}
@@ -1042,7 +1077,7 @@ static VmdkExtent *find_extent(BDRVVmdkState *s,
return NULL;
}
static int coroutine_fn vmdk_co_is_allocated(BlockDriverState *bs,
static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum)
{
BDRVVmdkState *s = bs->opaque;
@@ -1059,7 +1094,24 @@ static int coroutine_fn vmdk_co_is_allocated(BlockDriverState *bs,
sector_num * 512, 0, &offset);
qemu_co_mutex_unlock(&s->lock);
ret = (ret == VMDK_OK || ret == VMDK_ZEROED);
switch (ret) {
case VMDK_ERROR:
ret = -EIO;
break;
case VMDK_UNALLOC:
ret = 0;
break;
case VMDK_ZEROED:
ret = BDRV_BLOCK_ZERO;
break;
case VMDK_OK:
ret = BDRV_BLOCK_DATA;
if (extent->file == bs->file) {
ret |= BDRV_BLOCK_OFFSET_VALID | offset;
}
break;
}
index_in_cluster = sector_num % extent->cluster_sectors;
n = extent->cluster_sectors - index_in_cluster;
@@ -1264,8 +1316,7 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
VmdkMetaData m_data;
if (sector_num > bs->total_sectors) {
fprintf(stderr,
"(VMDK) Wrong offset: sector_num=0x%" PRIx64
error_report("Wrong offset: sector_num=0x%" PRIx64
" total_sectors=0x%" PRIx64 "\n",
sector_num, bs->total_sectors);
return -EIO;
@@ -1285,9 +1336,8 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
if (extent->compressed) {
if (ret == VMDK_OK) {
/* Refuse write to allocated cluster for streamOptimized */
fprintf(stderr,
"VMDK: can't write to allocated cluster"
" for streamOptimized\n");
error_report("Could not write to allocated cluster"
" for streamOptimized");
return -EIO;
} else {
/* allocate */
@@ -1384,7 +1434,6 @@ static int coroutine_fn vmdk_co_write_zeroes(BlockDriverState *bs,
return ret;
}
static int vmdk_create_extent(const char *filename, int64_t filesize,
bool flat, bool compress, bool zeroed_grain)
{
@@ -1496,12 +1545,12 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
}
static int filename_decompose(const char *filename, char *path, char *prefix,
char *postfix, size_t buf_len)
char *postfix, size_t buf_len, Error **errp)
{
const char *p, *q;
if (filename == NULL || !strlen(filename)) {
fprintf(stderr, "Vmdk: no filename provided.\n");
error_setg(errp, "No filename provided");
return VMDK_ERROR;
}
p = strrchr(filename, '/');
@@ -1535,10 +1584,11 @@ static int filename_decompose(const char *filename, char *path, char *prefix,
return VMDK_OK;
}
static int vmdk_create(const char *filename, QEMUOptionParameter *options)
static int vmdk_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
int fd, idx = 0;
char *desc = NULL;
char desc[BUF_SIZE];
int64_t total_size = 0, filesize;
const char *adapter_type = NULL;
const char *backing_file = NULL;
@@ -1546,7 +1596,7 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options)
int flags = 0;
int ret = 0;
bool flat, split, compress;
GString *ext_desc_lines;
char ext_desc_lines[BUF_SIZE] = "";
char path[PATH_MAX], prefix[PATH_MAX], postfix[PATH_MAX];
const int64_t split_size = 0x80000000; /* VMDK has constant split size */
const char *desc_extent_line;
@@ -1574,11 +1624,8 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options)
"ddb.geometry.sectors = \"63\"\n"
"ddb.adapterType = \"%s\"\n";
ext_desc_lines = g_string_new(NULL);
if (filename_decompose(filename, path, prefix, postfix, PATH_MAX)) {
ret = -EINVAL;
goto exit;
if (filename_decompose(filename, path, prefix, postfix, PATH_MAX, errp)) {
return -EINVAL;
}
/* Read out options */
while (options && options->name) {
@@ -1603,9 +1650,8 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options)
strcmp(adapter_type, "buslogic") &&
strcmp(adapter_type, "lsilogic") &&
strcmp(adapter_type, "legacyESX")) {
fprintf(stderr, "VMDK: Unknown adapter type: '%s'.\n", adapter_type);
ret = -EINVAL;
goto exit;
error_setg(errp, "Unknown adapter type: '%s'", adapter_type);
return -EINVAL;
}
if (strcmp(adapter_type, "ide") != 0) {
/* that's the number of heads with which vmware operates when
@@ -1620,9 +1666,8 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options)
strcmp(fmt, "twoGbMaxExtentSparse") &&
strcmp(fmt, "twoGbMaxExtentFlat") &&
strcmp(fmt, "streamOptimized")) {
fprintf(stderr, "VMDK: Unknown subformat: %s\n", fmt);
ret = -EINVAL;
goto exit;
error_setg(errp, "Unknown subformat: '%s'", fmt);
return -EINVAL;
}
split = !(strcmp(fmt, "twoGbMaxExtentFlat") &&
strcmp(fmt, "twoGbMaxExtentSparse"));
@@ -1635,24 +1680,26 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options)
desc_extent_line = "RW %lld SPARSE \"%s\"\n";
}
if (flat && backing_file) {
/* not supporting backing file for flat image */
ret = -ENOTSUP;
goto exit;
error_setg(errp, "Flat image can't have backing file");
return -ENOTSUP;
}
if (flat && zeroed_grain) {
error_setg(errp, "Flat image can't enable zeroed grain");
return -ENOTSUP;
}
if (backing_file) {
BlockDriverState *bs = bdrv_new("");
ret = bdrv_open(bs, backing_file, NULL, 0, NULL);
ret = bdrv_open(bs, backing_file, NULL, 0, NULL, errp);
if (ret != 0) {
bdrv_delete(bs);
goto exit;
bdrv_unref(bs);
return ret;
}
if (strcmp(bs->drv->format_name, "vmdk")) {
bdrv_delete(bs);
ret = -EINVAL;
goto exit;
bdrv_unref(bs);
return -EINVAL;
}
parent_cid = vmdk_read_cid(bs, 0);
bdrv_delete(bs);
bdrv_unref(bs);
snprintf(parent_desc_line, sizeof(parent_desc_line),
"parentFileNameHint=\"%s\"", backing_file);
}
@@ -1683,27 +1730,25 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options)
if (vmdk_create_extent(ext_filename, size,
flat, compress, zeroed_grain)) {
ret = -EINVAL;
goto exit;
return -EINVAL;
}
filesize -= size;
/* Format description line */
snprintf(desc_line, sizeof(desc_line),
desc_extent_line, size / 512, desc_filename);
g_string_append(ext_desc_lines, desc_line);
pstrcat(ext_desc_lines, sizeof(ext_desc_lines), desc_line);
}
/* generate descriptor file */
desc = g_strdup_printf(desc_template,
(unsigned int)time(NULL),
parent_cid,
fmt,
parent_desc_line,
ext_desc_lines->str,
(flags & BLOCK_FLAG_COMPAT6 ? 6 : 4),
total_size / (int64_t)(63 * number_heads * 512),
number_heads,
adapter_type);
snprintf(desc, sizeof(desc), desc_template,
(unsigned int)time(NULL),
parent_cid,
fmt,
parent_desc_line,
ext_desc_lines,
(flags & BLOCK_FLAG_COMPAT6 ? 6 : 4),
total_size / (int64_t)(63 * number_heads * 512), number_heads,
adapter_type);
if (split || flat) {
fd = qemu_open(filename,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
@@ -1714,25 +1759,21 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options)
0644);
}
if (fd < 0) {
ret = -errno;
goto exit;
return -errno;
}
/* the descriptor offset = 0x200 */
if (!split && !flat && 0x200 != lseek(fd, 0x200, SEEK_SET)) {
ret = -errno;
goto close_exit;
goto exit;
}
ret = qemu_write_full(fd, desc, strlen(desc));
if (ret != strlen(desc)) {
ret = -errno;
goto close_exit;
goto exit;
}
ret = 0;
close_exit:
qemu_close(fd);
exit:
g_free(desc);
g_string_free(ext_desc_lines, true);
qemu_close(fd);
return ret;
}
@@ -1741,6 +1782,7 @@ static void vmdk_close(BlockDriverState *bs)
BDRVVmdkState *s = bs->opaque;
vmdk_free_extents(bs);
g_free(s->create_type);
migrate_del_blocker(s->migration_blocker);
error_free(s->migration_blocker);
@@ -1802,6 +1844,54 @@ static int vmdk_has_zero_init(BlockDriverState *bs)
return 1;
}
static ImageInfo *vmdk_get_extent_info(VmdkExtent *extent)
{
ImageInfo *info = g_new0(ImageInfo, 1);
*info = (ImageInfo){
.filename = g_strdup(extent->file->filename),
.format = g_strdup(extent->type),
.virtual_size = extent->sectors * BDRV_SECTOR_SIZE,
.compressed = extent->compressed,
.has_compressed = extent->compressed,
.cluster_size = extent->cluster_sectors * BDRV_SECTOR_SIZE,
.has_cluster_size = !extent->flat,
};
return info;
}
static ImageInfoSpecific *vmdk_get_specific_info(BlockDriverState *bs)
{
int i;
BDRVVmdkState *s = bs->opaque;
ImageInfoSpecific *spec_info = g_new0(ImageInfoSpecific, 1);
ImageInfoList **next;
*spec_info = (ImageInfoSpecific){
.kind = IMAGE_INFO_SPECIFIC_KIND_VMDK,
{
.vmdk = g_new0(ImageInfoSpecificVmdk, 1),
},
};
*spec_info->vmdk = (ImageInfoSpecificVmdk) {
.create_type = g_strdup(s->create_type),
.cid = s->cid,
.parent_cid = s->parent_cid,
};
next = &spec_info->vmdk->extents;
for (i = 0; i < s->num_extents; i++) {
*next = g_new0(ImageInfoList, 1);
(*next)->value = vmdk_get_extent_info(&s->extents[i]);
(*next)->next = NULL;
next = &(*next)->next;
}
return spec_info;
}
static QEMUOptionParameter vmdk_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
@@ -1851,9 +1941,10 @@ static BlockDriver bdrv_vmdk = {
.bdrv_close = vmdk_close,
.bdrv_create = vmdk_create,
.bdrv_co_flush_to_disk = vmdk_co_flush,
.bdrv_co_is_allocated = vmdk_co_is_allocated,
.bdrv_co_get_block_status = vmdk_co_get_block_status,
.bdrv_get_allocated_file_size = vmdk_get_allocated_file_size,
.bdrv_has_zero_init = vmdk_has_zero_init,
.bdrv_get_specific_info = vmdk_get_specific_info,
.create_options = vmdk_create_options,
};

View File

@@ -46,7 +46,7 @@ enum vhd_type {
#define VHD_TIMESTAMP_BASE 946684800
// always big-endian
struct vhd_footer {
typedef struct vhd_footer {
char creator[8]; // "conectix"
uint32_t features;
uint32_t version;
@@ -79,9 +79,9 @@ struct vhd_footer {
uint8_t uuid[16];
uint8_t in_saved_state;
};
} QEMU_PACKED VHDFooter;
struct vhd_dyndisk_header {
typedef struct vhd_dyndisk_header {
char magic[8]; // "cxsparse"
// Offset of next header structure, 0xFFFFFFFF if none
@@ -111,7 +111,7 @@ struct vhd_dyndisk_header {
uint32_t reserved;
uint64_t data_offset;
} parent_locator[8];
};
} QEMU_PACKED VHDDynDiskHeader;
typedef struct BDRVVPCState {
CoMutex lock;
@@ -155,12 +155,13 @@ static int vpc_probe(const uint8_t *buf, int buf_size, const char *filename)
return 0;
}
static int vpc_open(BlockDriverState *bs, QDict *options, int flags)
static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVVPCState *s = bs->opaque;
int i;
struct vhd_footer* footer;
struct vhd_dyndisk_header* dyndisk_header;
VHDFooter *footer;
VHDDynDiskHeader *dyndisk_header;
uint8_t buf[HEADER_SIZE];
uint32_t checksum;
int disk_type = VHD_DYNAMIC;
@@ -171,7 +172,7 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags)
goto fail;
}
footer = (struct vhd_footer*) s->footer_buf;
footer = (VHDFooter *) s->footer_buf;
if (strncmp(footer->creator, "conectix", 8)) {
int64_t offset = bdrv_getlength(bs->file);
if (offset < 0) {
@@ -223,7 +224,7 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags)
goto fail;
}
dyndisk_header = (struct vhd_dyndisk_header *) buf;
dyndisk_header = (VHDDynDiskHeader *) buf;
if (strncmp(dyndisk_header->magic, "cxsparse", 8)) {
ret = -EINVAL;
@@ -259,6 +260,13 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags)
}
}
if (s->free_data_block_offset > bdrv_getlength(bs->file)) {
error_setg(errp, "block-vpc: free_data_block_offset points after "
"the end of file. The image has been truncated.");
ret = -EINVAL;
goto fail;
}
s->last_bitmap_offset = (int64_t) -1;
#ifdef CACHE
@@ -445,7 +453,7 @@ static int vpc_read(BlockDriverState *bs, int64_t sector_num,
int ret;
int64_t offset;
int64_t sectors, sectors_per_block;
struct vhd_footer *footer = (struct vhd_footer *) s->footer_buf;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (cpu_to_be32(footer->type) == VHD_FIXED) {
return bdrv_read(bs->file, sector_num, buf, nb_sectors);
@@ -494,7 +502,7 @@ static int vpc_write(BlockDriverState *bs, int64_t sector_num,
int64_t offset;
int64_t sectors, sectors_per_block;
int ret;
struct vhd_footer *footer = (struct vhd_footer *) s->footer_buf;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (cpu_to_be32(footer->type) == VHD_FIXED) {
return bdrv_write(bs->file, sector_num, buf, nb_sectors);
@@ -596,8 +604,8 @@ static int calculate_geometry(int64_t total_sectors, uint16_t* cyls,
static int create_dynamic_disk(int fd, uint8_t *buf, int64_t total_sectors)
{
struct vhd_dyndisk_header* dyndisk_header =
(struct vhd_dyndisk_header*) buf;
VHDDynDiskHeader *dyndisk_header =
(VHDDynDiskHeader *) buf;
size_t block_size, num_bat_entries;
int i;
int ret = -EIO;
@@ -683,10 +691,11 @@ static int create_fixed_disk(int fd, uint8_t *buf, int64_t total_size)
return ret;
}
static int vpc_create(const char *filename, QEMUOptionParameter *options)
static int vpc_create(const char *filename, QEMUOptionParameter *options,
Error **errp)
{
uint8_t buf[1024];
struct vhd_footer *footer = (struct vhd_footer *) buf;
VHDFooter *footer = (VHDFooter *) buf;
QEMUOptionParameter *disk_type_param;
int fd, i;
uint16_t cyls = 0;
@@ -789,7 +798,7 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options)
static int vpc_has_zero_init(BlockDriverState *bs)
{
BDRVVPCState *s = bs->opaque;
struct vhd_footer *footer = (struct vhd_footer *) s->footer_buf;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
if (cpu_to_be32(footer->type) == VHD_FIXED) {
return bdrv_has_zero_init(bs->file);

View File

@@ -1065,7 +1065,8 @@ static void vvfat_parse_filename(const char *filename, QDict *options,
qdict_put(options, "rw", qbool_from_int(rw));
}
static int vvfat_open(BlockDriverState *bs, QDict *options, int flags)
static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVVVFATState *s = bs->opaque;
int cyls, heads, secs;
@@ -2874,16 +2875,17 @@ static coroutine_fn int vvfat_co_write(BlockDriverState *bs, int64_t sector_num,
return ret;
}
static int coroutine_fn vvfat_co_is_allocated(BlockDriverState *bs,
static int64_t coroutine_fn vvfat_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int* n)
{
BDRVVVFATState* s = bs->opaque;
*n = s->sector_count - sector_num;
if (*n > nb_sectors)
*n = nb_sectors;
else if (*n < 0)
return 0;
return 1;
if (*n > nb_sectors) {
*n = nb_sectors;
} else if (*n < 0) {
return 0;
}
return BDRV_BLOCK_DATA;
}
static int write_target_commit(BlockDriverState *bs, int64_t sector_num,
@@ -2894,7 +2896,7 @@ static int write_target_commit(BlockDriverState *bs, int64_t sector_num,
static void write_target_close(BlockDriverState *bs) {
BDRVVVFATState* s = *((BDRVVVFATState**) bs->opaque);
bdrv_delete(s->qcow);
bdrv_unref(s->qcow);
g_free(s->qcow_filename);
}
@@ -2908,6 +2910,7 @@ static int enable_write_target(BDRVVVFATState *s)
{
BlockDriver *bdrv_qcow;
QEMUOptionParameter *options;
Error *local_err = NULL;
int ret;
int size = sector2cluster(s, s->sector_count);
s->used_clusters = calloc(size, 1);
@@ -2925,17 +2928,22 @@ static int enable_write_target(BDRVVVFATState *s)
set_option_parameter_int(options, BLOCK_OPT_SIZE, s->sector_count * 512);
set_option_parameter(options, BLOCK_OPT_BACKING_FILE, "fat:");
ret = bdrv_create(bdrv_qcow, s->qcow_filename, options);
ret = bdrv_create(bdrv_qcow, s->qcow_filename, options, &local_err);
if (ret < 0) {
qerror_report_err(local_err);
error_free(local_err);
goto err;
}
s->qcow = bdrv_new("");
ret = bdrv_open(s->qcow, s->qcow_filename, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH, bdrv_qcow);
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH, bdrv_qcow,
&local_err);
if (ret < 0) {
bdrv_delete(s->qcow);
qerror_report_err(local_err);
error_free(local_err);
bdrv_unref(s->qcow);
goto err;
}
@@ -2943,7 +2951,7 @@ static int enable_write_target(BDRVVVFATState *s)
unlink(s->qcow_filename);
#endif
s->bs->backing_hd = calloc(sizeof(BlockDriverState), 1);
s->bs->backing_hd = bdrv_new("");
s->bs->backing_hd->drv = &vvfat_write_target;
s->bs->backing_hd->opaque = g_malloc(sizeof(void*));
*(void**)s->bs->backing_hd->opaque = s;
@@ -2984,7 +2992,7 @@ static BlockDriver bdrv_vvfat = {
.bdrv_read = vvfat_co_read,
.bdrv_write = vvfat_co_write,
.bdrv_co_is_allocated = vvfat_co_is_allocated,
.bdrv_co_get_block_status = vvfat_co_get_block_status,
};
static void bdrv_vvfat_init(void)

View File

@@ -105,13 +105,6 @@ static void win32_aio_completion_cb(EventNotifier *e)
}
}
static int win32_aio_flush_cb(EventNotifier *e)
{
QEMUWin32AIOState *s = container_of(e, QEMUWin32AIOState, e);
return (s->count > 0) ? 1 : 0;
}
static void win32_aio_cancel(BlockDriverAIOCB *blockacb)
{
QEMUWin32AIOCB *waiocb = (QEMUWin32AIOCB *)blockacb;
@@ -201,8 +194,7 @@ QEMUWin32AIOState *win32_aio_init(void)
goto out_close_efd;
}
qemu_aio_set_event_notifier(&s->e, win32_aio_completion_cb,
win32_aio_flush_cb);
qemu_aio_set_event_notifier(&s->e, win32_aio_completion_cb);
return s;

View File

@@ -69,12 +69,6 @@ static void nbd_close_notifier(Notifier *n, void *data)
g_free(cn);
}
static void nbd_server_put_ref(NBDExport *exp)
{
BlockDriverState *bs = nbd_export_get_blockdev(exp);
drive_put_ref(drive_get_by_blockdev(bs));
}
void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
Error **errp)
{
@@ -105,11 +99,9 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
writable = false;
}
exp = nbd_export_new(bs, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY,
nbd_server_put_ref);
exp = nbd_export_new(bs, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY, NULL);
nbd_export_set_name(exp, device);
drive_get_ref(drive_get_by_blockdev(bs));
n = g_malloc0(sizeof(NBDCloseNotifier));
n->n.notify = nbd_close_notifier;

1248
blockdev.c

File diff suppressed because it is too large Load Diff

View File

@@ -35,7 +35,7 @@
#include "qmp-commands.h"
#include "qemu/timer.h"
void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs,
void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
int64_t speed, BlockDriverCompletionFunc *cb,
void *opaque, Error **errp)
{
@@ -45,10 +45,11 @@ void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs,
error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
return NULL;
}
bdrv_ref(bs);
bdrv_set_in_use(bs, 1);
job = g_malloc0(job_type->instance_size);
job->job_type = job_type;
job = g_malloc0(driver->instance_size);
job->driver = driver;
job->bs = bs;
job->cb = cb;
job->opaque = opaque;
@@ -86,11 +87,11 @@ void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
{
Error *local_err = NULL;
if (!job->job_type->set_speed) {
if (!job->driver->set_speed) {
error_set(errp, QERR_NOT_SUPPORTED);
return;
}
job->job_type->set_speed(job, speed, &local_err);
job->driver->set_speed(job, speed, &local_err);
if (error_is_set(&local_err)) {
error_propagate(errp, local_err);
return;
@@ -101,12 +102,12 @@ void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
void block_job_complete(BlockJob *job, Error **errp)
{
if (job->paused || job->cancelled || !job->job_type->complete) {
if (job->paused || job->cancelled || !job->driver->complete) {
error_set(errp, QERR_BLOCK_JOB_NOT_READY, job->bs->device_name);
return;
}
job->job_type->complete(job, errp);
job->driver->complete(job, errp);
}
void block_job_pause(BlockJob *job)
@@ -142,8 +143,8 @@ bool block_job_is_cancelled(BlockJob *job)
void block_job_iostatus_reset(BlockJob *job)
{
job->iostatus = BLOCK_DEVICE_IO_STATUS_OK;
if (job->job_type->iostatus_reset) {
job->job_type->iostatus_reset(job);
if (job->driver->iostatus_reset) {
job->driver->iostatus_reset(job);
}
}
@@ -187,7 +188,7 @@ int block_job_cancel_sync(BlockJob *job)
return (data.cancelled && data.ret == 0) ? -ECANCELED : data.ret;
}
void block_job_sleep_ns(BlockJob *job, QEMUClock *clock, int64_t ns)
void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
{
assert(job->busy);
@@ -200,7 +201,7 @@ void block_job_sleep_ns(BlockJob *job, QEMUClock *clock, int64_t ns)
if (block_job_is_paused(job)) {
qemu_coroutine_yield();
} else {
co_sleep_ns(clock, ns);
co_sleep_ns(type, ns);
}
job->busy = true;
}
@@ -208,7 +209,7 @@ void block_job_sleep_ns(BlockJob *job, QEMUClock *clock, int64_t ns)
BlockJobInfo *block_job_query(BlockJob *job)
{
BlockJobInfo *info = g_new0(BlockJobInfo, 1);
info->type = g_strdup(job->job_type->job_type);
info->type = g_strdup(BlockJobType_lookup[job->driver->job_type]);
info->device = g_strdup(bdrv_get_device_name(job->bs));
info->len = job->len;
info->busy = job->busy;
@@ -235,7 +236,7 @@ QObject *qobject_from_block_job(BlockJob *job)
"'len': %" PRId64 ","
"'offset': %" PRId64 ","
"'speed': %" PRId64 " }",
job->job_type->job_type,
BlockJobType_lookup[job->driver->job_type],
bdrv_get_device_name(job->bs),
job->len,
job->offset,

View File

@@ -323,9 +323,9 @@ abi_long copy_from_user(void *hptr, abi_ulong gaddr, size_t len);
abi_long copy_to_user(abi_ulong gaddr, void *hptr, size_t len);
/* Functions for accessing guest memory. The tget and tput functions
read/write single values, byteswapping as necessary. The lock_user
read/write single values, byteswapping as necessary. The lock_user function
gets a pointer to a contiguous area of guest memory, but does not perform
and byteswapping. lock_user may return either a pointer to the guest
any byteswapping. lock_user may return either a pointer to the guest
memory, or a temporary buffer. */
/* Lock an area of guest memory into the host. If copy is true then the
@@ -381,7 +381,7 @@ static inline void *lock_user_string(abi_ulong guest_addr)
return lock_user(VERIFY_READ, guest_addr, (long)(len + 1), 1);
}
/* Helper macros for locking/ulocking a target struct. */
/* Helper macros for locking/unlocking a target struct. */
#define lock_user_struct(type, host_ptr, guest_addr, copy) \
(host_ptr = lock_user(type, guest_addr, sizeof(*host_ptr), copy))
#define unlock_user_struct(host_ptr, guest_addr, copy) \

308
configure vendored
View File

@@ -119,6 +119,7 @@ path_of() {
# default parameters
source_path=`dirname "$0"`
cpu=""
iasl="iasl"
interp_prefix="/usr/gnemul/qemu-%M"
static="no"
cross_prefix=""
@@ -215,7 +216,6 @@ linux_user="no"
bsd_user="no"
guest_base="yes"
uname_release=""
mixemu="no"
aix="no"
blobs="yes"
pkgversion=""
@@ -232,6 +232,9 @@ usb_redir=""
glx=""
zlib="yes"
guest_agent=""
guest_agent_with_vss="no"
vss_win32_sdk=""
win_sdk="no"
want_tools="yes"
libiscsi=""
coroutine=""
@@ -253,6 +256,10 @@ for opt do
;;
--cc=*) CC="$optarg"
;;
--cxx=*) CXX="$optarg"
;;
--iasl=*) iasl="$optarg"
;;
--source-path=*) source_path="$optarg"
;;
--cpu=*) cpu="$optarg"
@@ -283,6 +290,12 @@ else
cc="${CC-${cross_prefix}gcc}"
fi
if test -z "${CXX}${cross_prefix}"; then
cxx="c++"
else
cxx="${CXX-${cross_prefix}g++}"
fi
ar="${AR-${cross_prefix}ar}"
as="${AS-${cross_prefix}as}"
cpp="${CPP-$cc -E}"
@@ -298,9 +311,6 @@ query_pkg_config() {
pkg_config=query_pkg_config
sdl_config="${SDL_CONFIG-${cross_prefix}sdl-config}"
# If the user hasn't specified ARFLAGS, default to 'rv', just as make does.
ARFLAGS="${ARFLAGS-rv}"
# default flags for all hosts
QEMU_CFLAGS="-fno-strict-aliasing $QEMU_CFLAGS"
QEMU_CFLAGS="-Wall -Wundef -Wwrite-strings -Wmissing-prototypes $QEMU_CFLAGS"
@@ -366,7 +376,11 @@ if test ! -z "$cpu" ; then
elif check_define __i386__ ; then
cpu="i386"
elif check_define __x86_64__ ; then
cpu="x86_64"
if check_define __ILP32__ ; then
cpu="x32"
else
cpu="x86_64"
fi
elif check_define __sparc__ ; then
if check_define __arch64__ ; then
cpu="sparc64"
@@ -403,7 +417,7 @@ ARCH=
# Normalise host CPU name and set ARCH.
# Note that this case should only have supported host CPUs, not guests.
case "$cpu" in
ia64|ppc|ppc64|s390|s390x|sparc64)
ia64|ppc|ppc64|s390|s390x|sparc64|x32)
cpu="$cpu"
;;
i386|i486|i586|i686|i86pc|BePC)
@@ -418,9 +432,6 @@ case "$cpu" in
aarch64)
cpu="aarch64"
;;
hppa|parisc|parisc64)
cpu="hppa"
;;
mips*)
cpu="mips"
;;
@@ -550,11 +561,10 @@ Haiku)
audio_possible_drivers="oss alsa sdl esd pa"
linux="yes"
linux_user="yes"
usb="linux"
kvm="yes"
vhost_net="yes"
vhost_scsi="yes"
if [ "$cpu" = "i386" -o "$cpu" = "x86_64" ] ; then
if [ "$cpu" = "i386" -o "$cpu" = "x86_64" -o "$cpu" = "x32" ] ; then
audio_possible_drivers="$audio_possible_drivers fmod"
fi
QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers $QEMU_INCLUDES"
@@ -563,16 +573,13 @@ esac
if [ "$bsd" = "yes" ] ; then
if [ "$darwin" != "yes" ] ; then
if [ "$targetos" != "FreeBSD" ]; then
usb="bsd"
fi
bsd_user="yes"
fi
fi
: ${make=${MAKE-make}}
: ${install=${INSTALL-install}}
: ${python=${PYTHON-python}}
: ${python=${PYTHON-python -B}}
: ${smbd=${SMBD-/usr/sbin/smbd}}
# Default objcc to clang if available, otherwise use CC
@@ -626,6 +633,8 @@ for opt do
;;
--host-cc=*) host_cc="$optarg"
;;
--cxx=*)
;;
--objcc=*) objcc="$optarg"
;;
--make=*) make="$optarg"
@@ -859,8 +868,6 @@ for opt do
;;
--enable-fdt) fdt="yes"
;;
--enable-mixemu) mixemu="yes"
;;
--disable-linux-aio) linux_aio="no"
;;
--enable-linux-aio) linux_aio="yes"
@@ -921,6 +928,18 @@ for opt do
;;
--disable-guest-agent) guest_agent="no"
;;
--with-vss-sdk) vss_win32_sdk=""
;;
--with-vss-sdk=*) vss_win32_sdk="$optarg"
;;
--without-vss-sdk) vss_win32_sdk="no"
;;
--with-win-sdk) win_sdk=""
;;
--with-win-sdk=*) win_sdk="$optarg"
;;
--without-win-sdk) win_sdk="no"
;;
--enable-tools) want_tools="yes"
;;
--disable-tools) want_tools="no"
@@ -959,6 +978,14 @@ for opt do
done
case "$cpu" in
ppc)
CPU_CFLAGS="-m32"
LDFLAGS="-m32 $LDFLAGS"
;;
ppc64)
CPU_CFLAGS="-m64"
LDFLAGS="-m64 $LDFLAGS"
;;
sparc)
LDFLAGS="-m32 $LDFLAGS"
CPU_CFLAGS="-m32 -mcpu=ultrasparc"
@@ -985,6 +1012,11 @@ case "$cpu" in
LDFLAGS="-m64 $LDFLAGS"
cc_i386='$(CC) -m32'
;;
x32)
CPU_CFLAGS="-mx32"
LDFLAGS="-mx32 $LDFLAGS"
cc_i386='$(CC) -m32'
;;
# No special flags required for other host CPUs
esac
@@ -1029,8 +1061,10 @@ echo "Advanced options (experts only):"
echo " --source-path=PATH path of source code [$source_path]"
echo " --cross-prefix=PREFIX use PREFIX for compile tools [$cross_prefix]"
echo " --cc=CC use C compiler CC [$cc]"
echo " --iasl=IASL use ACPI compiler IASL [$iasl]"
echo " --host-cc=CC use C compiler CC [$host_cc] for code run at"
echo " build time"
echo " --cxx=CXX use C++ compiler CXX [$cxx]"
echo " --objcc=OBJCC use Objective-C compiler OBJCC [$objcc]"
echo " --extra-cflags=CFLAGS append extra C compiler flags QEMU_CFLAGS"
echo " --extra-ldflags=LDFLAGS append extra linker flags LDFLAGS"
@@ -1075,7 +1109,6 @@ echo " (affects only QEMU, not qemu-img)"
echo " --block-drv-ro-whitelist=L"
echo " set block driver read-only whitelist"
echo " (affects only QEMU, not qemu-img)"
echo " --enable-mixemu enable mixer emulation"
echo " --disable-xen disable xen backend driver support"
echo " --enable-xen enable xen backend driver support"
echo " --disable-xen-pci-passthrough"
@@ -1156,6 +1189,8 @@ echo " --disable-usb-redir disable usb network redirection support"
echo " --enable-usb-redir enable usb network redirection support"
echo " --disable-guest-agent disable building of the QEMU Guest Agent"
echo " --enable-guest-agent enable building of the QEMU Guest Agent"
echo " --with-vss-sdk=SDK-path enable Windows VSS support in QEMU Guest Agent"
echo " --with-win-sdk=SDK-path path to Windows Platform SDK (to build VSS .tlb)"
echo " --disable-seccomp disable seccomp support"
echo " --enable-seccomp enables seccomp support"
echo " --with-coroutine=BACKEND coroutine backend. Supported options:"
@@ -1214,6 +1249,7 @@ gcc_flags="-Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers $gcc_
gcc_flags="-Wmissing-include-dirs -Wempty-body -Wnested-externs $gcc_flags"
gcc_flags="-Wendif-labels $gcc_flags"
gcc_flags="-Wno-initializer-overrides $gcc_flags"
gcc_flags="-Wno-string-plus-int $gcc_flags"
# Note that we do not add -Werror to gcc_flags here, because that would
# enable it for all configure tests. If a configure test failed due
# to -Werror this would just silently disable some features,
@@ -1261,7 +1297,7 @@ fi
if test "$pie" = ""; then
case "$cpu-$targetos" in
i386-Linux|x86_64-Linux|i386-OpenBSD|x86_64-OpenBSD)
i386-Linux|x86_64-Linux|x32-Linux|i386-OpenBSD|x86_64-OpenBSD)
;;
*)
pie="no"
@@ -1358,7 +1394,7 @@ fi
# Note that if the Python conditional here evaluates True we will exit
# with status 1 which is a shell 'false' value.
if ! "$python" -c 'import sys; sys.exit(sys.version_info < (2,4) or sys.version_info >= (3,))'; then
if ! $python -c 'import sys; sys.exit(sys.version_info < (2,4) or sys.version_info >= (3,))'; then
error_exit "Cannot use '$python', Python 2.4 or later is required." \
"Note that Python 3 or later is not yet supported." \
"Use --python=/path/to/python to specify a supported Python."
@@ -1467,7 +1503,7 @@ libs_softmmu="$libs_softmmu -lz"
# libseccomp check
if test "$seccomp" != "no" ; then
if $pkg_config --atleast-version=2.1.0 libseccomp --modversion >/dev/null 2>&1; then
if $pkg_config --atleast-version=2.1.0 libseccomp; then
libs_softmmu="$libs_softmmu `$pkg_config --libs libseccomp`"
QEMU_CFLAGS="$QEMU_CFLAGS `$pkg_config --cflags libseccomp`"
seccomp="yes"
@@ -1701,10 +1737,10 @@ if test "$gtk" != "no"; then
fi
gtk="no"
else
gtk_cflags=`$pkg_config --cflags $gtkpackage 2>/dev/null`
gtk_libs=`$pkg_config --libs $gtkpackage 2>/dev/null`
vte_cflags=`$pkg_config --cflags $vtepackage 2>/dev/null`
vte_libs=`$pkg_config --libs $vtepackage 2>/dev/null`
gtk_cflags=`$pkg_config --cflags $gtkpackage`
gtk_libs=`$pkg_config --libs $gtkpackage`
vte_cflags=`$pkg_config --cflags $vtepackage`
vte_libs=`$pkg_config --libs $vtepackage`
libs_softmmu="$gtk_libs $vte_libs $libs_softmmu"
gtk="yes"
fi
@@ -1719,7 +1755,7 @@ if test "`basename $sdl_config`" != sdl-config && ! has ${sdl_config}; then
sdl_config=sdl-config
fi
if $pkg_config sdl --modversion >/dev/null 2>&1; then
if $pkg_config sdl --exists; then
sdlconfig="$pkg_config sdl"
_sdlversion=`$sdlconfig --modversion 2>/dev/null | sed 's/[^0-9]//g'`
elif has ${sdl_config}; then
@@ -1905,9 +1941,9 @@ int main(void) {
return png_ptr != 0;
}
EOF
if $pkg_config libpng --modversion >/dev/null 2>&1; then
vnc_png_cflags=`$pkg_config libpng --cflags 2> /dev/null`
vnc_png_libs=`$pkg_config libpng --libs 2> /dev/null`
if $pkg_config libpng --exists; then
vnc_png_cflags=`$pkg_config libpng --cflags`
vnc_png_libs=`$pkg_config libpng --libs`
else
vnc_png_cflags=""
vnc_png_libs="-lpng"
@@ -2184,7 +2220,7 @@ fi
##########################################
# curl probe
if test "$curl" != "no" ; then
if $pkg_config libcurl --modversion >/dev/null 2>&1; then
if $pkg_config libcurl --exists; then
curlconfig="$pkg_config libcurl"
else
curlconfig=curl-config
@@ -2236,10 +2272,9 @@ if test "$mingw32" = yes; then
else
glib_req_ver=2.12
fi
if $pkg_config --atleast-version=$glib_req_ver gthread-2.0 > /dev/null 2>&1
then
glib_cflags=`$pkg_config --cflags gthread-2.0 2>/dev/null`
glib_libs=`$pkg_config --libs gthread-2.0 2>/dev/null`
if $pkg_config --atleast-version=$glib_req_ver gthread-2.0; then
glib_cflags=`$pkg_config --cflags gthread-2.0`
glib_libs=`$pkg_config --libs gthread-2.0`
LIBS="$glib_libs $LIBS"
libs_qga="$glib_libs $libs_qga"
else
@@ -2268,8 +2303,8 @@ if test "$pixman" = "none"; then
pixman_cflags=
pixman_libs=
elif test "$pixman" = "system"; then
pixman_cflags=`$pkg_config --cflags pixman-1 2>/dev/null`
pixman_libs=`$pkg_config --libs pixman-1 2>/dev/null`
pixman_cflags=`$pkg_config --cflags pixman-1`
pixman_libs=`$pkg_config --libs pixman-1`
else
if test ! -d ${source_path}/pixman/pixman; then
error_exit "pixman not present. Your options:" \
@@ -2368,8 +2403,7 @@ fi
# libssh2 probe
min_libssh2_version=1.2.8
if test "$libssh2" != "no" ; then
if $pkg_config --atleast-version=$min_libssh2_version libssh2 >/dev/null 2>&1
then
if $pkg_config --atleast-version=$min_libssh2_version libssh2; then
libssh2_cflags=`$pkg_config libssh2 --cflags`
libssh2_libs=`$pkg_config libssh2 --libs`
libssh2=yes
@@ -2512,7 +2546,7 @@ fi
fdt_required=no
for target in $target_list; do
case $target in
arm*-softmmu|ppc*-softmmu|microblaze*-softmmu)
aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu)
fdt_required=yes
;;
esac
@@ -2587,14 +2621,14 @@ fi
##########################################
# glusterfs probe
if test "$glusterfs" != "no" ; then
if $pkg_config --atleast-version=3 glusterfs-api >/dev/null 2>&1; then
if $pkg_config --atleast-version=3 glusterfs-api; then
glusterfs="yes"
glusterfs_cflags=`$pkg_config --cflags glusterfs-api 2>/dev/null`
glusterfs_libs=`$pkg_config --libs glusterfs-api 2>/dev/null`
glusterfs_cflags=`$pkg_config --cflags glusterfs-api`
glusterfs_libs=`$pkg_config --libs glusterfs-api`
CFLAGS="$CFLAGS $glusterfs_cflags"
libs_tools="$glusterfs_libs $libs_tools"
libs_softmmu="$glusterfs_libs $libs_softmmu"
if $pkg_config --atleast-version=5 glusterfs-api >/dev/null 2>&1; then
if $pkg_config --atleast-version=5 glusterfs-api; then
glusterfs_discard="yes"
fi
else
@@ -2816,6 +2850,37 @@ if compile_prog "" "" ; then
dup3=yes
fi
# check for ppoll support
ppoll=no
cat > $TMPC << EOF
#include <poll.h>
int main(void)
{
struct pollfd pfd = { .fd = 0, .events = 0, .revents = 0 };
ppoll(&pfd, 1, 0, 0);
return 0;
}
EOF
if compile_prog "" "" ; then
ppoll=yes
fi
# check for prctl(PR_SET_TIMERSLACK , ... ) support
prctl_pr_set_timerslack=no
cat > $TMPC << EOF
#include <sys/prctl.h>
int main(void)
{
prctl(PR_SET_TIMERSLACK, 1, 0, 0, 0);
return 0;
}
EOF
if compile_prog "" "" ; then
prctl_pr_set_timerslack=yes
fi
# check for epoll support
epoll=no
cat > $TMPC << EOF
@@ -2926,10 +2991,10 @@ if test "$libiscsi" != "no" ; then
#include <iscsi/iscsi.h>
int main(void) { iscsi_unmap_sync(NULL,0,0,0,NULL,0); return 0; }
EOF
if $pkg_config --atleast-version=1.7.0 libiscsi --modversion >/dev/null 2>&1; then
if $pkg_config --atleast-version=1.7.0 libiscsi; then
libiscsi="yes"
libiscsi_cflags=$($pkg_config --cflags libiscsi 2>/dev/null)
libiscsi_libs=$($pkg_config --libs libiscsi 2>/dev/null)
libiscsi_cflags=$($pkg_config --cflags libiscsi)
libiscsi_libs=$($pkg_config --libs libiscsi)
CFLAGS="$CFLAGS $libiscsi_cflags"
LIBS="$LIBS $libiscsi_libs"
elif compile_prog "" "-liscsi" ; then
@@ -2996,8 +3061,8 @@ int main(void) { spice_server_new(); return 0; }
EOF
spice_cflags=$($pkg_config --cflags spice-protocol spice-server 2>/dev/null)
spice_libs=$($pkg_config --libs spice-protocol spice-server 2>/dev/null)
if $pkg_config --atleast-version=0.12.0 spice-server >/dev/null 2>&1 && \
$pkg_config --atleast-version=0.12.3 spice-protocol > /dev/null 2>&1 && \
if $pkg_config --atleast-version=0.12.0 spice-server && \
$pkg_config --atleast-version=0.12.3 spice-protocol && \
compile_prog "$spice_cflags" "$spice_libs" ; then
spice="yes"
libs_softmmu="$libs_softmmu $spice_libs"
@@ -3032,7 +3097,7 @@ EOF
test_cflags="-Werror $test_cflags"
fi
if test -n "$libtool" &&
$pkg_config --atleast-version=3.12.8 nss >/dev/null 2>&1 && \
$pkg_config --atleast-version=3.12.8 nss && \
compile_prog "$test_cflags" "$libcacard_libs"; then
smartcard_nss="yes"
QEMU_CFLAGS="$QEMU_CFLAGS $libcacard_cflags"
@@ -3048,11 +3113,10 @@ fi
# check for libusb
if test "$libusb" != "no" ; then
if $pkg_config --atleast-version=1.0.13 libusb-1.0 >/dev/null 2>&1 ; then
if $pkg_config --atleast-version=1.0.13 libusb-1.0; then
libusb="yes"
usb="libusb"
libusb_cflags=$($pkg_config --cflags libusb-1.0 2>/dev/null)
libusb_libs=$($pkg_config --libs libusb-1.0 2>/dev/null)
libusb_cflags=$($pkg_config --cflags libusb-1.0)
libusb_libs=$($pkg_config --libs libusb-1.0)
QEMU_CFLAGS="$QEMU_CFLAGS $libusb_cflags"
libs_softmmu="$libs_softmmu $libusb_libs"
else
@@ -3065,10 +3129,10 @@ fi
# check for usbredirparser for usb network redirection support
if test "$usb_redir" != "no" ; then
if $pkg_config --atleast-version=0.6 libusbredirparser-0.5 >/dev/null 2>&1 ; then
if $pkg_config --atleast-version=0.6 libusbredirparser-0.5; then
usb_redir="yes"
usb_redir_cflags=$($pkg_config --cflags libusbredirparser-0.5 2>/dev/null)
usb_redir_libs=$($pkg_config --libs libusbredirparser-0.5 2>/dev/null)
usb_redir_cflags=$($pkg_config --cflags libusbredirparser-0.5)
usb_redir_libs=$($pkg_config --libs libusbredirparser-0.5)
QEMU_CFLAGS="$QEMU_CFLAGS $usb_redir_cflags"
libs_softmmu="$libs_softmmu $usb_redir_libs"
else
@@ -3079,6 +3143,61 @@ if test "$usb_redir" != "no" ; then
fi
fi
##########################################
# check if we have VSS SDK headers for win
if test "$mingw32" = "yes" -a "$guest_agent" != "no" -a "$vss_win32_sdk" != "no" ; then
case "$vss_win32_sdk" in
"") vss_win32_include="-I$source_path" ;;
*\ *) # The SDK is installed in "Program Files" by default, but we cannot
# handle path with spaces. So we symlink the headers into ".sdk/vss".
vss_win32_include="-I$source_path/.sdk/vss"
symlink "$vss_win32_sdk/inc" "$source_path/.sdk/vss/inc"
;;
*) vss_win32_include="-I$vss_win32_sdk"
esac
cat > $TMPC << EOF
#define __MIDL_user_allocate_free_DEFINED__
#include <inc/win2003/vss.h>
int main(void) { return VSS_CTX_BACKUP; }
EOF
if compile_prog "$vss_win32_include" "" ; then
guest_agent_with_vss="yes"
QEMU_CFLAGS="$QEMU_CFLAGS $vss_win32_include"
libs_qga="-lole32 -loleaut32 -lshlwapi -luuid -lstdc++ -Wl,--enable-stdcall-fixup $libs_qga"
else
if test "$vss_win32_sdk" != "" ; then
echo "ERROR: Please download and install Microsoft VSS SDK:"
echo "ERROR: http://www.microsoft.com/en-us/download/details.aspx?id=23490"
echo "ERROR: On POSIX-systems, you can extract the SDK headers by:"
echo "ERROR: scripts/extract-vsssdk-headers setup.exe"
echo "ERROR: The headers are extracted in the directory \`inc'."
feature_not_found "VSS support"
fi
guest_agent_with_vss="no"
fi
fi
##########################################
# lookup Windows platform SDK (if not specified)
# The SDK is needed only to build .tlb (type library) file of guest agent
# VSS provider from the source. It is usually unnecessary because the
# pre-compiled .tlb file is included.
if test "$mingw32" = "yes" -a "$guest_agent" != "no" -a "$guest_agent_with_vss" = "yes" ; then
if test -z "$win_sdk"; then
programfiles="$PROGRAMFILES"
test -n "$PROGRAMW6432" && programfiles="$PROGRAMW6432"
if test -n "$programfiles"; then
win_sdk=$(ls -d "$programfiles/Microsoft SDKs/Windows/v"* | tail -1) 2>/dev/null
else
feature_not_found "Windows SDK"
fi
elif test "$win_sdk" = "no"; then
win_sdk=""
fi
fi
##########################################
##########################################
@@ -3389,7 +3508,7 @@ if test "$gcov" = "yes" ; then
CFLAGS="-fprofile-arcs -ftest-coverage -g $CFLAGS"
LDFLAGS="-fprofile-arcs -ftest-coverage $LDFLAGS"
elif test "$debug" = "no" ; then
CFLAGS="-O2 -D_FORTIFY_SOURCE=2 $CFLAGS"
CFLAGS="-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 $CFLAGS"
fi
@@ -3455,8 +3574,11 @@ if test "$softmmu" = yes ; then
fi
fi
if [ "$guest_agent" != "no" ]; then
if [ "$linux" = "yes" -o "$bsd" = "yes" -o "$solaris" = "yes" ] ; then
if [ "$linux" = "yes" -o "$bsd" = "yes" -o "$solaris" = "yes" -o "$mingw32" = "yes" ] ; then
tools="qemu-ga\$(EXESUF) $tools"
if [ "$mingw32" = "yes" -a "$guest_agent_with_vss" = "yes" ]; then
tools="qga/vss-win32/qga-vss.dll qga/vss-win32/qga-vss.tlb $tools"
fi
guest_agent=yes
elif [ "$guest_agent" != yes ]; then
guest_agent=no
@@ -3484,7 +3606,7 @@ fi
if test "$pie" = "no" ; then
textseg_addr=
case "$cpu" in
arm | hppa | i386 | m68k | ppc | ppc64 | s390* | sparc | sparc64 | x86_64)
arm | hppa | i386 | m68k | ppc | ppc64 | s390* | sparc | sparc64 | x86_64 | x32)
textseg_addr=0x60000000
;;
mips)
@@ -3527,12 +3649,13 @@ echo "Manual directory `eval echo $mandir`"
echo "ELF interp prefix $interp_prefix"
else
echo "local state directory queried at runtime"
echo "Windows SDK $win_sdk"
fi
echo "Source path $source_path"
echo "C compiler $cc"
echo "Host C compiler $host_cc"
echo "C++ compiler $cxx"
echo "Objective-C compiler $objcc"
echo "ARFLAGS $ARFLAGS"
echo "CFLAGS $CFLAGS"
echo "QEMU_CFLAGS $QEMU_CFLAGS"
echo "LDFLAGS $LDFLAGS"
@@ -3564,7 +3687,6 @@ echo "mingw32 support $mingw32"
echo "Audio drivers $audio_drv_list"
echo "Block whitelist (rw) $block_drv_rw_whitelist"
echo "Block whitelist (ro) $block_drv_ro_whitelist"
echo "Mixer emulation $mixemu"
echo "VirtFS support $virtfs"
echo "VNC support $vnc"
if test "$vnc" = "yes" ; then
@@ -3613,6 +3735,7 @@ echo "usb net redir $usb_redir"
echo "GLX support $glx"
echo "libiscsi support $libiscsi"
echo "build guest agent $guest_agent"
echo "QGA VSS support $guest_agent_with_vss"
echo "seccomp support $seccomp"
echo "coroutine backend $coroutine"
echo "coroutine pool $coroutine_pool"
@@ -3660,14 +3783,6 @@ echo "libs_softmmu=$libs_softmmu" >> $config_host_mak
echo "ARCH=$ARCH" >> $config_host_mak
case "$cpu" in
arm|i386|x86_64|ppc|aarch64)
# The TCG interpreter currently does not support ld/st optimization.
if test "$tcg_interpreter" = "no" ; then
echo "CONFIG_QEMU_LDST_OPTIMIZATION=y" >> $config_host_mak
fi
;;
esac
if test "$debug_tcg" = "yes" ; then
echo "CONFIG_DEBUG_TCG=y" >> $config_host_mak
fi
@@ -3688,6 +3803,10 @@ if test "$mingw32" = "yes" ; then
version_micro=0
echo "CONFIG_FILEVERSION=$version_major,$version_minor,$version_subminor,$version_micro" >> $config_host_mak
echo "CONFIG_PRODUCTVERSION=$version_major,$version_minor,$version_subminor,$version_micro" >> $config_host_mak
if test "$guest_agent_with_vss" = "yes" ; then
echo "CONFIG_QGA_VSS=y" >> $config_host_mak
echo "WIN_SDK=\"$win_sdk\"" >> $config_host_mak
fi
else
echo "CONFIG_POSIX=y" >> $config_host_mak
fi
@@ -3746,9 +3865,6 @@ if test "$audio_win_int" = "yes" ; then
fi
echo "CONFIG_BDRV_RW_WHITELIST=$block_drv_rw_whitelist" >> $config_host_mak
echo "CONFIG_BDRV_RO_WHITELIST=$block_drv_ro_whitelist" >> $config_host_mak
if test "$mixemu" = "yes" ; then
echo "CONFIG_MIXEMU=y" >> $config_host_mak
fi
if test "$vnc" = "yes" ; then
echo "CONFIG_VNC=y" >> $config_host_mak
fi
@@ -3825,6 +3941,12 @@ fi
if test "$dup3" = "yes" ; then
echo "CONFIG_DUP3=y" >> $config_host_mak
fi
if test "$ppoll" = "yes" ; then
echo "CONFIG_PPOLL=y" >> $config_host_mak
fi
if test "$prctl_pr_set_timerslack" = "yes" ; then
echo "CONFIG_PRCTL_PR_SET_TIMERSLACK=y" >> $config_host_mak
fi
if test "$epoll" = "yes" ; then
echo "CONFIG_EPOLL=y" >> $config_host_mak
fi
@@ -4020,24 +4142,11 @@ if test "$virtio_blk_data_plane" = "yes" ; then
fi
# USB host support
case "$usb" in
linux)
echo "HOST_USB=linux legacy" >> $config_host_mak
;;
bsd)
echo "HOST_USB=bsd" >> $config_host_mak
;;
libusb)
if test "$linux" = "yes"; then
echo "HOST_USB=libusb linux legacy" >> $config_host_mak
else
echo "HOST_USB=libusb legacy" >> $config_host_mak
fi
;;
*)
if test "$libusb" = "yes"; then
echo "HOST_USB=libusb legacy" >> $config_host_mak
else
echo "HOST_USB=stub" >> $config_host_mak
;;
esac
fi
# TPM passthrough support?
if test "$tpm" = "yes"; then
@@ -4095,7 +4204,7 @@ elif test "$ARCH" = "sparc64" ; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/sparc $QEMU_INCLUDES"
elif test "$ARCH" = "s390x" ; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/s390 $QEMU_INCLUDES"
elif test "$ARCH" = "x86_64" ; then
elif test "$ARCH" = "x86_64" -o "$ARCH" = "x32" ; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/i386 $QEMU_INCLUDES"
else
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/\$(ARCH) $QEMU_INCLUDES"
@@ -4117,11 +4226,14 @@ else
fi
echo "PYTHON=$python" >> $config_host_mak
echo "CC=$cc" >> $config_host_mak
if $iasl -h > /dev/null 2>&1; then
echo "IASL=$iasl" >> $config_host_mak
fi
echo "CC_I386=$cc_i386" >> $config_host_mak
echo "HOST_CC=$host_cc" >> $config_host_mak
echo "CXX=$cxx" >> $config_host_mak
echo "OBJCC=$objcc" >> $config_host_mak
echo "AR=$ar" >> $config_host_mak
echo "ARFLAGS=$ARFLAGS" >> $config_host_mak
echo "AS=$as" >> $config_host_mak
echo "CPP=$cpp" >> $config_host_mak
echo "OBJCOPY=$objcopy" >> $config_host_mak
@@ -4158,7 +4270,7 @@ fi
if test "$linux" = "yes" ; then
mkdir -p linux-headers
case "$cpu" in
i386|x86_64)
i386|x86_64|x32)
linux_arch=x86
;;
ppcemb|ppc|ppc64)
@@ -4244,6 +4356,11 @@ case "$target_name" in
bflt="yes"
gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
;;
aarch64)
TARGET_BASE_ARCH=arm
bflt="yes"
gdb_xml_files="aarch64-core.xml"
;;
cris)
;;
lm32)
@@ -4424,7 +4541,7 @@ for i in $ARCH $TARGET_BASE_ARCH ; do
echo "CONFIG_HPPA_DIS=y" >> $config_target_mak
echo "CONFIG_HPPA_DIS=y" >> config-all-disas.mak
;;
i386|x86_64)
i386|x86_64|x32)
echo "CONFIG_I386_DIS=y" >> $config_target_mak
echo "CONFIG_I386_DIS=y" >> config-all-disas.mak
;;
@@ -4524,7 +4641,8 @@ if [ "$dtc_internal" = "yes" ]; then
fi
# build tree in object directory in case the source is not in the current directory
DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/libqos tests/qapi-schema tests/tcg/xtensa"
DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/libqos tests/qapi-schema tests/tcg/xtensa tests/qemu-iotests"
DIRS="$DIRS fsdev"
DIRS="$DIRS pc-bios/optionrom pc-bios/spapr-rtas pc-bios/s390-ccw"
DIRS="$DIRS roms/seabios roms/vgabios"
DIRS="$DIRS qapi-generated"
@@ -4564,7 +4682,7 @@ for rom in seabios vgabios ; do
echo "BCC=bcc" >> $config_mak
echo "CPP=$cpp" >> $config_mak
echo "OBJCOPY=objcopy" >> $config_mak
echo "IASL=iasl" >> $config_mak
echo "IASL=$iasl" >> $config_mak
echo "LD=$ld" >> $config_mak
done

View File

@@ -53,7 +53,7 @@ void cpu_resume_from_signal(CPUArchState *env, void *puc)
static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, uint8_t *tb_ptr)
{
CPUArchState *env = cpu->env_ptr;
tcg_target_ulong next_tb = tcg_qemu_tb_exec(env, tb_ptr);
uintptr_t next_tb = tcg_qemu_tb_exec(env, tb_ptr);
if ((next_tb & TB_EXIT_MASK) > TB_EXIT_IDX1) {
/* We didn't start executing this TB (eg because the instruction
* counter hit zero); we must restore the guest PC to the address
@@ -209,7 +209,7 @@ int cpu_exec(CPUArchState *env)
int ret, interrupt_request;
TranslationBlock *tb;
uint8_t *tc_ptr;
tcg_target_ulong next_tb;
uintptr_t next_tb;
if (cpu->halted) {
if (!cpu_has_work(cpu)) {
@@ -681,6 +681,10 @@ int cpu_exec(CPUArchState *env)
* local variables as longjmp is marked 'noreturn'. */
cpu = current_cpu;
env = cpu->env_ptr;
#if !(defined(CONFIG_USER_ONLY) && \
(defined(TARGET_M68K) || defined(TARGET_PPC) || defined(TARGET_S390X)))
cc = CPU_GET_CLASS(cpu);
#endif
}
} /* for(;;) */

373
cpus.c
View File

@@ -37,6 +37,7 @@
#include "sysemu/qtest.h"
#include "qemu/main-loop.h"
#include "qemu/bitmap.h"
#include "qemu/seqlock.h"
#ifndef _WIN32
#include "qemu/compatfd.h"
@@ -62,12 +63,17 @@
static CPUState *next_cpu;
bool cpu_is_stopped(CPUState *cpu)
{
return cpu->stopped || !runstate_is_running();
}
static bool cpu_thread_is_idle(CPUState *cpu)
{
if (cpu->stop || cpu->queued_work_first) {
return false;
}
if (cpu->stopped || !runstate_is_running()) {
if (cpu_is_stopped(cpu)) {
return true;
}
if (!cpu->halted || qemu_cpu_has_work(cpu) ||
@@ -81,7 +87,7 @@ static bool all_cpu_threads_idle(void)
{
CPUState *cpu;
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
if (!cpu_thread_is_idle(cpu)) {
return false;
}
@@ -92,21 +98,32 @@ static bool all_cpu_threads_idle(void)
/***********************************************************/
/* guest cycle counter */
/* Protected by TimersState seqlock */
/* Compensate for varying guest execution speed. */
static int64_t qemu_icount_bias;
static int64_t vm_clock_warp_start;
/* Conversion factor from emulated instructions to virtual clock ticks. */
static int icount_time_shift;
/* Arbitrarily pick 1MIPS as the minimum allowable speed. */
#define MAX_ICOUNT_SHIFT 10
/* Compensate for varying guest execution speed. */
static int64_t qemu_icount_bias;
/* Only written by TCG thread */
static int64_t qemu_icount;
static QEMUTimer *icount_rt_timer;
static QEMUTimer *icount_vm_timer;
static QEMUTimer *icount_warp_timer;
static int64_t vm_clock_warp_start;
static int64_t qemu_icount;
typedef struct TimersState {
/* Protected by BQL. */
int64_t cpu_ticks_prev;
int64_t cpu_ticks_offset;
/* cpu_clock_offset can be read out of BQL, so protect it with
* this lock.
*/
QemuSeqLock vm_clock_seqlock;
int64_t cpu_clock_offset;
int32_t cpu_ticks_enabled;
int64_t dummy;
@@ -115,7 +132,7 @@ typedef struct TimersState {
static TimersState timers_state;
/* Return the virtual CPU time, based on the instruction counter. */
int64_t cpu_get_icount(void)
static int64_t cpu_get_icount_locked(void)
{
int64_t icount;
CPUState *cpu = current_cpu;
@@ -131,58 +148,100 @@ int64_t cpu_get_icount(void)
return qemu_icount_bias + (icount << icount_time_shift);
}
int64_t cpu_get_icount(void)
{
int64_t icount;
unsigned start;
do {
start = seqlock_read_begin(&timers_state.vm_clock_seqlock);
icount = cpu_get_icount_locked();
} while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
return icount;
}
/* return the host CPU cycle counter and handle stop/restart */
/* Caller must hold the BQL */
int64_t cpu_get_ticks(void)
{
int64_t ticks;
if (use_icount) {
return cpu_get_icount();
}
if (!timers_state.cpu_ticks_enabled) {
return timers_state.cpu_ticks_offset;
} else {
int64_t ticks;
ticks = cpu_get_real_ticks();
if (timers_state.cpu_ticks_prev > ticks) {
/* Note: non increasing ticks may happen if the host uses
software suspend */
timers_state.cpu_ticks_offset += timers_state.cpu_ticks_prev - ticks;
}
timers_state.cpu_ticks_prev = ticks;
return ticks + timers_state.cpu_ticks_offset;
ticks = timers_state.cpu_ticks_offset;
if (timers_state.cpu_ticks_enabled) {
ticks += cpu_get_real_ticks();
}
if (timers_state.cpu_ticks_prev > ticks) {
/* Note: non increasing ticks may happen if the host uses
software suspend */
timers_state.cpu_ticks_offset += timers_state.cpu_ticks_prev - ticks;
ticks = timers_state.cpu_ticks_prev;
}
timers_state.cpu_ticks_prev = ticks;
return ticks;
}
static int64_t cpu_get_clock_locked(void)
{
int64_t ticks;
ticks = timers_state.cpu_clock_offset;
if (timers_state.cpu_ticks_enabled) {
ticks += get_clock();
}
return ticks;
}
/* return the host CPU monotonic timer and handle stop/restart */
int64_t cpu_get_clock(void)
{
int64_t ti;
if (!timers_state.cpu_ticks_enabled) {
return timers_state.cpu_clock_offset;
} else {
ti = get_clock();
return ti + timers_state.cpu_clock_offset;
}
unsigned start;
do {
start = seqlock_read_begin(&timers_state.vm_clock_seqlock);
ti = cpu_get_clock_locked();
} while (seqlock_read_retry(&timers_state.vm_clock_seqlock, start));
return ti;
}
/* enable cpu_get_ticks() */
/* enable cpu_get_ticks()
* Caller must hold BQL which server as mutex for vm_clock_seqlock.
*/
void cpu_enable_ticks(void)
{
/* Here, the really thing protected by seqlock is cpu_clock_offset. */
seqlock_write_lock(&timers_state.vm_clock_seqlock);
if (!timers_state.cpu_ticks_enabled) {
timers_state.cpu_ticks_offset -= cpu_get_real_ticks();
timers_state.cpu_clock_offset -= get_clock();
timers_state.cpu_ticks_enabled = 1;
}
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
}
/* disable cpu_get_ticks() : the clock is stopped. You must not call
cpu_get_ticks() after that. */
* cpu_get_ticks() after that.
* Caller must hold BQL which server as mutex for vm_clock_seqlock.
*/
void cpu_disable_ticks(void)
{
/* Here, the really thing protected by seqlock is cpu_clock_offset. */
seqlock_write_lock(&timers_state.vm_clock_seqlock);
if (timers_state.cpu_ticks_enabled) {
timers_state.cpu_ticks_offset = cpu_get_ticks();
timers_state.cpu_clock_offset = cpu_get_clock();
timers_state.cpu_ticks_offset += cpu_get_real_ticks();
timers_state.cpu_clock_offset = cpu_get_clock_locked();
timers_state.cpu_ticks_enabled = 0;
}
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
}
/* Correlation between real and virtual time is always going to be
@@ -196,13 +255,19 @@ static void icount_adjust(void)
int64_t cur_time;
int64_t cur_icount;
int64_t delta;
/* Protected by TimersState mutex. */
static int64_t last_delta;
/* If the VM is not running, then do nothing. */
if (!runstate_is_running()) {
return;
}
cur_time = cpu_get_clock();
cur_icount = qemu_get_clock_ns(vm_clock);
seqlock_write_lock(&timers_state.vm_clock_seqlock);
cur_time = cpu_get_clock_locked();
cur_icount = cpu_get_icount_locked();
delta = cur_icount - cur_time;
/* FIXME: This is a very crude algorithm, somewhat prone to oscillation. */
if (delta > 0
@@ -219,19 +284,21 @@ static void icount_adjust(void)
}
last_delta = delta;
qemu_icount_bias = cur_icount - (qemu_icount << icount_time_shift);
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
}
static void icount_adjust_rt(void *opaque)
{
qemu_mod_timer(icount_rt_timer,
qemu_get_clock_ms(rt_clock) + 1000);
timer_mod(icount_rt_timer,
qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + 1000);
icount_adjust();
}
static void icount_adjust_vm(void *opaque)
{
qemu_mod_timer(icount_vm_timer,
qemu_get_clock_ns(vm_clock) + get_ticks_per_sec() / 10);
timer_mod(icount_vm_timer,
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
get_ticks_per_sec() / 10);
icount_adjust();
}
@@ -242,48 +309,59 @@ static int64_t qemu_icount_round(int64_t count)
static void icount_warp_rt(void *opaque)
{
if (vm_clock_warp_start == -1) {
/* The icount_warp_timer is rescheduled soon after vm_clock_warp_start
* changes from -1 to another value, so the race here is okay.
*/
if (atomic_read(&vm_clock_warp_start) == -1) {
return;
}
seqlock_write_lock(&timers_state.vm_clock_seqlock);
if (runstate_is_running()) {
int64_t clock = qemu_get_clock_ns(rt_clock);
int64_t warp_delta = clock - vm_clock_warp_start;
if (use_icount == 1) {
qemu_icount_bias += warp_delta;
} else {
int64_t clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
int64_t warp_delta;
warp_delta = clock - vm_clock_warp_start;
if (use_icount == 2) {
/*
* In adaptive mode, do not let the vm_clock run too
* In adaptive mode, do not let QEMU_CLOCK_VIRTUAL run too
* far ahead of real time.
*/
int64_t cur_time = cpu_get_clock();
int64_t cur_icount = qemu_get_clock_ns(vm_clock);
int64_t cur_time = cpu_get_clock_locked();
int64_t cur_icount = cpu_get_icount_locked();
int64_t delta = cur_time - cur_icount;
qemu_icount_bias += MIN(warp_delta, delta);
}
if (qemu_clock_expired(vm_clock)) {
qemu_notify_event();
warp_delta = MIN(warp_delta, delta);
}
qemu_icount_bias += warp_delta;
}
vm_clock_warp_start = -1;
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
if (qemu_clock_expired(QEMU_CLOCK_VIRTUAL)) {
qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
}
}
void qtest_clock_warp(int64_t dest)
{
int64_t clock = qemu_get_clock_ns(vm_clock);
int64_t clock = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
assert(qtest_enabled());
while (clock < dest) {
int64_t deadline = qemu_clock_deadline(vm_clock);
int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
int64_t warp = MIN(dest - clock, deadline);
seqlock_write_lock(&timers_state.vm_clock_seqlock);
qemu_icount_bias += warp;
qemu_run_timers(vm_clock);
clock = qemu_get_clock_ns(vm_clock);
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
clock = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
}
qemu_notify_event();
qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
}
void qemu_clock_warp(QEMUClock *clock)
void qemu_clock_warp(QEMUClockType type)
{
int64_t clock;
int64_t deadline;
/*
@@ -291,20 +369,20 @@ void qemu_clock_warp(QEMUClock *clock)
* applicable to other clocks. But a clock argument removes the
* need for if statements all over the place.
*/
if (clock != vm_clock || !use_icount) {
if (type != QEMU_CLOCK_VIRTUAL || !use_icount) {
return;
}
/*
* If the CPUs have been sleeping, advance the vm_clock timer now. This
* ensures that the deadline for the timer is computed correctly below.
* If the CPUs have been sleeping, advance QEMU_CLOCK_VIRTUAL timer now.
* This ensures that the deadline for the timer is computed correctly below.
* This also makes sure that the insn counter is synchronized before the
* CPU starts running, in case the CPU is woken by an event other than
* the earliest vm_clock timer.
* the earliest QEMU_CLOCK_VIRTUAL timer.
*/
icount_warp_rt(NULL);
if (!all_cpu_threads_idle() || !qemu_clock_has_timers(vm_clock)) {
qemu_del_timer(icount_warp_timer);
timer_del(icount_warp_timer);
if (!all_cpu_threads_idle()) {
return;
}
@@ -313,28 +391,39 @@ void qemu_clock_warp(QEMUClock *clock)
return;
}
vm_clock_warp_start = qemu_get_clock_ns(rt_clock);
deadline = qemu_clock_deadline(vm_clock);
/* We want to use the earliest deadline from ALL vm_clocks */
clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
if (deadline < 0) {
return;
}
if (deadline > 0) {
/*
* Ensure the vm_clock proceeds even when the virtual CPU goes to
* Ensure QEMU_CLOCK_VIRTUAL proceeds even when the virtual CPU goes to
* sleep. Otherwise, the CPU might be waiting for a future timer
* interrupt to wake it up, but the interrupt never comes because
* the vCPU isn't running any insns and thus doesn't advance the
* vm_clock.
* QEMU_CLOCK_VIRTUAL.
*
* An extreme solution for this problem would be to never let VCPUs
* sleep in icount mode if there is a pending vm_clock timer; rather
* time could just advance to the next vm_clock event. Instead, we
* do stop VCPUs and only advance vm_clock after some "real" time,
* (related to the time left until the next event) has passed. This
* rt_clock timer will do this. This avoids that the warps are too
* visible externally---for example, you will not be sending network
* packets continuously instead of every 100ms.
* sleep in icount mode if there is a pending QEMU_CLOCK_VIRTUAL
* timer; rather time could just advance to the next QEMU_CLOCK_VIRTUAL
* event. Instead, we do stop VCPUs and only advance QEMU_CLOCK_VIRTUAL
* after some e"real" time, (related to the time left until the next
* event) has passed. The QEMU_CLOCK_REALTIME timer will do this.
* This avoids that the warps are visible externally; for example,
* you will not be sending network packets continuously instead of
* every 100ms.
*/
qemu_mod_timer(icount_warp_timer, vm_clock_warp_start + deadline);
} else {
qemu_notify_event();
seqlock_write_lock(&timers_state.vm_clock_seqlock);
if (vm_clock_warp_start == -1 || vm_clock_warp_start > clock) {
vm_clock_warp_start = clock;
}
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
timer_mod_anticipate(icount_warp_timer, clock + deadline);
} else if (deadline == 0) {
qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
}
}
@@ -353,12 +442,14 @@ static const VMStateDescription vmstate_timers = {
void configure_icount(const char *option)
{
seqlock_init(&timers_state.vm_clock_seqlock, NULL);
vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
if (!option) {
return;
}
icount_warp_timer = qemu_new_timer_ns(rt_clock, icount_warp_rt, NULL);
icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
icount_warp_rt, NULL);
if (strcmp(option, "auto") != 0) {
icount_time_shift = strtol(option, NULL, 0);
use_icount = 1;
@@ -376,12 +467,15 @@ void configure_icount(const char *option)
the virtual time trigger catches emulated time passing too fast.
Realtime triggers occur even when idle, so use them less frequently
than VM triggers. */
icount_rt_timer = qemu_new_timer_ms(rt_clock, icount_adjust_rt, NULL);
qemu_mod_timer(icount_rt_timer,
qemu_get_clock_ms(rt_clock) + 1000);
icount_vm_timer = qemu_new_timer_ns(vm_clock, icount_adjust_vm, NULL);
qemu_mod_timer(icount_vm_timer,
qemu_get_clock_ns(vm_clock) + get_ticks_per_sec() / 10);
icount_rt_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
icount_adjust_rt, NULL);
timer_mod(icount_rt_timer,
qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + 1000);
icount_vm_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
icount_adjust_vm, NULL);
timer_mod(icount_vm_timer,
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
get_ticks_per_sec() / 10);
}
/***********************************************************/
@@ -394,7 +488,7 @@ void hw_error(const char *fmt, ...)
fprintf(stderr, "qemu: hardware error: ");
vfprintf(stderr, fmt, ap);
fprintf(stderr, "\n");
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
fprintf(stderr, "CPU #%d:\n", cpu->cpu_index);
cpu_dump_state(cpu, stderr, fprintf, CPU_DUMP_FPU);
}
@@ -406,7 +500,7 @@ void cpu_synchronize_all_states(void)
{
CPUState *cpu;
for (cpu = first_cpu; cpu; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
cpu_synchronize_state(cpu);
}
}
@@ -415,7 +509,7 @@ void cpu_synchronize_all_post_reset(void)
{
CPUState *cpu;
for (cpu = first_cpu; cpu; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
cpu_synchronize_post_reset(cpu);
}
}
@@ -424,16 +518,11 @@ void cpu_synchronize_all_post_init(void)
{
CPUState *cpu;
for (cpu = first_cpu; cpu; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
cpu_synchronize_post_init(cpu);
}
}
bool cpu_is_stopped(CPUState *cpu)
{
return !runstate_is_running() || cpu->stopped;
}
static int do_vm_stop(RunState state)
{
int ret = 0;
@@ -457,7 +546,7 @@ static bool cpu_can_run(CPUState *cpu)
if (cpu->stop) {
return false;
}
if (cpu->stopped || !runstate_is_running()) {
if (cpu_is_stopped(cpu)) {
return false;
}
return true;
@@ -735,7 +824,7 @@ static void qemu_tcg_wait_io_event(void)
while (all_cpu_threads_idle()) {
/* Start accounting real time to the virtual clock if the CPUs
are idle. */
qemu_clock_warp(vm_clock);
qemu_clock_warp(QEMU_CLOCK_VIRTUAL);
qemu_cond_wait(tcg_halt_cond, &qemu_global_mutex);
}
@@ -743,7 +832,7 @@ static void qemu_tcg_wait_io_event(void)
qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
}
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
qemu_wait_io_event_common(cpu);
}
}
@@ -837,12 +926,6 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
static void tcg_exec_all(void);
static void tcg_signal_cpu_creation(CPUState *cpu, void *data)
{
cpu->thread_id = qemu_get_thread_id();
cpu->created = true;
}
static void *qemu_tcg_cpu_thread_fn(void *arg)
{
CPUState *cpu = arg;
@@ -851,23 +934,31 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
qemu_thread_get_self(cpu->thread);
qemu_mutex_lock(&qemu_global_mutex);
qemu_for_each_cpu(tcg_signal_cpu_creation, NULL);
CPU_FOREACH(cpu) {
cpu->thread_id = qemu_get_thread_id();
cpu->created = true;
}
qemu_cond_signal(&qemu_cpu_cond);
/* wait for initial kick-off after machine start */
while (first_cpu->stopped) {
while (QTAILQ_FIRST(&cpus)->stopped) {
qemu_cond_wait(tcg_halt_cond, &qemu_global_mutex);
/* process any pending work */
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
qemu_wait_io_event_common(cpu);
}
}
while (1) {
tcg_exec_all();
if (use_icount && qemu_clock_deadline(vm_clock) <= 0) {
qemu_notify_event();
if (use_icount) {
int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
if (deadline == 0) {
qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
}
}
qemu_tcg_wait_io_event();
}
@@ -969,13 +1060,12 @@ void qemu_mutex_unlock_iothread(void)
static int all_vcpus_paused(void)
{
CPUState *cpu = first_cpu;
CPUState *cpu;
while (cpu) {
CPU_FOREACH(cpu) {
if (!cpu->stopped) {
return 0;
}
cpu = cpu->next_cpu;
}
return 1;
@@ -983,23 +1073,20 @@ static int all_vcpus_paused(void)
void pause_all_vcpus(void)
{
CPUState *cpu = first_cpu;
CPUState *cpu;
qemu_clock_enable(vm_clock, false);
while (cpu) {
qemu_clock_enable(QEMU_CLOCK_VIRTUAL, false);
CPU_FOREACH(cpu) {
cpu->stop = true;
qemu_cpu_kick(cpu);
cpu = cpu->next_cpu;
}
if (qemu_in_vcpu_thread()) {
cpu_stop_current();
if (!kvm_enabled()) {
cpu = first_cpu;
while (cpu) {
CPU_FOREACH(cpu) {
cpu->stop = false;
cpu->stopped = true;
cpu = cpu->next_cpu;
}
return;
}
@@ -1007,10 +1094,8 @@ void pause_all_vcpus(void)
while (!all_vcpus_paused()) {
qemu_cond_wait(&qemu_pause_cond, &qemu_global_mutex);
cpu = first_cpu;
while (cpu) {
CPU_FOREACH(cpu) {
qemu_cpu_kick(cpu);
cpu = cpu->next_cpu;
}
}
}
@@ -1024,12 +1109,11 @@ void cpu_resume(CPUState *cpu)
void resume_all_vcpus(void)
{
CPUState *cpu = first_cpu;
CPUState *cpu;
qemu_clock_enable(vm_clock, true);
while (cpu) {
qemu_clock_enable(QEMU_CLOCK_VIRTUAL, true);
CPU_FOREACH(cpu) {
cpu_resume(cpu);
cpu = cpu->next_cpu;
}
}
@@ -1145,11 +1229,23 @@ static int tcg_cpu_exec(CPUArchState *env)
#endif
if (use_icount) {
int64_t count;
int64_t deadline;
int decr;
qemu_icount -= (env->icount_decr.u16.low + env->icount_extra);
env->icount_decr.u16.low = 0;
env->icount_extra = 0;
count = qemu_icount_round(qemu_clock_deadline(vm_clock));
deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
/* Maintain prior (possibly buggy) behaviour where if no deadline
* was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than
* INT32_MAX nanoseconds ahead, we still use INT32_MAX
* nanoseconds.
*/
if ((deadline < 0) || (deadline > INT32_MAX)) {
deadline = INT32_MAX;
}
count = qemu_icount_round(deadline);
qemu_icount += count;
decr = (count > 0xffff) ? 0xffff : count;
count -= decr;
@@ -1175,17 +1271,17 @@ static void tcg_exec_all(void)
{
int r;
/* Account partial waits to the vm_clock. */
qemu_clock_warp(vm_clock);
/* Account partial waits to QEMU_CLOCK_VIRTUAL. */
qemu_clock_warp(QEMU_CLOCK_VIRTUAL);
if (next_cpu == NULL) {
next_cpu = first_cpu;
}
for (; next_cpu != NULL && !exit_request; next_cpu = next_cpu->next_cpu) {
for (; next_cpu != NULL && !exit_request; next_cpu = CPU_NEXT(next_cpu)) {
CPUState *cpu = next_cpu;
CPUArchState *env = cpu->env_ptr;
qemu_clock_enable(vm_clock,
qemu_clock_enable(QEMU_CLOCK_VIRTUAL,
(cpu->singlestep_enabled & SSTEP_NOTIMER) == 0);
if (cpu_can_run(cpu)) {
@@ -1206,7 +1302,7 @@ void set_numa_modes(void)
CPUState *cpu;
int i;
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
for (i = 0; i < nb_numa_nodes; i++) {
if (test_bit(cpu->cpu_index, node_cpumask[i])) {
cpu->numa_node = i;
@@ -1228,7 +1324,7 @@ CpuInfoList *qmp_query_cpus(Error **errp)
CpuInfoList *head = NULL, *cur_item = NULL;
CPUState *cpu;
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
CpuInfoList *info;
#if defined(TARGET_I386)
X86CPU *x86_cpu = X86_CPU(cpu);
@@ -1309,7 +1405,10 @@ void qmp_memsave(int64_t addr, int64_t size, const char *filename,
l = sizeof(buf);
if (l > size)
l = size;
cpu_memory_rw_debug(cpu, addr, buf, l, 0);
if (cpu_memory_rw_debug(cpu, addr, buf, l, 0) != 0) {
error_setg(errp, "Invalid addr 0x%016" PRIx64 "specified", addr);
goto exit;
}
if (fwrite(buf, 1, l, f) != l) {
error_set(errp, QERR_IO_ERROR);
goto exit;
@@ -1357,7 +1456,7 @@ void qmp_inject_nmi(Error **errp)
#if defined(TARGET_I386)
CPUState *cs;
for (cs = first_cpu; cs != NULL; cs = cs->next_cpu) {
CPU_FOREACH(cs) {
X86CPU *cpu = X86_CPU(cs);
CPUX86State *env = &cpu->env;
@@ -1367,6 +1466,20 @@ void qmp_inject_nmi(Error **errp)
apic_deliver_nmi(env->apic_state);
}
}
#elif defined(TARGET_S390X)
CPUState *cs;
S390CPU *cpu;
CPU_FOREACH(cs) {
cpu = S390_CPU(cs);
if (cpu->env.cpu_num == monitor_get_cpu_index()) {
if (s390_cpu_restart(S390_CPU(cs)) == -1) {
error_set(errp, QERR_UNSUPPORTED);
return;
}
break;
}
}
#else
error_set(errp, QERR_UNSUPPORTED);
#endif

View File

@@ -169,27 +169,12 @@ static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
return ram_addr;
}
static inline void tlb_update_dirty(CPUTLBEntry *tlb_entry)
{
ram_addr_t ram_addr;
void *p;
if (tlb_is_dirty_ram(tlb_entry)) {
p = (void *)(uintptr_t)((tlb_entry->addr_write & TARGET_PAGE_MASK)
+ tlb_entry->addend);
ram_addr = qemu_ram_addr_from_host_nofail(p);
if (!cpu_physical_memory_is_dirty(ram_addr)) {
tlb_entry->addr_write |= TLB_NOTDIRTY;
}
}
}
void cpu_tlb_reset_dirty_all(ram_addr_t start1, ram_addr_t length)
{
CPUState *cpu;
CPUArchState *env;
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
int mmu_idx;
env = cpu->env_ptr;

View File

@@ -1,3 +1 @@
# Default configuration for arm-linux-user
CONFIG_GDBSTUB_XML=y

View File

@@ -2,7 +2,6 @@
include pci.mak
include usb.mak
CONFIG_GDBSTUB_XML=y
CONFIG_VGA=y
CONFIG_ISA_MMIO=y
CONFIG_NAND=y
@@ -34,9 +33,9 @@ CONFIG_PFLASH_CFI02=y
CONFIG_MICRODRIVE=y
CONFIG_USB_MUSB=y
CONFIG_ARM9MPCORE=y
CONFIG_ARM11MPCORE=y
CONFIG_ARM15MPCORE=y
CONFIG_A9MPCORE=y
CONFIG_A15MPCORE=y
CONFIG_ARM_GIC=y
CONFIG_ARM_GIC_KVM=$(CONFIG_KVM)
@@ -62,6 +61,7 @@ CONFIG_BITBANG_I2C=y
CONFIG_FRAMEBUFFER=y
CONFIG_XILINX_SPIPS=y
CONFIG_ARM11SCU=y
CONFIG_A9SCU=y
CONFIG_MARVELL_88W8618=y
CONFIG_OMAP=y
@@ -80,3 +80,4 @@ CONFIG_VERSATILE_PCI=y
CONFIG_VERSATILE_I2C=y
CONFIG_SDHCI=y
CONFIG_INTEGRATOR_DEBUG=y

View File

@@ -1,3 +1 @@
# Default configuration for armeb-linux-user
CONFIG_GDBSTUB_XML=y

View File

@@ -1,3 +1 @@
# Default configuration for m68k-linux-user
CONFIG_GDBSTUB_XML=y

View File

@@ -3,5 +3,4 @@
include pci.mak
include usb.mak
CONFIG_COLDFIRE=y
CONFIG_GDBSTUB_XML=y
CONFIG_PTIMER=y

View File

@@ -1,3 +1 @@
# Default configuration for ppc-linux-user
CONFIG_GDBSTUB_XML=y

View File

@@ -3,7 +3,6 @@
include pci.mak
include sound.mak
include usb.mak
CONFIG_GDBSTUB_XML=y
CONFIG_ISA_MMIO=y
CONFIG_ESCC=y
CONFIG_M48T59=y

View File

@@ -1,3 +1 @@
# Default configuration for ppc64-linux-user
CONFIG_GDBSTUB_XML=y

View File

@@ -3,7 +3,6 @@
include pci.mak
include sound.mak
include usb.mak
CONFIG_GDBSTUB_XML=y
CONFIG_ISA_MMIO=y
CONFIG_ESCC=y
CONFIG_M48T59=y
@@ -47,6 +46,7 @@ CONFIG_E500=y
CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
# For pSeries
CONFIG_XICS=$(CONFIG_PSERIES)
CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
# For PReP
CONFIG_I82378=y
CONFIG_I8259=y

View File

@@ -1,3 +1 @@
# Default configuration for ppc64abi32-linux-user
CONFIG_GDBSTUB_XML=y

View File

@@ -3,7 +3,6 @@
include pci.mak
include sound.mak
include usb.mak
CONFIG_GDBSTUB_XML=y
CONFIG_ISA_MMIO=y
CONFIG_ESCC=y
CONFIG_M48T59=y

47
disas.c
View File

@@ -158,6 +158,35 @@ print_insn_thumb1(bfd_vma pc, disassemble_info *info)
}
#endif
static int print_insn_objdump(bfd_vma pc, disassemble_info *info,
const char *prefix)
{
int i, n = info->buffer_length;
uint8_t *buf = g_malloc(n);
info->read_memory_func(pc, buf, n, info);
for (i = 0; i < n; ++i) {
if (i % 32 == 0) {
info->fprintf_func(info->stream, "\n%s: ", prefix);
}
info->fprintf_func(info->stream, "%02x", buf[i]);
}
g_free(buf);
return n;
}
static int print_insn_od_host(bfd_vma pc, disassemble_info *info)
{
return print_insn_objdump(pc, info, "OBJD-H");
}
static int print_insn_od_target(bfd_vma pc, disassemble_info *info)
{
return print_insn_objdump(pc, info, "OBJD-T");
}
/* Disassemble this for me please... (debugging). 'flags' has the following
values:
i386 - 1 means 16 bit code, 2 means 64 bit code
@@ -171,7 +200,7 @@ void target_disas(FILE *out, CPUArchState *env, target_ulong code,
target_ulong pc;
int count;
CPUDebug s;
int (*print_insn)(bfd_vma pc, disassemble_info *info);
int (*print_insn)(bfd_vma pc, disassemble_info *info) = NULL;
INIT_DISASSEMBLE_INFO(s.info, out, fprintf);
@@ -263,11 +292,10 @@ void target_disas(FILE *out, CPUArchState *env, target_ulong code,
#elif defined(TARGET_LM32)
s.info.mach = bfd_mach_lm32;
print_insn = print_insn_lm32;
#else
fprintf(out, "0x" TARGET_FMT_lx
": Asm output not supported on this arch\n", code);
return;
#endif
if (print_insn == NULL) {
print_insn = print_insn_od_target;
}
for (pc = code; size > 0; pc += count, size -= count) {
fprintf(out, "0x" TARGET_FMT_lx ": ", pc);
@@ -303,7 +331,7 @@ void disas(FILE *out, void *code, unsigned long size)
uintptr_t pc;
int count;
CPUDebug s;
int (*print_insn)(bfd_vma pc, disassemble_info *info);
int (*print_insn)(bfd_vma pc, disassemble_info *info) = NULL;
INIT_DISASSEMBLE_INFO(s.info, out, fprintf);
s.info.print_address_func = generic_print_host_address;
@@ -347,11 +375,10 @@ void disas(FILE *out, void *code, unsigned long size)
print_insn = print_insn_hppa;
#elif defined(__ia64__)
print_insn = print_insn_ia64;
#else
fprintf(out, "0x%lx: Asm output not supported on this arch\n",
(long) code);
return;
#endif
if (print_insn == NULL) {
print_insn = print_insn_od_host;
}
for (pc = (uintptr_t)code; size > 0; pc += count, size -= count) {
fprintf(out, "0x%08" PRIxPTR ": ", pc);
count = print_insn(pc, &s.info);

View File

@@ -5157,7 +5157,8 @@ int
print_insn_ppc (bfd_vma memaddr, struct disassemble_info *info)
{
int dialect = (char *) info->private_data - (char *) 0;
return print_insn_powerpc (memaddr, info, 1, dialect);
return print_insn_powerpc (memaddr, info, info->endian == BFD_ENDIAN_BIG,
dialect);
}
/* Print a big endian PowerPC instruction. */

View File

@@ -11,6 +11,7 @@
#include "trace.h"
#include "qemu/range.h"
#include "qemu/thread.h"
#include "qemu/main-loop.h"
/* #define DEBUG_IOMMU */

View File

@@ -52,7 +52,7 @@ Configuring and building:
Assuming you have a working smartcard on the host with the current
user, using NSS, qemu acts as another NSS client using ccid-card-emulated:
qemu -usb -device usb-ccid -device ccid-card-emualated
qemu -usb -device usb-ccid -device ccid-card-emulated
4. Using ccid-card-emulated with certificates

View File

@@ -52,6 +52,15 @@ MemoryRegion):
hole". Aliases may point to any type of region, including other aliases,
but an alias may not point back to itself, directly or indirectly.
It is valid to add subregions to a region which is not a pure container
(that is, to an MMIO, RAM or ROM region). This means that the region
will act like a container, except that any addresses within the container's
region which are not claimed by any subregion are handled by the
container itself (ie by its MMIO callbacks or RAM backing). However
it is generally possible to achieve the same effect with a pure container
one of whose subregions is a low priority "background" region covering
the whole address range; this is often clearer and is preferred.
Subregions cannot be added to an alias region.
Region names
------------
@@ -80,6 +89,53 @@ guest. This is done with memory_region_add_subregion_overlap(), which
allows the region to overlap any other region in the same container, and
specifies a priority that allows the core to decide which of two regions at
the same address are visible (highest wins).
Priority values are signed, and the default value is zero. This means that
you can use memory_region_add_subregion_overlap() both to specify a region
that must sit 'above' any others (with a positive priority) and also a
background region that sits 'below' others (with a negative priority).
If the higher priority region in an overlap is a container or alias, then
the lower priority region will appear in any "holes" that the higher priority
region has left by not mapping subregions to that area of its address range.
(This applies recursively -- if the subregions are themselves containers or
aliases that leave holes then the lower priority region will appear in these
holes too.)
For example, suppose we have a container A of size 0x8000 with two subregions
B and C. B is a container mapped at 0x2000, size 0x4000, priority 1; C is
an MMIO region mapped at 0x0, size 0x6000, priority 2. B currently has two
of its own subregions: D of size 0x1000 at offset 0 and E of size 0x1000 at
offset 0x2000. As a diagram:
0 1000 2000 3000 4000 5000 6000 7000 8000
|------|------|------|------|------|------|------|-------|
A: [ ]
C: [CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC]
B: [ ]
D: [DDDDD]
E: [EEEEE]
The regions that will be seen within this address range then are:
[CCCCCCCCCCCC][DDDDD][CCCCC][EEEEE][CCCCC]
Since B has higher priority than C, its subregions appear in the flat map
even where they overlap with C. In ranges where B has not mapped anything
C's region appears.
If B had provided its own MMIO operations (ie it was not a pure container)
then these would be used for any addresses in its range not handled by
D or E, and the result would be:
[CCCCCCCCCCCC][DDDDD][BBBBB][EEEEE][BBBBB]
Priority values are local to a container, because the priorities of two
regions are only compared when they are both children of the same container.
This means that the device in charge of the container (typically modelling
a bus or a memory controller) can use them to manage the interaction of
its child regions without any side effects on other parts of the system.
In the example above, the priorities of D and E are unimportant because
they do not overlap each other. It is the relative priority of B and C
that causes D and E to appear on top of C: D and E's priorities are never
compared against the priority of C.
Visibility
----------
@@ -90,11 +146,19 @@ guest accesses an address:
descending priority order
- if the address lies outside the region offset/size, the subregion is
discarded
- if the subregion is a leaf (RAM or MMIO), the search terminates
- if the subregion is a leaf (RAM or MMIO), the search terminates, returning
this leaf region
- if the subregion is a container, the same algorithm is used within the
subregion (after the address is adjusted by the subregion offset)
- if the subregion is an alias, the search is continues at the alias target
- if the subregion is an alias, the search is continued at the alias target
(after the address is adjusted by the subregion offset and alias offset)
- if a recursive search within a container or alias subregion does not
find a match (because of a "hole" in the container's coverage of its
address range), then if this is a container with its own MMIO or RAM
backing the search terminates, returning the container itself. Otherwise
we continue with the next subregion in priority order
- if none of the subregions match the address then the search terminates
with no match found
Example memory map
------------------

View File

@@ -91,6 +91,29 @@
port = "4"
chassis = "4"
##
# Example PCIe switch with two downstream ports
#
#[device "pcie-switch-upstream-port-1"]
# driver = "x3130-upstream"
# bus = "ich9-pcie-port-4"
# addr = "00.0"
#
#[device "pcie-switch-downstream-port-1-1"]
# driver = "xio3130-downstream"
# multifunction = "on"
# bus = "pcie-switch-upstream-port-1"
# addr = "00.0"
# port = "1"
# chassis = "5"
#
#[device "pcie-switch-downstream-port-1-2"]
# driver = "xio3130-downstream"
# multifunction = "on"
# bus = "pcie-switch-upstream-port-1"
# addr = "00.1"
# port = "1"
# chassis = "6"
[device "ich9-ehci-1"]
driver = "ich9-usb-ehci1"

View File

@@ -53,6 +53,23 @@ The use of '*' as a prefix to the name means the member is optional. Optional
members should always be added to the end of the dictionary to preserve
backwards compatibility.
A complex type definition can specify another complex type as its base.
In this case, the fields of the base type are included as top-level fields
of the new complex type's dictionary in the QMP wire format. An example
definition is:
{ 'type': 'BlockdevOptionsGenericFormat', 'data': { 'file': 'str' } }
{ 'type': 'BlockdevOptionsGenericCOWFormat',
'base': 'BlockdevOptionsGenericFormat',
'data': { '*backing': 'str' } }
An example BlockdevOptionsGenericCOWFormat object on the wire could use
both fields like this:
{ "file": "/some/place/my-image",
"backing": "/some/place/my-backing-file" }
=== Enumeration types ===
An enumeration type is a dictionary containing a single key whose value is a
@@ -147,7 +164,7 @@ This example allows using both of the following example objects:
{ "file": "my_existing_block_device_id" }
{ "file": { "driver": "file",
"readonly": false,
'filename': "/tmp/mydisk.qcow2" } }
"filename": "/tmp/mydisk.qcow2" } }
=== Commands ===

87
docs/qmp/README Normal file
View File

@@ -0,0 +1,87 @@
QEMU Machine Protocol
=====================
Introduction
------------
The QEMU Machine Protocol (QMP) allows applications to operate a
QEMU instance.
QMP is JSON[1] based and features the following:
- Lightweight, text-based, easy to parse data format
- Asynchronous messages support (ie. events)
- Capabilities Negotiation
For detailed information on QMP's usage, please, refer to the following files:
o qmp-spec.txt QEMU Machine Protocol current specification
o qmp-commands.txt QMP supported commands (auto-generated at build-time)
o qmp-events.txt List of available asynchronous events
[1] http://www.json.org
Usage
-----
You can use the -qmp option to enable QMP. For example, the following
makes QMP available on localhost port 4444:
$ qemu [...] -qmp tcp:localhost:4444,server,nowait
However, for more flexibility and to make use of more options, the -mon
command-line option should be used. For instance, the following example
creates one HMP instance (human monitor) on stdio and one QMP instance
on localhost port 4444:
$ qemu [...] -chardev stdio,id=mon0 -mon chardev=mon0,mode=readline \
-chardev socket,id=mon1,host=localhost,port=4444,server,nowait \
-mon chardev=mon1,mode=control,pretty=on
Please, refer to QEMU's manpage for more information.
Simple Testing
--------------
To manually test QMP one can connect with telnet and issue commands by hand:
$ telnet localhost 4444
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
{
"QMP": {
"version": {
"qemu": {
"micro": 50,
"minor": 6,
"major": 1
},
"package": ""
},
"capabilities": [
]
}
}
{ "execute": "qmp_capabilities" }
{
"return": {
}
}
{ "execute": "query-status" }
{
"return": {
"status": "prelaunch",
"singlestep": false,
"running": false
}
}
Please, refer to the qapi-schema.json file for a complete command reference.
QMP wiki page
-------------
http://wiki.qemu-project.org/QMP

View File

@@ -1,4 +1,4 @@
QEMU Monitor Protocol Events
QEMU Machine Protocol Events
============================
BALLOON_CHANGE
@@ -18,6 +18,28 @@ Example:
"data": { "actual": 944766976 },
"timestamp": { "seconds": 1267020223, "microseconds": 435656 } }
BLOCK_IMAGE_CORRUPTED
---------------------
Emitted when a disk image is being marked corrupt.
Data:
- "device": Device name (json-string)
- "msg": Informative message (e.g., reason for the corruption) (json-string)
- "offset": If the corruption resulted from an image access, this is the access
offset into the image (json-int)
- "size": If the corruption resulted from an image access, this is the access
size (json-int)
Example:
{ "event": "BLOCK_IMAGE_CORRUPTED",
"data": { "device": "ide0-hd0",
"msg": "Prevented active L1 table overwrite", "offset": 196608,
"size": 65536 },
"timestamp": { "seconds": 1378126126, "microseconds": 966463 } }
BLOCK_IO_ERROR
--------------
@@ -137,7 +159,7 @@ Note: The "ready to complete" status is always reset by a BLOCK_JOB_ERROR
event.
DEVICE_DELETED
-----------------
--------------
Emitted whenever the device removal completion is acknowledged
by the guest.
@@ -172,8 +194,22 @@ Data:
},
"timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
GUEST_PANICKED
--------------
Emitted when guest OS panic is detected.
Data:
- "action": Action that has been taken (json-string, currently always "pause").
Example:
{ "event": "GUEST_PANICKED",
"data": { "action": "pause" } }
NIC_RX_FILTER_CHANGED
-----------------
---------------------
The event is emitted once until the query command is executed,
the first event will always be emitted.
@@ -464,17 +500,3 @@ Example:
Note: If action is "reset", "shutdown", or "pause" the WATCHDOG event is
followed respectively by the RESET, SHUTDOWN, or STOP events.
GUEST_PANICKED
--------------
Emitted when guest OS panic is detected.
Data:
- "action": Action that has been taken (json-string, currently always "pause").
Example:
{ "event": "GUEST_PANICKED",
"data": { "action": "pause" } }

View File

@@ -1,21 +1,17 @@
QEMU Monitor Protocol Specification - Version 0.1
QEMU Machine Protocol Specification
1. Introduction
===============
This document specifies the QEMU Monitor Protocol (QMP), a JSON-based protocol
which is available for applications to control QEMU at the machine-level.
To enable QMP support, QEMU has to be run in "control mode". This is done by
starting QEMU with the appropriate command-line options. Please, refer to the
QEMU manual page for more information.
This document specifies the QEMU Machine Protocol (QMP), a JSON-based protocol
which is available for applications to operate QEMU at the machine-level.
2. Protocol Specification
=========================
This section details the protocol format. For the purpose of this document
"Client" is any application which is communicating with QEMU in control mode,
and "Server" is QEMU itself.
"Client" is any application which is using QMP to communicate with QEMU and
"Server" is QEMU itself.
JSON data structures, when mentioned in this document, are always in the
following format:
@@ -47,14 +43,14 @@ that the connection has been successfully established and that the Server is
ready for capabilities negotiation (for more information refer to section
'4. Capabilities Negotiation').
The format is:
The greeting message format is:
{ "QMP": { "version": json-object, "capabilities": json-array } }
Where,
- The "version" member contains the Server's version information (the format
is the same of the 'query-version' command)
is the same of the query-version command)
- The "capabilities" member specify the availability of features beyond the
baseline specification
@@ -83,10 +79,7 @@ of a command execution: success or error.
2.4.1 success
-------------
The success response is issued when the command execution has finished
without errors.
The format is:
The format of a success response is:
{ "return": json-object, "id": json-value }
@@ -96,15 +89,12 @@ The format is:
in a per-command basis or an empty json-object if the command does not
return data
- The "id" member contains the transaction identification associated
with the command execution (if issued by the Client)
with the command execution if issued by the Client
2.4.2 error
-----------
The error response is issued when the command execution could not be
completed because of an error condition.
The format is:
The format of an error response is:
{ "error": { "class": json-string, "desc": json-string }, "id": json-value }
@@ -114,7 +104,7 @@ The format is:
- The "desc" member is a human-readable error message. Clients should
not attempt to parse this message.
- The "id" member contains the transaction identification associated with
the command execution (if issued by the Client)
the command execution if issued by the Client
NOTE: Some errors can occur before the Server is able to read the "id" member,
in these cases the "id" member will not be part of the error response, even
@@ -124,9 +114,9 @@ if provided by the client.
-----------------------
As a result of state changes, the Server may send messages unilaterally
to the Client at any time. They are called 'asynchronous events'.
to the Client at any time. They are called "asynchronous events".
The format is:
The format of asynchronous events is:
{ "event": json-string, "data": json-object,
"timestamp": { "seconds": json-number, "microseconds": json-number } }
@@ -147,36 +137,37 @@ qmp-events.txt file.
===============
This section provides some examples of real QMP usage, in all of them
'C' stands for 'Client' and 'S' stands for 'Server'.
"C" stands for "Client" and "S" stands for "Server".
3.1 Server greeting
-------------------
S: {"QMP": {"version": {"qemu": "0.12.50", "package": ""}, "capabilities": []}}
S: { "QMP": { "version": { "qemu": { "micro": 50, "minor": 6, "major": 1 },
"package": ""}, "capabilities": []}}
3.2 Simple 'stop' execution
---------------------------
C: { "execute": "stop" }
S: {"return": {}}
S: { "return": {} }
3.3 KVM information
-------------------
C: { "execute": "query-kvm", "id": "example" }
S: {"return": {"enabled": true, "present": true}, "id": "example"}
S: { "return": { "enabled": true, "present": true }, "id": "example"}
3.4 Parsing error
------------------
C: { "execute": }
S: {"error": {"class": "GenericError", "desc": "Invalid JSON syntax" } }
S: { "error": { "class": "GenericError", "desc": "Invalid JSON syntax" } }
3.5 Powerdown event
-------------------
S: {"timestamp": {"seconds": 1258551470, "microseconds": 802384}, "event":
"POWERDOWN"}
S: { "timestamp": { "seconds": 1258551470, "microseconds": 802384 },
"event": "POWERDOWN" }
4. Capabilities Negotiation
----------------------------
@@ -184,17 +175,17 @@ S: {"timestamp": {"seconds": 1258551470, "microseconds": 802384}, "event":
When a Client successfully establishes a connection, the Server is in
Capabilities Negotiation mode.
In this mode only the 'qmp_capabilities' command is allowed to run, all
other commands will return the CommandNotFound error. Asynchronous messages
are not delivered either.
In this mode only the qmp_capabilities command is allowed to run, all
other commands will return the CommandNotFound error. Asynchronous
messages are not delivered either.
Clients should use the 'qmp_capabilities' command to enable capabilities
Clients should use the qmp_capabilities command to enable capabilities
advertised in the Server's greeting (section '2.2 Server Greeting') they
support.
When the 'qmp_capabilities' command is issued, and if it does not return an
When the qmp_capabilities command is issued, and if it does not return an
error, the Server enters in Command mode where capabilities changes take
effect, all commands (except 'qmp_capabilities') are allowed and asynchronous
effect, all commands (except qmp_capabilities) are allowed and asynchronous
messages are delivered.
5 Compatibility Considerations
@@ -245,7 +236,7 @@ arguments, errors, asynchronous events, and so forth.
Any new names downstream wishes to add must begin with '__'. To
ensure compatibility with other downstreams, it is strongly
recommended that you prefix your downstram names with '__RFQDN_' where
recommended that you prefix your downstream names with '__RFQDN_' where
RFQDN is a valid, reverse fully qualified domain name which you
control. For example, a qemu-kvm specific monitor command would be:

View File

@@ -1,7 +1,7 @@
(RDMA: Remote Direct Memory Access)
RDMA Live Migration Specification, Version # 1
==============================================
Wiki: http://wiki.qemu.org/Features/RDMALiveMigration
Wiki: http://wiki.qemu-project.org/Features/RDMALiveMigration
Github: git@github.com:hinesmr/qemu.git, 'rdma' branch
Copyright (C) 2013 Michael R. Hines <mrhines@us.ibm.com>

View File

@@ -10,7 +10,7 @@ ACPI GPE block (IO ports 0xafe0-0xafe3, byte access):
Generic ACPI GPE block. Bit 2 (GPE.2) used to notify CPU
hot-add/remove event to ACPI BIOS, via SCI interrupt.
CPU present bitmap (IO port 0xaf00-0xae1f, 1-byte access):
CPU present bitmap (IO port 0xaf00-0xaf1f, 1-byte access):
---------------------------------------------------------------
One bit per CPU. Bit position reflects corresponding CPU APIC ID.
Read-only.

View File

@@ -80,7 +80,12 @@ in the description of a field.
tables to repair refcounts before accessing the
image.
Bits 1-63: Reserved (set to 0)
Bit 1: Corrupt bit. If this bit is set then any data
structure may be corrupt and the image must not
be written to (unless for regaining
consistency).
Bits 2-63: Reserved (set to 0)
80 - 87: compatible_features
Bitmask of compatible features. An implementation can
@@ -350,3 +355,6 @@ Snapshot table entry:
variable: Unique ID string for the snapshot (not null terminated)
variable: Name of the snapshot (not null terminated)
variable: Padding to round up the snapshot table entry size to the
next multiple of 8.

14
dump.c
View File

@@ -66,7 +66,7 @@ typedef struct DumpState {
uint32_t sh_info;
bool have_section;
bool resume;
size_t note_size;
ssize_t note_size;
hwaddr memory_offset;
int fd;
@@ -277,7 +277,7 @@ static int write_elf64_notes(DumpState *s)
int ret;
int id;
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
id = cpu_index(cpu);
ret = cpu_write_elf64_note(fd_write_vmcore, cpu, id, s);
if (ret < 0) {
@@ -286,7 +286,7 @@ static int write_elf64_notes(DumpState *s)
}
}
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
ret = cpu_write_elf64_qemunote(fd_write_vmcore, cpu, s);
if (ret < 0) {
dump_error(s, "dump: failed to write CPU status.\n");
@@ -327,7 +327,7 @@ static int write_elf32_notes(DumpState *s)
int ret;
int id;
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
id = cpu_index(cpu);
ret = cpu_write_elf32_note(fd_write_vmcore, cpu, id, s);
if (ret < 0) {
@@ -336,7 +336,7 @@ static int write_elf32_notes(DumpState *s)
}
}
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
ret = cpu_write_elf32_qemunote(fd_write_vmcore, cpu, s);
if (ret < 0) {
dump_error(s, "dump: failed to write CPU status.\n");
@@ -734,7 +734,7 @@ static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
*/
cpu_synchronize_all_states();
nr_cpus = 0;
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
nr_cpus++;
}
@@ -765,7 +765,7 @@ static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
s->note_size = cpu_get_note_size(s->dump_info.d_class,
s->dump_info.d_machine, nr_cpus);
if (ret < 0) {
if (s->note_size < 0) {
error_set(errp, QERR_UNSUPPORTED);
goto cleanup;
}

315
exec.c
View File

@@ -69,7 +69,7 @@ static MemoryRegion io_mem_unassigned;
#endif
CPUState *first_cpu;
struct CPUTailQ cpus = QTAILQ_HEAD_INITIALIZER(cpus);
/* current CPU in the current thread. It is only valid inside
cpu_exec() */
DEFINE_TLS(CPUState *, current_cpu);
@@ -129,7 +129,6 @@ static PhysPageMap next_map;
static void io_mem_init(void);
static void memory_map_init(void);
static void *qemu_safe_ram_ptr(ram_addr_t addr);
static MemoryRegion io_mem_watch;
#endif
@@ -350,45 +349,30 @@ const VMStateDescription vmstate_cpu_common = {
#endif
CPUState *qemu_get_cpu(int index)
{
CPUState *cpu = first_cpu;
while (cpu) {
if (cpu->cpu_index == index) {
break;
}
cpu = cpu->next_cpu;
}
return cpu;
}
void qemu_for_each_cpu(void (*func)(CPUState *cpu, void *data), void *data)
{
CPUState *cpu;
cpu = first_cpu;
while (cpu) {
func(cpu, data);
cpu = cpu->next_cpu;
CPU_FOREACH(cpu) {
if (cpu->cpu_index == index) {
return cpu;
}
}
return NULL;
}
void cpu_exec_init(CPUArchState *env)
{
CPUState *cpu = ENV_GET_CPU(env);
CPUClass *cc = CPU_GET_CLASS(cpu);
CPUState **pcpu;
CPUState *some_cpu;
int cpu_index;
#if defined(CONFIG_USER_ONLY)
cpu_list_lock();
#endif
cpu->next_cpu = NULL;
pcpu = &first_cpu;
cpu_index = 0;
while (*pcpu != NULL) {
pcpu = &(*pcpu)->next_cpu;
CPU_FOREACH(some_cpu) {
cpu_index++;
}
cpu->cpu_index = cpu_index;
@@ -398,7 +382,7 @@ void cpu_exec_init(CPUArchState *env)
#ifndef CONFIG_USER_ONLY
cpu->thread_id = qemu_get_thread_id();
#endif
*pcpu = cpu;
QTAILQ_INSERT_TAIL(&cpus, cpu, node);
#if defined(CONFIG_USER_ONLY)
cpu_list_unlock();
#endif
@@ -425,10 +409,8 @@ static void breakpoint_invalidate(CPUState *cpu, target_ulong pc)
#else
static void breakpoint_invalidate(CPUState *cpu, target_ulong pc)
{
hwaddr phys = cpu_get_phys_page_debug(cpu, pc);
if (phys != -1) {
tb_invalidate_phys_addr(phys | (pc & ~TARGET_PAGE_MASK));
}
tb_invalidate_phys_addr(cpu_get_phys_page_debug(cpu, pc) |
(pc & ~TARGET_PAGE_MASK));
}
#endif
#endif /* TARGET_HAS_ICE */
@@ -642,55 +624,40 @@ void cpu_abort(CPUArchState *env, const char *fmt, ...)
abort();
}
CPUArchState *cpu_copy(CPUArchState *env)
#if !defined(CONFIG_USER_ONLY)
static RAMBlock *qemu_get_ram_block(ram_addr_t addr)
{
CPUArchState *new_env = cpu_init(env->cpu_model_str);
#if defined(TARGET_HAS_ICE)
CPUBreakpoint *bp;
CPUWatchpoint *wp;
#endif
RAMBlock *block;
/* Reset non arch specific state */
cpu_reset(ENV_GET_CPU(new_env));
/* Copy arch specific state into the new CPU */
memcpy(new_env, env, sizeof(CPUArchState));
/* Clone all break/watchpoints.
Note: Once we support ptrace with hw-debug register access, make sure
BP_CPU break/watchpoints are handled correctly on clone. */
QTAILQ_INIT(&env->breakpoints);
QTAILQ_INIT(&env->watchpoints);
#if defined(TARGET_HAS_ICE)
QTAILQ_FOREACH(bp, &env->breakpoints, entry) {
cpu_breakpoint_insert(new_env, bp->pc, bp->flags, NULL);
/* The list is protected by the iothread lock here. */
block = ram_list.mru_block;
if (block && addr - block->offset < block->length) {
goto found;
}
QTAILQ_FOREACH(wp, &env->watchpoints, entry) {
cpu_watchpoint_insert(new_env, wp->vaddr, (~wp->len_mask) + 1,
wp->flags, NULL);
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (addr - block->offset < block->length) {
goto found;
}
}
#endif
return new_env;
fprintf(stderr, "Bad ram offset %" PRIx64 "\n", (uint64_t)addr);
abort();
found:
ram_list.mru_block = block;
return block;
}
#if !defined(CONFIG_USER_ONLY)
static void tlb_reset_dirty_range_all(ram_addr_t start, ram_addr_t end,
uintptr_t length)
{
uintptr_t start1;
RAMBlock *block;
ram_addr_t start1;
/* we modify the TLB cache so that the dirty bit will be set again
when accessing the range */
start1 = (uintptr_t)qemu_safe_ram_ptr(start);
/* Check that we don't span multiple blocks - this breaks the
address comparisons below. */
if ((uintptr_t)qemu_safe_ram_ptr(end - 1) - start1
!= (end - 1) - start) {
abort();
}
block = qemu_get_ram_block(start);
assert(block == qemu_get_ram_block(end - 1));
start1 = (uintptr_t)block->host + (start - block->offset);
cpu_tlb_reset_dirty_all(start1, length);
}
/* Note: start and end must be within the same ram block. */
@@ -766,6 +733,18 @@ static int subpage_register (subpage_t *mmio, uint32_t start, uint32_t end,
uint16_t section);
static subpage_t *subpage_init(AddressSpace *as, hwaddr base);
static void *(*phys_mem_alloc)(size_t size) = qemu_anon_ram_alloc;
/*
* Set a custom physical guest memory alloator.
* Accelerators with unusual needs may need this. Hopefully, we can
* get rid of it eventually.
*/
void phys_mem_set_alloc(void *(*alloc)(size_t))
{
phys_mem_alloc = alloc;
}
static uint16_t phys_section_add(MemoryRegionSection *section)
{
/* The physical section number is ORed with a page-aligned
@@ -897,7 +876,7 @@ void qemu_mutex_unlock_ramlist(void)
qemu_mutex_unlock(&ram_list.mutex);
}
#if defined(__linux__) && !defined(TARGET_S390X)
#ifdef __linux__
#include <sys/vfs.h>
@@ -1000,6 +979,14 @@ static void *file_ram_alloc(RAMBlock *block,
block->fd = fd;
return area;
}
#else
static void *file_ram_alloc(RAMBlock *block,
ram_addr_t memory,
const char *path)
{
fprintf(stderr, "-mem-path not supported on this host\n");
exit(1);
}
#endif
static ram_addr_t find_ram_offset(ram_addr_t size)
@@ -1116,6 +1103,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
size = TARGET_PAGE_ALIGN(size);
new_block = g_malloc0(sizeof(*new_block));
new_block->fd = -1;
/* This assumes the iothread lock is taken here too. */
qemu_mutex_lock_ramlist();
@@ -1124,26 +1112,32 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
if (host) {
new_block->host = host;
new_block->flags |= RAM_PREALLOC_MASK;
} else if (xen_enabled()) {
if (mem_path) {
fprintf(stderr, "-mem-path not supported with Xen\n");
exit(1);
}
xen_ram_alloc(new_block->offset, size, mr);
} else {
if (mem_path) {
#if defined (__linux__) && !defined(TARGET_S390X)
new_block->host = file_ram_alloc(new_block, size, mem_path);
if (!new_block->host) {
new_block->host = qemu_anon_ram_alloc(size);
memory_try_enable_merging(new_block->host, size);
if (phys_mem_alloc != qemu_anon_ram_alloc) {
/*
* file_ram_alloc() needs to allocate just like
* phys_mem_alloc, but we haven't bothered to provide
* a hook there.
*/
fprintf(stderr,
"-mem-path not supported with this accelerator\n");
exit(1);
}
#else
fprintf(stderr, "-mem-path option unsupported\n");
exit(1);
#endif
} else {
if (xen_enabled()) {
xen_ram_alloc(new_block->offset, size, mr);
} else if (kvm_enabled()) {
/* some s390/kvm configurations have special constraints */
new_block->host = kvm_ram_alloc(size);
} else {
new_block->host = qemu_anon_ram_alloc(size);
new_block->host = file_ram_alloc(new_block, size, mem_path);
}
if (!new_block->host) {
new_block->host = phys_mem_alloc(size);
if (!new_block->host) {
fprintf(stderr, "Cannot set up guest memory '%s': %s\n",
new_block->mr->name, strerror(errno));
exit(1);
}
memory_try_enable_merging(new_block->host, size);
}
@@ -1218,23 +1212,15 @@ void qemu_ram_free(ram_addr_t addr)
ram_list.version++;
if (block->flags & RAM_PREALLOC_MASK) {
;
} else if (mem_path) {
#if defined (__linux__) && !defined(TARGET_S390X)
if (block->fd) {
munmap(block->host, block->length);
close(block->fd);
} else {
qemu_anon_ram_free(block->host, block->length);
}
#else
abort();
} else if (xen_enabled()) {
xen_invalidate_map_cache_entry(block->host);
#ifndef _WIN32
} else if (block->fd >= 0) {
munmap(block->host, block->length);
close(block->fd);
#endif
} else {
if (xen_enabled()) {
xen_invalidate_map_cache_entry(block->host);
} else {
qemu_anon_ram_free(block->host, block->length);
}
qemu_anon_ram_free(block->host, block->length);
}
g_free(block);
break;
@@ -1258,38 +1244,31 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
vaddr = block->host + offset;
if (block->flags & RAM_PREALLOC_MASK) {
;
} else if (xen_enabled()) {
abort();
} else {
flags = MAP_FIXED;
munmap(vaddr, length);
if (mem_path) {
#if defined(__linux__) && !defined(TARGET_S390X)
if (block->fd) {
if (block->fd >= 0) {
#ifdef MAP_POPULATE
flags |= mem_prealloc ? MAP_POPULATE | MAP_SHARED :
MAP_PRIVATE;
flags |= mem_prealloc ? MAP_POPULATE | MAP_SHARED :
MAP_PRIVATE;
#else
flags |= MAP_PRIVATE;
#endif
area = mmap(vaddr, length, PROT_READ | PROT_WRITE,
flags, block->fd, offset);
} else {
flags |= MAP_PRIVATE | MAP_ANONYMOUS;
area = mmap(vaddr, length, PROT_READ | PROT_WRITE,
flags, -1, 0);
}
#else
abort();
flags |= MAP_PRIVATE;
#endif
area = mmap(vaddr, length, PROT_READ | PROT_WRITE,
flags, block->fd, offset);
} else {
#if defined(TARGET_S390X) && defined(CONFIG_KVM)
flags |= MAP_SHARED | MAP_ANONYMOUS;
area = mmap(vaddr, length, PROT_EXEC|PROT_READ|PROT_WRITE,
flags, -1, 0);
#else
/*
* Remap needs to match alloc. Accelerators that
* set phys_mem_alloc never remap. If they did,
* we'd need a remap hook here.
*/
assert(phys_mem_alloc == qemu_anon_ram_alloc);
flags |= MAP_PRIVATE | MAP_ANONYMOUS;
area = mmap(vaddr, length, PROT_READ | PROT_WRITE,
flags, -1, 0);
#endif
}
if (area != vaddr) {
fprintf(stderr, "Could not remap addr: "
@@ -1306,29 +1285,6 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
}
#endif /* !_WIN32 */
static RAMBlock *qemu_get_ram_block(ram_addr_t addr)
{
RAMBlock *block;
/* The list is protected by the iothread lock here. */
block = ram_list.mru_block;
if (block && addr - block->offset < block->length) {
goto found;
}
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (addr - block->offset < block->length) {
goto found;
}
}
fprintf(stderr, "Bad ram offset %" PRIx64 "\n", (uint64_t)addr);
abort();
found:
ram_list.mru_block = block;
return block;
}
/* Return a host pointer to ram allocated with qemu_ram_alloc.
With the exception of the softmmu code in this file, this should
only be used for local memory (e.g. video ram) that the device owns,
@@ -1356,40 +1312,6 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
return block->host + (addr - block->offset);
}
/* Return a host pointer to ram allocated with qemu_ram_alloc. Same as
* qemu_get_ram_ptr but do not touch ram_list.mru_block.
*
* ??? Is this still necessary?
*/
static void *qemu_safe_ram_ptr(ram_addr_t addr)
{
RAMBlock *block;
/* The list is protected by the iothread lock here. */
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (addr - block->offset < block->length) {
if (xen_enabled()) {
/* We need to check if the requested address is in the RAM
* because we don't want to map the entire memory in QEMU.
* In that case just map until the end of the page.
*/
if (block->offset == 0) {
return xen_map_cache(addr, 0, 0);
} else if (block->host == NULL) {
block->host =
xen_map_cache(block->offset, block->length, 1);
}
}
return block->host + (addr - block->offset);
}
}
fprintf(stderr, "Bad ram offset %" PRIx64 "\n", (uint64_t)addr);
abort();
return NULL;
}
/* Return a host pointer to guest's ram. Similar to qemu_get_ram_ptr
* but takes a size argument */
static void *qemu_ram_ptr_length(ram_addr_t addr, hwaddr *size)
@@ -1578,7 +1500,7 @@ static uint64_t subpage_read(void *opaque, hwaddr addr,
uint8_t buf[4];
#if defined(DEBUG_SUBPAGE)
printf("%s: subpage %p len %d addr " TARGET_FMT_plx "\n", __func__,
printf("%s: subpage %p len %u addr " TARGET_FMT_plx "\n", __func__,
subpage, len, addr);
#endif
address_space_read(subpage->as, addr + subpage->base, buf, len);
@@ -1601,7 +1523,7 @@ static void subpage_write(void *opaque, hwaddr addr,
uint8_t buf[4];
#if defined(DEBUG_SUBPAGE)
printf("%s: subpage %p len %d addr " TARGET_FMT_plx
printf("%s: subpage %p len %u addr " TARGET_FMT_plx
" value %"PRIx64"\n",
__func__, subpage, len, addr, value);
#endif
@@ -1622,16 +1544,16 @@ static void subpage_write(void *opaque, hwaddr addr,
}
static bool subpage_accepts(void *opaque, hwaddr addr,
unsigned size, bool is_write)
unsigned len, bool is_write)
{
subpage_t *subpage = opaque;
#if defined(DEBUG_SUBPAGE)
printf("%s: subpage %p %c len %d addr " TARGET_FMT_plx "\n",
printf("%s: subpage %p %c len %u addr " TARGET_FMT_plx "\n",
__func__, subpage, is_write ? 'w' : 'r', len, addr);
#endif
return address_space_access_valid(subpage->as, addr + subpage->base,
size, is_write);
len, is_write);
}
static const MemoryRegionOps subpage_ops = {
@@ -1651,8 +1573,8 @@ static int subpage_register (subpage_t *mmio, uint32_t start, uint32_t end,
idx = SUBPAGE_IDX(start);
eidx = SUBPAGE_IDX(end);
#if defined(DEBUG_SUBPAGE)
printf("%s: %p start %08x end %08x idx %08x eidx %08x mem %ld\n", __func__,
mmio, start, end, idx, eidx, memory);
printf("%s: %p start %08x end %08x idx %08x eidx %08x section %d\n",
__func__, mmio, start, end, idx, eidx, section);
#endif
for (; idx <= eidx; idx++) {
mmio->sub_section[idx] = section;
@@ -1673,8 +1595,8 @@ static subpage_t *subpage_init(AddressSpace *as, hwaddr base)
"subpage", TARGET_PAGE_SIZE);
mmio->iomem.subpage = true;
#if defined(DEBUG_SUBPAGE)
printf("%s: %p base " TARGET_FMT_plx " len %08x %d\n", __func__,
mmio, base, TARGET_PAGE_SIZE, subpage_memory);
printf("%s: %p base " TARGET_FMT_plx " len %08x\n", __func__,
mmio, base, TARGET_PAGE_SIZE);
#endif
subpage_register(mmio, 0, TARGET_PAGE_SIZE-1, PHYS_SECTION_UNASSIGNED);
@@ -1765,7 +1687,7 @@ static void tcg_commit(MemoryListener *listener)
/* since each CPU stores ram addresses in its TLB cache, we must
reset the modified entries */
/* XXX: slow ! */
for (cpu = first_cpu; cpu != NULL; cpu = cpu->next_cpu) {
CPU_FOREACH(cpu) {
CPUArchState *env = cpu->env_ptr;
tlb_flush(env, 1);
@@ -1819,7 +1741,12 @@ void address_space_destroy_dispatch(AddressSpace *as)
static void memory_map_init(void)
{
system_memory = g_malloc(sizeof(*system_memory));
memory_region_init(system_memory, NULL, "system", INT64_MAX);
assert(TARGET_PHYS_ADDR_SPACE_BITS <= 64);
memory_region_init(system_memory, NULL, "system",
TARGET_PHYS_ADDR_SPACE_BITS == 64 ?
UINT64_MAX : (0x1ULL << TARGET_PHYS_ADDR_SPACE_BITS));
address_space_init(&address_space_memory, system_memory, "memory");
system_io = g_malloc(sizeof(*system_io));
@@ -1828,7 +1755,9 @@ static void memory_map_init(void)
address_space_init(&address_space_io, system_io, "I/O");
memory_listener_register(&core_memory_listener, &address_space_memory);
memory_listener_register(&tcg_memory_listener, &address_space_memory);
if (tcg_enabled()) {
memory_listener_register(&tcg_memory_listener, &address_space_memory);
}
}
MemoryRegion *get_system_memory(void)
@@ -2175,7 +2104,9 @@ void *address_space_map(AddressSpace *as,
if (bounce.buffer) {
return NULL;
}
bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, TARGET_PAGE_SIZE);
/* Avoid unbounded allocations */
l = MIN(l, TARGET_PAGE_SIZE);
bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, l);
bounce.addr = addr;
bounce.len = l;

Some files were not shown because too many files have changed in this diff Show More