Compare commits

..

242 Commits

Author SHA1 Message Date
Michael Roth
f4c53d94d2 update VERSION for v1.2.1
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:47:44 -05:00
David Gibson
159fe82dea pseries: Don't test for MSR_PR for hypercalls under KVM
PAPR hypercalls should only be invoked from the guest kernel, not guest
user programs, that is, with MSR[PR]=0.  Currently we check this in
spapr_hypercall, returning H_PRIVILEGE if MSR[PR]=1.

However, under KVM the state of MSR[PR] is already checked by the host
kernel before passing the hypercall to qemu, making this check redundant.
Worse, however, we don't generally synchronize KVM and qemu state on the
hypercall path, meaning that qemu could incorrectly reject a hypercall
because it has a stale MSR value.

This patch fixes the problem by moving the privilege test exclusively to
the TCG hypercall path.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
CC: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit efcb9383b9)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:21 -05:00
Peter Maydell
a8f2299d1b fpu/softfloat.c: Return correctly signed values from uint64_to_float32
The uint64_to_float32() conversion function was incorrectly always
returning numbers with the sign bit set (ie negative numbers). Correct
this so we return positive numbers instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit e744c06fca)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:21 -05:00
Eduardo Habkost
68fc5d1d27 i386: kvm: bit 10 of CPUID[8000_0001].EDX is reserved
Bit 10 of CPUID[8000_0001].EDX is not defined as an alias of
CPUID[1].EDX[10], so do not duplicate it on
kvm_arch_get_supported_cpuid().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Don Slutz <Don@CloudSwitch.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit b1f4679392)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:21 -05:00
Francesco Lavra
3e1954cb4a Versatile Express: Fix NOR flash 0 address and remove flash alias
In the A series memory map (implemented in the Cortex A15 CoreTile), the
first NOR flash bank (flash 0) is mapped to address 0x08000000, while
address 0x00000000 can be configured as alias to either the first or the
second flash bank. This patch fixes the definition of flash 0 address,
and for simplicity removes the alias definition.

Signed-off-by: Francesco Lavra <francescolavra.fl@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 661bafb3e1)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:21 -05:00
Meador Inge
14bdf978ef hw/armv7m_nvic: Correctly register GIC region when setting up NVIC
When setting up the NVIC memory regions the memory range
0x100..0xcff is aliased to an IO memory region that belongs
to the ARM GIC.  This aliased region should be added to the
NVIC memory container, but the actual GIC IO memory region
was being added instead.  This mixup was causing the wrong
IO memory access functions to be called when accessing parts
of the NVIC memory.

Signed-off-by: Meador Inge <meadori@codesourcery.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 9892cae395)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:21 -05:00
Brendan Fennell
806d996789 pl190: fix read of VECTADDR
Reading VECTADDR was causing us to set the current priority to
the wrong value, the most obvious effect of which was that we
would return the vector for the wrong interrupt as the result
of the read.

Signed-off-by: Brendan Fennell <bfennell@skynet.ie>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 14c126baf1)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:21 -05:00
Orit Wasserman
d893e56f72 Clear handler only for valid fd
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 3202becaa2)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:21 -05:00
Orit Wasserman
240f68ce5b Fix address handling in inet_nonblocking_connect
getaddrinfo can give us a list of addresses, but we only try to
connect to the first one. If that fails we never proceed to
the next one.  This is common on desktop setups that often have ipv6
configured but not actually working.

To fix this make inet_connect_nonblocking retry connection with a different
address.
callers on inet_nonblocking_connect register a callback function that will
be called when connect opertion completes, in case of failure the fd will have
a negative value

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 233aa5c2d1)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:21 -05:00
Orit Wasserman
ee0999b16a Separate inet_connect into inet_connect (blocking) and inet_nonblocking_connect
No need to add non blocking parameters to the blocking inet_connect
add block parameter for inet_connect_opts instead of using QemuOpt "block".

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 5db5f44cb4)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:21 -05:00
Michael S. Tsirkin
57f8a4c507 Refactor inet_connect_opts function
refactor address resolution code to fix nonblocking connect
remove getnameinfo call

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 05bc1d8a4b)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:21 -05:00
Stefan Weil
b83465006d configure: Allow builds without any system or user emulation
The old code aborted configure when no emulation target was selected.
Even after removing the 'exit 1', it tried to read from STDIN
when QEMU was configured with

    configure' '--disable-user' '--disable-system'

This is fixed here.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 8bdd3d499f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:21 -05:00
Jeff Cody
d80210efbc block: correctly set the keep_read_only flag
I believe the bs->keep_read_only flag is supposed to reflect
the initial open state of the device. If the device is initially
opened R/O, then commit operations, or reopen operations changing
to R/W, are prohibited.

Currently, the keep_read_only flag is only accurate for the active
layer, and its backing file. Subsequent images end up always having
the keep_read_only flag set.

For instance, what happens now:

[  base  ]  kro = 1, ro = 1
    |
    v
[ snap-1 ]  kro = 1, ro = 1
    |
    v
[ snap-2 ]  kro = 0, ro = 1
    |
    v
[ active ]  kro = 0, ro = 0

What we want:

[  base  ]  kro = 0, ro = 1
    |
    v
[ snap-1 ]  kro = 0, ro = 1
    |
    v
[ snap-2 ]  kro = 0, ro = 1
    |
    v
[ active ]  kro = 0, ro = 0

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit be028adced)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Kevin Shanahan
513cdda198 blockdev: preserve readonly and snapshot states across media changes
If readonly=on is given at device creation time, the ->readonly flag
needs to be set in the block driver state for this device so that
readonly-ness is preserved across media changes (qmp change command).
Similarly, to preserve the snapshot property requires ->open_flags to
be correct.

Signed-off-by: Kevin Shanahan <kmshanah@disenchant.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 80dd1aae36)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Stefan Weil
eea54caab0 w32: Add implementation of gmtime_r, localtime_r
Those functions are missing in MinGW.

Some versions of MinGW-w64 include defines for gmtime_r and localtime_r.
Older versions of these macros are buggy (they return a pointer to a
static variable), therefore we don't want them. Newer versions are
similar to the code used here, but without the memset.

The implementation which is used here is not strictly reentrant,
but sufficiently good for QEMU on w32 or w64.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
[blauwirbel@gmail.com: added comment about locking]
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit d3e8f95753)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Stefan Weil
78fd27b3b6 w32: Always use standard instead of native format strings
GLib 2.0 include files use __printf__ for the format attribute
which resolves to native format strings on w32 hosts.

QEMU wants standard format strings instead of native format
strings, so we simply change any declaration with __printf__
to use __gnu_printf__.

This works because all basic printf functions support both
kinds of format strings.

This fixes a compiler warning:

qapi/string-output-visitor.c: In function ‘print_type_int’:
qapi/string-output-visitor.c:34:5: warning: unknown conversion type character ‘l’ in format [-Wformat]
qapi/string-output-visitor.c:34:5: warning: too many arguments for format [-Wformat-extra-args]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit 95df51a4a0)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Stefan Weil
d4413913e1 net/socket: Fix compiler warning (regression for MinGW)
Commit 213fd5087e removed a type cast
which is needed for MinGW:

net/socket.c:136: warning:
 pointer targets in passing argument 2 of ‘sendto’ differ in signedness
/usr/lib/gcc/amd64-mingw32msvc/4.4.4/../../../../amd64-mingw32msvc/include/winsock2.h:1313: note:
 expected ‘const char *’ but argument is of type ‘const uint8_t *’

Add a 'qemu_sendto' macro which provides that type cast where needed
and use the new macro instead of 'sendto'.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit 73062dfe6b)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Stefan Weil
61dea8e914 linux-user: Remove redundant null check and replace free by g_free
Report from smatch:

linux-user/syscall.c:3632 do_ioctl_dm(220) info:
 redundant null check on big_buf calling free()

'big_buf' was allocated by g_malloc0, therefore free was also
replaced by g_free.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit ad11ad7774)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Laszlo Ersek
a2aad5fdc4 TextConsole: saturate escape parameter in TTY_STATE_CSI
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit c10600af60)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Hitoshi Mitake
5730401442 curses: don't initialize curses when qemu is daemonized
Current qemu initializes curses even if -daemonize option is
passed. This cause problem because shell prompt appears without
calling endwin().

This patch adds new function, is_daemonized(), to OS dependent
code. With this function, curses_display_init() can check that qemu is
daemonized or not. If daemonized, curses_display_init() isn't called
and the problem is avoided.

Of course, -daemonize && -curses doesn't make sense. Users shouldn't
pass the arguments at the same time. But the problem is very painful
because Ctrl-C cannot be delivered to the terminal.

Cc: Andrzej Zaborowski  <balrog@zabor.org>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit 995ee2bf46)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Stefan Weil
f3e9893078 pflash_cfi01: Fix warning caused by unreachable code
Report from smatch:
hw/pflash_cfi01.c:431 pflash_write(180) info: ignoring unreachable code.

Instead of removing the return statement after the switch statement,
the patch replaces the return statements in the switch statement by
break statements. Other switch statements in the same code do it also
like that.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit 12dabc79f9)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Stefan Weil
4d0fcc1968 ioh3420: Remove unreachable code
Report from smatch:
hw/ioh3420.c:128 ioh3420_initfn(35) info: ignoring unreachable code.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit 997f15672a)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Stefan Weil
2b9a5aca51 lm4549: Fix buffer overflow
Report from smatch:
lm4549.c:234 lm4549_write_samples(14) error:
 buffer overflow 's->buffer' 1024 <= 1024

There must be enough space to add two entries starting with index
s->buffer_level, therefore the old check was wrong.

[Peter Maydell <peter.maydell@linaro.org> clarifies the nature of the
analyser warning:

I don't object to making the change to placate the analyser,
but I don't think this is actually a buffer overrun. We always
add and remove samples from the buffer two at a time, so it's
not possible to get here with s->buffer_level == BUFFER_SIZE-1
(which is the only case where the old and new conditions
give different answers).]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit 8139626643)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Stefan Weil
96d90d32ac cadence_uart: Fix buffer overflow
Report from smatch:
hw/cadence_uart.c:413 uart_read(13) error: buffer overflow 's->r' 18 <= 18

This fixes read access to s->r[R_MAX] which is behind the limits of s->r.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit 5d40097fc0)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Stefan Weil
3c980e22ac qemu-sockets: Fix potential memory leak
The old code leaks variable 'peer'.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit 39b384591f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Stefan Weil
907a34c9d7 qemu-ga: Remove unreachable code after g_error
Report from smatch:
qemu-ga.c:117 register_signal_handlers(11) info: ignoring unreachable code.
qemu-ga.c:122 register_signal_handlers(16) info: ignoring unreachable code.

g_error calls abort which terminates the program.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit b548828862)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Stefan Weil
b79ba45af1 audio: Fix warning from static code analysis
smatch report:
audio/audio_template.h:416 AUD_open_out(18) warn:
 variable dereferenced before check 'as' (see line 414)

Moving the ldebug statement after the statement which checks 'as'
fixes that warning.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: malc <av1474@comtv.ru>
(cherry picked from commit 93b6599734)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Ronnie Sahlberg
75a44ae297 SCSI: Standard INQUIRY data should report HiSup flag as set.
QEMU as far as I know only reports LUN numbers using the modes that
are described in SAM4.
As such, since all LUN numbers generated by the SCSI emulation in QEMU
follow SAM4, we should set the HiSup bit in the standard INQUIRY data
to indicate such.

From SAM4:
  4.6.3 LUNs overview
  All LUN formats described in this standard are hierarchical in
  structure even when only a single level in that hierarchy is used.
  The HISUP bit shall be set to one in the standard INQUIRY data
  (see SPC-4) when any LUN format described in this standard is used.
  Non-hierarchical formats are outside the scope of this standard.

Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
(cherry picked from commit 1109c89405)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:20 -05:00
Paolo Bonzini
ff498e4a0d scsi-disk: fix check for out-of-range LBA
This fix is needed to correctly handle 0-block read and writes.
Without it, a 0-block access at LBA 0 would underflow.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 12ca76fc48)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Paolo Bonzini
394a2f28cb scsi-disk: introduce check_lba_range
Abstract the test for an out-of-range (starting block, block count)
pair.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 444bc90861)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Ronnie Sahlberg
60088e5b83 iSCSI: We dont need to explicitely call qemu_notify_event() any more
We no longer need to explicitely call qemu_notify_event() any more
since this is now done automatically any time the filehandles we listen
to change.

Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 40a13ca8d2)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Ronnie Sahlberg
ae06193d73 iSCSI: We need to support SG_IO also from iscsi_ioctl()
We need to support SG_IO from the synchronous iscsi_ioctl() since
scsi-block uses this to do an INQ to the device to discover its properties
This patch makes scsi-block work with iscsi.

Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f1a12821d7)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Andreas Färber
e8ca4a8ac7 MAINTAINERS: Add entry for QOM CPU
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit f2ca052414)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Aurelien Jarno
68958c31b5 pflash_cfi01: fix vendor specific extended query
pflash_cfi01 announces a version number of 1.1, which implies
"Protection Register Information" and "Burst Read information"
sections, which are not provided.

Decrease the version number to 1.0 so that only the "Protection
Register Information" section is needed.

Set the number of protection fields (0x3f) to 0x01, as 0x00 means 256
protections field, which makes the CFI table bigger than the current
implementation, causing some kernels to fail to read it.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 262e1eaafa)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Chris Wulff
584717a39a xilinx_timer: Fix a compile error if debug enabled
There was a missing include of qemu-log and a variable name in a printf was out
of date.

Signed-off-by: Chris Wulff <crwulff@gmail.com>
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
(cherry picked from commit 8354cd722e)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Peter A. G. Crosthwaite
85368b79a0 xilinx.h: Error check when setting links
Assert that the ethernet and dma controller are sucessfully linked to their
peers.

Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
(cherry picked from commit 4b5e52101f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Peter A. G. Crosthwaite
36d7fb299a xilinx_timer: Send dbg msgs to stderr not stdout
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
(cherry picked from commit e03377ae75)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Peter A. G. Crosthwaite
3412955066 xilinx_timer: Removed comma in device name
Fixes an error in a61e4b07a3

Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
(cherry picked from commit c0a1dcb9f0)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Peter Maydell
b52c8f78f2 arch_init.c: Improve '-soundhw help' for non-HAS_AUDIO_CHOICE archs
For architectures which don't set HAS_AUDIO_CHOICE, improve the
'-soundhw help' message so that it doesn't simply print an empty
list, implying no sound support at all.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: malc <av1474@comtv.ru>
(cherry picked from commit 55d4fd3c24)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
David Gibson
668194d5a5 cpu_physical_memory_write_rom() needs to do TB invalidates
cpu_physical_memory_write_rom(), despite the name, can also be used to
write images into RAM - and will often be used that way if the machine
uses load_image_targphys() into RAM addresses.

However, cpu_physical_memory_write_rom(), unlike cpu_physical_memory_rw()
doesn't invalidate any cached TBs which might be affected by the region
written.

This was breaking reset (under full emu) on the pseries machine - we loaded
our firmware image into RAM, and while executing it rewrite the code at
the entry point (correctly causing a TB invalidate/refresh).  When we
reset the firmware image was reloaded, but the TB from the rewrite was
still active and caused us to get an illegal instruction trap.

This patch fixes the bug by duplicating the tb invalidate code from
cpu_physical_memory_rw() in cpu_physical_memory_write_rom().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 0b57e28713)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
David Gibson
8484fead29 qemu-char: BUGFIX, don't call FD_ISSET with negative fd
tcp_chr_connect(), unlike for example udp_chr_update_read_handler() does
not check if the fd it is using is valid (>= 0) before passing it to
qemu_set_fd_handler2().  If using e.g. a TCP serial port, which is not
initially connected, this can result in -1 being passed to FD_ISSET, which
has undefined behaviour.  On x86 it seems to harmlessly return 0, but on
PowerPC, it causes a fortify buffer overflow error to be thrown.

This patch fixes this by putting an extra test in tcp_chr_connect(), and
also adds an assert qemu_set_fd_handler2() to catch other such errors on
all platforms, rather than just some.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit bbdd2ad081)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Anthony Liguori
57683d6354 Revert 455aa1e08 and c3767ed0eb
commit c3767ed0eb
    qemu-char: (Re-)connect for tcp_chr_write() unconnected writing

Has no hope of working because tcp_chr_connect() does not actually connect.

455aa1e08 just fixes the SEGV with server() but the attempt to connect a client
socket is still completely broken.

This patch reverts both.

Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 6db0fdce02)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:19 -05:00
Natanael Copa
5d6ad6262b configure: properly check if -lrt and -lm is needed
Fixes build against uClibc.

uClibc provides 2 versions of clock_gettime(), one with realtime
support and one without (this is so you can avoid linking in -lrt
unless actually needed). This means that the clock_gettime() don't
need -lrt. We still need it for timer_create() so we check for this
function in addition.

We also need check if -lm is needed for isnan().

Both -lm and -lrt are needed for libs_qga.

Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 8bacde8d86)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Yann E. MORIN
a63eb7a227 configure: fix seccomp check
Currently, if libseccomp is missing but the user explicitly requested
seccomp support using --enable-seccomp, configure silently ignores the
situation and disables seccomp support.

This is unlike all other tests that explicitly fail in such situation.

Fix that.

Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit e84d5956cc)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Stefan Hajnoczi
a638498f51 net: EAGAIN handling for net/socket.c TCP
Replace spinning send_all() with a proper non-blocking send.  When the
socket write buffer limit is reached, we should stop trying to send and
wait for the socket to become writable again.

Non-blocking TCP sockets can return in two different ways when the write
buffer limit is reached:

1. ret = -1 and errno = EAGAIN/EWOULDBLOCK.  No data has been written.

2. ret < total_size.  Short write, only part of the message was
   transmitted.

Handle both cases and keep track of how many bytes have been written in
s->send_index.  (This includes the 'length' header before the actual
payload buffer.)

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit 45a7f54a8b)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Stefan Hajnoczi
9d72a090b2 net: EAGAIN handling for net/socket.c UDP
Implement asynchronous send for UDP (or other SOCK_DGRAM) sockets.  If
send fails with EAGAIN we wait for the socket to become writable again.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit 213fd5087e)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Stefan Hajnoczi
ab52df7712 net: asynchronous send/receive infrastructure for net/socket.c
The net/socket.c net client is not truly asynchronous.  This patch
borrows the qemu_set_fd_handler2() code from net/tap.c as the basis for
proper asynchronous send/receive.

Only read packets from the socket when the peer is able to receive.
This avoids needless queuing.

Later patches implement asynchronous send.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit 863f678fba)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Stefan Hajnoczi
5c8240c09c net: broadcast hub packets if at least one port can receive
In commit 60c07d933c ("net: fix
qemu_can_send_packet logic") the "VLAN" broadcast behavior was changed
to queue packets if any net client cannot receive.  It turns out that
this was not actually the right fix and just hides the real bug that
hw/usb/dev-network.c:usbnet_receive() clobbers its receive buffer when
called multiple times in a row.  The commit also introduced a new bug
that "VLAN" packets would not be sent if one of multiple net clients was
down.

The hw/usb/dev-network.c bug has since been fixed, so this patch reverts
broadcast behavior to send packets as long as one net client can
receive.  Packets simply get queued for the net clients that are
temporarily unable to receive.

Reported-by: Roy.Li <rongqing.li@windriver.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit 61518a74ca)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Stefan Hajnoczi
d8f4811880 net: fix usbnet_receive() packet drops
The USB network interface has a single buffer which the guest reads
from.  This patch prevents multiple calls to usbnet_receive() from
clobbering the input buffer.  Instead we queue packets until buffer
space becomes available again.

This is inspired by virtio-net and e1000 rxbuf handling.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit 190563f9a9)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Stefan Hajnoczi
2747a3eee3 net: clean up usbnet_receive()
The USB network interface has two code paths depending on whether or not
RNDIS mode is enabled.  Refactor usbnet_receive() so that there is a
common path throughout the function instead of duplicating everything
across if (is_rndis(s)) ... else ... code paths.

Clean up coding style and 80 character line wrap along the way.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit f237ddbb89)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Stefan Hajnoczi
aa69dd05ce net: add -netdev options to man page
Document the -netdev syntax which supercedes the older -net syntax.
This patch is a first step to making -netdev prominent in the QEMU
manual.

Reported-by: Anatoly Techtonik <techtonik@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit 08d12022c7)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Stefan Hajnoczi
a9d8f7b1c4 net: do not report queued packets as sent
Net send functions have a return value where 0 means the packet has not
been sent and will be queued.  A non-zero value means the packet was
sent or an error caused the packet to be dropped.

This patch fixes two instances where packets are queued but we return
their size.  This causes callers to believe the packets were sent.  When
the caller uses the async send interface this creates a real problem
because the callback will be invoked for a packet that the caller
believed to be already sent.  This bug can cause double-frees in the
caller.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit 06b5f36d05)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Stefan Hajnoczi
0d91cd5aa7 net: add receive_disabled logic to iov delivery path
This patch adds the missing NetClient->receive_disabled logic in the
sendv delivery code path.  It seems that commit
893379efd0 ("net: disable receiving if
client returns zero") only added the logic to qemu_deliver_packet() and
not qemu_deliver_packet_iov().

The receive_disabled flag should be automatically set when .receive(),
.receive_raw(), or .receive_iov() return 0.  No further packets will be
delivered to the NetClient until the receive_disabled flag is cleared
again by calling qemu_flush_queued_packets().

Typically the NetClient will wait until its file descriptor becomes
writable and then invoke qemu_flush_queued_packets() to resume
transmission.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit c67f5dc105)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Bo Yang
bfa23099aa eepro100: Fix network hang when rx buffers run out
This is reported by QA. When installing os with pxe, after the initial
kernel and initrd are loaded, the procedure tries to copy files from install
server to local harddisk, the network becomes stall because of running out of
receive descriptor.

[Whitespace fixes and removed qemu_notify_event() because Paolo's
earlier net patches have moved it into qemu_flush_queued_packets().

Additional info:

I can reproduce the network hang with a tap device doing a iPXE HTTP
boot as follows:

  $ qemu -enable-kvm -m 1024 \
    -netdev tap,id=netdev0,script=no,downscript=no \
    -device i82559er,netdev=netdev0,romfile=80861209.rom \
    -drive if=virtio,cache=none,file=test.img
  iPXE> ifopen net0
  iPXE> config # set static network configuration
  iPXE> kernel http://mirror.bytemark.co.uk/fedora/linux/releases/17/Fedora/x86_64/os/images/pxeboot/vmlinuz

I needed a vanilla iPXE ROM to get to the iPXE prompt.  I think the boot
prompt has been disabled in the ROMs that ship with QEMU to reduce boot
time.

During the vmlinuz HTTP download there is a network hang.  hw/eepro100.c
has reached the end of the rx descriptor list.  When the iPXE driver
replenishes the rx descriptor list we don't kick the QEMU net subsystem
and event loop, thereby leaving the tap netdev without its file
descriptor in select(2).

Stefan Hajnoczi <stefanha@gmail.com>]

Signed-off-by: Bo Yang <boyang@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit 1069985fb1)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Paolo Bonzini
65a2f8e27c xen: flush queue when getting an event
xen does not have a register that, when written, will cause can_receive
to go from false to true.  However, flushing the queue can be attempted
whenever the front-end raises its side of the Xen event channel.  There
is a single event channel for tx and rx.

Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit a98b140223)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Paolo Bonzini
7575a6d2f2 e1000: flush queue whenever can_receive can go from false to true
When the guests replenish the receive ring buffer, the network device
should flush its queue of pending packets.  This is done with
qemu_flush_queued_packets.

e1000's can_receive can go from false to true when RCTL or RDT are
modified.

Reported-by: Luigi Rizzo <rizzo@iet.unipi.it>
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Jan Kiszka <jan.kiszka@siemens.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit e8b4c680b4)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:18 -05:00
Paolo Bonzini
8e9fdc4a88 net: notify iothread after flushing queue
virtio-net has code to flush the queue and notify the iothread
whenever new receive buffers are added by the guest.  That is
fine, and indeed we need to do the same in all other drivers.
However, notifying the iothread should be work for the network
subsystem.  And since we are at it we can add a little smartness:
if some of the queued packets already could not be delivered,
there is no need to notify the iothread.

Reported-by: Luigi Rizzo <rizzo@iet.unipi.it>
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Jan Kiszka <jan.kiszka@siemens.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit 987a9b4800)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Igor Mitsyanko
ca54dc1e75 arch_init.c: add missing '%' symbols before PRIu64 in debug printfs
'%' symbols were missing in front of PRIu64 macros in DPRINTF() messages in
arch_init.c, this caused compilation warnings when compiled with DEBUG_ARCH_INIT defined.

Signed-off-by: Igor Mitsyanko <i.mitsyanko@samsung.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit ef37a699a0)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Stefan Weil
a78cedc09e kvm: Fix warning from static code analysis
Report from smatch:

kvm-all.c:1373 kvm_init(135) warn:
 variable dereferenced before check 's' (see line 1360)

's' cannot by NULL (it was alloced using g_malloc0), so there is no need
to check it here.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit 6d1cc3210c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Lei Li
59a01f73c3 qapi: Fix enumeration typo error
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
(cherry picked from commit 6932a69b20)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
BALATON Zoltan
c6553fee81 console: Clean up bytes per pixel calculation
Division with round up is the correct way to compute this even if the
only case where division with round down gives incorrect result is
probably 15 bpp. This case was explicitely patched up in one of these
functions but was unhandled in the other. (I'm not sure about setting
16 bpp for the 15bpp case either but I left that there for now.)

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit feadf1a4de)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Stefan Weil
a0d718a395 Spelling fixes in comments and documentation
These wrong spellings were detected by codespell:

* successully -> successfully

* alot -> a lot

* wanna -> want to

* infomation -> information

* occured -> occurred

["also is" -> "is also" and "ressources" -> "resources" suggested by
Peter Maydell <peter.maydell@linaro.org>]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit 0546b8c2f0)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Stefan Weil
34173da021 srp: Don't use QEMU_PACKED for single elements of a structured type
QEMU_PACKED results in a MinGW compiler warning when it is
used for single structure elements:

warning: 'gcc_struct' attribute ignored

Using QEMU_PACKED for the whole structure avoids the compiler warning
without changing the memory layout.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
(cherry picked from commit 93d3ad2a80)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Hervé Poussineau
c23314334d slirp: Implement TFTP Blocksize option
This option is described in RFC 1783. As this is only an optional field,
we may ignore it in some situations and handle it in some others.

However, MS Windows 2003 PXE boot client requests a block size of the MTU
(most of the times 1472 bytes), and doesn't work if the option is not
acknowledged (with whatever value).

According to the RFC 1783, we cannot acknowledge the option with a bigger
value than the requested one.

As current implementation is using 512 bytes by block, accept the option
with a value of 512 if the option was specified, and don't acknowledge it
if it is not present or less than 512 bytes.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
(cherry picked from commit 95b1ad7ad8)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Hervé Poussineau
b83f395ba1 slirp: Handle more than 65535 blocks in TFTP transfers
RFC 1350 does not mention block count roll-over. However, a lot of TFTP servers
implement it to be able to transmit big files, so do it also.

Current block size is 512 bytes, so TFTP files were limited to 32 MB.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
(cherry picked from commit 4aa401f39e)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Hervé Poussineau
34f7e690c3 slirp: improve TFTP performance
When transferring a file, keep it open during the whole transfer,
instead of opening/closing it for each block.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
(cherry picked from commit 78be056628)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Stefan Weil
6116adbc0e slirp: Fix error reported by static code analysis
Report from smatch:

slirp/tcp_subr.c:127 tcp_respond(17) error:
 we previously assumed 'tp' could be null (see line 124)

Return if 'tp' is NULL.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
(cherry picked from commit e56afbc54a)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Stefan Weil
3ed791b42d slirp: Remove wrong type casts ins debug statements
The type casts of pointers to long are not allowed
when sizeof(pointer) != sizeof(long).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
(cherry picked from commit c4d12a743c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Hans de Goede
2ee160f8e1 uhci: Don't queue up packets after one with the SPD flag set
Don't queue up packets after a packet with the SPD (short packet detect)
flag set. Since we won't know if the packet will actually be short until it
has completed, and if it is short we should stop the queue.

This fixes a miniature photoframe emulating a USB cdrom with the windows
software for it not working.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 72a04d0c17)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Hans de Goede
981e213cf7 usb-redir: Revert usb-redir part of commit 93bfef4c
Commit 93bfef4c6e makes qemu-devices
which report the qemu version string to the guest in some way use a
qemu_get_version function which reports a machine-specific version string.

However usb-redir does not expose the qemu version to the guest, only to
the usbredir-host as part of the initial handshake. This can then be logged
on the usbredir-host side for debugging purposes and is otherwise completely
unused! For debugging purposes it is important to have the real qemu version
in there, rather then the machine-specific version.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 35efba2cc6)

Conflicts:

	hw/usb/redirect.c

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Hans de Goede
d56d7e6ef8 ehci: Walk async schedule before and after migration
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ceab6f9645)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Hans de Goede
ad2cea3878 ehci: Don't set seen to 0 when removing unseen queue-heads
When removing unseen queue-heads from the async queue list, we should not
set the seen flag to 0, as this may cause them to be removed by
ehci_queues_rip_unused() during the next call to ehci_advance_async_state()
if the timer is late or running at a low frequency.

Note:
1) This *may* have caused the instant unlink / relinks described in commit
   9bc3a3a216

2) Rather then putting more if-s inside ehci_queues_rip_unused, this patch
   instead introduces a new ehci_queues_rip_unseen function.

3) This patch also makes it save to call ehci_queues_rip_unseen() multiple
   times, which gets used in the folluw up patch titled:
   "ehci: Walk async schedule before and after migration"

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8f5457eb04)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:17 -05:00
Aurelien Jarno
d28ed2ffbb configure: usbredir fixes
usbredir is only used by system emulation, so add the libraries to
libs_softmmu instead of LIBS.

Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 56ab2ad177)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Alon Levy
bffb221775 hw/qxl: tracing fixes
Add two new trace events:
qxl_send_events(int qid, uint32_t events) "%d %d"
qxl_set_guest_bug(int qid) "%d"

Change qxl_io_unexpected_vga_mode parameters to be equivalent to those
of qxl_io_write for easier grouping under a single systemtap probe.

Change d to qxl in one place.

Signed-off-by: Alon Levy <alevy@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 917ae08ca1)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Dunrong Huang
9cf4af689b block: Don't forget to delete temporary file
The caller would not delete temporary file after failed get_tmp_filename().

Signed-off-by: Dunrong Huang <riegamaths@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit fe235a06e1)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Daniel P. Berrange
f9f552e80b Don't require encryption password for 'qemu-img info' command
The encryption password is only required if I/O is going to be
performed on a disk image. The 'qemu-img info' command merely
reports metadata, so it should not ask for a decryption password

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f0536bb848)

Conflicts:

	qemu-img.c

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Jason Baron
7cf7e305cc ahci: properly reset PxCMD on HBA reset
While testing q35, I found that windows 7 (specifically, windows 7 ultimate
with sp1 x64), wouldn't install because it can't find the cdrom or disk drive.
The failure message is: 'A required cd/dvd device driver is missing. If you
have a driver floppy disk, CD, DVD, or USB flash drive, please insert it now.'
This can also be reproduced on piix by adding an ahci controller, and
observing that windows 7 does not see any devices behind it.

The problem is that when windows issues a HBA reset, qemu does not reset the
individual ports' PxCMD register. Windows 7 then reads back the PxCMD register
and presumably assumes that the ahci controller has already been initialized.
Windows then never sets up the PxIE register to enable interrupts, and thus it
never gets irqs back when it sends ata device inquiry commands.

This change brings qemu into ahci 1.3 specification compliance.

Section 10.4.3 HBA Reset:

"
When GHC.HR is set to '1', GHC.AE, GHC.IE, the IS register, and all port
register fields (except PxFB/PxFBU/PxCLB/PxCLBU) that are not HwInit in the
HBA's register memory space are reset.
"

I've also re-tested Fedora 16 and 17 to verify that they continue to work with
this change.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2a4f4f34e6)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Pavel Hrdina
84a2ca9f90 block: fix block tray status
The tray status should change also if you eject empty block device.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9ca111544c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Stefan Weil
1c224f0ab3 vdi: Fix warning from clang
ccc-analyzer reports these warnings:

block/vdi.c:704:13: warning: Dereference of null pointer
            bmap[i] = VDI_UNALLOCATED;
            ^
block/vdi.c:702:13: warning: Dereference of null pointer
            bmap[i] = i;
            ^

Moving some code into the if block fixes this.
It also avoids calling function write with 0 bytes of data.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 514f21a5d4)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Stefan Weil
1e32a1651a block/curl: Fix wrong free statement
Report from smatch:
block/curl.c:546 curl_close(21) info: redundant null check on s->url calling free()

The check was redundant, and free was also wrong because the memory
was allocated using g_strdup.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 45724d6d02)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Stefan Weil
3900d1d992 ide: Fix error messages from static code analysis (no real error)
Report from smatch:
hw/ide/core.c:1472 ide_exec_cmd(423) error: buffer overflow 'smart_attributes' 8 <= 29
hw/ide/core.c:1474 ide_exec_cmd(425) error: buffer overflow 'smart_attributes' 8 <= 29
hw/ide/core.c:1475 ide_exec_cmd(426) error: buffer overflow 'smart_attributes' 8 <= 29
...

The upper limit of 30 was never reached because both for loops terminated
when 'smart_attributes' reached end of list, so there was no real buffer
overflow.

Nevertheless, changing the code not only fixes the error report, but also
reduces the size of smart_attributes and simplifies the for loops.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 1e53537fda)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
MORITA Kazutaka
1fce4135f3 sheepdog: fix savevm and loadvm
This patch sets data to be sent to Sheepdog correctly and fixes savevm
and loadvm operations on a Sheepdog image.

Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 1f7a48de44)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Hans de Goede
349b4e9070 ehci: Don't process too much frames in 1 timer tick (v2)
The Linux ehci isoc scheduling code fills the entire schedule ahead of
time minus 80 frames. If we make a large jump in where we are in the
schedule, ie 40 frames, then the scheduler all of a sudden will only have
40 frames left to work in, causing it to fail packet submissions
with error -27 (-EFBIG).

Changes in v2:
-Don't hardcode a maximum number of frames to process in one tick, instead:
 -Process a minimum number of frames to ensure we do eventually catch up
 -Stop (after the minimum number) when the guest has requested an irq

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8f74ed1e43)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Hans de Goede
29ecaa26a8 ehci: Fix interrupts stopping when Interrupt Threshold Control is 8
If Interrupt Threshold Control is 8 or a multiple of 8, then
s->usbsts_frindex can become exactly 0x4000, at which point
(s->usbsts_frindex > s->frindex) will never become true, as
s->usbsts_frindex will not be lowered / reset in this case.

This patch fixes this.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ffa1f2e088)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Gerd Hoffmann
1f97f6c9fe ehci: switch to new-style memory ops
Also register different memory regions for capabilities,
operational registers and port status registers.  Create
separate tracepoints for operational regs and port status
regs.  Ditch a bunch of sanity checks because the memory
core will do this for us now.

Offloading the byte, word and dword access handling to the
memory core also has the side effect of fixing ehci register
access on bigendian hosts.

Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3e4f910c8d)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:16 -05:00
Uri Lublin
9a5a94de58 qxl: better cleanup for surface destroy
Add back a call to qxl_spice_destroy_surface_wait_complete() in qxl_spice_destroy_surface_wait(),
that was removed by commit c480bb7da4

It is needed to complete surface-removal cleanup, for non async.
For async, qxl_spice_destroy_surface_wait_complete is called upon operation completion.

Signed-off-by: Uri Lublin <uril@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 753b8b0d77)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Gerd Hoffmann
462ff6fda1 usb-host: allow emulated (non-async) control requests without USBPacket
xhci needs this for USB_REQ_SET_ADDRESS due to the way
usb addressing is handled by the xhci hardware.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 63587e3135)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Dunrong Huang
b26859ab18 qxl: dont update invalid area
This patch fixes the following error:

$ ~/usr/bin/qemu-system-x86_64 -enable-kvm -m 1024 -spice port=5900,disable-ticketing -vga qxl -cdrom ~/Images/linuxmint-13-mate-dvd-32bit.iso
(/home/mathslinux/usr/bin/qemu-system-x86_64:10068): SpiceWorker-CRITICAL **: red_worker.c:4599:red_update_area: condition `area->left >= 0 && area->top >= 0 && area->left < area->right && area->top < area->bottom' failed
Aborted

spice server terminates QEMU process if we pass invalid area to it,
so dont update those invalid areas.

Signed-off-by: Dunrong Huang <riegamaths@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ccc2960d65)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Gerd Hoffmann
03c0342cb8 xhci: allow bytewise capability register reads
Some guests need this according to
Alejandro Martinez Ruiz <alex@securiforest.com>

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 6ee021d410)

Conflicts:

	hw/usb/hcd-xhci.c

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Gerd Hoffmann
23ba9d4838 xhci: fix runtime write tracepoint
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8e9f18b6db)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Gerd Hoffmann
640bda8df5 xhci: drop buffering
This patch splits the xhci_xfer_data function into three.
The xhci_xfer_data function used to do does two things:

  (1) copy transfer data between guest memory and a temporary buffer.
  (2) report transfer results to the guest using events.

Now we three functions to handle this:

  (1) xhci_xfer_map creates a scatter list for the transfer and
      uses that (instead of the temporary buffer) to build a
      USBPacket.
  (2) xhci_xfer_unmap undoes the mapping.
  (3) xhci_xfer_report sends out events.

The patch also fixes reporting of transaction errors which must be
reported unconditinally, not only in case the guest asks for it
using the ISP flag.

[ v2: fix warning ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d5a15814b4)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Gerd Hoffmann
64022e9d58 xhci: rip out background transfer code
original xhci code (the one which used libusb directly) used to use
'background transfers' for iso streams.  In upstream qemu the iso
stream buffering is handled by usb-host & usb-redir, so we will
never ever need this.  It has been left in as reference, but is dead
code anyway.  Rip it out.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 331e9406f1)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Gerd Hoffmann
3e3750e10a usb-audio: fix usb version
usb-audio is a full speed (1.1) device,
but bcdUSB claims it is usb 2.0.  Fix it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2bbd086c41)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Samuel Thibault
fbcb89ddc9 Better name usb braille device
Windows users need to know that they have to use the Baum driver to make
the qemu braille device work.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2964cd9bfa)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Hans de Goede
22ba7a7488 usb-redir: Return babble when getting more bulk data then requested
Babble is the appropriate error in this case (rather then signalling a stall).

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2979a36183)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Hans de Goede
636071de92 usb-redir: Move to core packet id and queue handling
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit de550a6afb)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Hans de Goede
55a2f465b9 usb-redir: Get rid of unused async-struct dev member
This is a preparation patch for completely getting rid of the async-packet
struct in usb-redir, instead relying on the (new) per ep queues in the
qemu usb core.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 206e7f20fe)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Hans de Goede
aa57b628e0 usb-redir: Get rid of local shadow copy of packet headers
The shadow copy only serves as an extra check (besides the packet-id) to
ensure the packet we get back is a reply to the packet we think it is.

This check has never triggered in all the time usb-redir is in use now,
and since the verified data in the returned packet-header is not used
otherwise, removing the check does not open any possibilities for the
usbredirhost to confuse us.

This is a preparation patch for completely getting rid of the async-packet
struct in usb-redir, instead relying on the (new) per ep queues in the
qemu usb core.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 104981d52b)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:15 -05:00
Hans de Goede
57eae744d4 usb-redir: Get rid of async-struct get member
This is a preparation patch for completely getting rid of the async-packet
struct in usb-redir, instead relying on the (new) per ep queues in the
qemu usb core.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit cb897117cd)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Hans de Goede
14ecfb09fa usb-redir: Don't delay handling of open events to a bottom half
There is no need for this, and doing so means that a backend trying to
write immediately after an open event will see qemu_chr_be_can_write
returning 0, which not all backends handle well as there is no wakeup
mechanism to detect when the frontend does become writable.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ed9873bfbf)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Hans de Goede
1d5ba9a6a8 usb-redir: Never return USB_RET_NAK for async handled packets
USB_RET_NAK is not a valid response for async handled packets (and will
trigger an assert as such).

Also drop the warning when receiving a status of cancelled for packets not
cancelled by qemu itself, this can happen when a device gets unredirected
by the usbredir-host while transfers are pending.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 181133404f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Hans de Goede
77c3d59256 ehci: Correct a comment in fetchqtd packet processing
Since my previous comment said "Should never happen", I tried changing the
next line to an assert(0), which did not go well, which as the new comments
explains is logical if you think about it for a moment.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit cf1f81691d)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Hans de Goede
e2ab86fe2d ehci: Handle USB_RET_PROCERR in ehci_fill_queue
USB_RET_PROCERR can be triggered by the guest (by for example requesting more
then BUFFSIZE bytes), so don't assert on it.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit eff6dce79b)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Hans de Goede
2f7ba4731b ehci: Fix memory leak in handling of NAK-ed packets
Currently each time we try to execute a NAK-ed packet we redo
ehci_init_transfer, and usb_packet_map, re-allocing (without freeing) the
sg list every time.

This patch fixes this, it does this by introducing another async state, so
that we also properly cleanup a NAK-ed packet on cancel.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ef5b234477)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Hans de Goede
0154d330c7 ehci: Add some additional ehci_trace_guest_bug() calls
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3a8ca08e01)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Gerd Hoffmann
d294ad6323 ehci: add doorbell trace events
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 1defcbd1e8)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Gerd Hoffmann
62fc5e6983 ehci: trace guest bugs
make qemu_queue_{cancel,reset} return the number of packets released,
so the caller can figure whenever there have been active packets even
though there shouldn't have been any.  Add tracepoint to log this.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 5c514681ab)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Gerd Hoffmann
4a6cdb4807 ehci: check for EHCI_ASYNC_FINISHED first in ehci_free_packet
Otherwise we'll see the packet free twice in the trace log even though
it actually happens only once.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 616789cde2)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Hans de Goede
a8cf10d5d6 ehci: Properly report completed but not yet processed packets to the guest
Reported packets which have completed before being cancelled as such to the
host. Note that the new code path this patch adds is untested since it I've
been unable to actually trigger the race which needs this code path.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit 4b63a0df3b)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Hans de Goede
307fea863a ehci: Properly cleanup packets on cancel
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit 0e7953525f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Hans de Goede
c15d61b252 ehci: Update copyright headers to reflect recent work
Update copyright headers to reflect all the work Gerd and I have been doing
on the EHCI emulation.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit 522079dd44)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:14 -05:00
Hans de Goede
712fc762a6 ehci: Validate qh is not changed unexpectedly by the guest
-combine the qh check with the check for devaddr changes
-also ensure that p gets set to NULL when the queue gets cancelled on
 devaddr change, which was not done properly before this patch

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit dafe31fc2a)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:13 -05:00
Hans de Goede
a37d5e521a Revert "ehci: don't flush cache on doorbell rings."
This reverts commit 9bc3a3a216, which got
added to fix an issue where the real, underlying cause was not stopping
the ep queue on an error.

Now that the underlying cause is fixed by the "usb: Halt ep queue and
cancel pending packets on a packet error" patch, the "don't flush" fix
is no longer needed.

Not only is it not needed, it causes us to see cancellations (unlinks)
done by the Linux EHCI driver too late, which in combination with the new
usb-core packet-id generation where qtd addresses are used as ids, causes
duplicate ids for in flight packets.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit 66f092d256)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:13 -05:00
Hans de Goede
d6e508d3a5 usb-core: Allow the first packet of a pipelined ep to complete immediately
This can happen with usb-redir live-migration when the packet gets re-queued
after the migration and the original queuing from the migration source side
has already finished.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 9c1f67654a)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:13 -05:00
Hans de Goede
d780116ab0 usb-core: Add a usb_ep_find_packet_by_id() helper function
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c13a9e6136)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:13 -05:00
Hans de Goede
597330bd8d usb-core: Don't set packet state to complete on a nak
This way the hcd can re-use the same packet to retry without needing
to re-init it.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit cc40997489)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:13 -05:00
Hans de Goede
4ebbf3229a usb: controllers do not need to check for babble themselves
If an (emulated) usb-device tries to write more data to a packet then
its iov len, this will trigger an assert in usb_packet_copy(), and if
a driver somehow circumvents that check and writes more data to the
iov then there is space, we have a much bigger problem then not correctly
reporting babble to the guest.

In practice babble will only happen with (real) redirected devices, and there
both the usb-host os and the qemu usb-device code already check for it.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 45b339b18c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:13 -05:00
Daniel P. Berrange
b5701820f8 Add ability to force enable/disable of tools build
The qemu-img, qemu-nbd and qemu-io tools are built conditionally
based on whether any softmmu target is enabled. These are useful
self-contained tools which can be used in many other scenarios.
Add new --enable-tools/--disable-tools args to configure to allow
the user to explicitly turn on / off their build. The default
behaviour is now to build these tools are all times, regardless
of whether any softmmu target is enabled

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 4b1c11fd20)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-11 21:44:13 -05:00
Anthony Liguori
5af6b07ab0 socket: don't attempt to reconnect a TCP socket in server mode
Commit c3767ed0eb introduced a possible SEGV when
using a socket chardev with server=on because it assumes that all TCP sockets
are in client mode.

This patch adds a check to only reconnect when in client mode.

Cc: Lei Li <lilei@linux.vnet.ibm.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 455aa1e081)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:30 -05:00
Michael Tokarev
82645b9e93 use --libexecdir instead of ignoring it first and reinventing it later
Commit 7b93fadf3a "Add basic version
of bridge helper" put the bridge helper executable into a fixed
${prefix}/libexec/ location, instead of using ${libexecdir} for
this.  At the same time, --libexecdir is being happily ignored
by ./configure.  Even more, the same patch sets unused $libexecdir
variable in the generated config-host.mak, and uses fixed string
(\${prefix}/libexecdir) for the bridge helper binary.

Fix this braindamage by introducing $libexecdir variable, using
it for the bridge helper binary, and recognizing --libexecdir.

This patch is applicable to stable-1.1.

Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: Corey Bryant <coreyb@linux.vnet.ibm.com>
Cc: Richa Marwaha <rmarwah@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 8bf188aa18)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:30 -05:00
Stefan Weil
b750d22466 hw/mcf5206: Fix buffer overflow for MBAR read / write
Report from smatch:

mcf5206.c:384 m5206_mbar_readb(7) error: buffer overflow 'm5206_mbar_width' 128 <= 128
mcf5206.c:403 m5206_mbar_readw(8) error: buffer overflow 'm5206_mbar_width' 128 <= 128
mcf5206.c:427 m5206_mbar_readl(8) error: buffer overflow 'm5206_mbar_width' 128 <= 128
mcf5206.c:451 m5206_mbar_writeb(9) error: buffer overflow 'm5206_mbar_width' 128 <= 128
mcf5206.c:475 m5206_mbar_writew(9) error: buffer overflow 'm5206_mbar_width' 128 <= 128
mcf5206.c:503 m5206_mbar_writel(9) error: buffer overflow 'm5206_mbar_width' 128 <= 128

m5206_mbar_width has 0x80 elements and supports 0 <= offset < 0x200.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit a32354e206)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:30 -05:00
Stefan Weil
bd4dba6658 hw/wm8750: Fix potential buffer overflow
Report from smatch:

hw/wm8750.c:369 wm8750_tx(12) error: buffer overflow 's->i2c_data' 2 <= 2

It looks like the preprocessor statements were simply misplaced.

Replace also __FUNCTION__ by __func__ to please checkpatch.pl.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 149eeb5fe5)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:30 -05:00
Christian Borntraeger
7566759fd5 qemu: Use valgrind annotations to mark kvm guest memory as defined
valgrind with kvm produces a big amount of false positives regarding
"Conditional jump or move depends on uninitialised value(s)". This
happens because the guest memory is allocated with qemu_vmalloc which
boils down posix_memalign etc. This function is (correctly) considered
by valgrind as returning undefined memory.

Since valgrind is based on jitting code, it will not be able to see
changes made by the guest to guest memory if this is done by KVM_RUN,
thus keeping most of the guest memory undefined.

Now lots of places in qemu will then use guest memory to change behaviour.
To avoid the flood of these messages, lets declare the whole guest
memory as defined. This will reduce the noise and allows us to see real
problems.

In the future we might want to make this conditional, since there
is actually something that we can use those false positives for:
These messages will point to code that depends on guest memory, so
we can use these backtraces to actually make an audit that is focussed
only at those code places. For normal development we dont want to
see those messages, though.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
(cherry picked from commit 62fe83318d)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:30 -05:00
Jan Kiszka
47b11da1e9 musicpal: Fix flash mapping
The old arithmetic assumed 32 physical address bits which is no longer
true for ARM since 3cc0cd61f4.

Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 0c267217ca)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:30 -05:00
Fabien Chouteau
2ecd8831e0 Add MAINTAINERS entry for leon3
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit ce6c760c37)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:30 -05:00
Maciej W. Rozycki
27aae39fa0 MIPS/user: Fix reset CPU state initialization
This change updates the CPU reset sequence to use a common piece of code
that figures out CPU state flags, fixing the problem with MIPS_HFLAG_COP1X
not being set where applicable that causes floating-point MADD family
instructions (and other instructions from the MIPS IV FP subset) to trap.

 As compute_hflags is now shared between op_helper.c and translate.c, the
function is now moved to a common header.  There are no changes to this
function.

 The problem was seen with the 24Kf MIPS32r2 processor in user emulation.
The new approach prevents system and user emulation from diverging -- all
the hflags state is initialized in one place now.

Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 03e6e50177)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:30 -05:00
Aurelien Jarno
635cc81bf9 lan9118: fix multicast filtering
The lan9118 emulation tries to compute the multicast index by calling
directly the crc32() function from zlib, but fails to get the correct
result.

Use the common compute_mcast_idx() function instead, which gives the
correct result. This fixes IPv6 support.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 449bc90e1f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:30 -05:00
Henning Schild
4de6467cbc fix entry pointer for ELF kernels loaded with -kernel option
Find a hopefully proper patch attached. Take it or leave it.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Henning Schild <henning@hennsch.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 7e9c7ffe9f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Jason Baron
4382057785 pcie_aer: clear cmask for Advanced Error Interrupt Message Number
The Advanced Error Interrupt Message Number (bits 31:27 of the Root
Error Status Register) is updated when the number of msi messages assigned to a
device changes. Migration of windows 7 on q35 chipset failed because the check
in get_pci_config_device() fails due to cmask being set on these bits. Its valid
to update these bits and we must restore this state across migration.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0e180d9c8a)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Jason Baron
6ac46e3216 pcie: drop version_id field for live migration
While testing q35 live migration, I found that the migration would abort with
the following error: "Unknown savevm section type 76".

The error is due to this check failing in 'vmstate_load_state()':

    while(field->name) {
        if ((field->field_exists &&
             field->field_exists(opaque, version_id)) ||
            (!field->field_exists &&
             field->version_id <= version_id)) {

The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However,
'version_id' in the above check is 1. And thus we fail to load the pcie device
field. Further the code returns to 'qemu_loadvm_state()' which produces the
error that I saw.

I'm proposing to fix this by simply dropping the 'version_id' field from
VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further
the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already
versioned. Thus, any versioning issues could be detected at the vmsd level.

Taking a step back, I think that the 'field->version_id' should be compared
against a saved version number for the field not the 'version_id'. Futhermore,
once vmstate_load_state() is called recursively on another vmsd, the check of:

    if (version_id > vmsd->version_id) {
        return -EINVAL;
    }

Will never fail since version_id is always equal to vmsd->version_id. So I'm
wondering why we aren't storing the vmsd version id of the source in the
migration stream?

This patch also renames the 'name' field of vmstate_pcie_device from:
PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1de5345927)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Stefan Weil
3aec24d195 json-parser: Fix potential NULL pointer segfault
Report from smatch:
json-parser.c:474 parse_object(62) error: potential null derefence 'dict'.
json-parser.c:553 parse_array(75) error: potential null derefence 'list'.

Label 'out' in json-parser.c can be called with list == NULL
which is passed to QDECREF.

Modify QDECREF to handle a NULL argument (inline function qobject_decref
already handles them, too).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 149474c934)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Stefan Weil
122b92d90a qapi: Fix potential NULL pointer segfault
Report from smatch:

qapi-visit.c:1640 visit_type_BlockdevAction(8) error:
 we previously assumed 'obj' could be null (see line 1639)
qapi-visit.c:2432 visit_type_NetClientOptions(8) error:
 we previously assumed 'obj' could be null (see line 2431)

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 227ccf6bff)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Amos Kong
23086b2054 fix doc of using raw values with sendkey
(qemu) sendkey a
(qemu) sendkey 0x1e
(qemu) sendkey #0x1e
 unknown key: '#0x1e'

The last command doesn't work, '#' is not requested before
raw values, and the raw value in decimal format is not supported.

Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 886cc706ce)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Alon Levy
ea4c86551d configure: print spice-protocol and spice-server versions
Signed-off-by: Alon Levy <alevy@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2e0e3c399a)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Alon Levy
b75c710575 qxl: add QXL_IO_MONITORS_CONFIG_ASYNC
Revision bumped to 4 for new IO support, enabled for spice-server >=
0.11.1. New io enabled if revision is 4. Revision can be set to 4.

[ kraxel: 3 continues to be the default revision.  Once we have a new
          stable spice-server release and the qemu patches to enable
          the new bits merged we'll go flip the switch and make rev4
          the default ]

This io calls the corresponding new spice api
spice_qxl_monitors_config_async to let spice-server read a new guest set
monitors config and notify the client.

On migration reissue spice_qxl_monitors_config_async.

RHBZ: 770842

Signed-off-by: Alon Levy <alevy@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

fixup

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 020af1c45f)

Conflicts:

	hw/qxl.c

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Alon Levy
5b7582af06 qxl/update_area_io: guest_bug on invalid parameters
Signed-off-by: Alon Levy <alevy@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 511b13e2c9)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Yonit Halperin
615198836c spice: increase the verbosity of spice section in "qemu --help"
Added all spice options to the help string. This can be used by libvirt
to determine which spice related features are supported by qemu.

Signed-off-by: Yonit Halperin <yhalperi@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 27af778828)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Yonit Halperin
7908e4a388 spice: adding seamless-migration option to the command line
The seamless-migration flag is required in order to identify
whether libvirt supports the new QEVENT_SPICE_MIGRATE_COMPLETED or not
(by default the flag is off).
New libvirt versions that wait for QEVENT_SPICE_MIGRATE_COMPLETED should turn on this flag.
When this flag is off, spice fallbacks to its old migration method, which
can result in data loss.

Signed-off-by: Yonit Halperin <yhalperi@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8c9570530c)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Yonit Halperin
38a01d68c6 spice: add 'migrated' flag to spice info
The flag is 'true' when spice migration has completed on the src side.
It is needed for a case where libvirt dies before migration completes
and it misses the event QEVENT_SPICE_MIGRATE_COMPLETED.
When libvirt is restored and queries the migration status, it also needs
to query spice and check if its migration has completed.

Signed-off-by: Yonit Halperin <yhalperi@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 61c4efe2cb)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:29 -05:00
Yonit Halperin
986b9a1a2a spice migration: add QEVENT_SPICE_MIGRATE_COMPLETED
When migrating, libvirt queries the migration status, and upon migration
completions, it closes the migration src. On the other hand, when
migration is completed, spice transfers data from the src to destination
via the client. This data is required for keeping the spice session
after migration, without suffering from data loss and inconsistencies.
In order to allow this data transfer, we add QEVENT for signaling
libvirt that spice migration has completed, and libvirt needs to wait
for this event before quitting the src process.

Signed-off-by: Yonit Halperin <yhalperi@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2fdd16e239)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:28 -05:00
Yonit Halperin
25bc2251eb spice: notify on vm state change only via spice_server_vm_start/stop
QXLWorker->start/stop are deprecated since spice-server 0.11.2

Signed-off-by: Yonit Halperin <yhalperi@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 71d388d420)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:28 -05:00
Yonit Halperin
fc24f3bd2e spice: notify spice server on vm start/stop
Spice server needs to know about the vm state in order to prevent
attempts to write to devices when they are stopped, mainly during
the non-live stage of migration.
Instead, spice will take care of restoring this writes, on the migration
target side, after migration completes.

Signed-off-by: Yonit Halperin <yhalperi@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f5bb039c6d)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:28 -05:00
Christophe Fergeau
3c82758f04 spice: abort on invalid streaming cmdline params
When parsing its command line parameters, spice aborts when it
finds unexpected values, except for the 'streaming-video' option.
This happens because the parsing of the parameters for this option
is done using the 'name2enum' helper, which does not error out
on unknown values. Using the 'parse_name' helper makes sure we
error out in this case. Looking at git history, the use of
'name2enum' instead of 'parse_name' seems to have been an oversight,
so let's change to that now.

Fixes rhbz#831708

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 835cab85ad)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:58:28 -05:00
Stefan Weil
7fd494086b tci: Fix for AREG0 free mode
Support for helper functions with 5 arguments was missing
in the code generator and in the interpreter.

There is no need to pass the constant TCG_AREG0 from the
code generator to the interpreter. Remove that code for
the INDEX_op_qemu_st* opcodes.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:11 -05:00
Stefan Weil
e46258fb50 tcg: Fix MAX_OPC_PARAM_IARGS
DEF_HELPER_FLAGS_5 was added some time ago without adjusting
MAX_OPC_PARAM_IARGS.

Fixing the definition becomes more important as QEMU is using
an increasing number of helper functions called with 5 arguments.

Add also a comment to avoid future problems when DEF_HELPER_FLAGS_6
will be added.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:11 -05:00
Aurelien Jarno
c90a8840b4 tcg/i386: fix build with -march < i686
The movcond_i32 op has to be protected with TCG_TARGET_HAS_movcond_i32
to fix the build with -march < i686.

Thanks to Richard Henderson for the hint.

Reported-by: Alex Barcelo <abarcelo@ac.upc.edu>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:11 -05:00
Richard Henderson
e1312991ed tcg: Adjust descriptions of *cond opcodes
The README file documented the operand ordering of the tcg_gen_*
functions.  Since we're documenting opcodes here, use the true
operand ordering.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Cc: malc <av1474@comtv.ru>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:11 -05:00
Aurelien Jarno
6c975d03a2 tcg/mips: fix MIPS32(R2) detection
Fix the MIPS32(R2) cpu detection so that it also works with
-march=octeon. Thanks to Andrew Pinski for the hint.

Cc: Andrew Pinski <apinski@cavium.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:11 -05:00
Richard Henderson
df6502fac5 target-alpha: Initialize env->cpu_model_str
Save the cpu_model_str so that we have a non-null value when
creating a new cpu during clone.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:11 -05:00
Richard Henderson
fef28a4336 tcg-sparc: Preserve branch destinations during retranslation
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:11 -05:00
Richard Henderson
38df9027ce tcg-sparc: Fix and enable direct TB chaining.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Richard Henderson
ba8b16fac5 tcg-sparc: Add %g/%o registers to alloc_order
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Richard Henderson
39cbcb4b45 tcg-sparc: Use defines for temporaries.
And change from %i4/%i5 to %g1/%o7 to remove a v8plus fixme.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Richard Henderson
db0cfd14ba tcg-sparc: Mask shift immediates to avoid illegal insns.
The xtensa-test image generates a sra_i32 with count 0x40.
Whether this is accident of tcg constant propagation or
originating directly from the instruction stream is immaterial.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Richard Henderson
5c7199f164 tcg-sparc: Clean up cruft stemming from attempts to use global registers.
Don't use -ffixed-gN.  Don't link statically.  Don't save/restore
AREG0 around calls.  Don't allocate space on the stack for AREG0 save.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Richard Henderson
60db655378 tcg-sparc: Change AREG0 in generated code to %i0.
We can now move the TCG variable from %g[56] to a call-preserved
windowed register.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Richard Henderson
757ee3f9e2 tcg-sparc: Support GUEST_BASE.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Richard Henderson
e85ec822e9 tcg-sparc: Fix qemu_ld/st to handle 32-bit host.
At the same time, split out the tlb load logic to a new function.
Fixes the cases of two data registers and two address registers.
Fixes the signature of, and adds missing, qemu_ld/st opcodes.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Richard Henderson
181dd0b27c tcg-sparc: Assume v9 cpu always, i.e. force v8plus in 32-bit mode.
Current code doesn't actually work in 32-bit mode at all.  Since
no one really noticed, drop the complication of v7 and v8 cpus.
Eliminate the --sparc_cpu configure option and standardize macro
testing on TCG_TARGET_REG_BITS / HOST_LONG_BITS

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Richard Henderson
e1784a4cbd tcg-sparc: Don't MAP_FIXED on top of the program
The address we pick in sparc64.ld is also 0x60000000, so doing a fixed map
on top of that is guaranteed to blow up.  Choosing 0x40000000 is exactly
right for the max of code_gen_buffer_size set below.

No need to ever use MAP_FIXED.  While getting our desired address helps
optimize the generated code, we won't fail if we don't get it.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Richard Henderson
e183eb99fa tcg-sparc: Fix ADDX opcode.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Richard Henderson
ee13de2853 tcg-sparc: Hack in qemu_ld/st64 for 32-bit.
Not actually implemented, but at least we avoid the tcg assert at startup.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
malc
6164c1274e tcg/ppc32: Implement movcond32
Thanks to Richard Henderson

Signed-off-by: malc <av1474@comtv.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Stefan Weil
a48028b8e9 w64: Fix TCG helper functions with 5 arguments
TCG uses 6 registers for function arguments on 64 bit Linux hosts,
but only 4 registers on W64 hosts.

Commit 2999a0b200 increased the number
of arguments for some important helper functions from 4 to 5
which triggered a bug for W64 hosts: QEMU aborts when executing
helper_lcall_real in the guest's BIOS because function
tcg_target_get_call_iarg_regs_count always returned 6.

As W64 has only 4 registers for arguments, the 5th argument must be
passed on the stack using a correct stack offset.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Max Filippov
5096a7c7d8 tcg/README: document tcg_gen_goto_tb restrictions
See
http://lists.nongnu.org/archive/html/qemu-devel/2012-09/msg03196.html
for the whole story.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Aurelien Jarno
2e1d09263b tcg/optimize: add constant folding for deposit
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Aurelien Jarno
ffc1c56815 tcg: remove #ifdef #endif around TCGOpcode tests
Commit 25c4d9cc changed all TCGOpcode enums to be available, so we don't
need to #ifdef #endif the one that are available only on some targets.
This makes the code easier to read.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Aurelien Jarno
cfca711212 tcg/optimize: prefer the "op a, a, b" form for commutative ops
The "op a, a, b" form is better handled on non-RISC host than the "op
a, b, a" form, so swap the arguments to this form when possible, and
when b is not a constant.

This reduces the number of generated instructions by a tiny bit.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Aurelien Jarno
26adaf07da tcg/optimize: further optimize brcond/movcond/setcond
When both argument of brcond/movcond/setcond are the same or when one
of the two values is a constant equal to zero, it's possible to do
further optimizations.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:10 -05:00
Aurelien Jarno
f4643451e9 tcg/optimize: optimize "op r, a, a => movi r, 0"
Now that it's possible to detect copies, we can optimize the case
the "op r, a, a => movi r, 0". This helps in the computation of
overflow flags when one of the two args is 0.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
d3b9fb75c6 tcg/optimize: optimize "op r, a, a => mov r, a"
Now that we can easily detect all copies, we can optimize the
"op r, a, a => mov r, a" case a bit more.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
6599e3faae tcg/optimize: do copy propagation for all operations
It is possible to due copy propagation for all operations, even the one
that have side effects or clobber arguments (it only concerns input
arguments). That said, the call operation should be handled differently
due to the variable number of arguments.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
a17f24e08d tcg/optimize: rework copy progagation
The copy propagation pass tries to keep track what is a copy of what
and what has copy of what, and in addition it keep a circular list of
of all the copies. Unfortunately this doesn't fully work: a mov from
a temp which has a state "COPY" changed it into a state "HAS_COPY".
Later when this temp is used again, it is considered has not having
copy and thus no propagation is done.

This patch fixes that by removing the hiearchy between copies, and thus
only keeping a "COPY" state both meaning "is a copy" and "has a copy".
The decision of which copy to use is deferred to the actual temp
replacement. At this stage there is not one best choice to do, but only
better choices than others. For doing the best choice the operation
would have to be parsed in reversed to know if a temp is going to be
used later or not. That what is done by the liveness analysis. At this
stage it is known that globals will be always live, that local temps
will be dead at the end of the translation block, and that the temps
will be dead at the end of the basic block. This means that this stage
should try to replace temps by local temps or globals and local temps
by globals.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
a4c701ee76 tcg/optimize: check types in copy propagation
The copy propagation doesn't check the types of the temps during copy
propagation. However TCG is using the mov_i32 for the i64 to i32
conversion and thus the two are not equivalent.

With this patch tcg_opt_gen_mov() doesn't consider two temps of
different type as copies anymore.

So far it seems the optimization was not aggressive enough to trigger
this bug, but it will be triggered later in this series once the copy
propagation is improved.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
ad797145ae tcg/optimize: remove TCG_TEMP_ANY
TCG_TEMP_ANY has no different meaning than TCG_TEMP_UNDEF, so use
the later instead.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
7ef9bee924 tcg/mips: implement movcond op on MIPS32R2
movcond operation can be implemented on MIPS32 Release 2 using the MOVN,
MOVZ, SLT and SLTU instructions.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
6d82e71b6a tcg/mips: implement deposit op on MIPS32R2
deposit operations can be optimized on MIPS32 Release 2 using the INS
instruction.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
a8067da5d1 tcg/mips: implement rotl/rotr ops on MIPS32R2
rotr operations can be optimized on MIPS32 Release 2 using the ROTR and
ROTRV instructions. Also implemented rotl operations by subtracting the
shift from 32.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
3b63392482 tcg/mips: optimize bswap{16,16s,32} on MIPS32R2
bswap operations can be optimized on MIPS32 Release 2 using the ROTR,
WSBH and SEH instructions. We can't use the non-R2 code to implement the
ops due to registers constraints, so don't define the corresponding
TCG_TARGET_HAS_bswap* values.

Also bswap16* operations are supposed to be called with the 16 high bits
zeroed. This is the case everywhere (including for TCG by definition)
except when called from the store helper. Remove the AND instructions from
bswap16* and move it there.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
86b4f7ca05 tcg/mips: optimize brcond arg, 0
MIPS has some conditional branch instructions when comparing with zero.
Use them.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
adf89e7483 tcg/mips: use stack for TCG temps
Use stack instead of temp_buf array in CPUState for TCG
temps.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
8193cdd0d0 tcg/mips: don't use global pointer
Don't use the global pointer in TCG, in case helpers try access global
variables.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
f25f0d7751 tcg/mips: use TCGArg or TCGReg instead of int
Instead of int, use the correct TCGArg and TCGReg type: TCGReg when
representing a TCG target register, TCGArg when representing the latter
or a constant.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
b7505cfd52 tcg/mips: kill warnings in user mode
Recent versions of GCC emit warnings when compiling user mode targets.
Kill them by reordering a bit the #ifdef.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
df3e097e21 tcg-mips: fix wrong usage of 'Z' constraint
The 'Z' constraint has been introduced to map the zero register. However
when the op also accept a constant, there is no point to accept the zero
register in addition.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Richard Henderson
707eee12fd tcg-hppa: Fix broken load/store helpers
The CONFIG_TCG_PASS_AREG0 code for calling ld/st helpers
was not respecting the ABI requirement for 64-bit values
being aligned in registers.

Mirror the ARM port in use of helper functions to marshal
arguments into the correct registers.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Richard Henderson
fb1d48edb6 tcg-hppa: Fix brcond2 and setcond2
Neither of these functions were performing double-word
compares properly.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Richard Henderson
4c8f62d388 tcg: Fix !USE_DIRECT_JUMP
Commit 6375e09e changed the type of TranslationBlock.tb_next,
but failed to change the type of TCGContext.tb_next.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Aurelien Jarno
15658ad01b gdbstub/sh4: fix build with USE_SOFTFLOAT_STRUCT_TYPES
We have to use different type to access float values when
USE_SOFTFLOAT_STRUCT_TYPES is defined.

Rework SH4 version of cpu_gdb_{read,write}_register() using
a single case, and fixing the coding style. Use ldll_p() and
stfl_p() to access float values.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Richard Henderson
1d465a94ad tcg: Optimize two-address commutative operations
While swapping constants to the second operand, swap
sources matching destinations to the first operand.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Richard Henderson
b5b6bb16b9 tcg: Optimize movcond for constant comparisons
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:09 -05:00
Richard Henderson
52080f688b tcg-i386: Implement movcond
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Richard Henderson
72bdbb7f46 target-alpha: Use movcond
For proper cmov insns, as well as the non-goto-tb case
of conditional branch.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Richard Henderson
6314cc0c73 tcg: Introduce movcond
Implemented with setcond if the target does not provide
the optional opcode.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Max Filippov
cc10061717 target-xtensa: don't emit extra tcg_gen_goto_tb
Unconditional gen_check_loop_end at the end of disas_xtensa_insn
can emit tcg_gen_goto_tb with slot id already used in the TB (e.g. when
TB ends at LEND with a branch).

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: malc <av1474@comtv.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Max Filippov
44219e7df2 target-xtensa: fix extui shift amount
extui opcode only uses lowermost op1 bit for sa4.

Reported-by: malc <av1474@comtv.ru>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: malc <av1474@comtv.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Aurelien Jarno
5b681fffe4 tcg/optimize: fix end of basic block detection
Commit e31b0a7c05 fixed copy propagation on
32-bit host by restricting the copy between different types. This was the
wrong fix.

The real problem is that the all temps states should be reset at the end
of a basic block. This was done by adding such operations in the switch,
but brcond2 was forgotten (that's why the crash was only observed on 32-bit
hosts).

Fix that by looking at the TCG_OPF_BB_END instead. We need to keep the case
for op_set_label as temps might be modified through another path.

Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Richard Henderson
447f11b945 target-mips: Always evaluate debugging macro arguments
this will prevent some of the compilation errors with debugging
enabled from creeping back in.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Richard Henderson
fe07dcf262 target-mips: Fix MIPS_DEBUG.
The macro uses the DisasContext.  Pass it around as needed.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Richard Henderson
d25c2359de target-mips: Set opn in gen_ldst_multiple.
Used by MIPS_DEBUG, when enabled.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Aurelien Jarno
4f17bc1e89 revert "TCG: fix copy propagation"
Given the copy propagation breakage on 32-bit hosts has been fixed
commit e31b0a7c05 can be reverted.

Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Aurelien Jarno
d39d648846 tcg: mark set_label with TCG_OPF_BB_END flag
set_label is effectively the end of a basic block, as no optimization
can be made accross it. It was treated as such in the liveness analysis
code, but as a special case.

Mark it with TCG_OPF_BB_END flag so that this information can be used
by other parts of the TCG code, and remove the special case in the liveness
analysis code.

Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Aurelien Jarno
ea15fd7c1a tcg/i386: allow constants in load/store ops
On x86, it is possible to move a constant value to memory. Add code to
handle a constant argument to load/store ops.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Blue Swirl
a0969d7a46 Remove unused CONFIG_TCG_PASS_AREG0 and dead code
Now that CONFIG_TCG_PASS_AREG0 is enabled for all targets,
remove dead code and support for !CONFIG_TCG_PASS_AREG0 case.

Remove dyngen-exec.h and all references to it. Although included by
hw/spapr_hcall.c, it does not seem to use it.

Remove unused HELPER_CFLAGS.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Blue Swirl
0c4b3c0198 target-mips: switch to AREG0 free mode
Add an explicit CPUState parameter instead of relying on AREG0
and switch to AREG0 free mode.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Blue Swirl
ca75e56821 target-sh4: switch to AREG0 free mode
Add an explicit CPUState parameter instead of relying on AREG0
and switch to AREG0 free mode.

Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Aurelien Jarno
c383331e9d target-cris: Switch to AREG0 free mode
Add an explicit CPUCRISState parameter instead of relying on AREG0, and
use cpu_ld* in translation and interrupt handling. Remove AREG0 swapping
in tlb_fill(). Switch to AREG0 free mode

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:08 -05:00
Aurelien Jarno
d36ae16dd4 target-cris: Avoid AREG0 for helpers
Add an explicit CPUCRISState parameter instead of relying on AREG0.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Blue Swirl
c9b89e8224 target-microblaze: switch to AREG0 free mode
Add an explicit CPUState parameter instead of relying on AREG0
and switch to AREG0 free mode.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Blue Swirl
40b1f8a804 target-arm: final conversion to AREG0 free mode
Convert code load functions and switch to AREG0 free mode.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Blue Swirl
883864650e target-arm: convert remaining helpers
Convert remaining helpers to AREG0 free mode: add an explicit
CPUState parameter instead of relying on AREG0.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Blue Swirl
6178b83f35 target-arm: convert void helpers
Add an explicit CPUState parameter instead of relying on AREG0.

For easier review, convert only op helpers which don't return any value.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Blue Swirl
af411212ce target-unicore32: switch to AREG0 free mode
Add an explicit CPUState parameter instead of relying on AREG0
and switch to AREG0 free mode.

Tested-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Blue Swirl
e67d4921fd target-m68k: avoid using cpu_single_env
Pass around CPUState instead of using global cpu_single_env.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Blue Swirl
3654ef451e target-m68k: switch to AREG0 free mode
Add an explicit CPUState parameter instead of relying on AREG0
and switch to AREG0 free mode.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Blue Swirl
7a5311f64f target-lm32: switch to AREG0 free mode
Add an explicit CPUState parameter instead of relying on AREG0
and switch to AREG0 free mode.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Blue Swirl
1ada516f0e target-s390x: avoid cpu_single_env
Pass around CPUState instead of using global cpu_single_env.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Aurelien Jarno
578c758200 tcg/optimize: fix if/else/break coding style
optimizer.c contains some cases were the break is appearing in both the
if and the else parts. Fix that by moving it to the outer part. Also
move some common code there.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Aurelien Jarno
97c012c76a tcg/optimize: add constant folding for brcond
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Aurelien Jarno
2108a786ba tcg/optimize: add constant folding for setcond
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Aurelien Jarno
decddd9021 tcg/optimize: swap brcond/setcond arguments when possible
brcond and setcond ops are not commutative, but it's easy to compute the
new condition after swapping the arguments. Try to always put the constant
argument in second position like for commutative ops, to help backends to
generate better code.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Aurelien Jarno
52173babad tcg/optimize: simplify shift/rot r, 0, a => movi r, 0 cases
shift/rot r, 0, a is equivalent to movi r, 0.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Aurelien Jarno
f66f1dab29 tcg/optimize: simplify and r, a, 0 cases
and r, a, 0 is equivalent to a movi r, 0.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Aurelien Jarno
6c9785fb4b tcg/optimize: simplify or/xor r, a, 0 cases
or/xor r, a, 0 is equivalent to a mov r, a.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Aurelien Jarno
d969307261 tcg/optimize: split expression simplification
Split expression simplification in multiple parts so that a given op
can appear multiple times. This patch should not change anything.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:07 -05:00
Stefan Weil
62cdb0fdd3 target-arm: Fix potential buffer overflow
Report from smatch:

target-arm/helper.c:651 arm946_prbs_read(6) error:
 buffer overflow 'env->cp15.c6_region' 8 <= 8
target-arm/helper.c:661 arm946_prbs_write(6) error:
 buffer overflow 'env->cp15.c6_region' 8 <= 8

c7_region is an array with 8 elements, so the index must be less than 8.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Aurelien Jarno
3bc7da7cd7 tcg/s390: fix ld/st with CONFIG_TCG_PASS_AREG0
The load/store slow path has been broken in e141ab52d:
- We need to move 4 registers for store functions and 3 registers for
  load functions and not the reverse.
- According to the s390x calling convention the arguments of a function
  should be zero extended. This means that the register shift should be
  done with TCG_TYPE_I64 to ensure the higher word is correctly zero
  extended when needed.

I am aware that CONFIG_TCG_PASS_AREG0 is being removed and thus that
this patch can be improved, but doing so means it can also be applied to
the 1.1 and 1.2 stable branches.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Blue Swirl
265434460e target-s390x: switch to AREG0 free mode
Add an explicit CPUState parameter instead of relying on AREG0.

Remove temporary wrappers and switch to AREG0 free mode.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
[agraf: fix conflicts]
Signed-off-by: Alexander Graf <agraf@suse.de>

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Blue Swirl
71742e6c68 target-s390x: avoid AREG0 for misc helpers
Make misc helpers take a parameter for CPUState instead
of relying on global env.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
[agraf: fix conflict]
Signed-off-by: Alexander Graf <agraf@suse.de>

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Blue Swirl
719a60ddb3 target-s390x: avoid AREG0 for condition code helpers
Make condition code helpers take a parameter for CPUState instead
of relying on global env.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Blue Swirl
1f983b49f7 target-s390x: avoid AREG0 for integer helpers
Make integer helpers take a parameter for CPUState instead
of relying on global env.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Blue Swirl
313e38b4a1 target-s390x: avoid AREG0 for FPU helpers
Make FPU helpers take a parameter for CPUState instead
of relying on global env.

Introduce temporary wrappers for FPU load and store ops.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Blue Swirl
1a392c8957 target-s390x: rename op_helper.c to misc_helper.c
Now op_helper.c contains miscellaneous helpers, rename
it to misc_helper.c.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
[agraf: fix conflict]
Signed-off-by: Alexander Graf <agraf@suse.de>

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Blue Swirl
84dde20eea target-s390x: split memory access helpers
Move memory access helpers to mem_helper.c.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
[agraf: fold softmmu include ifdefs together]
Signed-off-by: Alexander Graf <agraf@suse.de>

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Blue Swirl
7dc581be44 target-s390x: split integer helpers
Move integer helpers to int_helper.c.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Blue Swirl
f7c50726dd target-s390x: split condition code helpers
Move condition code helpers to cc_helper.c.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Blue Swirl
4bae8a0a90 target-s390x: split FPU ops
Move floating point instructions to fpu_helper.c.

While exporting some condition code helpers,
avoid duplicate identifier conflict with translate.c.

Remove unused set_cc_nz_f64() in translate.c.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Blue Swirl
b05900a761 target-s390x: fix style
Before splitting op_helper.c and helper.c in the next patches,
fix style issues. No functional changes.

Replace also GCC specific __FUNCTION__ with
standard __func__.

Don't init static variable (cpu_s390x_init:inited) with 0.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Aurelien Jarno
b16148e1c6 target-sparc: fix fcmp{s,d,q} instructions wrt exception
fcmp{s,d,q} instructions are supposed to ignore quiet NaN (contrary to
the fcmpe{s,d,q} instructions), but the current code is wrongly setting
the NV exception in that case. Moreover the current code is duplicated:
first the arguments are checked for NaN to generate an exception, and
later in case the comparison is unordered (which can only happens if one
of the argument is a NaN), the same check is done to generate an
exception.

Fix that by calling clear_float_exceptions() followed by
check_ieee_exceptions() as for the other floating point instructions.
Use the _compare_quiet functions for fcmp{s,d,q} and the _compare ones
for fcmpe{s,d,q}. Simplify the flag setting by not clearing a flag that
is set the line just below.

This fix allows the math glibc testsuite to pass.

Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Max Filippov
dafd8866be target-xtensa: fix missing errno codes for mingw32
Put the following errno value mappings under #ifdef:

xtensa-semi.c: In function 'errno_h2g':
xtensa-semi.c:113: error: 'ENOTBLK' undeclared (first use in this function)
xtensa-semi.c:113: error: (Each undeclared identifier is reported only once
xtensa-semi.c:113: error: for each function it appears in.)
xtensa-semi.c:113: error: array index in initializer not of integer type
xtensa-semi.c:113: error: (near initialization for 'guest_errno')
xtensa-semi.c:124: error: 'ETXTBSY' undeclared (first use in this function)
xtensa-semi.c:124: error: array index in initializer not of integer type
xtensa-semi.c:124: error: (near initialization for 'guest_errno')
xtensa-semi.c:134: error: 'ELOOP' undeclared (first use in this function)
xtensa-semi.c:134: error: array index in initializer not of integer type
xtensa-semi.c:134: error: (near initialization for 'guest_errno')

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Stefan Weil
631287933e target-cris: Fix buffer overflow
Report from smatch:

target-cris/translate.c:3464 cpu_dump_state(32) error:
 buffer overflow 'env->sregs' 4 <= 255

sregs is declared 'uint32_t sregs[4][16]', so the first index must be
less than 4 or ARRAY_SIZE(env->sregs).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
Max Filippov
814395979e target-xtensa: convert host errno values to guest
Guest errno values are taken from the newlib. Convert only those errno
values that can be returned from used system calls.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2012-10-09 01:42:06 -05:00
4217 changed files with 380421 additions and 776319 deletions

7
.exrc
View File

@@ -1,7 +0,0 @@
"VIM settings to match QEMU coding style. They are activated by adding the
"following settings (without the " symbol) as last two lines in $HOME/.vimrc:
"set secure
"set exrc
set expandtab
set shiftwidth=4
set smarttab

144
.gitignore vendored
View File

@@ -1,69 +1,66 @@
/config-devices.*
/config-all-devices.*
/config-all-disas.*
/config-host.*
/config-target.*
/config.status
/config-temp
/trace/generated-tracers.h
/trace/generated-tracers.c
/trace/generated-tracers-dtrace.h
/trace/generated-tracers.dtrace
/trace/generated-events.h
/trace/generated-events.c
/trace/generated-helpers-wrappers.h
/trace/generated-helpers.h
/trace/generated-helpers.c
/trace/generated-tcg-tracers.h
/trace/generated-ust-provider.h
/trace/generated-ust.c
/libcacard/trace/generated-tracers.c
config-devices.*
config-all-devices.*
config-host.*
config-target.*
trace.h
trace.c
trace-dtrace.h
trace-dtrace.dtrace
*-timestamp
/*-softmmu
/*-darwin-user
/*-linux-user
/*-bsd-user
/libdis*
/libuser
/linux-headers/asm
/qga/qapi-generated
/qapi-generated
/qapi-types.[ch]
/qapi-visit.[ch]
/qapi-event.[ch]
/qmp-commands.h
/qmp-marshal.c
/qemu-doc.html
/qemu-tech.html
/qemu-doc.info
/qemu-tech.info
/qemu-img
/qemu-nbd
/qemu-options.def
/qemu-options.texi
/qemu-img-cmds.texi
/qemu-img-cmds.h
/qemu-io
/qemu-ga
/qemu-bridge-helper
/qemu-monitor.texi
/qmp-commands.txt
/vscclient
/fsdev/virtfs-proxy-helper
*.[1-9]
*-softmmu
*-darwin-user
*-linux-user
*-bsd-user
libdis*
libhw32
libhw64
libuser
linux-headers/asm
qapi-generated
qapi-types.[ch]
qapi-visit.[ch]
qmp-commands.h
qmp-marshal.c
qemu-doc.html
qemu-tech.html
qemu-doc.info
qemu-tech.info
qemu.1
qemu.pod
qemu-img.1
qemu-img.pod
qemu-img
qemu-nbd
qemu-nbd.8
qemu-nbd.pod
qemu-options.def
qemu-options.texi
qemu-img-cmds.texi
qemu-img-cmds.h
qemu-io
qemu-ga
qemu-bridge-helper
qemu-monitor.texi
vscclient
QMP/qmp-commands.txt
test-coroutine
test-qmp-input-visitor
test-qmp-output-visitor
test-string-input-visitor
test-string-output-visitor
test-visitor-serialization
fsdev/virtfs-proxy-helper.1
fsdev/virtfs-proxy-helper.pod
.gdbinit
*.a
*.aux
*.cp
*.dvi
*.exe
*.dll
*.so
*.mo
*.fn
*.ky
*.log
*.pdf
*.pod
*.cps
*.fns
*.kys
@@ -73,31 +70,26 @@
*.tp
*.vr
*.d
!/scripts/qemu-guest-agent/fsfreeze-hook.d
*.o
*.lo
*.la
*.pc
.libs
.sdk
*.gcda
*.gcno
/pc-bios/bios-pq/status
/pc-bios/vgabios-pq/status
/pc-bios/optionrom/linuxboot.asm
/pc-bios/optionrom/linuxboot.bin
/pc-bios/optionrom/linuxboot.raw
/pc-bios/optionrom/linuxboot.img
/pc-bios/optionrom/multiboot.asm
/pc-bios/optionrom/multiboot.bin
/pc-bios/optionrom/multiboot.raw
/pc-bios/optionrom/multiboot.img
/pc-bios/optionrom/kvmvapic.asm
/pc-bios/optionrom/kvmvapic.bin
/pc-bios/optionrom/kvmvapic.raw
/pc-bios/optionrom/kvmvapic.img
/pc-bios/s390-ccw/s390-ccw.elf
/pc-bios/s390-ccw/s390-ccw.img
*.swp
*.orig
.pc
patches
pc-bios/bios-pq/status
pc-bios/vgabios-pq/status
pc-bios/optionrom/linuxboot.bin
pc-bios/optionrom/linuxboot.raw
pc-bios/optionrom/linuxboot.img
pc-bios/optionrom/multiboot.bin
pc-bios/optionrom/multiboot.raw
pc-bios/optionrom/multiboot.img
pc-bios/optionrom/kvmvapic.bin
pc-bios/optionrom/kvmvapic.raw
pc-bios/optionrom/kvmvapic.img
.stgit-*
cscope.*
tags

26
.gitmodules vendored
View File

@@ -1,33 +1,21 @@
[submodule "roms/vgabios"]
path = roms/vgabios
url = git://git.qemu-project.org/vgabios.git/
url = git://git.qemu.org/vgabios.git/
[submodule "roms/seabios"]
path = roms/seabios
url = git://git.qemu-project.org/seabios.git/
url = git://git.qemu.org/seabios.git/
[submodule "roms/SLOF"]
path = roms/SLOF
url = git://git.qemu-project.org/SLOF.git
url = git://git.qemu.org/SLOF.git
[submodule "roms/ipxe"]
path = roms/ipxe
url = git://git.qemu-project.org/ipxe.git
url = git://git.qemu.org/ipxe.git
[submodule "roms/openbios"]
path = roms/openbios
url = git://git.qemu-project.org/openbios.git
[submodule "roms/openhackware"]
path = roms/openhackware
url = git://git.qemu-project.org/openhackware.git
url = git://git.qemu.org/openbios.git
[submodule "roms/qemu-palcode"]
path = roms/qemu-palcode
url = git://github.com/rth7680/qemu-palcode.git
url = git://repo.or.cz/qemu-palcode.git
[submodule "roms/sgabios"]
path = roms/sgabios
url = git://git.qemu-project.org/sgabios.git
[submodule "pixman"]
path = pixman
url = git://anongit.freedesktop.org/pixman
[submodule "dtc"]
path = dtc
url = git://git.qemu-project.org/dtc.git
[submodule "roms/u-boot"]
path = roms/u-boot
url = git://git.qemu-project.org/u-boot.git
url = git://git.qemu.org/sgabios.git

View File

@@ -2,8 +2,7 @@
# into proper addresses so that they are counted properly in git shortlog output.
#
Andrzej Zaborowski <balrogg@gmail.com> balrog <balrog@c046a42c-6fe2-441c-8c8c-71466251a162>
Anthony Liguori <anthony@codemonkey.ws> aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
Anthony Liguori <anthony@codemonkey.ws> Anthony Liguori <aliguori@us.ibm.com>
Anthony Liguori <aliguori@us.ibm.com> aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
Aurelien Jarno <aurelien@aurel32.net> aurel32 <aurel32@c046a42c-6fe2-441c-8c8c-71466251a162>
Blue Swirl <blauwirbel@gmail.com> blueswir1 <blueswir1@c046a42c-6fe2-441c-8c8c-71466251a162>
Edgar E. Iglesias <edgar.iglesias@gmail.com> edgar_igl <edgar_igl@c046a42c-6fe2-441c-8c8c-71466251a162>

View File

@@ -1,103 +0,0 @@
language: c
python:
- "2.4"
compiler:
- gcc
- clang
notifications:
irc:
channels:
- "irc.oftc.net#qemu"
on_success: change
on_failure: always
env:
global:
- TEST_CMD=""
- EXTRA_CONFIG=""
# Development packages, EXTRA_PKGS saved for additional builds
- CORE_PKGS="libusb-1.0-0-dev libiscsi-dev librados-dev libncurses5-dev"
- NET_PKGS="libseccomp-dev libgnutls-dev libssh2-1-dev libspice-server-dev libspice-protocol-dev libnss3-dev"
- GUI_PKGS="libgtk-3-dev libvte-2.90-dev libsdl1.2-dev libpng12-dev libpixman-1-dev"
- EXTRA_PKGS=""
matrix:
# Group major targets together with their linux-user counterparts
- TARGETS=alpha-softmmu,alpha-linux-user
- TARGETS=arm-softmmu,arm-linux-user,armeb-linux-user,aarch64-softmmu,aarch64-linux-user
- TARGETS=cris-softmmu,cris-linux-user
- TARGETS=i386-softmmu,i386-linux-user,x86_64-softmmu,x86_64-linux-user
- TARGETS=m68k-softmmu,m68k-linux-user
- TARGETS=microblaze-softmmu,microblazeel-softmmu,microblaze-linux-user,microblazeel-linux-user
- TARGETS=mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu
- TARGETS=mips-linux-user,mips64-linux-user,mips64el-linux-user,mipsel-linux-user,mipsn32-linux-user,mipsn32el-linux-user
- TARGETS=or32-softmmu,or32-linux-user
- TARGETS=ppc-softmmu,ppc64-softmmu,ppcemb-softmmu,ppc-linux-user,ppc64-linux-user,ppc64abi32-linux-user,ppc64le-linux-user
- TARGETS=s390x-softmmu,s390x-linux-user
- TARGETS=sh4-softmmu,sh4eb-softmmu,sh4-linux-user sh4eb-linux-user
- TARGETS=sparc-softmmu,sparc64-softmmu,sparc-linux-user,sparc32plus-linux-user,sparc64-linux-user
- TARGETS=unicore32-softmmu,unicore32-linux-user
# Group remaining softmmu only targets into one build
- TARGETS=lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,xtensaeb-softmmu
git:
# we want to do this ourselves
submodules: false
before_install:
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive
- sudo apt-get update -qq
- sudo apt-get install -qq ${CORE_PKGS} ${NET_PKGS} ${GUI_PKGS} ${EXTRA_PKGS}
before_script:
- ./configure --target-list=${TARGETS} --enable-debug-tcg ${EXTRA_CONFIG}
script:
- make -j2 && ${TEST_CMD}
matrix:
# We manually include a number of additional build for non-standard bits
include:
# Make check target (we only do this once)
- env:
- TARGETS=alpha-softmmu,arm-softmmu,aarch64-softmmu,cris-softmmu,
i386-softmmu,x86_64-softmmu,m68k-softmmu,microblaze-softmmu,
microblazeel-softmmu,mips-softmmu,mips64-softmmu,
mips64el-softmmu,mipsel-softmmu,or32-softmmu,ppc-softmmu,
ppc64-softmmu,ppcemb-softmmu,s390x-softmmu,sh4-softmmu,
sh4eb-softmmu,sparc-softmmu,sparc64-softmmu,
unicore32-softmmu,unicore32-linux-user,
lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,
xtensaeb-softmmu
TEST_CMD="make check"
compiler: gcc
# Debug related options
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-debug"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-debug --enable-tcg-interpreter"
compiler: gcc
# All the extra -dev packages
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="libaio-dev libcap-ng-dev libattr1-dev libbrlapi-dev uuid-dev libusb-1.0.0-dev"
compiler: gcc
# Currently configure doesn't force --disable-pie
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-gprof --enable-gcov --disable-pie"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="sparse"
EXTRA_CONFIG="--enable-sparse"
compiler: gcc
# All the trace backends (apart from dtrace)
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=stderr"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=simple"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=ftrace"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="liblttng-ust-dev liburcu-dev"
EXTRA_CONFIG="--enable-trace-backends=ust"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-modules"
compiler: gcc

View File

@@ -84,24 +84,3 @@ and clarity it comes on a line by itself:
Rationale: a consistent (except for functions...) bracing style reduces
ambiguity and avoids needless churn when lines are added or removed.
Furthermore, it is the QEMU coding style.
5. Declarations
Mixed declarations (interleaving statements and declarations within blocks)
are not allowed; declarations should be at the beginning of blocks. In other
words, the code should not generate warnings if using GCC's
-Wdeclaration-after-statement option.
6. Conditional statements
When comparing a variable for (in)equality with a constant, list the
constant on the right, as in:
if (a == 1) {
/* Reads like: "If a equals 1" */
do_something();
}
Rationale: Yoda conditions (as in 'if (1 == a)') are awkward to read.
Besides, good compilers already warn users when '==' is mis-typed as '=',
even when the constant is on the right.

View File

@@ -1,6 +1,6 @@
This file documents changes for QEMU releases 0.12 and earlier.
For changelog information for later releases, see
http://wiki.qemu-project.org/ChangeLog or look at the git history for
http://wiki.qemu.org/ChangeLog or look at the git history for
more detailed information.

57
HACKING
View File

@@ -32,7 +32,7 @@ mandatory for VMState fields.
Don't use Linux kernel internal types like u32, __u32 or __le32.
Use hwaddr for guest physical addresses except pcibus_t
Use target_phys_addr_t for guest physical addresses except pcibus_t
for PCI addresses. In addition, ram_addr_t is a QEMU internal address
space that maps guest RAM physical addresses into an intermediate
address space that can map to host virtual address spaces. Generally
@@ -40,23 +40,8 @@ speaking, the size of guest memory can always fit into ram_addr_t but
it would not be correct to store an actual guest physical address in a
ram_addr_t.
For CPU virtual addresses there are several possible types.
vaddr is the best type to use to hold a CPU virtual address in
target-independent code. It is guaranteed to be large enough to hold a
virtual address for any target, and it does not change size from target
to target. It is always unsigned.
target_ulong is a type the size of a virtual address on the CPU; this means
it may be 32 or 64 bits depending on which target is being built. It should
therefore be used only in target-specific code, and in some
performance-critical built-per-target core code such as the TLB code.
There is also a signed version, target_long.
abi_ulong is for the *-user targets, and represents a type the size of
'void *' in that target's ABI. (This may not be the same as the size of a
full CPU virtual address in the case of target ABIs which use 32 bit pointers
on 64 bit CPUs, like sparc32plus.) Definitions of structures that must match
the target's ABI must use this type for anything that on the target is defined
to be an 'unsigned long' or a pointer type.
There is also a signed version, abi_long.
Use target_ulong (or abi_ulong) for CPU virtual addresses, however
devices should not need to use target_ulong.
Of course, take all of the above with a grain of salt. If you're about
to use some system interface that requires a type like size_t, pid_t or
@@ -93,23 +78,23 @@ avoided.
Use of the malloc/free/realloc/calloc/valloc/memalign/posix_memalign
APIs is not allowed in the QEMU codebase. Instead of these routines,
use the GLib memory allocation routines g_malloc/g_malloc0/g_new/
g_new0/g_realloc/g_free or QEMU's qemu_memalign/qemu_blockalign/qemu_vfree
g_new0/g_realloc/g_free or QEMU's qemu_vmalloc/qemu_memalign/qemu_vfree
APIs.
Please note that g_malloc will exit on allocation failure, so there
is no need to test for failure (as you would have to with malloc).
Calling g_malloc with a zero size is valid and will return NULL.
Memory allocated by qemu_memalign or qemu_blockalign must be freed with
qemu_vfree, since breaking this will cause problems on Win32.
Memory allocated by qemu_vmalloc or qemu_memalign must be freed with
qemu_vfree, since breaking this will cause problems on Win32 and user
emulators.
4. String manipulation
Do not use the strncpy function. As mentioned in the man page, it does *not*
guarantee a NULL-terminated buffer, which makes it extremely dangerous to use.
It also zeros trailing destination bytes out to the specified length. Instead,
use this similar function when possible, but note its different signature:
void pstrcpy(char *dest, int dest_buf_size, const char *src)
Do not use the strncpy function. According to the man page, it does
*not* guarantee a NULL-terminated buffer, which makes it extremely dangerous
to use. Instead, use functionally equivalent function:
void pstrcpy(char *buf, int buf_size, const char *str)
Don't use strcat because it can't check for buffer overflows, but:
char *pstrcat(char *buf, int buf_size, const char *s)
@@ -137,23 +122,3 @@ gcc's printf attribute directive in the prototype.
This makes it so gcc's -Wformat and -Wformat-security options can do
their jobs and cross-check format strings with the number and types
of arguments.
6. C standard, implementation defined and undefined behaviors
C code in QEMU should be written to the C99 language specification. A copy
of the final version of the C99 standard with corrigenda TC1, TC2, and TC3
included, formatted as a draft, can be downloaded from:
http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf
The C language specification defines regions of undefined behavior and
implementation defined behavior (to give compiler authors enough leeway to
produce better code). In general, code in QEMU should follow the language
specification and avoid both undefined and implementation defined
constructs. ("It works fine on the gcc I tested it with" is not a valid
argument...) However there are a few areas where we allow ourselves to
assume certain behaviors because in practice all the platforms we care about
behave in the same way and writing strictly conformant code would be
painful. These are:
* you may assume that integers are 2s complement representation
* you may assume that right shift of a signed integer duplicates
the sign bit (ie it is an arithmetic shift, not a logical shift)

15
LICENSE
View File

@@ -1,21 +1,16 @@
The following points clarify the QEMU license:
1) QEMU as a whole is released under the GNU General Public License,
version 2.
1) QEMU as a whole is released under the GNU General Public License
2) Parts of QEMU have specific licenses which are compatible with the
GNU General Public License, version 2. Hence each source file contains
its own licensing information. Source files with no licensing information
are released under the GNU General Public License, version 2 or (at your
option) any later version.
GNU General Public License. Hence each source file contains its own
licensing information.
As of July 2013, contributions under version 2 of the GNU General Public
License (and no later version) are only accepted for the following files
or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*.
Many hardware device emulation sources are released under the BSD license.
3) The Tiny Code Generator (TCG) is released under the BSD license
(see license headers in files).
4) QEMU is a trademark of Fabrice Bellard.
Fabrice Bellard and the QEMU team
Fabrice Bellard.

File diff suppressed because it is too large Load Diff

396
Makefile
View File

@@ -8,65 +8,22 @@ ifneq ($(wildcard config-host.mak),)
# Put the all: rule here so that config-host.mak can contain dependencies.
all:
include config-host.mak
# Check that we're not trying to do an out-of-tree build from
# a tree that's been used for an in-tree build.
ifneq ($(realpath $(SRC_PATH)),$(realpath .))
ifneq ($(wildcard $(SRC_PATH)/config-host.mak),)
$(error This is an out of tree build but your source tree ($(SRC_PATH)) \
seems to have been used for an in-tree build. You can fix this by running \
"make distclean && rm -rf *-linux-user *-softmmu" in your source tree)
endif
endif
CONFIG_SOFTMMU := $(if $(filter %-softmmu,$(TARGET_DIRS)),y)
CONFIG_USER_ONLY := $(if $(filter %-user,$(TARGET_DIRS)),y)
CONFIG_ALL=y
-include config-all-devices.mak
-include config-all-disas.mak
include $(SRC_PATH)/rules.mak
config-host.mak: $(SRC_PATH)/configure
@echo $@ is out-of-date, running configure
@# TODO: The next lines include code which supports a smooth
@# transition from old configurations without config.status.
@# This code can be removed after QEMU 1.7.
@if test -x config.status; then \
./config.status; \
else \
sed -n "/.*Configured with/s/[^:]*: //p" $@ | sh; \
fi
@sed -n "/.*Configured with/s/[^:]*: //p" $@ | sh
else
config-host.mak:
ifneq ($(filter-out %clean,$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
@echo "Please call configure before running make!"
@exit 1
endif
GENERATED_HEADERS = config-host.h trace.h qemu-options.def
ifeq ($(TRACE_BACKEND),dtrace)
GENERATED_HEADERS += trace-dtrace.h
endif
GENERATED_HEADERS = config-host.h qemu-options.def
GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h
GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c
GENERATED_HEADERS += trace/generated-events.h
GENERATED_SOURCES += trace/generated-events.c
GENERATED_HEADERS += trace/generated-tracers.h
ifeq ($(findstring dtrace,$(TRACE_BACKENDS)),dtrace)
GENERATED_HEADERS += trace/generated-tracers-dtrace.h
endif
GENERATED_SOURCES += trace/generated-tracers.c
GENERATED_HEADERS += trace/generated-tcg-tracers.h
GENERATED_HEADERS += trace/generated-helpers-wrappers.h
GENERATED_HEADERS += trace/generated-helpers.h
GENERATED_SOURCES += trace/generated-helpers.c
ifeq ($(findstring ust,$(TRACE_BACKENDS)),ust)
GENERATED_HEADERS += trace/generated-ust-provider.h
GENERATED_SOURCES += trace/generated-ust.c
endif
GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h
GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c trace.c
# Don't try to regenerate Makefile or configure
# We don't generate any of them
@@ -83,10 +40,7 @@ LIBS+=-lz $(LIBS_TOOLS)
HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)
ifdef BUILD_DOCS
DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 qmp-commands.txt
ifdef CONFIG_LINUX
DOCS+=kvm_stat.1
endif
DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 QMP/qmp-commands.txt
ifdef CONFIG_VIRTFS
DOCS+=fsdev/virtfs-proxy-helper.1
endif
@@ -96,25 +50,21 @@ endif
SUBDIR_MAKEFLAGS=$(if $(V),,--no-print-directory) BUILD_DIR=$(BUILD_DIR)
SUBDIR_DEVICES_MAK=$(patsubst %, %/config-devices.mak, $(TARGET_DIRS))
SUBDIR_DEVICES_MAK_DEP=$(patsubst %, %-config-devices.mak.d, $(TARGET_DIRS))
SUBDIR_DEVICES_MAK_DEP=$(patsubst %, %/config-devices.mak.d, $(TARGET_DIRS))
ifeq ($(SUBDIR_DEVICES_MAK),)
config-all-devices.mak:
$(call quiet-command,echo '# no devices' > $@," GEN $@")
else
config-all-devices.mak: $(SUBDIR_DEVICES_MAK)
$(call quiet-command, sed -n \
's|^\([^=]*\)=\(.*\)$$|\1:=$$(findstring y,$$(\1)\2)|p' \
$(SUBDIR_DEVICES_MAK) | sort -u > $@, \
" GEN $@")
$(call quiet-command,cat $(SUBDIR_DEVICES_MAK) | grep =y | sort -u > $@," GEN $@")
endif
-include $(SUBDIR_DEVICES_MAK_DEP)
%/config-devices.mak: default-configs/%.mak
$(call quiet-command, \
$(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $< $*-config-devices.mak.d $@ > $@.tmp, " GEN $@.tmp")
$(call quiet-command, if test -f $@; then \
$(call quiet-command,$(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $@ $<, " GEN $@")
@if test -f $@; then \
if cmp -s $@.old $@; then \
mv $@.tmp $@; \
cp -p $@ $@.old; \
@@ -130,33 +80,14 @@ endif
else \
mv $@.tmp $@; \
cp -p $@ $@.old; \
fi, " GEN $@");
fi
defconfig:
rm -f config-all-devices.mak $(SUBDIR_DEVICES_MAK)
ifneq ($(wildcard config-host.mak),)
include $(SRC_PATH)/Makefile.objs
endif
-include config-all-devices.mak
dummy := $(call unnest-vars,, \
stub-obj-y \
util-obj-y \
qga-obj-y \
qga-vss-dll-obj-y \
block-obj-y \
block-obj-m \
common-obj-y \
common-obj-m)
ifneq ($(wildcard config-host.mak),)
include $(SRC_PATH)/tests/Makefile
endif
ifeq ($(CONFIG_SMARTCARD_NSS),y)
include $(SRC_PATH)/libcacard/Makefile
endif
all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules
all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all
config-host.h: config-host.h-timestamp
config-host.h-timestamp: config-host.mak
@@ -164,34 +95,19 @@ qemu-options.def: $(SRC_PATH)/qemu-options.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $@")
SUBDIR_RULES=$(patsubst %,subdir-%, $(TARGET_DIRS))
SOFTMMU_SUBDIR_RULES=$(filter %-softmmu,$(SUBDIR_RULES))
$(SOFTMMU_SUBDIR_RULES): $(block-obj-y)
$(SOFTMMU_SUBDIR_RULES): config-all-devices.mak
subdir-%:
$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C $* V="$(V)" TARGET_DIR="$*/" all,)
subdir-pixman: pixman/Makefile
$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C pixman V="$(V)" all,)
ifneq ($(wildcard config-host.mak),)
include $(SRC_PATH)/Makefile.objs
endif
pixman/Makefile: $(SRC_PATH)/pixman/configure
(cd pixman; CFLAGS="$(CFLAGS) -fPIC $(extra_cflags) $(extra_ldflags)" $(SRC_PATH)/pixman/configure $(AUTOCONF_HOST) --disable-gtk --disable-shared --enable-static)
subdir-libcacard: $(oslib-obj-y) $(trace-obj-y) qemu-timer-common.o
$(SRC_PATH)/pixman/configure:
(cd $(SRC_PATH)/pixman; autoreconf -v --install)
$(filter %-softmmu,$(SUBDIR_RULES)): $(universal-obj-y) $(trace-obj-y) $(common-obj-y) $(extra-obj-y) subdir-libdis
DTC_MAKE_ARGS=-I$(SRC_PATH)/dtc VPATH=$(SRC_PATH)/dtc -C dtc V="$(V)" LIBFDT_srcdir=$(SRC_PATH)/dtc/libfdt
DTC_CFLAGS=$(CFLAGS) $(QEMU_CFLAGS)
DTC_CPPFLAGS=-I$(BUILD_DIR)/dtc -I$(SRC_PATH)/dtc -I$(SRC_PATH)/dtc/libfdt
subdir-dtc:dtc/libfdt dtc/tests
$(call quiet-command,$(MAKE) $(DTC_MAKE_ARGS) CPPFLAGS="$(DTC_CPPFLAGS)" CFLAGS="$(DTC_CFLAGS)" LDFLAGS="$(LDFLAGS)" ARFLAGS="$(ARFLAGS)" CC="$(CC)" AR="$(AR)" LD="$(LD)" $(SUBDIR_MAKEFLAGS) libfdt/libfdt.a,)
dtc/%:
mkdir -p $@
$(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y)
$(filter %-user,$(SUBDIR_RULES)): $(universal-obj-y) $(trace-obj-y) subdir-libdis-user subdir-libuser
ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS))
romsubdir-%:
@@ -201,33 +117,61 @@ ALL_SUBDIRS=$(TARGET_DIRS) $(patsubst %,pc-bios/%, $(ROMS))
recurse-all: $(SUBDIR_RULES) $(ROMSUBDIR_RULES)
$(BUILD_DIR)/version.o: $(SRC_PATH)/version.rc config-host.h | $(BUILD_DIR)/version.lo
$(call quiet-command,$(WINDRES) -I$(BUILD_DIR) -o $@ $<," RC version.o")
$(BUILD_DIR)/version.lo: $(SRC_PATH)/version.rc config-host.h
$(call quiet-command,$(WINDRES) -I$(BUILD_DIR) -o $@ $<," RC version.lo")
audio/audio.o audio/fmodaudio.o: QEMU_CFLAGS += $(FMOD_CFLAGS)
Makefile: $(version-obj-y) $(version-lobj-y)
QEMU_CFLAGS+=$(CURL_CFLAGS)
QEMU_CFLAGS += -I$(SRC_PATH)/include
ui/cocoa.o: ui/cocoa.m
ui/sdl.o audio/sdlaudio.o ui/sdl_zoom.o hw/baum.o: QEMU_CFLAGS += $(SDL_CFLAGS)
ui/vnc.o: QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS)
version.o: $(SRC_PATH)/version.rc config-host.h
$(call quiet-command,$(WINDRES) -I. -o $@ $<," RC $(TARGET_DIR)$@")
version-obj-$(CONFIG_WIN32) += version.o
######################################################################
# Build libraries
# Support building shared library libcacard
libqemustub.a: $(stub-obj-y)
libqemuutil.a: $(util-obj-y)
.PHONY: libcacard.la install-libcacard
ifeq ($(LIBTOOL),)
libcacard.la:
@echo "libtool is missing, please install and rerun configure"; exit 1
block-modules = $(foreach o,$(block-obj-m),"$(basename $(subst /,-,$o))",) NULL
util/module.o-cflags = -D'CONFIG_BLOCK_MODULES=$(block-modules)'
install-libcacard:
@echo "libtool is missing, please install and rerun configure"; exit 1
else
libcacard.la: $(oslib-obj-y) qemu-timer-common.o $(addsuffix .lo, $(basename $(trace-obj-y)))
$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C libcacard V="$(V)" TARGET_DIR="$*/" libcacard.la,)
install-libcacard: libcacard.la
$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C libcacard V="$(V)" TARGET_DIR="$*/" install-libcacard,)
endif
######################################################################
qemu-img.o: qemu-img-cmds.h
qemu-img$(EXESUF): qemu-img.o $(block-obj-y) libqemuutil.a libqemustub.a
qemu-nbd$(EXESUF): qemu-nbd.o $(block-obj-y) libqemuutil.a libqemustub.a
qemu-io$(EXESUF): qemu-io.o $(block-obj-y) libqemuutil.a libqemustub.a
tools-obj-y = $(oslib-obj-y) $(trace-obj-y) qemu-tool.o qemu-timer.o \
qemu-timer-common.o main-loop.o notify.o \
iohandler.o cutils.o iov.o async.o
tools-obj-$(CONFIG_POSIX) += compatfd.o
qemu-img$(EXESUF): qemu-img.o $(tools-obj-y) $(block-obj-y)
qemu-nbd$(EXESUF): qemu-nbd.o $(tools-obj-y) $(block-obj-y)
qemu-io$(EXESUF): qemu-io.o cmd.o $(tools-obj-y) $(block-obj-y)
qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o
fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/virtio-9p-marshal.o libqemuutil.a libqemustub.a
vscclient$(EXESUF): $(libcacard-y) $(oslib-obj-y) $(trace-obj-y) $(tools-obj-y) qemu-timer-common.o libcacard/vscclient.o
$(call quiet-command,$(CC) $(LDFLAGS) -o $@ $^ $(libcacard_libs) $(LIBS)," LINK $@")
fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/virtio-9p-marshal.o oslib-posix.o $(trace-obj-y)
fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
@@ -238,73 +182,56 @@ qemu-ga$(EXESUF): QEMU_CFLAGS += -I qga/qapi-generated
gen-out-type = $(subst .,-,$(suffix $@))
ifneq ($(wildcard config-host.mak),)
include $(SRC_PATH)/tests/Makefile
endif
qapi-py = $(SRC_PATH)/scripts/qapi.py $(SRC_PATH)/scripts/ordereddict.py
qga/qapi-generated/qga-qapi-types.c qga/qapi-generated/qga-qapi-types.h :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
" GEN $@")
$(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py $(gen-out-type) -o qga/qapi-generated -p "qga-" < $<, " GEN $@")
qga/qapi-generated/qga-qapi-visit.c qga/qapi-generated/qga-qapi-visit.h :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
" GEN $@")
$(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py $(gen-out-type) -o qga/qapi-generated -p "qga-" < $<, " GEN $@")
qga/qapi-generated/qga-qmp-commands.h qga/qapi-generated/qga-qmp-marshal.c :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
" GEN $@")
qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json \
$(SRC_PATH)/qapi/block.json $(SRC_PATH)/qapi/block-core.json \
$(SRC_PATH)/qapi/event.json
$(SRC_PATH)/qapi-schema-guest.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py $(gen-out-type) -o qga/qapi-generated -p "qga-" < $<, " GEN $@")
qapi-types.c qapi-types.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
$(gen-out-type) -o "." -b $<, \
" GEN $@")
$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py $(gen-out-type) -o "." < $<, " GEN $@")
qapi-visit.c qapi-visit.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
$(gen-out-type) -o "." -b $<, \
" GEN $@")
qapi-event.c qapi-event.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-event.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-event.py \
$(gen-out-type) -o "." $<, \
" GEN $@")
$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py $(gen-out-type) -o "." < $<, " GEN $@")
qmp-commands.h qmp-marshal.c :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
$(gen-out-type) -o "." -m $<, \
" GEN $@")
$(SRC_PATH)/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py $(gen-out-type) -m -o "." < $<, " GEN $@")
QGALIB_GEN=$(addprefix qga/qapi-generated/, qga-qapi-types.h qga-qapi-visit.h qga-qmp-commands.h)
$(qga-obj-y) qemu-ga.o: $(QGALIB_GEN)
qemu-ga$(EXESUF): $(qga-obj-y) libqemuutil.a libqemustub.a
$(call LINK, $^)
qemu-ga$(EXESUF): qemu-ga.o $(qga-obj-y) $(tools-obj-y) $(qapi-obj-y) $(qobject-obj-y) $(version-obj-y)
QEMULIBS=libhw32 libhw64 libuser libdis libdis-user
clean:
# avoid old build problems by removing potentially incorrect old files
rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h gen-op-arm.h
rm -f qemu-options.def
find . \( -name '*.l[oa]' -o -name '*.so' -o -name '*.dll' -o -name '*.mo' -o -name '*.[oda]' \) -type f -exec rm {} +
rm -f $(filter-out %.tlb,$(TOOLS)) $(HELPERS-y) qemu-ga TAGS cscope.* *.pod *~ */*~
rm -f fsdev/*.pod
rm -rf .libs */.libs
find . -name '*.[od]' -exec rm -f {} +
rm -f *.a *.lo $(TOOLS) $(HELPERS-y) qemu-ga TAGS cscope.* *.pod *~ */*~
rm -Rf .libs
rm -f qemu-img-cmds.h
rm -f ui/shader/*-vert.h ui/shader/*-frag.h
rm -f trace-dtrace.dtrace trace-dtrace.dtrace-timestamp
@# May not be present in GENERATED_HEADERS
rm -f trace/generated-tracers-dtrace.dtrace*
rm -f trace/generated-tracers-dtrace.h*
rm -f trace-dtrace.h trace-dtrace.h-timestamp
rm -f $(foreach f,$(GENERATED_HEADERS),$(f) $(f)-timestamp)
rm -f $(foreach f,$(GENERATED_SOURCES),$(f) $(f)-timestamp)
rm -rf qapi-generated
rm -rf qga/qapi-generated
for d in $(ALL_SUBDIRS); do \
$(MAKE) -C tests/tcg clean
for d in $(ALL_SUBDIRS) $(QEMULIBS) libcacard; do \
if test -d $$d; then $(MAKE) -C $$d $@ || exit 1; fi; \
rm -f $$d/qemu-options.def; \
done
@@ -318,8 +245,7 @@ qemu-%.tar.bz2:
distclean: clean
rm -f config-host.mak config-host.h* config-host.ld $(DOCS) qemu-options.texi qemu-img-cmds.texi qemu-monitor.texi
rm -f config-all-devices.mak config-all-disas.mak config.status
rm -f po/*.mo tests/qemu-iotests/common.env
rm -f config-all-devices.mak
rm -f roms/seabios/config.mak roms/vgabios/config.mak
rm -f qemu-doc.info qemu-doc.aux qemu-doc.cp qemu-doc.cps qemu-doc.dvi
rm -f qemu-doc.fn qemu-doc.fns qemu-doc.info qemu-doc.ky qemu-doc.kys
@@ -328,35 +254,27 @@ distclean: clean
rm -f config.log
rm -f linux-headers/asm
rm -f qemu-tech.info qemu-tech.aux qemu-tech.cp qemu-tech.dvi qemu-tech.fn qemu-tech.info qemu-tech.ky qemu-tech.log qemu-tech.pdf qemu-tech.pg qemu-tech.toc qemu-tech.tp qemu-tech.vr
for d in $(TARGET_DIRS); do \
for d in $(TARGET_DIRS) $(QEMULIBS); do \
rm -rf $$d || exit 1 ; \
done
rm -Rf .sdk
if test -f pixman/config.log; then $(MAKE) -C pixman distclean; fi
if test -f dtc/version_gen.h; then $(MAKE) $(DTC_MAKE_ARGS) clean; fi
KEYMAPS=da en-gb et fr fr-ch is lt modifiers no pt-br sv \
ar de en-us fi fr-be hr it lv nl pl ru th \
common de-ch es fo fr-ca hu ja mk nl-be pt sl tr \
bepo cz
bepo
ifdef INSTALL_BLOBS
BLOBS=bios.bin bios-256k.bin sgabios.bin vgabios.bin vgabios-cirrus.bin \
BLOBS=bios.bin sgabios.bin vgabios.bin vgabios-cirrus.bin \
vgabios-stdvga.bin vgabios-vmware.bin vgabios-qxl.bin \
acpi-dsdt.aml q35-acpi-dsdt.aml \
ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc QEMU,tcx.bin QEMU,cgthree.bin \
ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc \
pxe-e1000.rom pxe-eepro100.rom pxe-ne2k_pci.rom \
pxe-pcnet.rom pxe-rtl8139.rom pxe-virtio.rom \
efi-e1000.rom efi-eepro100.rom efi-ne2k_pci.rom \
efi-pcnet.rom efi-rtl8139.rom efi-virtio.rom \
qemu-icon.bmp qemu_logo_no_text.svg \
qemu-icon.bmp \
bamboo.dtb petalogix-s3adsp1800.dtb petalogix-ml605.dtb \
multiboot.bin linuxboot.bin kvmvapic.bin \
s390-zipl.rom \
s390-ccw.img \
spapr-rtas.bin slof.bin \
palcode-clipper \
u-boot.e500
palcode-clipper
else
BLOBS=
endif
@@ -364,16 +282,13 @@ endif
install-doc: $(DOCS)
$(INSTALL_DIR) "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qemu-doc.html qemu-tech.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qmp-commands.txt "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) QMP/qmp-commands.txt "$(DESTDIR)$(qemu_docdir)"
ifdef CONFIG_POSIX
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DATA) qemu.1 "$(DESTDIR)$(mandir)/man1"
ifneq ($(TOOLS),)
$(INSTALL_DATA) qemu-img.1 "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DATA) qemu.1 qemu-img.1 "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man8"
$(INSTALL_DATA) qemu-nbd.8 "$(DESTDIR)$(mandir)/man8"
endif
endif
ifdef CONFIG_VIRTFS
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DATA) fsdev/virtfs-proxy-helper.1 "$(DESTDIR)$(mandir)/man1"
@@ -382,50 +297,33 @@ endif
install-datadir:
$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)"
install-localstatedir:
ifdef CONFIG_POSIX
ifneq (,$(findstring qemu-ga,$(TOOLS)))
$(INSTALL_DIR) "$(DESTDIR)$(qemu_localstatedir)"/run
endif
endif
install-confdir:
$(INSTALL_DIR) "$(DESTDIR)$(qemu_confdir)"
install-sysconfig: install-datadir install-confdir
$(INSTALL_DATA) $(SRC_PATH)/sysconfigs/target/target-x86_64.conf "$(DESTDIR)$(qemu_confdir)"
$(INSTALL_DATA) $(SRC_PATH)/sysconfigs/target/cpus-x86_64.conf "$(DESTDIR)$(qemu_datadir)"
install: all $(if $(BUILD_DOCS),install-doc) install-sysconfig \
install-datadir install-localstatedir
install: all $(if $(BUILD_DOCS),install-doc) install-sysconfig install-datadir
$(INSTALL_DIR) "$(DESTDIR)$(bindir)"
ifneq ($(TOOLS),)
$(call install-prog,$(TOOLS),$(DESTDIR)$(bindir))
endif
ifneq ($(CONFIG_MODULES),)
$(INSTALL_DIR) "$(DESTDIR)$(qemu_moddir)"
for s in $(modules-m:.mo=$(DSOSUF)); do \
t="$(DESTDIR)$(qemu_moddir)/$$(echo $$s | tr / -)"; \
$(INSTALL_LIB) $$s "$$t"; \
test -z "$(STRIP)" || $(STRIP) "$$t"; \
done
$(INSTALL_PROG) $(STRIP_OPT) $(TOOLS) "$(DESTDIR)$(bindir)"
endif
ifneq ($(HELPERS-y),)
$(call install-prog,$(HELPERS-y),$(DESTDIR)$(libexecdir))
$(INSTALL_DIR) "$(DESTDIR)$(libexecdir)"
$(INSTALL_PROG) $(STRIP_OPT) $(HELPERS-y) "$(DESTDIR)$(libexecdir)"
endif
ifneq ($(BLOBS),)
set -e; for x in $(BLOBS); do \
$(INSTALL_DATA) $(SRC_PATH)/pc-bios/$$x "$(DESTDIR)$(qemu_datadir)"; \
done
endif
ifeq ($(CONFIG_GTK),y)
$(MAKE) -C po $@
endif
$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/keymaps"
set -e; for x in $(KEYMAPS); do \
$(INSTALL_DATA) $(SRC_PATH)/pc-bios/keymaps/$$x "$(DESTDIR)$(qemu_datadir)/keymaps"; \
done
$(INSTALL_DATA) $(SRC_PATH)/trace-events "$(DESTDIR)$(qemu_datadir)/trace-events"
for d in $(TARGET_DIRS); do \
$(MAKE) $(SUBDIR_MAKEFLAGS) TARGET_DIR=$$d/ -C $$d $@ || exit 1 ; \
$(MAKE) -C $$d $@ || exit 1 ; \
done
# various test targets
@@ -434,30 +332,13 @@ test speed: all
.PHONY: TAGS
TAGS:
rm -f $@
find "$(SRC_PATH)" -name '*.[hc]' -exec etags --append {} +
find "$(SRC_PATH)" -name '*.[hc]' -print0 | xargs -0 etags
cscope:
rm -f ./cscope.*
find "$(SRC_PATH)" -name "*.[chsS]" -print | sed 's,^\./,,' > ./cscope.files
cscope -b
# opengl shader programs
ui/shader/%-vert.h: $(SRC_PATH)/ui/shader/%.vert $(SRC_PATH)/scripts/shaderinclude.pl
@mkdir -p $(dir $@)
$(call quiet-command,\
perl $(SRC_PATH)/scripts/shaderinclude.pl $< > $@,\
" VERT $@")
ui/shader/%-frag.h: $(SRC_PATH)/ui/shader/%.frag $(SRC_PATH)/scripts/shaderinclude.pl
@mkdir -p $(dir $@)
$(call quiet-command,\
perl $(SRC_PATH)/scripts/shaderinclude.pl $< > $@,\
" FRAG $@")
ui/console-gl.o: $(SRC_PATH)/ui/console-gl.c \
ui/shader/texture-blit-vert.h ui/shader/texture-blit-frag.h
# documentation
MAKEINFO=makeinfo
MAKEINFOFLAGS=--no-headers --no-split --number-sections
@@ -481,7 +362,7 @@ qemu-options.texi: $(SRC_PATH)/qemu-options.hx
qemu-monitor.texi: $(SRC_PATH)/hmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qmp-commands.txt: $(SRC_PATH)/qmp-commands.hx
QMP/qmp-commands.txt: $(SRC_PATH)/qmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -q < $< > $@," GEN $@")
qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx
@@ -511,12 +392,6 @@ qemu-nbd.8: qemu-nbd.texi
$(POD2MAN) --section=8 --center=" " --release=" " qemu-nbd.pod > $@, \
" GEN $@")
kvm_stat.1: scripts/kvm/kvm_stat.texi
$(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< kvm_stat.pod && \
$(POD2MAN) --section=1 --center=" " --release=" " kvm_stat.pod > $@, \
" GEN $@")
dvi: qemu-doc.dvi qemu-tech.dvi
html: qemu-doc.html qemu-tech.html
info: qemu-doc.info qemu-tech.info
@@ -526,66 +401,9 @@ qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
qemu-img.texi qemu-nbd.texi qemu-options.texi \
qemu-monitor.texi qemu-img-cmds.texi
ifdef CONFIG_WIN32
INSTALLER = qemu-setup-$(VERSION)$(EXESUF)
nsisflags = -V2 -NOCD
ifneq ($(wildcard $(SRC_PATH)/dll),)
ifeq ($(ARCH),x86_64)
# 64 bit executables
DLL_PATH = $(SRC_PATH)/dll/w64
nsisflags += -DW64
else
# 32 bit executables
DLL_PATH = $(SRC_PATH)/dll/w32
endif
endif
.PHONY: installer
installer: $(INSTALLER)
INSTDIR=/tmp/qemu-nsis
$(INSTALLER): $(SRC_PATH)/qemu.nsi
$(MAKE) install prefix=${INSTDIR}
ifdef SIGNCODE
(cd ${INSTDIR}; \
for i in *.exe; do \
$(SIGNCODE) $${i}; \
done \
)
endif # SIGNCODE
(cd ${INSTDIR}; \
for i in qemu-system-*.exe; do \
arch=$${i%.exe}; \
arch=$${arch#qemu-system-}; \
echo Section \"$$arch\" Section_$$arch; \
echo SetOutPath \"\$$INSTDIR\"; \
echo File \"\$${BINDIR}\\$$i\"; \
echo SectionEnd; \
done \
) >${INSTDIR}/system-emulations.nsh
makensis $(nsisflags) \
$(if $(BUILD_DOCS),-DCONFIG_DOCUMENTATION="y") \
$(if $(CONFIG_GTK),-DCONFIG_GTK="y") \
-DBINDIR="${INSTDIR}" \
$(if $(DLL_PATH),-DDLLDIR="$(DLL_PATH)") \
-DSRCDIR="$(SRC_PATH)" \
-DOUTFILE="$(INSTALLER)" \
$(SRC_PATH)/qemu.nsi
rm -r ${INSTDIR}
ifdef SIGNCODE
$(SIGNCODE) $(INSTALLER)
endif # SIGNCODE
endif # CONFIG_WIN
# Add a dependency on the generated files, so that they are always
# rebuilt before other object files
ifneq ($(filter-out %clean,$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
Makefile: $(GENERATED_HEADERS)
endif
# Include automatically generated dependency files
# Dependencies in Makefile.objs files come from our recursive subdir rules

20
Makefile.dis Normal file
View File

@@ -0,0 +1,20 @@
# Makefile for disassemblers.
include ../config-host.mak
include config.mak
include $(SRC_PATH)/rules.mak
.PHONY: all
$(call set-vpath, $(SRC_PATH))
QEMU_CFLAGS+=-I..
include $(SRC_PATH)/Makefile.objs
all: $(libdis-y)
# Dummy command so that make thinks it has done something
@true
clean:
rm -f *.o *.d *.a *~

23
Makefile.hw Normal file
View File

@@ -0,0 +1,23 @@
# Makefile for qemu target independent devices.
include ../config-host.mak
include ../config-all-devices.mak
include config.mak
include $(SRC_PATH)/rules.mak
.PHONY: all
$(call set-vpath, $(SRC_PATH))
QEMU_CFLAGS+=-I..
QEMU_CFLAGS += -I$(SRC_PATH)/include
include $(SRC_PATH)/Makefile.objs
all: $(hw-obj-y)
# Dummy command so that make thinks it has done something
@true
clean:
rm -f $(addsuffix *.o, $(sort $(dir $(hw-obj-y))))
rm -f $(addsuffix *.d, $(sort $(dir $(hw-obj-y))))

View File

@@ -1,25 +1,207 @@
#######################################################################
# Common libraries for tools and emulators
stub-obj-y = stubs/
util-obj-y = util/ qobject/ qapi/ qapi-types.o qapi-visit.o qapi-event.o
# Target-independent parts used in system and user emulation
universal-obj-y =
universal-obj-y += qemu-log.o
#######################################################################
# QObject
qobject-obj-y = qint.o qstring.o qdict.o qlist.o qfloat.o qbool.o
qobject-obj-y += qjson.o json-lexer.o json-streamer.o json-parser.o
qobject-obj-y += qerror.o error.o qemu-error.o
universal-obj-y += $(qobject-obj-y)
#######################################################################
# QOM
qom-obj-y = qom/
universal-obj-y += $(qom-obj-y)
#######################################################################
# oslib-obj-y is code depending on the OS (win32 vs posix)
oslib-obj-y = osdep.o
oslib-obj-$(CONFIG_WIN32) += oslib-win32.o qemu-thread-win32.o
oslib-obj-$(CONFIG_POSIX) += oslib-posix.o qemu-thread-posix.o
#######################################################################
# coroutines
coroutine-obj-y = qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
coroutine-obj-y += qemu-coroutine-sleep.o
ifeq ($(CONFIG_UCONTEXT_COROUTINE),y)
coroutine-obj-$(CONFIG_POSIX) += coroutine-ucontext.o
else
ifeq ($(CONFIG_SIGALTSTACK_COROUTINE),y)
coroutine-obj-$(CONFIG_POSIX) += coroutine-sigaltstack.o
else
coroutine-obj-$(CONFIG_POSIX) += coroutine-gthread.o
endif
endif
coroutine-obj-$(CONFIG_WIN32) += coroutine-win32.o
#######################################################################
# block-obj-y is code used by both qemu system emulation and qemu-img
block-obj-y = async.o thread-pool.o
block-obj-y += nbd.o block.o blockjob.o
block-obj-y += main-loop.o iohandler.o qemu-timer.o
block-obj-$(CONFIG_POSIX) += aio-posix.o
block-obj-$(CONFIG_WIN32) += aio-win32.o
block-obj-y = cutils.o iov.o cache-utils.o qemu-option.o module.o async.o
block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o qemu-progress.o qemu-sockets.o
block-obj-y += $(coroutine-obj-y) $(qobject-obj-y) $(version-obj-y)
block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
block-obj-y += block/
block-obj-y += qemu-io-cmds.o
block-obj-y += qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
block-obj-y += qemu-coroutine-sleep.o
block-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o
ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
# Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.
# only pull in the actual virtio-9p device if we also enabled virtio.
CONFIG_REALLY_VIRTFS=y
endif
block-obj-m = block/
######################################################################
# Target independent part of system emulation. The long term path is to
# suppress *all* target specific code in case of system emulation, i.e. a
# single QEMU executable should support all CPUs and machines.
common-obj-y = $(block-obj-y) blockdev.o
common-obj-y += net.o net/
common-obj-y += qom/
common-obj-y += readline.o console.o cursor.o
common-obj-y += $(oslib-obj-y)
common-obj-$(CONFIG_WIN32) += os-win32.o
common-obj-$(CONFIG_POSIX) += os-posix.o
common-obj-$(CONFIG_LINUX) += fsdev/
extra-obj-$(CONFIG_LINUX) += fsdev/
common-obj-y += tcg-runtime.o host-utils.o main-loop.o
common-obj-y += input.o
common-obj-y += buffered_file.o migration.o migration-tcp.o
common-obj-y += qemu-char.o #aio.o
common-obj-y += block-migration.o iohandler.o
common-obj-y += pflib.o
common-obj-y += bitmap.o bitops.o
common-obj-y += page_cache.o
common-obj-$(CONFIG_POSIX) += migration-exec.o migration-unix.o migration-fd.o
common-obj-$(CONFIG_WIN32) += version.o
common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
common-obj-y += audio/
common-obj-y += hw/
common-obj-y += ui/
common-obj-y += bt-host.o bt-vhci.o
common-obj-y += iov.o acl.o
common-obj-$(CONFIG_POSIX) += compatfd.o
common-obj-y += notify.o event_notifier.o
common-obj-y += qemu-timer.o qemu-timer-common.o
common-obj-$(CONFIG_SLIRP) += slirp/
######################################################################
# libseccomp
ifeq ($(CONFIG_SECCOMP),y)
common-obj-y += qemu-seccomp.o
endif
######################################################################
# libuser
user-obj-y =
user-obj-y += envlist.o path.o
user-obj-y += tcg-runtime.o host-utils.o
user-obj-y += cutils.o iov.o cache-utils.o
user-obj-y += module.o
user-obj-y += qemu-user.o
user-obj-y += $(trace-obj-y)
user-obj-y += qom/
######################################################################
# libhw
hw-obj-y = vl.o dma-helpers.o qtest.o hw/
######################################################################
# libdis
# NOTE: the disassembler code is only needed for debugging
libdis-y =
libdis-$(CONFIG_ALPHA_DIS) += alpha-dis.o
libdis-$(CONFIG_ARM_DIS) += arm-dis.o
libdis-$(CONFIG_CRIS_DIS) += cris-dis.o
libdis-$(CONFIG_HPPA_DIS) += hppa-dis.o
libdis-$(CONFIG_I386_DIS) += i386-dis.o
libdis-$(CONFIG_IA64_DIS) += ia64-dis.o
libdis-$(CONFIG_M68K_DIS) += m68k-dis.o
libdis-$(CONFIG_MICROBLAZE_DIS) += microblaze-dis.o
libdis-$(CONFIG_MIPS_DIS) += mips-dis.o
libdis-$(CONFIG_PPC_DIS) += ppc-dis.o
libdis-$(CONFIG_S390_DIS) += s390-dis.o
libdis-$(CONFIG_SH4_DIS) += sh4-dis.o
libdis-$(CONFIG_SPARC_DIS) += sparc-dis.o
libdis-$(CONFIG_LM32_DIS) += lm32-dis.o
######################################################################
# trace
ifeq ($(TRACE_BACKEND),dtrace)
TRACE_H_EXTRA_DEPS=trace-dtrace.h
endif
trace.h: trace.h-timestamp $(TRACE_H_EXTRA_DEPS)
trace.h-timestamp: $(SRC_PATH)/trace-events $(BUILD_DIR)/config-host.mak
$(call quiet-command,$(TRACETOOL) \
--format=h \
--backend=$(TRACE_BACKEND) \
< $< > $@," GEN trace.h")
@cmp -s $@ trace.h || cp $@ trace.h
trace.c: trace.c-timestamp
trace.c-timestamp: $(SRC_PATH)/trace-events $(BUILD_DIR)/config-host.mak
$(call quiet-command,$(TRACETOOL) \
--format=c \
--backend=$(TRACE_BACKEND) \
< $< > $@," GEN trace.c")
@cmp -s $@ trace.c || cp $@ trace.c
trace.o: trace.c $(GENERATED_HEADERS)
trace-dtrace.h: trace-dtrace.dtrace
$(call quiet-command,dtrace -o $@ -h -s $<, " GEN trace-dtrace.h")
# Normal practice is to name DTrace probe file with a '.d' extension
# but that gets picked up by QEMU's Makefile as an external dependency
# rule file. So we use '.dtrace' instead
trace-dtrace.dtrace: trace-dtrace.dtrace-timestamp
trace-dtrace.dtrace-timestamp: $(SRC_PATH)/trace-events $(BUILD_DIR)/config-host.mak
$(call quiet-command,$(TRACETOOL) \
--format=d \
--backend=$(TRACE_BACKEND) \
< $< > $@," GEN trace-dtrace.dtrace")
@cmp -s $@ trace-dtrace.dtrace || cp $@ trace-dtrace.dtrace
trace-dtrace.o: trace-dtrace.dtrace $(GENERATED_HEADERS)
$(call quiet-command,dtrace -o $@ -G -s $<, " GEN trace-dtrace.o")
ifeq ($(LIBTOOL),)
trace-dtrace.lo: trace-dtrace.dtrace
@echo "missing libtool. please install and rerun configure."; exit 1
else
trace-dtrace.lo: trace-dtrace.dtrace
$(call quiet-command,$(LIBTOOL) --mode=compile --tag=CC dtrace -o $@ -G -s $<, " lt GEN trace-dtrace.o")
endif
trace/simple.o: trace/simple.c $(GENERATED_HEADERS)
trace-obj-$(CONFIG_TRACE_DTRACE) += trace-dtrace.o
ifneq ($(TRACE_BACKEND),dtrace)
trace-obj-y = trace.o
endif
trace-obj-$(CONFIG_TRACE_DEFAULT) += trace/default.o
trace-obj-$(CONFIG_TRACE_SIMPLE) += trace/simple.o
trace-obj-$(CONFIG_TRACE_SIMPLE) += qemu-timer-common.o
trace-obj-$(CONFIG_TRACE_STDERR) += trace/stderr.o
trace-obj-y += trace/control.o
$(trace-obj-y): $(GENERATED_HEADERS)
######################################################################
# smartcard
@@ -29,82 +211,40 @@ libcacard-y += libcacard/vcard.o libcacard/vreader.o
libcacard-y += libcacard/vcard_emul_nss.o
libcacard-y += libcacard/vcard_emul_type.o
libcacard-y += libcacard/card_7816.o
libcacard-y += libcacard/vcardt.o
libcacard/vcard_emul_nss.o-cflags := $(NSS_CFLAGS)
libcacard/vcard_emul_nss.o-libs := $(NSS_LIBS)
######################################################################
# Target independent part of system emulation. The long term path is to
# suppress *all* target specific code in case of system emulation, i.e. a
# single QEMU executable should support all CPUs and machines.
ifeq ($(CONFIG_SOFTMMU),y)
common-obj-y = blockdev.o blockdev-nbd.o block/
common-obj-y += iothread.o
common-obj-y += net/
common-obj-y += qdev-monitor.o device-hotplug.o
common-obj-$(CONFIG_WIN32) += os-win32.o
common-obj-$(CONFIG_POSIX) += os-posix.o
common-obj-$(CONFIG_LINUX) += fsdev/
common-obj-y += migration/
common-obj-y += qemu-char.o #aio.o
common-obj-y += page_cache.o
common-obj-y += qjson.o
common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
common-obj-y += audio/
common-obj-y += hw/
common-obj-y += accel.o
common-obj-y += ui/
common-obj-y += bt-host.o bt-vhci.o
bt-host.o-cflags := $(BLUEZ_CFLAGS)
common-obj-y += dma-helpers.o
common-obj-y += vl.o
vl.o-cflags := $(GPROF_CFLAGS) $(SDL_CFLAGS)
common-obj-y += tpm.o
common-obj-$(CONFIG_SLIRP) += slirp/
common-obj-y += backends/
common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
common-obj-$(CONFIG_SMARTCARD_NSS) += $(libcacard-y)
######################################################################
# qapi
common-obj-y += qmp-marshal.o
qapi-obj-y = qapi/
qapi-obj-y += qapi-types.o qapi-visit.o
common-obj-y += qmp-marshal.o qapi-visit.o qapi-types.o
common-obj-y += qmp.o hmp.o
endif
#######################################################################
# Target-independent parts used in system and user emulation
common-obj-y += qemu-log.o
common-obj-y += tcg-runtime.o
common-obj-y += hw/
common-obj-y += qom/
common-obj-y += disas/
######################################################################
# Resource file for Windows executables
version-obj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.o
version-lobj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.lo
######################################################################
# tracing
util-obj-y += trace/
target-obj-y += trace/
universal-obj-y += $(qapi-obj-y)
######################################################################
# guest agent
# FIXME: a few definitions from qapi-types.o/qapi-visit.o are needed
# by libqemuutil.a. These should be moved to a separate .json schema.
qga-obj-y = qga/
qga-vss-dll-obj-y = qga/
qga-obj-y = qga/ qemu-ga.o module.o
qga-obj-$(CONFIG_WIN32) += oslib-win32.o
qga-obj-$(CONFIG_POSIX) += oslib-posix.o qemu-sockets.o qemu-option.o
vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
vl.o: QEMU_CFLAGS+=$(SDL_CFLAGS)
QEMU_CFLAGS+=$(GLIB_CFLAGS)
nested-vars += \
hw-obj-y \
qga-obj-y \
block-obj-y \
qom-obj-y \
qapi-obj-y \
user-obj-y \
common-obj-y \
extra-obj-y
dummy := $(call unnest-vars)

View File

@@ -1,9 +1,12 @@
# -*- Mode: makefile -*-
include ../config-host.mak
include config-target.mak
include config-devices.mak
include config-target.mak
include $(SRC_PATH)/rules.mak
ifneq ($(HWDIR),)
include $(HWDIR)/config.mak
endif
$(call set-vpath, $(SRC_PATH))
ifdef CONFIG_LINUX
@@ -15,30 +18,31 @@ QEMU_CFLAGS+=-I$(SRC_PATH)/include
ifdef CONFIG_USER_ONLY
# user emulator name
QEMU_PROG=qemu-$(TARGET_NAME)
QEMU_PROG_BUILD = $(QEMU_PROG)
QEMU_PROG=qemu-$(TARGET_ARCH2)
else
# system emulator name
QEMU_PROG=qemu-system-$(TARGET_NAME)$(EXESUF)
ifneq (,$(findstring -mwindows,$(libs_softmmu)))
ifneq (,$(findstring -mwindows,$(LIBS)))
# Terminate program name with a 'w' because the linker builds a windows executable.
QEMU_PROGW=qemu-system-$(TARGET_NAME)w$(EXESUF)
$(QEMU_PROG): $(QEMU_PROGW)
$(call quiet-command,$(OBJCOPY) --subsystem console $(QEMU_PROGW) $(QEMU_PROG)," GEN $(TARGET_DIR)$(QEMU_PROG)")
QEMU_PROG_BUILD = $(QEMU_PROGW)
else
QEMU_PROG_BUILD = $(QEMU_PROG)
endif
QEMU_PROGW=qemu-system-$(TARGET_ARCH2)w$(EXESUF)
endif # windows executable
QEMU_PROG=qemu-system-$(TARGET_ARCH2)$(EXESUF)
endif
PROGS=$(QEMU_PROG) $(QEMU_PROGW)
PROGS=$(QEMU_PROG)
ifdef QEMU_PROGW
PROGS+=$(QEMU_PROGW)
endif
STPFILES=
ifndef CONFIG_HAIKU
LIBS+=-lm
endif
config-target.h: config-target.h-timestamp
config-target.h-timestamp: config-target.mak
ifdef CONFIG_TRACE_SYSTEMTAP
stap: $(QEMU_PROG).stp-installed $(QEMU_PROG).stp $(QEMU_PROG)-simpletrace.stp
stap: $(QEMU_PROG).stp
ifdef CONFIG_USER_ONLY
TARGET_TYPE=user
@@ -46,31 +50,14 @@ else
TARGET_TYPE=system
endif
$(QEMU_PROG).stp-installed: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--format=stap \
--backends=$(TRACE_BACKENDS) \
--binary=$(bindir)/$(QEMU_PROG) \
--target-name=$(TARGET_NAME) \
--target-type=$(TARGET_TYPE) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp-installed")
$(QEMU_PROG).stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--format=stap \
--backends=$(TRACE_BACKENDS) \
--binary=$(realpath .)/$(QEMU_PROG) \
--target-name=$(TARGET_NAME) \
--backend=$(TRACE_BACKEND) \
--binary=$(bindir)/$(QEMU_PROG) \
--target-arch=$(TARGET_ARCH) \
--target-type=$(TARGET_TYPE) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp")
$(QEMU_PROG)-simpletrace.stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--format=simpletrace-stap \
--backends=$(TRACE_BACKENDS) \
--probe-prefix=qemu.$(TARGET_TYPE).$(TARGET_NAME) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG)-simpletrace.stp")
< $< > $@," GEN $(QEMU_PROG).stp")
else
stap:
endif
@@ -83,20 +70,15 @@ all: $(PROGS) stap
#########################################################
# cpu emulator library
obj-y = exec.o translate-all.o cpu-exec.o
obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o
obj-y += tcg/tcg.o tcg/optimize.o
obj-$(CONFIG_TCG_INTERPRETER) += tci.o
obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
obj-y += fpu/softfloat.o
obj-y += target-$(TARGET_BASE_ARCH)/
obj-y += disas.o
obj-$(call notempty,$(TARGET_XML_FILES)) += gdbstub-xml.o
obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
obj-$(CONFIG_TCI_DIS) += tci-dis.o
obj-y += target-$(TARGET_BASE_ARCH)/
obj-$(CONFIG_GDBSTUB_XML) += gdbstub-xml.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decContext.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decNumber.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal32.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal64.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal128.o
tci-dis.o: QEMU_CFLAGS += -I$(SRC_PATH)/tcg -I$(SRC_PATH)/tcg/tci
#########################################################
# Linux user emulator target
@@ -106,7 +88,7 @@ ifdef CONFIG_LINUX_USER
QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) -I$(SRC_PATH)/linux-user
obj-y += linux-user/
obj-y += gdbstub.o thunk.o user-exec.o
obj-y += gdbstub.o thunk.o user-exec.o $(oslib-obj-y)
endif #CONFIG_LINUX_USER
@@ -115,74 +97,82 @@ endif #CONFIG_LINUX_USER
ifdef CONFIG_BSD_USER
QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ABI_DIR) \
-I$(SRC_PATH)/bsd-user/$(HOST_VARIANT_DIR)
QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ARCH)
obj-y += bsd-user/
obj-y += gdbstub.o user-exec.o
obj-y += gdbstub.o user-exec.o $(oslib-obj-y)
endif #CONFIG_BSD_USER
#########################################################
# System emulator target
ifdef CONFIG_SOFTMMU
obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
obj-y += qtest.o bootdevice.o
CONFIG_NO_PCI = $(if $(subst n,,$(CONFIG_PCI)),n,y)
CONFIG_NO_KVM = $(if $(subst n,,$(CONFIG_KVM)),n,y)
CONFIG_NO_XEN = $(if $(subst n,,$(CONFIG_XEN)),n,y)
CONFIG_NO_GET_MEMORY_MAPPING = $(if $(subst n,,$(CONFIG_HAVE_GET_MEMORY_MAPPING)),n,y)
CONFIG_NO_CORE_DUMP = $(if $(subst n,,$(CONFIG_HAVE_CORE_DUMP)),n,y)
obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o
obj-y += hw/
obj-$(CONFIG_FDT) += device_tree.o
obj-$(CONFIG_KVM) += kvm-all.o
obj-$(CONFIG_NO_KVM) += kvm-stub.o
obj-y += memory.o savevm.o cputlb.o
obj-y += memory_mapping.o
obj-y += dump.o
LIBS := $(libs_softmmu) $(LIBS)
obj-$(CONFIG_HAVE_GET_MEMORY_MAPPING) += memory_mapping.o
obj-$(CONFIG_HAVE_CORE_DUMP) += dump.o
obj-$(CONFIG_NO_GET_MEMORY_MAPPING) += memory_mapping-stub.o
obj-$(CONFIG_NO_CORE_DUMP) += dump-stub.o
LIBS+=-lz
QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
QEMU_CFLAGS += $(VNC_JPEG_CFLAGS)
QEMU_CFLAGS += $(VNC_PNG_CFLAGS)
# xen support
obj-$(CONFIG_XEN) += xen-common.o
obj-$(CONFIG_XEN_I386) += xen-hvm.o xen-mapcache.o
obj-$(call lnot,$(CONFIG_XEN)) += xen-common-stub.o
obj-$(call lnot,$(CONFIG_XEN_I386)) += xen-hvm-stub.o
obj-$(CONFIG_XEN) += xen-all.o xen-mapcache.o
obj-$(CONFIG_NO_XEN) += xen-stub.o
# Hardware support
ifeq ($(TARGET_NAME), sparc64)
ifeq ($(TARGET_ARCH), sparc64)
obj-y += hw/sparc64/
else
obj-y += hw/$(TARGET_BASE_ARCH)/
endif
main.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
GENERATED_HEADERS += hmp-commands.h qmp-commands-old.h
endif # CONFIG_SOFTMMU
# Workaround for http://gcc.gnu.org/PR55489, see configure.
%/translate.o: QEMU_CFLAGS += $(TRANSLATE_OPT_CFLAGS)
nested-vars += obj-y
dummy := $(call unnest-vars,,obj-y)
all-obj-y := $(obj-y)
target-obj-y :=
block-obj-y :=
common-obj-y :=
# This resolves all nested paths, so it must come last
include $(SRC_PATH)/Makefile.objs
dummy := $(call unnest-vars,,target-obj-y)
target-obj-y-save := $(target-obj-y)
dummy := $(call unnest-vars,.., \
block-obj-y \
block-obj-m \
common-obj-y \
common-obj-m)
target-obj-y := $(target-obj-y-save)
all-obj-y += $(common-obj-y)
all-obj-y += $(target-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y)
$(QEMU_PROG_BUILD): config-devices.mak
all-obj-y = $(obj-y)
all-obj-y += $(addprefix ../, $(universal-obj-y))
# build either PROG or PROGW
$(QEMU_PROG_BUILD): $(all-obj-y) ../libqemuutil.a ../libqemustub.a
$(call LINK, $(filter-out %.mak, $^))
ifdef CONFIG_DARWIN
$(call quiet-command,Rez -append $(SRC_PATH)/pc-bios/qemu.rsrc -o $@," REZ $(TARGET_DIR)$@")
$(call quiet-command,SetFile -a C $@," SETFILE $(TARGET_DIR)$@")
ifdef CONFIG_SOFTMMU
all-obj-y += $(addprefix ../, $(common-obj-y))
all-obj-y += $(addprefix ../libdis/, $(libdis-y))
all-obj-y += $(addprefix $(HWDIR)/, $(hw-obj-y))
all-obj-y += $(addprefix ../, $(trace-obj-y))
else
all-obj-y += $(addprefix ../libuser/, $(user-obj-y))
all-obj-y += $(addprefix ../libdis-user/, $(libdis-y))
endif #CONFIG_LINUX_USER
ifdef QEMU_PROGW
# The linker builds a windows executable. Make also a console executable.
$(QEMU_PROGW): $(all-obj-y)
$(call LINK,$^)
$(QEMU_PROG): $(QEMU_PROGW)
$(call quiet-command,$(OBJCOPY) --subsystem console $(QEMU_PROGW) $(QEMU_PROG)," GEN $(TARGET_DIR)$(QEMU_PROG)")
else
$(QEMU_PROG): $(all-obj-y)
$(call LINK,$^)
endif
gdbstub-xml.c: $(TARGET_XML_FILES) $(SRC_PATH)/scripts/feature_to_c.sh
@@ -204,12 +194,14 @@ endif
install: all
ifneq ($(PROGS),)
$(call install-prog,$(PROGS),$(DESTDIR)$(bindir))
$(INSTALL) -m 755 $(PROGS) "$(DESTDIR)$(bindir)"
ifneq ($(STRIP),)
$(STRIP) $(patsubst %,"$(DESTDIR)$(bindir)/%",$(PROGS))
endif
endif
ifdef CONFIG_TRACE_SYSTEMTAP
$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset"
$(INSTALL_DATA) $(QEMU_PROG).stp-installed "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG).stp"
$(INSTALL_DATA) $(QEMU_PROG)-simpletrace.stp "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG)-simpletrace.stp"
$(INSTALL_DATA) $(QEMU_PROG).stp "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset"
endif
GENERATED_HEADERS += config-target.h

24
Makefile.user Normal file
View File

@@ -0,0 +1,24 @@
# Makefile for qemu target independent user files.
include ../config-host.mak
include $(SRC_PATH)/rules.mak
-include config.mak
.PHONY: all
$(call set-vpath, $(SRC_PATH))
QEMU_CFLAGS+=-I..
QEMU_CFLAGS += -I$(SRC_PATH)/include
QEMU_CFLAGS += -DCONFIG_USER_ONLY
include $(SRC_PATH)/Makefile.objs
all: $(user-obj-y)
# Dummy command so that make thinks it has done something
@true
clean:
for d in . trace; do \
rm -f $$d/*.o $$d/*.d $$d/*.a $$d/*~; \
done

88
QMP/README Normal file
View File

@@ -0,0 +1,88 @@
QEMU Monitor Protocol
=====================
Introduction
-------------
The QEMU Monitor Protocol (QMP) allows applications to communicate with
QEMU's Monitor.
QMP is JSON[1] based and currently has the following features:
- Lightweight, text-based, easy to parse data format
- Asynchronous messages support (ie. events)
- Capabilities Negotiation
For detailed information on QMP's usage, please, refer to the following files:
o qmp-spec.txt QEMU Monitor Protocol current specification
o qmp-commands.txt QMP supported commands (auto-generated at build-time)
o qmp-events.txt List of available asynchronous events
There is also a simple Python script called 'qmp-shell' available.
IMPORTANT: It's strongly recommended to read the 'Stability Considerations'
section in the qmp-commands.txt file before making any serious use of QMP.
[1] http://www.json.org
Usage
-----
To enable QMP, you need a QEMU monitor instance in "control mode". There are
two ways of doing this.
The simplest one is using the '-qmp' command-line option. The following
example makes QMP available on localhost port 4444:
$ qemu [...] -qmp tcp:localhost:4444,server
However, in order to have more complex combinations, like multiple monitors,
the '-mon' command-line option should be used along with the '-chardev' one.
For instance, the following example creates one user monitor on stdio and one
QMP monitor on localhost port 4444.
$ qemu [...] -chardev stdio,id=mon0 -mon chardev=mon0,mode=readline \
-chardev socket,id=mon1,host=localhost,port=4444,server \
-mon chardev=mon1,mode=control
Please, refer to QEMU's manpage for more information.
Simple Testing
--------------
To manually test QMP one can connect with telnet and issue commands by hand:
$ telnet localhost 4444
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 50, "minor": 13, "major": 0}, "package": ""}, "capabilities": []}}
{ "execute": "qmp_capabilities" }
{"return": {}}
{ "execute": "query-version" }
{"return": {"qemu": {"micro": 50, "minor": 13, "major": 0}, "package": ""}}
Development Process
-------------------
When changing QMP's interface (by adding new commands, events or modifying
existing ones) it's mandatory to update the relevant documentation, which is
one (or more) of the files listed in the 'Introduction' section*.
Also, it's strongly recommended to send the documentation patch first, before
doing any code change. This is so because:
1. Avoids the code dictating the interface
2. Review can improve your interface. Letting that happen before
you implement it can save you work.
* The qmp-commands.txt file is generated from the qmp-commands.hx one, which
is the file that should be edited.
Homepage
--------
http://wiki.qemu.org/QMP

388
QMP/qmp-events.txt Normal file
View File

@@ -0,0 +1,388 @@
QEMU Monitor Protocol Events
============================
BALLOON_CHANGE
--------------
Emitted when the guest changes the actual BALLOON level. This
value is equivalent to the 'actual' field return by the
'query-balloon' command
Data:
- "actual": actual level of the guest memory balloon in bytes (json-number)
Example:
{ "event": "BALLOON_CHANGE",
"data": { "actual": 944766976 },
"timestamp": { "seconds": 1267020223, "microseconds": 435656 } }
BLOCK_IO_ERROR
--------------
Emitted when a disk I/O error occurs.
Data:
- "device": device name (json-string)
- "operation": I/O operation (json-string, "read" or "write")
- "action": action that has been taken, it's one of the following (json-string):
"ignore": error has been ignored
"report": error has been reported to the device
"stop": error caused VM to be stopped
Example:
{ "event": "BLOCK_IO_ERROR",
"data": { "device": "ide0-hd1",
"operation": "write",
"action": "stop" },
"timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
Note: If action is "stop", a STOP event will eventually follow the
BLOCK_IO_ERROR event.
BLOCK_JOB_CANCELLED
-------------------
Emitted when a block job has been cancelled.
Data:
- "type": Job type ("stream" for image streaming, json-string)
- "device": Device name (json-string)
- "len": Maximum progress value (json-int)
- "offset": Current progress value (json-int)
On success this is equal to len.
On failure this is less than len.
- "speed": Rate limit, bytes per second (json-int)
Example:
{ "event": "BLOCK_JOB_CANCELLED",
"data": { "type": "stream", "device": "virtio-disk0",
"len": 10737418240, "offset": 134217728,
"speed": 0 },
"timestamp": { "seconds": 1267061043, "microseconds": 959568 } }
BLOCK_JOB_COMPLETED
-------------------
Emitted when a block job has completed.
Data:
- "type": Job type ("stream" for image streaming, json-string)
- "device": Device name (json-string)
- "len": Maximum progress value (json-int)
- "offset": Current progress value (json-int)
On success this is equal to len.
On failure this is less than len.
- "speed": Rate limit, bytes per second (json-int)
- "error": Error message (json-string, optional)
Only present on failure. This field contains a human-readable
error message. There are no semantics other than that streaming
has failed and clients should not try to interpret the error
string.
Example:
{ "event": "BLOCK_JOB_COMPLETED",
"data": { "type": "stream", "device": "virtio-disk0",
"len": 10737418240, "offset": 10737418240,
"speed": 0 },
"timestamp": { "seconds": 1267061043, "microseconds": 959568 } }
DEVICE_TRAY_MOVED
-----------------
It's emitted whenever the tray of a removable device is moved by the guest
or by HMP/QMP commands.
Data:
- "device": device name (json-string)
- "tray-open": true if the tray has been opened or false if it has been closed
(json-bool)
{ "event": "DEVICE_TRAY_MOVED",
"data": { "device": "ide1-cd0",
"tray-open": true
},
"timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
RESET
-----
Emitted when the Virtual Machine is reseted.
Data: None.
Example:
{ "event": "RESET",
"timestamp": { "seconds": 1267041653, "microseconds": 9518 } }
RESUME
------
Emitted when the Virtual Machine resumes execution.
Data: None.
Example:
{ "event": "RESUME",
"timestamp": { "seconds": 1271770767, "microseconds": 582542 } }
RTC_CHANGE
----------
Emitted when the guest changes the RTC time.
Data:
- "offset": delta against the host UTC in seconds (json-number)
Example:
{ "event": "RTC_CHANGE",
"data": { "offset": 78 },
"timestamp": { "seconds": 1267020223, "microseconds": 435656 } }
SHUTDOWN
--------
Emitted when the Virtual Machine is powered down.
Data: None.
Example:
{ "event": "SHUTDOWN",
"timestamp": { "seconds": 1267040730, "microseconds": 682951 } }
Note: If the command-line option "-no-shutdown" has been specified, a STOP
event will eventually follow the SHUTDOWN event.
SPICE_CONNECTED, SPICE_DISCONNECTED
-----------------------------------
Emitted when a SPICE client connects or disconnects.
Data:
- "server": Server information (json-object)
- "host": IP address (json-string)
- "port": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "client": Client information (json-object)
- "host": IP address (json-string)
- "port": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
Example:
{ "timestamp": {"seconds": 1290688046, "microseconds": 388707},
"event": "SPICE_CONNECTED",
"data": {
"server": { "port": "5920", "family": "ipv4", "host": "127.0.0.1"},
"client": {"port": "52873", "family": "ipv4", "host": "127.0.0.1"}
}}
SPICE_INITIALIZED
-----------------
Emitted after initial handshake and authentication takes place (if any)
and the SPICE channel is up'n'running
Data:
- "server": Server information (json-object)
- "host": IP address (json-string)
- "port": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "auth": authentication method (json-string, optional)
- "client": Client information (json-object)
- "host": IP address (json-string)
- "port": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "connection-id": spice connection id. All channels with the same id
belong to the same spice session (json-int)
- "channel-type": channel type. "1" is the main control channel, filter for
this one if you want track spice sessions only (json-int)
- "channel-id": channel id. Usually "0", might be different needed when
multiple channels of the same type exist, such as multiple
display channels in a multihead setup (json-int)
- "tls": whevener the channel is encrypted (json-bool)
Example:
{ "timestamp": {"seconds": 1290688046, "microseconds": 417172},
"event": "SPICE_INITIALIZED",
"data": {"server": {"auth": "spice", "port": "5921",
"family": "ipv4", "host": "127.0.0.1"},
"client": {"port": "49004", "family": "ipv4", "channel-type": 3,
"connection-id": 1804289383, "host": "127.0.0.1",
"channel-id": 0, "tls": true}
}}
STOP
----
Emitted when the Virtual Machine is stopped.
Data: None.
Example:
{ "event": "STOP",
"timestamp": { "seconds": 1267041730, "microseconds": 281295 } }
SUSPEND
-------
Emitted when guest enters S3 state.
Data: None.
Example:
{ "event": "SUSPEND",
"timestamp": { "seconds": 1344456160, "microseconds": 309119 } }
SUSPEND_DISK
------------
Emitted when the guest makes a request to enter S4 state.
Data: None.
Example:
{ "event": "SUSPEND_DISK",
"timestamp": { "seconds": 1344456160, "microseconds": 309119 } }
Note: QEMU shuts down when entering S4 state.
VNC_CONNECTED
-------------
Emitted when a VNC client establishes a connection.
Data:
- "server": Server information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "auth": authentication method (json-string, optional)
- "client": Client information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
Example:
{ "event": "VNC_CONNECTED",
"data": {
"server": { "auth": "sasl", "family": "ipv4",
"service": "5901", "host": "0.0.0.0" },
"client": { "family": "ipv4", "service": "58425",
"host": "127.0.0.1" } },
"timestamp": { "seconds": 1262976601, "microseconds": 975795 } }
Note: This event is emitted before any authentication takes place, thus
the authentication ID is not provided.
VNC_DISCONNECTED
----------------
Emitted when the connection is closed.
Data:
- "server": Server information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "auth": authentication method (json-string, optional)
- "client": Client information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "x509_dname": TLS dname (json-string, optional)
- "sasl_username": SASL username (json-string, optional)
Example:
{ "event": "VNC_DISCONNECTED",
"data": {
"server": { "auth": "sasl", "family": "ipv4",
"service": "5901", "host": "0.0.0.0" },
"client": { "family": "ipv4", "service": "58425",
"host": "127.0.0.1", "sasl_username": "luiz" } },
"timestamp": { "seconds": 1262976601, "microseconds": 975795 } }
VNC_INITIALIZED
---------------
Emitted after authentication takes place (if any) and the VNC session is
made active.
Data:
- "server": Server information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "auth": authentication method (json-string, optional)
- "client": Client information (json-object)
- "host": IP address (json-string)
- "service": port number (json-string)
- "family": address family (json-string, "ipv4" or "ipv6")
- "x509_dname": TLS dname (json-string, optional)
- "sasl_username": SASL username (json-string, optional)
Example:
{ "event": "VNC_INITIALIZED",
"data": {
"server": { "auth": "sasl", "family": "ipv4",
"service": "5901", "host": "0.0.0.0"},
"client": { "family": "ipv4", "service": "46089",
"host": "127.0.0.1", "sasl_username": "luiz" } },
"timestamp": { "seconds": 1263475302, "microseconds": 150772 } }
WAKEUP
------
Emitted when the guest has woken up from S3 and is running.
Data: None.
Example:
{ "event": "WATCHDOG",
"timestamp": { "seconds": 1344522075, "microseconds": 745528 } }
WATCHDOG
--------
Emitted when the watchdog device's timer is expired.
Data:
- "action": Action that has been taken, it's one of the following (json-string):
"reset", "shutdown", "poweroff", "pause", "debug", or "none"
Example:
{ "event": "WATCHDOG",
"data": { "action": "reset" },
"timestamp": { "seconds": 1267061043, "microseconds": 959568 } }
Note: If action is "reset", "shutdown", or "pause" the WATCHDOG event is
followed respectively by the RESET, SHUTDOWN, or STOP events.

259
QMP/qmp-shell Executable file
View File

@@ -0,0 +1,259 @@
#!/usr/bin/python
#
# Low-level QEMU shell on top of QMP.
#
# Copyright (C) 2009, 2010 Red Hat Inc.
#
# Authors:
# Luiz Capitulino <lcapitulino@redhat.com>
#
# This work is licensed under the terms of the GNU GPL, version 2. See
# the COPYING file in the top-level directory.
#
# Usage:
#
# Start QEMU with:
#
# # qemu [...] -qmp unix:./qmp-sock,server
#
# Run the shell:
#
# $ qmp-shell ./qmp-sock
#
# Commands have the following format:
#
# < command-name > [ arg-name1=arg1 ] ... [ arg-nameN=argN ]
#
# For example:
#
# (QEMU) device_add driver=e1000 id=net1
# {u'return': {}}
# (QEMU)
import qmp
import readline
import sys
class QMPCompleter(list):
def complete(self, text, state):
for cmd in self:
if cmd.startswith(text):
if not state:
return cmd
else:
state -= 1
class QMPShellError(Exception):
pass
class QMPShellBadPort(QMPShellError):
pass
# TODO: QMPShell's interface is a bit ugly (eg. _fill_completion() and
# _execute_cmd()). Let's design a better one.
class QMPShell(qmp.QEMUMonitorProtocol):
def __init__(self, address):
qmp.QEMUMonitorProtocol.__init__(self, self.__get_address(address))
self._greeting = None
self._completer = None
def __get_address(self, arg):
"""
Figure out if the argument is in the port:host form, if it's not it's
probably a file path.
"""
addr = arg.split(':')
if len(addr) == 2:
try:
port = int(addr[1])
except ValueError:
raise QMPShellBadPort
return ( addr[0], port )
# socket path
return arg
def _fill_completion(self):
for cmd in self.cmd('query-commands')['return']:
self._completer.append(cmd['name'])
def __completer_setup(self):
self._completer = QMPCompleter()
self._fill_completion()
readline.set_completer(self._completer.complete)
readline.parse_and_bind("tab: complete")
# XXX: default delimiters conflict with some command names (eg. query-),
# clearing everything as it doesn't seem to matter
readline.set_completer_delims('')
def __build_cmd(self, cmdline):
"""
Build a QMP input object from a user provided command-line in the
following format:
< command-name > [ arg-name1=arg1 ] ... [ arg-nameN=argN ]
"""
cmdargs = cmdline.split()
qmpcmd = { 'execute': cmdargs[0], 'arguments': {} }
for arg in cmdargs[1:]:
opt = arg.split('=')
try:
value = int(opt[1])
except ValueError:
value = opt[1]
qmpcmd['arguments'][opt[0]] = value
return qmpcmd
def _execute_cmd(self, cmdline):
try:
qmpcmd = self.__build_cmd(cmdline)
except:
print 'command format: <command-name> ',
print '[arg-name1=arg1] ... [arg-nameN=argN]'
return True
resp = self.cmd_obj(qmpcmd)
if resp is None:
print 'Disconnected'
return False
print resp
return True
def connect(self):
self._greeting = qmp.QEMUMonitorProtocol.connect(self)
self.__completer_setup()
def show_banner(self, msg='Welcome to the QMP low-level shell!'):
print msg
version = self._greeting['QMP']['version']['qemu']
print 'Connected to QEMU %d.%d.%d\n' % (version['major'],version['minor'],version['micro'])
def read_exec_command(self, prompt):
"""
Read and execute a command.
@return True if execution was ok, return False if disconnected.
"""
try:
cmdline = raw_input(prompt)
except EOFError:
print
return False
if cmdline == '':
for ev in self.get_events():
print ev
self.clear_events()
return True
else:
return self._execute_cmd(cmdline)
class HMPShell(QMPShell):
def __init__(self, address):
QMPShell.__init__(self, address)
self.__cpu_index = 0
def __cmd_completion(self):
for cmd in self.__cmd_passthrough('help')['return'].split('\r\n'):
if cmd and cmd[0] != '[' and cmd[0] != '\t':
name = cmd.split()[0] # drop help text
if name == 'info':
continue
if name.find('|') != -1:
# Command in the form 'foobar|f' or 'f|foobar', take the
# full name
opt = name.split('|')
if len(opt[0]) == 1:
name = opt[1]
else:
name = opt[0]
self._completer.append(name)
self._completer.append('help ' + name) # help completion
def __info_completion(self):
for cmd in self.__cmd_passthrough('info')['return'].split('\r\n'):
if cmd:
self._completer.append('info ' + cmd.split()[1])
def __other_completion(self):
# special cases
self._completer.append('help info')
def _fill_completion(self):
self.__cmd_completion()
self.__info_completion()
self.__other_completion()
def __cmd_passthrough(self, cmdline, cpu_index = 0):
return self.cmd_obj({ 'execute': 'human-monitor-command', 'arguments':
{ 'command-line': cmdline,
'cpu-index': cpu_index } })
def _execute_cmd(self, cmdline):
if cmdline.split()[0] == "cpu":
# trap the cpu command, it requires special setting
try:
idx = int(cmdline.split()[1])
if not 'return' in self.__cmd_passthrough('info version', idx):
print 'bad CPU index'
return True
self.__cpu_index = idx
except ValueError:
print 'cpu command takes an integer argument'
return True
resp = self.__cmd_passthrough(cmdline, self.__cpu_index)
if resp is None:
print 'Disconnected'
return False
assert 'return' in resp or 'error' in resp
if 'return' in resp:
# Success
if len(resp['return']) > 0:
print resp['return'],
else:
# Error
print '%s: %s' % (resp['error']['class'], resp['error']['desc'])
return True
def show_banner(self):
QMPShell.show_banner(self, msg='Welcome to the HMP shell!')
def die(msg):
sys.stderr.write('ERROR: %s\n' % msg)
sys.exit(1)
def fail_cmdline(option=None):
if option:
sys.stderr.write('ERROR: bad command-line option \'%s\'\n' % option)
sys.stderr.write('qemu-shell [ -H ] < UNIX socket path> | < TCP address:port >\n')
sys.exit(1)
def main():
addr = ''
try:
if len(sys.argv) == 2:
qemu = QMPShell(sys.argv[1])
addr = sys.argv[1]
elif len(sys.argv) == 3:
if sys.argv[1] != '-H':
fail_cmdline(sys.argv[1])
qemu = HMPShell(sys.argv[2])
addr = sys.argv[2]
else:
fail_cmdline()
except QMPShellBadPort:
die('bad port number in command-line')
try:
qemu.connect()
except qmp.QMPConnectError:
die('Didn\'t get QMP greeting message')
except qmp.QMPCapabilitiesError:
die('Could not negotiate capabilities')
except qemu.error:
die('Could not connect to %s' % addr)
qemu.show_banner()
while qemu.read_exec_command('(QEMU) '):
pass
qemu.close()
if __name__ == '__main__':
main()

282
QMP/qmp-spec.txt Normal file
View File

@@ -0,0 +1,282 @@
QEMU Monitor Protocol Specification - Version 0.1
1. Introduction
===============
This document specifies the QEMU Monitor Protocol (QMP), a JSON-based protocol
which is available for applications to control QEMU at the machine-level.
To enable QMP support, QEMU has to be run in "control mode". This is done by
starting QEMU with the appropriate command-line options. Please, refer to the
QEMU manual page for more information.
2. Protocol Specification
=========================
This section details the protocol format. For the purpose of this document
"Client" is any application which is communicating with QEMU in control mode,
and "Server" is QEMU itself.
JSON data structures, when mentioned in this document, are always in the
following format:
json-DATA-STRUCTURE-NAME
Where DATA-STRUCTURE-NAME is any valid JSON data structure, as defined by
the JSON standard:
http://www.ietf.org/rfc/rfc4627.txt
For convenience, json-object members and json-array elements mentioned in
this document will be in a certain order. However, in real protocol usage
they can be in ANY order, thus no particular order should be assumed.
2.1 General Definitions
-----------------------
2.1.1 All interactions transmitted by the Server are json-objects, always
terminating with CRLF
2.1.2 All json-objects members are mandatory when not specified otherwise
2.2 Server Greeting
-------------------
Right when connected the Server will issue a greeting message, which signals
that the connection has been successfully established and that the Server is
ready for capabilities negotiation (for more information refer to section
'4. Capabilities Negotiation').
The format is:
{ "QMP": { "version": json-object, "capabilities": json-array } }
Where,
- The "version" member contains the Server's version information (the format
is the same of the 'query-version' command)
- The "capabilities" member specify the availability of features beyond the
baseline specification
2.3 Issuing Commands
--------------------
The format for command execution is:
{ "execute": json-string, "arguments": json-object, "id": json-value }
Where,
- The "execute" member identifies the command to be executed by the Server
- The "arguments" member is used to pass any arguments required for the
execution of the command, it is optional when no arguments are required
- The "id" member is a transaction identification associated with the
command execution, it is optional and will be part of the response if
provided
2.4 Commands Responses
----------------------
There are two possible responses which the Server will issue as the result
of a command execution: success or error.
2.4.1 success
-------------
The success response is issued when the command execution has finished
without errors.
The format is:
{ "return": json-object, "id": json-value }
Where,
- The "return" member contains the command returned data, which is defined
in a per-command basis or an empty json-object if the command does not
return data
- The "id" member contains the transaction identification associated
with the command execution (if issued by the Client)
2.4.2 error
-----------
The error response is issued when the command execution could not be
completed because of an error condition.
The format is:
{ "error": { "class": json-string, "desc": json-string }, "id": json-value }
Where,
- The "class" member contains the error class name (eg. "GenericError")
- The "desc" member is a human-readable error message. Clients should
not attempt to parse this message.
- The "id" member contains the transaction identification associated with
the command execution (if issued by the Client)
NOTE: Some errors can occur before the Server is able to read the "id" member,
in these cases the "id" member will not be part of the error response, even
if provided by the client.
2.5 Asynchronous events
-----------------------
As a result of state changes, the Server may send messages unilaterally
to the Client at any time. They are called 'asynchronous events'.
The format is:
{ "event": json-string, "data": json-object,
"timestamp": { "seconds": json-number, "microseconds": json-number } }
Where,
- The "event" member contains the event's name
- The "data" member contains event specific data, which is defined in a
per-event basis, it is optional
- The "timestamp" member contains the exact time of when the event occurred
in the Server. It is a fixed json-object with time in seconds and
microseconds
For a listing of supported asynchronous events, please, refer to the
qmp-events.txt file.
3. QMP Examples
===============
This section provides some examples of real QMP usage, in all of them
'C' stands for 'Client' and 'S' stands for 'Server'.
3.1 Server greeting
-------------------
S: {"QMP": {"version": {"qemu": "0.12.50", "package": ""}, "capabilities": []}}
3.2 Simple 'stop' execution
---------------------------
C: { "execute": "stop" }
S: {"return": {}}
3.3 KVM information
-------------------
C: { "execute": "query-kvm", "id": "example" }
S: {"return": {"enabled": true, "present": true}, "id": "example"}
3.4 Parsing error
------------------
C: { "execute": }
S: {"error": {"class": "GenericError", "desc": "Invalid JSON syntax" } }
3.5 Powerdown event
-------------------
S: {"timestamp": {"seconds": 1258551470, "microseconds": 802384}, "event":
"POWERDOWN"}
4. Capabilities Negotiation
----------------------------
When a Client successfully establishes a connection, the Server is in
Capabilities Negotiation mode.
In this mode only the 'qmp_capabilities' command is allowed to run, all
other commands will return the CommandNotFound error. Asynchronous messages
are not delivered either.
Clients should use the 'qmp_capabilities' command to enable capabilities
advertised in the Server's greeting (section '2.2 Server Greeting') they
support.
When the 'qmp_capabilities' command is issued, and if it does not return an
error, the Server enters in Command mode where capabilities changes take
effect, all commands (except 'qmp_capabilities') are allowed and asynchronous
messages are delivered.
5 Compatibility Considerations
------------------------------
All protocol changes or new features which modify the protocol format in an
incompatible way are disabled by default and will be advertised by the
capabilities array (section '2.2 Server Greeting'). Thus, Clients can check
that array and enable the capabilities they support.
The QMP Server performs a type check on the arguments to a command. It
generates an error if a value does not have the expected type for its
key, or if it does not understand a key that the Client included. The
strictness of the Server catches wrong assumptions of Clients about
the Server's schema. Clients can assume that, when such validation
errors occur, they will be reported before the command generated any
side effect.
However, Clients must not assume any particular:
- Length of json-arrays
- Size of json-objects; in particular, future versions of QEMU may add
new keys and Clients should be able to ignore them.
- Order of json-object members or json-array elements
- Amount of errors generated by a command, that is, new errors can be added
to any existing command in newer versions of the Server
Of course, the Server does guarantee to send valid JSON. But apart from
this, a Client should be "conservative in what they send, and liberal in
what they accept".
6. Downstream extension of QMP
------------------------------
We recommend that downstream consumers of QEMU do *not* modify QMP.
Management tools should be able to support both upstream and downstream
versions of QMP without special logic, and downstream extensions are
inherently at odds with that.
However, we recognize that it is sometimes impossible for downstreams to
avoid modifying QMP. Both upstream and downstream need to take care to
preserve long-term compatibility and interoperability.
To help with that, QMP reserves JSON object member names beginning with
'__' (double underscore) for downstream use ("downstream names"). This
means upstream will never use any downstream names for its commands,
arguments, errors, asynchronous events, and so forth.
Any new names downstream wishes to add must begin with '__'. To
ensure compatibility with other downstreams, it is strongly
recommended that you prefix your downstram names with '__RFQDN_' where
RFQDN is a valid, reverse fully qualified domain name which you
control. For example, a qemu-kvm specific monitor command would be:
(qemu) __org.linux-kvm_enable_irqchip
Downstream must not change the server greeting (section 2.2) other than
to offer additional capabilities. But see below for why even that is
discouraged.
Section '5 Compatibility Considerations' applies to downstream as well
as to upstream, obviously. It follows that downstream must behave
exactly like upstream for any input not containing members with
downstream names ("downstream members"), except it may add members
with downstream names to its output.
Thus, a client should not be able to distinguish downstream from
upstream as long as it doesn't send input with downstream members, and
properly ignores any downstream members in the output it receives.
Advice on downstream modifications:
1. Introducing new commands is okay. If you want to extend an existing
command, consider introducing a new one with the new behaviour
instead.
2. Introducing new asynchronous messages is okay. If you want to extend
an existing message, consider adding a new one instead.
3. Introducing new errors for use in new commands is okay. Adding new
errors to existing commands counts as extension, so 1. applies.
4. New capabilities are strongly discouraged. Capabilities are for
evolving the basic protocol, and multiple diverging basic protocol
dialects are most undesirable.

163
QMP/qmp.py Normal file
View File

@@ -0,0 +1,163 @@
# QEMU Monitor Protocol Python class
#
# Copyright (C) 2009, 2010 Red Hat Inc.
#
# Authors:
# Luiz Capitulino <lcapitulino@redhat.com>
#
# This work is licensed under the terms of the GNU GPL, version 2. See
# the COPYING file in the top-level directory.
import json
import errno
import socket
class QMPError(Exception):
pass
class QMPConnectError(QMPError):
pass
class QMPCapabilitiesError(QMPError):
pass
class QEMUMonitorProtocol:
def __init__(self, address, server=False):
"""
Create a QEMUMonitorProtocol class.
@param address: QEMU address, can be either a unix socket path (string)
or a tuple in the form ( address, port ) for a TCP
connection
@param server: server mode listens on the socket (bool)
@raise socket.error on socket connection errors
@note No connection is established, this is done by the connect() or
accept() methods
"""
self.__events = []
self.__address = address
self.__sock = self.__get_sock()
if server:
self.__sock.bind(self.__address)
self.__sock.listen(1)
def __get_sock(self):
if isinstance(self.__address, tuple):
family = socket.AF_INET
else:
family = socket.AF_UNIX
return socket.socket(family, socket.SOCK_STREAM)
def __negotiate_capabilities(self):
self.__sockfile = self.__sock.makefile()
greeting = self.__json_read()
if greeting is None or not greeting.has_key('QMP'):
raise QMPConnectError
# Greeting seems ok, negotiate capabilities
resp = self.cmd('qmp_capabilities')
if "return" in resp:
return greeting
raise QMPCapabilitiesError
def __json_read(self, only_event=False):
while True:
data = self.__sockfile.readline()
if not data:
return
resp = json.loads(data)
if 'event' in resp:
self.__events.append(resp)
if not only_event:
continue
return resp
error = socket.error
def connect(self):
"""
Connect to the QMP Monitor and perform capabilities negotiation.
@return QMP greeting dict
@raise socket.error on socket connection errors
@raise QMPConnectError if the greeting is not received
@raise QMPCapabilitiesError if fails to negotiate capabilities
"""
self.__sock.connect(self.__address)
return self.__negotiate_capabilities()
def accept(self):
"""
Await connection from QMP Monitor and perform capabilities negotiation.
@return QMP greeting dict
@raise socket.error on socket connection errors
@raise QMPConnectError if the greeting is not received
@raise QMPCapabilitiesError if fails to negotiate capabilities
"""
self.__sock, _ = self.__sock.accept()
return self.__negotiate_capabilities()
def cmd_obj(self, qmp_cmd):
"""
Send a QMP command to the QMP Monitor.
@param qmp_cmd: QMP command to be sent as a Python dict
@return QMP response as a Python dict or None if the connection has
been closed
"""
try:
self.__sock.sendall(json.dumps(qmp_cmd))
except socket.error, err:
if err[0] == errno.EPIPE:
return
raise socket.error(err)
return self.__json_read()
def cmd(self, name, args=None, id=None):
"""
Build a QMP command and send it to the QMP Monitor.
@param name: command name (string)
@param args: command arguments (dict)
@param id: command id (dict, list, string or int)
"""
qmp_cmd = { 'execute': name }
if args:
qmp_cmd['arguments'] = args
if id:
qmp_cmd['id'] = id
return self.cmd_obj(qmp_cmd)
def command(self, cmd, **kwds):
ret = self.cmd(cmd, kwds)
if ret.has_key('error'):
raise Exception(ret['error']['desc'])
return ret['return']
def get_events(self, wait=False):
"""
Get a list of available QMP events.
@param wait: block until an event is available (bool)
"""
self.__sock.setblocking(0)
try:
self.__json_read()
except socket.error, err:
if err[0] == errno.EAGAIN:
# No data available
pass
self.__sock.setblocking(1)
if not self.__events and wait:
self.__json_read(only_event=True)
return self.__events
def clear_events(self):
"""
Clear current list of pending events.
"""
self.__events = []
def close(self):
self.__sock.close()
self.__sockfile.close()

2
README
View File

@@ -1,3 +1,3 @@
Read the documentation in qemu-doc.html or on http://wiki.qemu-project.org
Read the documentation in qemu-doc.html or on http://wiki.qemu.org
- QEMU team

37
TODO Normal file
View File

@@ -0,0 +1,37 @@
General:
-------
- cycle counter for all archs
- cpu_interrupt() win32/SMP fix
- merge PIC spurious interrupt patch
- warning for OS/2: must not use 128 MB memory (merge bochs cmos patch ?)
- config file (at least for windows/Mac OS X)
- update doc: PCI infos.
- basic VGA optimizations
- better code fetch
- do not resize vga if invalid size.
- TLB code protection support for PPC
- disable SMC handling for ARM/SPARC/PPC (not finished)
- see undefined flags for BTx insn
- keyboard output buffer filling timing emulation
- tests for each target CPU
- fix all remaining thread lock issues (must put TBs in a specific invalid
state, find a solution for tb_flush()).
ppc specific:
------------
- TLB invalidate not needed if msr_pr changes
- enable shift optimizations ?
linux-user specific:
-------------------
- remove threading support as it cannot work at this point
- improve IPC syscalls
- more syscalls (in particular all 64 bit ones, IPCs, fix 64 bit
issues, fix 16 bit uid issues)
- use kernel traps for unaligned accesses on ARM ?
lower priority:
--------------
- int15 ah=86: use better timing
- use -msoft-float on ARM

View File

@@ -1 +1 @@
2.3.50
1.2.1

430
a.out.h Normal file
View File

@@ -0,0 +1,430 @@
/* a.out.h
Copyright 1997, 1998, 1999, 2001 Red Hat, Inc.
This file is part of Cygwin.
This software is a copyrighted work licensed under the terms of the
Cygwin license. Please consult the file "CYGWIN_LICENSE" for
details. */
#ifndef _A_OUT_H_
#define _A_OUT_H_
#ifdef __cplusplus
extern "C" {
#endif
#define COFF_IMAGE_WITH_PE
#define COFF_LONG_SECTION_NAMES
/*** coff information for Intel 386/486. */
/********************** FILE HEADER **********************/
struct external_filehdr {
short f_magic; /* magic number */
short f_nscns; /* number of sections */
host_ulong f_timdat; /* time & date stamp */
host_ulong f_symptr; /* file pointer to symtab */
host_ulong f_nsyms; /* number of symtab entries */
short f_opthdr; /* sizeof(optional hdr) */
short f_flags; /* flags */
};
/* Bits for f_flags:
* F_RELFLG relocation info stripped from file
* F_EXEC file is executable (no unresolved external references)
* F_LNNO line numbers stripped from file
* F_LSYMS local symbols stripped from file
* F_AR32WR file has byte ordering of an AR32WR machine (e.g. vax)
*/
#define F_RELFLG (0x0001)
#define F_EXEC (0x0002)
#define F_LNNO (0x0004)
#define F_LSYMS (0x0008)
#define I386MAGIC 0x14c
#define I386PTXMAGIC 0x154
#define I386AIXMAGIC 0x175
/* This is Lynx's all-platform magic number for executables. */
#define LYNXCOFFMAGIC 0415
#define I386BADMAG(x) (((x).f_magic != I386MAGIC) \
&& (x).f_magic != I386AIXMAGIC \
&& (x).f_magic != I386PTXMAGIC \
&& (x).f_magic != LYNXCOFFMAGIC)
#define FILHDR struct external_filehdr
#define FILHSZ 20
/********************** AOUT "OPTIONAL HEADER"=
**********************/
typedef struct
{
unsigned short magic; /* type of file */
unsigned short vstamp; /* version stamp */
host_ulong tsize; /* text size in bytes, padded to FW bdry*/
host_ulong dsize; /* initialized data " " */
host_ulong bsize; /* uninitialized data " " */
host_ulong entry; /* entry pt. */
host_ulong text_start; /* base of text used for this file */
host_ulong data_start; /* base of data used for this file=
*/
}
AOUTHDR;
#define AOUTSZ 28
#define AOUTHDRSZ 28
#define OMAGIC 0404 /* object files, eg as output */
#define ZMAGIC 0413 /* demand load format, eg normal ld output */
#define STMAGIC 0401 /* target shlib */
#define SHMAGIC 0443 /* host shlib */
/* define some NT default values */
/* #define NT_IMAGE_BASE 0x400000 moved to internal.h */
#define NT_SECTION_ALIGNMENT 0x1000
#define NT_FILE_ALIGNMENT 0x200
#define NT_DEF_RESERVE 0x100000
#define NT_DEF_COMMIT 0x1000
/********************** SECTION HEADER **********************/
struct external_scnhdr {
char s_name[8]; /* section name */
host_ulong s_paddr; /* physical address, offset
of last addr in scn */
host_ulong s_vaddr; /* virtual address */
host_ulong s_size; /* section size */
host_ulong s_scnptr; /* file ptr to raw data for section */
host_ulong s_relptr; /* file ptr to relocation */
host_ulong s_lnnoptr; /* file ptr to line numbers */
unsigned short s_nreloc; /* number of relocation entries */
unsigned short s_nlnno; /* number of line number entries*/
host_ulong s_flags; /* flags */
};
#define SCNHDR struct external_scnhdr
#define SCNHSZ 40
/*
* names of "special" sections
*/
#define _TEXT ".text"
#define _DATA ".data"
#define _BSS ".bss"
#define _COMMENT ".comment"
#define _LIB ".lib"
/********************** LINE NUMBERS **********************/
/* 1 line number entry for every "breakpointable" source line in a section.
* Line numbers are grouped on a per function basis; first entry in a function
* grouping will have l_lnno = 0 and in place of physical address will be the
* symbol table index of the function name.
*/
struct external_lineno {
union {
host_ulong l_symndx; /* function name symbol index, iff l_lnno 0 */
host_ulong l_paddr; /* (physical) address of line number */
} l_addr;
unsigned short l_lnno; /* line number */
};
#define LINENO struct external_lineno
#define LINESZ 6
/********************** SYMBOLS **********************/
#define E_SYMNMLEN 8 /* # characters in a symbol name */
#define E_FILNMLEN 14 /* # characters in a file name */
#define E_DIMNUM 4 /* # array dimensions in auxiliary entry */
struct QEMU_PACKED external_syment
{
union {
char e_name[E_SYMNMLEN];
struct {
host_ulong e_zeroes;
host_ulong e_offset;
} e;
} e;
host_ulong e_value;
unsigned short e_scnum;
unsigned short e_type;
char e_sclass[1];
char e_numaux[1];
};
#define N_BTMASK (0xf)
#define N_TMASK (0x30)
#define N_BTSHFT (4)
#define N_TSHIFT (2)
union external_auxent {
struct {
host_ulong x_tagndx; /* str, un, or enum tag indx */
union {
struct {
unsigned short x_lnno; /* declaration line number */
unsigned short x_size; /* str/union/array size */
} x_lnsz;
host_ulong x_fsize; /* size of function */
} x_misc;
union {
struct { /* if ISFCN, tag, or .bb */
host_ulong x_lnnoptr;/* ptr to fcn line # */
host_ulong x_endndx; /* entry ndx past block end */
} x_fcn;
struct { /* if ISARY, up to 4 dimen. */
char x_dimen[E_DIMNUM][2];
} x_ary;
} x_fcnary;
unsigned short x_tvndx; /* tv index */
} x_sym;
union {
char x_fname[E_FILNMLEN];
struct {
host_ulong x_zeroes;
host_ulong x_offset;
} x_n;
} x_file;
struct {
host_ulong x_scnlen; /* section length */
unsigned short x_nreloc; /* # relocation entries */
unsigned short x_nlinno; /* # line numbers */
host_ulong x_checksum; /* section COMDAT checksum */
unsigned short x_associated;/* COMDAT associated section index */
char x_comdat[1]; /* COMDAT selection number */
} x_scn;
struct {
host_ulong x_tvfill; /* tv fill value */
unsigned short x_tvlen; /* length of .tv */
char x_tvran[2][2]; /* tv range */
} x_tv; /* info about .tv section (in auxent of symbol .tv)) */
};
#define SYMENT struct external_syment
#define SYMESZ 18
#define AUXENT union external_auxent
#define AUXESZ 18
#define _ETEXT "etext"
/********************** RELOCATION DIRECTIVES **********************/
struct external_reloc {
char r_vaddr[4];
char r_symndx[4];
char r_type[2];
};
#define RELOC struct external_reloc
#define RELSZ 10
/* end of coff/i386.h */
/* PE COFF header information */
#ifndef _PE_H
#define _PE_H
/* NT specific file attributes */
#define IMAGE_FILE_RELOCS_STRIPPED 0x0001
#define IMAGE_FILE_EXECUTABLE_IMAGE 0x0002
#define IMAGE_FILE_LINE_NUMS_STRIPPED 0x0004
#define IMAGE_FILE_LOCAL_SYMS_STRIPPED 0x0008
#define IMAGE_FILE_BYTES_REVERSED_LO 0x0080
#define IMAGE_FILE_32BIT_MACHINE 0x0100
#define IMAGE_FILE_DEBUG_STRIPPED 0x0200
#define IMAGE_FILE_SYSTEM 0x1000
#define IMAGE_FILE_DLL 0x2000
#define IMAGE_FILE_BYTES_REVERSED_HI 0x8000
/* additional flags to be set for section headers to allow the NT loader to
read and write to the section data (to replace the addresses of data in
dlls for one thing); also to execute the section in .text's case=
*/
#define IMAGE_SCN_MEM_DISCARDABLE 0x02000000
#define IMAGE_SCN_MEM_EXECUTE 0x20000000
#define IMAGE_SCN_MEM_READ 0x40000000
#define IMAGE_SCN_MEM_WRITE 0x80000000
/*
* Section characteristics added for ppc-nt
*/
#define IMAGE_SCN_TYPE_NO_PAD 0x00000008 /* Reserved. */
#define IMAGE_SCN_CNT_CODE 0x00000020 /* Section contains code. */
#define IMAGE_SCN_CNT_INITIALIZED_DATA 0x00000040 /* Section contains initialized data. */
#define IMAGE_SCN_CNT_UNINITIALIZED_DATA 0x00000080 /* Section contains uninitialized data. */
#define IMAGE_SCN_LNK_OTHER 0x00000100 /* Reserved. */
#define IMAGE_SCN_LNK_INFO 0x00000200 /* Section contains comments or some other type of information. */
#define IMAGE_SCN_LNK_REMOVE 0x00000800 /* Section contents will not become part of image. */
#define IMAGE_SCN_LNK_COMDAT 0x00001000 /* Section contents comdat. */
#define IMAGE_SCN_MEM_FARDATA 0x00008000
#define IMAGE_SCN_MEM_PURGEABLE 0x00020000
#define IMAGE_SCN_MEM_16BIT 0x00020000
#define IMAGE_SCN_MEM_LOCKED 0x00040000
#define IMAGE_SCN_MEM_PRELOAD 0x00080000
#define IMAGE_SCN_ALIGN_1BYTES 0x00100000
#define IMAGE_SCN_ALIGN_2BYTES 0x00200000
#define IMAGE_SCN_ALIGN_4BYTES 0x00300000
#define IMAGE_SCN_ALIGN_8BYTES 0x00400000
#define IMAGE_SCN_ALIGN_16BYTES 0x00500000 /* Default alignment if no others are specified. */
#define IMAGE_SCN_ALIGN_32BYTES 0x00600000
#define IMAGE_SCN_ALIGN_64BYTES 0x00700000
#define IMAGE_SCN_LNK_NRELOC_OVFL 0x01000000 /* Section contains extended relocations. */
#define IMAGE_SCN_MEM_NOT_CACHED 0x04000000 /* Section is not cachable. */
#define IMAGE_SCN_MEM_NOT_PAGED 0x08000000 /* Section is not pageable. */
#define IMAGE_SCN_MEM_SHARED 0x10000000 /* Section is shareable. */
/* COMDAT selection codes. */
#define IMAGE_COMDAT_SELECT_NODUPLICATES (1) /* Warn if duplicates. */
#define IMAGE_COMDAT_SELECT_ANY (2) /* No warning. */
#define IMAGE_COMDAT_SELECT_SAME_SIZE (3) /* Warn if different size. */
#define IMAGE_COMDAT_SELECT_EXACT_MATCH (4) /* Warn if different. */
#define IMAGE_COMDAT_SELECT_ASSOCIATIVE (5) /* Base on other section. */
/* Magic values that are true for all dos/nt implementations */
#define DOSMAGIC 0x5a4d
#define NT_SIGNATURE 0x00004550
/* NT allows long filenames, we want to accommodate this. This may break
some of the bfd functions */
#undef FILNMLEN
#define FILNMLEN 18 /* # characters in a file name */
#ifdef COFF_IMAGE_WITH_PE
/* The filehdr is only weired in images */
#undef FILHDR
struct external_PE_filehdr
{
/* DOS header fields */
unsigned short e_magic; /* Magic number, 0x5a4d */
unsigned short e_cblp; /* Bytes on last page of file, 0x90 */
unsigned short e_cp; /* Pages in file, 0x3 */
unsigned short e_crlc; /* Relocations, 0x0 */
unsigned short e_cparhdr; /* Size of header in paragraphs, 0x4 */
unsigned short e_minalloc; /* Minimum extra paragraphs needed, 0x0 */
unsigned short e_maxalloc; /* Maximum extra paragraphs needed, 0xFFFF */
unsigned short e_ss; /* Initial (relative) SS value, 0x0 */
unsigned short e_sp; /* Initial SP value, 0xb8 */
unsigned short e_csum; /* Checksum, 0x0 */
unsigned short e_ip; /* Initial IP value, 0x0 */
unsigned short e_cs; /* Initial (relative) CS value, 0x0 */
unsigned short e_lfarlc; /* File address of relocation table, 0x40 */
unsigned short e_ovno; /* Overlay number, 0x0 */
char e_res[4][2]; /* Reserved words, all 0x0 */
unsigned short e_oemid; /* OEM identifier (for e_oeminfo), 0x0 */
unsigned short e_oeminfo; /* OEM information; e_oemid specific, 0x0 */
char e_res2[10][2]; /* Reserved words, all 0x0 */
host_ulong e_lfanew; /* File address of new exe header, 0x80 */
char dos_message[16][4]; /* other stuff, always follow DOS header */
unsigned int nt_signature; /* required NT signature, 0x4550 */
/* From standard header */
unsigned short f_magic; /* magic number */
unsigned short f_nscns; /* number of sections */
host_ulong f_timdat; /* time & date stamp */
host_ulong f_symptr; /* file pointer to symtab */
host_ulong f_nsyms; /* number of symtab entries */
unsigned short f_opthdr; /* sizeof(optional hdr) */
unsigned short f_flags; /* flags */
};
#define FILHDR struct external_PE_filehdr
#undef FILHSZ
#define FILHSZ 152
#endif
typedef struct
{
unsigned short magic; /* type of file */
unsigned short vstamp; /* version stamp */
host_ulong tsize; /* text size in bytes, padded to FW bdry*/
host_ulong dsize; /* initialized data " " */
host_ulong bsize; /* uninitialized data " " */
host_ulong entry; /* entry pt. */
host_ulong text_start; /* base of text used for this file */
host_ulong data_start; /* base of all data used for this file */
/* NT extra fields; see internal.h for descriptions */
host_ulong ImageBase;
host_ulong SectionAlignment;
host_ulong FileAlignment;
unsigned short MajorOperatingSystemVersion;
unsigned short MinorOperatingSystemVersion;
unsigned short MajorImageVersion;
unsigned short MinorImageVersion;
unsigned short MajorSubsystemVersion;
unsigned short MinorSubsystemVersion;
char Reserved1[4];
host_ulong SizeOfImage;
host_ulong SizeOfHeaders;
host_ulong CheckSum;
unsigned short Subsystem;
unsigned short DllCharacteristics;
host_ulong SizeOfStackReserve;
host_ulong SizeOfStackCommit;
host_ulong SizeOfHeapReserve;
host_ulong SizeOfHeapCommit;
host_ulong LoaderFlags;
host_ulong NumberOfRvaAndSizes;
/* IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES]; */
char DataDirectory[16][2][4]; /* 16 entries, 2 elements/entry, 4 chars */
} PEAOUTHDR;
#undef AOUTSZ
#define AOUTSZ (AOUTHDRSZ + 196)
#undef E_FILNMLEN
#define E_FILNMLEN 18 /* # characters in a file name */
#endif
/* end of coff/pe.h */
#define DT_NON (0) /* no derived type */
#define DT_PTR (1) /* pointer */
#define DT_FCN (2) /* function */
#define DT_ARY (3) /* array */
#define ISPTR(x) (((x) & N_TMASK) == (DT_PTR << N_BTSHFT))
#define ISFCN(x) (((x) & N_TMASK) == (DT_FCN << N_BTSHFT))
#define ISARY(x) (((x) & N_TMASK) == (DT_ARY << N_BTSHFT))
#ifdef __cplusplus
}
#endif
#endif /* _A_OUT_H_ */

157
accel.c
View File

@@ -1,157 +0,0 @@
/*
* QEMU System Emulator, accelerator interfaces
*
* Copyright (c) 2003-2008 Fabrice Bellard
* Copyright (c) 2014 Red Hat Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "sysemu/accel.h"
#include "hw/boards.h"
#include "qemu-common.h"
#include "sysemu/arch_init.h"
#include "sysemu/sysemu.h"
#include "sysemu/kvm.h"
#include "sysemu/qtest.h"
#include "hw/xen/xen.h"
#include "qom/object.h"
#include "hw/boards.h"
int tcg_tb_size;
static bool tcg_allowed = true;
static int tcg_init(MachineState *ms)
{
tcg_exec_init(tcg_tb_size * 1024 * 1024);
return 0;
}
static const TypeInfo accel_type = {
.name = TYPE_ACCEL,
.parent = TYPE_OBJECT,
.class_size = sizeof(AccelClass),
.instance_size = sizeof(AccelState),
};
/* Lookup AccelClass from opt_name. Returns NULL if not found */
static AccelClass *accel_find(const char *opt_name)
{
char *class_name = g_strdup_printf(ACCEL_CLASS_NAME("%s"), opt_name);
AccelClass *ac = ACCEL_CLASS(object_class_by_name(class_name));
g_free(class_name);
return ac;
}
static int accel_init_machine(AccelClass *acc, MachineState *ms)
{
ObjectClass *oc = OBJECT_CLASS(acc);
const char *cname = object_class_get_name(oc);
AccelState *accel = ACCEL(object_new(cname));
int ret;
ms->accelerator = accel;
*(acc->allowed) = true;
ret = acc->init_machine(ms);
if (ret < 0) {
ms->accelerator = NULL;
*(acc->allowed) = false;
object_unref(OBJECT(accel));
}
return ret;
}
int configure_accelerator(MachineState *ms)
{
const char *p;
char buf[10];
int ret;
bool accel_initialised = false;
bool init_failed = false;
AccelClass *acc = NULL;
p = qemu_opt_get(qemu_get_machine_opts(), "accel");
if (p == NULL) {
/* Use the default "accelerator", tcg */
p = "tcg";
}
while (!accel_initialised && *p != '\0') {
if (*p == ':') {
p++;
}
p = get_opt_name(buf, sizeof(buf), p, ':');
acc = accel_find(buf);
if (!acc) {
fprintf(stderr, "\"%s\" accelerator not found.\n", buf);
continue;
}
if (acc->available && !acc->available()) {
printf("%s not supported for this target\n",
acc->name);
continue;
}
ret = accel_init_machine(acc, ms);
if (ret < 0) {
init_failed = true;
fprintf(stderr, "failed to initialize %s: %s\n",
acc->name,
strerror(-ret));
} else {
accel_initialised = true;
}
}
if (!accel_initialised) {
if (!init_failed) {
fprintf(stderr, "No accelerator found!\n");
}
exit(1);
}
if (init_failed) {
fprintf(stderr, "Back to %s accelerator.\n", acc->name);
}
return !accel_initialised;
}
static void tcg_accel_class_init(ObjectClass *oc, void *data)
{
AccelClass *ac = ACCEL_CLASS(oc);
ac->name = "tcg";
ac->init_machine = tcg_init;
ac->allowed = &tcg_allowed;
}
#define TYPE_TCG_ACCEL ACCEL_CLASS_NAME("tcg")
static const TypeInfo tcg_accel_type = {
.name = TYPE_TCG_ACCEL,
.parent = TYPE_ACCEL,
.class_init = tcg_accel_class_init,
};
static void register_accel_types(void)
{
type_register_static(&accel_type);
type_register_static(&tcg_accel_type);
}
type_init(register_accel_types);

View File

@@ -24,7 +24,7 @@
#include "qemu-common.h"
#include "qemu/acl.h"
#include "acl.h"
#ifdef CONFIG_FNMATCH
#include <fnmatch.h>
@@ -103,8 +103,8 @@ void qemu_acl_reset(qemu_acl *acl)
acl->defaultDeny = 1;
QTAILQ_FOREACH_SAFE(entry, &acl->entries, next, next_entry) {
QTAILQ_REMOVE(&acl->entries, entry, next);
g_free(entry->match);
g_free(entry);
free(entry->match);
free(entry);
}
acl->nentries = 0;
}
@@ -132,23 +132,23 @@ int qemu_acl_insert(qemu_acl *acl,
const char *match,
int index)
{
qemu_acl_entry *entry;
qemu_acl_entry *tmp;
int i = 0;
if (index <= 0)
return -1;
if (index > acl->nentries) {
if (index >= acl->nentries)
return qemu_acl_append(acl, deny, match);
}
entry = g_malloc(sizeof(*entry));
entry->match = g_strdup(match);
entry->deny = deny;
QTAILQ_FOREACH(tmp, &acl->entries, next) {
i++;
if (i == index) {
qemu_acl_entry *entry;
entry = g_malloc(sizeof(*entry));
entry->match = g_strdup(match);
entry->deny = deny;
QTAILQ_INSERT_BEFORE(tmp, entry, next);
acl->nentries++;
break;
@@ -168,9 +168,6 @@ int qemu_acl_remove(qemu_acl *acl,
i++;
if (strcmp(entry->match, match) == 0) {
QTAILQ_REMOVE(&acl->entries, entry, next);
acl->nentries--;
g_free(entry->match);
g_free(entry);
return i;
}
}

View File

@@ -25,7 +25,7 @@
#ifndef __QEMU_ACL_H__
#define __QEMU_ACL_H__
#include "qemu/queue.h"
#include "qemu-queue.h"
typedef struct qemu_acl_entry qemu_acl_entry;
typedef struct qemu_acl qemu_acl;

1314
aes.c Normal file

File diff suppressed because it is too large Load Diff

26
aes.h Normal file
View File

@@ -0,0 +1,26 @@
#ifndef QEMU_AES_H
#define QEMU_AES_H
#define AES_MAXNR 14
#define AES_BLOCK_SIZE 16
struct aes_key_st {
uint32_t rd_key[4 *(AES_MAXNR + 1)];
int rounds;
};
typedef struct aes_key_st AES_KEY;
int AES_set_encrypt_key(const unsigned char *userKey, const int bits,
AES_KEY *key);
int AES_set_decrypt_key(const unsigned char *userKey, const int bits,
AES_KEY *key);
void AES_encrypt(const unsigned char *in, unsigned char *out,
const AES_KEY *key);
void AES_decrypt(const unsigned char *in, unsigned char *out,
const AES_KEY *key);
void AES_cbc_encrypt(const unsigned char *in, unsigned char *out,
const unsigned long length, const AES_KEY *key,
unsigned char *ivec, const int enc);
#endif

View File

@@ -1,299 +0,0 @@
/*
* QEMU aio implementation
*
* Copyright IBM, Corp. 2008
*
* Authors:
* Anthony Liguori <aliguori@us.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*
* Contributions after 2012-01-13 are licensed under the terms of the
* GNU GPL, version 2 or (at your option) any later version.
*/
#include "qemu-common.h"
#include "block/block.h"
#include "qemu/queue.h"
#include "qemu/sockets.h"
struct AioHandler
{
GPollFD pfd;
IOHandler *io_read;
IOHandler *io_write;
int deleted;
void *opaque;
QLIST_ENTRY(AioHandler) node;
};
static AioHandler *find_aio_handler(AioContext *ctx, int fd)
{
AioHandler *node;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->pfd.fd == fd)
if (!node->deleted)
return node;
}
return NULL;
}
void aio_set_fd_handler(AioContext *ctx,
int fd,
IOHandler *io_read,
IOHandler *io_write,
void *opaque)
{
AioHandler *node;
node = find_aio_handler(ctx, fd);
/* Are we deleting the fd handler? */
if (!io_read && !io_write) {
if (node) {
g_source_remove_poll(&ctx->source, &node->pfd);
/* If the lock is held, just mark the node as deleted */
if (ctx->walking_handlers) {
node->deleted = 1;
node->pfd.revents = 0;
} else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up after
* releasing the walking_handlers lock.
*/
QLIST_REMOVE(node, node);
g_free(node);
}
}
} else {
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_new0(AioHandler, 1);
node->pfd.fd = fd;
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
g_source_add_poll(&ctx->source, &node->pfd);
}
/* Update handler with latest information */
node->io_read = io_read;
node->io_write = io_write;
node->opaque = opaque;
node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP | G_IO_ERR : 0);
node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0);
}
aio_notify(ctx);
}
void aio_set_event_notifier(AioContext *ctx,
EventNotifier *notifier,
EventNotifierHandler *io_read)
{
aio_set_fd_handler(ctx, event_notifier_get_fd(notifier),
(IOHandler *)io_read, NULL, notifier);
}
bool aio_prepare(AioContext *ctx)
{
return false;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
int revents;
revents = node->pfd.revents & node->pfd.events;
if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR) && node->io_read) {
return true;
}
if (revents & (G_IO_OUT | G_IO_ERR) && node->io_write) {
return true;
}
}
return false;
}
bool aio_dispatch(AioContext *ctx)
{
AioHandler *node;
bool progress = false;
/*
* If there are callbacks left that have been queued, we need to call them.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
progress = true;
}
/*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
int revents;
ctx->walking_handlers++;
revents = node->pfd.revents & node->pfd.events;
node->pfd.revents = 0;
if (!node->deleted &&
(revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) &&
node->io_read) {
node->io_read(node->opaque);
/* aio_notify() does not count as progress */
if (node->opaque != &ctx->notifier) {
progress = true;
}
}
if (!node->deleted &&
(revents & (G_IO_OUT | G_IO_ERR)) &&
node->io_write) {
node->io_write(node->opaque);
progress = true;
}
tmp = node;
node = QLIST_NEXT(node, node);
ctx->walking_handlers--;
if (!ctx->walking_handlers && tmp->deleted) {
QLIST_REMOVE(tmp, node);
g_free(tmp);
}
}
/* Run our timers */
progress |= timerlistgroup_run_timers(&ctx->tlg);
return progress;
}
/* These thread-local variables are used only in a small part of aio_poll
* around the call to the poll() system call. In particular they are not
* used while aio_poll is performing callbacks, which makes it much easier
* to think about reentrancy!
*
* Stack-allocated arrays would be perfect but they have size limitations;
* heap allocation is expensive enough that we want to reuse arrays across
* calls to aio_poll(). And because poll() has to be called without holding
* any lock, the arrays cannot be stored in AioContext. Thread-local data
* has none of the disadvantages of these three options.
*/
static __thread GPollFD *pollfds;
static __thread AioHandler **nodes;
static __thread unsigned npfd, nalloc;
static __thread Notifier pollfds_cleanup_notifier;
static void pollfds_cleanup(Notifier *n, void *unused)
{
g_assert(npfd == 0);
g_free(pollfds);
g_free(nodes);
nalloc = 0;
}
static void add_pollfd(AioHandler *node)
{
if (npfd == nalloc) {
if (nalloc == 0) {
pollfds_cleanup_notifier.notify = pollfds_cleanup;
qemu_thread_atexit_add(&pollfds_cleanup_notifier);
nalloc = 8;
} else {
g_assert(nalloc <= INT_MAX);
nalloc *= 2;
}
pollfds = g_renew(GPollFD, pollfds, nalloc);
nodes = g_renew(AioHandler *, nodes, nalloc);
}
nodes[npfd] = node;
pollfds[npfd] = (GPollFD) {
.fd = node->pfd.fd,
.events = node->pfd.events,
};
npfd++;
}
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
bool was_dispatching;
int i, ret;
bool progress;
int64_t timeout;
aio_context_acquire(ctx);
was_dispatching = ctx->dispatching;
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
* be re-evaluated before the next blocking poll(). This is
* already true when aio_poll is called with blocking == false;
* if blocking == true, it is only true after poll() returns.
*
* If we're in a nested event loop, ctx->dispatching might be true.
* In that case we can restore it just before returning, but we
* have to clear it now.
*/
aio_set_dispatching(ctx, !blocking);
ctx->walking_handlers++;
assert(npfd == 0);
/* fill pollfds */
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (!node->deleted && node->pfd.events) {
add_pollfd(node);
}
}
timeout = blocking ? aio_compute_timeout(ctx) : 0;
/* wait until next event */
if (timeout) {
aio_context_release(ctx);
}
ret = qemu_poll_ns((GPollFD *)pollfds, npfd, timeout);
if (timeout) {
aio_context_acquire(ctx);
}
/* if we have any readable fds, dispatch event */
if (ret > 0) {
for (i = 0; i < npfd; i++) {
nodes[i]->pfd.revents = pollfds[i].revents;
}
}
npfd = 0;
ctx->walking_handlers--;
/* Run dispatch even if there were no readable fds to run timers */
aio_set_dispatching(ctx, true);
if (aio_dispatch(ctx)) {
progress = true;
}
aio_set_dispatching(ctx, was_dispatching);
aio_context_release(ctx);
return progress;
}

View File

@@ -1,361 +0,0 @@
/*
* QEMU aio implementation
*
* Copyright IBM Corp., 2008
* Copyright Red Hat Inc., 2012
*
* Authors:
* Anthony Liguori <aliguori@us.ibm.com>
* Paolo Bonzini <pbonzini@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*
* Contributions after 2012-01-13 are licensed under the terms of the
* GNU GPL, version 2 or (at your option) any later version.
*/
#include "qemu-common.h"
#include "block/block.h"
#include "qemu/queue.h"
#include "qemu/sockets.h"
struct AioHandler {
EventNotifier *e;
IOHandler *io_read;
IOHandler *io_write;
EventNotifierHandler *io_notify;
GPollFD pfd;
int deleted;
void *opaque;
QLIST_ENTRY(AioHandler) node;
};
void aio_set_fd_handler(AioContext *ctx,
int fd,
IOHandler *io_read,
IOHandler *io_write,
void *opaque)
{
/* fd is a SOCKET in our case */
AioHandler *node;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->pfd.fd == fd && !node->deleted) {
break;
}
}
/* Are we deleting the fd handler? */
if (!io_read && !io_write) {
if (node) {
/* If the lock is held, just mark the node as deleted */
if (ctx->walking_handlers) {
node->deleted = 1;
node->pfd.revents = 0;
} else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up after
* releasing the walking_handlers lock.
*/
QLIST_REMOVE(node, node);
g_free(node);
}
}
} else {
HANDLE event;
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_new0(AioHandler, 1);
node->pfd.fd = fd;
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
}
node->pfd.events = 0;
if (node->io_read) {
node->pfd.events |= G_IO_IN;
}
if (node->io_write) {
node->pfd.events |= G_IO_OUT;
}
node->e = &ctx->notifier;
/* Update handler with latest information */
node->opaque = opaque;
node->io_read = io_read;
node->io_write = io_write;
event = event_notifier_get_handle(&ctx->notifier);
WSAEventSelect(node->pfd.fd, event,
FD_READ | FD_ACCEPT | FD_CLOSE |
FD_CONNECT | FD_WRITE | FD_OOB);
}
aio_notify(ctx);
}
void aio_set_event_notifier(AioContext *ctx,
EventNotifier *e,
EventNotifierHandler *io_notify)
{
AioHandler *node;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->e == e && !node->deleted) {
break;
}
}
/* Are we deleting the fd handler? */
if (!io_notify) {
if (node) {
g_source_remove_poll(&ctx->source, &node->pfd);
/* If the lock is held, just mark the node as deleted */
if (ctx->walking_handlers) {
node->deleted = 1;
node->pfd.revents = 0;
} else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up after
* releasing the walking_handlers lock.
*/
QLIST_REMOVE(node, node);
g_free(node);
}
}
} else {
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_new0(AioHandler, 1);
node->e = e;
node->pfd.fd = (uintptr_t)event_notifier_get_handle(e);
node->pfd.events = G_IO_IN;
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
g_source_add_poll(&ctx->source, &node->pfd);
}
/* Update handler with latest information */
node->io_notify = io_notify;
}
aio_notify(ctx);
}
bool aio_prepare(AioContext *ctx)
{
static struct timeval tv0;
AioHandler *node;
bool have_select_revents = false;
fd_set rfds, wfds;
/* fill fd sets */
FD_ZERO(&rfds);
FD_ZERO(&wfds);
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->io_read) {
FD_SET ((SOCKET)node->pfd.fd, &rfds);
}
if (node->io_write) {
FD_SET ((SOCKET)node->pfd.fd, &wfds);
}
}
if (select(0, &rfds, &wfds, NULL, &tv0) > 0) {
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
node->pfd.revents = 0;
if (FD_ISSET(node->pfd.fd, &rfds)) {
node->pfd.revents |= G_IO_IN;
have_select_revents = true;
}
if (FD_ISSET(node->pfd.fd, &wfds)) {
node->pfd.revents |= G_IO_OUT;
have_select_revents = true;
}
}
}
return have_select_revents;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->pfd.revents && node->io_notify) {
return true;
}
if ((node->pfd.revents & G_IO_IN) && node->io_read) {
return true;
}
if ((node->pfd.revents & G_IO_OUT) && node->io_write) {
return true;
}
}
return false;
}
static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
{
AioHandler *node;
bool progress = false;
/*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
int revents = node->pfd.revents;
ctx->walking_handlers++;
if (!node->deleted &&
(revents || event_notifier_get_handle(node->e) == event) &&
node->io_notify) {
node->pfd.revents = 0;
node->io_notify(node->e);
/* aio_notify() does not count as progress */
if (node->e != &ctx->notifier) {
progress = true;
}
}
if (!node->deleted &&
(node->io_read || node->io_write)) {
node->pfd.revents = 0;
if ((revents & G_IO_IN) && node->io_read) {
node->io_read(node->opaque);
progress = true;
}
if ((revents & G_IO_OUT) && node->io_write) {
node->io_write(node->opaque);
progress = true;
}
/* if the next select() will return an event, we have progressed */
if (event == event_notifier_get_handle(&ctx->notifier)) {
WSANETWORKEVENTS ev;
WSAEnumNetworkEvents(node->pfd.fd, event, &ev);
if (ev.lNetworkEvents) {
progress = true;
}
}
}
tmp = node;
node = QLIST_NEXT(node, node);
ctx->walking_handlers--;
if (!ctx->walking_handlers && tmp->deleted) {
QLIST_REMOVE(tmp, node);
g_free(tmp);
}
}
return progress;
}
bool aio_dispatch(AioContext *ctx)
{
bool progress;
progress = aio_bh_poll(ctx);
progress |= aio_dispatch_handlers(ctx, INVALID_HANDLE_VALUE);
progress |= timerlistgroup_run_timers(&ctx->tlg);
return progress;
}
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
bool was_dispatching, progress, have_select_revents, first;
int count;
int timeout;
aio_context_acquire(ctx);
have_select_revents = aio_prepare(ctx);
if (have_select_revents) {
blocking = false;
}
was_dispatching = ctx->dispatching;
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
* be re-evaluated before the next blocking poll(). This is
* already true when aio_poll is called with blocking == false;
* if blocking == true, it is only true after poll() returns.
*
* If we're in a nested event loop, ctx->dispatching might be true.
* In that case we can restore it just before returning, but we
* have to clear it now.
*/
aio_set_dispatching(ctx, !blocking);
ctx->walking_handlers++;
/* fill fd sets */
count = 0;
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (!node->deleted && node->io_notify) {
events[count++] = event_notifier_get_handle(node->e);
}
}
ctx->walking_handlers--;
first = true;
/* wait until next event */
while (count > 0) {
HANDLE event;
int ret;
timeout = blocking
? qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)) : 0;
if (timeout) {
aio_context_release(ctx);
}
ret = WaitForMultipleObjects(count, events, FALSE, timeout);
if (timeout) {
aio_context_acquire(ctx);
}
aio_set_dispatching(ctx, true);
if (first && aio_bh_poll(ctx)) {
progress = true;
}
first = false;
/* if we have any signaled events, dispatch event */
event = NULL;
if ((DWORD) (ret - WAIT_OBJECT_0) < count) {
event = events[ret - WAIT_OBJECT_0];
events[ret - WAIT_OBJECT_0] = events[--count];
} else if (!have_select_revents) {
break;
}
have_select_revents = false;
blocking = false;
progress |= aio_dispatch_handlers(ctx, event);
}
progress |= timerlistgroup_run_timers(&ctx->tlg);
aio_set_dispatching(ctx, was_dispatching);
aio_context_release(ctx);
return progress;
}

194
aio.c Normal file
View File

@@ -0,0 +1,194 @@
/*
* QEMU aio implementation
*
* Copyright IBM, Corp. 2008
*
* Authors:
* Anthony Liguori <aliguori@us.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*
* Contributions after 2012-01-13 are licensed under the terms of the
* GNU GPL, version 2 or (at your option) any later version.
*/
#include "qemu-common.h"
#include "block.h"
#include "qemu-queue.h"
#include "qemu_socket.h"
typedef struct AioHandler AioHandler;
/* The list of registered AIO handlers */
static QLIST_HEAD(, AioHandler) aio_handlers;
/* This is a simple lock used to protect the aio_handlers list. Specifically,
* it's used to ensure that no callbacks are removed while we're walking and
* dispatching callbacks.
*/
static int walking_handlers;
struct AioHandler
{
int fd;
IOHandler *io_read;
IOHandler *io_write;
AioFlushHandler *io_flush;
int deleted;
void *opaque;
QLIST_ENTRY(AioHandler) node;
};
static AioHandler *find_aio_handler(int fd)
{
AioHandler *node;
QLIST_FOREACH(node, &aio_handlers, node) {
if (node->fd == fd)
if (!node->deleted)
return node;
}
return NULL;
}
int qemu_aio_set_fd_handler(int fd,
IOHandler *io_read,
IOHandler *io_write,
AioFlushHandler *io_flush,
void *opaque)
{
AioHandler *node;
node = find_aio_handler(fd);
/* Are we deleting the fd handler? */
if (!io_read && !io_write) {
if (node) {
/* If the lock is held, just mark the node as deleted */
if (walking_handlers)
node->deleted = 1;
else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up after
* releasing the walking_handlers lock.
*/
QLIST_REMOVE(node, node);
g_free(node);
}
}
} else {
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_malloc0(sizeof(AioHandler));
node->fd = fd;
QLIST_INSERT_HEAD(&aio_handlers, node, node);
}
/* Update handler with latest information */
node->io_read = io_read;
node->io_write = io_write;
node->io_flush = io_flush;
node->opaque = opaque;
}
qemu_set_fd_handler2(fd, NULL, io_read, io_write, opaque);
return 0;
}
void qemu_aio_flush(void)
{
while (qemu_aio_wait());
}
bool qemu_aio_wait(void)
{
AioHandler *node;
fd_set rdfds, wrfds;
int max_fd = -1;
int ret;
bool busy;
/*
* If there are callbacks left that have been queued, we need to call then.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for qemu_aio_wait loops).
*/
if (qemu_bh_poll()) {
return true;
}
walking_handlers = 1;
FD_ZERO(&rdfds);
FD_ZERO(&wrfds);
/* fill fd sets */
busy = false;
QLIST_FOREACH(node, &aio_handlers, node) {
/* If there aren't pending AIO operations, don't invoke callbacks.
* Otherwise, if there are no AIO requests, qemu_aio_wait() would
* wait indefinitely.
*/
if (node->io_flush) {
if (node->io_flush(node->opaque) == 0) {
continue;
}
busy = true;
}
if (!node->deleted && node->io_read) {
FD_SET(node->fd, &rdfds);
max_fd = MAX(max_fd, node->fd + 1);
}
if (!node->deleted && node->io_write) {
FD_SET(node->fd, &wrfds);
max_fd = MAX(max_fd, node->fd + 1);
}
}
walking_handlers = 0;
/* No AIO operations? Get us out of here */
if (!busy) {
return false;
}
/* wait until next event */
ret = select(max_fd, &rdfds, &wrfds, NULL, NULL);
/* if we have any readable fds, dispatch event */
if (ret > 0) {
walking_handlers = 1;
/* we have to walk very carefully in case
* qemu_aio_set_fd_handler is called while we're walking */
node = QLIST_FIRST(&aio_handlers);
while (node) {
AioHandler *tmp;
if (!node->deleted &&
FD_ISSET(node->fd, &rdfds) &&
node->io_read) {
node->io_read(node->opaque);
}
if (!node->deleted &&
FD_ISSET(node->fd, &wrfds) &&
node->io_write) {
node->io_write(node->opaque);
}
tmp = node;
node = QLIST_NEXT(node, node);
if (tmp->deleted) {
QLIST_REMOVE(tmp, node);
g_free(tmp);
}
}
walking_handlers = 0;
}
return true;
}

1916
alpha-dis.c Normal file

File diff suppressed because it is too large Load Diff

127
alpha.ld Normal file
View File

@@ -0,0 +1,127 @@
OUTPUT_FORMAT("elf64-alpha", "elf64-alpha",
"elf64-alpha")
OUTPUT_ARCH(alpha)
ENTRY(__start)
SECTIONS
{
/* Read-only sections, merged into text segment: */
. = 0x60000000 + SIZEOF_HEADERS;
.interp : { *(.interp) }
.hash : { *(.hash) }
.dynsym : { *(.dynsym) }
.dynstr : { *(.dynstr) }
.gnu.version : { *(.gnu.version) }
.gnu.version_d : { *(.gnu.version_d) }
.gnu.version_r : { *(.gnu.version_r) }
.rel.text :
{ *(.rel.text) *(.rel.gnu.linkonce.t*) }
.rela.text :
{ *(.rela.text) *(.rela.gnu.linkonce.t*) }
.rel.data :
{ *(.rel.data) *(.rel.gnu.linkonce.d*) }
.rela.data :
{ *(.rela.data) *(.rela.gnu.linkonce.d*) }
.rel.rodata :
{ *(.rel.rodata) *(.rel.gnu.linkonce.r*) }
.rela.rodata :
{ *(.rela.rodata) *(.rela.gnu.linkonce.r*) }
.rel.got : { *(.rel.got) }
.rela.got : { *(.rela.got) }
.rel.ctors : { *(.rel.ctors) }
.rela.ctors : { *(.rela.ctors) }
.rel.dtors : { *(.rel.dtors) }
.rela.dtors : { *(.rela.dtors) }
.rel.init : { *(.rel.init) }
.rela.init : { *(.rela.init) }
.rel.fini : { *(.rel.fini) }
.rela.fini : { *(.rela.fini) }
.rel.bss : { *(.rel.bss) }
.rela.bss : { *(.rela.bss) }
.rel.plt : { *(.rel.plt) }
.rela.plt : { *(.rela.plt) }
.init : { *(.init) } =0x47ff041f
.text :
{
*(.text)
/* .gnu.warning sections are handled specially by elf32.em. */
*(.gnu.warning)
*(.gnu.linkonce.t*)
} =0x47ff041f
_etext = .;
PROVIDE (etext = .);
.fini : { *(.fini) } =0x47ff041f
.rodata : { *(.rodata) *(.gnu.linkonce.r*) }
.rodata1 : { *(.rodata1) }
.reginfo : { *(.reginfo) }
/* Adjust the address for the data segment. We want to adjust up to
the same address within the page on the next page up. */
. = ALIGN(0x100000) + (. & (0x100000 - 1));
.data :
{
*(.data)
*(.gnu.linkonce.d*)
CONSTRUCTORS
}
.data1 : { *(.data1) }
.ctors :
{
*(.ctors)
}
.dtors :
{
*(.dtors)
}
.plt : { *(.plt) }
.got : { *(.got.plt) *(.got) }
.dynamic : { *(.dynamic) }
/* We want the small data sections together, so single-instruction offsets
can access them all, and initialized data all before uninitialized, so
we can shorten the on-disk segment size. */
.sdata : { *(.sdata) }
_edata = .;
PROVIDE (edata = .);
__bss_start = .;
.sbss : { *(.sbss) *(.scommon) }
.bss :
{
*(.dynbss)
*(.bss)
*(COMMON)
}
_end = . ;
PROVIDE (end = .);
/* Stabs debugging sections. */
.stab 0 : { *(.stab) }
.stabstr 0 : { *(.stabstr) }
.stab.excl 0 : { *(.stab.excl) }
.stab.exclstr 0 : { *(.stab.exclstr) }
.stab.index 0 : { *(.stab.index) }
.stab.indexstr 0 : { *(.stab.indexstr) }
.comment 0 : { *(.comment) }
/* DWARF debug sections.
Symbols in the DWARF debugging sections are relative to the beginning
of the section so we begin them at 0. */
/* DWARF 1 */
.debug 0 : { *(.debug) }
.line 0 : { *(.line) }
/* GNU DWARF 1 extensions */
.debug_srcinfo 0 : { *(.debug_srcinfo) }
.debug_sfnames 0 : { *(.debug_sfnames) }
/* DWARF 1.1 and DWARF 2 */
.debug_aranges 0 : { *(.debug_aranges) }
.debug_pubnames 0 : { *(.debug_pubnames) }
/* DWARF 2 */
.debug_info 0 : { *(.debug_info) }
.debug_abbrev 0 : { *(.debug_abbrev) }
.debug_line 0 : { *(.debug_line) }
.debug_frame 0 : { *(.debug_frame) }
.debug_str 0 : { *(.debug_str) }
.debug_loc 0 : { *(.debug_loc) }
.debug_macinfo 0 : { *(.debug_macinfo) }
/* SGI/MIPS DWARF 2 extensions */
.debug_weaknames 0 : { *(.debug_weaknames) }
.debug_funcnames 0 : { *(.debug_funcnames) }
.debug_typenames 0 : { *(.debug_typenames) }
.debug_varnames 0 : { *(.debug_varnames) }
/* These must appear regardless of . */
}

File diff suppressed because it is too large Load Diff

39
arch_init.h Normal file
View File

@@ -0,0 +1,39 @@
#ifndef QEMU_ARCH_INIT_H
#define QEMU_ARCH_INIT_H
#include "qmp-commands.h"
enum {
QEMU_ARCH_ALL = -1,
QEMU_ARCH_ALPHA = 1,
QEMU_ARCH_ARM = 2,
QEMU_ARCH_CRIS = 4,
QEMU_ARCH_I386 = 8,
QEMU_ARCH_M68K = 16,
QEMU_ARCH_LM32 = 32,
QEMU_ARCH_MICROBLAZE = 64,
QEMU_ARCH_MIPS = 128,
QEMU_ARCH_PPC = 256,
QEMU_ARCH_S390X = 512,
QEMU_ARCH_SH4 = 1024,
QEMU_ARCH_SPARC = 2048,
QEMU_ARCH_XTENSA = 4096,
QEMU_ARCH_OPENRISC = 8192,
QEMU_ARCH_UNICORE32 = 0x4000,
};
extern const uint32_t arch_type;
void select_soundhw(const char *optarg);
void do_acpitable_option(const char *optarg);
void do_smbios_option(const char *optarg);
void cpudef_init(void);
int audio_available(void);
void audio_init(ISABus *isa_bus, PCIBus *pci_bus);
int tcg_available(void);
int kvm_available(void);
int xen_available(void);
CpuDefinitionInfoList GCC_WEAK_DECL *arch_query_cpu_definitions(Error **errp);
#endif

4136
arm-dis.c Normal file

File diff suppressed because it is too large Load Diff

153
arm.ld Normal file
View File

@@ -0,0 +1,153 @@
OUTPUT_FORMAT("elf32-littlearm", "elf32-littlearm",
"elf32-littlearm")
OUTPUT_ARCH(arm)
ENTRY(_start)
SECTIONS
{
/* Read-only sections, merged into text segment: */
. = 0x60000000 + SIZEOF_HEADERS;
.interp : { *(.interp) }
.hash : { *(.hash) }
.dynsym : { *(.dynsym) }
.dynstr : { *(.dynstr) }
.gnu.version : { *(.gnu.version) }
.gnu.version_d : { *(.gnu.version_d) }
.gnu.version_r : { *(.gnu.version_r) }
.rel.text :
{ *(.rel.text) *(.rel.gnu.linkonce.t*) }
.rela.text :
{ *(.rela.text) *(.rela.gnu.linkonce.t*) }
.rel.data :
{ *(.rel.data) *(.rel.gnu.linkonce.d*) }
.rela.data :
{ *(.rela.data) *(.rela.gnu.linkonce.d*) }
.rel.rodata :
{ *(.rel.rodata) *(.rel.gnu.linkonce.r*) }
.rela.rodata :
{ *(.rela.rodata) *(.rela.gnu.linkonce.r*) }
.rel.got : { *(.rel.got) }
.rela.got : { *(.rela.got) }
.rel.ctors : { *(.rel.ctors) }
.rela.ctors : { *(.rela.ctors) }
.rel.dtors : { *(.rel.dtors) }
.rela.dtors : { *(.rela.dtors) }
.rel.init : { *(.rel.init) }
.rela.init : { *(.rela.init) }
.rel.fini : { *(.rel.fini) }
.rela.fini : { *(.rela.fini) }
.rel.bss : { *(.rel.bss) }
.rela.bss : { *(.rela.bss) }
.rel.plt : { *(.rel.plt) }
.rela.plt : { *(.rela.plt) }
.init : { *(.init) } =0x47ff041f
.text :
{
*(.text)
/* .gnu.warning sections are handled specially by elf32.em. */
*(.gnu.warning)
*(.gnu.linkonce.t*)
} =0x47ff041f
_etext = .;
PROVIDE (etext = .);
.fini : { *(.fini) } =0x47ff041f
.rodata : { *(.rodata) *(.gnu.linkonce.r*) }
.rodata1 : { *(.rodata1) }
.ARM.extab : { *(.ARM.extab* .gnu.linkonce.armextab.*) }
__exidx_start = .;
.ARM.exidx : { *(.ARM.exidx* .gnu.linkonce.armexidx.*) }
__exidx_end = .;
.reginfo : { *(.reginfo) }
/* Adjust the address for the data segment. We want to adjust up to
the same address within the page on the next page up. */
. = ALIGN(0x100000) + (. & (0x100000 - 1));
.data :
{
*(.gen_code)
*(.data)
*(.gnu.linkonce.d*)
CONSTRUCTORS
}
.tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) }
.data1 : { *(.data1) }
.preinit_array :
{
PROVIDE (__preinit_array_start = .);
KEEP (*(.preinit_array))
PROVIDE (__preinit_array_end = .);
}
.init_array :
{
PROVIDE (__init_array_start = .);
KEEP (*(SORT(.init_array.*)))
KEEP (*(.init_array))
PROVIDE (__init_array_end = .);
}
.fini_array :
{
PROVIDE (__fini_array_start = .);
KEEP (*(.fini_array))
KEEP (*(SORT(.fini_array.*)))
PROVIDE (__fini_array_end = .);
}
.ctors :
{
*(.ctors)
}
.dtors :
{
*(.dtors)
}
.plt : { *(.plt) }
.got : { *(.got.plt) *(.got) }
.dynamic : { *(.dynamic) }
/* We want the small data sections together, so single-instruction offsets
can access them all, and initialized data all before uninitialized, so
we can shorten the on-disk segment size. */
.sdata : { *(.sdata) }
_edata = .;
PROVIDE (edata = .);
__bss_start = .;
.sbss : { *(.sbss) *(.scommon) }
.bss :
{
*(.dynbss)
*(.bss)
*(COMMON)
}
_end = . ;
PROVIDE (end = .);
/* Stabs debugging sections. */
.stab 0 : { *(.stab) }
.stabstr 0 : { *(.stabstr) }
.stab.excl 0 : { *(.stab.excl) }
.stab.exclstr 0 : { *(.stab.exclstr) }
.stab.index 0 : { *(.stab.index) }
.stab.indexstr 0 : { *(.stab.indexstr) }
.comment 0 : { *(.comment) }
/* DWARF debug sections.
Symbols in the DWARF debugging sections are relative to the beginning
of the section so we begin them at 0. */
/* DWARF 1 */
.debug 0 : { *(.debug) }
.line 0 : { *(.line) }
/* GNU DWARF 1 extensions */
.debug_srcinfo 0 : { *(.debug_srcinfo) }
.debug_sfnames 0 : { *(.debug_sfnames) }
/* DWARF 1.1 and DWARF 2 */
.debug_aranges 0 : { *(.debug_aranges) }
.debug_pubnames 0 : { *(.debug_pubnames) }
/* DWARF 2 */
.debug_info 0 : { *(.debug_info) }
.debug_abbrev 0 : { *(.debug_abbrev) }
.debug_line 0 : { *(.debug_line) }
.debug_frame 0 : { *(.debug_frame) }
.debug_str 0 : { *(.debug_str) }
.debug_loc 0 : { *(.debug_loc) }
.debug_macinfo 0 : { *(.debug_macinfo) }
/* SGI/MIPS DWARF 2 extensions */
.debug_weaknames 0 : { *(.debug_weaknames) }
.debug_funcnames 0 : { *(.debug_funcnames) }
.debug_typenames 0 : { *(.debug_typenames) }
.debug_varnames 0 : { *(.debug_varnames) }
/* These must appear regardless of . */
}

248
async.c
View File

@@ -23,16 +23,16 @@
*/
#include "qemu-common.h"
#include "block/aio.h"
#include "block/thread-pool.h"
#include "qemu/main-loop.h"
#include "qemu/atomic.h"
#include "qemu-aio.h"
#include "main-loop.h"
/* Anchor of the list of Bottom Halves belonging to the context */
static struct QEMUBH *first_bh;
/***********************************************************/
/* bottom halves (can be seen as timers which expire ASAP) */
struct QEMUBH {
AioContext *ctx;
QEMUBHFunc *cb;
void *opaque;
QEMUBH *next;
@@ -41,44 +41,30 @@ struct QEMUBH {
bool deleted;
};
QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
QEMUBH *qemu_bh_new(QEMUBHFunc *cb, void *opaque)
{
QEMUBH *bh;
bh = g_new(QEMUBH, 1);
*bh = (QEMUBH){
.ctx = ctx,
.cb = cb,
.opaque = opaque,
};
qemu_mutex_lock(&ctx->bh_lock);
bh->next = ctx->first_bh;
/* Make sure that the members are ready before putting bh into list */
smp_wmb();
ctx->first_bh = bh;
qemu_mutex_unlock(&ctx->bh_lock);
bh = g_malloc0(sizeof(QEMUBH));
bh->cb = cb;
bh->opaque = opaque;
bh->next = first_bh;
first_bh = bh;
return bh;
}
/* Multiple occurrences of aio_bh_poll cannot be called concurrently */
int aio_bh_poll(AioContext *ctx)
int qemu_bh_poll(void)
{
QEMUBH *bh, **bhp, *next;
int ret;
static int nesting = 0;
ctx->walking_bh++;
nesting++;
ret = 0;
for (bh = ctx->first_bh; bh; bh = next) {
/* Make sure that fetching bh happens before accessing its members */
smp_read_barrier_depends();
for (bh = first_bh; bh; bh = next) {
next = bh->next;
/* The atomic_xchg is paired with the one in qemu_bh_schedule. The
* implicit memory barrier ensures that the callback sees all writes
* done by the scheduling thread. It also ensures that the scheduling
* thread sees the zero before bh->cb has run, and thus will call
* aio_notify again if necessary.
*/
if (!bh->deleted && atomic_xchg(&bh->scheduled, 0)) {
if (!bh->deleted && bh->scheduled) {
bh->scheduled = 0;
if (!bh->idle)
ret = 1;
bh->idle = 0;
@@ -86,12 +72,11 @@ int aio_bh_poll(AioContext *ctx)
}
}
ctx->walking_bh--;
nesting--;
/* remove deleted bhs */
if (!ctx->walking_bh) {
qemu_mutex_lock(&ctx->bh_lock);
bhp = &ctx->first_bh;
if (!nesting) {
bhp = &first_bh;
while (*bhp) {
bh = *bhp;
if (bh->deleted) {
@@ -101,7 +86,6 @@ int aio_bh_poll(AioContext *ctx)
bhp = &bh->next;
}
}
qemu_mutex_unlock(&ctx->bh_lock);
}
return ret;
@@ -109,216 +93,50 @@ int aio_bh_poll(AioContext *ctx)
void qemu_bh_schedule_idle(QEMUBH *bh)
{
if (bh->scheduled)
return;
bh->scheduled = 1;
bh->idle = 1;
/* Make sure that idle & any writes needed by the callback are done
* before the locations are read in the aio_bh_poll.
*/
atomic_mb_set(&bh->scheduled, 1);
}
void qemu_bh_schedule(QEMUBH *bh)
{
AioContext *ctx;
ctx = bh->ctx;
if (bh->scheduled)
return;
bh->scheduled = 1;
bh->idle = 0;
/* The memory barrier implicit in atomic_xchg makes sure that:
* 1. idle & any writes needed by the callback are done before the
* locations are read in the aio_bh_poll.
* 2. ctx is loaded before scheduled is set and the callback has a chance
* to execute.
*/
if (atomic_xchg(&bh->scheduled, 1) == 0) {
aio_notify(ctx);
}
/* stop the currently executing CPU to execute the BH ASAP */
qemu_notify_event();
}
/* This func is async.
*/
void qemu_bh_cancel(QEMUBH *bh)
{
bh->scheduled = 0;
}
/* This func is async.The bottom half will do the delete action at the finial
* end.
*/
void qemu_bh_delete(QEMUBH *bh)
{
bh->scheduled = 0;
bh->deleted = 1;
}
int64_t
aio_compute_timeout(AioContext *ctx)
void qemu_bh_update_timeout(uint32_t *timeout)
{
int64_t deadline;
int timeout = -1;
QEMUBH *bh;
for (bh = ctx->first_bh; bh; bh = bh->next) {
for (bh = first_bh; bh; bh = bh->next) {
if (!bh->deleted && bh->scheduled) {
if (bh->idle) {
/* idle bottom halves will be polled at least
* every 10ms */
timeout = 10000000;
*timeout = MIN(10, *timeout);
} else {
/* non-idle bottom halves will be executed
* immediately */
return 0;
*timeout = 0;
break;
}
}
}
deadline = timerlistgroup_deadline_ns(&ctx->tlg);
if (deadline == 0) {
return 0;
} else {
return qemu_soonest_timeout(timeout, deadline);
}
}
static gboolean
aio_ctx_prepare(GSource *source, gint *timeout)
{
AioContext *ctx = (AioContext *) source;
/* We assume there is no timeout already supplied */
*timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx));
if (aio_prepare(ctx)) {
*timeout = 0;
}
return *timeout == 0;
}
static gboolean
aio_ctx_check(GSource *source)
{
AioContext *ctx = (AioContext *) source;
QEMUBH *bh;
for (bh = ctx->first_bh; bh; bh = bh->next) {
if (!bh->deleted && bh->scheduled) {
return true;
}
}
return aio_pending(ctx) || (timerlistgroup_deadline_ns(&ctx->tlg) == 0);
}
static gboolean
aio_ctx_dispatch(GSource *source,
GSourceFunc callback,
gpointer user_data)
{
AioContext *ctx = (AioContext *) source;
assert(callback == NULL);
aio_dispatch(ctx);
return true;
}
static void
aio_ctx_finalize(GSource *source)
{
AioContext *ctx = (AioContext *) source;
thread_pool_free(ctx->thread_pool);
aio_set_event_notifier(ctx, &ctx->notifier, NULL);
event_notifier_cleanup(&ctx->notifier);
rfifolock_destroy(&ctx->lock);
qemu_mutex_destroy(&ctx->bh_lock);
timerlistgroup_deinit(&ctx->tlg);
}
static GSourceFuncs aio_source_funcs = {
aio_ctx_prepare,
aio_ctx_check,
aio_ctx_dispatch,
aio_ctx_finalize
};
GSource *aio_get_g_source(AioContext *ctx)
{
g_source_ref(&ctx->source);
return &ctx->source;
}
ThreadPool *aio_get_thread_pool(AioContext *ctx)
{
if (!ctx->thread_pool) {
ctx->thread_pool = thread_pool_new(ctx);
}
return ctx->thread_pool;
}
void aio_set_dispatching(AioContext *ctx, bool dispatching)
{
ctx->dispatching = dispatching;
if (!dispatching) {
/* Write ctx->dispatching before reading e.g. bh->scheduled.
* Optimization: this is only needed when we're entering the "unsafe"
* phase where other threads must call event_notifier_set.
*/
smp_mb();
}
}
void aio_notify(AioContext *ctx)
{
/* Write e.g. bh->scheduled before reading ctx->dispatching. */
smp_mb();
if (!ctx->dispatching) {
event_notifier_set(&ctx->notifier);
}
}
static void aio_timerlist_notify(void *opaque)
{
aio_notify(opaque);
}
AioContext *aio_context_new(Error **errp)
{
int ret;
AioContext *ctx;
ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
ret = event_notifier_init(&ctx->notifier, false);
if (ret < 0) {
g_source_destroy(&ctx->source);
error_setg_errno(errp, -ret, "Failed to initialize event notifier");
return NULL;
}
g_source_set_can_recurse(&ctx->source, true);
aio_set_event_notifier(ctx, &ctx->notifier,
(EventNotifierHandler *)
event_notifier_test_and_clear);
ctx->thread_pool = NULL;
qemu_mutex_init(&ctx->bh_lock);
rfifolock_init(&ctx->lock, NULL, NULL);
timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
return ctx;
}
void aio_context_ref(AioContext *ctx)
{
g_source_ref(&ctx->source);
}
void aio_context_unref(AioContext *ctx)
{
g_source_unref(&ctx->source);
}
void aio_context_acquire(AioContext *ctx)
{
rfifolock_lock(&ctx->lock);
}
void aio_context_release(AioContext *ctx)
{
rfifolock_unlock(&ctx->lock);
}

View File

@@ -12,6 +12,3 @@ common-obj-$(CONFIG_WINWAVE) += winwaveaudio.o
common-obj-$(CONFIG_AUDIO_PT_INT) += audio_pt_int.o
common-obj-$(CONFIG_AUDIO_WIN_INT) += audio_win_int.o
common-obj-y += wavcapture.o
$(obj)/audio.o $(obj)/fmodaudio.o: QEMU_CFLAGS += $(FMOD_CFLAGS)
sdlaudio.o-cflags := $(SDL_CFLAGS)

View File

@@ -23,7 +23,7 @@
*/
#include <alsa/asoundlib.h>
#include "qemu-common.h"
#include "qemu/main-loop.h"
#include "qemu-char.h"
#include "audio.h"
#if QEMU_GNUC_PREREQ(4, 3)
@@ -815,8 +815,10 @@ static void alsa_fini_out (HWVoiceOut *hw)
ldebug ("alsa_fini\n");
alsa_anal_close (&alsa->handle, &alsa->pollhlp);
g_free(alsa->pcm_buf);
alsa->pcm_buf = NULL;
if (alsa->pcm_buf) {
g_free (alsa->pcm_buf);
alsa->pcm_buf = NULL;
}
}
static int alsa_init_out (HWVoiceOut *hw, struct audsettings *as)
@@ -976,8 +978,10 @@ static void alsa_fini_in (HWVoiceIn *hw)
alsa_anal_close (&alsa->handle, &alsa->pollhlp);
g_free(alsa->pcm_buf);
alsa->pcm_buf = NULL;
if (alsa->pcm_buf) {
g_free (alsa->pcm_buf);
alsa->pcm_buf = NULL;
}
}
static int alsa_run_in (HWVoiceIn *hw)

View File

@@ -23,9 +23,9 @@
*/
#include "hw/hw.h"
#include "audio.h"
#include "monitor/monitor.h"
#include "qemu/timer.h"
#include "sysemu/sysemu.h"
#include "monitor.h"
#include "qemu-timer.h"
#include "sysemu.h"
#define AUDIO_CAP "audio"
#include "audio_int.h"
@@ -95,7 +95,7 @@ static struct {
}
},
.period = { .hertz = 100 },
.period = { .hertz = 250 },
.plive = 0,
.log_to_monitor = 0,
.try_poll_in = 1,
@@ -828,9 +828,8 @@ static int audio_attach_capture (HWVoiceOut *hw)
QLIST_INSERT_HEAD (&hw_cap->sw_head, sw, entries);
QLIST_INSERT_HEAD (&hw->cap_head, sc, entries);
#ifdef DEBUG_CAPTURE
sw->name = g_strdup_printf ("for %p %d,%d,%d",
hw, sw->info.freq, sw->info.bits,
sw->info.nchannels);
asprintf (&sw->name, "for %p %d,%d,%d",
hw, sw->info.freq, sw->info.bits, sw->info.nchannels);
dolog ("Added %s active = %d\n", sw->name, sw->active);
#endif
if (sw->active) {
@@ -1124,11 +1123,10 @@ static int audio_is_timer_needed (void)
static void audio_reset_timer (AudioState *s)
{
if (audio_is_timer_needed ()) {
timer_mod (s->ts,
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + conf.period.ticks);
qemu_mod_timer (s->ts, qemu_get_clock_ns (vm_clock) + 1);
}
else {
timer_del (s->ts);
qemu_del_timer (s->ts);
}
}
@@ -1812,7 +1810,8 @@ static const VMStateDescription vmstate_audio = {
.name = "audio",
.version_id = 1,
.minimum_version_id = 1,
.fields = (VMStateField[]) {
.minimum_version_id_old = 1,
.fields = (VMStateField []) {
VMSTATE_END_OF_LIST()
}
};
@@ -1834,7 +1833,7 @@ static void audio_init (void)
QLIST_INIT (&s->cap_head);
atexit (audio_atexit);
s->ts = timer_new_ns(QEMU_CLOCK_VIRTUAL, audio_timer, s);
s->ts = qemu_new_timer_ns (vm_clock, audio_timer, s);
if (!s->ts) {
hw_error("Could not create audio timer\n");
}

View File

@@ -25,7 +25,7 @@
#define QEMU_AUDIO_H
#include "config-host.h"
#include "qemu/queue.h"
#include "qemu-queue.h"
typedef void (*audio_callback_fn) (void *opaque, int avail);

View File

@@ -243,13 +243,38 @@ static inline int audio_ring_dist (int dst, int src, int len)
return (dst >= src) ? (dst - src) : (len - src + dst);
}
#define dolog(fmt, ...) AUD_log(AUDIO_CAP, fmt, ## __VA_ARGS__)
static void GCC_ATTR dolog (const char *fmt, ...)
{
va_list ap;
va_start (ap, fmt);
AUD_vlog (AUDIO_CAP, fmt, ap);
va_end (ap);
}
#ifdef DEBUG
#define ldebug(fmt, ...) AUD_log(AUDIO_CAP, fmt, ## __VA_ARGS__)
static void GCC_ATTR ldebug (const char *fmt, ...)
{
va_list ap;
va_start (ap, fmt);
AUD_vlog (AUDIO_CAP, fmt, ap);
va_end (ap);
}
#else
#define ldebug(fmt, ...) (void)0
#if defined NDEBUG && defined __GNUC__
#define ldebug(...)
#elif defined NDEBUG && defined _MSC_VER
#define ldebug __noop
#else
static void GCC_ATTR ldebug (const char *fmt, ...)
{
(void) fmt;
}
#endif
#endif
#undef GCC_ATTR
#define AUDIO_STRINGIFY_(n) #n
#define AUDIO_STRINGIFY(n) AUDIO_STRINGIFY_(n)

View File

@@ -71,7 +71,10 @@ static void glue (audio_init_nb_voices_, TYPE) (struct audio_driver *drv)
static void glue (audio_pcm_hw_free_resources_, TYPE) (HW *hw)
{
g_free (HWBUF);
if (HWBUF) {
g_free (HWBUF);
}
HWBUF = NULL;
}
@@ -89,7 +92,9 @@ static int glue (audio_pcm_hw_alloc_resources_, TYPE) (HW *hw)
static void glue (audio_pcm_sw_free_resources_, TYPE) (SW *sw)
{
g_free (sw->buf);
if (sw->buf) {
g_free (sw->buf);
}
if (sw->rate) {
st_rate_stop (sw->rate);
@@ -167,8 +172,10 @@ static int glue (audio_pcm_sw_init_, TYPE) (
static void glue (audio_pcm_sw_fini_, TYPE) (SW *sw)
{
glue (audio_pcm_sw_free_resources_, TYPE) (sw);
g_free (sw->name);
sw->name = NULL;
if (sw->name) {
g_free (sw->name);
sw->name = NULL;
}
}
static void glue (audio_pcm_hw_add_sw_, TYPE) (HW *hw, SW *sw)
@@ -191,9 +198,9 @@ static void glue (audio_pcm_hw_gc_, TYPE) (HW **hwp)
audio_detach_capture (hw);
#endif
QLIST_REMOVE (hw, entries);
glue (hw->pcm_ops->fini_, TYPE) (hw);
glue (s->nb_hw_voices_, TYPE) += 1;
glue (audio_pcm_hw_free_resources_ ,TYPE) (hw);
glue (hw->pcm_ops->fini_, TYPE) (hw);
g_free (hw);
*hwp = NULL;
}

View File

@@ -1,6 +1,7 @@
/* public domain */
#include "qemu-common.h"
#include "audio.h"
#define AUDIO_CAP "win-int"
#include <windows.h>

View File

@@ -348,6 +348,7 @@ void mixeng_clear (struct st_sample *buf, int len)
void mixeng_volume (struct st_sample *buf, int len, struct mixeng_volume *vol)
{
#ifdef CONFIG_MIXEMU
if (vol->mute) {
mixeng_clear (buf, len);
return;
@@ -363,4 +364,9 @@ void mixeng_volume (struct st_sample *buf, int len, struct mixeng_volume *vol)
#endif
buf += 1;
}
#else
(void) buf;
(void) len;
(void) vol;
#endif
}

View File

@@ -35,7 +35,7 @@
#define IN_T glue (glue (ITYPE, BSIZE), _t)
#ifdef FLOAT_MIXENG
static inline mixeng_real glue (conv_, ET) (IN_T v)
static mixeng_real inline glue (conv_, ET) (IN_T v)
{
IN_T nv = ENDIAN_CONVERT (v);
@@ -54,7 +54,7 @@ static inline mixeng_real glue (conv_, ET) (IN_T v)
#endif
}
static inline IN_T glue (clip_, ET) (mixeng_real v)
static IN_T inline glue (clip_, ET) (mixeng_real v)
{
if (v >= 0.5) {
return IN_MAX;

View File

@@ -23,7 +23,7 @@
*/
#include "qemu-common.h"
#include "audio.h"
#include "qemu/timer.h"
#include "qemu-timer.h"
#define AUDIO_CAP "noaudio"
#include "audio_int.h"
@@ -46,7 +46,7 @@ static int no_run_out (HWVoiceOut *hw, int live)
int64_t ticks;
int64_t bytes;
now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
now = qemu_get_clock_ns (vm_clock);
ticks = now - no->old_ticks;
bytes = muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());
bytes = audio_MIN (bytes, INT_MAX);
@@ -102,7 +102,7 @@ static int no_run_in (HWVoiceIn *hw)
int samples = 0;
if (dead) {
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
int64_t now = qemu_get_clock_ns (vm_clock);
int64_t ticks = now - no->old_ticks;
int64_t bytes =
muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());

View File

@@ -25,10 +25,14 @@
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#ifdef __OpenBSD__
#include <soundcard.h>
#else
#include <sys/soundcard.h>
#endif
#include "qemu-common.h"
#include "qemu/main-loop.h"
#include "qemu/host-utils.h"
#include "host-utils.h"
#include "qemu-char.h"
#include "audio.h"
#define AUDIO_CAP "oss"
@@ -736,8 +740,10 @@ static void oss_fini_in (HWVoiceIn *hw)
oss_anal_close (&oss->fd);
g_free(oss->pcm_buf);
oss->pcm_buf = NULL;
if (oss->pcm_buf) {
g_free (oss->pcm_buf);
oss->pcm_buf = NULL;
}
}
static int oss_run_in (HWVoiceIn *hw)
@@ -847,10 +853,6 @@ static int oss_ctl_in (HWVoiceIn *hw, int cmd, ...)
static void *oss_audio_init (void)
{
if (access(conf.devpath_in, R_OK | W_OK) < 0 ||
access(conf.devpath_out, R_OK | W_OK) < 0) {
return NULL;
}
return &conf;
}

View File

@@ -547,11 +547,11 @@ static int qpa_init_out (HWVoiceOut *hw, struct audsettings *as)
ss.rate = as->freq;
/*
* qemu audio tick runs at 100 Hz (by default), so processing
* data chunks worth 10 ms of sound should be a good fit.
* qemu audio tick runs at 250 Hz (by default), so processing
* data chunks worth 4 ms of sound should be a good fit.
*/
ba.tlength = pa_usec_to_bytes (10 * 1000, &ss);
ba.minreq = pa_usec_to_bytes (5 * 1000, &ss);
ba.tlength = pa_usec_to_bytes (4 * 1000, &ss);
ba.minreq = pa_usec_to_bytes (2 * 1000, &ss);
ba.maxlength = -1;
ba.prebuf = -1;

View File

@@ -18,24 +18,15 @@
*/
#include "hw/hw.h"
#include "qemu/timer.h"
#include "qemu-timer.h"
#include "ui/qemu-spice.h"
#define AUDIO_CAP "spice"
#include "audio.h"
#include "audio_int.h"
#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
#define LINE_OUT_SAMPLES (480 * 4)
#else
#define LINE_OUT_SAMPLES (256 * 4)
#endif
#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
#define LINE_IN_SAMPLES (480 * 4)
#else
#define LINE_IN_SAMPLES (256 * 4)
#endif
#define LINE_IN_SAMPLES 1024
#define LINE_OUT_SAMPLES 1024
typedef struct SpiceRateCtl {
int64_t start_ticks;
@@ -90,7 +81,7 @@ static void spice_audio_fini (void *opaque)
static void rate_start (SpiceRateCtl *rate)
{
memset (rate, 0, sizeof (*rate));
rate->start_ticks = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
rate->start_ticks = qemu_get_clock_ns (vm_clock);
}
static int rate_get_samples (struct audio_pcm_info *info, SpiceRateCtl *rate)
@@ -100,12 +91,12 @@ static int rate_get_samples (struct audio_pcm_info *info, SpiceRateCtl *rate)
int64_t bytes;
int64_t samples;
now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
now = qemu_get_clock_ns (vm_clock);
ticks = now - rate->start_ticks;
bytes = muldiv64 (ticks, info->bytes_per_second, get_ticks_per_sec ());
samples = (bytes - rate->bytes_sent) >> info->shift;
if (samples < 0 || samples > 65536) {
error_report("Resetting rate control (%" PRId64 " samples)", samples);
fprintf (stderr, "Resetting rate control (%" PRId64 " samples)\n", samples);
rate_start (rate);
samples = 0;
}
@@ -120,11 +111,7 @@ static int line_out_init (HWVoiceOut *hw, struct audsettings *as)
SpiceVoiceOut *out = container_of (hw, SpiceVoiceOut, hw);
struct audsettings settings;
#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
settings.freq = spice_server_get_best_playback_rate(NULL);
#else
settings.freq = SPICE_INTERFACE_PLAYBACK_FREQ;
#endif
settings.nchannels = SPICE_INTERFACE_PLAYBACK_CHAN;
settings.fmt = AUD_FMT_S16;
settings.endianness = AUDIO_HOST_ENDIANNESS;
@@ -135,9 +122,6 @@ static int line_out_init (HWVoiceOut *hw, struct audsettings *as)
out->sin.base.sif = &playback_sif.base;
qemu_spice_add_interface (&out->sin.base);
#if SPICE_INTERFACE_PLAYBACK_MAJOR > 1 || SPICE_INTERFACE_PLAYBACK_MINOR >= 3
spice_server_set_playback_rate(&out->sin, settings.freq);
#endif
return 0;
}
@@ -248,11 +232,7 @@ static int line_in_init (HWVoiceIn *hw, struct audsettings *as)
SpiceVoiceIn *in = container_of (hw, SpiceVoiceIn, hw);
struct audsettings settings;
#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
settings.freq = spice_server_get_best_record_rate(NULL);
#else
settings.freq = SPICE_INTERFACE_RECORD_FREQ;
#endif
settings.nchannels = SPICE_INTERFACE_RECORD_CHAN;
settings.fmt = AUD_FMT_S16;
settings.endianness = AUDIO_HOST_ENDIANNESS;
@@ -263,9 +243,6 @@ static int line_in_init (HWVoiceIn *hw, struct audsettings *as)
in->sin.base.sif = &record_sif.base;
qemu_spice_add_interface (&in->sin.base);
#if SPICE_INTERFACE_RECORD_MAJOR > 2 || SPICE_INTERFACE_RECORD_MINOR >= 3
spice_server_set_record_rate(&in->sin, settings.freq);
#endif
return 0;
}

View File

@@ -22,7 +22,7 @@
* THE SOFTWARE.
*/
#include "hw/hw.h"
#include "qemu/timer.h"
#include "qemu-timer.h"
#include "audio.h"
#define AUDIO_CAP "wav"
@@ -52,7 +52,7 @@ static int wav_run_out (HWVoiceOut *hw, int live)
int rpos, decr, samples;
uint8_t *dst;
struct st_sample *src;
int64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
int64_t now = qemu_get_clock_ns (vm_clock);
int64_t ticks = now - wav->old_ticks;
int64_t bytes =
muldiv64 (ticks, hw->info.bytes_per_second, get_ticks_per_sec ());

View File

@@ -1,5 +1,5 @@
#include "hw/hw.h"
#include "monitor/monitor.h"
#include "monitor.h"
#include "audio.h"
typedef struct {
@@ -63,7 +63,8 @@ static void wav_destroy (void *opaque)
}
doclose:
if (fclose (wav->f)) {
error_report("wav_destroy: fclose failed: %s", strerror(errno));
fprintf (stderr, "wav_destroy: fclose failed: %s",
strerror (errno));
}
}

View File

@@ -1,7 +1,7 @@
/* public domain */
#include "qemu-common.h"
#include "sysemu/sysemu.h"
#include "sysemu.h"
#include "audio.h"
#define AUDIO_CAP "winwave"

View File

@@ -1,11 +0,0 @@
common-obj-y += rng.o rng-egd.o
common-obj-$(CONFIG_POSIX) += rng-random.o
common-obj-y += msmouse.o testdev.o
common-obj-$(CONFIG_BRLAPI) += baum.o
baum.o-cflags := $(SDL_CFLAGS)
common-obj-$(CONFIG_TPM) += tpm.o
common-obj-y += hostmem.o hostmem-ram.o
common-obj-$(CONFIG_LINUX) += hostmem-file.o

View File

@@ -1,134 +0,0 @@
/*
* QEMU Host Memory Backend for hugetlbfs
*
* Copyright (C) 2013-2014 Red Hat Inc
*
* Authors:
* Paolo Bonzini <pbonzini@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu-common.h"
#include "sysemu/hostmem.h"
#include "sysemu/sysemu.h"
#include "qom/object_interfaces.h"
/* hostmem-file.c */
/**
* @TYPE_MEMORY_BACKEND_FILE:
* name of backend that uses mmap on a file descriptor
*/
#define TYPE_MEMORY_BACKEND_FILE "memory-backend-file"
#define MEMORY_BACKEND_FILE(obj) \
OBJECT_CHECK(HostMemoryBackendFile, (obj), TYPE_MEMORY_BACKEND_FILE)
typedef struct HostMemoryBackendFile HostMemoryBackendFile;
struct HostMemoryBackendFile {
HostMemoryBackend parent_obj;
bool share;
char *mem_path;
};
static void
file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
}
if (!fb->mem_path) {
error_setg(errp, "mem-path property not set");
return;
}
#ifndef CONFIG_LINUX
error_setg(errp, "-mem-path not supported on this host");
#else
if (!memory_region_size(&backend->mr)) {
backend->force_prealloc = mem_prealloc;
memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
object_get_canonical_path(OBJECT(backend)),
backend->size, fb->share,
fb->mem_path, errp);
}
#endif
}
static void
file_backend_class_init(ObjectClass *oc, void *data)
{
HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
bc->alloc = file_backend_memory_alloc;
}
static char *get_mem_path(Object *o, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
return g_strdup(fb->mem_path);
}
static void set_mem_path(Object *o, const char *str, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (memory_region_size(&backend->mr)) {
error_setg(errp, "cannot change property value");
return;
}
if (fb->mem_path) {
g_free(fb->mem_path);
}
fb->mem_path = g_strdup(str);
}
static bool file_memory_backend_get_share(Object *o, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
return fb->share;
}
static void file_memory_backend_set_share(Object *o, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (memory_region_size(&backend->mr)) {
error_setg(errp, "cannot change property value");
return;
}
fb->share = value;
}
static void
file_backend_instance_init(Object *o)
{
object_property_add_bool(o, "share",
file_memory_backend_get_share,
file_memory_backend_set_share, NULL);
object_property_add_str(o, "mem-path", get_mem_path,
set_mem_path, NULL);
}
static const TypeInfo file_backend_info = {
.name = TYPE_MEMORY_BACKEND_FILE,
.parent = TYPE_MEMORY_BACKEND,
.class_init = file_backend_class_init,
.instance_init = file_backend_instance_init,
.instance_size = sizeof(HostMemoryBackendFile),
};
static void register_types(void)
{
type_register_static(&file_backend_info);
}
type_init(register_types);

View File

@@ -1,53 +0,0 @@
/*
* QEMU Host Memory Backend
*
* Copyright (C) 2013-2014 Red Hat Inc
*
* Authors:
* Igor Mammedov <imammedo@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "sysemu/hostmem.h"
#include "qom/object_interfaces.h"
#define TYPE_MEMORY_BACKEND_RAM "memory-backend-ram"
static void
ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
char *path;
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
}
path = object_get_canonical_path_component(OBJECT(backend));
memory_region_init_ram(&backend->mr, OBJECT(backend), path,
backend->size, errp);
g_free(path);
}
static void
ram_backend_class_init(ObjectClass *oc, void *data)
{
HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
bc->alloc = ram_backend_memory_alloc;
}
static const TypeInfo ram_backend_info = {
.name = TYPE_MEMORY_BACKEND_RAM,
.parent = TYPE_MEMORY_BACKEND,
.class_init = ram_backend_class_init,
};
static void register_types(void)
{
type_register_static(&ram_backend_info);
}
type_init(register_types);

View File

@@ -1,379 +0,0 @@
/*
* QEMU Host Memory Backend
*
* Copyright (C) 2013-2014 Red Hat Inc
*
* Authors:
* Igor Mammedov <imammedo@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "sysemu/hostmem.h"
#include "qapi/visitor.h"
#include "qapi-types.h"
#include "qapi-visit.h"
#include "qapi/qmp/qerror.h"
#include "qemu/config-file.h"
#include "qom/object_interfaces.h"
#ifdef CONFIG_NUMA
#include <numaif.h>
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_DEFAULT != MPOL_DEFAULT);
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_PREFERRED != MPOL_PREFERRED);
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_BIND != MPOL_BIND);
QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_INTERLEAVE != MPOL_INTERLEAVE);
#endif
static void
host_memory_backend_get_size(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint64_t value = backend->size;
visit_type_size(v, &value, name, errp);
}
static void
host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
Error *local_err = NULL;
uint64_t value;
if (memory_region_size(&backend->mr)) {
error_setg(&local_err, "cannot change property value");
goto out;
}
visit_type_size(v, &value, name, &local_err);
if (local_err) {
goto out;
}
if (!value) {
error_setg(&local_err, "Property '%s.%s' doesn't take value '%"
PRIu64 "'", object_get_typename(obj), name, value);
goto out;
}
backend->size = value;
out:
error_propagate(errp, local_err);
}
static void
host_memory_backend_get_host_nodes(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint16List *host_nodes = NULL;
uint16List **node = &host_nodes;
unsigned long value;
value = find_first_bit(backend->host_nodes, MAX_NODES);
if (value == MAX_NODES) {
return;
}
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
do {
value = find_next_bit(backend->host_nodes, MAX_NODES, value + 1);
if (value == MAX_NODES) {
break;
}
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
} while (true);
visit_type_uint16List(v, &host_nodes, name, errp);
}
static void
host_memory_backend_set_host_nodes(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
#ifdef CONFIG_NUMA
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint16List *l = NULL;
visit_type_uint16List(v, &l, name, errp);
while (l) {
bitmap_set(backend->host_nodes, l->value, 1);
l = l->next;
}
#else
error_setg(errp, "NUMA node binding are not supported by this QEMU");
#endif
}
static void
host_memory_backend_get_policy(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
int policy = backend->policy;
visit_type_enum(v, &policy, HostMemPolicy_lookup, NULL, name, errp);
}
static void
host_memory_backend_set_policy(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
int policy;
visit_type_enum(v, &policy, HostMemPolicy_lookup, NULL, name, errp);
backend->policy = policy;
#ifndef CONFIG_NUMA
if (policy != HOST_MEM_POLICY_DEFAULT) {
error_setg(errp, "NUMA policies are not supported by this QEMU");
}
#endif
}
static bool host_memory_backend_get_merge(Object *obj, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
return backend->merge;
}
static void host_memory_backend_set_merge(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!memory_region_size(&backend->mr)) {
backend->merge = value;
return;
}
if (value != backend->merge) {
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
qemu_madvise(ptr, sz,
value ? QEMU_MADV_MERGEABLE : QEMU_MADV_UNMERGEABLE);
backend->merge = value;
}
}
static bool host_memory_backend_get_dump(Object *obj, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
return backend->dump;
}
static void host_memory_backend_set_dump(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!memory_region_size(&backend->mr)) {
backend->dump = value;
return;
}
if (value != backend->dump) {
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
qemu_madvise(ptr, sz,
value ? QEMU_MADV_DODUMP : QEMU_MADV_DONTDUMP);
backend->dump = value;
}
}
static bool host_memory_backend_get_prealloc(Object *obj, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
return backend->prealloc || backend->force_prealloc;
}
static void host_memory_backend_set_prealloc(Object *obj, bool value,
Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (backend->force_prealloc) {
if (value) {
error_setg(errp,
"remove -mem-prealloc to use the prealloc property");
return;
}
}
if (!memory_region_size(&backend->mr)) {
backend->prealloc = value;
return;
}
if (value && !backend->prealloc) {
int fd = memory_region_get_fd(&backend->mr);
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
os_mem_prealloc(fd, ptr, sz);
backend->prealloc = true;
}
}
static void host_memory_backend_init(Object *obj)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
backend->merge = qemu_opt_get_bool(qemu_get_machine_opts(),
"mem-merge", true);
backend->dump = qemu_opt_get_bool(qemu_get_machine_opts(),
"dump-guest-core", true);
backend->prealloc = mem_prealloc;
object_property_add_bool(obj, "merge",
host_memory_backend_get_merge,
host_memory_backend_set_merge, NULL);
object_property_add_bool(obj, "dump",
host_memory_backend_get_dump,
host_memory_backend_set_dump, NULL);
object_property_add_bool(obj, "prealloc",
host_memory_backend_get_prealloc,
host_memory_backend_set_prealloc, NULL);
object_property_add(obj, "size", "int",
host_memory_backend_get_size,
host_memory_backend_set_size, NULL, NULL, NULL);
object_property_add(obj, "host-nodes", "int",
host_memory_backend_get_host_nodes,
host_memory_backend_set_host_nodes, NULL, NULL, NULL);
object_property_add(obj, "policy", "str",
host_memory_backend_get_policy,
host_memory_backend_set_policy, NULL, NULL, NULL);
}
MemoryRegion *
host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
{
return memory_region_size(&backend->mr) ? &backend->mr : NULL;
}
static void
host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(uc);
HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
Error *local_err = NULL;
void *ptr;
uint64_t sz;
if (bc->alloc) {
bc->alloc(backend, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
ptr = memory_region_get_ram_ptr(&backend->mr);
sz = memory_region_size(&backend->mr);
if (backend->merge) {
qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
}
if (!backend->dump) {
qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP);
}
#ifdef CONFIG_NUMA
unsigned long lastbit = find_last_bit(backend->host_nodes, MAX_NODES);
/* lastbit == MAX_NODES means maxnode = 0 */
unsigned long maxnode = (lastbit + 1) % (MAX_NODES + 1);
/* ensure policy won't be ignored in case memory is preallocated
* before mbind(). note: MPOL_MF_STRICT is ignored on hugepages so
* this doesn't catch hugepage case. */
unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE;
/* check for invalid host-nodes and policies and give more verbose
* error messages than mbind(). */
if (maxnode && backend->policy == MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be empty for policy default,"
" or you should explicitly specify a policy other"
" than default");
return;
} else if (maxnode == 0 && backend->policy != MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be set for policy %s",
HostMemPolicy_lookup[backend->policy]);
return;
}
/* We can have up to MAX_NODES nodes, but we need to pass maxnode+1
* as argument to mbind() due to an old Linux bug (feature?) which
* cuts off the last specified node. This means backend->host_nodes
* must have MAX_NODES+1 bits available.
*/
assert(sizeof(backend->host_nodes) >=
BITS_TO_LONGS(MAX_NODES + 1) * sizeof(unsigned long));
assert(maxnode <= MAX_NODES);
if (mbind(ptr, sz, backend->policy,
maxnode ? backend->host_nodes : NULL, maxnode + 1, flags)) {
error_setg_errno(errp, errno,
"cannot bind memory to host NUMA nodes");
return;
}
#endif
/* Preallocate memory after the NUMA policy has been instantiated.
* This is necessary to guarantee memory is allocated with
* specified NUMA policy in place.
*/
if (backend->prealloc) {
os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz);
}
}
}
static bool
host_memory_backend_can_be_deleted(UserCreatable *uc, Error **errp)
{
MemoryRegion *mr;
mr = host_memory_backend_get_memory(MEMORY_BACKEND(uc), errp);
if (memory_region_is_mapped(mr)) {
return false;
} else {
return true;
}
}
static void
host_memory_backend_class_init(ObjectClass *oc, void *data)
{
UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
ucc->complete = host_memory_backend_memory_complete;
ucc->can_be_deleted = host_memory_backend_can_be_deleted;
}
static const TypeInfo host_memory_backend_info = {
.name = TYPE_MEMORY_BACKEND,
.parent = TYPE_OBJECT,
.abstract = true,
.class_size = sizeof(HostMemoryBackendClass),
.class_init = host_memory_backend_class_init,
.instance_size = sizeof(HostMemoryBackend),
.instance_init = host_memory_backend_init,
.interfaces = (InterfaceInfo[]) {
{ TYPE_USER_CREATABLE },
{ }
}
};
static void register_types(void)
{
type_register_static(&host_memory_backend_info);
}
type_init(register_types);

View File

@@ -1,232 +0,0 @@
/*
* QEMU Random Number Generator Backend
*
* Copyright IBM, Corp. 2012
*
* Authors:
* Anthony Liguori <aliguori@us.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "sysemu/rng.h"
#include "sysemu/char.h"
#include "qapi/qmp/qerror.h"
#include "hw/qdev.h" /* just for DEFINE_PROP_CHR */
#define TYPE_RNG_EGD "rng-egd"
#define RNG_EGD(obj) OBJECT_CHECK(RngEgd, (obj), TYPE_RNG_EGD)
typedef struct RngEgd
{
RngBackend parent;
CharDriverState *chr;
char *chr_name;
GSList *requests;
} RngEgd;
typedef struct RngRequest
{
EntropyReceiveFunc *receive_entropy;
uint8_t *data;
void *opaque;
size_t offset;
size_t size;
} RngRequest;
static void rng_egd_request_entropy(RngBackend *b, size_t size,
EntropyReceiveFunc *receive_entropy,
void *opaque)
{
RngEgd *s = RNG_EGD(b);
RngRequest *req;
req = g_malloc(sizeof(*req));
req->offset = 0;
req->size = size;
req->receive_entropy = receive_entropy;
req->opaque = opaque;
req->data = g_malloc(req->size);
while (size > 0) {
uint8_t header[2];
uint8_t len = MIN(size, 255);
/* synchronous entropy request */
header[0] = 0x02;
header[1] = len;
qemu_chr_fe_write(s->chr, header, sizeof(header));
size -= len;
}
s->requests = g_slist_append(s->requests, req);
}
static void rng_egd_free_request(RngRequest *req)
{
g_free(req->data);
g_free(req);
}
static int rng_egd_chr_can_read(void *opaque)
{
RngEgd *s = RNG_EGD(opaque);
GSList *i;
int size = 0;
for (i = s->requests; i; i = i->next) {
RngRequest *req = i->data;
size += req->size - req->offset;
}
return size;
}
static void rng_egd_chr_read(void *opaque, const uint8_t *buf, int size)
{
RngEgd *s = RNG_EGD(opaque);
size_t buf_offset = 0;
while (size > 0 && s->requests) {
RngRequest *req = s->requests->data;
int len = MIN(size, req->size - req->offset);
memcpy(req->data + req->offset, buf + buf_offset, len);
buf_offset += len;
req->offset += len;
size -= len;
if (req->offset == req->size) {
s->requests = g_slist_remove_link(s->requests, s->requests);
req->receive_entropy(req->opaque, req->data, req->size);
rng_egd_free_request(req);
}
}
}
static void rng_egd_free_requests(RngEgd *s)
{
GSList *i;
for (i = s->requests; i; i = i->next) {
rng_egd_free_request(i->data);
}
g_slist_free(s->requests);
s->requests = NULL;
}
static void rng_egd_cancel_requests(RngBackend *b)
{
RngEgd *s = RNG_EGD(b);
/* We simply delete the list of pending requests. If there is data in the
* queue waiting to be read, this is okay, because there will always be
* more data than we requested originally
*/
rng_egd_free_requests(s);
}
static void rng_egd_opened(RngBackend *b, Error **errp)
{
RngEgd *s = RNG_EGD(b);
if (s->chr_name == NULL) {
error_set(errp, QERR_INVALID_PARAMETER_VALUE,
"chardev", "a valid character device");
return;
}
s->chr = qemu_chr_find(s->chr_name);
if (s->chr == NULL) {
error_set(errp, QERR_DEVICE_NOT_FOUND, s->chr_name);
return;
}
if (qemu_chr_fe_claim(s->chr) != 0) {
error_set(errp, QERR_DEVICE_IN_USE, s->chr_name);
return;
}
/* FIXME we should resubmit pending requests when the CDS reconnects. */
qemu_chr_add_handlers(s->chr, rng_egd_chr_can_read, rng_egd_chr_read,
NULL, s);
}
static void rng_egd_set_chardev(Object *obj, const char *value, Error **errp)
{
RngBackend *b = RNG_BACKEND(obj);
RngEgd *s = RNG_EGD(b);
if (b->opened) {
error_set(errp, QERR_PERMISSION_DENIED);
} else {
g_free(s->chr_name);
s->chr_name = g_strdup(value);
}
}
static char *rng_egd_get_chardev(Object *obj, Error **errp)
{
RngEgd *s = RNG_EGD(obj);
if (s->chr && s->chr->label) {
return g_strdup(s->chr->label);
}
return NULL;
}
static void rng_egd_init(Object *obj)
{
object_property_add_str(obj, "chardev",
rng_egd_get_chardev, rng_egd_set_chardev,
NULL);
}
static void rng_egd_finalize(Object *obj)
{
RngEgd *s = RNG_EGD(obj);
if (s->chr) {
qemu_chr_add_handlers(s->chr, NULL, NULL, NULL, NULL);
qemu_chr_fe_release(s->chr);
}
g_free(s->chr_name);
rng_egd_free_requests(s);
}
static void rng_egd_class_init(ObjectClass *klass, void *data)
{
RngBackendClass *rbc = RNG_BACKEND_CLASS(klass);
rbc->request_entropy = rng_egd_request_entropy;
rbc->cancel_requests = rng_egd_cancel_requests;
rbc->opened = rng_egd_opened;
}
static const TypeInfo rng_egd_info = {
.name = TYPE_RNG_EGD,
.parent = TYPE_RNG_BACKEND,
.instance_size = sizeof(RngEgd),
.class_init = rng_egd_class_init,
.instance_init = rng_egd_init,
.instance_finalize = rng_egd_finalize,
};
static void register_types(void)
{
type_register_static(&rng_egd_info);
}
type_init(register_types);

View File

@@ -1,156 +0,0 @@
/*
* QEMU Random Number Generator Backend
*
* Copyright IBM, Corp. 2012
*
* Authors:
* Anthony Liguori <aliguori@us.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "sysemu/rng-random.h"
#include "sysemu/rng.h"
#include "qapi/qmp/qerror.h"
#include "qemu/main-loop.h"
struct RndRandom
{
RngBackend parent;
int fd;
char *filename;
EntropyReceiveFunc *receive_func;
void *opaque;
size_t size;
};
/**
* A simple and incomplete backend to request entropy from /dev/random.
*
* This backend exposes an additional "filename" property that can be used to
* set the filename to use to open the backend.
*/
static void entropy_available(void *opaque)
{
RndRandom *s = RNG_RANDOM(opaque);
uint8_t buffer[s->size];
ssize_t len;
len = read(s->fd, buffer, s->size);
if (len < 0 && errno == EAGAIN) {
return;
}
g_assert(len != -1);
s->receive_func(s->opaque, buffer, len);
s->receive_func = NULL;
qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
}
static void rng_random_request_entropy(RngBackend *b, size_t size,
EntropyReceiveFunc *receive_entropy,
void *opaque)
{
RndRandom *s = RNG_RANDOM(b);
if (s->receive_func) {
s->receive_func(s->opaque, NULL, 0);
}
s->receive_func = receive_entropy;
s->opaque = opaque;
s->size = size;
qemu_set_fd_handler(s->fd, entropy_available, NULL, s);
}
static void rng_random_opened(RngBackend *b, Error **errp)
{
RndRandom *s = RNG_RANDOM(b);
if (s->filename == NULL) {
error_set(errp, QERR_INVALID_PARAMETER_VALUE,
"filename", "a valid filename");
} else {
s->fd = qemu_open(s->filename, O_RDONLY | O_NONBLOCK);
if (s->fd == -1) {
error_setg_file_open(errp, errno, s->filename);
}
}
}
static char *rng_random_get_filename(Object *obj, Error **errp)
{
RndRandom *s = RNG_RANDOM(obj);
return g_strdup(s->filename);
}
static void rng_random_set_filename(Object *obj, const char *filename,
Error **errp)
{
RngBackend *b = RNG_BACKEND(obj);
RndRandom *s = RNG_RANDOM(obj);
if (b->opened) {
error_set(errp, QERR_PERMISSION_DENIED);
return;
}
g_free(s->filename);
s->filename = g_strdup(filename);
}
static void rng_random_init(Object *obj)
{
RndRandom *s = RNG_RANDOM(obj);
object_property_add_str(obj, "filename",
rng_random_get_filename,
rng_random_set_filename,
NULL);
s->filename = g_strdup("/dev/random");
s->fd = -1;
}
static void rng_random_finalize(Object *obj)
{
RndRandom *s = RNG_RANDOM(obj);
if (s->fd != -1) {
qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
qemu_close(s->fd);
}
g_free(s->filename);
}
static void rng_random_class_init(ObjectClass *klass, void *data)
{
RngBackendClass *rbc = RNG_BACKEND_CLASS(klass);
rbc->request_entropy = rng_random_request_entropy;
rbc->opened = rng_random_opened;
}
static const TypeInfo rng_random_info = {
.name = TYPE_RNG_RANDOM,
.parent = TYPE_RNG_BACKEND,
.instance_size = sizeof(RndRandom),
.class_init = rng_random_class_init,
.instance_init = rng_random_init,
.instance_finalize = rng_random_finalize,
};
static void register_types(void)
{
type_register_static(&rng_random_info);
}
type_init(register_types);

View File

@@ -1,109 +0,0 @@
/*
* QEMU Random Number Generator Backend
*
* Copyright IBM, Corp. 2012
*
* Authors:
* Anthony Liguori <aliguori@us.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "sysemu/rng.h"
#include "qapi/qmp/qerror.h"
#include "qom/object_interfaces.h"
void rng_backend_request_entropy(RngBackend *s, size_t size,
EntropyReceiveFunc *receive_entropy,
void *opaque)
{
RngBackendClass *k = RNG_BACKEND_GET_CLASS(s);
if (k->request_entropy) {
k->request_entropy(s, size, receive_entropy, opaque);
}
}
void rng_backend_cancel_requests(RngBackend *s)
{
RngBackendClass *k = RNG_BACKEND_GET_CLASS(s);
if (k->cancel_requests) {
k->cancel_requests(s);
}
}
static bool rng_backend_prop_get_opened(Object *obj, Error **errp)
{
RngBackend *s = RNG_BACKEND(obj);
return s->opened;
}
static void rng_backend_complete(UserCreatable *uc, Error **errp)
{
object_property_set_bool(OBJECT(uc), true, "opened", errp);
}
static void rng_backend_prop_set_opened(Object *obj, bool value, Error **errp)
{
RngBackend *s = RNG_BACKEND(obj);
RngBackendClass *k = RNG_BACKEND_GET_CLASS(s);
Error *local_err = NULL;
if (value == s->opened) {
return;
}
if (!value && s->opened) {
error_set(errp, QERR_PERMISSION_DENIED);
return;
}
if (k->opened) {
k->opened(s, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
}
s->opened = true;
}
static void rng_backend_init(Object *obj)
{
object_property_add_bool(obj, "opened",
rng_backend_prop_get_opened,
rng_backend_prop_set_opened,
NULL);
}
static void rng_backend_class_init(ObjectClass *oc, void *data)
{
UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
ucc->complete = rng_backend_complete;
}
static const TypeInfo rng_backend_info = {
.name = TYPE_RNG_BACKEND,
.parent = TYPE_OBJECT,
.instance_size = sizeof(RngBackend),
.instance_init = rng_backend_init,
.class_size = sizeof(RngBackendClass),
.class_init = rng_backend_class_init,
.abstract = true,
.interfaces = (InterfaceInfo[]) {
{ TYPE_USER_CREATABLE },
{ }
}
};
static void register_types(void)
{
type_register_static(&rng_backend_info);
}
type_init(register_types);

View File

@@ -1,131 +0,0 @@
/*
* QEMU Char Device for testsuite control
*
* Copyright (c) 2014 Red Hat, Inc.
*
* Author: Paolo Bonzini <pbonzini@redhat.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "sysemu/char.h"
#define BUF_SIZE 32
typedef struct {
CharDriverState *chr;
uint8_t in_buf[32];
int in_buf_used;
} TestdevCharState;
/* Try to interpret a whole incoming packet */
static int testdev_eat_packet(TestdevCharState *testdev)
{
const uint8_t *cur = testdev->in_buf;
int len = testdev->in_buf_used;
uint8_t c;
int arg;
#define EAT(c) do { \
if (!len--) { \
return 0; \
} \
c = *cur++; \
} while (0)
EAT(c);
while (isspace(c)) {
EAT(c);
}
arg = 0;
while (isdigit(c)) {
arg = arg * 10 + c - '0';
EAT(c);
}
while (isspace(c)) {
EAT(c);
}
switch (c) {
case 'q':
exit((arg << 1) | 1);
break;
default:
break;
}
return cur - testdev->in_buf;
}
/* The other end is writing some data. Store it and try to interpret */
static int testdev_write(CharDriverState *chr, const uint8_t *buf, int len)
{
TestdevCharState *testdev = chr->opaque;
int tocopy, eaten, orig_len = len;
while (len) {
/* Complete our buffer as much as possible */
tocopy = MIN(len, BUF_SIZE - testdev->in_buf_used);
memcpy(testdev->in_buf + testdev->in_buf_used, buf, tocopy);
testdev->in_buf_used += tocopy;
buf += tocopy;
len -= tocopy;
/* Interpret it as much as possible */
while (testdev->in_buf_used > 0 &&
(eaten = testdev_eat_packet(testdev)) > 0) {
memmove(testdev->in_buf, testdev->in_buf + eaten,
testdev->in_buf_used - eaten);
testdev->in_buf_used -= eaten;
}
}
return orig_len;
}
static void testdev_close(struct CharDriverState *chr)
{
TestdevCharState *testdev = chr->opaque;
g_free(testdev);
}
CharDriverState *chr_testdev_init(void)
{
TestdevCharState *testdev;
CharDriverState *chr;
testdev = g_malloc0(sizeof(TestdevCharState));
testdev->chr = chr = g_malloc0(sizeof(CharDriverState));
chr->opaque = testdev;
chr->chr_write = testdev_write;
chr->chr_close = testdev_close;
return chr;
}
static void register_types(void)
{
register_char_driver("testdev", CHARDEV_BACKEND_KIND_TESTDEV, NULL);
}
type_init(register_types);

View File

@@ -1,182 +0,0 @@
/*
* QEMU TPM Backend
*
* Copyright IBM, Corp. 2013
*
* Authors:
* Stefan Berger <stefanb@us.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
* Based on backends/rng.c by Anthony Liguori
*/
#include "sysemu/tpm_backend.h"
#include "qapi/qmp/qerror.h"
#include "sysemu/tpm.h"
#include "qemu/thread.h"
#include "sysemu/tpm_backend_int.h"
enum TpmType tpm_backend_get_type(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->type;
}
const char *tpm_backend_get_desc(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->desc();
}
void tpm_backend_destroy(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
k->ops->destroy(s);
}
int tpm_backend_init(TPMBackend *s, TPMState *state,
TPMRecvDataCB *datacb)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->init(s, state, datacb);
}
int tpm_backend_startup_tpm(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->startup_tpm(s);
}
bool tpm_backend_had_startup_error(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->had_startup_error(s);
}
size_t tpm_backend_realloc_buffer(TPMBackend *s, TPMSizedBuffer *sb)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->realloc_buffer(sb);
}
void tpm_backend_deliver_request(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
k->ops->deliver_request(s);
}
void tpm_backend_reset(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
k->ops->reset(s);
}
void tpm_backend_cancel_cmd(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
k->ops->cancel_cmd(s);
}
bool tpm_backend_get_tpm_established_flag(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->get_tpm_established_flag(s);
}
static bool tpm_backend_prop_get_opened(Object *obj, Error **errp)
{
TPMBackend *s = TPM_BACKEND(obj);
return s->opened;
}
void tpm_backend_open(TPMBackend *s, Error **errp)
{
object_property_set_bool(OBJECT(s), true, "opened", errp);
}
static void tpm_backend_prop_set_opened(Object *obj, bool value, Error **errp)
{
TPMBackend *s = TPM_BACKEND(obj);
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
Error *local_err = NULL;
if (value == s->opened) {
return;
}
if (!value && s->opened) {
error_set(errp, QERR_PERMISSION_DENIED);
return;
}
if (k->opened) {
k->opened(s, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
}
s->opened = true;
}
static void tpm_backend_instance_init(Object *obj)
{
object_property_add_bool(obj, "opened",
tpm_backend_prop_get_opened,
tpm_backend_prop_set_opened,
NULL);
}
void tpm_backend_thread_deliver_request(TPMBackendThread *tbt)
{
g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_PROCESS_CMD, NULL);
}
void tpm_backend_thread_create(TPMBackendThread *tbt,
GFunc func, gpointer user_data)
{
if (!tbt->pool) {
tbt->pool = g_thread_pool_new(func, user_data, 1, TRUE, NULL);
g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_INIT, NULL);
}
}
void tpm_backend_thread_end(TPMBackendThread *tbt)
{
if (tbt->pool) {
g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_END, NULL);
g_thread_pool_free(tbt->pool, FALSE, TRUE);
tbt->pool = NULL;
}
}
static const TypeInfo tpm_backend_info = {
.name = TYPE_TPM_BACKEND,
.parent = TYPE_OBJECT,
.instance_size = sizeof(TPMBackend),
.instance_init = tpm_backend_instance_init,
.class_size = sizeof(TPMBackendClass),
.abstract = true,
};
static void register_types(void)
{
type_register_static(&tpm_backend_info);
}
type_init(register_types);

View File

@@ -24,33 +24,18 @@
* THE SOFTWARE.
*/
#include "monitor/monitor.h"
#include "exec/cpu-common.h"
#include "sysemu/kvm.h"
#include "sysemu/balloon.h"
#include "monitor.h"
#include "cpu-common.h"
#include "kvm.h"
#include "balloon.h"
#include "trace.h"
#include "qmp-commands.h"
#include "qapi/qmp/qjson.h"
#include "qjson.h"
static QEMUBalloonEvent *balloon_event_fn;
static QEMUBalloonStatus *balloon_stat_fn;
static void *balloon_opaque;
static bool have_balloon(Error **errp)
{
if (kvm_enabled() && !kvm_has_sync_mmu()) {
error_set(errp, ERROR_CLASS_KVM_MISSING_CAP,
"Using KVM without synchronous MMU, balloon unavailable");
return false;
}
if (!balloon_event_fn) {
error_set(errp, ERROR_CLASS_DEVICE_NOT_ACTIVE,
"No balloon device has been activated");
return false;
}
return true;
}
int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
QEMUBalloonStatus *stat_func, void *opaque)
{
@@ -58,6 +43,7 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
/* We're already registered one balloon handler. How many can
* a guest really have?
*/
error_report("Another balloon device already registered");
return -1;
}
balloon_event_fn = event_func;
@@ -76,30 +62,71 @@ void qemu_remove_balloon_handler(void *opaque)
balloon_opaque = NULL;
}
static int qemu_balloon(ram_addr_t target)
{
if (!balloon_event_fn) {
return 0;
}
trace_balloon_event(balloon_opaque, target);
balloon_event_fn(balloon_opaque, target);
return 1;
}
static int qemu_balloon_status(BalloonInfo *info)
{
if (!balloon_stat_fn) {
return 0;
}
balloon_stat_fn(balloon_opaque, info);
return 1;
}
void qemu_balloon_changed(int64_t actual)
{
QObject *data;
data = qobject_from_jsonf("{ 'actual': %" PRId64 " }",
actual);
monitor_protocol_event(QEVENT_BALLOON_CHANGE, data);
qobject_decref(data);
}
BalloonInfo *qmp_query_balloon(Error **errp)
{
BalloonInfo *info;
if (!have_balloon(errp)) {
if (kvm_enabled() && !kvm_has_sync_mmu()) {
error_set(errp, QERR_KVM_MISSING_CAP, "synchronous MMU", "balloon");
return NULL;
}
info = g_malloc0(sizeof(*info));
balloon_stat_fn(balloon_opaque, info);
if (qemu_balloon_status(info) == 0) {
error_set(errp, QERR_DEVICE_NOT_ACTIVE, "balloon");
qapi_free_BalloonInfo(info);
return NULL;
}
return info;
}
void qmp_balloon(int64_t target, Error **errp)
void qmp_balloon(int64_t value, Error **errp)
{
if (!have_balloon(errp)) {
if (kvm_enabled() && !kvm_has_sync_mmu()) {
error_set(errp, QERR_KVM_MISSING_CAP, "synchronous MMU", "balloon");
return;
}
if (target <= 0) {
if (value <= 0) {
error_set(errp, QERR_INVALID_PARAMETER_VALUE, "target", "a size");
return;
}
trace_balloon_event(balloon_opaque, target);
balloon_event_fn(balloon_opaque, target);
if (qemu_balloon(value) == 0) {
error_set(errp, QERR_DEVICE_NOT_ACTIVE, "balloon");
}
}

View File

@@ -14,7 +14,7 @@
#ifndef _QEMU_BALLOON_H
#define _QEMU_BALLOON_H
#include "monitor/monitor.h"
#include "monitor.h"
#include "qapi-types.h"
typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
@@ -24,4 +24,6 @@ int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
QEMUBalloonStatus *stat_func, void *opaque);
void qemu_remove_balloon_handler(void *opaque);
void qemu_balloon_changed(int64_t actual);
#endif

View File

@@ -9,8 +9,8 @@
* Version 2.
*/
#include "qemu/bitops.h"
#include "qemu/bitmap.h"
#include "bitops.h"
#include "bitmap.h"
/*
* bitmaps provide an array of bits, implemented using an an
@@ -36,9 +36,9 @@
* endian architectures.
*/
int slow_bitmap_empty(const unsigned long *bitmap, long bits)
int slow_bitmap_empty(const unsigned long *bitmap, int bits)
{
long k, lim = bits/BITS_PER_LONG;
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k) {
if (bitmap[k]) {
@@ -54,9 +54,9 @@ int slow_bitmap_empty(const unsigned long *bitmap, long bits)
return 1;
}
int slow_bitmap_full(const unsigned long *bitmap, long bits)
int slow_bitmap_full(const unsigned long *bitmap, int bits)
{
long k, lim = bits/BITS_PER_LONG;
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k) {
if (~bitmap[k]) {
@@ -74,9 +74,9 @@ int slow_bitmap_full(const unsigned long *bitmap, long bits)
}
int slow_bitmap_equal(const unsigned long *bitmap1,
const unsigned long *bitmap2, long bits)
const unsigned long *bitmap2, int bits)
{
long k, lim = bits/BITS_PER_LONG;
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k) {
if (bitmap1[k] != bitmap2[k]) {
@@ -94,9 +94,9 @@ int slow_bitmap_equal(const unsigned long *bitmap1,
}
void slow_bitmap_complement(unsigned long *dst, const unsigned long *src,
long bits)
int bits)
{
long k, lim = bits/BITS_PER_LONG;
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k) {
dst[k] = ~src[k];
@@ -108,10 +108,10 @@ void slow_bitmap_complement(unsigned long *dst, const unsigned long *src,
}
int slow_bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, long bits)
const unsigned long *bitmap2, int bits)
{
long k;
long nr = BITS_TO_LONGS(bits);
int k;
int nr = BITS_TO_LONGS(bits);
unsigned long result = 0;
for (k = 0; k < nr; k++) {
@@ -121,10 +121,10 @@ int slow_bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
}
void slow_bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, long bits)
const unsigned long *bitmap2, int bits)
{
long k;
long nr = BITS_TO_LONGS(bits);
int k;
int nr = BITS_TO_LONGS(bits);
for (k = 0; k < nr; k++) {
dst[k] = bitmap1[k] | bitmap2[k];
@@ -132,10 +132,10 @@ void slow_bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
}
void slow_bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, long bits)
const unsigned long *bitmap2, int bits)
{
long k;
long nr = BITS_TO_LONGS(bits);
int k;
int nr = BITS_TO_LONGS(bits);
for (k = 0; k < nr; k++) {
dst[k] = bitmap1[k] ^ bitmap2[k];
@@ -143,10 +143,10 @@ void slow_bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
}
int slow_bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, long bits)
const unsigned long *bitmap2, int bits)
{
long k;
long nr = BITS_TO_LONGS(bits);
int k;
int nr = BITS_TO_LONGS(bits);
unsigned long result = 0;
for (k = 0; k < nr; k++) {
@@ -157,10 +157,10 @@ int slow_bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) % BITS_PER_LONG))
void bitmap_set(unsigned long *map, long start, long nr)
void bitmap_set(unsigned long *map, int start, int nr)
{
unsigned long *p = map + BIT_WORD(start);
const long size = start + nr;
const int size = start + nr;
int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
@@ -177,10 +177,10 @@ void bitmap_set(unsigned long *map, long start, long nr)
}
}
void bitmap_clear(unsigned long *map, long start, long nr)
void bitmap_clear(unsigned long *map, int start, int nr)
{
unsigned long *p = map + BIT_WORD(start);
const long size = start + nr;
const int size = start + nr;
int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
@@ -212,10 +212,10 @@ void bitmap_clear(unsigned long *map, long start, long nr)
* power of 2. A @align_mask of 0 means no alignment is required.
*/
unsigned long bitmap_find_next_zero_area(unsigned long *map,
unsigned long size,
unsigned long start,
unsigned long nr,
unsigned long align_mask)
unsigned long size,
unsigned long start,
unsigned int nr,
unsigned long align_mask)
{
unsigned long index, end, i;
again:
@@ -237,9 +237,9 @@ again:
}
int slow_bitmap_intersects(const unsigned long *bitmap1,
const unsigned long *bitmap2, long bits)
const unsigned long *bitmap2, int bits)
{
long k, lim = bits/BITS_PER_LONG;
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k) {
if (bitmap1[k] & bitmap2[k]) {

222
bitmap.h Normal file
View File

@@ -0,0 +1,222 @@
/*
* Bitmap Module
*
* Copyright (C) 2010 Corentin Chary <corentin.chary@gmail.com>
*
* Mostly inspired by (stolen from) linux/bitmap.h and linux/bitops.h
*
* This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
* See the COPYING.LIB file in the top-level directory.
*/
#ifndef BITMAP_H
#define BITMAP_H
#include "qemu-common.h"
#include "bitops.h"
/*
* The available bitmap operations and their rough meaning in the
* case that the bitmap is a single unsigned long are thus:
*
* Note that nbits should be always a compile time evaluable constant.
* Otherwise many inlines will generate horrible code.
*
* bitmap_zero(dst, nbits) *dst = 0UL
* bitmap_fill(dst, nbits) *dst = ~0UL
* bitmap_copy(dst, src, nbits) *dst = *src
* bitmap_and(dst, src1, src2, nbits) *dst = *src1 & *src2
* bitmap_or(dst, src1, src2, nbits) *dst = *src1 | *src2
* bitmap_xor(dst, src1, src2, nbits) *dst = *src1 ^ *src2
* bitmap_andnot(dst, src1, src2, nbits) *dst = *src1 & ~(*src2)
* bitmap_complement(dst, src, nbits) *dst = ~(*src)
* bitmap_equal(src1, src2, nbits) Are *src1 and *src2 equal?
* bitmap_intersects(src1, src2, nbits) Do *src1 and *src2 overlap?
* bitmap_empty(src, nbits) Are all bits zero in *src?
* bitmap_full(src, nbits) Are all bits set in *src?
* bitmap_set(dst, pos, nbits) Set specified bit area
* bitmap_clear(dst, pos, nbits) Clear specified bit area
* bitmap_find_next_zero_area(buf, len, pos, n, mask) Find bit free area
*/
/*
* Also the following operations apply to bitmaps.
*
* set_bit(bit, addr) *addr |= bit
* clear_bit(bit, addr) *addr &= ~bit
* change_bit(bit, addr) *addr ^= bit
* test_bit(bit, addr) Is bit set in *addr?
* test_and_set_bit(bit, addr) Set bit and return old value
* test_and_clear_bit(bit, addr) Clear bit and return old value
* test_and_change_bit(bit, addr) Change bit and return old value
* find_first_zero_bit(addr, nbits) Position first zero bit in *addr
* find_first_bit(addr, nbits) Position first set bit in *addr
* find_next_zero_bit(addr, nbits, bit) Position next zero bit in *addr >= bit
* find_next_bit(addr, nbits, bit) Position next set bit in *addr >= bit
*/
#define BITMAP_LAST_WORD_MASK(nbits) \
( \
((nbits) % BITS_PER_LONG) ? \
(1UL<<((nbits) % BITS_PER_LONG))-1 : ~0UL \
)
#define DECLARE_BITMAP(name,bits) \
unsigned long name[BITS_TO_LONGS(bits)]
#define small_nbits(nbits) \
((nbits) <= BITS_PER_LONG)
int slow_bitmap_empty(const unsigned long *bitmap, int bits);
int slow_bitmap_full(const unsigned long *bitmap, int bits);
int slow_bitmap_equal(const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits);
void slow_bitmap_complement(unsigned long *dst, const unsigned long *src,
int bits);
void slow_bitmap_shift_right(unsigned long *dst,
const unsigned long *src, int shift, int bits);
void slow_bitmap_shift_left(unsigned long *dst,
const unsigned long *src, int shift, int bits);
int slow_bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits);
void slow_bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits);
void slow_bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits);
int slow_bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits);
int slow_bitmap_intersects(const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits);
static inline unsigned long *bitmap_new(int nbits)
{
int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
return g_malloc0(len);
}
static inline void bitmap_zero(unsigned long *dst, int nbits)
{
if (small_nbits(nbits)) {
*dst = 0UL;
} else {
int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
memset(dst, 0, len);
}
}
static inline void bitmap_fill(unsigned long *dst, int nbits)
{
size_t nlongs = BITS_TO_LONGS(nbits);
if (!small_nbits(nbits)) {
int len = (nlongs - 1) * sizeof(unsigned long);
memset(dst, 0xff, len);
}
dst[nlongs - 1] = BITMAP_LAST_WORD_MASK(nbits);
}
static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
int nbits)
{
if (small_nbits(nbits)) {
*dst = *src;
} else {
int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
memcpy(dst, src, len);
}
}
static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
const unsigned long *src2, int nbits)
{
if (small_nbits(nbits)) {
return (*dst = *src1 & *src2) != 0;
}
return slow_bitmap_and(dst, src1, src2, nbits);
}
static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
const unsigned long *src2, int nbits)
{
if (small_nbits(nbits)) {
*dst = *src1 | *src2;
} else {
slow_bitmap_or(dst, src1, src2, nbits);
}
}
static inline void bitmap_xor(unsigned long *dst, const unsigned long *src1,
const unsigned long *src2, int nbits)
{
if (small_nbits(nbits)) {
*dst = *src1 ^ *src2;
} else {
slow_bitmap_xor(dst, src1, src2, nbits);
}
}
static inline int bitmap_andnot(unsigned long *dst, const unsigned long *src1,
const unsigned long *src2, int nbits)
{
if (small_nbits(nbits)) {
return (*dst = *src1 & ~(*src2)) != 0;
}
return slow_bitmap_andnot(dst, src1, src2, nbits);
}
static inline void bitmap_complement(unsigned long *dst, const unsigned long *src,
int nbits)
{
if (small_nbits(nbits)) {
*dst = ~(*src) & BITMAP_LAST_WORD_MASK(nbits);
} else {
slow_bitmap_complement(dst, src, nbits);
}
}
static inline int bitmap_equal(const unsigned long *src1,
const unsigned long *src2, int nbits)
{
if (small_nbits(nbits)) {
return ! ((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
} else {
return slow_bitmap_equal(src1, src2, nbits);
}
}
static inline int bitmap_empty(const unsigned long *src, int nbits)
{
if (small_nbits(nbits)) {
return ! (*src & BITMAP_LAST_WORD_MASK(nbits));
} else {
return slow_bitmap_empty(src, nbits);
}
}
static inline int bitmap_full(const unsigned long *src, int nbits)
{
if (small_nbits(nbits)) {
return ! (~(*src) & BITMAP_LAST_WORD_MASK(nbits));
} else {
return slow_bitmap_full(src, nbits);
}
}
static inline int bitmap_intersects(const unsigned long *src1,
const unsigned long *src2, int nbits)
{
if (small_nbits(nbits)) {
return ((*src1 & *src2) & BITMAP_LAST_WORD_MASK(nbits)) != 0;
} else {
return slow_bitmap_intersects(src1, src2, nbits);
}
}
void bitmap_set(unsigned long *map, int i, int len);
void bitmap_clear(unsigned long *map, int start, int nr);
unsigned long bitmap_find_next_zero_area(unsigned long *map,
unsigned long size,
unsigned long start,
unsigned int nr,
unsigned long align_mask);
#endif /* BITMAP_H */

View File

@@ -11,7 +11,7 @@
* 2 of the License, or (at your option) any later version.
*/
#include "qemu/bitops.h"
#include "bitops.h"
#define BITOP_WORD(nr) ((nr) / BITS_PER_LONG)
@@ -42,23 +42,7 @@ unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
size -= BITS_PER_LONG;
result += BITS_PER_LONG;
}
while (size >= 4*BITS_PER_LONG) {
unsigned long d1, d2, d3;
tmp = *p;
d1 = *(p+1);
d2 = *(p+2);
d3 = *(p+3);
if (tmp) {
goto found_middle;
}
if (d1 | d2 | d3) {
break;
}
p += 4;
result += 4*BITS_PER_LONG;
size -= 4*BITS_PER_LONG;
}
while (size >= BITS_PER_LONG) {
while (size & ~(BITS_PER_LONG-1)) {
if ((tmp = *(p++))) {
goto found_middle;
}
@@ -76,7 +60,7 @@ found_first:
return result + size; /* Nope. */
}
found_middle:
return result + ctzl(tmp);
return result + bitops_ffsl(tmp);
}
/*
@@ -125,7 +109,7 @@ found_first:
return result + size; /* Nope. */
}
found_middle:
return result + ctzl(~tmp);
return result + ffz(tmp);
}
unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
@@ -149,7 +133,7 @@ unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
tmp = addr[--words];
if (tmp) {
found:
return words * BITS_PER_LONG + BITS_PER_LONG - 1 - clzl(tmp);
return words * BITS_PER_LONG + bitops_flsl(tmp);
}
}

362
bitops.h Normal file
View File

@@ -0,0 +1,362 @@
/*
* Bitops Module
*
* Copyright (C) 2010 Corentin Chary <corentin.chary@gmail.com>
*
* Mostly inspired by (stolen from) linux/bitmap.h and linux/bitops.h
*
* This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
* See the COPYING.LIB file in the top-level directory.
*/
#ifndef BITOPS_H
#define BITOPS_H
#include "qemu-common.h"
#define BITS_PER_BYTE CHAR_BIT
#define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE)
#define BIT(nr) (1UL << (nr))
#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG))
#define BIT_WORD(nr) ((nr) / BITS_PER_LONG)
#define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
/**
* bitops_ffs - find first bit in word.
* @word: The word to search
*
* Undefined if no bit exists, so code should check against 0 first.
*/
static unsigned long bitops_ffsl(unsigned long word)
{
int num = 0;
#if LONG_MAX > 0x7FFFFFFF
if ((word & 0xffffffff) == 0) {
num += 32;
word >>= 32;
}
#endif
if ((word & 0xffff) == 0) {
num += 16;
word >>= 16;
}
if ((word & 0xff) == 0) {
num += 8;
word >>= 8;
}
if ((word & 0xf) == 0) {
num += 4;
word >>= 4;
}
if ((word & 0x3) == 0) {
num += 2;
word >>= 2;
}
if ((word & 0x1) == 0) {
num += 1;
}
return num;
}
/**
* bitops_fls - find last (most-significant) set bit in a long word
* @word: the word to search
*
* Undefined if no set bit exists, so code should check against 0 first.
*/
static inline unsigned long bitops_flsl(unsigned long word)
{
int num = BITS_PER_LONG - 1;
#if LONG_MAX > 0x7FFFFFFF
if (!(word & (~0ul << 32))) {
num -= 32;
word <<= 32;
}
#endif
if (!(word & (~0ul << (BITS_PER_LONG-16)))) {
num -= 16;
word <<= 16;
}
if (!(word & (~0ul << (BITS_PER_LONG-8)))) {
num -= 8;
word <<= 8;
}
if (!(word & (~0ul << (BITS_PER_LONG-4)))) {
num -= 4;
word <<= 4;
}
if (!(word & (~0ul << (BITS_PER_LONG-2)))) {
num -= 2;
word <<= 2;
}
if (!(word & (~0ul << (BITS_PER_LONG-1))))
num -= 1;
return num;
}
/**
* ffz - find first zero in word.
* @word: The word to search
*
* Undefined if no zero exists, so code should check against ~0UL first.
*/
static inline unsigned long ffz(unsigned long word)
{
return bitops_ffsl(~word);
}
/**
* set_bit - Set a bit in memory
* @nr: the bit to set
* @addr: the address to start counting from
*/
static inline void set_bit(int nr, unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = addr + BIT_WORD(nr);
*p |= mask;
}
/**
* clear_bit - Clears a bit in memory
* @nr: Bit to clear
* @addr: Address to start counting from
*/
static inline void clear_bit(int nr, unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = addr + BIT_WORD(nr);
*p &= ~mask;
}
/**
* change_bit - Toggle a bit in memory
* @nr: Bit to change
* @addr: Address to start counting from
*/
static inline void change_bit(int nr, unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = addr + BIT_WORD(nr);
*p ^= mask;
}
/**
* test_and_set_bit - Set a bit and return its old value
* @nr: Bit to set
* @addr: Address to count from
*/
static inline int test_and_set_bit(int nr, unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = addr + BIT_WORD(nr);
unsigned long old = *p;
*p = old | mask;
return (old & mask) != 0;
}
/**
* test_and_clear_bit - Clear a bit and return its old value
* @nr: Bit to clear
* @addr: Address to count from
*/
static inline int test_and_clear_bit(int nr, unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = addr + BIT_WORD(nr);
unsigned long old = *p;
*p = old & ~mask;
return (old & mask) != 0;
}
/**
* test_and_change_bit - Change a bit and return its old value
* @nr: Bit to change
* @addr: Address to count from
*/
static inline int test_and_change_bit(int nr, unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = addr + BIT_WORD(nr);
unsigned long old = *p;
*p = old ^ mask;
return (old & mask) != 0;
}
/**
* test_bit - Determine whether a bit is set
* @nr: bit number to test
* @addr: Address to start counting from
*/
static inline int test_bit(int nr, const unsigned long *addr)
{
return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
}
/**
* find_last_bit - find the last set bit in a memory region
* @addr: The address to start the search at
* @size: The maximum size to search
*
* Returns the bit number of the first set bit, or size.
*/
unsigned long find_last_bit(const unsigned long *addr,
unsigned long size);
/**
* find_next_bit - find the next set bit in a memory region
* @addr: The address to base the search on
* @offset: The bitnumber to start searching at
* @size: The bitmap size in bits
*/
unsigned long find_next_bit(const unsigned long *addr,
unsigned long size, unsigned long offset);
/**
* find_next_zero_bit - find the next cleared bit in a memory region
* @addr: The address to base the search on
* @offset: The bitnumber to start searching at
* @size: The bitmap size in bits
*/
unsigned long find_next_zero_bit(const unsigned long *addr,
unsigned long size,
unsigned long offset);
/**
* find_first_bit - find the first set bit in a memory region
* @addr: The address to start the search at
* @size: The maximum size to search
*
* Returns the bit number of the first set bit.
*/
static inline unsigned long find_first_bit(const unsigned long *addr,
unsigned long size)
{
return find_next_bit(addr, size, 0);
}
/**
* find_first_zero_bit - find the first cleared bit in a memory region
* @addr: The address to start the search at
* @size: The maximum size to search
*
* Returns the bit number of the first cleared bit.
*/
static inline unsigned long find_first_zero_bit(const unsigned long *addr,
unsigned long size)
{
return find_next_zero_bit(addr, size, 0);
}
static inline unsigned long hweight_long(unsigned long w)
{
unsigned long count;
for (count = 0; w; w >>= 1) {
count += w & 1;
}
return count;
}
/**
* extract32:
* @value: the value to extract the bit field from
* @start: the lowest bit in the bit field (numbered from 0)
* @length: the length of the bit field
*
* Extract from the 32 bit input @value the bit field specified by the
* @start and @length parameters, and return it. The bit field must
* lie entirely within the 32 bit word. It is valid to request that
* all 32 bits are returned (ie @length 32 and @start 0).
*
* Returns: the value of the bit field extracted from the input value.
*/
static inline uint32_t extract32(uint32_t value, int start, int length)
{
assert(start >= 0 && length > 0 && length <= 32 - start);
return (value >> start) & (~0U >> (32 - length));
}
/**
* extract64:
* @value: the value to extract the bit field from
* @start: the lowest bit in the bit field (numbered from 0)
* @length: the length of the bit field
*
* Extract from the 64 bit input @value the bit field specified by the
* @start and @length parameters, and return it. The bit field must
* lie entirely within the 64 bit word. It is valid to request that
* all 64 bits are returned (ie @length 64 and @start 0).
*
* Returns: the value of the bit field extracted from the input value.
*/
static inline uint64_t extract64(uint64_t value, int start, int length)
{
assert(start >= 0 && length > 0 && length <= 64 - start);
return (value >> start) & (~0ULL >> (64 - length));
}
/**
* deposit32:
* @value: initial value to insert bit field into
* @start: the lowest bit in the bit field (numbered from 0)
* @length: the length of the bit field
* @fieldval: the value to insert into the bit field
*
* Deposit @fieldval into the 32 bit @value at the bit field specified
* by the @start and @length parameters, and return the modified
* @value. Bits of @value outside the bit field are not modified.
* Bits of @fieldval above the least significant @length bits are
* ignored. The bit field must lie entirely within the 32 bit word.
* It is valid to request that all 32 bits are modified (ie @length
* 32 and @start 0).
*
* Returns: the modified @value.
*/
static inline uint32_t deposit32(uint32_t value, int start, int length,
uint32_t fieldval)
{
uint32_t mask;
assert(start >= 0 && length > 0 && length <= 32 - start);
mask = (~0U >> (32 - length)) << start;
return (value & ~mask) | ((fieldval << start) & mask);
}
/**
* deposit64:
* @value: initial value to insert bit field into
* @start: the lowest bit in the bit field (numbered from 0)
* @length: the length of the bit field
* @fieldval: the value to insert into the bit field
*
* Deposit @fieldval into the 64 bit @value at the bit field specified
* by the @start and @length parameters, and return the modified
* @value. Bits of @value outside the bit field are not modified.
* Bits of @fieldval above the least significant @length bits are
* ignored. The bit field must lie entirely within the 64 bit word.
* It is valid to request that all 64 bits are modified (ie @length
* 64 and @start 0).
*
* Returns: the modified @value.
*/
static inline uint64_t deposit64(uint64_t value, int start, int length,
uint64_t fieldval)
{
uint64_t mask;
assert(start >= 0 && length > 0 && length <= 64 - start);
mask = (~0ULL >> (64 - length)) << start;
return (value & ~mask) | ((fieldval << start) & mask);
}
#endif

768
block-migration.c Normal file
View File

@@ -0,0 +1,768 @@
/*
* QEMU live block migration
*
* Copyright IBM, Corp. 2009
*
* Authors:
* Liran Schour <lirans@il.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*
* Contributions after 2012-01-13 are licensed under the terms of the
* GNU GPL, version 2 or (at your option) any later version.
*/
#include "qemu-common.h"
#include "block_int.h"
#include "hw/hw.h"
#include "qemu-queue.h"
#include "qemu-timer.h"
#include "block-migration.h"
#include "migration.h"
#include "blockdev.h"
#include <assert.h>
#define BLOCK_SIZE (BDRV_SECTORS_PER_DIRTY_CHUNK << BDRV_SECTOR_BITS)
#define BLK_MIG_FLAG_DEVICE_BLOCK 0x01
#define BLK_MIG_FLAG_EOS 0x02
#define BLK_MIG_FLAG_PROGRESS 0x04
#define MAX_IS_ALLOCATED_SEARCH 65536
//#define DEBUG_BLK_MIGRATION
#ifdef DEBUG_BLK_MIGRATION
#define DPRINTF(fmt, ...) \
do { printf("blk_migration: " fmt, ## __VA_ARGS__); } while (0)
#else
#define DPRINTF(fmt, ...) \
do { } while (0)
#endif
typedef struct BlkMigDevState {
BlockDriverState *bs;
int bulk_completed;
int shared_base;
int64_t cur_sector;
int64_t cur_dirty;
int64_t completed_sectors;
int64_t total_sectors;
int64_t dirty;
QSIMPLEQ_ENTRY(BlkMigDevState) entry;
unsigned long *aio_bitmap;
} BlkMigDevState;
typedef struct BlkMigBlock {
uint8_t *buf;
BlkMigDevState *bmds;
int64_t sector;
int nr_sectors;
struct iovec iov;
QEMUIOVector qiov;
BlockDriverAIOCB *aiocb;
int ret;
QSIMPLEQ_ENTRY(BlkMigBlock) entry;
} BlkMigBlock;
typedef struct BlkMigState {
int blk_enable;
int shared_base;
QSIMPLEQ_HEAD(bmds_list, BlkMigDevState) bmds_list;
QSIMPLEQ_HEAD(blk_list, BlkMigBlock) blk_list;
int submitted;
int read_done;
int transferred;
int64_t total_sector_sum;
int prev_progress;
int bulk_completed;
long double total_time;
long double prev_time_offset;
int reads;
} BlkMigState;
static BlkMigState block_mig_state;
static void blk_send(QEMUFile *f, BlkMigBlock * blk)
{
int len;
/* sector number and flags */
qemu_put_be64(f, (blk->sector << BDRV_SECTOR_BITS)
| BLK_MIG_FLAG_DEVICE_BLOCK);
/* device name */
len = strlen(blk->bmds->bs->device_name);
qemu_put_byte(f, len);
qemu_put_buffer(f, (uint8_t *)blk->bmds->bs->device_name, len);
qemu_put_buffer(f, blk->buf, BLOCK_SIZE);
}
int blk_mig_active(void)
{
return !QSIMPLEQ_EMPTY(&block_mig_state.bmds_list);
}
uint64_t blk_mig_bytes_transferred(void)
{
BlkMigDevState *bmds;
uint64_t sum = 0;
QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
sum += bmds->completed_sectors;
}
return sum << BDRV_SECTOR_BITS;
}
uint64_t blk_mig_bytes_remaining(void)
{
return blk_mig_bytes_total() - blk_mig_bytes_transferred();
}
uint64_t blk_mig_bytes_total(void)
{
BlkMigDevState *bmds;
uint64_t sum = 0;
QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
sum += bmds->total_sectors;
}
return sum << BDRV_SECTOR_BITS;
}
static inline long double compute_read_bwidth(void)
{
assert(block_mig_state.total_time != 0);
return (block_mig_state.reads / block_mig_state.total_time) * BLOCK_SIZE;
}
static int bmds_aio_inflight(BlkMigDevState *bmds, int64_t sector)
{
int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK;
if ((sector << BDRV_SECTOR_BITS) < bdrv_getlength(bmds->bs)) {
return !!(bmds->aio_bitmap[chunk / (sizeof(unsigned long) * 8)] &
(1UL << (chunk % (sizeof(unsigned long) * 8))));
} else {
return 0;
}
}
static void bmds_set_aio_inflight(BlkMigDevState *bmds, int64_t sector_num,
int nb_sectors, int set)
{
int64_t start, end;
unsigned long val, idx, bit;
start = sector_num / BDRV_SECTORS_PER_DIRTY_CHUNK;
end = (sector_num + nb_sectors - 1) / BDRV_SECTORS_PER_DIRTY_CHUNK;
for (; start <= end; start++) {
idx = start / (sizeof(unsigned long) * 8);
bit = start % (sizeof(unsigned long) * 8);
val = bmds->aio_bitmap[idx];
if (set) {
val |= 1UL << bit;
} else {
val &= ~(1UL << bit);
}
bmds->aio_bitmap[idx] = val;
}
}
static void alloc_aio_bitmap(BlkMigDevState *bmds)
{
BlockDriverState *bs = bmds->bs;
int64_t bitmap_size;
bitmap_size = (bdrv_getlength(bs) >> BDRV_SECTOR_BITS) +
BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
bitmap_size /= BDRV_SECTORS_PER_DIRTY_CHUNK * 8;
bmds->aio_bitmap = g_malloc0(bitmap_size);
}
static void blk_mig_read_cb(void *opaque, int ret)
{
long double curr_time = qemu_get_clock_ns(rt_clock);
BlkMigBlock *blk = opaque;
blk->ret = ret;
block_mig_state.reads++;
block_mig_state.total_time += (curr_time - block_mig_state.prev_time_offset);
block_mig_state.prev_time_offset = curr_time;
QSIMPLEQ_INSERT_TAIL(&block_mig_state.blk_list, blk, entry);
bmds_set_aio_inflight(blk->bmds, blk->sector, blk->nr_sectors, 0);
block_mig_state.submitted--;
block_mig_state.read_done++;
assert(block_mig_state.submitted >= 0);
}
static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
{
int64_t total_sectors = bmds->total_sectors;
int64_t cur_sector = bmds->cur_sector;
BlockDriverState *bs = bmds->bs;
BlkMigBlock *blk;
int nr_sectors;
if (bmds->shared_base) {
while (cur_sector < total_sectors &&
!bdrv_is_allocated(bs, cur_sector, MAX_IS_ALLOCATED_SEARCH,
&nr_sectors)) {
cur_sector += nr_sectors;
}
}
if (cur_sector >= total_sectors) {
bmds->cur_sector = bmds->completed_sectors = total_sectors;
return 1;
}
bmds->completed_sectors = cur_sector;
cur_sector &= ~((int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK - 1);
/* we are going to transfer a full block even if it is not allocated */
nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
if (total_sectors - cur_sector < BDRV_SECTORS_PER_DIRTY_CHUNK) {
nr_sectors = total_sectors - cur_sector;
}
blk = g_malloc(sizeof(BlkMigBlock));
blk->buf = g_malloc(BLOCK_SIZE);
blk->bmds = bmds;
blk->sector = cur_sector;
blk->nr_sectors = nr_sectors;
blk->iov.iov_base = blk->buf;
blk->iov.iov_len = nr_sectors * BDRV_SECTOR_SIZE;
qemu_iovec_init_external(&blk->qiov, &blk->iov, 1);
if (block_mig_state.submitted == 0) {
block_mig_state.prev_time_offset = qemu_get_clock_ns(rt_clock);
}
blk->aiocb = bdrv_aio_readv(bs, cur_sector, &blk->qiov,
nr_sectors, blk_mig_read_cb, blk);
block_mig_state.submitted++;
bdrv_reset_dirty(bs, cur_sector, nr_sectors);
bmds->cur_sector = cur_sector + nr_sectors;
return (bmds->cur_sector >= total_sectors);
}
static void set_dirty_tracking(int enable)
{
BlkMigDevState *bmds;
QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
bdrv_set_dirty_tracking(bmds->bs, enable);
}
}
static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
{
BlkMigDevState *bmds;
int64_t sectors;
if (!bdrv_is_read_only(bs)) {
sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
if (sectors <= 0) {
return;
}
bmds = g_malloc0(sizeof(BlkMigDevState));
bmds->bs = bs;
bmds->bulk_completed = 0;
bmds->total_sectors = sectors;
bmds->completed_sectors = 0;
bmds->shared_base = block_mig_state.shared_base;
alloc_aio_bitmap(bmds);
drive_get_ref(drive_get_by_blockdev(bs));
bdrv_set_in_use(bs, 1);
block_mig_state.total_sector_sum += sectors;
if (bmds->shared_base) {
DPRINTF("Start migration for %s with shared base image\n",
bs->device_name);
} else {
DPRINTF("Start full migration for %s\n", bs->device_name);
}
QSIMPLEQ_INSERT_TAIL(&block_mig_state.bmds_list, bmds, entry);
}
}
static void init_blk_migration(QEMUFile *f)
{
block_mig_state.submitted = 0;
block_mig_state.read_done = 0;
block_mig_state.transferred = 0;
block_mig_state.total_sector_sum = 0;
block_mig_state.prev_progress = -1;
block_mig_state.bulk_completed = 0;
block_mig_state.total_time = 0;
block_mig_state.reads = 0;
bdrv_iterate(init_blk_migration_it, NULL);
}
static int blk_mig_save_bulked_block(QEMUFile *f)
{
int64_t completed_sector_sum = 0;
BlkMigDevState *bmds;
int progress;
int ret = 0;
QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
if (bmds->bulk_completed == 0) {
if (mig_save_device_bulk(f, bmds) == 1) {
/* completed bulk section for this device */
bmds->bulk_completed = 1;
}
completed_sector_sum += bmds->completed_sectors;
ret = 1;
break;
} else {
completed_sector_sum += bmds->completed_sectors;
}
}
if (block_mig_state.total_sector_sum != 0) {
progress = completed_sector_sum * 100 /
block_mig_state.total_sector_sum;
} else {
progress = 100;
}
if (progress != block_mig_state.prev_progress) {
block_mig_state.prev_progress = progress;
qemu_put_be64(f, (progress << BDRV_SECTOR_BITS)
| BLK_MIG_FLAG_PROGRESS);
DPRINTF("Completed %d %%\r", progress);
}
return ret;
}
static void blk_mig_reset_dirty_cursor(void)
{
BlkMigDevState *bmds;
QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
bmds->cur_dirty = 0;
}
}
static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
int is_async)
{
BlkMigBlock *blk;
int64_t total_sectors = bmds->total_sectors;
int64_t sector;
int nr_sectors;
int ret = -EIO;
for (sector = bmds->cur_dirty; sector < bmds->total_sectors;) {
if (bmds_aio_inflight(bmds, sector)) {
bdrv_drain_all();
}
if (bdrv_get_dirty(bmds->bs, sector)) {
if (total_sectors - sector < BDRV_SECTORS_PER_DIRTY_CHUNK) {
nr_sectors = total_sectors - sector;
} else {
nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
}
blk = g_malloc(sizeof(BlkMigBlock));
blk->buf = g_malloc(BLOCK_SIZE);
blk->bmds = bmds;
blk->sector = sector;
blk->nr_sectors = nr_sectors;
if (is_async) {
blk->iov.iov_base = blk->buf;
blk->iov.iov_len = nr_sectors * BDRV_SECTOR_SIZE;
qemu_iovec_init_external(&blk->qiov, &blk->iov, 1);
if (block_mig_state.submitted == 0) {
block_mig_state.prev_time_offset = qemu_get_clock_ns(rt_clock);
}
blk->aiocb = bdrv_aio_readv(bmds->bs, sector, &blk->qiov,
nr_sectors, blk_mig_read_cb, blk);
block_mig_state.submitted++;
bmds_set_aio_inflight(bmds, sector, nr_sectors, 1);
} else {
ret = bdrv_read(bmds->bs, sector, blk->buf, nr_sectors);
if (ret < 0) {
goto error;
}
blk_send(f, blk);
g_free(blk->buf);
g_free(blk);
}
bdrv_reset_dirty(bmds->bs, sector, nr_sectors);
break;
}
sector += BDRV_SECTORS_PER_DIRTY_CHUNK;
bmds->cur_dirty = sector;
}
return (bmds->cur_dirty >= bmds->total_sectors);
error:
DPRINTF("Error reading sector %" PRId64 "\n", sector);
qemu_file_set_error(f, ret);
g_free(blk->buf);
g_free(blk);
return 0;
}
static int blk_mig_save_dirty_block(QEMUFile *f, int is_async)
{
BlkMigDevState *bmds;
int ret = 0;
QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
if (mig_save_device_dirty(f, bmds, is_async) == 0) {
ret = 1;
break;
}
}
return ret;
}
static void flush_blks(QEMUFile* f)
{
BlkMigBlock *blk;
DPRINTF("%s Enter submitted %d read_done %d transferred %d\n",
__FUNCTION__, block_mig_state.submitted, block_mig_state.read_done,
block_mig_state.transferred);
while ((blk = QSIMPLEQ_FIRST(&block_mig_state.blk_list)) != NULL) {
if (qemu_file_rate_limit(f)) {
break;
}
if (blk->ret < 0) {
qemu_file_set_error(f, blk->ret);
break;
}
blk_send(f, blk);
QSIMPLEQ_REMOVE_HEAD(&block_mig_state.blk_list, entry);
g_free(blk->buf);
g_free(blk);
block_mig_state.read_done--;
block_mig_state.transferred++;
assert(block_mig_state.read_done >= 0);
}
DPRINTF("%s Exit submitted %d read_done %d transferred %d\n", __FUNCTION__,
block_mig_state.submitted, block_mig_state.read_done,
block_mig_state.transferred);
}
static int64_t get_remaining_dirty(void)
{
BlkMigDevState *bmds;
int64_t dirty = 0;
QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
dirty += bdrv_get_dirty_count(bmds->bs);
}
return dirty * BLOCK_SIZE;
}
static int is_stage2_completed(void)
{
int64_t remaining_dirty;
long double bwidth;
if (block_mig_state.bulk_completed == 1) {
remaining_dirty = get_remaining_dirty();
if (remaining_dirty == 0) {
return 1;
}
bwidth = compute_read_bwidth();
if ((remaining_dirty / bwidth) <=
migrate_max_downtime()) {
/* finish stage2 because we think that we can finish remaining work
below max_downtime */
return 1;
}
}
return 0;
}
static void blk_mig_cleanup(void)
{
BlkMigDevState *bmds;
BlkMigBlock *blk;
set_dirty_tracking(0);
while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
bdrv_set_in_use(bmds->bs, 0);
drive_put_ref(drive_get_by_blockdev(bmds->bs));
g_free(bmds->aio_bitmap);
g_free(bmds);
}
while ((blk = QSIMPLEQ_FIRST(&block_mig_state.blk_list)) != NULL) {
QSIMPLEQ_REMOVE_HEAD(&block_mig_state.blk_list, entry);
g_free(blk->buf);
g_free(blk);
}
}
static void block_migration_cancel(void *opaque)
{
blk_mig_cleanup();
}
static int block_save_setup(QEMUFile *f, void *opaque)
{
int ret;
DPRINTF("Enter save live setup submitted %d transferred %d\n",
block_mig_state.submitted, block_mig_state.transferred);
init_blk_migration(f);
/* start track dirty blocks */
set_dirty_tracking(1);
flush_blks(f);
ret = qemu_file_get_error(f);
if (ret) {
blk_mig_cleanup();
return ret;
}
blk_mig_reset_dirty_cursor();
qemu_put_be64(f, BLK_MIG_FLAG_EOS);
return 0;
}
static int block_save_iterate(QEMUFile *f, void *opaque)
{
int ret;
DPRINTF("Enter save live iterate submitted %d transferred %d\n",
block_mig_state.submitted, block_mig_state.transferred);
flush_blks(f);
ret = qemu_file_get_error(f);
if (ret) {
blk_mig_cleanup();
return ret;
}
blk_mig_reset_dirty_cursor();
/* control the rate of transfer */
while ((block_mig_state.submitted +
block_mig_state.read_done) * BLOCK_SIZE <
qemu_file_get_rate_limit(f)) {
if (block_mig_state.bulk_completed == 0) {
/* first finish the bulk phase */
if (blk_mig_save_bulked_block(f) == 0) {
/* finished saving bulk on all devices */
block_mig_state.bulk_completed = 1;
}
} else {
if (blk_mig_save_dirty_block(f, 1) == 0) {
/* no more dirty blocks */
break;
}
}
}
flush_blks(f);
ret = qemu_file_get_error(f);
if (ret) {
blk_mig_cleanup();
return ret;
}
qemu_put_be64(f, BLK_MIG_FLAG_EOS);
return is_stage2_completed();
}
static int block_save_complete(QEMUFile *f, void *opaque)
{
int ret;
DPRINTF("Enter save live complete submitted %d transferred %d\n",
block_mig_state.submitted, block_mig_state.transferred);
flush_blks(f);
ret = qemu_file_get_error(f);
if (ret) {
blk_mig_cleanup();
return ret;
}
blk_mig_reset_dirty_cursor();
/* we know for sure that save bulk is completed and
all async read completed */
assert(block_mig_state.submitted == 0);
while (blk_mig_save_dirty_block(f, 0) != 0) {
/* Do nothing */
}
blk_mig_cleanup();
/* report completion */
qemu_put_be64(f, (100 << BDRV_SECTOR_BITS) | BLK_MIG_FLAG_PROGRESS);
ret = qemu_file_get_error(f);
if (ret) {
return ret;
}
DPRINTF("Block migration completed\n");
qemu_put_be64(f, BLK_MIG_FLAG_EOS);
return 0;
}
static int block_load(QEMUFile *f, void *opaque, int version_id)
{
static int banner_printed;
int len, flags;
char device_name[256];
int64_t addr;
BlockDriverState *bs, *bs_prev = NULL;
uint8_t *buf;
int64_t total_sectors = 0;
int nr_sectors;
int ret;
do {
addr = qemu_get_be64(f);
flags = addr & ~BDRV_SECTOR_MASK;
addr >>= BDRV_SECTOR_BITS;
if (flags & BLK_MIG_FLAG_DEVICE_BLOCK) {
/* get device name */
len = qemu_get_byte(f);
qemu_get_buffer(f, (uint8_t *)device_name, len);
device_name[len] = '\0';
bs = bdrv_find(device_name);
if (!bs) {
fprintf(stderr, "Error unknown block device %s\n",
device_name);
return -EINVAL;
}
if (bs != bs_prev) {
bs_prev = bs;
total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
if (total_sectors <= 0) {
error_report("Error getting length of block device %s",
device_name);
return -EINVAL;
}
}
if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
nr_sectors = total_sectors - addr;
} else {
nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
}
buf = g_malloc(BLOCK_SIZE);
qemu_get_buffer(f, buf, BLOCK_SIZE);
ret = bdrv_write(bs, addr, buf, nr_sectors);
g_free(buf);
if (ret < 0) {
return ret;
}
} else if (flags & BLK_MIG_FLAG_PROGRESS) {
if (!banner_printed) {
printf("Receiving block device images\n");
banner_printed = 1;
}
printf("Completed %d %%%c", (int)addr,
(addr == 100) ? '\n' : '\r');
fflush(stdout);
} else if (!(flags & BLK_MIG_FLAG_EOS)) {
fprintf(stderr, "Unknown flags\n");
return -EINVAL;
}
ret = qemu_file_get_error(f);
if (ret != 0) {
return ret;
}
} while (!(flags & BLK_MIG_FLAG_EOS));
return 0;
}
static void block_set_params(const MigrationParams *params, void *opaque)
{
block_mig_state.blk_enable = params->blk;
block_mig_state.shared_base = params->shared;
/* shared base means that blk_enable = 1 */
block_mig_state.blk_enable |= params->shared;
}
static bool block_is_active(void *opaque)
{
return block_mig_state.blk_enable == 1;
}
SaveVMHandlers savevm_block_handlers = {
.set_params = block_set_params,
.save_live_setup = block_save_setup,
.save_live_iterate = block_save_iterate,
.save_live_complete = block_save_complete,
.load_state = block_load,
.cancel = block_migration_cancel,
.is_active = block_is_active,
};
void blk_mig_init(void)
{
QSIMPLEQ_INIT(&block_mig_state.bmds_list);
QSIMPLEQ_INIT(&block_mig_state.blk_list);
register_savevm_live(NULL, "block", 0, 1, &savevm_block_handlers,
&block_mig_state);
}

5576
block.c

File diff suppressed because it is too large Load Diff

409
block.h Normal file
View File

@@ -0,0 +1,409 @@
#ifndef BLOCK_H
#define BLOCK_H
#include "qemu-aio.h"
#include "qemu-common.h"
#include "qemu-option.h"
#include "qemu-coroutine.h"
#include "qobject.h"
/* block.c */
typedef struct BlockDriver BlockDriver;
typedef struct BlockDriverInfo {
/* in bytes, 0 if irrelevant */
int cluster_size;
/* offset at which the VM state can be saved (0 if not possible) */
int64_t vm_state_offset;
bool is_dirty;
} BlockDriverInfo;
typedef struct BlockFragInfo {
uint64_t allocated_clusters;
uint64_t total_clusters;
uint64_t fragmented_clusters;
} BlockFragInfo;
typedef struct QEMUSnapshotInfo {
char id_str[128]; /* unique snapshot id */
/* the following fields are informative. They are not needed for
the consistency of the snapshot */
char name[256]; /* user chosen name */
uint64_t vm_state_size; /* VM state info size */
uint32_t date_sec; /* UTC date of the snapshot */
uint32_t date_nsec;
uint64_t vm_clock_nsec; /* VM clock relative to boot */
} QEMUSnapshotInfo;
/* Callbacks for block device models */
typedef struct BlockDevOps {
/*
* Runs when virtual media changed (monitor commands eject, change)
* Argument load is true on load and false on eject.
* Beware: doesn't run when a host device's physical media
* changes. Sure would be useful if it did.
* Device models with removable media must implement this callback.
*/
void (*change_media_cb)(void *opaque, bool load);
/*
* Runs when an eject request is issued from the monitor, the tray
* is closed, and the medium is locked.
* Device models that do not implement is_medium_locked will not need
* this callback. Device models that can lock the medium or tray might
* want to implement the callback and unlock the tray when "force" is
* true, even if they do not support eject requests.
*/
void (*eject_request_cb)(void *opaque, bool force);
/*
* Is the virtual tray open?
* Device models implement this only when the device has a tray.
*/
bool (*is_tray_open)(void *opaque);
/*
* Is the virtual medium locked into the device?
* Device models implement this only when device has such a lock.
*/
bool (*is_medium_locked)(void *opaque);
/*
* Runs when the size changed (e.g. monitor command block_resize)
*/
void (*resize_cb)(void *opaque);
} BlockDevOps;
#define BDRV_O_RDWR 0x0002
#define BDRV_O_SNAPSHOT 0x0008 /* open the file read only and save writes in a snapshot */
#define BDRV_O_NOCACHE 0x0020 /* do not use the host page cache */
#define BDRV_O_CACHE_WB 0x0040 /* use write-back caching */
#define BDRV_O_NATIVE_AIO 0x0080 /* use native AIO instead of the thread pool */
#define BDRV_O_NO_BACKING 0x0100 /* don't open the backing file */
#define BDRV_O_NO_FLUSH 0x0200 /* disable flushing on this disk */
#define BDRV_O_COPY_ON_READ 0x0400 /* copy read backing sectors into image */
#define BDRV_O_INCOMING 0x0800 /* consistency hint for incoming migration */
#define BDRV_O_CHECK 0x1000 /* open solely for consistency check */
#define BDRV_O_ALLOW_RDWR 0x2000 /* allow reopen to change from r/o to r/w */
#define BDRV_O_CACHE_MASK (BDRV_O_NOCACHE | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH)
#define BDRV_SECTOR_BITS 9
#define BDRV_SECTOR_SIZE (1ULL << BDRV_SECTOR_BITS)
#define BDRV_SECTOR_MASK ~(BDRV_SECTOR_SIZE - 1)
typedef enum {
BLOCK_ERR_REPORT, BLOCK_ERR_IGNORE, BLOCK_ERR_STOP_ENOSPC,
BLOCK_ERR_STOP_ANY
} BlockErrorAction;
typedef enum {
BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP
} BlockQMPEventAction;
void bdrv_iostatus_enable(BlockDriverState *bs);
void bdrv_iostatus_reset(BlockDriverState *bs);
void bdrv_iostatus_disable(BlockDriverState *bs);
bool bdrv_iostatus_is_enabled(const BlockDriverState *bs);
void bdrv_iostatus_set_err(BlockDriverState *bs, int error);
void bdrv_emit_qmp_error_event(const BlockDriverState *bdrv,
BlockQMPEventAction action, int is_read);
void bdrv_info_print(Monitor *mon, const QObject *data);
void bdrv_info(Monitor *mon, QObject **ret_data);
void bdrv_stats_print(Monitor *mon, const QObject *data);
void bdrv_info_stats(Monitor *mon, QObject **ret_data);
/* disk I/O throttling */
void bdrv_io_limits_enable(BlockDriverState *bs);
void bdrv_io_limits_disable(BlockDriverState *bs);
bool bdrv_io_limits_enabled(BlockDriverState *bs);
void bdrv_init(void);
void bdrv_init_with_whitelist(void);
BlockDriver *bdrv_find_protocol(const char *filename);
BlockDriver *bdrv_find_format(const char *format_name);
BlockDriver *bdrv_find_whitelisted_format(const char *format_name);
int bdrv_create(BlockDriver *drv, const char* filename,
QEMUOptionParameter *options);
int bdrv_create_file(const char* filename, QEMUOptionParameter *options);
BlockDriverState *bdrv_new(const char *device_name);
void bdrv_make_anon(BlockDriverState *bs);
void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old);
void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top);
void bdrv_delete(BlockDriverState *bs);
int bdrv_parse_cache_flags(const char *mode, int *flags);
int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags);
int bdrv_open(BlockDriverState *bs, const char *filename, int flags,
BlockDriver *drv);
void bdrv_close(BlockDriverState *bs);
int bdrv_attach_dev(BlockDriverState *bs, void *dev);
void bdrv_attach_dev_nofail(BlockDriverState *bs, void *dev);
void bdrv_detach_dev(BlockDriverState *bs, void *dev);
void *bdrv_get_attached_dev(BlockDriverState *bs);
void bdrv_set_dev_ops(BlockDriverState *bs, const BlockDevOps *ops,
void *opaque);
void bdrv_dev_eject_request(BlockDriverState *bs, bool force);
bool bdrv_dev_has_removable_media(BlockDriverState *bs);
bool bdrv_dev_is_tray_open(BlockDriverState *bs);
bool bdrv_dev_is_medium_locked(BlockDriverState *bs);
int bdrv_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors);
int bdrv_read_unthrottled(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors);
int bdrv_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors);
int bdrv_pread(BlockDriverState *bs, int64_t offset,
void *buf, int count);
int bdrv_pwrite(BlockDriverState *bs, int64_t offset,
const void *buf, int count);
int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset,
const void *buf, int count);
int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov);
int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov);
/*
* Efficiently zero a region of the disk image. Note that this is a regular
* I/O request like read or write and should have a reasonable size. This
* function is not suitable for zeroing the entire image in a single request
* because it may allocate memory for the entire region.
*/
int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
int nb_sectors);
int coroutine_fn bdrv_co_is_allocated(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, int *pnum);
int coroutine_fn bdrv_co_is_allocated_above(BlockDriverState *top,
BlockDriverState *base,
int64_t sector_num,
int nb_sectors, int *pnum);
BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
const char *backing_file);
int bdrv_get_backing_file_depth(BlockDriverState *bs);
int bdrv_truncate(BlockDriverState *bs, int64_t offset);
int64_t bdrv_getlength(BlockDriverState *bs);
int64_t bdrv_get_allocated_file_size(BlockDriverState *bs);
void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
int bdrv_commit(BlockDriverState *bs);
int bdrv_commit_all(void);
int bdrv_change_backing_file(BlockDriverState *bs,
const char *backing_file, const char *backing_fmt);
void bdrv_register(BlockDriver *bdrv);
typedef struct BdrvCheckResult {
int corruptions;
int leaks;
int check_errors;
int corruptions_fixed;
int leaks_fixed;
BlockFragInfo bfi;
} BdrvCheckResult;
typedef enum {
BDRV_FIX_LEAKS = 1,
BDRV_FIX_ERRORS = 2,
} BdrvCheckMode;
int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix);
/* async block I/O */
typedef void BlockDriverDirtyHandler(BlockDriverState *bs, int64_t sector,
int sector_num);
BlockDriverAIOCB *bdrv_aio_readv(BlockDriverState *bs, int64_t sector_num,
QEMUIOVector *iov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque);
BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num,
QEMUIOVector *iov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque);
BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb, void *opaque);
BlockDriverAIOCB *bdrv_aio_discard(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque);
void bdrv_aio_cancel(BlockDriverAIOCB *acb);
typedef struct BlockRequest {
/* Fields to be filled by multiwrite caller */
int64_t sector;
int nb_sectors;
QEMUIOVector *qiov;
BlockDriverCompletionFunc *cb;
void *opaque;
/* Filled by multiwrite implementation */
int error;
} BlockRequest;
int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs,
int num_reqs);
/* sg packet commands */
int bdrv_ioctl(BlockDriverState *bs, unsigned long int req, void *buf);
BlockDriverAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
unsigned long int req, void *buf,
BlockDriverCompletionFunc *cb, void *opaque);
/* Invalidate any cached metadata used by image formats */
void bdrv_invalidate_cache(BlockDriverState *bs);
void bdrv_invalidate_cache_all(void);
void bdrv_clear_incoming_migration_all(void);
/* Ensure contents are flushed to disk. */
int bdrv_flush(BlockDriverState *bs);
int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
void bdrv_flush_all(void);
void bdrv_close_all(void);
void bdrv_drain_all(void);
int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
int bdrv_co_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
int bdrv_has_zero_init(BlockDriverState *bs);
int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
int *pnum);
void bdrv_set_on_error(BlockDriverState *bs, BlockErrorAction on_read_error,
BlockErrorAction on_write_error);
BlockErrorAction bdrv_get_on_error(BlockDriverState *bs, int is_read);
int bdrv_is_read_only(BlockDriverState *bs);
int bdrv_is_sg(BlockDriverState *bs);
int bdrv_enable_write_cache(BlockDriverState *bs);
void bdrv_set_enable_write_cache(BlockDriverState *bs, bool wce);
int bdrv_is_inserted(BlockDriverState *bs);
int bdrv_media_changed(BlockDriverState *bs);
void bdrv_lock_medium(BlockDriverState *bs, bool locked);
void bdrv_eject(BlockDriverState *bs, bool eject_flag);
const char *bdrv_get_format_name(BlockDriverState *bs);
BlockDriverState *bdrv_find(const char *name);
BlockDriverState *bdrv_next(BlockDriverState *bs);
void bdrv_iterate(void (*it)(void *opaque, BlockDriverState *bs),
void *opaque);
int bdrv_is_encrypted(BlockDriverState *bs);
int bdrv_key_required(BlockDriverState *bs);
int bdrv_set_key(BlockDriverState *bs, const char *key);
int bdrv_query_missing_keys(void);
void bdrv_iterate_format(void (*it)(void *opaque, const char *name),
void *opaque);
const char *bdrv_get_device_name(BlockDriverState *bs);
int bdrv_get_flags(BlockDriverState *bs);
int bdrv_write_compressed(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors);
int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi);
const char *bdrv_get_encrypted_filename(BlockDriverState *bs);
void bdrv_get_backing_filename(BlockDriverState *bs,
char *filename, int filename_size);
void bdrv_get_full_backing_filename(BlockDriverState *bs,
char *dest, size_t sz);
int bdrv_can_snapshot(BlockDriverState *bs);
int bdrv_is_snapshot(BlockDriverState *bs);
BlockDriverState *bdrv_snapshots(void);
int bdrv_snapshot_create(BlockDriverState *bs,
QEMUSnapshotInfo *sn_info);
int bdrv_snapshot_goto(BlockDriverState *bs,
const char *snapshot_id);
int bdrv_snapshot_delete(BlockDriverState *bs, const char *snapshot_id);
int bdrv_snapshot_list(BlockDriverState *bs,
QEMUSnapshotInfo **psn_info);
int bdrv_snapshot_load_tmp(BlockDriverState *bs,
const char *snapshot_name);
char *bdrv_snapshot_dump(char *buf, int buf_size, QEMUSnapshotInfo *sn);
char *get_human_readable_size(char *buf, int buf_size, int64_t size);
int path_is_absolute(const char *path);
void path_combine(char *dest, int dest_size,
const char *base_path,
const char *filename);
int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf,
int64_t pos, int size);
int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
int64_t pos, int size);
int bdrv_img_create(const char *filename, const char *fmt,
const char *base_filename, const char *base_fmt,
char *options, uint64_t img_size, int flags);
void bdrv_set_buffer_alignment(BlockDriverState *bs, int align);
void *qemu_blockalign(BlockDriverState *bs, size_t size);
#define BDRV_SECTORS_PER_DIRTY_CHUNK 2048
void bdrv_set_dirty_tracking(BlockDriverState *bs, int enable);
int bdrv_get_dirty(BlockDriverState *bs, int64_t sector);
void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector,
int nr_sectors);
int64_t bdrv_get_dirty_count(BlockDriverState *bs);
void bdrv_enable_copy_on_read(BlockDriverState *bs);
void bdrv_disable_copy_on_read(BlockDriverState *bs);
void bdrv_set_in_use(BlockDriverState *bs, int in_use);
int bdrv_in_use(BlockDriverState *bs);
enum BlockAcctType {
BDRV_ACCT_READ,
BDRV_ACCT_WRITE,
BDRV_ACCT_FLUSH,
BDRV_MAX_IOTYPE,
};
typedef struct BlockAcctCookie {
int64_t bytes;
int64_t start_time_ns;
enum BlockAcctType type;
} BlockAcctCookie;
void bdrv_acct_start(BlockDriverState *bs, BlockAcctCookie *cookie,
int64_t bytes, enum BlockAcctType type);
void bdrv_acct_done(BlockDriverState *bs, BlockAcctCookie *cookie);
typedef enum {
BLKDBG_L1_UPDATE,
BLKDBG_L1_GROW_ALLOC_TABLE,
BLKDBG_L1_GROW_WRITE_TABLE,
BLKDBG_L1_GROW_ACTIVATE_TABLE,
BLKDBG_L2_LOAD,
BLKDBG_L2_UPDATE,
BLKDBG_L2_UPDATE_COMPRESSED,
BLKDBG_L2_ALLOC_COW_READ,
BLKDBG_L2_ALLOC_WRITE,
BLKDBG_READ_AIO,
BLKDBG_READ_BACKING_AIO,
BLKDBG_READ_COMPRESSED,
BLKDBG_WRITE_AIO,
BLKDBG_WRITE_COMPRESSED,
BLKDBG_VMSTATE_LOAD,
BLKDBG_VMSTATE_SAVE,
BLKDBG_COW_READ,
BLKDBG_COW_WRITE,
BLKDBG_REFTABLE_LOAD,
BLKDBG_REFTABLE_GROW,
BLKDBG_REFBLOCK_LOAD,
BLKDBG_REFBLOCK_UPDATE,
BLKDBG_REFBLOCK_UPDATE_PART,
BLKDBG_REFBLOCK_ALLOC,
BLKDBG_REFBLOCK_ALLOC_HOOKUP,
BLKDBG_REFBLOCK_ALLOC_WRITE,
BLKDBG_REFBLOCK_ALLOC_WRITE_BLOCKS,
BLKDBG_REFBLOCK_ALLOC_WRITE_TABLE,
BLKDBG_REFBLOCK_ALLOC_SWITCH_TABLE,
BLKDBG_CLUSTER_ALLOC,
BLKDBG_CLUSTER_ALLOC_BYTES,
BLKDBG_CLUSTER_FREE,
BLKDBG_EVENT_MAX,
} BlkDebugEvent;
#define BLKDBG_EVENT(bs, evt) bdrv_debug_event(bs, evt)
void bdrv_debug_event(BlockDriverState *bs, BlkDebugEvent event);
#endif

View File

@@ -1,43 +1,11 @@
block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o
block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o
block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
block-obj-$(CONFIG_QUORUM) += quorum.o
block-obj-y += parallels.o blkdebug.o blkverify.o
block-obj-y += block-backend.o snapshot.o qapi.o
block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
block-obj-y += stream.o
block-obj-$(CONFIG_WIN32) += raw-win32.o
block-obj-$(CONFIG_POSIX) += raw-posix.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
block-obj-y += null.o mirror.o io.o
block-obj-y += nbd.o nbd-client.o sheepdog.o
block-obj-$(CONFIG_LIBISCSI) += iscsi.o
block-obj-$(CONFIG_LIBNFS) += nfs.o
block-obj-$(CONFIG_CURL) += curl.o
block-obj-$(CONFIG_RBD) += rbd.o
block-obj-$(CONFIG_GLUSTERFS) += gluster.o
block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
block-obj-$(CONFIG_LIBSSH2) += ssh.o
block-obj-y += accounting.o
block-obj-y += write-threshold.o
common-obj-y += stream.o
common-obj-y += commit.o
common-obj-y += backup.o
iscsi.o-cflags := $(LIBISCSI_CFLAGS)
iscsi.o-libs := $(LIBISCSI_LIBS)
curl.o-cflags := $(CURL_CFLAGS)
curl.o-libs := $(CURL_LIBS)
rbd.o-cflags := $(RBD_CFLAGS)
rbd.o-libs := $(RBD_LIBS)
gluster.o-cflags := $(GLUSTERFS_CFLAGS)
gluster.o-libs := $(GLUSTERFS_LIBS)
ssh.o-cflags := $(LIBSSH2_CFLAGS)
ssh.o-libs := $(LIBSSH2_LIBS)
archipelago.o-libs := $(ARCHIPELAGO_LIBS)
block-obj-m += dmg.o
dmg.o-libs := $(BZIP2_LIBS)
qcow.o-libs := -lz
linux-aio.o-libs := -laio

View File

@@ -1,63 +0,0 @@
/*
* QEMU System Emulator block accounting
*
* Copyright (c) 2011 Christoph Hellwig
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "block/accounting.h"
#include "block/block_int.h"
#include "qemu/timer.h"
void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
int64_t bytes, enum BlockAcctType type)
{
assert(type < BLOCK_MAX_IOTYPE);
cookie->bytes = bytes;
cookie->start_time_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
cookie->type = type;
}
void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
assert(cookie->type < BLOCK_MAX_IOTYPE);
stats->nr_bytes[cookie->type] += cookie->bytes;
stats->nr_ops[cookie->type]++;
stats->total_time_ns[cookie->type] +=
qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - cookie->start_time_ns;
}
void block_acct_highest_sector(BlockAcctStats *stats, int64_t sector_num,
unsigned int nb_sectors)
{
if (stats->wr_highest_sector < sector_num + nb_sectors - 1) {
stats->wr_highest_sector = sector_num + nb_sectors - 1;
}
}
void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type,
int num_requests)
{
assert(type < BLOCK_MAX_IOTYPE);
stats->merged[type] += num_requests;
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,548 +0,0 @@
/*
* QEMU backup
*
* Copyright (C) 2013 Proxmox Server Solutions
*
* Authors:
* Dietmar Maurer (dietmar@proxmox.com)
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
*/
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include "trace.h"
#include "block/block.h"
#include "block/block_int.h"
#include "block/blockjob.h"
#include "qemu/ratelimit.h"
#define BACKUP_CLUSTER_BITS 16
#define BACKUP_CLUSTER_SIZE (1 << BACKUP_CLUSTER_BITS)
#define BACKUP_SECTORS_PER_CLUSTER (BACKUP_CLUSTER_SIZE / BDRV_SECTOR_SIZE)
#define SLICE_TIME 100000000ULL /* ns */
typedef struct CowRequest {
int64_t start;
int64_t end;
QLIST_ENTRY(CowRequest) list;
CoQueue wait_queue; /* coroutines blocked on this request */
} CowRequest;
typedef struct BackupBlockJob {
BlockJob common;
BlockDriverState *target;
/* bitmap for sync=dirty-bitmap */
BdrvDirtyBitmap *sync_bitmap;
MirrorSyncMode sync_mode;
RateLimit limit;
BlockdevOnError on_source_error;
BlockdevOnError on_target_error;
CoRwlock flush_rwlock;
uint64_t sectors_read;
HBitmap *bitmap;
QLIST_HEAD(, CowRequest) inflight_reqs;
} BackupBlockJob;
/* See if in-flight requests overlap and wait for them to complete */
static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
int64_t start,
int64_t end)
{
CowRequest *req;
bool retry;
do {
retry = false;
QLIST_FOREACH(req, &job->inflight_reqs, list) {
if (end > req->start && start < req->end) {
qemu_co_queue_wait(&req->wait_queue);
retry = true;
break;
}
}
} while (retry);
}
/* Keep track of an in-flight request */
static void cow_request_begin(CowRequest *req, BackupBlockJob *job,
int64_t start, int64_t end)
{
req->start = start;
req->end = end;
qemu_co_queue_init(&req->wait_queue);
QLIST_INSERT_HEAD(&job->inflight_reqs, req, list);
}
/* Forget about a completed request */
static void cow_request_end(CowRequest *req)
{
QLIST_REMOVE(req, list);
qemu_co_queue_restart_all(&req->wait_queue);
}
static int coroutine_fn backup_do_cow(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
bool *error_is_read)
{
BackupBlockJob *job = (BackupBlockJob *)bs->job;
CowRequest cow_request;
struct iovec iov;
QEMUIOVector bounce_qiov;
void *bounce_buffer = NULL;
int ret = 0;
int64_t start, end;
int n;
qemu_co_rwlock_rdlock(&job->flush_rwlock);
start = sector_num / BACKUP_SECTORS_PER_CLUSTER;
end = DIV_ROUND_UP(sector_num + nb_sectors, BACKUP_SECTORS_PER_CLUSTER);
trace_backup_do_cow_enter(job, start, sector_num, nb_sectors);
wait_for_overlapping_requests(job, start, end);
cow_request_begin(&cow_request, job, start, end);
for (; start < end; start++) {
if (hbitmap_get(job->bitmap, start)) {
trace_backup_do_cow_skip(job, start);
continue; /* already copied */
}
trace_backup_do_cow_process(job, start);
n = MIN(BACKUP_SECTORS_PER_CLUSTER,
job->common.len / BDRV_SECTOR_SIZE -
start * BACKUP_SECTORS_PER_CLUSTER);
if (!bounce_buffer) {
bounce_buffer = qemu_blockalign(bs, BACKUP_CLUSTER_SIZE);
}
iov.iov_base = bounce_buffer;
iov.iov_len = n * BDRV_SECTOR_SIZE;
qemu_iovec_init_external(&bounce_qiov, &iov, 1);
ret = bdrv_co_readv(bs, start * BACKUP_SECTORS_PER_CLUSTER, n,
&bounce_qiov);
if (ret < 0) {
trace_backup_do_cow_read_fail(job, start, ret);
if (error_is_read) {
*error_is_read = true;
}
goto out;
}
if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
ret = bdrv_co_write_zeroes(job->target,
start * BACKUP_SECTORS_PER_CLUSTER,
n, BDRV_REQ_MAY_UNMAP);
} else {
ret = bdrv_co_writev(job->target,
start * BACKUP_SECTORS_PER_CLUSTER, n,
&bounce_qiov);
}
if (ret < 0) {
trace_backup_do_cow_write_fail(job, start, ret);
if (error_is_read) {
*error_is_read = false;
}
goto out;
}
hbitmap_set(job->bitmap, start, 1);
/* Publish progress, guest I/O counts as progress too. Note that the
* offset field is an opaque progress value, it is not a disk offset.
*/
job->sectors_read += n;
job->common.offset += n * BDRV_SECTOR_SIZE;
}
out:
if (bounce_buffer) {
qemu_vfree(bounce_buffer);
}
cow_request_end(&cow_request);
trace_backup_do_cow_return(job, sector_num, nb_sectors, ret);
qemu_co_rwlock_unlock(&job->flush_rwlock);
return ret;
}
static int coroutine_fn backup_before_write_notify(
NotifierWithReturn *notifier,
void *opaque)
{
BdrvTrackedRequest *req = opaque;
int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;
assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
return backup_do_cow(req->bs, sector_num, nb_sectors, NULL);
}
static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
if (speed < 0) {
error_set(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static void backup_iostatus_reset(BlockJob *job)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
bdrv_iostatus_reset(s->target);
}
static const BlockJobDriver backup_job_driver = {
.instance_size = sizeof(BackupBlockJob),
.job_type = BLOCK_JOB_TYPE_BACKUP,
.set_speed = backup_set_speed,
.iostatus_reset = backup_iostatus_reset,
};
static BlockErrorAction backup_error_action(BackupBlockJob *job,
bool read, int error)
{
if (read) {
return block_job_error_action(&job->common, job->common.bs,
job->on_source_error, true, error);
} else {
return block_job_error_action(&job->common, job->target,
job->on_target_error, false, error);
}
}
typedef struct {
int ret;
} BackupCompleteData;
static void backup_complete(BlockJob *job, void *opaque)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
BackupCompleteData *data = opaque;
bdrv_unref(s->target);
block_job_completed(job, data->ret);
g_free(data);
}
static bool coroutine_fn yield_and_check(BackupBlockJob *job)
{
if (block_job_is_cancelled(&job->common)) {
return true;
}
/* we need to yield so that bdrv_drain_all() returns.
* (without, VM does not reboot)
*/
if (job->common.speed) {
uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
job->sectors_read);
job->sectors_read = 0;
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
} else {
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, 0);
}
if (block_job_is_cancelled(&job->common)) {
return true;
}
return false;
}
static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
{
bool error_is_read;
int ret = 0;
int clusters_per_iter;
uint32_t granularity;
int64_t sector;
int64_t cluster;
int64_t end;
int64_t last_cluster = -1;
BlockDriverState *bs = job->common.bs;
HBitmapIter hbi;
granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);
bdrv_dirty_iter_init(job->sync_bitmap, &hbi);
/* Find the next dirty sector(s) */
while ((sector = hbitmap_iter_next(&hbi)) != -1) {
cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
/* Fake progress updates for any clusters we skipped */
if (cluster != last_cluster + 1) {
job->common.offset += ((cluster - last_cluster - 1) *
BACKUP_CLUSTER_SIZE);
}
for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
do {
if (yield_and_check(job)) {
return ret;
}
ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
BACKUP_SECTORS_PER_CLUSTER, &error_is_read);
if ((ret < 0) &&
backup_error_action(job, error_is_read, -ret) ==
BLOCK_ERROR_ACTION_REPORT) {
return ret;
}
} while (ret < 0);
}
/* If the bitmap granularity is smaller than the backup granularity,
* we need to advance the iterator pointer to the next cluster. */
if (granularity < BACKUP_CLUSTER_SIZE) {
bdrv_set_dirty_iter(&hbi, cluster * BACKUP_SECTORS_PER_CLUSTER);
}
last_cluster = cluster - 1;
}
/* Play some final catchup with the progress meter */
end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
if (last_cluster + 1 < end) {
job->common.offset += ((end - last_cluster - 1) * BACKUP_CLUSTER_SIZE);
}
return ret;
}
static void coroutine_fn backup_run(void *opaque)
{
BackupBlockJob *job = opaque;
BackupCompleteData *data;
BlockDriverState *bs = job->common.bs;
BlockDriverState *target = job->target;
BlockdevOnError on_target_error = job->on_target_error;
NotifierWithReturn before_write = {
.notify = backup_before_write_notify,
};
int64_t start, end;
int ret = 0;
QLIST_INIT(&job->inflight_reqs);
qemu_co_rwlock_init(&job->flush_rwlock);
start = 0;
end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
job->bitmap = hbitmap_alloc(end, 0);
bdrv_set_enable_write_cache(target, true);
bdrv_set_on_error(target, on_target_error, on_target_error);
bdrv_iostatus_enable(target);
bdrv_add_before_write_notifier(bs, &before_write);
if (job->sync_mode == MIRROR_SYNC_MODE_NONE) {
while (!block_job_is_cancelled(&job->common)) {
/* Yield until the job is cancelled. We just let our before_write
* notify callback service CoW requests. */
job->common.busy = false;
qemu_coroutine_yield();
job->common.busy = true;
}
} else if (job->sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
ret = backup_run_incremental(job);
} else {
/* Both FULL and TOP SYNC_MODE's require copying.. */
for (; start < end; start++) {
bool error_is_read;
if (yield_and_check(job)) {
break;
}
if (job->sync_mode == MIRROR_SYNC_MODE_TOP) {
int i, n;
int alloced = 0;
/* Check to see if these blocks are already in the
* backing file. */
for (i = 0; i < BACKUP_SECTORS_PER_CLUSTER;) {
/* bdrv_is_allocated() only returns true/false based
* on the first set of sectors it comes across that
* are are all in the same state.
* For that reason we must verify each sector in the
* backup cluster length. We end up copying more than
* needed but at some point that is always the case. */
alloced =
bdrv_is_allocated(bs,
start * BACKUP_SECTORS_PER_CLUSTER + i,
BACKUP_SECTORS_PER_CLUSTER - i, &n);
i += n;
if (alloced == 1 || n == 0) {
break;
}
}
/* If the above loop never found any sectors that are in
* the topmost image, skip this backup. */
if (alloced == 0) {
continue;
}
}
/* FULL sync mode we copy the whole drive. */
ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER,
BACKUP_SECTORS_PER_CLUSTER, &error_is_read);
if (ret < 0) {
/* Depending on error action, fail now or retry cluster */
BlockErrorAction action =
backup_error_action(job, error_is_read, -ret);
if (action == BLOCK_ERROR_ACTION_REPORT) {
break;
} else {
start--;
continue;
}
}
}
}
notifier_with_return_remove(&before_write);
/* wait until pending backup_do_cow() calls have completed */
qemu_co_rwlock_wrlock(&job->flush_rwlock);
qemu_co_rwlock_unlock(&job->flush_rwlock);
if (job->sync_bitmap) {
BdrvDirtyBitmap *bm;
if (ret < 0) {
/* Merge the successor back into the parent, delete nothing. */
bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
assert(bm);
} else {
/* Everything is fine, delete this bitmap and install the backup. */
bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
assert(bm);
}
}
hbitmap_free(job->bitmap);
bdrv_iostatus_disable(target);
bdrv_op_unblock_all(target, job->common.blocker);
data = g_malloc(sizeof(*data));
data->ret = ret;
block_job_defer_to_main_loop(&job->common, backup_complete, data);
}
void backup_start(BlockDriverState *bs, BlockDriverState *target,
int64_t speed, MirrorSyncMode sync_mode,
BdrvDirtyBitmap *sync_bitmap,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockCompletionFunc *cb, void *opaque,
Error **errp)
{
int64_t len;
assert(bs);
assert(target);
assert(cb);
if (bs == target) {
error_setg(errp, "Source and target cannot be the same");
return;
}
if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
!bdrv_iostatus_is_enabled(bs)) {
error_set(errp, QERR_INVALID_PARAMETER, "on-source-error");
return;
}
if (!bdrv_is_inserted(bs)) {
error_setg(errp, "Device is not inserted: %s",
bdrv_get_device_name(bs));
return;
}
if (!bdrv_is_inserted(target)) {
error_setg(errp, "Device is not inserted: %s",
bdrv_get_device_name(target));
return;
}
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
return;
}
if (bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP_TARGET, errp)) {
return;
}
if (sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
if (!sync_bitmap) {
error_setg(errp, "must provide a valid bitmap name for "
"\"dirty-bitmap\" sync mode");
return;
}
/* Create a new bitmap, and freeze/disable this one. */
if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap, errp) < 0) {
return;
}
} else if (sync_bitmap) {
error_setg(errp,
"a sync_bitmap was provided to backup_run, "
"but received an incompatible sync_mode (%s)",
MirrorSyncMode_lookup[sync_mode]);
return;
}
len = bdrv_getlength(bs);
if (len < 0) {
error_setg_errno(errp, -len, "unable to get length for '%s'",
bdrv_get_device_name(bs));
goto error;
}
BackupBlockJob *job = block_job_create(&backup_job_driver, bs, speed,
cb, opaque, errp);
if (!job) {
goto error;
}
bdrv_op_block_all(target, job->common.blocker);
job->on_source_error = on_source_error;
job->on_target_error = on_target_error;
job->target = target;
job->sync_mode = sync_mode;
job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP ?
sync_bitmap : NULL;
job->common.len = len;
job->common.co = qemu_coroutine_create(backup_run);
qemu_coroutine_enter(job->common.co, job);
return;
error:
if (sync_bitmap) {
bdrv_reclaim_dirty_bitmap(bs, sync_bitmap, NULL);
}
}

View File

@@ -23,43 +23,31 @@
*/
#include "qemu-common.h"
#include "qemu/config-file.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
#include "block_int.h"
#include "module.h"
typedef struct BDRVBlkdebugState {
int state;
int new_state;
QLIST_HEAD(, BlkdebugRule) rules[BLKDBG_EVENT_MAX];
QSIMPLEQ_HEAD(, BlkdebugRule) active_rules;
QLIST_HEAD(, BlkdebugSuspendedReq) suspended_reqs;
} BDRVBlkdebugState;
typedef struct BlkdebugAIOCB {
BlockAIOCB common;
BlockDriverAIOCB common;
QEMUBH *bh;
int ret;
} BlkdebugAIOCB;
typedef struct BlkdebugSuspendedReq {
Coroutine *co;
char *tag;
QLIST_ENTRY(BlkdebugSuspendedReq) next;
} BlkdebugSuspendedReq;
static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb);
static const AIOCBInfo blkdebug_aiocb_info = {
.aiocb_size = sizeof(BlkdebugAIOCB),
static AIOPool blkdebug_aio_pool = {
.aiocb_size = sizeof(BlkdebugAIOCB),
.cancel = blkdebug_aio_cancel,
};
enum {
ACTION_INJECT_ERROR,
ACTION_SET_STATE,
ACTION_SUSPEND,
};
typedef struct BlkdebugRule {
@@ -76,9 +64,6 @@ typedef struct BlkdebugRule {
struct {
int new_state;
} set_state;
struct {
char *tag;
} suspend;
} options;
QLIST_ENTRY(BlkdebugRule) next;
QSIMPLEQ_ENTRY(BlkdebugRule) active_next;
@@ -169,7 +154,6 @@ static const char *event_names[BLKDBG_EVENT_MAX] = {
[BLKDBG_REFTABLE_LOAD] = "reftable_load",
[BLKDBG_REFTABLE_GROW] = "reftable_grow",
[BLKDBG_REFTABLE_UPDATE] = "reftable_update",
[BLKDBG_REFBLOCK_LOAD] = "refblock_load",
[BLKDBG_REFBLOCK_UPDATE] = "refblock_update",
@@ -184,19 +168,6 @@ static const char *event_names[BLKDBG_EVENT_MAX] = {
[BLKDBG_CLUSTER_ALLOC] = "cluster_alloc",
[BLKDBG_CLUSTER_ALLOC_BYTES] = "cluster_alloc_bytes",
[BLKDBG_CLUSTER_FREE] = "cluster_free",
[BLKDBG_FLUSH_TO_OS] = "flush_to_os",
[BLKDBG_FLUSH_TO_DISK] = "flush_to_disk",
[BLKDBG_PWRITEV_RMW_HEAD] = "pwritev_rmw.head",
[BLKDBG_PWRITEV_RMW_AFTER_HEAD] = "pwritev_rmw.after_head",
[BLKDBG_PWRITEV_RMW_TAIL] = "pwritev_rmw.tail",
[BLKDBG_PWRITEV_RMW_AFTER_TAIL] = "pwritev_rmw.after_tail",
[BLKDBG_PWRITEV] = "pwritev",
[BLKDBG_PWRITEV_ZERO] = "pwritev_zero",
[BLKDBG_PWRITEV_DONE] = "pwritev_done",
[BLKDBG_EMPTY_IMAGE_PREPARE] = "empty_image_prepare",
};
static int get_event_by_name(const char *name, BlkDebugEvent *event)
@@ -216,7 +187,6 @@ static int get_event_by_name(const char *name, BlkDebugEvent *event)
struct add_rule_data {
BDRVBlkdebugState *s;
int action;
Error **errp;
};
static int add_rule(QemuOpts *opts, void *opaque)
@@ -229,11 +199,7 @@ static int add_rule(QemuOpts *opts, void *opaque)
/* Find the right event for the rule */
event_name = qemu_opt_get(opts, "event");
if (!event_name) {
error_setg(d->errp, "Missing event name for rule");
return -1;
} else if (get_event_by_name(event_name, &event) < 0) {
error_setg(d->errp, "Invalid event name \"%s\"", event_name);
if (!event_name || get_event_by_name(event_name, &event) < 0) {
return -1;
}
@@ -259,11 +225,6 @@ static int add_rule(QemuOpts *opts, void *opaque)
rule->options.set_state.new_state =
qemu_opt_get_number(opts, "new_state", 0);
break;
case ACTION_SUSPEND:
rule->options.suspend.tag =
g_strdup(qemu_opt_get(opts, "tag"));
break;
};
/* Add the rule */
@@ -272,189 +233,75 @@ static int add_rule(QemuOpts *opts, void *opaque)
return 0;
}
static void remove_rule(BlkdebugRule *rule)
static int read_config(BDRVBlkdebugState *s, const char *filename)
{
switch (rule->action) {
case ACTION_INJECT_ERROR:
case ACTION_SET_STATE:
break;
case ACTION_SUSPEND:
g_free(rule->options.suspend.tag);
break;
}
QLIST_REMOVE(rule, next);
g_free(rule);
}
static int read_config(BDRVBlkdebugState *s, const char *filename,
QDict *options, Error **errp)
{
FILE *f = NULL;
FILE *f;
int ret;
struct add_rule_data d;
Error *local_err = NULL;
if (filename) {
f = fopen(filename, "r");
if (f == NULL) {
error_setg_errno(errp, errno, "Could not read blkdebug config file");
return -errno;
}
ret = qemu_config_parse(f, config_groups, filename);
if (ret < 0) {
error_setg(errp, "Could not parse blkdebug config file");
ret = -EINVAL;
goto fail;
}
f = fopen(filename, "r");
if (f == NULL) {
return -errno;
}
qemu_config_parse_qdict(options, config_groups, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
ret = qemu_config_parse(f, config_groups, filename);
if (ret < 0) {
goto fail;
}
d.s = s;
d.action = ACTION_INJECT_ERROR;
d.errp = &local_err;
qemu_opts_foreach(&inject_error_opts, add_rule, &d, 1);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
qemu_opts_foreach(&inject_error_opts, add_rule, &d, 0);
d.action = ACTION_SET_STATE;
qemu_opts_foreach(&set_state_opts, add_rule, &d, 1);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
qemu_opts_foreach(&set_state_opts, add_rule, &d, 0);
ret = 0;
fail:
qemu_opts_reset(&inject_error_opts);
qemu_opts_reset(&set_state_opts);
if (f) {
fclose(f);
}
fclose(f);
return ret;
}
/* Valid blkdebug filenames look like blkdebug:path/to/config:path/to/image */
static void blkdebug_parse_filename(const char *filename, QDict *options,
Error **errp)
{
const char *c;
/* Parse the blkdebug: prefix */
if (!strstart(filename, "blkdebug:", &filename)) {
/* There was no prefix; therefore, all options have to be already
present in the QDict (except for the filename) */
qdict_put(options, "x-image", qstring_from_str(filename));
return;
}
/* Parse config file path */
c = strchr(filename, ':');
if (c == NULL) {
error_setg(errp, "blkdebug requires both config file and image path");
return;
}
if (c != filename) {
QString *config_path;
config_path = qstring_from_substr(filename, 0, c - filename - 1);
qdict_put(options, "config", config_path);
}
/* TODO Allow multi-level nesting and set file.filename here */
filename = c + 1;
qdict_put(options, "x-image", qstring_from_str(filename));
}
static QemuOptsList runtime_opts = {
.name = "blkdebug",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "config",
.type = QEMU_OPT_STRING,
.help = "Path to the configuration file",
},
{
.name = "x-image",
.type = QEMU_OPT_STRING,
.help = "[internal use only, will be removed]",
},
{
.name = "align",
.type = QEMU_OPT_SIZE,
.help = "Required alignment in bytes",
},
{ /* end of list */ }
},
};
static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
static int blkdebug_open(BlockDriverState *bs, const char *filename, int flags)
{
BDRVBlkdebugState *s = bs->opaque;
QemuOpts *opts;
Error *local_err = NULL;
const char *config;
uint64_t align;
int ret;
char *config, *c;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto out;
/* Parse the blkdebug: prefix */
if (strncmp(filename, "blkdebug:", strlen("blkdebug:"))) {
return -EINVAL;
}
filename += strlen("blkdebug:");
/* Read rules from config file */
c = strchr(filename, ':');
if (c == NULL) {
return -EINVAL;
}
/* Read rules from config file or command line options */
config = qemu_opt_get(opts, "config");
ret = read_config(s, config, options, errp);
if (ret) {
goto out;
config = g_strdup(filename);
config[c - filename] = '\0';
ret = read_config(s, config);
g_free(config);
if (ret < 0) {
return ret;
}
filename = c + 1;
/* Set initial state */
s->state = 1;
/* Open the backing file */
assert(bs->file == NULL);
ret = bdrv_open_image(&bs->file, qemu_opt_get(opts, "x-image"), options, "image",
flags | BDRV_O_PROTOCOL, false, &local_err);
ret = bdrv_file_open(&bs->file, filename, flags);
if (ret < 0) {
error_propagate(errp, local_err);
goto out;
return ret;
}
/* Set request alignment */
align = qemu_opt_get_size(opts, "align", bs->request_alignment);
if (align > 0 && align < INT_MAX && !(align & (align - 1))) {
bs->request_alignment = align;
} else {
error_setg(errp, "Invalid alignment");
ret = -EINVAL;
goto fail_unref;
}
ret = 0;
goto out;
fail_unref:
bdrv_unref(bs->file);
out:
qemu_opts_del(opts);
return ret;
return 0;
}
static void error_callback_bh(void *opaque)
@@ -462,40 +309,44 @@ static void error_callback_bh(void *opaque)
struct BlkdebugAIOCB *acb = opaque;
qemu_bh_delete(acb->bh);
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_unref(acb);
qemu_aio_release(acb);
}
static BlockAIOCB *inject_error(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque, BlkdebugRule *rule)
static void blkdebug_aio_cancel(BlockDriverAIOCB *blockacb)
{
BlkdebugAIOCB *acb = container_of(blockacb, BlkdebugAIOCB, common);
qemu_aio_release(acb);
}
static BlockDriverAIOCB *inject_error(BlockDriverState *bs,
BlockDriverCompletionFunc *cb, void *opaque, BlkdebugRule *rule)
{
BDRVBlkdebugState *s = bs->opaque;
int error = rule->options.inject.error;
struct BlkdebugAIOCB *acb;
QEMUBH *bh;
bool immediately = rule->options.inject.immediately;
if (rule->options.inject.once) {
QSIMPLEQ_REMOVE(&s->active_rules, rule, BlkdebugRule, active_next);
remove_rule(rule);
QSIMPLEQ_INIT(&s->active_rules);
}
if (immediately) {
if (rule->options.inject.immediately) {
return NULL;
}
acb = qemu_aio_get(&blkdebug_aiocb_info, bs, cb, opaque);
acb = qemu_aio_get(&blkdebug_aio_pool, bs, cb, opaque);
acb->ret = -error;
bh = aio_bh_new(bdrv_get_aio_context(bs), error_callback_bh, acb);
bh = qemu_bh_new(error_callback_bh, acb);
acb->bh = bh;
qemu_bh_schedule(bh);
return &acb->common;
}
static BlockAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
static BlockDriverAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
@@ -515,9 +366,9 @@ static BlockAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
return bdrv_aio_readv(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
}
static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
static BlockDriverAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
@@ -537,26 +388,6 @@ static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
return bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, cb, opaque);
}
static BlockAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
if (rule->options.inject.sector == -1) {
break;
}
}
if (rule && rule->options.inject.error) {
return inject_error(bs, cb, opaque, rule);
}
return bdrv_aio_flush(bs->file, cb, opaque);
}
static void blkdebug_close(BlockDriverState *bs)
{
BDRVBlkdebugState *s = bs->opaque;
@@ -565,39 +396,19 @@ static void blkdebug_close(BlockDriverState *bs)
for (i = 0; i < BLKDBG_EVENT_MAX; i++) {
QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
remove_rule(rule);
QLIST_REMOVE(rule, next);
g_free(rule);
}
}
}
static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugSuspendedReq r;
r = (BlkdebugSuspendedReq) {
.co = qemu_coroutine_self(),
.tag = g_strdup(rule->options.suspend.tag),
};
remove_rule(rule);
QLIST_INSERT_HEAD(&s->suspended_reqs, &r, next);
printf("blkdebug: Suspended request '%s'\n", r.tag);
qemu_coroutine_yield();
printf("blkdebug: Resuming request '%s'\n", r.tag);
QLIST_REMOVE(&r, next);
g_free(r.tag);
}
static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
bool injected)
int old_state, bool injected)
{
BDRVBlkdebugState *s = bs->opaque;
/* Only process rules for the current state */
if (rule->state && rule->state != s->state) {
if (rule->state && rule->state != old_state) {
return injected;
}
@@ -612,11 +423,7 @@ static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
break;
case ACTION_SET_STATE:
s->new_state = rule->options.set_state.new_state;
break;
case ACTION_SUSPEND:
suspend_request(bs, rule);
s->state = rule->options.set_state.new_state;
break;
}
return injected;
@@ -625,95 +432,16 @@ static bool process_rule(BlockDriverState *bs, struct BlkdebugRule *rule,
static void blkdebug_debug_event(BlockDriverState *bs, BlkDebugEvent event)
{
BDRVBlkdebugState *s = bs->opaque;
struct BlkdebugRule *rule, *next;
struct BlkdebugRule *rule;
int old_state = s->state;
bool injected;
assert((int)event >= 0 && event < BLKDBG_EVENT_MAX);
injected = false;
s->new_state = s->state;
QLIST_FOREACH_SAFE(rule, &s->rules[event], next, next) {
injected = process_rule(bs, rule, injected);
QLIST_FOREACH(rule, &s->rules[event], next) {
injected = process_rule(bs, rule, old_state, injected);
}
s->state = s->new_state;
}
static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
const char *tag)
{
BDRVBlkdebugState *s = bs->opaque;
struct BlkdebugRule *rule;
BlkDebugEvent blkdebug_event;
if (get_event_by_name(event, &blkdebug_event) < 0) {
return -ENOENT;
}
rule = g_malloc(sizeof(*rule));
*rule = (struct BlkdebugRule) {
.event = blkdebug_event,
.action = ACTION_SUSPEND,
.state = 0,
.options.suspend.tag = g_strdup(tag),
};
QLIST_INSERT_HEAD(&s->rules[blkdebug_event], rule, next);
return 0;
}
static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugSuspendedReq *r, *next;
QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, next) {
if (!strcmp(r->tag, tag)) {
qemu_coroutine_enter(r->co, NULL);
return 0;
}
}
return -ENOENT;
}
static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
const char *tag)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugSuspendedReq *r, *r_next;
BlkdebugRule *rule, *next;
int i, ret = -ENOENT;
for (i = 0; i < BLKDBG_EVENT_MAX; i++) {
QLIST_FOREACH_SAFE(rule, &s->rules[i], next, next) {
if (rule->action == ACTION_SUSPEND &&
!strcmp(rule->options.suspend.tag, tag)) {
remove_rule(rule);
ret = 0;
}
}
}
QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, r_next) {
if (!strcmp(r->tag, tag)) {
qemu_coroutine_enter(r->co, NULL);
ret = 0;
}
}
return ret;
}
static bool blkdebug_debug_is_suspended(BlockDriverState *bs, const char *tag)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugSuspendedReq *r;
QLIST_FOREACH(r, &s->suspended_reqs, next) {
if (!strcmp(r->tag, tag)) {
return true;
}
}
return false;
}
static int64_t blkdebug_getlength(BlockDriverState *bs)
@@ -721,82 +449,20 @@ static int64_t blkdebug_getlength(BlockDriverState *bs)
return bdrv_getlength(bs->file);
}
static int blkdebug_truncate(BlockDriverState *bs, int64_t offset)
{
return bdrv_truncate(bs->file, offset);
}
static void blkdebug_refresh_filename(BlockDriverState *bs)
{
QDict *opts;
const QDictEntry *e;
bool force_json = false;
for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
if (strcmp(qdict_entry_key(e), "config") &&
strcmp(qdict_entry_key(e), "x-image") &&
strcmp(qdict_entry_key(e), "image") &&
strncmp(qdict_entry_key(e), "image.", strlen("image.")))
{
force_json = true;
break;
}
}
if (force_json && !bs->file->full_open_options) {
/* The config file cannot be recreated, so creating a plain filename
* is impossible */
return;
}
if (!force_json && bs->file->exact_filename[0]) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkdebug:%s:%s",
qdict_get_try_str(bs->options, "config") ?: "",
bs->file->exact_filename);
}
opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkdebug")));
QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "image", QOBJECT(bs->file->full_open_options));
for (e = qdict_first(bs->options); e; e = qdict_next(bs->options, e)) {
if (strcmp(qdict_entry_key(e), "x-image") &&
strcmp(qdict_entry_key(e), "image") &&
strncmp(qdict_entry_key(e), "image.", strlen("image.")))
{
qobject_incref(qdict_entry_value(e));
qdict_put_obj(opts, qdict_entry_key(e), qdict_entry_value(e));
}
}
bs->full_open_options = opts;
}
static BlockDriver bdrv_blkdebug = {
.format_name = "blkdebug",
.protocol_name = "blkdebug",
.instance_size = sizeof(BDRVBlkdebugState),
.format_name = "blkdebug",
.protocol_name = "blkdebug",
.bdrv_parse_filename = blkdebug_parse_filename,
.bdrv_file_open = blkdebug_open,
.bdrv_close = blkdebug_close,
.bdrv_getlength = blkdebug_getlength,
.bdrv_truncate = blkdebug_truncate,
.bdrv_refresh_filename = blkdebug_refresh_filename,
.instance_size = sizeof(BDRVBlkdebugState),
.bdrv_aio_readv = blkdebug_aio_readv,
.bdrv_aio_writev = blkdebug_aio_writev,
.bdrv_aio_flush = blkdebug_aio_flush,
.bdrv_file_open = blkdebug_open,
.bdrv_close = blkdebug_close,
.bdrv_getlength = blkdebug_getlength,
.bdrv_debug_event = blkdebug_debug_event,
.bdrv_debug_breakpoint = blkdebug_debug_breakpoint,
.bdrv_debug_remove_breakpoint
= blkdebug_debug_remove_breakpoint,
.bdrv_debug_resume = blkdebug_debug_resume,
.bdrv_debug_is_suspended = blkdebug_debug_is_suspended,
.bdrv_aio_readv = blkdebug_aio_readv,
.bdrv_aio_writev = blkdebug_aio_writev,
.bdrv_debug_event = blkdebug_debug_event,
};
static void bdrv_blkdebug_init(void)

View File

@@ -8,10 +8,8 @@
*/
#include <stdarg.h>
#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
#include "block/block_int.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
#include "qemu_socket.h" /* for EINPROGRESS on Windows */
#include "block_int.h"
typedef struct {
BlockDriverState *test_file;
@@ -19,7 +17,7 @@ typedef struct {
typedef struct BlkverifyAIOCB BlkverifyAIOCB;
struct BlkverifyAIOCB {
BlockAIOCB common;
BlockDriverAIOCB common;
QEMUBH *bh;
/* Request metadata */
@@ -29,6 +27,7 @@ struct BlkverifyAIOCB {
int ret; /* first completed request's result */
unsigned int done; /* completion counter */
bool *finished; /* completion signal for cancel */
QEMUIOVector *qiov; /* user I/O vector */
QEMUIOVector raw_qiov; /* cloned I/O vector for raw file */
@@ -37,8 +36,21 @@ struct BlkverifyAIOCB {
void (*verify)(BlkverifyAIOCB *acb);
};
static const AIOCBInfo blkverify_aiocb_info = {
static void blkverify_aio_cancel(BlockDriverAIOCB *blockacb)
{
BlkverifyAIOCB *acb = (BlkverifyAIOCB *)blockacb;
bool finished = false;
/* Wait until request completes, invokes its callback, and frees itself */
acb->finished = &finished;
while (!finished) {
qemu_aio_wait();
}
}
static AIOPool blkverify_aio_pool = {
.aiocb_size = sizeof(BlkverifyAIOCB),
.cancel = blkverify_aio_cancel,
};
static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyAIOCB *acb,
@@ -57,101 +69,50 @@ static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyAIOCB *acb,
}
/* Valid blkverify filenames look like blkverify:path/to/raw_image:path/to/image */
static void blkverify_parse_filename(const char *filename, QDict *options,
Error **errp)
static int blkverify_open(BlockDriverState *bs, const char *filename, int flags)
{
const char *c;
QString *raw_path;
BDRVBlkverifyState *s = bs->opaque;
int ret;
char *raw, *c;
/* Parse the blkverify: prefix */
if (!strstart(filename, "blkverify:", &filename)) {
/* There was no prefix; therefore, all options have to be already
present in the QDict (except for the filename) */
qdict_put(options, "x-image", qstring_from_str(filename));
return;
if (strncmp(filename, "blkverify:", strlen("blkverify:"))) {
return -EINVAL;
}
filename += strlen("blkverify:");
/* Parse the raw image filename */
c = strchr(filename, ':');
if (c == NULL) {
error_setg(errp, "blkverify requires raw copy and original image path");
return;
return -EINVAL;
}
/* TODO Implement option pass-through and set raw.filename here */
raw_path = qstring_from_substr(filename, 0, c - filename - 1);
qdict_put(options, "x-raw", raw_path);
/* TODO Allow multi-level nesting and set file.filename here */
filename = c + 1;
qdict_put(options, "x-image", qstring_from_str(filename));
}
static QemuOptsList runtime_opts = {
.name = "blkverify",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "x-raw",
.type = QEMU_OPT_STRING,
.help = "[internal use only, will be removed]",
},
{
.name = "x-image",
.type = QEMU_OPT_STRING,
.help = "[internal use only, will be removed]",
},
{ /* end of list */ }
},
};
static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVBlkverifyState *s = bs->opaque;
QemuOpts *opts;
Error *local_err = NULL;
int ret;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
/* Open the raw file */
assert(bs->file == NULL);
ret = bdrv_open_image(&bs->file, qemu_opt_get(opts, "x-raw"), options,
"raw", flags | BDRV_O_PROTOCOL, false, &local_err);
raw = g_strdup(filename);
raw[c - filename] = '\0';
ret = bdrv_file_open(&bs->file, raw, flags);
g_free(raw);
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
return ret;
}
filename = c + 1;
/* Open the test file */
assert(s->test_file == NULL);
ret = bdrv_open_image(&s->test_file, qemu_opt_get(opts, "x-image"), options,
"test", flags, false, &local_err);
s->test_file = bdrv_new("");
ret = bdrv_open(s->test_file, filename, flags, NULL);
if (ret < 0) {
error_propagate(errp, local_err);
bdrv_delete(s->test_file);
s->test_file = NULL;
goto fail;
return ret;
}
ret = 0;
fail:
qemu_opts_del(opts);
return ret;
return 0;
}
static void blkverify_close(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_unref(s->test_file);
bdrv_delete(s->test_file);
s->test_file = NULL;
}
@@ -162,13 +123,117 @@ static int64_t blkverify_getlength(BlockDriverState *bs)
return bdrv_getlength(s->test_file);
}
/**
* Check that I/O vector contents are identical
*
* @a: I/O vector
* @b: I/O vector
* @ret: Offset to first mismatching byte or -1 if match
*/
static ssize_t blkverify_iovec_compare(QEMUIOVector *a, QEMUIOVector *b)
{
int i;
ssize_t offset = 0;
assert(a->niov == b->niov);
for (i = 0; i < a->niov; i++) {
size_t len = 0;
uint8_t *p = (uint8_t *)a->iov[i].iov_base;
uint8_t *q = (uint8_t *)b->iov[i].iov_base;
assert(a->iov[i].iov_len == b->iov[i].iov_len);
while (len < a->iov[i].iov_len && *p++ == *q++) {
len++;
}
offset += len;
if (len != a->iov[i].iov_len) {
return offset;
}
}
return -1;
}
typedef struct {
int src_index;
struct iovec *src_iov;
void *dest_base;
} IOVectorSortElem;
static int sortelem_cmp_src_base(const void *a, const void *b)
{
const IOVectorSortElem *elem_a = a;
const IOVectorSortElem *elem_b = b;
/* Don't overflow */
if (elem_a->src_iov->iov_base < elem_b->src_iov->iov_base) {
return -1;
} else if (elem_a->src_iov->iov_base > elem_b->src_iov->iov_base) {
return 1;
} else {
return 0;
}
}
static int sortelem_cmp_src_index(const void *a, const void *b)
{
const IOVectorSortElem *elem_a = a;
const IOVectorSortElem *elem_b = b;
return elem_a->src_index - elem_b->src_index;
}
/**
* Copy contents of I/O vector
*
* The relative relationships of overlapping iovecs are preserved. This is
* necessary to ensure identical semantics in the cloned I/O vector.
*/
static void blkverify_iovec_clone(QEMUIOVector *dest, const QEMUIOVector *src,
void *buf)
{
IOVectorSortElem sortelems[src->niov];
void *last_end;
int i;
/* Sort by source iovecs by base address */
for (i = 0; i < src->niov; i++) {
sortelems[i].src_index = i;
sortelems[i].src_iov = &src->iov[i];
}
qsort(sortelems, src->niov, sizeof(sortelems[0]), sortelem_cmp_src_base);
/* Allocate buffer space taking into account overlapping iovecs */
last_end = NULL;
for (i = 0; i < src->niov; i++) {
struct iovec *cur = sortelems[i].src_iov;
ptrdiff_t rewind = 0;
/* Detect overlap */
if (last_end && last_end > cur->iov_base) {
rewind = last_end - cur->iov_base;
}
sortelems[i].dest_base = buf - rewind;
buf += cur->iov_len - MIN(rewind, cur->iov_len);
last_end = MAX(cur->iov_base + cur->iov_len, last_end);
}
/* Sort by source iovec index and build destination iovec */
qsort(sortelems, src->niov, sizeof(sortelems[0]), sortelem_cmp_src_index);
for (i = 0; i < src->niov; i++) {
qemu_iovec_add(dest, sortelems[i].dest_base, src->iov[i].iov_len);
}
}
static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
BlockDriverCompletionFunc *cb,
void *opaque)
{
BlkverifyAIOCB *acb = qemu_aio_get(&blkverify_aiocb_info, bs, cb, opaque);
BlkverifyAIOCB *acb = qemu_aio_get(&blkverify_aio_pool, bs, cb, opaque);
acb->bh = NULL;
acb->is_write = is_write;
@@ -179,6 +244,7 @@ static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
acb->qiov = qiov;
acb->buf = NULL;
acb->verify = NULL;
acb->finished = NULL;
return acb;
}
@@ -192,7 +258,10 @@ static void blkverify_aio_bh(void *opaque)
qemu_vfree(acb->buf);
}
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_unref(acb);
if (acb->finished) {
*acb->finished = true;
}
qemu_aio_release(acb);
}
static void blkverify_aio_cb(void *opaque, int ret)
@@ -213,8 +282,7 @@ static void blkverify_aio_cb(void *opaque, int ret)
acb->verify(acb);
}
acb->bh = aio_bh_new(bdrv_get_aio_context(acb->common.bs),
blkverify_aio_bh, acb);
acb->bh = qemu_bh_new(blkverify_aio_bh, acb);
qemu_bh_schedule(acb->bh);
break;
}
@@ -222,16 +290,16 @@ static void blkverify_aio_cb(void *opaque, int ret)
static void blkverify_verify_readv(BlkverifyAIOCB *acb)
{
ssize_t offset = qemu_iovec_compare(acb->qiov, &acb->raw_qiov);
ssize_t offset = blkverify_iovec_compare(acb->qiov, &acb->raw_qiov);
if (offset != -1) {
blkverify_err(acb, "contents mismatch in sector %" PRId64,
acb->sector_num + (int64_t)(offset / BDRV_SECTOR_SIZE));
}
}
static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
static BlockDriverAIOCB *blkverify_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
BlkverifyAIOCB *acb = blkverify_aio_get(bs, false, sector_num, qiov,
@@ -240,7 +308,7 @@ static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
acb->verify = blkverify_verify_readv;
acb->buf = qemu_blockalign(bs->file, qiov->size);
qemu_iovec_init(&acb->raw_qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->raw_qiov, qiov, acb->buf);
blkverify_iovec_clone(&acb->raw_qiov, qiov, acb->buf);
bdrv_aio_readv(s->test_file, sector_num, qiov, nb_sectors,
blkverify_aio_cb, acb);
@@ -249,9 +317,9 @@ static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
return &acb->common;
}
static BlockAIOCB *blkverify_aio_writev(BlockDriverState *bs,
static BlockDriverAIOCB *blkverify_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
BlkverifyAIOCB *acb = blkverify_aio_get(bs, true, sector_num, qiov,
@@ -264,9 +332,9 @@ static BlockAIOCB *blkverify_aio_writev(BlockDriverState *bs,
return &acb->common;
}
static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb,
void *opaque)
static BlockDriverAIOCB *blkverify_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb,
void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
@@ -274,82 +342,20 @@ static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
return bdrv_aio_flush(s->test_file, cb, opaque);
}
static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
BlockDriverState *candidate)
{
BDRVBlkverifyState *s = bs->opaque;
bool perm = bdrv_recurse_is_first_non_filter(bs->file, candidate);
if (perm) {
return true;
}
return bdrv_recurse_is_first_non_filter(s->test_file, candidate);
}
/* Propagate AioContext changes to ->test_file */
static void blkverify_detach_aio_context(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_detach_aio_context(s->test_file);
}
static void blkverify_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_attach_aio_context(s->test_file, new_context);
}
static void blkverify_refresh_filename(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
/* bs->file has already been refreshed */
bdrv_refresh_filename(s->test_file);
if (bs->file->full_open_options && s->test_file->full_open_options) {
QDict *opts = qdict_new();
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkverify")));
QINCREF(bs->file->full_open_options);
qdict_put_obj(opts, "raw", QOBJECT(bs->file->full_open_options));
QINCREF(s->test_file->full_open_options);
qdict_put_obj(opts, "test", QOBJECT(s->test_file->full_open_options));
bs->full_open_options = opts;
}
if (bs->file->exact_filename[0] && s->test_file->exact_filename[0]) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkverify:%s:%s",
bs->file->exact_filename, s->test_file->exact_filename);
}
}
static BlockDriver bdrv_blkverify = {
.format_name = "blkverify",
.protocol_name = "blkverify",
.instance_size = sizeof(BDRVBlkverifyState),
.format_name = "blkverify",
.protocol_name = "blkverify",
.bdrv_parse_filename = blkverify_parse_filename,
.bdrv_file_open = blkverify_open,
.bdrv_close = blkverify_close,
.bdrv_getlength = blkverify_getlength,
.bdrv_refresh_filename = blkverify_refresh_filename,
.instance_size = sizeof(BDRVBlkverifyState),
.bdrv_aio_readv = blkverify_aio_readv,
.bdrv_aio_writev = blkverify_aio_writev,
.bdrv_aio_flush = blkverify_aio_flush,
.bdrv_getlength = blkverify_getlength,
.bdrv_attach_aio_context = blkverify_attach_aio_context,
.bdrv_detach_aio_context = blkverify_detach_aio_context,
.bdrv_file_open = blkverify_open,
.bdrv_close = blkverify_close,
.is_filter = true,
.bdrv_recurse_is_first_non_filter = blkverify_recurse_is_first_non_filter,
.bdrv_aio_readv = blkverify_aio_readv,
.bdrv_aio_writev = blkverify_aio_writev,
.bdrv_aio_flush = blkverify_aio_flush,
};
static void bdrv_blkverify_init(void)

View File

@@ -1,915 +0,0 @@
/*
* QEMU Block backends
*
* Copyright (C) 2014 Red Hat, Inc.
*
* Authors:
* Markus Armbruster <armbru@redhat.com>,
*
* This work is licensed under the terms of the GNU LGPL, version 2.1
* or later. See the COPYING.LIB file in the top-level directory.
*/
#include "sysemu/block-backend.h"
#include "block/block_int.h"
#include "sysemu/blockdev.h"
#include "qapi-event.h"
/* Number of coroutines to reserve per attached device model */
#define COROUTINE_POOL_RESERVATION 64
struct BlockBackend {
char *name;
int refcnt;
BlockDriverState *bs;
DriveInfo *legacy_dinfo; /* null unless created by drive_new() */
QTAILQ_ENTRY(BlockBackend) link; /* for blk_backends */
void *dev; /* attached device model, if any */
/* TODO change to DeviceState when all users are qdevified */
const BlockDevOps *dev_ops;
void *dev_opaque;
};
typedef struct BlockBackendAIOCB {
BlockAIOCB common;
QEMUBH *bh;
int ret;
} BlockBackendAIOCB;
static const AIOCBInfo block_backend_aiocb_info = {
.aiocb_size = sizeof(BlockBackendAIOCB),
};
static void drive_info_del(DriveInfo *dinfo);
/* All the BlockBackends (except for hidden ones) */
static QTAILQ_HEAD(, BlockBackend) blk_backends =
QTAILQ_HEAD_INITIALIZER(blk_backends);
/*
* Create a new BlockBackend with @name, with a reference count of one.
* @name must not be null or empty.
* Fail if a BlockBackend with this name already exists.
* Store an error through @errp on failure, unless it's null.
* Return the new BlockBackend on success, null on failure.
*/
BlockBackend *blk_new(const char *name, Error **errp)
{
BlockBackend *blk;
assert(name && name[0]);
if (!id_wellformed(name)) {
error_setg(errp, "Invalid device name");
return NULL;
}
if (blk_by_name(name)) {
error_setg(errp, "Device with id '%s' already exists", name);
return NULL;
}
if (bdrv_find_node(name)) {
error_setg(errp,
"Device name '%s' conflicts with an existing node name",
name);
return NULL;
}
blk = g_new0(BlockBackend, 1);
blk->name = g_strdup(name);
blk->refcnt = 1;
QTAILQ_INSERT_TAIL(&blk_backends, blk, link);
return blk;
}
/*
* Create a new BlockBackend with a new BlockDriverState attached.
* Otherwise just like blk_new(), which see.
*/
BlockBackend *blk_new_with_bs(const char *name, Error **errp)
{
BlockBackend *blk;
BlockDriverState *bs;
blk = blk_new(name, errp);
if (!blk) {
return NULL;
}
bs = bdrv_new_root();
blk->bs = bs;
bs->blk = blk;
return blk;
}
/*
* Calls blk_new_with_bs() and then calls bdrv_open() on the BlockDriverState.
*
* Just as with bdrv_open(), after having called this function the reference to
* @options belongs to the block layer (even on failure).
*
* TODO: Remove @filename and @flags; it should be possible to specify a whole
* BDS tree just by specifying the @options QDict (or @reference,
* alternatively). At the time of adding this function, this is not possible,
* though, so callers of this function have to be able to specify @filename and
* @flags.
*/
BlockBackend *blk_new_open(const char *name, const char *filename,
const char *reference, QDict *options, int flags,
Error **errp)
{
BlockBackend *blk;
int ret;
blk = blk_new_with_bs(name, errp);
if (!blk) {
QDECREF(options);
return NULL;
}
ret = bdrv_open(&blk->bs, filename, reference, options, flags, NULL, errp);
if (ret < 0) {
blk_unref(blk);
return NULL;
}
return blk;
}
static void blk_delete(BlockBackend *blk)
{
assert(!blk->refcnt);
assert(!blk->dev);
if (blk->bs) {
assert(blk->bs->blk == blk);
blk->bs->blk = NULL;
bdrv_unref(blk->bs);
blk->bs = NULL;
}
/* Avoid double-remove after blk_hide_on_behalf_of_hmp_drive_del() */
if (blk->name[0]) {
QTAILQ_REMOVE(&blk_backends, blk, link);
}
g_free(blk->name);
drive_info_del(blk->legacy_dinfo);
g_free(blk);
}
static void drive_info_del(DriveInfo *dinfo)
{
if (!dinfo) {
return;
}
qemu_opts_del(dinfo->opts);
g_free(dinfo->serial);
g_free(dinfo);
}
/*
* Increment @blk's reference count.
* @blk must not be null.
*/
void blk_ref(BlockBackend *blk)
{
blk->refcnt++;
}
/*
* Decrement @blk's reference count.
* If this drops it to zero, destroy @blk.
* For convenience, do nothing if @blk is null.
*/
void blk_unref(BlockBackend *blk)
{
if (blk) {
assert(blk->refcnt > 0);
if (!--blk->refcnt) {
blk_delete(blk);
}
}
}
/*
* Return the BlockBackend after @blk.
* If @blk is null, return the first one.
* Else, return @blk's next sibling, which may be null.
*
* To iterate over all BlockBackends, do
* for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
* ...
* }
*/
BlockBackend *blk_next(BlockBackend *blk)
{
return blk ? QTAILQ_NEXT(blk, link) : QTAILQ_FIRST(&blk_backends);
}
/*
* Return @blk's name, a non-null string.
* Wart: the name is empty iff @blk has been hidden with
* blk_hide_on_behalf_of_hmp_drive_del().
*/
const char *blk_name(BlockBackend *blk)
{
return blk->name;
}
/*
* Return the BlockBackend with name @name if it exists, else null.
* @name must not be null.
*/
BlockBackend *blk_by_name(const char *name)
{
BlockBackend *blk;
assert(name);
QTAILQ_FOREACH(blk, &blk_backends, link) {
if (!strcmp(name, blk->name)) {
return blk;
}
}
return NULL;
}
/*
* Return the BlockDriverState attached to @blk if any, else null.
*/
BlockDriverState *blk_bs(BlockBackend *blk)
{
return blk->bs;
}
/*
* Return @blk's DriveInfo if any, else null.
*/
DriveInfo *blk_legacy_dinfo(BlockBackend *blk)
{
return blk->legacy_dinfo;
}
/*
* Set @blk's DriveInfo to @dinfo, and return it.
* @blk must not have a DriveInfo set already.
* No other BlockBackend may have the same DriveInfo set.
*/
DriveInfo *blk_set_legacy_dinfo(BlockBackend *blk, DriveInfo *dinfo)
{
assert(!blk->legacy_dinfo);
return blk->legacy_dinfo = dinfo;
}
/*
* Return the BlockBackend with DriveInfo @dinfo.
* It must exist.
*/
BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo)
{
BlockBackend *blk;
QTAILQ_FOREACH(blk, &blk_backends, link) {
if (blk->legacy_dinfo == dinfo) {
return blk;
}
}
abort();
}
/*
* Hide @blk.
* @blk must not have been hidden already.
* Make attached BlockDriverState, if any, anonymous.
* Once hidden, @blk is invisible to all functions that don't receive
* it as argument. For example, blk_by_name() won't return it.
* Strictly for use by do_drive_del().
* TODO get rid of it!
*/
void blk_hide_on_behalf_of_hmp_drive_del(BlockBackend *blk)
{
QTAILQ_REMOVE(&blk_backends, blk, link);
blk->name[0] = 0;
if (blk->bs) {
bdrv_make_anon(blk->bs);
}
}
/*
* Attach device model @dev to @blk.
* Return 0 on success, -EBUSY when a device model is attached already.
*/
int blk_attach_dev(BlockBackend *blk, void *dev)
/* TODO change to DeviceState *dev when all users are qdevified */
{
if (blk->dev) {
return -EBUSY;
}
blk_ref(blk);
blk->dev = dev;
bdrv_iostatus_reset(blk->bs);
return 0;
}
/*
* Attach device model @dev to @blk.
* @blk must not have a device model attached already.
* TODO qdevified devices don't use this, remove when devices are qdevified
*/
void blk_attach_dev_nofail(BlockBackend *blk, void *dev)
{
if (blk_attach_dev(blk, dev) < 0) {
abort();
}
}
/*
* Detach device model @dev from @blk.
* @dev must be currently attached to @blk.
*/
void blk_detach_dev(BlockBackend *blk, void *dev)
/* TODO change to DeviceState *dev when all users are qdevified */
{
assert(blk->dev == dev);
blk->dev = NULL;
blk->dev_ops = NULL;
blk->dev_opaque = NULL;
bdrv_set_guest_block_size(blk->bs, 512);
blk_unref(blk);
}
/*
* Return the device model attached to @blk if any, else null.
*/
void *blk_get_attached_dev(BlockBackend *blk)
/* TODO change to return DeviceState * when all users are qdevified */
{
return blk->dev;
}
/*
* Set @blk's device model callbacks to @ops.
* @opaque is the opaque argument to pass to the callbacks.
* This is for use by device models.
*/
void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops,
void *opaque)
{
blk->dev_ops = ops;
blk->dev_opaque = opaque;
}
/*
* Notify @blk's attached device model of media change.
* If @load is true, notify of media load.
* Else, notify of media eject.
* Also send DEVICE_TRAY_MOVED events as appropriate.
*/
void blk_dev_change_media_cb(BlockBackend *blk, bool load)
{
if (blk->dev_ops && blk->dev_ops->change_media_cb) {
bool tray_was_closed = !blk_dev_is_tray_open(blk);
blk->dev_ops->change_media_cb(blk->dev_opaque, load);
if (tray_was_closed) {
/* tray open */
qapi_event_send_device_tray_moved(blk_name(blk),
true, &error_abort);
}
if (load) {
/* tray close */
qapi_event_send_device_tray_moved(blk_name(blk),
false, &error_abort);
}
}
}
/*
* Does @blk's attached device model have removable media?
* %true if no device model is attached.
*/
bool blk_dev_has_removable_media(BlockBackend *blk)
{
return !blk->dev || (blk->dev_ops && blk->dev_ops->change_media_cb);
}
/*
* Notify @blk's attached device model of a media eject request.
* If @force is true, the medium is about to be yanked out forcefully.
*/
void blk_dev_eject_request(BlockBackend *blk, bool force)
{
if (blk->dev_ops && blk->dev_ops->eject_request_cb) {
blk->dev_ops->eject_request_cb(blk->dev_opaque, force);
}
}
/*
* Does @blk's attached device model have a tray, and is it open?
*/
bool blk_dev_is_tray_open(BlockBackend *blk)
{
if (blk->dev_ops && blk->dev_ops->is_tray_open) {
return blk->dev_ops->is_tray_open(blk->dev_opaque);
}
return false;
}
/*
* Does @blk's attached device model have the medium locked?
* %false if the device model has no such lock.
*/
bool blk_dev_is_medium_locked(BlockBackend *blk)
{
if (blk->dev_ops && blk->dev_ops->is_medium_locked) {
return blk->dev_ops->is_medium_locked(blk->dev_opaque);
}
return false;
}
/*
* Notify @blk's attached device model of a backend size change.
*/
void blk_dev_resize_cb(BlockBackend *blk)
{
if (blk->dev_ops && blk->dev_ops->resize_cb) {
blk->dev_ops->resize_cb(blk->dev_opaque);
}
}
void blk_iostatus_enable(BlockBackend *blk)
{
bdrv_iostatus_enable(blk->bs);
}
static int blk_check_byte_request(BlockBackend *blk, int64_t offset,
size_t size)
{
int64_t len;
if (size > INT_MAX) {
return -EIO;
}
if (!blk_is_inserted(blk)) {
return -ENOMEDIUM;
}
len = blk_getlength(blk);
if (len < 0) {
return len;
}
if (offset < 0) {
return -EIO;
}
if (offset > len || len - offset < size) {
return -EIO;
}
return 0;
}
static int blk_check_request(BlockBackend *blk, int64_t sector_num,
int nb_sectors)
{
if (sector_num < 0 || sector_num > INT64_MAX / BDRV_SECTOR_SIZE) {
return -EIO;
}
if (nb_sectors < 0 || nb_sectors > INT_MAX / BDRV_SECTOR_SIZE) {
return -EIO;
}
return blk_check_byte_request(blk, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE);
}
int blk_read(BlockBackend *blk, int64_t sector_num, uint8_t *buf,
int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_read(blk->bs, sector_num, buf, nb_sectors);
}
int blk_read_unthrottled(BlockBackend *blk, int64_t sector_num, uint8_t *buf,
int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_read_unthrottled(blk->bs, sector_num, buf, nb_sectors);
}
int blk_write(BlockBackend *blk, int64_t sector_num, const uint8_t *buf,
int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_write(blk->bs, sector_num, buf, nb_sectors);
}
int blk_write_zeroes(BlockBackend *blk, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_write_zeroes(blk->bs, sector_num, nb_sectors, flags);
}
static void error_callback_bh(void *opaque)
{
struct BlockBackendAIOCB *acb = opaque;
qemu_bh_delete(acb->bh);
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_unref(acb);
}
static BlockAIOCB *abort_aio_request(BlockBackend *blk, BlockCompletionFunc *cb,
void *opaque, int ret)
{
struct BlockBackendAIOCB *acb;
QEMUBH *bh;
acb = blk_aio_get(&block_backend_aiocb_info, blk, cb, opaque);
acb->ret = ret;
bh = aio_bh_new(blk_get_aio_context(blk), error_callback_bh, acb);
acb->bh = bh;
qemu_bh_schedule(bh);
return &acb->common;
}
BlockAIOCB *blk_aio_write_zeroes(BlockBackend *blk, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return abort_aio_request(blk, cb, opaque, ret);
}
return bdrv_aio_write_zeroes(blk->bs, sector_num, nb_sectors, flags,
cb, opaque);
}
int blk_pread(BlockBackend *blk, int64_t offset, void *buf, int count)
{
int ret = blk_check_byte_request(blk, offset, count);
if (ret < 0) {
return ret;
}
return bdrv_pread(blk->bs, offset, buf, count);
}
int blk_pwrite(BlockBackend *blk, int64_t offset, const void *buf, int count)
{
int ret = blk_check_byte_request(blk, offset, count);
if (ret < 0) {
return ret;
}
return bdrv_pwrite(blk->bs, offset, buf, count);
}
int64_t blk_getlength(BlockBackend *blk)
{
return bdrv_getlength(blk->bs);
}
void blk_get_geometry(BlockBackend *blk, uint64_t *nb_sectors_ptr)
{
bdrv_get_geometry(blk->bs, nb_sectors_ptr);
}
int64_t blk_nb_sectors(BlockBackend *blk)
{
return bdrv_nb_sectors(blk->bs);
}
BlockAIOCB *blk_aio_readv(BlockBackend *blk, int64_t sector_num,
QEMUIOVector *iov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return abort_aio_request(blk, cb, opaque, ret);
}
return bdrv_aio_readv(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
}
BlockAIOCB *blk_aio_writev(BlockBackend *blk, int64_t sector_num,
QEMUIOVector *iov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return abort_aio_request(blk, cb, opaque, ret);
}
return bdrv_aio_writev(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
}
BlockAIOCB *blk_aio_flush(BlockBackend *blk,
BlockCompletionFunc *cb, void *opaque)
{
return bdrv_aio_flush(blk->bs, cb, opaque);
}
BlockAIOCB *blk_aio_discard(BlockBackend *blk,
int64_t sector_num, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return abort_aio_request(blk, cb, opaque, ret);
}
return bdrv_aio_discard(blk->bs, sector_num, nb_sectors, cb, opaque);
}
void blk_aio_cancel(BlockAIOCB *acb)
{
bdrv_aio_cancel(acb);
}
void blk_aio_cancel_async(BlockAIOCB *acb)
{
bdrv_aio_cancel_async(acb);
}
int blk_aio_multiwrite(BlockBackend *blk, BlockRequest *reqs, int num_reqs)
{
int i, ret;
for (i = 0; i < num_reqs; i++) {
ret = blk_check_request(blk, reqs[i].sector, reqs[i].nb_sectors);
if (ret < 0) {
return ret;
}
}
return bdrv_aio_multiwrite(blk->bs, reqs, num_reqs);
}
int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
{
return bdrv_ioctl(blk->bs, req, buf);
}
BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
BlockCompletionFunc *cb, void *opaque)
{
return bdrv_aio_ioctl(blk->bs, req, buf, cb, opaque);
}
int blk_co_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_co_discard(blk->bs, sector_num, nb_sectors);
}
int blk_co_flush(BlockBackend *blk)
{
return bdrv_co_flush(blk->bs);
}
int blk_flush(BlockBackend *blk)
{
return bdrv_flush(blk->bs);
}
int blk_flush_all(void)
{
return bdrv_flush_all();
}
void blk_drain_all(void)
{
bdrv_drain_all();
}
BlockdevOnError blk_get_on_error(BlockBackend *blk, bool is_read)
{
return bdrv_get_on_error(blk->bs, is_read);
}
BlockErrorAction blk_get_error_action(BlockBackend *blk, bool is_read,
int error)
{
return bdrv_get_error_action(blk->bs, is_read, error);
}
void blk_error_action(BlockBackend *blk, BlockErrorAction action,
bool is_read, int error)
{
bdrv_error_action(blk->bs, action, is_read, error);
}
int blk_is_read_only(BlockBackend *blk)
{
return bdrv_is_read_only(blk->bs);
}
int blk_is_sg(BlockBackend *blk)
{
return bdrv_is_sg(blk->bs);
}
int blk_enable_write_cache(BlockBackend *blk)
{
return bdrv_enable_write_cache(blk->bs);
}
void blk_set_enable_write_cache(BlockBackend *blk, bool wce)
{
bdrv_set_enable_write_cache(blk->bs, wce);
}
void blk_invalidate_cache(BlockBackend *blk, Error **errp)
{
bdrv_invalidate_cache(blk->bs, errp);
}
int blk_is_inserted(BlockBackend *blk)
{
return bdrv_is_inserted(blk->bs);
}
void blk_lock_medium(BlockBackend *blk, bool locked)
{
bdrv_lock_medium(blk->bs, locked);
}
void blk_eject(BlockBackend *blk, bool eject_flag)
{
bdrv_eject(blk->bs, eject_flag);
}
int blk_get_flags(BlockBackend *blk)
{
return bdrv_get_flags(blk->bs);
}
int blk_get_max_transfer_length(BlockBackend *blk)
{
return blk->bs->bl.max_transfer_length;
}
void blk_set_guest_block_size(BlockBackend *blk, int align)
{
bdrv_set_guest_block_size(blk->bs, align);
}
void *blk_blockalign(BlockBackend *blk, size_t size)
{
return qemu_blockalign(blk ? blk->bs : NULL, size);
}
bool blk_op_is_blocked(BlockBackend *blk, BlockOpType op, Error **errp)
{
return bdrv_op_is_blocked(blk->bs, op, errp);
}
void blk_op_unblock(BlockBackend *blk, BlockOpType op, Error *reason)
{
bdrv_op_unblock(blk->bs, op, reason);
}
void blk_op_block_all(BlockBackend *blk, Error *reason)
{
bdrv_op_block_all(blk->bs, reason);
}
void blk_op_unblock_all(BlockBackend *blk, Error *reason)
{
bdrv_op_unblock_all(blk->bs, reason);
}
AioContext *blk_get_aio_context(BlockBackend *blk)
{
return bdrv_get_aio_context(blk->bs);
}
void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
{
bdrv_set_aio_context(blk->bs, new_context);
}
void blk_add_aio_context_notifier(BlockBackend *blk,
void (*attached_aio_context)(AioContext *new_context, void *opaque),
void (*detach_aio_context)(void *opaque), void *opaque)
{
bdrv_add_aio_context_notifier(blk->bs, attached_aio_context,
detach_aio_context, opaque);
}
void blk_remove_aio_context_notifier(BlockBackend *blk,
void (*attached_aio_context)(AioContext *,
void *),
void (*detach_aio_context)(void *),
void *opaque)
{
bdrv_remove_aio_context_notifier(blk->bs, attached_aio_context,
detach_aio_context, opaque);
}
void blk_add_close_notifier(BlockBackend *blk, Notifier *notify)
{
bdrv_add_close_notifier(blk->bs, notify);
}
void blk_io_plug(BlockBackend *blk)
{
bdrv_io_plug(blk->bs);
}
void blk_io_unplug(BlockBackend *blk)
{
bdrv_io_unplug(blk->bs);
}
BlockAcctStats *blk_get_stats(BlockBackend *blk)
{
return bdrv_get_stats(blk->bs);
}
void *blk_aio_get(const AIOCBInfo *aiocb_info, BlockBackend *blk,
BlockCompletionFunc *cb, void *opaque)
{
return qemu_aio_get(aiocb_info, blk_bs(blk), cb, opaque);
}
int coroutine_fn blk_co_write_zeroes(BlockBackend *blk, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_co_write_zeroes(blk->bs, sector_num, nb_sectors, flags);
}
int blk_write_compressed(BlockBackend *blk, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_write_compressed(blk->bs, sector_num, buf, nb_sectors);
}
int blk_truncate(BlockBackend *blk, int64_t offset)
{
return bdrv_truncate(blk->bs, offset);
}
int blk_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors)
{
int ret = blk_check_request(blk, sector_num, nb_sectors);
if (ret < 0) {
return ret;
}
return bdrv_discard(blk->bs, sector_num, nb_sectors);
}
int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
int64_t pos, int size)
{
return bdrv_save_vmstate(blk->bs, buf, pos, size);
}
int blk_load_vmstate(BlockBackend *blk, uint8_t *buf, int64_t pos, int size)
{
return bdrv_load_vmstate(blk->bs, buf, pos, size);
}
int blk_probe_blocksizes(BlockBackend *blk, BlockSizes *bsz)
{
return bdrv_probe_blocksizes(blk->bs, bsz);
}
int blk_probe_geometry(BlockBackend *blk, HDGeometry *geo)
{
return bdrv_probe_geometry(blk->bs, geo);
}

View File

@@ -23,8 +23,8 @@
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "block_int.h"
#include "module.h"
/**************************************************************/
@@ -39,41 +39,56 @@
// not allocated: 0xffffffff
// always little-endian
struct bochs_header {
char magic[32]; /* "Bochs Virtual HD Image" */
char type[16]; /* "Redolog" */
char subtype[16]; /* "Undoable" / "Volatile" / "Growing" */
struct bochs_header_v1 {
char magic[32]; // "Bochs Virtual HD Image"
char type[16]; // "Redolog"
char subtype[16]; // "Undoable" / "Volatile" / "Growing"
uint32_t version;
uint32_t header; /* size of header */
uint32_t catalog; /* num of entries */
uint32_t bitmap; /* bitmap size */
uint32_t extent; /* extent size */
uint32_t header; // size of header
union {
struct {
uint32_t reserved; /* for ??? */
uint64_t disk; /* disk size */
char padding[HEADER_SIZE - 64 - 20 - 12];
} QEMU_PACKED redolog;
struct {
uint64_t disk; /* disk size */
char padding[HEADER_SIZE - 64 - 20 - 8];
} QEMU_PACKED redolog_v1;
char padding[HEADER_SIZE - 64 - 20];
struct {
uint32_t catalog; // num of entries
uint32_t bitmap; // bitmap size
uint32_t extent; // extent size
uint64_t disk; // disk size
char padding[HEADER_SIZE - 64 - 8 - 20];
} redolog;
char padding[HEADER_SIZE - 64 - 8];
} extra;
} QEMU_PACKED;
};
// always little-endian
struct bochs_header {
char magic[32]; // "Bochs Virtual HD Image"
char type[16]; // "Redolog"
char subtype[16]; // "Undoable" / "Volatile" / "Growing"
uint32_t version;
uint32_t header; // size of header
union {
struct {
uint32_t catalog; // num of entries
uint32_t bitmap; // bitmap size
uint32_t extent; // extent size
uint32_t reserved; // for ???
uint64_t disk; // disk size
char padding[HEADER_SIZE - 64 - 8 - 24];
} redolog;
char padding[HEADER_SIZE - 64 - 8];
} extra;
};
typedef struct BDRVBochsState {
CoMutex lock;
uint32_t *catalog_bitmap;
uint32_t catalog_size;
int catalog_size;
uint32_t data_offset;
int data_offset;
uint32_t bitmap_blocks;
uint32_t extent_blocks;
uint32_t extent_size;
int bitmap_blocks;
int extent_blocks;
int extent_size;
} BDRVBochsState;
static int bochs_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -93,19 +108,17 @@ static int bochs_probe(const uint8_t *buf, int buf_size, const char *filename)
return 0;
}
static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
static int bochs_open(BlockDriverState *bs, int flags)
{
BDRVBochsState *s = bs->opaque;
uint32_t i;
int i;
struct bochs_header bochs;
int ret;
struct bochs_header_v1 header_v1;
bs->read_only = 1; // no write support yet
ret = bdrv_pread(bs->file, 0, &bochs, sizeof(bochs));
if (ret < 0) {
return ret;
if (bdrv_pread(bs->file, 0, &bochs, sizeof(bochs)) != sizeof(bochs)) {
goto fail;
}
if (strcmp(bochs.magic, HEADER_MAGIC) ||
@@ -113,107 +126,63 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
strcmp(bochs.subtype, GROWING_TYPE) ||
((le32_to_cpu(bochs.version) != HEADER_VERSION) &&
(le32_to_cpu(bochs.version) != HEADER_V1))) {
error_setg(errp, "Image not in Bochs format");
return -EINVAL;
}
if (le32_to_cpu(bochs.version) == HEADER_V1) {
bs->total_sectors = le64_to_cpu(bochs.extra.redolog_v1.disk) / 512;
} else {
bs->total_sectors = le64_to_cpu(bochs.extra.redolog.disk) / 512;
}
/* Limit to 1M entries to avoid unbounded allocation. This is what is
* needed for the largest image that bximage can create (~8 TB). */
s->catalog_size = le32_to_cpu(bochs.catalog);
if (s->catalog_size > 0x100000) {
error_setg(errp, "Catalog size is too large");
return -EFBIG;
}
s->catalog_bitmap = g_try_new(uint32_t, s->catalog_size);
if (s->catalog_size && s->catalog_bitmap == NULL) {
error_setg(errp, "Could not allocate memory for catalog");
return -ENOMEM;
}
ret = bdrv_pread(bs->file, le32_to_cpu(bochs.header), s->catalog_bitmap,
s->catalog_size * 4);
if (ret < 0) {
goto fail;
}
if (le32_to_cpu(bochs.version) == HEADER_V1) {
memcpy(&header_v1, &bochs, sizeof(bochs));
bs->total_sectors = le64_to_cpu(header_v1.extra.redolog.disk) / 512;
} else {
bs->total_sectors = le64_to_cpu(bochs.extra.redolog.disk) / 512;
}
s->catalog_size = le32_to_cpu(bochs.extra.redolog.catalog);
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
if (bdrv_pread(bs->file, le32_to_cpu(bochs.header), s->catalog_bitmap,
s->catalog_size * 4) != s->catalog_size * 4)
goto fail;
for (i = 0; i < s->catalog_size; i++)
le32_to_cpus(&s->catalog_bitmap[i]);
s->data_offset = le32_to_cpu(bochs.header) + (s->catalog_size * 4);
s->bitmap_blocks = 1 + (le32_to_cpu(bochs.bitmap) - 1) / 512;
s->extent_blocks = 1 + (le32_to_cpu(bochs.extent) - 1) / 512;
s->bitmap_blocks = 1 + (le32_to_cpu(bochs.extra.redolog.bitmap) - 1) / 512;
s->extent_blocks = 1 + (le32_to_cpu(bochs.extra.redolog.extent) - 1) / 512;
s->extent_size = le32_to_cpu(bochs.extent);
if (s->extent_size < BDRV_SECTOR_SIZE) {
/* bximage actually never creates extents smaller than 4k */
error_setg(errp, "Extent size must be at least 512");
ret = -EINVAL;
goto fail;
} else if (!is_power_of_2(s->extent_size)) {
error_setg(errp, "Extent size %" PRIu32 " is not a power of two",
s->extent_size);
ret = -EINVAL;
goto fail;
} else if (s->extent_size > 0x800000) {
error_setg(errp, "Extent size %" PRIu32 " is too large",
s->extent_size);
ret = -EINVAL;
goto fail;
}
if (s->catalog_size < DIV_ROUND_UP(bs->total_sectors,
s->extent_size / BDRV_SECTOR_SIZE))
{
error_setg(errp, "Catalog size is too small for this disk size");
ret = -EINVAL;
goto fail;
}
s->extent_size = le32_to_cpu(bochs.extra.redolog.extent);
qemu_co_mutex_init(&s->lock);
return 0;
fail:
g_free(s->catalog_bitmap);
return ret;
fail:
return -1;
}
static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
{
BDRVBochsState *s = bs->opaque;
uint64_t offset = sector_num * 512;
uint64_t extent_index, extent_offset, bitmap_offset;
int64_t offset = sector_num * 512;
int64_t extent_index, extent_offset, bitmap_offset;
char bitmap_entry;
int ret;
// seek to sector
extent_index = offset / s->extent_size;
extent_offset = (offset % s->extent_size) / 512;
if (s->catalog_bitmap[extent_index] == 0xffffffff) {
return 0; /* not allocated */
return -1; /* not allocated */
}
bitmap_offset = s->data_offset +
(512 * (uint64_t) s->catalog_bitmap[extent_index] *
(s->extent_blocks + s->bitmap_blocks));
bitmap_offset = s->data_offset + (512 * s->catalog_bitmap[extent_index] *
(s->extent_blocks + s->bitmap_blocks));
/* read in bitmap for current extent */
ret = bdrv_pread(bs->file, bitmap_offset + (extent_offset / 8),
&bitmap_entry, 1);
if (ret < 0) {
return ret;
if (bdrv_pread(bs->file, bitmap_offset + (extent_offset / 8),
&bitmap_entry, 1) != 1) {
return -1;
}
if (!((bitmap_entry >> (extent_offset % 8)) & 1)) {
return 0; /* not allocated */
return -1; /* not allocated */
}
return bitmap_offset + (512 * (s->bitmap_blocks + extent_offset));
@@ -226,16 +195,13 @@ static int bochs_read(BlockDriverState *bs, int64_t sector_num,
while (nb_sectors > 0) {
int64_t block_offset = seek_to_sector(bs, sector_num);
if (block_offset < 0) {
return block_offset;
} else if (block_offset > 0) {
if (block_offset >= 0) {
ret = bdrv_pread(bs->file, block_offset, buf, 512);
if (ret < 0) {
return ret;
if (ret != 512) {
return -1;
}
} else {
} else
memset(buf, 0, 512);
}
nb_sectors--;
sector_num++;
buf += 512;

View File

@@ -22,13 +22,10 @@
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "block_int.h"
#include "module.h"
#include <zlib.h>
/* Maximum compressed block size */
#define MAX_BLOCK_SIZE (64 * 1024 * 1024)
typedef struct BDRVCloopState {
CoMutex lock;
uint32_t block_size;
@@ -56,130 +53,46 @@ static int cloop_probe(const uint8_t *buf, int buf_size, const char *filename)
return 0;
}
static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
static int cloop_open(BlockDriverState *bs, int flags)
{
BDRVCloopState *s = bs->opaque;
uint32_t offsets_size, max_compressed_block_size = 1, i;
int ret;
bs->read_only = 1;
/* read header */
ret = bdrv_pread(bs->file, 128, &s->block_size, 4);
if (ret < 0) {
return ret;
if (bdrv_pread(bs->file, 128, &s->block_size, 4) < 4) {
goto cloop_close;
}
s->block_size = be32_to_cpu(s->block_size);
if (s->block_size % 512) {
error_setg(errp, "block_size %" PRIu32 " must be a multiple of 512",
s->block_size);
return -EINVAL;
}
if (s->block_size == 0) {
error_setg(errp, "block_size cannot be zero");
return -EINVAL;
}
/* cloop's create_compressed_fs.c warns about block sizes beyond 256 KB but
* we can accept more. Prevent ridiculous values like 4 GB - 1 since we
* need a buffer this big.
*/
if (s->block_size > MAX_BLOCK_SIZE) {
error_setg(errp, "block_size %" PRIu32 " must be %u MB or less",
s->block_size,
MAX_BLOCK_SIZE / (1024 * 1024));
return -EINVAL;
}
ret = bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4);
if (ret < 0) {
return ret;
if (bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4) < 4) {
goto cloop_close;
}
s->n_blocks = be32_to_cpu(s->n_blocks);
/* read offsets */
if (s->n_blocks > (UINT32_MAX - 1) / sizeof(uint64_t)) {
/* Prevent integer overflow */
error_setg(errp, "n_blocks %" PRIu32 " must be %zu or less",
s->n_blocks,
(UINT32_MAX - 1) / sizeof(uint64_t));
return -EINVAL;
offsets_size = s->n_blocks * sizeof(uint64_t);
s->offsets = g_malloc(offsets_size);
if (bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size) <
offsets_size) {
goto cloop_close;
}
offsets_size = (s->n_blocks + 1) * sizeof(uint64_t);
if (offsets_size > 512 * 1024 * 1024) {
/* Prevent ridiculous offsets_size which causes memory allocation to
* fail or overflows bdrv_pread() size. In practice the 512 MB
* offsets[] limit supports 16 TB images at 256 KB block size.
*/
error_setg(errp, "image requires too many offsets, "
"try increasing block size");
return -EINVAL;
}
s->offsets = g_try_malloc(offsets_size);
if (s->offsets == NULL) {
error_setg(errp, "Could not allocate offsets table");
return -ENOMEM;
}
ret = bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size);
if (ret < 0) {
goto fail;
}
for (i = 0; i < s->n_blocks + 1; i++) {
uint64_t size;
for(i=0;i<s->n_blocks;i++) {
s->offsets[i] = be64_to_cpu(s->offsets[i]);
if (i == 0) {
continue;
}
if (s->offsets[i] < s->offsets[i - 1]) {
error_setg(errp, "offsets not monotonically increasing at "
"index %" PRIu32 ", image file is corrupt", i);
ret = -EINVAL;
goto fail;
}
size = s->offsets[i] - s->offsets[i - 1];
/* Compressed blocks should be smaller than the uncompressed block size
* but maybe compression performed poorly so the compressed block is
* actually bigger. Clamp down on unrealistic values to prevent
* ridiculous s->compressed_block allocation.
*/
if (size > 2 * MAX_BLOCK_SIZE) {
error_setg(errp, "invalid compressed block size at index %" PRIu32
", image file is corrupt", i);
ret = -EINVAL;
goto fail;
}
if (size > max_compressed_block_size) {
max_compressed_block_size = size;
if (i > 0) {
uint32_t size = s->offsets[i] - s->offsets[i - 1];
if (size > max_compressed_block_size) {
max_compressed_block_size = size;
}
}
}
/* initialize zlib engine */
s->compressed_block = g_try_malloc(max_compressed_block_size + 1);
if (s->compressed_block == NULL) {
error_setg(errp, "Could not allocate compressed_block");
ret = -ENOMEM;
goto fail;
}
s->uncompressed_block = g_try_malloc(s->block_size);
if (s->uncompressed_block == NULL) {
error_setg(errp, "Could not allocate uncompressed_block");
ret = -ENOMEM;
goto fail;
}
s->compressed_block = g_malloc(max_compressed_block_size + 1);
s->uncompressed_block = g_malloc(s->block_size);
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;
goto cloop_close;
}
s->current_block = s->n_blocks;
@@ -188,11 +101,8 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
qemu_co_mutex_init(&s->lock);
return 0;
fail:
g_free(s->offsets);
g_free(s->compressed_block);
g_free(s->uncompressed_block);
return ret;
cloop_close:
return -1;
}
static inline int cloop_read_block(BlockDriverState *bs, int block_num)
@@ -260,7 +170,9 @@ static coroutine_fn int cloop_co_read(BlockDriverState *bs, int64_t sector_num,
static void cloop_close(BlockDriverState *bs)
{
BDRVCloopState *s = bs->opaque;
g_free(s->offsets);
if (s->n_blocks > 0) {
g_free(s->offsets);
}
g_free(s->compressed_block);
g_free(s->uncompressed_block);
inflateEnd(&s->zstream);

View File

@@ -1,273 +0,0 @@
/*
* Live block commit
*
* Copyright Red Hat, Inc. 2012
*
* Authors:
* Jeff Cody <jcody@redhat.com>
* Based on stream.c by Stefan Hajnoczi
*
* This work is licensed under the terms of the GNU LGPL, version 2 or later.
* See the COPYING.LIB file in the top-level directory.
*
*/
#include "trace.h"
#include "block/block_int.h"
#include "block/blockjob.h"
#include "qemu/ratelimit.h"
enum {
/*
* Size of data buffer for populating the image file. This should be large
* enough to process multiple clusters in a single call, so that populating
* contiguous regions of the image is efficient.
*/
COMMIT_BUFFER_SIZE = 512 * 1024, /* in bytes */
};
#define SLICE_TIME 100000000ULL /* ns */
typedef struct CommitBlockJob {
BlockJob common;
RateLimit limit;
BlockDriverState *active;
BlockDriverState *top;
BlockDriverState *base;
BlockdevOnError on_error;
int base_flags;
int orig_overlay_flags;
char *backing_file_str;
} CommitBlockJob;
static int coroutine_fn commit_populate(BlockDriverState *bs,
BlockDriverState *base,
int64_t sector_num, int nb_sectors,
void *buf)
{
int ret = 0;
ret = bdrv_read(bs, sector_num, buf, nb_sectors);
if (ret) {
return ret;
}
ret = bdrv_write(base, sector_num, buf, nb_sectors);
if (ret) {
return ret;
}
return 0;
}
typedef struct {
int ret;
} CommitCompleteData;
static void commit_complete(BlockJob *job, void *opaque)
{
CommitBlockJob *s = container_of(job, CommitBlockJob, common);
CommitCompleteData *data = opaque;
BlockDriverState *active = s->active;
BlockDriverState *top = s->top;
BlockDriverState *base = s->base;
BlockDriverState *overlay_bs;
int ret = data->ret;
if (!block_job_is_cancelled(&s->common) && ret == 0) {
/* success */
ret = bdrv_drop_intermediate(active, top, base, s->backing_file_str);
}
/* restore base open flags here if appropriate (e.g., change the base back
* to r/o). These reopens do not need to be atomic, since we won't abort
* even on failure here */
if (s->base_flags != bdrv_get_flags(base)) {
bdrv_reopen(base, s->base_flags, NULL);
}
overlay_bs = bdrv_find_overlay(active, top);
if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
}
g_free(s->backing_file_str);
block_job_completed(&s->common, ret);
g_free(data);
}
static void coroutine_fn commit_run(void *opaque)
{
CommitBlockJob *s = opaque;
CommitCompleteData *data;
BlockDriverState *top = s->top;
BlockDriverState *base = s->base;
int64_t sector_num, end;
int ret = 0;
int n = 0;
void *buf = NULL;
int bytes_written = 0;
int64_t base_len;
ret = s->common.len = bdrv_getlength(top);
if (s->common.len < 0) {
goto out;
}
ret = base_len = bdrv_getlength(base);
if (base_len < 0) {
goto out;
}
if (base_len < s->common.len) {
ret = bdrv_truncate(base, s->common.len);
if (ret) {
goto out;
}
}
end = s->common.len >> BDRV_SECTOR_BITS;
buf = qemu_blockalign(top, COMMIT_BUFFER_SIZE);
for (sector_num = 0; sector_num < end; sector_num += n) {
uint64_t delay_ns = 0;
bool copy;
wait:
/* Note that even when no rate limit is applied we need to yield
* with no pending I/O here so that bdrv_drain_all() returns.
*/
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
if (block_job_is_cancelled(&s->common)) {
break;
}
/* Copy if allocated above the base */
ret = bdrv_is_allocated_above(top, base, sector_num,
COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
&n);
copy = (ret == 1);
trace_commit_one_iteration(s, sector_num, n, ret);
if (copy) {
if (s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, n);
if (delay_ns > 0) {
goto wait;
}
}
ret = commit_populate(top, base, sector_num, n, buf);
bytes_written += n * BDRV_SECTOR_SIZE;
}
if (ret < 0) {
if (s->on_error == BLOCKDEV_ON_ERROR_STOP ||
s->on_error == BLOCKDEV_ON_ERROR_REPORT||
(s->on_error == BLOCKDEV_ON_ERROR_ENOSPC && ret == -ENOSPC)) {
goto out;
} else {
n = 0;
continue;
}
}
/* Publish progress */
s->common.offset += n * BDRV_SECTOR_SIZE;
}
ret = 0;
out:
qemu_vfree(buf);
data = g_malloc(sizeof(*data));
data->ret = ret;
block_job_defer_to_main_loop(&s->common, commit_complete, data);
}
static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
{
CommitBlockJob *s = container_of(job, CommitBlockJob, common);
if (speed < 0) {
error_set(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static const BlockJobDriver commit_job_driver = {
.instance_size = sizeof(CommitBlockJob),
.job_type = BLOCK_JOB_TYPE_COMMIT,
.set_speed = commit_set_speed,
};
void commit_start(BlockDriverState *bs, BlockDriverState *base,
BlockDriverState *top, int64_t speed,
BlockdevOnError on_error, BlockCompletionFunc *cb,
void *opaque, const char *backing_file_str, Error **errp)
{
CommitBlockJob *s;
BlockReopenQueue *reopen_queue = NULL;
int orig_overlay_flags;
int orig_base_flags;
BlockDriverState *overlay_bs;
Error *local_err = NULL;
if ((on_error == BLOCKDEV_ON_ERROR_STOP ||
on_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
!bdrv_iostatus_is_enabled(bs)) {
error_setg(errp, "Invalid parameter combination");
return;
}
assert(top != bs);
if (top == base) {
error_setg(errp, "Invalid files for merge: top and base are the same");
return;
}
overlay_bs = bdrv_find_overlay(bs, top);
if (overlay_bs == NULL) {
error_setg(errp, "Could not find overlay image for %s:", top->filename);
return;
}
orig_base_flags = bdrv_get_flags(base);
orig_overlay_flags = bdrv_get_flags(overlay_bs);
/* convert base & overlay_bs to r/w, if necessary */
if (!(orig_base_flags & BDRV_O_RDWR)) {
reopen_queue = bdrv_reopen_queue(reopen_queue, base,
orig_base_flags | BDRV_O_RDWR);
}
if (!(orig_overlay_flags & BDRV_O_RDWR)) {
reopen_queue = bdrv_reopen_queue(reopen_queue, overlay_bs,
orig_overlay_flags | BDRV_O_RDWR);
}
if (reopen_queue) {
bdrv_reopen_multiple(reopen_queue, &local_err);
if (local_err != NULL) {
error_propagate(errp, local_err);
return;
}
}
s = block_job_create(&commit_job_driver, bs, speed, cb, opaque, errp);
if (!s) {
return;
}
s->base = base;
s->top = top;
s->active = bs;
s->base_flags = orig_base_flags;
s->orig_overlay_flags = orig_overlay_flags;
s->backing_file_str = g_strdup(backing_file_str);
s->on_error = on_error;
s->common.co = qemu_coroutine_create(commit_run);
trace_commit_start(bs, base, top, s, s->common.co, opaque);
qemu_coroutine_enter(s->common.co, s);
}

356
block/cow.c Normal file
View File

@@ -0,0 +1,356 @@
/*
* Block driver for the COW format
*
* Copyright (c) 2004 Fabrice Bellard
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block_int.h"
#include "module.h"
/**************************************************************/
/* COW block driver using file system holes */
/* user mode linux compatible COW file */
#define COW_MAGIC 0x4f4f4f4d /* MOOO */
#define COW_VERSION 2
struct cow_header_v2 {
uint32_t magic;
uint32_t version;
char backing_file[1024];
int32_t mtime;
uint64_t size;
uint32_t sectorsize;
};
typedef struct BDRVCowState {
CoMutex lock;
int64_t cow_sectors_offset;
} BDRVCowState;
static int cow_probe(const uint8_t *buf, int buf_size, const char *filename)
{
const struct cow_header_v2 *cow_header = (const void *)buf;
if (buf_size >= sizeof(struct cow_header_v2) &&
be32_to_cpu(cow_header->magic) == COW_MAGIC &&
be32_to_cpu(cow_header->version) == COW_VERSION)
return 100;
else
return 0;
}
static int cow_open(BlockDriverState *bs, int flags)
{
BDRVCowState *s = bs->opaque;
struct cow_header_v2 cow_header;
int bitmap_size;
int64_t size;
int ret;
/* see if it is a cow image */
ret = bdrv_pread(bs->file, 0, &cow_header, sizeof(cow_header));
if (ret < 0) {
goto fail;
}
if (be32_to_cpu(cow_header.magic) != COW_MAGIC) {
ret = -EINVAL;
goto fail;
}
if (be32_to_cpu(cow_header.version) != COW_VERSION) {
char version[64];
snprintf(version, sizeof(version),
"COW version %d", cow_header.version);
qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bs->device_name, "cow", version);
ret = -ENOTSUP;
goto fail;
}
/* cow image found */
size = be64_to_cpu(cow_header.size);
bs->total_sectors = size / 512;
pstrcpy(bs->backing_file, sizeof(bs->backing_file),
cow_header.backing_file);
bitmap_size = ((bs->total_sectors + 7) >> 3) + sizeof(cow_header);
s->cow_sectors_offset = (bitmap_size + 511) & ~511;
qemu_co_mutex_init(&s->lock);
return 0;
fail:
return ret;
}
/*
* XXX(hch): right now these functions are extremely inefficient.
* We should just read the whole bitmap we'll need in one go instead.
*/
static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
{
uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
uint8_t bitmap;
int ret;
ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
bitmap |= (1 << (bitnum % 8));
ret = bdrv_pwrite_sync(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
return 0;
}
static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
{
uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
uint8_t bitmap;
int ret;
ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
if (ret < 0) {
return ret;
}
return !!(bitmap & (1 << (bitnum % 8)));
}
/* Return true if first block has been changed (ie. current version is
* in COW file). Set the number of continuous blocks for which that
* is true. */
static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *num_same)
{
int changed;
if (nb_sectors == 0) {
*num_same = nb_sectors;
return 0;
}
changed = is_bit_set(bs, sector_num);
if (changed < 0) {
return 0; /* XXX: how to return I/O errors? */
}
for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) {
if (is_bit_set(bs, sector_num + *num_same) != changed)
break;
}
return changed;
}
static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
int error = 0;
int i;
for (i = 0; i < nb_sectors; i++) {
error = cow_set_bit(bs, sector_num + i);
if (error) {
break;
}
}
return error;
}
static int coroutine_fn cow_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
BDRVCowState *s = bs->opaque;
int ret, n;
while (nb_sectors > 0) {
if (bdrv_co_is_allocated(bs, sector_num, nb_sectors, &n)) {
ret = bdrv_pread(bs->file,
s->cow_sectors_offset + sector_num * 512,
buf, n * 512);
if (ret < 0) {
return ret;
}
} else {
if (bs->backing_hd) {
/* read from the base image */
ret = bdrv_read(bs->backing_hd, sector_num, buf, n);
if (ret < 0) {
return ret;
}
} else {
memset(buf, 0, n * 512);
}
}
nb_sectors -= n;
sector_num += n;
buf += n * 512;
}
return 0;
}
static coroutine_fn int cow_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVCowState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = cow_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static int cow_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
BDRVCowState *s = bs->opaque;
int ret;
ret = bdrv_pwrite(bs->file, s->cow_sectors_offset + sector_num * 512,
buf, nb_sectors * 512);
if (ret < 0) {
return ret;
}
return cow_update_bitmap(bs, sector_num, nb_sectors);
}
static coroutine_fn int cow_co_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
int ret;
BDRVCowState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = cow_write(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static void cow_close(BlockDriverState *bs)
{
}
static int cow_create(const char *filename, QEMUOptionParameter *options)
{
struct cow_header_v2 cow_header;
struct stat st;
int64_t image_sectors = 0;
const char *image_filename = NULL;
int ret;
BlockDriverState *cow_bs;
/* Read out options */
while (options && options->name) {
if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
image_sectors = options->value.n / 512;
} else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
image_filename = options->value.s;
}
options++;
}
ret = bdrv_create_file(filename, options);
if (ret < 0) {
return ret;
}
ret = bdrv_file_open(&cow_bs, filename, BDRV_O_RDWR);
if (ret < 0) {
return ret;
}
memset(&cow_header, 0, sizeof(cow_header));
cow_header.magic = cpu_to_be32(COW_MAGIC);
cow_header.version = cpu_to_be32(COW_VERSION);
if (image_filename) {
/* Note: if no file, we put a dummy mtime */
cow_header.mtime = cpu_to_be32(0);
if (stat(image_filename, &st) != 0) {
goto mtime_fail;
}
cow_header.mtime = cpu_to_be32(st.st_mtime);
mtime_fail:
pstrcpy(cow_header.backing_file, sizeof(cow_header.backing_file),
image_filename);
}
cow_header.sectorsize = cpu_to_be32(512);
cow_header.size = cpu_to_be64(image_sectors * 512);
ret = bdrv_pwrite(cow_bs, 0, &cow_header, sizeof(cow_header));
if (ret < 0) {
goto exit;
}
/* resize to include at least all the bitmap */
ret = bdrv_truncate(cow_bs,
sizeof(cow_header) + ((image_sectors + 7) >> 3));
if (ret < 0) {
goto exit;
}
exit:
bdrv_delete(cow_bs);
return ret;
}
static QEMUOptionParameter cow_create_options[] = {
{
.name = BLOCK_OPT_SIZE,
.type = OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_BACKING_FILE,
.type = OPT_STRING,
.help = "File name of a base image"
},
{ NULL }
};
static BlockDriver bdrv_cow = {
.format_name = "cow",
.instance_size = sizeof(BDRVCowState),
.bdrv_probe = cow_probe,
.bdrv_open = cow_open,
.bdrv_close = cow_close,
.bdrv_create = cow_create,
.bdrv_read = cow_co_read,
.bdrv_write = cow_co_write,
.bdrv_co_is_allocated = cow_co_is_allocated,
.create_options = cow_create_options,
};
static void bdrv_cow_init(void)
{
bdrv_register(&bdrv_cow);
}
block_init(bdrv_cow_init);

View File

@@ -22,11 +22,10 @@
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include "qapi/qmp/qbool.h"
#include "block_int.h"
#include <curl/curl.h>
// #define DEBUG_CURL
// #define DEBUG
// #define DEBUG_VERBOSE
#ifdef DEBUG_CURL
@@ -35,51 +34,19 @@
#define DPRINTF(fmt, ...) do { } while (0)
#endif
#if LIBCURL_VERSION_NUM >= 0x071000
/* The multi interface timer callback was introduced in 7.16.0 */
#define NEED_CURL_TIMER_CALLBACK
#define HAVE_SOCKET_ACTION
#endif
#ifndef HAVE_SOCKET_ACTION
/* If curl_multi_socket_action isn't available, define it statically here in
* terms of curl_multi_socket. Note that ev_bitmask will be ignored, which is
* less efficient but still safe. */
static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
curl_socket_t sockfd,
int ev_bitmask,
int *running_handles)
{
return curl_multi_socket(multi_handle, sockfd, running_handles);
}
#define curl_multi_socket_action __curl_multi_socket_action
#endif
#define PROTOCOLS (CURLPROTO_HTTP | CURLPROTO_HTTPS | \
CURLPROTO_FTP | CURLPROTO_FTPS | \
CURLPROTO_TFTP)
#define CURL_NUM_STATES 8
#define CURL_NUM_ACB 8
#define SECTOR_SIZE 512
#define READ_AHEAD_DEFAULT (256 * 1024)
#define CURL_TIMEOUT_DEFAULT 5
#define CURL_TIMEOUT_MAX 10000
#define READ_AHEAD_SIZE (256 * 1024)
#define FIND_RET_NONE 0
#define FIND_RET_OK 1
#define FIND_RET_WAIT 2
#define CURL_BLOCK_OPT_URL "url"
#define CURL_BLOCK_OPT_READAHEAD "readahead"
#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
#define CURL_BLOCK_OPT_TIMEOUT "timeout"
#define CURL_BLOCK_OPT_COOKIE "cookie"
struct BDRVCURLState;
typedef struct CURLAIOCB {
BlockAIOCB common;
BlockDriverAIOCB common;
QEMUBH *bh;
QEMUIOVector *qiov;
@@ -95,7 +62,6 @@ typedef struct CURLState
struct BDRVCURLState *s;
CURLAIOCB *acb[CURL_NUM_ACB];
CURL *curl;
curl_socket_t sock_fd;
char *orig_buf;
size_t buf_start;
size_t buf_off;
@@ -107,78 +73,47 @@ typedef struct CURLState
typedef struct BDRVCURLState {
CURLM *multi;
QEMUTimer timer;
size_t len;
CURLState states[CURL_NUM_STATES];
char *url;
size_t readahead_size;
bool sslverify;
uint64_t timeout;
char *cookie;
bool accept_range;
AioContext *aio_context;
} BDRVCURLState;
static void curl_clean_state(CURLState *s);
static void curl_multi_do(void *arg);
static void curl_multi_read(void *arg);
#ifdef NEED_CURL_TIMER_CALLBACK
static int curl_timer_cb(CURLM *multi, long timeout_ms, void *opaque)
{
BDRVCURLState *s = opaque;
DPRINTF("CURL: timer callback timeout_ms %ld\n", timeout_ms);
if (timeout_ms == -1) {
timer_del(&s->timer);
} else {
int64_t timeout_ns = (int64_t)timeout_ms * 1000 * 1000;
timer_mod(&s->timer,
qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + timeout_ns);
}
return 0;
}
#endif
static int curl_aio_flush(void *opaque);
static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
void *userp, void *sp)
void *s, void *sp)
{
BDRVCURLState *s;
CURLState *state = NULL;
curl_easy_getinfo(curl, CURLINFO_PRIVATE, (char **)&state);
state->sock_fd = fd;
s = state->s;
DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
switch (action) {
case CURL_POLL_IN:
aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
NULL, state);
qemu_aio_set_fd_handler(fd, curl_multi_do, NULL, curl_aio_flush, s);
break;
case CURL_POLL_OUT:
aio_set_fd_handler(s->aio_context, fd, NULL, curl_multi_do, state);
qemu_aio_set_fd_handler(fd, NULL, curl_multi_do, curl_aio_flush, s);
break;
case CURL_POLL_INOUT:
aio_set_fd_handler(s->aio_context, fd, curl_multi_read,
curl_multi_do, state);
qemu_aio_set_fd_handler(fd, curl_multi_do, curl_multi_do,
curl_aio_flush, s);
break;
case CURL_POLL_REMOVE:
aio_set_fd_handler(s->aio_context, fd, NULL, NULL, NULL);
qemu_aio_set_fd_handler(fd, NULL, NULL, NULL, NULL);
break;
}
return 0;
}
static size_t curl_header_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
static size_t curl_size_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
{
BDRVCURLState *s = opaque;
CURLState *s = ((CURLState*)opaque);
size_t realsize = size * nmemb;
const char *accept_line = "Accept-Ranges: bytes";
size_t fsize;
if (realsize >= strlen(accept_line)
&& strncmp((char *)ptr, accept_line, strlen(accept_line)) == 0) {
s->accept_range = true;
if(sscanf(ptr, "Content-Length: %zd", &fsize) == 1) {
s->s->len = fsize;
}
return realsize;
@@ -193,13 +128,8 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
DPRINTF("CURL: Just reading %zd bytes\n", realsize);
if (!s || !s->orig_buf)
return 0;
goto read_end;
if (s->buf_off >= s->buf_len) {
/* buffer full, read nothing */
return 0;
}
realsize = MIN(realsize, s->buf_len - s->buf_off);
memcpy(s->orig_buf + s->buf_off, ptr, realsize);
s->buf_off += realsize;
@@ -213,11 +143,12 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
acb->end - acb->start);
acb->common.cb(acb->common.opaque, 0);
qemu_aio_unref(acb);
qemu_aio_release(acb);
s->acb[i] = NULL;
}
}
read_end:
return realsize;
}
@@ -252,8 +183,7 @@ static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
}
// Wait for unfinished chunks
if (state->in_use &&
(start >= state->buf_start) &&
if ((start >= state->buf_start) &&
(start <= buf_fend) &&
(end >= state->buf_start) &&
(end <= buf_fend))
@@ -275,90 +205,64 @@ static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
return FIND_RET_NONE;
}
static void curl_multi_check_completion(BDRVCURLState *s)
static void curl_multi_do(void *arg)
{
BDRVCURLState *s = (BDRVCURLState *)arg;
int running;
int r;
int msgs_in_queue;
if (!s->multi)
return;
do {
r = curl_multi_socket_all(s->multi, &running);
} while(r == CURLM_CALL_MULTI_PERFORM);
/* Try to find done transfers, so we can free the easy
* handle again. */
for (;;) {
do {
CURLMsg *msg;
msg = curl_multi_info_read(s->multi, &msgs_in_queue);
/* Quit when there are no more completions */
if (!msg)
break;
if (msg->msg == CURLMSG_DONE) {
CURLState *state = NULL;
curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE,
(char **)&state);
/* ACBs for successful messages get completed in curl_read_cb */
if (msg->data.result != CURLE_OK) {
int i;
for (i = 0; i < CURL_NUM_ACB; i++) {
CURLAIOCB *acb = state->acb[i];
if (acb == NULL) {
continue;
}
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_unref(acb);
state->acb[i] = NULL;
}
}
curl_clean_state(state);
if (msg->msg == CURLMSG_NONE)
break;
switch (msg->msg) {
case CURLMSG_DONE:
{
CURLState *state = NULL;
curl_easy_getinfo(msg->easy_handle, CURLINFO_PRIVATE, (char**)&state);
/* ACBs for successful messages get completed in curl_read_cb */
if (msg->data.result != CURLE_OK) {
int i;
for (i = 0; i < CURL_NUM_ACB; i++) {
CURLAIOCB *acb = state->acb[i];
if (acb == NULL) {
continue;
}
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_release(acb);
state->acb[i] = NULL;
}
}
curl_clean_state(state);
break;
}
default:
msgs_in_queue = 0;
break;
}
}
} while(msgs_in_queue);
}
static void curl_multi_do(void *arg)
{
CURLState *s = (CURLState *)arg;
int running;
int r;
if (!s->s->multi) {
return;
}
do {
r = curl_multi_socket_action(s->s->multi, s->sock_fd, 0, &running);
} while(r == CURLM_CALL_MULTI_PERFORM);
}
static void curl_multi_read(void *arg)
{
CURLState *s = (CURLState *)arg;
curl_multi_do(arg);
curl_multi_check_completion(s->s);
}
static void curl_multi_timeout_do(void *arg)
{
#ifdef NEED_CURL_TIMER_CALLBACK
BDRVCURLState *s = (BDRVCURLState *)arg;
int running;
if (!s->multi) {
return;
}
curl_multi_socket_action(s->multi, CURL_SOCKET_TIMEOUT, 0, &running);
curl_multi_check_completion(s);
#else
abort();
#endif
}
static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
static CURLState *curl_init_state(BDRVCURLState *s)
{
CURLState *state = NULL;
int i, j;
@@ -376,47 +280,33 @@ static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
break;
}
if (!state) {
aio_poll(bdrv_get_aio_context(bs), true);
g_usleep(100);
curl_multi_do(s);
}
} while(!state);
if (!state->curl) {
state->curl = curl_easy_init();
if (!state->curl) {
return NULL;
}
curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
curl_easy_setopt(state->curl, CURLOPT_SSL_VERIFYPEER,
(long) s->sslverify);
if (s->cookie) {
curl_easy_setopt(state->curl, CURLOPT_COOKIE, s->cookie);
}
curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, (long)s->timeout);
curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION,
(void *)curl_read_cb);
curl_easy_setopt(state->curl, CURLOPT_WRITEDATA, (void *)state);
curl_easy_setopt(state->curl, CURLOPT_PRIVATE, (void *)state);
curl_easy_setopt(state->curl, CURLOPT_AUTOREFERER, 1);
curl_easy_setopt(state->curl, CURLOPT_FOLLOWLOCATION, 1);
curl_easy_setopt(state->curl, CURLOPT_NOSIGNAL, 1);
curl_easy_setopt(state->curl, CURLOPT_ERRORBUFFER, state->errmsg);
curl_easy_setopt(state->curl, CURLOPT_FAILONERROR, 1);
if (state->curl)
goto has_curl;
/* Restrict supported protocols to avoid security issues in the more
* obscure protocols. For example, do not allow POP3/SMTP/IMAP see
* CVE-2013-0249.
*
* Restricting protocols is only supported from 7.19.4 upwards.
*/
#if LIBCURL_VERSION_NUM >= 0x071304
curl_easy_setopt(state->curl, CURLOPT_PROTOCOLS, PROTOCOLS);
curl_easy_setopt(state->curl, CURLOPT_REDIR_PROTOCOLS, PROTOCOLS);
#endif
state->curl = curl_easy_init();
if (!state->curl)
return NULL;
curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
curl_easy_setopt(state->curl, CURLOPT_TIMEOUT, 5);
curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION, (void *)curl_read_cb);
curl_easy_setopt(state->curl, CURLOPT_WRITEDATA, (void *)state);
curl_easy_setopt(state->curl, CURLOPT_PRIVATE, (void *)state);
curl_easy_setopt(state->curl, CURLOPT_AUTOREFERER, 1);
curl_easy_setopt(state->curl, CURLOPT_FOLLOWLOCATION, 1);
curl_easy_setopt(state->curl, CURLOPT_NOSIGNAL, 1);
curl_easy_setopt(state->curl, CURLOPT_ERRORBUFFER, state->errmsg);
curl_easy_setopt(state->curl, CURLOPT_FAILONERROR, 1);
#ifdef DEBUG_VERBOSE
curl_easy_setopt(state->curl, CURLOPT_VERBOSE, 1);
curl_easy_setopt(state->curl, CURLOPT_VERBOSE, 1);
#endif
}
has_curl:
state->s = s;
@@ -430,136 +320,52 @@ static void curl_clean_state(CURLState *s)
s->in_use = 0;
}
static void curl_parse_filename(const char *filename, QDict *options,
Error **errp)
{
qdict_put(options, CURL_BLOCK_OPT_URL, qstring_from_str(filename));
}
static void curl_detach_aio_context(BlockDriverState *bs)
{
BDRVCURLState *s = bs->opaque;
int i;
for (i = 0; i < CURL_NUM_STATES; i++) {
if (s->states[i].in_use) {
curl_clean_state(&s->states[i]);
}
if (s->states[i].curl) {
curl_easy_cleanup(s->states[i].curl);
s->states[i].curl = NULL;
}
g_free(s->states[i].orig_buf);
s->states[i].orig_buf = NULL;
}
if (s->multi) {
curl_multi_cleanup(s->multi);
s->multi = NULL;
}
timer_del(&s->timer);
}
static void curl_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVCURLState *s = bs->opaque;
aio_timer_init(new_context, &s->timer,
QEMU_CLOCK_REALTIME, SCALE_NS,
curl_multi_timeout_do, s);
assert(!s->multi);
s->multi = curl_multi_init();
s->aio_context = new_context;
curl_multi_setopt(s->multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb);
#ifdef NEED_CURL_TIMER_CALLBACK
curl_multi_setopt(s->multi, CURLMOPT_TIMERDATA, s);
curl_multi_setopt(s->multi, CURLMOPT_TIMERFUNCTION, curl_timer_cb);
#endif
}
static QemuOptsList runtime_opts = {
.name = "curl",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = CURL_BLOCK_OPT_URL,
.type = QEMU_OPT_STRING,
.help = "URL to open",
},
{
.name = CURL_BLOCK_OPT_READAHEAD,
.type = QEMU_OPT_SIZE,
.help = "Readahead size",
},
{
.name = CURL_BLOCK_OPT_SSLVERIFY,
.type = QEMU_OPT_BOOL,
.help = "Verify SSL certificate"
},
{
.name = CURL_BLOCK_OPT_TIMEOUT,
.type = QEMU_OPT_NUMBER,
.help = "Curl timeout"
},
{
.name = CURL_BLOCK_OPT_COOKIE,
.type = QEMU_OPT_STRING,
.help = "Pass the cookie or list of cookies with each request"
},
{ /* end of list */ }
},
};
static int curl_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
static int curl_open(BlockDriverState *bs, const char *filename, int flags)
{
BDRVCURLState *s = bs->opaque;
CURLState *state = NULL;
QemuOpts *opts;
Error *local_err = NULL;
const char *file;
const char *cookie;
double d;
#define RA_OPTSTR ":readahead="
char *file;
char *ra;
const char *ra_val;
int parse_state = 0;
static int inited = 0;
if (flags & BDRV_O_RDWR) {
error_setg(errp, "curl block device does not support writes");
return -EROFS;
file = g_strdup(filename);
s->readahead_size = READ_AHEAD_SIZE;
/* Parse a trailing ":readahead=#:" param, if present. */
ra = file + strlen(file) - 1;
while (ra >= file) {
if (parse_state == 0) {
if (*ra == ':')
parse_state++;
else
break;
} else if (parse_state == 1) {
if (*ra > '9' || *ra < '0') {
char *opt_start = ra - strlen(RA_OPTSTR) + 1;
if (opt_start > file &&
strncmp(opt_start, RA_OPTSTR, strlen(RA_OPTSTR)) == 0) {
ra_val = ra + 1;
ra -= strlen(RA_OPTSTR) - 1;
*ra = '\0';
s->readahead_size = atoi(ra_val);
break;
} else {
break;
}
}
}
ra--;
}
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto out_noclean;
}
s->readahead_size = qemu_opt_get_size(opts, CURL_BLOCK_OPT_READAHEAD,
READ_AHEAD_DEFAULT);
if ((s->readahead_size & 0x1ff) != 0) {
error_setg(errp, "HTTP_READAHEAD_SIZE %zd is not a multiple of 512",
s->readahead_size);
goto out_noclean;
}
s->timeout = qemu_opt_get_number(opts, CURL_BLOCK_OPT_TIMEOUT,
CURL_TIMEOUT_DEFAULT);
if (s->timeout > CURL_TIMEOUT_MAX) {
error_setg(errp, "timeout parameter is too large or negative");
goto out_noclean;
}
s->sslverify = qemu_opt_get_bool(opts, CURL_BLOCK_OPT_SSLVERIFY, true);
cookie = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE);
s->cookie = g_strdup(cookie);
file = qemu_opt_get(opts, CURL_BLOCK_OPT_URL);
if (file == NULL) {
error_setg(errp, "curl block driver requires an 'url' option");
fprintf(stderr, "HTTP_READAHEAD_SIZE %zd is not a multiple of 512\n",
s->readahead_size);
goto out_noclean;
}
@@ -569,64 +375,78 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
}
DPRINTF("CURL: Opening %s\n", file);
s->aio_context = bdrv_get_aio_context(bs);
s->url = g_strdup(file);
state = curl_init_state(bs, s);
s->url = file;
state = curl_init_state(s);
if (!state)
goto out_noclean;
// Get file size
s->accept_range = false;
curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1);
curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION,
curl_header_cb);
curl_easy_setopt(state->curl, CURLOPT_HEADERDATA, s);
curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION, (void *)curl_size_cb);
if (curl_easy_perform(state->curl))
goto out;
curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, &d);
curl_easy_setopt(state->curl, CURLOPT_WRITEFUNCTION, (void *)curl_read_cb);
curl_easy_setopt(state->curl, CURLOPT_NOBODY, 0);
if (d)
s->len = (size_t)d;
else if(!s->len)
goto out;
if ((!strncasecmp(s->url, "http://", strlen("http://"))
|| !strncasecmp(s->url, "https://", strlen("https://")))
&& !s->accept_range) {
pstrcpy(state->errmsg, CURL_ERROR_SIZE,
"Server does not support 'range' (byte ranges).");
goto out;
}
DPRINTF("CURL: Size = %zd\n", s->len);
curl_clean_state(state);
curl_easy_cleanup(state->curl);
state->curl = NULL;
curl_attach_aio_context(bs, bdrv_get_aio_context(bs));
// Now we know the file exists and its size, so let's
// initialize the multi interface!
s->multi = curl_multi_init();
curl_multi_setopt( s->multi, CURLMOPT_SOCKETDATA, s);
curl_multi_setopt( s->multi, CURLMOPT_SOCKETFUNCTION, curl_sock_cb );
curl_multi_do(s);
qemu_opts_del(opts);
return 0;
out:
error_setg(errp, "CURL: Error opening file: %s", state->errmsg);
fprintf(stderr, "CURL: Error opening file: %s\n", state->errmsg);
curl_easy_cleanup(state->curl);
state->curl = NULL;
out_noclean:
g_free(s->cookie);
g_free(s->url);
qemu_opts_del(opts);
g_free(file);
return -EINVAL;
}
static const AIOCBInfo curl_aiocb_info = {
static int curl_aio_flush(void *opaque)
{
BDRVCURLState *s = opaque;
int i, j;
for (i=0; i < CURL_NUM_STATES; i++) {
for(j=0; j < CURL_NUM_ACB; j++) {
if (s->states[i].acb[j]) {
return 1;
}
}
}
return 0;
}
static void curl_aio_cancel(BlockDriverAIOCB *blockacb)
{
// Do we have to implement canceling? Seems to work without...
}
static AIOPool curl_aio_pool = {
.aiocb_size = sizeof(CURLAIOCB),
.cancel = curl_aio_cancel,
};
static void curl_readv_bh_cb(void *p)
{
CURLState *state;
int running;
CURLAIOCB *acb = p;
BDRVCURLState *s = acb->common.bs->opaque;
@@ -641,7 +461,7 @@ static void curl_readv_bh_cb(void *p)
// we can just call the callback and be done.
switch (curl_find_buf(s, start, acb->nb_sectors * SECTOR_SIZE, acb)) {
case FIND_RET_OK:
qemu_aio_unref(acb);
qemu_aio_release(acb);
// fall through
case FIND_RET_WAIT:
return;
@@ -650,10 +470,10 @@ static void curl_readv_bh_cb(void *p)
}
// No cache found, so let's start a new request
state = curl_init_state(acb->common.bs, s);
state = curl_init_state(s);
if (!state) {
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_unref(acb);
qemu_aio_release(acb);
return;
}
@@ -661,17 +481,12 @@ static void curl_readv_bh_cb(void *p)
acb->end = (acb->nb_sectors * SECTOR_SIZE);
state->buf_off = 0;
g_free(state->orig_buf);
if (state->orig_buf)
g_free(state->orig_buf);
state->buf_start = start;
state->buf_len = acb->end + s->readahead_size;
end = MIN(start + state->buf_len, s->len) - 1;
state->orig_buf = g_try_malloc(state->buf_len);
if (state->buf_len && state->orig_buf == NULL) {
curl_clean_state(state);
acb->common.cb(acb->common.opaque, -ENOMEM);
qemu_aio_unref(acb);
return;
}
state->orig_buf = g_malloc(state->buf_len);
state->acb[0] = acb;
snprintf(state->range, 127, "%zd-%zd", start, end);
@@ -680,24 +495,29 @@ static void curl_readv_bh_cb(void *p)
curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
curl_multi_add_handle(s->multi, state->curl);
curl_multi_do(s);
/* Tell curl it needs to kick things off */
curl_multi_socket_action(s->multi, CURL_SOCKET_TIMEOUT, 0, &running);
}
static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
static BlockDriverAIOCB *curl_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
BlockDriverCompletionFunc *cb, void *opaque)
{
CURLAIOCB *acb;
acb = qemu_aio_get(&curl_aiocb_info, bs, cb, opaque);
acb = qemu_aio_get(&curl_aio_pool, bs, cb, opaque);
acb->qiov = qiov;
acb->sector_num = sector_num;
acb->nb_sectors = nb_sectors;
acb->bh = aio_bh_new(bdrv_get_aio_context(bs), curl_readv_bh_cb, acb);
acb->bh = qemu_bh_new(curl_readv_bh_cb, acb);
if (!acb->bh) {
DPRINTF("CURL: qemu_bh_new failed\n");
return NULL;
}
qemu_bh_schedule(acb->bh);
return &acb->common;
}
@@ -705,11 +525,23 @@ static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
static void curl_close(BlockDriverState *bs)
{
BDRVCURLState *s = bs->opaque;
int i;
DPRINTF("CURL: Close\n");
curl_detach_aio_context(bs);
g_free(s->cookie);
for (i=0; i<CURL_NUM_STATES; i++) {
if (s->states[i].in_use)
curl_clean_state(&s->states[i]);
if (s->states[i].curl) {
curl_easy_cleanup(s->states[i].curl);
s->states[i].curl = NULL;
}
if (s->states[i].orig_buf) {
g_free(s->states[i].orig_buf);
s->states[i].orig_buf = NULL;
}
}
if (s->multi)
curl_multi_cleanup(s->multi);
g_free(s->url);
}
@@ -720,83 +552,63 @@ static int64_t curl_getlength(BlockDriverState *bs)
}
static BlockDriver bdrv_http = {
.format_name = "http",
.protocol_name = "http",
.format_name = "http",
.protocol_name = "http",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
.bdrv_aio_readv = curl_aio_readv,
};
static BlockDriver bdrv_https = {
.format_name = "https",
.protocol_name = "https",
.format_name = "https",
.protocol_name = "https",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
.bdrv_aio_readv = curl_aio_readv,
};
static BlockDriver bdrv_ftp = {
.format_name = "ftp",
.protocol_name = "ftp",
.format_name = "ftp",
.protocol_name = "ftp",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
.bdrv_aio_readv = curl_aio_readv,
};
static BlockDriver bdrv_ftps = {
.format_name = "ftps",
.protocol_name = "ftps",
.format_name = "ftps",
.protocol_name = "ftps",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
.bdrv_aio_readv = curl_aio_readv,
};
static BlockDriver bdrv_tftp = {
.format_name = "tftp",
.protocol_name = "tftp",
.format_name = "tftp",
.protocol_name = "tftp",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.instance_size = sizeof(BDRVCURLState),
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
.bdrv_aio_readv = curl_aio_readv,
};
static void curl_block_init(void)

View File

@@ -22,22 +22,10 @@
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/bswap.h"
#include "qemu/module.h"
#include "block_int.h"
#include "bswap.h"
#include "module.h"
#include <zlib.h>
#ifdef CONFIG_BZIP2
#include <bzlib.h>
#endif
#include <glib.h>
enum {
/* Limit chunk sizes to prevent unreasonable amounts of memory being used
* or truncating when converting to 32-bit types
*/
DMG_LENGTHS_MAX = 64 * 1024 * 1024, /* 64 MB */
DMG_SECTORCOUNTS_MAX = DMG_LENGTHS_MAX / 512,
};
typedef struct BDRVDMGState {
CoMutex lock;
@@ -59,599 +47,221 @@ typedef struct BDRVDMGState {
uint8_t *compressed_chunk;
uint8_t *uncompressed_chunk;
z_stream zstream;
#ifdef CONFIG_BZIP2
bz_stream bzstream;
#endif
} BDRVDMGState;
static int dmg_probe(const uint8_t *buf, int buf_size, const char *filename)
{
int len;
if (!filename) {
return 0;
}
len = strlen(filename);
if (len > 4 && !strcmp(filename + len - 4, ".dmg")) {
return 2;
}
int len=strlen(filename);
if(len>4 && !strcmp(filename+len-4,".dmg"))
return 2;
return 0;
}
static int read_uint64(BlockDriverState *bs, int64_t offset, uint64_t *result)
static off_t read_off(BlockDriverState *bs, int64_t offset)
{
uint64_t buffer;
int ret;
ret = bdrv_pread(bs->file, offset, &buffer, 8);
if (ret < 0) {
return ret;
}
*result = be64_to_cpu(buffer);
return 0;
uint64_t buffer;
if (bdrv_pread(bs->file, offset, &buffer, 8) < 8)
return 0;
return be64_to_cpu(buffer);
}
static int read_uint32(BlockDriverState *bs, int64_t offset, uint32_t *result)
static off_t read_uint32(BlockDriverState *bs, int64_t offset)
{
uint32_t buffer;
int ret;
ret = bdrv_pread(bs->file, offset, &buffer, 4);
if (ret < 0) {
return ret;
}
*result = be32_to_cpu(buffer);
return 0;
uint32_t buffer;
if (bdrv_pread(bs->file, offset, &buffer, 4) < 4)
return 0;
return be32_to_cpu(buffer);
}
static inline uint64_t buff_read_uint64(const uint8_t *buffer, int64_t offset)
{
return be64_to_cpu(*(uint64_t *)&buffer[offset]);
}
static inline uint32_t buff_read_uint32(const uint8_t *buffer, int64_t offset)
{
return be32_to_cpu(*(uint32_t *)&buffer[offset]);
}
/* Increase max chunk sizes, if necessary. This function is used to calculate
* the buffer sizes needed for compressed/uncompressed chunk I/O.
*/
static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
uint32_t *max_compressed_size,
uint32_t *max_sectors_per_chunk)
{
uint32_t compressed_size = 0;
uint32_t uncompressed_sectors = 0;
switch (s->types[chunk]) {
case 0x80000005: /* zlib compressed */
case 0x80000006: /* bzip2 compressed */
compressed_size = s->lengths[chunk];
uncompressed_sectors = s->sectorcounts[chunk];
break;
case 1: /* copy */
uncompressed_sectors = (s->lengths[chunk] + 511) / 512;
break;
case 2: /* zero */
/* as the all-zeroes block may be large, it is treated specially: the
* sector is not copied from a large buffer, a simple memset is used
* instead. Therefore uncompressed_sectors does not need to be set. */
break;
}
if (compressed_size > *max_compressed_size) {
*max_compressed_size = compressed_size;
}
if (uncompressed_sectors > *max_sectors_per_chunk) {
*max_sectors_per_chunk = uncompressed_sectors;
}
}
static int64_t dmg_find_koly_offset(BlockDriverState *file_bs, Error **errp)
{
int64_t length;
int64_t offset = 0;
uint8_t buffer[515];
int i, ret;
/* bdrv_getlength returns a multiple of block size (512), rounded up. Since
* dmg images can have odd sizes, try to look for the "koly" magic which
* marks the begin of the UDIF trailer (512 bytes). This magic can be found
* in the last 511 bytes of the second-last sector or the first 4 bytes of
* the last sector (search space: 515 bytes) */
length = bdrv_getlength(file_bs);
if (length < 0) {
error_setg_errno(errp, -length,
"Failed to get file size while reading UDIF trailer");
return length;
} else if (length < 512) {
error_setg(errp, "dmg file must be at least 512 bytes long");
return -EINVAL;
}
if (length > 511 + 512) {
offset = length - 511 - 512;
}
length = length < 515 ? length : 515;
ret = bdrv_pread(file_bs, offset, buffer, length);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed while reading UDIF trailer");
return ret;
}
for (i = 0; i < length - 3; i++) {
if (buffer[i] == 'k' && buffer[i+1] == 'o' &&
buffer[i+2] == 'l' && buffer[i+3] == 'y') {
return offset + i;
}
}
error_setg(errp, "Could not locate UDIF trailer in dmg file");
return -EINVAL;
}
/* used when building the sector table */
typedef struct DmgHeaderState {
/* used internally by dmg_read_mish_block to remember offsets of blocks
* across calls */
uint64_t data_fork_offset;
/* exported for dmg_open */
uint32_t max_compressed_size;
uint32_t max_sectors_per_chunk;
} DmgHeaderState;
static bool dmg_is_known_block_type(uint32_t entry_type)
{
switch (entry_type) {
case 0x00000001: /* uncompressed */
case 0x00000002: /* zeroes */
case 0x80000005: /* zlib */
#ifdef CONFIG_BZIP2
case 0x80000006: /* bzip2 */
#endif
return true;
default:
return false;
}
}
static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
uint8_t *buffer, uint32_t count)
{
uint32_t type, i;
int ret;
size_t new_size;
uint32_t chunk_count;
int64_t offset = 0;
uint64_t data_offset;
uint64_t in_offset = ds->data_fork_offset;
uint64_t out_offset;
type = buff_read_uint32(buffer, offset);
/* skip data that is not a valid MISH block (invalid magic or too small) */
if (type != 0x6d697368 || count < 244) {
/* assume success for now */
return 0;
}
/* chunk offsets are relative to this sector number */
out_offset = buff_read_uint64(buffer, offset + 8);
/* location in data fork for (compressed) blob (in bytes) */
data_offset = buff_read_uint64(buffer, offset + 0x18);
in_offset += data_offset;
/* move to begin of chunk entries */
offset += 204;
chunk_count = (count - 204) / 40;
new_size = sizeof(uint64_t) * (s->n_chunks + chunk_count);
s->types = g_realloc(s->types, new_size / 2);
s->offsets = g_realloc(s->offsets, new_size);
s->lengths = g_realloc(s->lengths, new_size);
s->sectors = g_realloc(s->sectors, new_size);
s->sectorcounts = g_realloc(s->sectorcounts, new_size);
for (i = s->n_chunks; i < s->n_chunks + chunk_count; i++) {
s->types[i] = buff_read_uint32(buffer, offset);
if (!dmg_is_known_block_type(s->types[i])) {
chunk_count--;
i--;
offset += 40;
continue;
}
/* sector number */
s->sectors[i] = buff_read_uint64(buffer, offset + 8);
s->sectors[i] += out_offset;
/* sector count */
s->sectorcounts[i] = buff_read_uint64(buffer, offset + 0x10);
/* all-zeroes sector (type 2) does not need to be "uncompressed" and can
* therefore be unbounded. */
if (s->types[i] != 2 && s->sectorcounts[i] > DMG_SECTORCOUNTS_MAX) {
error_report("sector count %" PRIu64 " for chunk %" PRIu32
" is larger than max (%u)",
s->sectorcounts[i], i, DMG_SECTORCOUNTS_MAX);
ret = -EINVAL;
goto fail;
}
/* offset in (compressed) data fork */
s->offsets[i] = buff_read_uint64(buffer, offset + 0x18);
s->offsets[i] += in_offset;
/* length in (compressed) data fork */
s->lengths[i] = buff_read_uint64(buffer, offset + 0x20);
if (s->lengths[i] > DMG_LENGTHS_MAX) {
error_report("length %" PRIu64 " for chunk %" PRIu32
" is larger than max (%u)",
s->lengths[i], i, DMG_LENGTHS_MAX);
ret = -EINVAL;
goto fail;
}
update_max_chunk_size(s, i, &ds->max_compressed_size,
&ds->max_sectors_per_chunk);
offset += 40;
}
s->n_chunks += chunk_count;
return 0;
fail:
return ret;
}
static int dmg_read_resource_fork(BlockDriverState *bs, DmgHeaderState *ds,
uint64_t info_begin, uint64_t info_length)
static int dmg_open(BlockDriverState *bs, int flags)
{
BDRVDMGState *s = bs->opaque;
int ret;
uint32_t count, rsrc_data_offset;
uint8_t *buffer = NULL;
uint64_t info_end;
uint64_t offset;
/* read offset from begin of resource fork (info_begin) to resource data */
ret = read_uint32(bs, info_begin, &rsrc_data_offset);
if (ret < 0) {
goto fail;
} else if (rsrc_data_offset > info_length) {
ret = -EINVAL;
goto fail;
}
/* read length of resource data */
ret = read_uint32(bs, info_begin + 8, &count);
if (ret < 0) {
goto fail;
} else if (count == 0 || rsrc_data_offset + count > info_length) {
ret = -EINVAL;
goto fail;
}
/* begin of resource data (consisting of one or more resources) */
offset = info_begin + rsrc_data_offset;
/* end of resource data (there is possibly a following resource map
* which will be ignored). */
info_end = offset + count;
/* read offsets (mish blocks) from one or more resources in resource data */
while (offset < info_end) {
/* size of following resource */
ret = read_uint32(bs, offset, &count);
if (ret < 0) {
goto fail;
} else if (count == 0 || count > info_end - offset) {
ret = -EINVAL;
goto fail;
}
offset += 4;
buffer = g_realloc(buffer, count);
ret = bdrv_pread(bs->file, offset, buffer, count);
if (ret < 0) {
goto fail;
}
ret = dmg_read_mish_block(s, ds, buffer, count);
if (ret < 0) {
goto fail;
}
/* advance offset by size of resource */
offset += count;
}
ret = 0;
fail:
g_free(buffer);
return ret;
}
static int dmg_read_plist_xml(BlockDriverState *bs, DmgHeaderState *ds,
uint64_t info_begin, uint64_t info_length)
{
BDRVDMGState *s = bs->opaque;
int ret;
uint8_t *buffer = NULL;
char *data_begin, *data_end;
/* Have at least some length to avoid NULL for g_malloc. Attempt to set a
* safe upper cap on the data length. A test sample had a XML length of
* about 1 MiB. */
if (info_length == 0 || info_length > 16 * 1024 * 1024) {
ret = -EINVAL;
goto fail;
}
buffer = g_malloc(info_length + 1);
buffer[info_length] = '\0';
ret = bdrv_pread(bs->file, info_begin, buffer, info_length);
if (ret != info_length) {
ret = -EINVAL;
goto fail;
}
/* look for <data>...</data>. The data is 284 (0x11c) bytes after base64
* decode. The actual data element has 431 (0x1af) bytes which includes tabs
* and line feeds. */
data_end = (char *)buffer;
while ((data_begin = strstr(data_end, "<data>")) != NULL) {
guchar *mish;
gsize out_len = 0;
data_begin += 6;
data_end = strstr(data_begin, "</data>");
/* malformed XML? */
if (data_end == NULL) {
ret = -EINVAL;
goto fail;
}
*data_end++ = '\0';
mish = g_base64_decode(data_begin, &out_len);
ret = dmg_read_mish_block(s, ds, mish, (uint32_t)out_len);
g_free(mish);
if (ret < 0) {
goto fail;
}
}
ret = 0;
fail:
g_free(buffer);
return ret;
}
static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVDMGState *s = bs->opaque;
DmgHeaderState ds;
uint64_t rsrc_fork_offset, rsrc_fork_length;
uint64_t plist_xml_offset, plist_xml_length;
off_t info_begin,info_end,last_in_offset,last_out_offset;
uint32_t count;
uint32_t max_compressed_size=1,max_sectors_per_chunk=1,i;
int64_t offset;
int ret;
bs->read_only = 1;
s->n_chunks = 0;
s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
/* used by dmg_read_mish_block to keep track of the current I/O position */
ds.data_fork_offset = 0;
ds.max_compressed_size = 1;
ds.max_sectors_per_chunk = 1;
/* locate the UDIF trailer */
offset = dmg_find_koly_offset(bs->file, errp);
/* read offset of info blocks */
offset = bdrv_getlength(bs->file);
if (offset < 0) {
ret = offset;
goto fail;
}
offset -= 0x1d8;
info_begin = read_off(bs, offset);
if (info_begin == 0) {
goto fail;
}
if (read_uint32(bs, info_begin) != 0x100) {
goto fail;
}
/* offset of data fork (DataForkOffset) */
ret = read_uint64(bs, offset + 0x18, &ds.data_fork_offset);
if (ret < 0) {
goto fail;
} else if (ds.data_fork_offset > offset) {
ret = -EINVAL;
count = read_uint32(bs, info_begin + 4);
if (count == 0) {
goto fail;
}
info_end = info_begin + count;
/* offset of resource fork (RsrcForkOffset) */
ret = read_uint64(bs, offset + 0x28, &rsrc_fork_offset);
if (ret < 0) {
goto fail;
}
ret = read_uint64(bs, offset + 0x30, &rsrc_fork_length);
if (ret < 0) {
goto fail;
}
if (rsrc_fork_offset >= offset ||
rsrc_fork_length > offset - rsrc_fork_offset) {
ret = -EINVAL;
goto fail;
}
/* offset of property list (XMLOffset) */
ret = read_uint64(bs, offset + 0xd8, &plist_xml_offset);
if (ret < 0) {
goto fail;
}
ret = read_uint64(bs, offset + 0xe0, &plist_xml_length);
if (ret < 0) {
goto fail;
}
if (plist_xml_offset >= offset ||
plist_xml_length > offset - plist_xml_offset) {
ret = -EINVAL;
goto fail;
}
ret = read_uint64(bs, offset + 0x1ec, (uint64_t *)&bs->total_sectors);
if (ret < 0) {
goto fail;
}
if (bs->total_sectors < 0) {
ret = -EINVAL;
goto fail;
}
if (rsrc_fork_length != 0) {
ret = dmg_read_resource_fork(bs, &ds,
rsrc_fork_offset, rsrc_fork_length);
if (ret < 0) {
goto fail;
}
} else if (plist_xml_length != 0) {
ret = dmg_read_plist_xml(bs, &ds, plist_xml_offset, plist_xml_length);
if (ret < 0) {
goto fail;
}
} else {
ret = -EINVAL;
goto fail;
offset = info_begin + 0x100;
/* read offsets */
last_in_offset = last_out_offset = 0;
while (offset < info_end) {
uint32_t type;
count = read_uint32(bs, offset);
if(count==0)
goto fail;
offset += 4;
type = read_uint32(bs, offset);
if (type == 0x6d697368 && count >= 244) {
int new_size, chunk_count;
offset += 4;
offset += 200;
chunk_count = (count-204)/40;
new_size = sizeof(uint64_t) * (s->n_chunks + chunk_count);
s->types = g_realloc(s->types, new_size/2);
s->offsets = g_realloc(s->offsets, new_size);
s->lengths = g_realloc(s->lengths, new_size);
s->sectors = g_realloc(s->sectors, new_size);
s->sectorcounts = g_realloc(s->sectorcounts, new_size);
for(i=s->n_chunks;i<s->n_chunks+chunk_count;i++) {
s->types[i] = read_uint32(bs, offset);
offset += 4;
if(s->types[i]!=0x80000005 && s->types[i]!=1 && s->types[i]!=2) {
if(s->types[i]==0xffffffff) {
last_in_offset = s->offsets[i-1]+s->lengths[i-1];
last_out_offset = s->sectors[i-1]+s->sectorcounts[i-1];
}
chunk_count--;
i--;
offset += 36;
continue;
}
offset += 4;
s->sectors[i] = last_out_offset+read_off(bs, offset);
offset += 8;
s->sectorcounts[i] = read_off(bs, offset);
offset += 8;
s->offsets[i] = last_in_offset+read_off(bs, offset);
offset += 8;
s->lengths[i] = read_off(bs, offset);
offset += 8;
if(s->lengths[i]>max_compressed_size)
max_compressed_size = s->lengths[i];
if(s->sectorcounts[i]>max_sectors_per_chunk)
max_sectors_per_chunk = s->sectorcounts[i];
}
s->n_chunks+=chunk_count;
}
}
/* initialize zlib engine */
s->compressed_chunk = qemu_try_blockalign(bs->file,
ds.max_compressed_size + 1);
s->uncompressed_chunk = qemu_try_blockalign(bs->file,
512 * ds.max_sectors_per_chunk);
if (s->compressed_chunk == NULL || s->uncompressed_chunk == NULL) {
ret = -ENOMEM;
goto fail;
}
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;
}
s->compressed_chunk = g_malloc(max_compressed_size+1);
s->uncompressed_chunk = g_malloc(512*max_sectors_per_chunk);
if(inflateInit(&s->zstream) != Z_OK)
goto fail;
s->current_chunk = s->n_chunks;
qemu_co_mutex_init(&s->lock);
return 0;
fail:
g_free(s->types);
g_free(s->offsets);
g_free(s->lengths);
g_free(s->sectors);
g_free(s->sectorcounts);
qemu_vfree(s->compressed_chunk);
qemu_vfree(s->uncompressed_chunk);
return ret;
return -1;
}
static inline int is_sector_in_chunk(BDRVDMGState* s,
uint32_t chunk_num, uint64_t sector_num)
uint32_t chunk_num,int sector_num)
{
if (chunk_num >= s->n_chunks || s->sectors[chunk_num] > sector_num ||
s->sectors[chunk_num] + s->sectorcounts[chunk_num] <= sector_num) {
return 0;
} else {
return -1;
}
if(chunk_num>=s->n_chunks || s->sectors[chunk_num]>sector_num ||
s->sectors[chunk_num]+s->sectorcounts[chunk_num]<=sector_num)
return 0;
else
return -1;
}
static inline uint32_t search_chunk(BDRVDMGState *s, uint64_t sector_num)
static inline uint32_t search_chunk(BDRVDMGState* s,int sector_num)
{
/* binary search */
uint32_t chunk1 = 0, chunk2 = s->n_chunks, chunk3;
while (chunk1 != chunk2) {
chunk3 = (chunk1 + chunk2) / 2;
if (s->sectors[chunk3] > sector_num) {
chunk2 = chunk3;
} else if (s->sectors[chunk3] + s->sectorcounts[chunk3] > sector_num) {
return chunk3;
} else {
chunk1 = chunk3;
}
uint32_t chunk1=0,chunk2=s->n_chunks,chunk3;
while(chunk1!=chunk2) {
chunk3 = (chunk1+chunk2)/2;
if(s->sectors[chunk3]>sector_num)
chunk2 = chunk3;
else if(s->sectors[chunk3]+s->sectorcounts[chunk3]>sector_num)
return chunk3;
else
chunk1 = chunk3;
}
return s->n_chunks; /* error */
}
static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
static inline int dmg_read_chunk(BlockDriverState *bs, int sector_num)
{
BDRVDMGState *s = bs->opaque;
if (!is_sector_in_chunk(s, s->current_chunk, sector_num)) {
int ret;
uint32_t chunk = search_chunk(s, sector_num);
#ifdef CONFIG_BZIP2
uint64_t total_out;
#endif
if(!is_sector_in_chunk(s,s->current_chunk,sector_num)) {
int ret;
uint32_t chunk = search_chunk(s,sector_num);
if (chunk >= s->n_chunks) {
return -1;
}
if(chunk>=s->n_chunks)
return -1;
s->current_chunk = s->n_chunks;
switch (s->types[chunk]) { /* block entry type */
case 0x80000005: { /* zlib compressed */
/* we need to buffer, because only the chunk as whole can be
* inflated. */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->compressed_chunk, s->lengths[chunk]);
if (ret != s->lengths[chunk]) {
return -1;
}
s->current_chunk = s->n_chunks;
switch(s->types[chunk]) {
case 0x80000005: { /* zlib compressed */
int i;
s->zstream.next_in = s->compressed_chunk;
s->zstream.avail_in = s->lengths[chunk];
s->zstream.next_out = s->uncompressed_chunk;
s->zstream.avail_out = 512 * s->sectorcounts[chunk];
ret = inflateReset(&s->zstream);
if (ret != Z_OK) {
return -1;
}
ret = inflate(&s->zstream, Z_FINISH);
if (ret != Z_STREAM_END ||
s->zstream.total_out != 512 * s->sectorcounts[chunk]) {
return -1;
}
break; }
#ifdef CONFIG_BZIP2
case 0x80000006: /* bzip2 compressed */
/* we need to buffer, because only the chunk as whole can be
* inflated. */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->compressed_chunk, s->lengths[chunk]);
if (ret != s->lengths[chunk]) {
return -1;
}
/* we need to buffer, because only the chunk as whole can be
* inflated. */
i=0;
do {
ret = bdrv_pread(bs->file, s->offsets[chunk] + i,
s->compressed_chunk+i, s->lengths[chunk]-i);
if(ret<0 && errno==EINTR)
ret=0;
i+=ret;
} while(ret>=0 && ret+i<s->lengths[chunk]);
ret = BZ2_bzDecompressInit(&s->bzstream, 0, 0);
if (ret != BZ_OK) {
return -1;
}
s->bzstream.next_in = (char *)s->compressed_chunk;
s->bzstream.avail_in = (unsigned int) s->lengths[chunk];
s->bzstream.next_out = (char *)s->uncompressed_chunk;
s->bzstream.avail_out = (unsigned int) 512 * s->sectorcounts[chunk];
ret = BZ2_bzDecompress(&s->bzstream);
total_out = ((uint64_t)s->bzstream.total_out_hi32 << 32) +
s->bzstream.total_out_lo32;
BZ2_bzDecompressEnd(&s->bzstream);
if (ret != BZ_STREAM_END ||
total_out != 512 * s->sectorcounts[chunk]) {
return -1;
}
break;
#endif /* CONFIG_BZIP2 */
case 1: /* copy */
ret = bdrv_pread(bs->file, s->offsets[chunk],
if (ret != s->lengths[chunk])
return -1;
s->zstream.next_in = s->compressed_chunk;
s->zstream.avail_in = s->lengths[chunk];
s->zstream.next_out = s->uncompressed_chunk;
s->zstream.avail_out = 512*s->sectorcounts[chunk];
ret = inflateReset(&s->zstream);
if(ret != Z_OK)
return -1;
ret = inflate(&s->zstream, Z_FINISH);
if(ret != Z_STREAM_END || s->zstream.total_out != 512*s->sectorcounts[chunk])
return -1;
break; }
case 1: /* copy */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->uncompressed_chunk, s->lengths[chunk]);
if (ret != s->lengths[chunk]) {
return -1;
}
break;
case 2: /* zero */
/* see dmg_read, it is treated specially. No buffer needs to be
* pre-filled, the zeroes can be set directly. */
break;
}
s->current_chunk = chunk;
if (ret != s->lengths[chunk])
return -1;
break;
case 2: /* zero */
memset(s->uncompressed_chunk, 0, 512*s->sectorcounts[chunk]);
break;
}
s->current_chunk = chunk;
}
return 0;
}
@@ -662,21 +272,12 @@ static int dmg_read(BlockDriverState *bs, int64_t sector_num,
BDRVDMGState *s = bs->opaque;
int i;
for (i = 0; i < nb_sectors; i++) {
uint32_t sector_offset_in_chunk;
if (dmg_read_chunk(bs, sector_num + i) != 0) {
return -1;
}
/* Special case: current chunk is all zeroes. Do not perform a memcpy as
* s->uncompressed_chunk may be too small to cover the large all-zeroes
* section. dmg_read_chunk is called to find s->current_chunk */
if (s->types[s->current_chunk] == 2) { /* all zeroes block entry */
memset(buf + i * 512, 0, 512);
continue;
}
sector_offset_in_chunk = sector_num + i - s->sectors[s->current_chunk];
memcpy(buf + i * 512,
s->uncompressed_chunk + sector_offset_in_chunk * 512, 512);
for(i=0;i<nb_sectors;i++) {
uint32_t sector_offset_in_chunk;
if(dmg_read_chunk(bs, sector_num+i) != 0)
return -1;
sector_offset_in_chunk = sector_num+i-s->sectors[s->current_chunk];
memcpy(buf+i*512,s->uncompressed_chunk+sector_offset_in_chunk*512,512);
}
return 0;
}
@@ -695,25 +296,25 @@ static coroutine_fn int dmg_co_read(BlockDriverState *bs, int64_t sector_num,
static void dmg_close(BlockDriverState *bs)
{
BDRVDMGState *s = bs->opaque;
g_free(s->types);
g_free(s->offsets);
g_free(s->lengths);
g_free(s->sectors);
g_free(s->sectorcounts);
qemu_vfree(s->compressed_chunk);
qemu_vfree(s->uncompressed_chunk);
if(s->n_chunks>0) {
free(s->types);
free(s->offsets);
free(s->lengths);
free(s->sectors);
free(s->sectorcounts);
}
free(s->compressed_chunk);
free(s->uncompressed_chunk);
inflateEnd(&s->zstream);
}
static BlockDriver bdrv_dmg = {
.format_name = "dmg",
.instance_size = sizeof(BDRVDMGState),
.bdrv_probe = dmg_probe,
.bdrv_open = dmg_open,
.bdrv_read = dmg_co_read,
.bdrv_close = dmg_close,
.format_name = "dmg",
.instance_size = sizeof(BDRVDMGState),
.bdrv_probe = dmg_probe,
.bdrv_open = dmg_open,
.bdrv_read = dmg_co_read,
.bdrv_close = dmg_close,
};
static void bdrv_dmg_init(void)

View File

@@ -1,833 +0,0 @@
/*
* GlusterFS backend for QEMU
*
* Copyright (C) 2012 Bharata B Rao <bharata@linux.vnet.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
*/
#include <glusterfs/api/glfs.h>
#include "block/block_int.h"
#include "qemu/uri.h"
typedef struct GlusterAIOCB {
int64_t size;
int ret;
QEMUBH *bh;
Coroutine *coroutine;
AioContext *aio_context;
} GlusterAIOCB;
typedef struct BDRVGlusterState {
struct glfs *glfs;
struct glfs_fd *fd;
} BDRVGlusterState;
typedef struct GlusterConf {
char *server;
int port;
char *volname;
char *image;
char *transport;
} GlusterConf;
static void qemu_gluster_gconf_free(GlusterConf *gconf)
{
if (gconf) {
g_free(gconf->server);
g_free(gconf->volname);
g_free(gconf->image);
g_free(gconf->transport);
g_free(gconf);
}
}
static int parse_volume_options(GlusterConf *gconf, char *path)
{
char *p, *q;
if (!path) {
return -EINVAL;
}
/* volume */
p = q = path + strspn(path, "/");
p += strcspn(p, "/");
if (*p == '\0') {
return -EINVAL;
}
gconf->volname = g_strndup(q, p - q);
/* image */
p += strspn(p, "/");
if (*p == '\0') {
return -EINVAL;
}
gconf->image = g_strdup(p);
return 0;
}
/*
* file=gluster[+transport]://[server[:port]]/volname/image[?socket=...]
*
* 'gluster' is the protocol.
*
* 'transport' specifies the transport type used to connect to gluster
* management daemon (glusterd). Valid transport types are
* tcp, unix and rdma. If a transport type isn't specified, then tcp
* type is assumed.
*
* 'server' specifies the server where the volume file specification for
* the given volume resides. This can be either hostname, ipv4 address
* or ipv6 address. ipv6 address needs to be within square brackets [ ].
* If transport type is 'unix', then 'server' field should not be specified.
* The 'socket' field needs to be populated with the path to unix domain
* socket.
*
* 'port' is the port number on which glusterd is listening. This is optional
* and if not specified, QEMU will send 0 which will make gluster to use the
* default port. If the transport type is unix, then 'port' should not be
* specified.
*
* 'volname' is the name of the gluster volume which contains the VM image.
*
* 'image' is the path to the actual VM image that resides on gluster volume.
*
* Examples:
*
* file=gluster://1.2.3.4/testvol/a.img
* file=gluster+tcp://1.2.3.4/testvol/a.img
* file=gluster+tcp://1.2.3.4:24007/testvol/dir/a.img
* file=gluster+tcp://[1:2:3:4:5:6:7:8]/testvol/dir/a.img
* file=gluster+tcp://[1:2:3:4:5:6:7:8]:24007/testvol/dir/a.img
* file=gluster+tcp://server.domain.com:24007/testvol/dir/a.img
* file=gluster+unix:///testvol/dir/a.img?socket=/tmp/glusterd.socket
* file=gluster+rdma://1.2.3.4:24007/testvol/a.img
*/
static int qemu_gluster_parseuri(GlusterConf *gconf, const char *filename)
{
URI *uri;
QueryParams *qp = NULL;
bool is_unix = false;
int ret = 0;
uri = uri_parse(filename);
if (!uri) {
return -EINVAL;
}
/* transport */
if (!uri->scheme || !strcmp(uri->scheme, "gluster")) {
gconf->transport = g_strdup("tcp");
} else if (!strcmp(uri->scheme, "gluster+tcp")) {
gconf->transport = g_strdup("tcp");
} else if (!strcmp(uri->scheme, "gluster+unix")) {
gconf->transport = g_strdup("unix");
is_unix = true;
} else if (!strcmp(uri->scheme, "gluster+rdma")) {
gconf->transport = g_strdup("rdma");
} else {
ret = -EINVAL;
goto out;
}
ret = parse_volume_options(gconf, uri->path);
if (ret < 0) {
goto out;
}
qp = query_params_parse(uri->query);
if (qp->n > 1 || (is_unix && !qp->n) || (!is_unix && qp->n)) {
ret = -EINVAL;
goto out;
}
if (is_unix) {
if (uri->server || uri->port) {
ret = -EINVAL;
goto out;
}
if (strcmp(qp->p[0].name, "socket")) {
ret = -EINVAL;
goto out;
}
gconf->server = g_strdup(qp->p[0].value);
} else {
gconf->server = g_strdup(uri->server ? uri->server : "localhost");
gconf->port = uri->port;
}
out:
if (qp) {
query_params_free(qp);
}
uri_free(uri);
return ret;
}
static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char *filename,
Error **errp)
{
struct glfs *glfs = NULL;
int ret;
int old_errno;
ret = qemu_gluster_parseuri(gconf, filename);
if (ret < 0) {
error_setg(errp, "Usage: file=gluster[+transport]://[server[:port]]/"
"volname/image[?socket=...]");
errno = -ret;
goto out;
}
glfs = glfs_new(gconf->volname);
if (!glfs) {
goto out;
}
ret = glfs_set_volfile_server(glfs, gconf->transport, gconf->server,
gconf->port);
if (ret < 0) {
goto out;
}
/*
* TODO: Use GF_LOG_ERROR instead of hard code value of 4 here when
* GlusterFS makes GF_LOG_* macros available to libgfapi users.
*/
ret = glfs_set_logging(glfs, "-", 4);
if (ret < 0) {
goto out;
}
ret = glfs_init(glfs);
if (ret) {
error_setg_errno(errp, errno,
"Gluster connection failed for server=%s port=%d "
"volume=%s image=%s transport=%s", gconf->server,
gconf->port, gconf->volname, gconf->image,
gconf->transport);
/* glfs_init sometimes doesn't set errno although docs suggest that */
if (errno == 0)
errno = EINVAL;
goto out;
}
return glfs;
out:
if (glfs) {
old_errno = errno;
glfs_fini(glfs);
errno = old_errno;
}
return NULL;
}
static void qemu_gluster_complete_aio(void *opaque)
{
GlusterAIOCB *acb = (GlusterAIOCB *)opaque;
qemu_bh_delete(acb->bh);
acb->bh = NULL;
qemu_coroutine_enter(acb->coroutine, NULL);
}
/*
* AIO callback routine called from GlusterFS thread.
*/
static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
{
GlusterAIOCB *acb = (GlusterAIOCB *)arg;
if (!ret || ret == acb->size) {
acb->ret = 0; /* Success */
} else if (ret < 0) {
acb->ret = ret; /* Read/Write failed */
} else {
acb->ret = -EIO; /* Partial read/write - fail it */
}
acb->bh = aio_bh_new(acb->aio_context, qemu_gluster_complete_aio, acb);
qemu_bh_schedule(acb->bh);
}
/* TODO Convert to fine grained options */
static QemuOptsList runtime_opts = {
.name = "gluster",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.type = QEMU_OPT_STRING,
.help = "URL to the gluster image",
},
{ /* end of list */ }
},
};
static void qemu_gluster_parse_flags(int bdrv_flags, int *open_flags)
{
assert(open_flags != NULL);
*open_flags |= O_BINARY;
if (bdrv_flags & BDRV_O_RDWR) {
*open_flags |= O_RDWR;
} else {
*open_flags |= O_RDONLY;
}
if ((bdrv_flags & BDRV_O_NOCACHE)) {
*open_flags |= O_DIRECT;
}
}
static int qemu_gluster_open(BlockDriverState *bs, QDict *options,
int bdrv_flags, Error **errp)
{
BDRVGlusterState *s = bs->opaque;
int open_flags = 0;
int ret = 0;
GlusterConf *gconf = g_new0(GlusterConf, 1);
QemuOpts *opts;
Error *local_err = NULL;
const char *filename;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto out;
}
filename = qemu_opt_get(opts, "filename");
s->glfs = qemu_gluster_init(gconf, filename, errp);
if (!s->glfs) {
ret = -errno;
goto out;
}
qemu_gluster_parse_flags(bdrv_flags, &open_flags);
s->fd = glfs_open(s->glfs, gconf->image, open_flags);
if (!s->fd) {
ret = -errno;
}
out:
qemu_opts_del(opts);
qemu_gluster_gconf_free(gconf);
if (!ret) {
return ret;
}
if (s->fd) {
glfs_close(s->fd);
}
if (s->glfs) {
glfs_fini(s->glfs);
}
return ret;
}
typedef struct BDRVGlusterReopenState {
struct glfs *glfs;
struct glfs_fd *fd;
} BDRVGlusterReopenState;
static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
BlockReopenQueue *queue, Error **errp)
{
int ret = 0;
BDRVGlusterReopenState *reop_s;
GlusterConf *gconf = NULL;
int open_flags = 0;
assert(state != NULL);
assert(state->bs != NULL);
state->opaque = g_new0(BDRVGlusterReopenState, 1);
reop_s = state->opaque;
qemu_gluster_parse_flags(state->flags, &open_flags);
gconf = g_new0(GlusterConf, 1);
reop_s->glfs = qemu_gluster_init(gconf, state->bs->filename, errp);
if (reop_s->glfs == NULL) {
ret = -errno;
goto exit;
}
reop_s->fd = glfs_open(reop_s->glfs, gconf->image, open_flags);
if (reop_s->fd == NULL) {
/* reops->glfs will be cleaned up in _abort */
ret = -errno;
goto exit;
}
exit:
/* state->opaque will be freed in either the _abort or _commit */
qemu_gluster_gconf_free(gconf);
return ret;
}
static void qemu_gluster_reopen_commit(BDRVReopenState *state)
{
BDRVGlusterReopenState *reop_s = state->opaque;
BDRVGlusterState *s = state->bs->opaque;
/* close the old */
if (s->fd) {
glfs_close(s->fd);
}
if (s->glfs) {
glfs_fini(s->glfs);
}
/* use the newly opened image / connection */
s->fd = reop_s->fd;
s->glfs = reop_s->glfs;
g_free(state->opaque);
state->opaque = NULL;
return;
}
static void qemu_gluster_reopen_abort(BDRVReopenState *state)
{
BDRVGlusterReopenState *reop_s = state->opaque;
if (reop_s == NULL) {
return;
}
if (reop_s->fd) {
glfs_close(reop_s->fd);
}
if (reop_s->glfs) {
glfs_fini(reop_s->glfs);
}
g_free(state->opaque);
state->opaque = NULL;
return;
}
#ifdef CONFIG_GLUSTERFS_ZEROFILL
static coroutine_fn int qemu_gluster_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
{
int ret;
GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
BDRVGlusterState *s = bs->opaque;
off_t size = nb_sectors * BDRV_SECTOR_SIZE;
off_t offset = sector_num * BDRV_SECTOR_SIZE;
acb->size = size;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
ret = glfs_zerofill_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
if (ret < 0) {
ret = -errno;
goto out;
}
qemu_coroutine_yield();
ret = acb->ret;
out:
g_slice_free(GlusterAIOCB, acb);
return ret;
}
static inline bool gluster_supports_zerofill(void)
{
return 1;
}
static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
int64_t size)
{
return glfs_zerofill(fd, offset, size);
}
#else
static inline bool gluster_supports_zerofill(void)
{
return 0;
}
static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
int64_t size)
{
return 0;
}
#endif
static int qemu_gluster_create(const char *filename,
QemuOpts *opts, Error **errp)
{
struct glfs *glfs;
struct glfs_fd *fd;
int ret = 0;
int prealloc = 0;
int64_t total_size = 0;
char *tmp = NULL;
GlusterConf *gconf = g_new0(GlusterConf, 1);
glfs = qemu_gluster_init(gconf, filename, errp);
if (!glfs) {
ret = -errno;
goto out;
}
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!tmp || !strcmp(tmp, "off")) {
prealloc = 0;
} else if (!strcmp(tmp, "full") &&
gluster_supports_zerofill()) {
prealloc = 1;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'"
" or GlusterFS doesn't support zerofill API",
tmp);
ret = -EINVAL;
goto out;
}
fd = glfs_creat(glfs, gconf->image,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR | S_IWUSR);
if (!fd) {
ret = -errno;
} else {
if (!glfs_ftruncate(fd, total_size)) {
if (prealloc && qemu_gluster_zerofill(fd, 0, total_size)) {
ret = -errno;
}
} else {
ret = -errno;
}
if (glfs_close(fd) != 0) {
ret = -errno;
}
}
out:
g_free(tmp);
qemu_gluster_gconf_free(gconf);
if (glfs) {
glfs_fini(glfs);
}
return ret;
}
static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov, int write)
{
int ret;
GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
BDRVGlusterState *s = bs->opaque;
size_t size = nb_sectors * BDRV_SECTOR_SIZE;
off_t offset = sector_num * BDRV_SECTOR_SIZE;
acb->size = size;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
if (write) {
ret = glfs_pwritev_async(s->fd, qiov->iov, qiov->niov, offset, 0,
&gluster_finish_aiocb, acb);
} else {
ret = glfs_preadv_async(s->fd, qiov->iov, qiov->niov, offset, 0,
&gluster_finish_aiocb, acb);
}
if (ret < 0) {
ret = -errno;
goto out;
}
qemu_coroutine_yield();
ret = acb->ret;
out:
g_slice_free(GlusterAIOCB, acb);
return ret;
}
static int qemu_gluster_truncate(BlockDriverState *bs, int64_t offset)
{
int ret;
BDRVGlusterState *s = bs->opaque;
ret = glfs_ftruncate(s->fd, offset);
if (ret < 0) {
return -errno;
}
return 0;
}
static coroutine_fn int qemu_gluster_co_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
{
return qemu_gluster_co_rw(bs, sector_num, nb_sectors, qiov, 0);
}
static coroutine_fn int qemu_gluster_co_writev(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
{
return qemu_gluster_co_rw(bs, sector_num, nb_sectors, qiov, 1);
}
static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
{
int ret;
GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
BDRVGlusterState *s = bs->opaque;
acb->size = 0;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
ret = glfs_fsync_async(s->fd, &gluster_finish_aiocb, acb);
if (ret < 0) {
ret = -errno;
goto out;
}
qemu_coroutine_yield();
ret = acb->ret;
out:
g_slice_free(GlusterAIOCB, acb);
return ret;
}
#ifdef CONFIG_GLUSTERFS_DISCARD
static coroutine_fn int qemu_gluster_co_discard(BlockDriverState *bs,
int64_t sector_num, int nb_sectors)
{
int ret;
GlusterAIOCB *acb = g_slice_new(GlusterAIOCB);
BDRVGlusterState *s = bs->opaque;
size_t size = nb_sectors * BDRV_SECTOR_SIZE;
off_t offset = sector_num * BDRV_SECTOR_SIZE;
acb->size = 0;
acb->ret = 0;
acb->coroutine = qemu_coroutine_self();
acb->aio_context = bdrv_get_aio_context(bs);
ret = glfs_discard_async(s->fd, offset, size, &gluster_finish_aiocb, acb);
if (ret < 0) {
ret = -errno;
goto out;
}
qemu_coroutine_yield();
ret = acb->ret;
out:
g_slice_free(GlusterAIOCB, acb);
return ret;
}
#endif
static int64_t qemu_gluster_getlength(BlockDriverState *bs)
{
BDRVGlusterState *s = bs->opaque;
int64_t ret;
ret = glfs_lseek(s->fd, 0, SEEK_END);
if (ret < 0) {
return -errno;
} else {
return ret;
}
}
static int64_t qemu_gluster_allocated_file_size(BlockDriverState *bs)
{
BDRVGlusterState *s = bs->opaque;
struct stat st;
int ret;
ret = glfs_fstat(s->fd, &st);
if (ret < 0) {
return -errno;
} else {
return st.st_blocks * 512;
}
}
static void qemu_gluster_close(BlockDriverState *bs)
{
BDRVGlusterState *s = bs->opaque;
if (s->fd) {
glfs_close(s->fd);
s->fd = NULL;
}
glfs_fini(s->glfs);
}
static int qemu_gluster_has_zero_init(BlockDriverState *bs)
{
/* GlusterFS volume could be backed by a block device */
return 0;
}
static QemuOptsList qemu_gluster_create_opts = {
.name = "qemu-gluster-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qemu_gluster_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
{
.name = BLOCK_OPT_PREALLOC,
.type = QEMU_OPT_STRING,
.help = "Preallocation mode (allowed values: off, full)"
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_gluster = {
.format_name = "gluster",
.protocol_name = "gluster",
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_reopen_prepare = qemu_gluster_reopen_prepare,
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
.bdrv_co_readv = qemu_gluster_co_readv,
.bdrv_co_writev = qemu_gluster_co_writev,
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
#ifdef CONFIG_GLUSTERFS_DISCARD
.bdrv_co_discard = qemu_gluster_co_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_opts = &qemu_gluster_create_opts,
};
static BlockDriver bdrv_gluster_tcp = {
.format_name = "gluster",
.protocol_name = "gluster+tcp",
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_reopen_prepare = qemu_gluster_reopen_prepare,
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
.bdrv_co_readv = qemu_gluster_co_readv,
.bdrv_co_writev = qemu_gluster_co_writev,
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
#ifdef CONFIG_GLUSTERFS_DISCARD
.bdrv_co_discard = qemu_gluster_co_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_opts = &qemu_gluster_create_opts,
};
static BlockDriver bdrv_gluster_unix = {
.format_name = "gluster",
.protocol_name = "gluster+unix",
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_reopen_prepare = qemu_gluster_reopen_prepare,
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
.bdrv_co_readv = qemu_gluster_co_readv,
.bdrv_co_writev = qemu_gluster_co_writev,
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
#ifdef CONFIG_GLUSTERFS_DISCARD
.bdrv_co_discard = qemu_gluster_co_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_opts = &qemu_gluster_create_opts,
};
static BlockDriver bdrv_gluster_rdma = {
.format_name = "gluster",
.protocol_name = "gluster+rdma",
.instance_size = sizeof(BDRVGlusterState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_gluster_open,
.bdrv_reopen_prepare = qemu_gluster_reopen_prepare,
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
.bdrv_co_readv = qemu_gluster_co_readv,
.bdrv_co_writev = qemu_gluster_co_writev,
.bdrv_co_flush_to_disk = qemu_gluster_co_flush_to_disk,
.bdrv_has_zero_init = qemu_gluster_has_zero_init,
#ifdef CONFIG_GLUSTERFS_DISCARD
.bdrv_co_discard = qemu_gluster_co_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
#endif
.create_opts = &qemu_gluster_create_opts,
};
static void bdrv_gluster_init(void)
{
bdrv_register(&bdrv_gluster_rdma);
bdrv_register(&bdrv_gluster_unix);
bdrv_register(&bdrv_gluster_tcp);
bdrv_register(&bdrv_gluster);
}
block_init(bdrv_gluster_init);

2603
block/io.c

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,337 +0,0 @@
/*
* Linux native AIO support.
*
* Copyright (C) 2009 IBM, Corp.
* Copyright (C) 2009 Red Hat, Inc.
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu-common.h"
#include "block/aio.h"
#include "qemu/queue.h"
#include "block/raw-aio.h"
#include "qemu/event_notifier.h"
#include <libaio.h>
/*
* Queue size (per-device).
*
* XXX: eventually we need to communicate this to the guest and/or make it
* tunable by the guest. If we get more outstanding requests at a time
* than this we will get EAGAIN from io_submit which is communicated to
* the guest as an I/O error.
*/
#define MAX_EVENTS 128
#define MAX_QUEUED_IO 128
struct qemu_laiocb {
BlockAIOCB common;
struct qemu_laio_state *ctx;
struct iocb iocb;
ssize_t ret;
size_t nbytes;
QEMUIOVector *qiov;
bool is_read;
QSIMPLEQ_ENTRY(qemu_laiocb) next;
};
typedef struct {
int plugged;
unsigned int n;
bool blocked;
QSIMPLEQ_HEAD(, qemu_laiocb) pending;
} LaioQueue;
struct qemu_laio_state {
io_context_t ctx;
EventNotifier e;
/* io queue for submit at batch */
LaioQueue io_q;
/* I/O completion processing */
QEMUBH *completion_bh;
struct io_event events[MAX_EVENTS];
int event_idx;
int event_max;
};
static void ioq_submit(struct qemu_laio_state *s);
static inline ssize_t io_event_ret(struct io_event *ev)
{
return (ssize_t)(((uint64_t)ev->res2 << 32) | ev->res);
}
/*
* Completes an AIO request (calls the callback and frees the ACB).
*/
static void qemu_laio_process_completion(struct qemu_laio_state *s,
struct qemu_laiocb *laiocb)
{
int ret;
ret = laiocb->ret;
if (ret != -ECANCELED) {
if (ret == laiocb->nbytes) {
ret = 0;
} else if (ret >= 0) {
/* Short reads mean EOF, pad with zeros. */
if (laiocb->is_read) {
qemu_iovec_memset(laiocb->qiov, ret, 0,
laiocb->qiov->size - ret);
} else {
ret = -EINVAL;
}
}
}
laiocb->common.cb(laiocb->common.opaque, ret);
qemu_aio_unref(laiocb);
}
/* The completion BH fetches completed I/O requests and invokes their
* callbacks.
*
* The function is somewhat tricky because it supports nested event loops, for
* example when a request callback invokes aio_poll(). In order to do this,
* the completion events array and index are kept in qemu_laio_state. The BH
* reschedules itself as long as there are completions pending so it will
* either be called again in a nested event loop or will be called after all
* events have been completed. When there are no events left to complete, the
* BH returns without rescheduling.
*/
static void qemu_laio_completion_bh(void *opaque)
{
struct qemu_laio_state *s = opaque;
/* Fetch more completion events when empty */
if (s->event_idx == s->event_max) {
do {
struct timespec ts = { 0 };
s->event_max = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS,
s->events, &ts);
} while (s->event_max == -EINTR);
s->event_idx = 0;
if (s->event_max <= 0) {
s->event_max = 0;
return; /* no more events */
}
}
/* Reschedule so nested event loops see currently pending completions */
qemu_bh_schedule(s->completion_bh);
/* Process completion events */
while (s->event_idx < s->event_max) {
struct iocb *iocb = s->events[s->event_idx].obj;
struct qemu_laiocb *laiocb =
container_of(iocb, struct qemu_laiocb, iocb);
laiocb->ret = io_event_ret(&s->events[s->event_idx]);
s->event_idx++;
qemu_laio_process_completion(s, laiocb);
}
if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
ioq_submit(s);
}
}
static void qemu_laio_completion_cb(EventNotifier *e)
{
struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, e);
if (event_notifier_test_and_clear(&s->e)) {
qemu_bh_schedule(s->completion_bh);
}
}
static void laio_cancel(BlockAIOCB *blockacb)
{
struct qemu_laiocb *laiocb = (struct qemu_laiocb *)blockacb;
struct io_event event;
int ret;
if (laiocb->ret != -EINPROGRESS) {
return;
}
ret = io_cancel(laiocb->ctx->ctx, &laiocb->iocb, &event);
laiocb->ret = -ECANCELED;
if (ret != 0) {
/* iocb is not cancelled, cb will be called by the event loop later */
return;
}
laiocb->common.cb(laiocb->common.opaque, laiocb->ret);
}
static const AIOCBInfo laio_aiocb_info = {
.aiocb_size = sizeof(struct qemu_laiocb),
.cancel_async = laio_cancel,
};
static void ioq_init(LaioQueue *io_q)
{
QSIMPLEQ_INIT(&io_q->pending);
io_q->plugged = 0;
io_q->n = 0;
io_q->blocked = false;
}
static void ioq_submit(struct qemu_laio_state *s)
{
int ret, len;
struct qemu_laiocb *aiocb;
struct iocb *iocbs[MAX_QUEUED_IO];
QSIMPLEQ_HEAD(, qemu_laiocb) completed;
do {
len = 0;
QSIMPLEQ_FOREACH(aiocb, &s->io_q.pending, next) {
iocbs[len++] = &aiocb->iocb;
if (len == MAX_QUEUED_IO) {
break;
}
}
ret = io_submit(s->ctx, len, iocbs);
if (ret == -EAGAIN) {
break;
}
if (ret < 0) {
abort();
}
s->io_q.n -= ret;
aiocb = container_of(iocbs[ret - 1], struct qemu_laiocb, iocb);
QSIMPLEQ_SPLIT_AFTER(&s->io_q.pending, aiocb, next, &completed);
} while (ret == len && !QSIMPLEQ_EMPTY(&s->io_q.pending));
s->io_q.blocked = (s->io_q.n > 0);
}
void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
{
struct qemu_laio_state *s = aio_ctx;
s->io_q.plugged++;
}
void laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
{
struct qemu_laio_state *s = aio_ctx;
assert(s->io_q.plugged > 0 || !unplug);
if (unplug && --s->io_q.plugged > 0) {
return;
}
if (!s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
ioq_submit(s);
}
}
BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type)
{
struct qemu_laio_state *s = aio_ctx;
struct qemu_laiocb *laiocb;
struct iocb *iocbs;
off_t offset = sector_num * 512;
laiocb = qemu_aio_get(&laio_aiocb_info, bs, cb, opaque);
laiocb->nbytes = nb_sectors * 512;
laiocb->ctx = s;
laiocb->ret = -EINPROGRESS;
laiocb->is_read = (type == QEMU_AIO_READ);
laiocb->qiov = qiov;
iocbs = &laiocb->iocb;
switch (type) {
case QEMU_AIO_WRITE:
io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, offset);
break;
case QEMU_AIO_READ:
io_prep_preadv(iocbs, fd, qiov->iov, qiov->niov, offset);
break;
/* Currently Linux kernel does not support other operations */
default:
fprintf(stderr, "%s: invalid AIO request type 0x%x.\n",
__func__, type);
goto out_free_aiocb;
}
io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
QSIMPLEQ_INSERT_TAIL(&s->io_q.pending, laiocb, next);
s->io_q.n++;
if (!s->io_q.blocked &&
(!s->io_q.plugged || s->io_q.n >= MAX_QUEUED_IO)) {
ioq_submit(s);
}
return &laiocb->common;
out_free_aiocb:
qemu_aio_unref(laiocb);
return NULL;
}
void laio_detach_aio_context(void *s_, AioContext *old_context)
{
struct qemu_laio_state *s = s_;
aio_set_event_notifier(old_context, &s->e, NULL);
qemu_bh_delete(s->completion_bh);
}
void laio_attach_aio_context(void *s_, AioContext *new_context)
{
struct qemu_laio_state *s = s_;
s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
}
void *laio_init(void)
{
struct qemu_laio_state *s;
s = g_malloc0(sizeof(*s));
if (event_notifier_init(&s->e, false) < 0) {
goto out_free_state;
}
if (io_setup(MAX_EVENTS, &s->ctx) != 0) {
goto out_close_efd;
}
ioq_init(&s->io_q);
return s;
out_close_efd:
event_notifier_cleanup(&s->e);
out_free_state:
g_free(s);
return NULL;
}
void laio_cleanup(void *s_)
{
struct qemu_laio_state *s = s_;
event_notifier_cleanup(&s->e);
if (io_destroy(s->ctx) != 0) {
fprintf(stderr, "%s: destroy AIO context %p failed\n",
__func__, &s->ctx);
}
g_free(s);
}

View File

@@ -1,782 +0,0 @@
/*
* Image mirroring
*
* Copyright Red Hat, Inc. 2012
*
* Authors:
* Paolo Bonzini <pbonzini@redhat.com>
*
* This work is licensed under the terms of the GNU LGPL, version 2 or later.
* See the COPYING.LIB file in the top-level directory.
*
*/
#include "trace.h"
#include "block/blockjob.h"
#include "block/block_int.h"
#include "qemu/ratelimit.h"
#include "qemu/bitmap.h"
#define SLICE_TIME 100000000ULL /* ns */
#define MAX_IN_FLIGHT 16
/* The mirroring buffer is a list of granularity-sized chunks.
* Free chunks are organized in a list.
*/
typedef struct MirrorBuffer {
QSIMPLEQ_ENTRY(MirrorBuffer) next;
} MirrorBuffer;
typedef struct MirrorBlockJob {
BlockJob common;
RateLimit limit;
BlockDriverState *target;
BlockDriverState *base;
/* The name of the graph node to replace */
char *replaces;
/* The BDS to replace */
BlockDriverState *to_replace;
/* Used to block operations on the drive-mirror-replace target */
Error *replace_blocker;
bool is_none_mode;
BlockdevOnError on_source_error, on_target_error;
bool synced;
bool should_complete;
int64_t sector_num;
int64_t granularity;
size_t buf_size;
int64_t bdev_length;
unsigned long *cow_bitmap;
BdrvDirtyBitmap *dirty_bitmap;
HBitmapIter hbi;
uint8_t *buf;
QSIMPLEQ_HEAD(, MirrorBuffer) buf_free;
int buf_free_count;
unsigned long *in_flight_bitmap;
int in_flight;
int sectors_in_flight;
int ret;
} MirrorBlockJob;
typedef struct MirrorOp {
MirrorBlockJob *s;
QEMUIOVector qiov;
int64_t sector_num;
int nb_sectors;
} MirrorOp;
static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read,
int error)
{
s->synced = false;
if (read) {
return block_job_error_action(&s->common, s->common.bs,
s->on_source_error, true, error);
} else {
return block_job_error_action(&s->common, s->target,
s->on_target_error, false, error);
}
}
static void mirror_iteration_done(MirrorOp *op, int ret)
{
MirrorBlockJob *s = op->s;
struct iovec *iov;
int64_t chunk_num;
int i, nb_chunks, sectors_per_chunk;
trace_mirror_iteration_done(s, op->sector_num, op->nb_sectors, ret);
s->in_flight--;
s->sectors_in_flight -= op->nb_sectors;
iov = op->qiov.iov;
for (i = 0; i < op->qiov.niov; i++) {
MirrorBuffer *buf = (MirrorBuffer *) iov[i].iov_base;
QSIMPLEQ_INSERT_TAIL(&s->buf_free, buf, next);
s->buf_free_count++;
}
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
chunk_num = op->sector_num / sectors_per_chunk;
nb_chunks = op->nb_sectors / sectors_per_chunk;
bitmap_clear(s->in_flight_bitmap, chunk_num, nb_chunks);
if (ret >= 0) {
if (s->cow_bitmap) {
bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
}
s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE;
}
qemu_iovec_destroy(&op->qiov);
g_slice_free(MirrorOp, op);
/* Enter coroutine when it is not sleeping. The coroutine sleeps to
* rate-limit itself. The coroutine will eventually resume since there is
* a sleep timeout so don't wake it early.
*/
if (s->common.busy) {
qemu_coroutine_enter(s->common.co, NULL);
}
}
static void mirror_write_complete(void *opaque, int ret)
{
MirrorOp *op = opaque;
MirrorBlockJob *s = op->s;
if (ret < 0) {
BlockErrorAction action;
bdrv_set_dirty_bitmap(s->dirty_bitmap, op->sector_num, op->nb_sectors);
action = mirror_error_action(s, false, -ret);
if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
s->ret = ret;
}
}
mirror_iteration_done(op, ret);
}
static void mirror_read_complete(void *opaque, int ret)
{
MirrorOp *op = opaque;
MirrorBlockJob *s = op->s;
if (ret < 0) {
BlockErrorAction action;
bdrv_set_dirty_bitmap(s->dirty_bitmap, op->sector_num, op->nb_sectors);
action = mirror_error_action(s, true, -ret);
if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
s->ret = ret;
}
mirror_iteration_done(op, ret);
return;
}
bdrv_aio_writev(s->target, op->sector_num, &op->qiov, op->nb_sectors,
mirror_write_complete, op);
}
static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
{
BlockDriverState *source = s->common.bs;
int nb_sectors, sectors_per_chunk, nb_chunks;
int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
uint64_t delay_ns = 0;
MirrorOp *op;
s->sector_num = hbitmap_iter_next(&s->hbi);
if (s->sector_num < 0) {
bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
s->sector_num = hbitmap_iter_next(&s->hbi);
trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
assert(s->sector_num >= 0);
}
hbitmap_next_sector = s->sector_num;
sector_num = s->sector_num;
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
end = s->bdev_length / BDRV_SECTOR_SIZE;
/* Extend the QEMUIOVector to include all adjacent blocks that will
* be copied in this operation.
*
* We have to do this if we have no backing file yet in the destination,
* and the cluster size is very large. Then we need to do COW ourselves.
* The first time a cluster is copied, copy it entirely. Note that,
* because both the granularity and the cluster size are powers of two,
* the number of sectors to copy cannot exceed one cluster.
*
* We also want to extend the QEMUIOVector to include more adjacent
* dirty blocks if possible, to limit the number of I/O operations and
* run efficiently even with a small granularity.
*/
nb_chunks = 0;
nb_sectors = 0;
next_sector = sector_num;
next_chunk = sector_num / sectors_per_chunk;
/* Wait for I/O to this cluster (from a previous iteration) to be done. */
while (test_bit(next_chunk, s->in_flight_bitmap)) {
trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
qemu_coroutine_yield();
}
do {
int added_sectors, added_chunks;
if (!bdrv_get_dirty(source, s->dirty_bitmap, next_sector) ||
test_bit(next_chunk, s->in_flight_bitmap)) {
assert(nb_sectors > 0);
break;
}
added_sectors = sectors_per_chunk;
if (s->cow_bitmap && !test_bit(next_chunk, s->cow_bitmap)) {
bdrv_round_to_clusters(s->target,
next_sector, added_sectors,
&next_sector, &added_sectors);
/* On the first iteration, the rounding may make us copy
* sectors before the first dirty one.
*/
if (next_sector < sector_num) {
assert(nb_sectors == 0);
sector_num = next_sector;
next_chunk = next_sector / sectors_per_chunk;
}
}
added_sectors = MIN(added_sectors, end - (sector_num + nb_sectors));
added_chunks = (added_sectors + sectors_per_chunk - 1) / sectors_per_chunk;
/* When doing COW, it may happen that there is not enough space for
* a full cluster. Wait if that is the case.
*/
while (nb_chunks == 0 && s->buf_free_count < added_chunks) {
trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
qemu_coroutine_yield();
}
if (s->buf_free_count < nb_chunks + added_chunks) {
trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
break;
}
/* We have enough free space to copy these sectors. */
bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks);
nb_sectors += added_sectors;
nb_chunks += added_chunks;
next_sector += added_sectors;
next_chunk += added_chunks;
if (!s->synced && s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, added_sectors);
}
} while (delay_ns == 0 && next_sector < end);
/* Allocate a MirrorOp that is used as an AIO callback. */
op = g_slice_new(MirrorOp);
op->s = s;
op->sector_num = sector_num;
op->nb_sectors = nb_sectors;
/* Now make a QEMUIOVector taking enough granularity-sized chunks
* from s->buf_free.
*/
qemu_iovec_init(&op->qiov, nb_chunks);
next_sector = sector_num;
while (nb_chunks-- > 0) {
MirrorBuffer *buf = QSIMPLEQ_FIRST(&s->buf_free);
size_t remaining = (nb_sectors * BDRV_SECTOR_SIZE) - op->qiov.size;
QSIMPLEQ_REMOVE_HEAD(&s->buf_free, next);
s->buf_free_count--;
qemu_iovec_add(&op->qiov, buf, MIN(s->granularity, remaining));
/* Advance the HBitmapIter in parallel, so that we do not examine
* the same sector twice.
*/
if (next_sector > hbitmap_next_sector
&& bdrv_get_dirty(source, s->dirty_bitmap, next_sector)) {
hbitmap_next_sector = hbitmap_iter_next(&s->hbi);
}
next_sector += sectors_per_chunk;
}
bdrv_reset_dirty_bitmap(s->dirty_bitmap, sector_num, nb_sectors);
/* Copy the dirty cluster. */
s->in_flight++;
s->sectors_in_flight += nb_sectors;
trace_mirror_one_iteration(s, sector_num, nb_sectors);
bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
mirror_read_complete, op);
return delay_ns;
}
static void mirror_free_init(MirrorBlockJob *s)
{
int granularity = s->granularity;
size_t buf_size = s->buf_size;
uint8_t *buf = s->buf;
assert(s->buf_free_count == 0);
QSIMPLEQ_INIT(&s->buf_free);
while (buf_size != 0) {
MirrorBuffer *cur = (MirrorBuffer *)buf;
QSIMPLEQ_INSERT_TAIL(&s->buf_free, cur, next);
s->buf_free_count++;
buf_size -= granularity;
buf += granularity;
}
}
static void mirror_drain(MirrorBlockJob *s)
{
while (s->in_flight > 0) {
qemu_coroutine_yield();
}
}
typedef struct {
int ret;
} MirrorExitData;
static void mirror_exit(BlockJob *job, void *opaque)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
MirrorExitData *data = opaque;
AioContext *replace_aio_context = NULL;
if (s->to_replace) {
replace_aio_context = bdrv_get_aio_context(s->to_replace);
aio_context_acquire(replace_aio_context);
}
if (s->should_complete && data->ret == 0) {
BlockDriverState *to_replace = s->common.bs;
if (s->to_replace) {
to_replace = s->to_replace;
}
if (bdrv_get_flags(s->target) != bdrv_get_flags(to_replace)) {
bdrv_reopen(s->target, bdrv_get_flags(to_replace), NULL);
}
bdrv_swap(s->target, to_replace);
if (s->common.driver->job_type == BLOCK_JOB_TYPE_COMMIT) {
/* drop the bs loop chain formed by the swap: break the loop then
* trigger the unref from the top one */
BlockDriverState *p = s->base->backing_hd;
bdrv_set_backing_hd(s->base, NULL);
bdrv_unref(p);
}
}
if (s->to_replace) {
bdrv_op_unblock_all(s->to_replace, s->replace_blocker);
error_free(s->replace_blocker);
bdrv_unref(s->to_replace);
}
if (replace_aio_context) {
aio_context_release(replace_aio_context);
}
g_free(s->replaces);
bdrv_unref(s->target);
block_job_completed(&s->common, data->ret);
g_free(data);
}
static void coroutine_fn mirror_run(void *opaque)
{
MirrorBlockJob *s = opaque;
MirrorExitData *data;
BlockDriverState *bs = s->common.bs;
int64_t sector_num, end, sectors_per_chunk, length;
uint64_t last_pause_ns;
BlockDriverInfo bdi;
char backing_filename[2]; /* we only need 2 characters because we are only
checking for a NULL string */
int ret = 0;
int n;
if (block_job_is_cancelled(&s->common)) {
goto immediate_exit;
}
s->bdev_length = bdrv_getlength(bs);
if (s->bdev_length < 0) {
ret = s->bdev_length;
goto immediate_exit;
} else if (s->bdev_length == 0) {
/* Report BLOCK_JOB_READY and wait for complete. */
block_job_event_ready(&s->common);
s->synced = true;
while (!block_job_is_cancelled(&s->common) && !s->should_complete) {
block_job_yield(&s->common);
}
s->common.cancelled = false;
goto immediate_exit;
}
length = DIV_ROUND_UP(s->bdev_length, s->granularity);
s->in_flight_bitmap = bitmap_new(length);
/* If we have no backing file yet in the destination, we cannot let
* the destination do COW. Instead, we copy sectors around the
* dirty data if needed. We need a bitmap to do that.
*/
bdrv_get_backing_filename(s->target, backing_filename,
sizeof(backing_filename));
if (backing_filename[0] && !s->target->backing_hd) {
ret = bdrv_get_info(s->target, &bdi);
if (ret < 0) {
goto immediate_exit;
}
if (s->granularity < bdi.cluster_size) {
s->buf_size = MAX(s->buf_size, bdi.cluster_size);
s->cow_bitmap = bitmap_new(length);
}
}
end = s->bdev_length / BDRV_SECTOR_SIZE;
s->buf = qemu_try_blockalign(bs, s->buf_size);
if (s->buf == NULL) {
ret = -ENOMEM;
goto immediate_exit;
}
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
mirror_free_init(s);
if (!s->is_none_mode) {
/* First part, loop on the sectors and initialize the dirty bitmap. */
BlockDriverState *base = s->base;
for (sector_num = 0; sector_num < end; ) {
int64_t next = (sector_num | (sectors_per_chunk - 1)) + 1;
ret = bdrv_is_allocated_above(bs, base,
sector_num, next - sector_num, &n);
if (ret < 0) {
goto immediate_exit;
}
assert(n > 0);
if (ret == 1) {
bdrv_set_dirty_bitmap(s->dirty_bitmap, sector_num, n);
sector_num = next;
} else {
sector_num += n;
}
}
}
bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
for (;;) {
uint64_t delay_ns = 0;
int64_t cnt;
bool should_complete;
if (s->ret < 0) {
ret = s->ret;
goto immediate_exit;
}
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
/* s->common.offset contains the number of bytes already processed so
* far, cnt is the number of dirty sectors remaining and
* s->sectors_in_flight is the number of sectors currently being
* processed; together those are the current total operation length */
s->common.len = s->common.offset +
(cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;
/* Note that even when no rate limit is applied we need to yield
* periodically with no pending I/O so that bdrv_drain_all() returns.
* We do so every SLICE_TIME nanoseconds, or when there is an error,
* or when the source is clean, whichever comes first.
*/
if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - last_pause_ns < SLICE_TIME &&
s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
if (s->in_flight == MAX_IN_FLIGHT || s->buf_free_count == 0 ||
(cnt == 0 && s->in_flight > 0)) {
trace_mirror_yield(s, s->in_flight, s->buf_free_count, cnt);
qemu_coroutine_yield();
continue;
} else if (cnt != 0) {
delay_ns = mirror_iteration(s);
}
}
should_complete = false;
if (s->in_flight == 0 && cnt == 0) {
trace_mirror_before_flush(s);
ret = bdrv_flush(s->target);
if (ret < 0) {
if (mirror_error_action(s, false, -ret) ==
BLOCK_ERROR_ACTION_REPORT) {
goto immediate_exit;
}
} else {
/* We're out of the streaming phase. From now on, if the job
* is cancelled we will actually complete all pending I/O and
* report completion. This way, block-job-cancel will leave
* the target in a consistent state.
*/
if (!s->synced) {
block_job_event_ready(&s->common);
s->synced = true;
}
should_complete = s->should_complete ||
block_job_is_cancelled(&s->common);
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
}
}
if (cnt == 0 && should_complete) {
/* The dirty bitmap is not updated while operations are pending.
* If we're about to exit, wait for pending operations before
* calling bdrv_get_dirty_count(bs), or we may exit while the
* source has dirty data to copy!
*
* Note that I/O can be submitted by the guest while
* mirror_populate runs.
*/
trace_mirror_before_drain(s, cnt);
bdrv_drain(bs);
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
}
ret = 0;
trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
if (!s->synced) {
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
if (block_job_is_cancelled(&s->common)) {
break;
}
} else if (!should_complete) {
delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
} else if (cnt == 0) {
/* The two disks are in sync. Exit and report successful
* completion.
*/
assert(QLIST_EMPTY(&bs->tracked_requests));
s->common.cancelled = false;
break;
}
last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
}
immediate_exit:
if (s->in_flight > 0) {
/* We get here only if something went wrong. Either the job failed,
* or it was cancelled prematurely so that we do not guarantee that
* the target is a copy of the source.
*/
assert(ret < 0 || (!s->synced && block_job_is_cancelled(&s->common)));
mirror_drain(s);
}
assert(s->in_flight == 0);
qemu_vfree(s->buf);
g_free(s->cow_bitmap);
g_free(s->in_flight_bitmap);
bdrv_release_dirty_bitmap(bs, s->dirty_bitmap);
bdrv_iostatus_disable(s->target);
data = g_malloc(sizeof(*data));
data->ret = ret;
block_job_defer_to_main_loop(&s->common, mirror_exit, data);
}
static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
if (speed < 0) {
error_set(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static void mirror_iostatus_reset(BlockJob *job)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
bdrv_iostatus_reset(s->target);
}
static void mirror_complete(BlockJob *job, Error **errp)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
Error *local_err = NULL;
int ret;
ret = bdrv_open_backing_file(s->target, NULL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return;
}
if (!s->synced) {
error_set(errp, QERR_BLOCK_JOB_NOT_READY,
bdrv_get_device_name(job->bs));
return;
}
/* check the target bs is not blocked and block all operations on it */
if (s->replaces) {
AioContext *replace_aio_context;
s->to_replace = check_to_replace_node(s->replaces, &local_err);
if (!s->to_replace) {
error_propagate(errp, local_err);
return;
}
replace_aio_context = bdrv_get_aio_context(s->to_replace);
aio_context_acquire(replace_aio_context);
error_setg(&s->replace_blocker,
"block device is in use by block-job-complete");
bdrv_op_block_all(s->to_replace, s->replace_blocker);
bdrv_ref(s->to_replace);
aio_context_release(replace_aio_context);
}
s->should_complete = true;
block_job_enter(&s->common);
}
static const BlockJobDriver mirror_job_driver = {
.instance_size = sizeof(MirrorBlockJob),
.job_type = BLOCK_JOB_TYPE_MIRROR,
.set_speed = mirror_set_speed,
.iostatus_reset= mirror_iostatus_reset,
.complete = mirror_complete,
};
static const BlockJobDriver commit_active_job_driver = {
.instance_size = sizeof(MirrorBlockJob),
.job_type = BLOCK_JOB_TYPE_COMMIT,
.set_speed = mirror_set_speed,
.iostatus_reset
= mirror_iostatus_reset,
.complete = mirror_complete,
};
static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
const char *replaces,
int64_t speed, uint32_t granularity,
int64_t buf_size,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockCompletionFunc *cb,
void *opaque, Error **errp,
const BlockJobDriver *driver,
bool is_none_mode, BlockDriverState *base)
{
MirrorBlockJob *s;
if (granularity == 0) {
granularity = bdrv_get_default_bitmap_granularity(target);
}
assert ((granularity & (granularity - 1)) == 0);
if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
!bdrv_iostatus_is_enabled(bs)) {
error_set(errp, QERR_INVALID_PARAMETER, "on-source-error");
return;
}
s = block_job_create(driver, bs, speed, cb, opaque, errp);
if (!s) {
return;
}
s->replaces = g_strdup(replaces);
s->on_source_error = on_source_error;
s->on_target_error = on_target_error;
s->target = target;
s->is_none_mode = is_none_mode;
s->base = base;
s->granularity = granularity;
s->buf_size = MAX(buf_size, granularity);
s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
if (!s->dirty_bitmap) {
return;
}
bdrv_set_enable_write_cache(s->target, true);
bdrv_set_on_error(s->target, on_target_error, on_target_error);
bdrv_iostatus_enable(s->target);
s->common.co = qemu_coroutine_create(mirror_run);
trace_mirror_start(bs, s, s->common.co, opaque);
qemu_coroutine_enter(s->common.co, s);
}
void mirror_start(BlockDriverState *bs, BlockDriverState *target,
const char *replaces,
int64_t speed, uint32_t granularity, int64_t buf_size,
MirrorSyncMode mode, BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockCompletionFunc *cb,
void *opaque, Error **errp)
{
bool is_none_mode;
BlockDriverState *base;
if (mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
error_setg(errp, "Sync mode 'dirty-bitmap' not supported");
return;
}
is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
base = mode == MIRROR_SYNC_MODE_TOP ? bs->backing_hd : NULL;
mirror_start_job(bs, target, replaces,
speed, granularity, buf_size,
on_source_error, on_target_error, cb, opaque, errp,
&mirror_job_driver, is_none_mode, base);
}
void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
int64_t speed,
BlockdevOnError on_error,
BlockCompletionFunc *cb,
void *opaque, Error **errp)
{
int64_t length, base_length;
int orig_base_flags;
int ret;
Error *local_err = NULL;
orig_base_flags = bdrv_get_flags(base);
if (bdrv_reopen(base, bs->open_flags, errp)) {
return;
}
length = bdrv_getlength(bs);
if (length < 0) {
error_setg_errno(errp, -length,
"Unable to determine length of %s", bs->filename);
goto error_restore_flags;
}
base_length = bdrv_getlength(base);
if (base_length < 0) {
error_setg_errno(errp, -base_length,
"Unable to determine length of %s", base->filename);
goto error_restore_flags;
}
if (length > base_length) {
ret = bdrv_truncate(base, length);
if (ret < 0) {
error_setg_errno(errp, -ret,
"Top image %s is larger than base image %s, and "
"resize of base image failed",
bs->filename, base->filename);
goto error_restore_flags;
}
}
bdrv_ref(base);
mirror_start_job(bs, base, NULL, speed, 0, 0,
on_error, on_error, cb, opaque, &local_err,
&commit_active_job_driver, false, base);
if (local_err) {
error_propagate(errp, local_err);
goto error_restore_flags;
}
return;
error_restore_flags:
/* ignore error and errp for bdrv_reopen, because we want to propagate
* the original error */
bdrv_reopen(base, orig_base_flags, NULL);
return;
}

View File

@@ -1,407 +0,0 @@
/*
* QEMU Block driver for NBD
*
* Copyright (C) 2008 Bull S.A.S.
* Author: Laurent Vivier <Laurent.Vivier@bull.net>
*
* Some parts:
* Copyright (C) 2007 Anthony Liguori <anthony@codemonkey.ws>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "nbd-client.h"
#include "qemu/sockets.h"
#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
#define INDEX_TO_HANDLE(bs, index) ((index) ^ ((uint64_t)(intptr_t)bs))
static void nbd_recv_coroutines_enter_all(NbdClientSession *s)
{
int i;
for (i = 0; i < MAX_NBD_REQUESTS; i++) {
if (s->recv_coroutine[i]) {
qemu_coroutine_enter(s->recv_coroutine[i], NULL);
}
}
}
static void nbd_teardown_connection(BlockDriverState *bs)
{
NbdClientSession *client = nbd_get_client_session(bs);
/* finish any pending coroutines */
shutdown(client->sock, 2);
nbd_recv_coroutines_enter_all(client);
nbd_client_detach_aio_context(bs);
closesocket(client->sock);
client->sock = -1;
}
static void nbd_reply_ready(void *opaque)
{
BlockDriverState *bs = opaque;
NbdClientSession *s = nbd_get_client_session(bs);
uint64_t i;
int ret;
if (s->reply.handle == 0) {
/* No reply already in flight. Fetch a header. It is possible
* that another thread has done the same thing in parallel, so
* the socket is not readable anymore.
*/
ret = nbd_receive_reply(s->sock, &s->reply);
if (ret == -EAGAIN) {
return;
}
if (ret < 0) {
s->reply.handle = 0;
goto fail;
}
}
/* There's no need for a mutex on the receive side, because the
* handler acts as a synchronization point and ensures that only
* one coroutine is called until the reply finishes. */
i = HANDLE_TO_INDEX(s, s->reply.handle);
if (i >= MAX_NBD_REQUESTS) {
goto fail;
}
if (s->recv_coroutine[i]) {
qemu_coroutine_enter(s->recv_coroutine[i], NULL);
return;
}
fail:
nbd_teardown_connection(bs);
}
static void nbd_restart_write(void *opaque)
{
BlockDriverState *bs = opaque;
qemu_coroutine_enter(nbd_get_client_session(bs)->send_coroutine, NULL);
}
static int nbd_co_send_request(BlockDriverState *bs,
struct nbd_request *request,
QEMUIOVector *qiov, int offset)
{
NbdClientSession *s = nbd_get_client_session(bs);
AioContext *aio_context;
int rc, ret, i;
qemu_co_mutex_lock(&s->send_mutex);
for (i = 0; i < MAX_NBD_REQUESTS; i++) {
if (s->recv_coroutine[i] == NULL) {
s->recv_coroutine[i] = qemu_coroutine_self();
break;
}
}
assert(i < MAX_NBD_REQUESTS);
request->handle = INDEX_TO_HANDLE(s, i);
s->send_coroutine = qemu_coroutine_self();
aio_context = bdrv_get_aio_context(bs);
aio_set_fd_handler(aio_context, s->sock,
nbd_reply_ready, nbd_restart_write, bs);
if (qiov) {
if (!s->is_unix) {
socket_set_cork(s->sock, 1);
}
rc = nbd_send_request(s->sock, request);
if (rc >= 0) {
ret = qemu_co_sendv(s->sock, qiov->iov, qiov->niov,
offset, request->len);
if (ret != request->len) {
rc = -EIO;
}
}
if (!s->is_unix) {
socket_set_cork(s->sock, 0);
}
} else {
rc = nbd_send_request(s->sock, request);
}
aio_set_fd_handler(aio_context, s->sock, nbd_reply_ready, NULL, bs);
s->send_coroutine = NULL;
qemu_co_mutex_unlock(&s->send_mutex);
return rc;
}
static void nbd_co_receive_reply(NbdClientSession *s,
struct nbd_request *request, struct nbd_reply *reply,
QEMUIOVector *qiov, int offset)
{
int ret;
/* Wait until we're woken up by the read handler. TODO: perhaps
* peek at the next reply and avoid yielding if it's ours? */
qemu_coroutine_yield();
*reply = s->reply;
if (reply->handle != request->handle) {
reply->error = EIO;
} else {
if (qiov && reply->error == 0) {
ret = qemu_co_recvv(s->sock, qiov->iov, qiov->niov,
offset, request->len);
if (ret != request->len) {
reply->error = EIO;
}
}
/* Tell the read handler to read another header. */
s->reply.handle = 0;
}
}
static void nbd_coroutine_start(NbdClientSession *s,
struct nbd_request *request)
{
/* Poor man semaphore. The free_sema is locked when no other request
* can be accepted, and unlocked after receiving one reply. */
if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
qemu_co_mutex_lock(&s->free_sema);
assert(s->in_flight < MAX_NBD_REQUESTS);
}
s->in_flight++;
/* s->recv_coroutine[i] is set as soon as we get the send_lock. */
}
static void nbd_coroutine_end(NbdClientSession *s,
struct nbd_request *request)
{
int i = HANDLE_TO_INDEX(s, request->handle);
s->recv_coroutine[i] = NULL;
if (s->in_flight-- == MAX_NBD_REQUESTS) {
qemu_co_mutex_unlock(&s->free_sema);
}
}
static int nbd_co_readv_1(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov,
int offset)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { .type = NBD_CMD_READ };
struct nbd_reply reply;
ssize_t ret;
request.from = sector_num * 512;
request.len = nb_sectors * 512;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL, 0);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, qiov, offset);
}
nbd_coroutine_end(client, &request);
return -reply.error;
}
static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov,
int offset)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { .type = NBD_CMD_WRITE };
struct nbd_reply reply;
ssize_t ret;
if (!bdrv_enable_write_cache(bs) &&
(client->nbdflags & NBD_FLAG_SEND_FUA)) {
request.type |= NBD_CMD_FLAG_FUA;
}
request.from = sector_num * 512;
request.len = nb_sectors * 512;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, qiov, offset);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, NULL, 0);
}
nbd_coroutine_end(client, &request);
return -reply.error;
}
/* qemu-nbd has a limit of slightly less than 1M per request. Try to
* remain aligned to 4K. */
#define NBD_MAX_SECTORS 2040
int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
{
int offset = 0;
int ret;
while (nb_sectors > NBD_MAX_SECTORS) {
ret = nbd_co_readv_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
if (ret < 0) {
return ret;
}
offset += NBD_MAX_SECTORS * 512;
sector_num += NBD_MAX_SECTORS;
nb_sectors -= NBD_MAX_SECTORS;
}
return nbd_co_readv_1(bs, sector_num, nb_sectors, qiov, offset);
}
int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
{
int offset = 0;
int ret;
while (nb_sectors > NBD_MAX_SECTORS) {
ret = nbd_co_writev_1(bs, sector_num, NBD_MAX_SECTORS, qiov, offset);
if (ret < 0) {
return ret;
}
offset += NBD_MAX_SECTORS * 512;
sector_num += NBD_MAX_SECTORS;
nb_sectors -= NBD_MAX_SECTORS;
}
return nbd_co_writev_1(bs, sector_num, nb_sectors, qiov, offset);
}
int nbd_client_co_flush(BlockDriverState *bs)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { .type = NBD_CMD_FLUSH };
struct nbd_reply reply;
ssize_t ret;
if (!(client->nbdflags & NBD_FLAG_SEND_FLUSH)) {
return 0;
}
if (client->nbdflags & NBD_FLAG_SEND_FUA) {
request.type |= NBD_CMD_FLAG_FUA;
}
request.from = 0;
request.len = 0;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL, 0);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, NULL, 0);
}
nbd_coroutine_end(client, &request);
return -reply.error;
}
int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { .type = NBD_CMD_TRIM };
struct nbd_reply reply;
ssize_t ret;
if (!(client->nbdflags & NBD_FLAG_SEND_TRIM)) {
return 0;
}
request.from = sector_num * 512;
request.len = nb_sectors * 512;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL, 0);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, NULL, 0);
}
nbd_coroutine_end(client, &request);
return -reply.error;
}
void nbd_client_detach_aio_context(BlockDriverState *bs)
{
aio_set_fd_handler(bdrv_get_aio_context(bs),
nbd_get_client_session(bs)->sock, NULL, NULL, NULL);
}
void nbd_client_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
aio_set_fd_handler(new_context, nbd_get_client_session(bs)->sock,
nbd_reply_ready, NULL, bs);
}
void nbd_client_close(BlockDriverState *bs)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = {
.type = NBD_CMD_DISC,
.from = 0,
.len = 0
};
if (client->sock == -1) {
return;
}
nbd_send_request(client->sock, &request);
nbd_teardown_connection(bs);
}
int nbd_client_init(BlockDriverState *bs, int sock, const char *export,
Error **errp)
{
NbdClientSession *client = nbd_get_client_session(bs);
int ret;
/* NBD handshake */
logout("session init %s\n", export);
qemu_set_block(sock);
ret = nbd_receive_negotiate(sock, export,
&client->nbdflags, &client->size, errp);
if (ret < 0) {
logout("Failed to negotiate with the NBD server\n");
closesocket(sock);
return ret;
}
qemu_co_mutex_init(&client->send_mutex);
qemu_co_mutex_init(&client->free_sema);
client->sock = sock;
/* Now that we're connected, set the socket to be non-blocking and
* kick the reply mechanism. */
qemu_set_nonblock(sock);
nbd_client_attach_aio_context(bs, bdrv_get_aio_context(bs));
logout("Established connection with NBD server\n");
return 0;
}

View File

@@ -1,53 +0,0 @@
#ifndef NBD_CLIENT_H
#define NBD_CLIENT_H
#include "qemu-common.h"
#include "block/nbd.h"
#include "block/block_int.h"
/* #define DEBUG_NBD */
#if defined(DEBUG_NBD)
#define logout(fmt, ...) \
fprintf(stderr, "nbd\t%-24s" fmt, __func__, ##__VA_ARGS__)
#else
#define logout(fmt, ...) ((void)0)
#endif
#define MAX_NBD_REQUESTS 16
typedef struct NbdClientSession {
int sock;
uint32_t nbdflags;
off_t size;
CoMutex send_mutex;
CoMutex free_sema;
Coroutine *send_coroutine;
int in_flight;
Coroutine *recv_coroutine[MAX_NBD_REQUESTS];
struct nbd_reply reply;
bool is_unix;
} NbdClientSession;
NbdClientSession *nbd_get_client_session(BlockDriverState *bs);
int nbd_client_init(BlockDriverState *bs, int sock, const char *export_name,
Error **errp);
void nbd_client_close(BlockDriverState *bs);
int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
int nb_sectors);
int nbd_client_co_flush(BlockDriverState *bs);
int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov);
int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov);
void nbd_client_detach_aio_context(BlockDriverState *bs);
void nbd_client_attach_aio_context(BlockDriverState *bs,
AioContext *new_context);
#endif /* NBD_CLIENT_H */

Some files were not shown because too many files have changed in this diff Show More