Compare commits

..

785 Commits

Author SHA1 Message Date
Peter Maydell
0e87fdc966 Update version for v2.12.0-rc2 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-04 20:37:20 +01:00
Peter Maydell
f1a639aaf2 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Wed 04 Apr 2018 17:07:57 BST
# gpg:                using RSA key BDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  block/rbd: remove processed options from qdict

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-04 17:48:18 +01:00
Jeff Cody
bfb15b4bec block/rbd: remove processed options from qdict
Commit 4bfb274 added some QAPIfication of option parsing in
qemu_rbd_open().  We need to remove all the options we processed,
otherwise in bdrv_open_inherit() we will think the remaining options are
invalid.

(This needs to go in 2.12 to avoid a regression that prevents rbd
from being opened.)

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2018-04-04 12:05:13 -04:00
Laurent Vivier
74912f6dad tcg: fix 16-byte vector operations detection
configure tries to detect if the compiler
supports 16-byte vector operations.

As stated in the comment of the detection
program, there is a problem with the system
compiler on GCC on Centos 7.

This program doesn't actually detect the problem
with GCC on RHEL7 on PPC64LE (Red Hat 4.8.5-28).

This patch updates the test to look more like
it is in QEMU helpers, and now detects the problem.

The error reported is:

  CC      ppc64-softmmu/accel/tcg/tcg-runtime-gvec.o
  ..//accel/tcg/tcg-runtime-gvec.c: In function ‘helper_gvec_shl8i’:
  ../accel/tcg/tcg-runtime-gvec.c:558:26: internal compiler error: in emit_move_insn, at expr.c:3495
           *(vec8 *)(d + i) = *(vec8 *)(a + i) << shift;
                            ^
Fixes: db43267 "tcg: Add generic vector expanders"
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Message-id: 20180328133152.24623-1-lvivier@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-04 16:23:57 +01:00
Peter Maydell
fd69ad866b Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Tue 03 Apr 2018 16:48:53 BST
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  iotests: Test abnormally large size in compressed cluster descriptor
  qemu-iotests: Use ppc64 qemu_arch on ppc64le host
  iotests: Test preallocated truncate of 2G image
  block/file-posix: Fix fully preallocated truncate
  iotests: fix 208 for luks format
  iotests: Update 186 after commit ac64273c66
  iotests: Update 051 and 186 after commit 1454509726
  block: handle invalid lseek returns gracefully
  gluster: Fix blockdev-add with server.N.type=unix

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-04 14:00:07 +01:00
Peter Maydell
e5efa1f5f2 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Tue 03 Apr 2018 17:10:22 BST
# gpg:                using RSA key BDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  gluster: Fix blockdev-add with server.N.type=unix
  blockjob: use qapi enum helpers
  blockjob: leak fix, remove from txn when failing early

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-04 12:33:23 +01:00
Peter Maydell
094b62cd9c Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging
Fix memory leaks when using object_property_get_str()

# gpg: Signature made Tue 03 Apr 2018 15:00:10 BST
# gpg:                using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-next-pull-request:
  sev/i386: fix memory leak in sev_guest_init()
  exec: fix memory leak in find_max_supported_pagesize()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-04 11:13:52 +01:00
Peter Maydell
71ad102baa Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-2.12-pull-request' into staging
# gpg: Signature made Tue 03 Apr 2018 11:33:31 BST
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-2.12-pull-request:
  linux-user: fix TARGET___O_TMPFILE for sparc
  linux-user: define TARGET_ARCH_HAS_KA_RESTORER
  linux-user: fix alpha signal emulation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-04 09:36:14 +01:00
Peter Maydell
f7481f651c Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20180403' into staging
- fix a memory leak in the ipl code introduced with this release
- increase timeout in the bios to avoid hangs during migration (and
  rebuild bios to activate the change)

# gpg: Signature made Tue 03 Apr 2018 09:28:30 BST
# gpg:                using RSA key DECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg:                 aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg:                 aka "Cornelia Huck <cohuck@kernel.org>"
# gpg:                 aka "Cornelia Huck <cohuck@redhat.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* remotes/cohuck/tags/s390x-20180403:
  pc-bios/s390-ccw: update image
  pc-bios/s390-ccw: Increase virtio timeout to 30 seconds
  hw/s390x: fix memory leak in s390_init_ipl_dev()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-03 23:23:58 +01:00
Peter Maydell
9abfc88af3 Merge remote-tracking branch 'remotes/xtensa/tags/20180402-xtensa' into staging
xtensa-specific fixes for linux-user:

- fix flushing registers for signal processing in call8 and call12 frames;
- fix PC value for restarted syscalls;
- fix sysv IPC structures;
- fix fadvise64 syscall;

generic fixes for linux-user:

- fix QEMU assertion in multithreaded application by calling cpu_copy
  under clone_lock;
- fix mq_getsetattr implementation;
- fix error propagation in clock_gettime;
- implement clock_settime.

# gpg: Signature made Mon 02 Apr 2018 18:07:08 BST
# gpg:                using RSA key 51F9CC91F83FA044
# gpg: Good signature from "Max Filippov <filippov@cadence.com>"
# gpg:                 aka "Max Filippov <max.filippov@cogentembedded.com>"
# gpg:                 aka "Max Filippov <jcmvbkbc@gmail.com>"
# Primary key fingerprint: 2B67 854B 98E5 327D CDEB  17D8 51F9 CC91 F83F A044

* remotes/xtensa/tags/20180402-xtensa:
  target/xtensa: linux-user: fix fadvise64 call
  linux-user: implement clock_settime
  linux-user: fix error propagation in clock_gettime
  target/xtensa: linux-user: fix sysv IPC structures
  linux-user: fix mq_getsetattr implementation
  linux-user: call cpu_copy under clone_lock
  target/xtensa: linux-user: rewind pc for restarted syscall
  target/xtensa: fix flush_window_regs

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-03 19:02:46 +01:00
Kevin Wolf
9c1386d3ff Merge remote-tracking branch 'mreitz/tags/pull-block-2018-04-03' into queue-block
A fix for preallocated truncation, a new iotest, and a fix to make the iotests work more comfortably on ppc64

# gpg: Signature made Tue Apr  3 17:40:57 2018 CEST
# gpg:                using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* mreitz/tags/pull-block-2018-04-03:
  iotests: Test abnormally large size in compressed cluster descriptor
  qemu-iotests: Use ppc64 qemu_arch on ppc64le host
  iotests: Test preallocated truncate of 2G image
  block/file-posix: Fix fully preallocated truncate

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-04-03 17:48:45 +02:00
Alberto Garcia
abd3622cc0 iotests: Test abnormally large size in compressed cluster descriptor
L2 entries for compressed clusters have a field that indicates the
number of sectors used to store the data in the image.

That's however not the size of the compressed data itself, just the
number of sectors where that data is located. The actual data size is
usually not a multiple of the sector size, and therefore cannot be
represented with this field.

The way it works is that QEMU reads all the specified sectors and
starts decompressing the data until there's enough to recover the
original uncompressed cluster. If there are any bytes left that
haven't been decompressed they are simply ignored.

One consequence of this is that even if the size field is larger than
it needs to be QEMU can handle it just fine: it will read more data
from disk but it will ignore the extra bytes.

This test creates an image with two compressed clusters that use 5
sectors (2.5 KB) each, increases the size field to the maximum (8192
sectors, or 4 MB) and verifies that the data can be read without
problems.

This test is important because while the decompressed data takes
exactly one cluster, the maximum value allowed in the compressed size
field is twice the cluster size. So although QEMU won't produce images
with such large values we need to make sure that it can handle them.

Another effect of increasing the size field is that it can make
it include data from the following host cluster(s). In this case
'qemu-img check' will detect that the refcounts are not correct, and
we'll need to rebuild them.

Additionally, this patch also tests that decreasing the size corrupts
the image since the original data can no longer be recovered. In this
case QEMU returns an error when trying to read the compressed data,
but 'qemu-img check' doesn't see anything wrong if the refcounts are
consistent.

One possible task for the future is to make 'qemu-img check' verify
the sizes of the compressed clusters, by trying to decompress the data
and checking that the size stored in the L2 entry is correct.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 20180329120745.11154-1-berto@igalia.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-04-03 17:39:37 +02:00
Lukáš Doktor
96914159b7 qemu-iotests: Use ppc64 qemu_arch on ppc64le host
The qemu target does not always correspond to the host machine type. For
example ppc64le machine target is ppc64. Let's introduce "qemu_arch"
variable to store the matching qemu architecture related to the current
architecture and use it when auto-detecting the default qemu binary.

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Message-id: 20180329112053.5399-2-ldoktor@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-04-03 17:39:37 +02:00
Max Reitz
733d1dce0f iotests: Test preallocated truncate of 2G image
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180228131315.30194-3-mreitz@redhat.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-04-03 17:39:37 +02:00
Max Reitz
82b45e0a0b block/file-posix: Fix fully preallocated truncate
Storing the lseek() result in an int results in it overflowing when the
file is at least 2 GB big.  Then, we have a 50 % chance of the result
being "negative" and thus thinking an error occurred when actually
everything went just fine.

So we should use the correct type for storing the result: off_t.

Reported-by: Daniel P. Berrange <berrange@redhat.com>
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1549231
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180228131315.30194-2-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-04-03 17:39:37 +02:00
Vladimir Sementsov-Ogievskiy
eb42e7193e iotests: fix 208 for luks format
Support luks images creatins like in 205

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-04-03 17:13:51 +02:00
Peter Maydell
13b65ec54d Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2018-04-02' into staging
nbd patches for 2018-04-02

- Eric Blake: nbd: Fix 32-bit compilation on BLOCK_STATUS
- Eric Blake: nbd/client: Correctly handle bad server REP_META_CONTEXT
- Eric Blake: nbd: trace meta context negotiation

# gpg: Signature made Mon 02 Apr 2018 15:15:01 BST
# gpg:                using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg:                 aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-nbd-2018-04-02:
  nbd: trace meta context negotiation
  nbd/client: Correctly handle bad server REP_META_CONTEXT
  nbd: Fix 32-bit compilation on BLOCK_STATUS

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-03 15:59:50 +01:00
Alberto Garcia
627f607e3d iotests: Update 186 after commit ac64273c66
Commit ac64273c66 modified the output of iotest 186, changing
the QOM path of floppy drives from /machine/unattached/device[17] to
/machine/unattached/device[13].

Instead of updating the test output to reflect this change, this patch
adds a new filter that hides all QOM paths from the 'Attached to:'
line of the 'info block' command.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-04-03 16:58:48 +02:00
Alberto Garcia
242c172132 iotests: Update 051 and 186 after commit 1454509726
SCSI controllers are no longer created automatically for
-drive if=scsi, so this patch updates the tests that relied
on that.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-04-03 16:58:48 +02:00
Kevin Wolf
9dae635afa gluster: Fix blockdev-add with server.N.type=unix
The legacy command line interface gets the socket path from an option
called 'socket'. QAPI in contract uses SocketAddress, where the
corresponding option is called 'path'.

Fix the gluster block driver to accept both 'socket' and 'path', with
'path' being the preferred syntax.

https://bugzilla.redhat.com/show_bug.cgi?id=1545155

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20180403110810.25624-1-kwolf@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2018-04-03 09:57:14 -04:00
Marc-André Lureau
604343ced7 blockjob: use qapi enum helpers
QAPI generator provide #define helpers for looking up enum string.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180327153011.29569-1-marcandre.lureau@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2018-04-03 09:56:55 -04:00
Marc-André Lureau
a865cebb82 blockjob: leak fix, remove from txn when failing early
This fixes leaks found by ASAN such as:
  GTESTER tests/test-blockjob
=================================================================
==31442==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7f88483cba38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7f8845e1bd77 in g_malloc0 ../glib/gmem.c:129
    #2 0x7f8845e1c04b in g_malloc0_n ../glib/gmem.c:360
    #3 0x5584d2732498 in block_job_txn_new /home/elmarco/src/qemu/blockjob.c:172
    #4 0x5584d2739b28 in block_job_create /home/elmarco/src/qemu/blockjob.c:973
    #5 0x5584d270ae31 in mk_job /home/elmarco/src/qemu/tests/test-blockjob.c:34
    #6 0x5584d270b1c1 in do_test_id /home/elmarco/src/qemu/tests/test-blockjob.c:57
    #7 0x5584d270b65c in test_job_ids /home/elmarco/src/qemu/tests/test-blockjob.c:118
    #8 0x7f8845e40b69 in test_case_run ../glib/gtestutils.c:2255
    #9 0x7f8845e40f29 in g_test_run_suite_internal ../glib/gtestutils.c:2339
    #10 0x7f8845e40fd2 in g_test_run_suite_internal ../glib/gtestutils.c:2351
    #11 0x7f8845e411e9 in g_test_run_suite ../glib/gtestutils.c:2426
    #12 0x7f8845e3fe72 in g_test_run ../glib/gtestutils.c:1692
    #13 0x5584d270d6e2 in main /home/elmarco/src/qemu/tests/test-blockjob.c:377
    #14 0x7f8843641f29 in __libc_start_main (/lib64/libc.so.6+0x20f29)

Add an assert to make sure that the job doesn't have associated txn before free().

[Jeff Cody: N.B., used updated patch provided by John Snow]

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2018-04-03 09:56:55 -04:00
Jeff Cody
a03083a017 block: handle invalid lseek returns gracefully
In commit 223a23c198, we implemented a
workaround in the gluster driver to handle invalid values returned for
SEEK_DATA or SEEK_HOLE.

In some instances, these same invalid values can be seen in the posix
file handler as well - for example, it has been reported on FUSE gluster
mounts.

Calling assert() for these invalid values is overly harsh; we can safely
return -EIO and allow this case to be treated as a "learned nothing"
case (e.g., D4 / H4, as commented in the code).

This patch does the same thing that 223a23c198 did for gluster.c,
except in file-posix.c

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-04-03 15:25:17 +02:00
Kevin Wolf
3e4d88eabf gluster: Fix blockdev-add with server.N.type=unix
The legacy command line interface gets the socket path from an option
called 'socket'. QAPI in contract uses SocketAddress, where the
corresponding option is called 'path'.

Fix the gluster block driver to accept both 'socket' and 'path', with
'path' being the preferred syntax.

https://bugzilla.redhat.com/show_bug.cgi?id=1545155

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-04-03 15:20:36 +02:00
Laurent Vivier
3ea7f4a226 linux-user: fix TARGET___O_TMPFILE for sparc
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180402102453.9883-3-laurent@vivier.eu>
2018-04-03 11:50:24 +02:00
Laurent Vivier
5de154e82f linux-user: define TARGET_ARCH_HAS_KA_RESTORER
Sparc as an extended sigaction structure containing
the field ka_restorer used in place of sa_restorer.

Define TARGET_ARCH_HAS_KA_RESTORER and use it
with sparc.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180402102453.9883-2-laurent@vivier.eu>
2018-04-03 11:50:15 +02:00
Laurent Vivier
95a29a4e3e linux-user: fix alpha signal emulation
setup_frame() doesn't set correctly the address of the trampoline code.
The offset of retcode array must be added to the stack frame address.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180401204653.14211-1-laurent@vivier.eu>
2018-04-03 11:49:49 +02:00
Cornelia Huck
d3b6e3bb6d pc-bios/s390-ccw: update image
Contains the following commits:
- pc-bios/s390-ccw: Move string arrays from bootmap header to .c file
- pc-bios/s390-ccw: Increase virtio timeout to 30 seconds

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-04-03 10:03:38 +02:00
Thomas Huth
23bf419c1c pc-bios/s390-ccw: Increase virtio timeout to 30 seconds
The current timeout is set to only three seconds - and considering that
vring_wait_reply() or rather get_second() is not doing any rounding,
the real timeout is likely rather 2 seconds in most cases. When the
host is really badly loaded, it's possible that we hit this timeout by
mistake; it's even more likely if we run the guest in TCG mode instead
of KVM.

So let's increase the timeout to 30 seconds instead to ease this situation
(30 seconds is also the timeout that is used by the Linux SCSI subsystem
for example, so this seems to be a sane value for block IO timeout).

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1549079
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1522316251-16399-1-git-send-email-thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
[CH: tweaked commit message]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-04-03 10:03:38 +02:00
Greg Kurz
d9b06db813 hw/s390x: fix memory leak in s390_init_ipl_dev()
The string returned by object_property_get_str() is dynamically allocated.

Fixes: 3c4e9baacf
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <152231460685.69730.14860451936216690693.stgit@bahia.lan>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-04-03 10:03:38 +02:00
Greg Kurz
5d7bc72a43 sev/i386: fix memory leak in sev_guest_init()
The string returned by object_property_get_str() is dynamically allocated.

Fixes: d8575c6c02
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <152231462116.69730.14119625999092384450.stgit@bahia.lan>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-04-02 23:05:26 -03:00
Greg Kurz
72a841d2a4 exec: fix memory leak in find_max_supported_pagesize()
The string returned by object_property_get_str() is dynamically allocated.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <152231458624.69730.1752893648612848392.stgit@bahia.lan>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-04-02 23:05:15 -03:00
Eric Blake
2b53af2523 nbd: trace meta context negotiation
Having a more detailed log of the interaction between client and
server is invaluable in debugging how meta context negotiation
actually works.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180330130950.1931229-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2018-04-02 09:10:49 -05:00
Eric Blake
260e34dbb7 nbd/client: Correctly handle bad server REP_META_CONTEXT
It's never a good idea to blindly read for size bytes as
returned by the server without first validating that the size
is within bounds; a malicious or buggy server could cause us
to hang or get out of sync from reading further messages.

It may be smarter to try and teach the client to cope with
unexpected context ids by silently ignoring them instead of
hanging up on the server, but for now, if the server doesn't
reply with exactly the one context we expect, it's easier to
just give up - however, if we give up for any reason other
than an I/O failure, we might as well try to politely tell
the server we are quitting rather than continuing.

Fix some typos in the process.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180329231837.1914680-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2018-04-02 08:59:34 -05:00
Eric Blake
00d96a4612 nbd: Fix 32-bit compilation on BLOCK_STATUS
iotests 123 and 209 fail on 32-bit platforms.  The culprit:
sizeof(extent) is wrong; we want sizeof(*extent).  But since
the struct is 8 bytes, it happened to work on 64-bit platforms
where the pointer is also 8 bytes (nasty).

Fixes: 78a33ab58
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180327210517.1804242-1-eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2018-04-02 08:45:21 -05:00
Max Filippov
64a563dd8d target/xtensa: linux-user: fix fadvise64 call
fadvise64_64 on xtensa passes advice as the second argument and so must
be handled similar to PPC.

This fixes glibc testsuite tests posix/tst-posix_fadvise and
posix/tst-posix_fadvise64.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-04-02 04:15:35 -07:00
Max Filippov
12e3340c23 linux-user: implement clock_settime
This fixes glibc testsuite test rt/tst-clock2.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-04-01 14:23:17 -07:00
Max Filippov
b9f9908e2d linux-user: fix error propagation in clock_gettime
host_to_target_timespec may return error if target address could not be
locked, but it is ignored.
Propagate return value of host_to_target_timespec to the caller of
clock_gettime.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-04-01 14:22:04 -07:00
Max Filippov
a3da8be512 target/xtensa: linux-user: fix sysv IPC structures
- make target_ipc_perm fields match kernel definitions for xtensa;
- add target_semid64_ds with proper order of times and reserved fields
  for little/big endian specific for xtensa;
- add missing reserved fields after time fields to the target_shmid_ds;
- fix types of shm_cpid, shm_lpid and shm_nattch fields of
  target_shmid_ds to match kernel definitions for xtensa.

These changes fix guest ipcs output and fix glibc testsuite tests
sysvipc/test-sysvsem and sysvipc/test-sysvshm.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-04-01 14:05:23 -07:00
Max Filippov
a23ea40982 linux-user: fix mq_getsetattr implementation
mq_getsetattr implementation does not set errno correctly in case of
error. Also in the presence of both 2nd and 3rd arguments it calls both
mq_getattr and mq_setattr, whereas only the latter call would suffice.

Don't call mq_getattr in the presence of the 2nd argument. Don't copy
output back to user in case of error. Use get_errno to set errno value.

This fixes test rt/tst-mqueue2 from the glibc testsuite.

Cc: Lionel Landwerlin <lionel.landwerlin@openwide.fr>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-31 14:06:36 -07:00
Max Filippov
73a988d957 linux-user: call cpu_copy under clone_lock
cpu_copy adds newly created CPU object to container/machine/unattached,
but does it w/o proper locking. As a result when multiple threads create
threads rapidly QEMU may abort with the following message:

  GLib-CRITICAL **: g_hash_table_iter_next: assertion
  'ri->version == ri->hash_table->version' failed

  ERROR:qemu/qom/object.c:1663:object_get_canonical_path_component:
  code should not be reached

E.g. this issue is observed when running glibc test nptl/tst-eintr1.
Move cpu_copy invocation under clone_lock to fix that.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-31 14:06:36 -07:00
Max Filippov
4a6bf7adb9 target/xtensa: linux-user: rewind pc for restarted syscall
In case of syscall restart request set pc back to the syscall
instruction.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-31 14:06:35 -07:00
Max Filippov
20ef667060 target/xtensa: fix flush_window_regs
flush_window_regs uses wrong stack frame to save overflow registers in
call8 and call12 frames, which results in wrong register values in
callers of a function that received a signal.
Reimplement flush_window_regs closely following window overflow
sequence.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-31 14:06:35 -07:00
Peter Maydell
f184de7553 Merge remote-tracking branch 'remotes/riscv/tags/riscv-qemu-2.12-critical-fixes' into staging
RISC-V: Critical fixes for QEMU 2.12

This series includes changes that are considered release critical,
such as floating point register file corruption under SMP Linux
due to incorrect handling of mstatus.FS.

This workaround will be replaced with a more comprehensive fix
for mstatus.FS handling in QEMU 2.13.

# gpg: Signature made Thu 29 Mar 2018 18:22:42 BST
# gpg:                using DSA key 6BF1D7B357EF3E4F
# gpg: Good signature from "Michael Clark <michaeljclark@mac.com>"
# gpg:                 aka "Michael Clark <mjc@sifive.com>"
# gpg:                 aka "Michael Clark <michael@metaparadigm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7C99 930E B17C D8BA 073D  5EFA 6BF1 D7B3 57EF 3E4F

* remotes/riscv/tags/riscv-qemu-2.12-critical-fixes:
  RISC-V: Workaround for critical mstatus.FS bug

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-31 09:42:33 +01:00
Peter Maydell
b60d667d3d Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Fri 30 Mar 2018 04:49:42 BST
# gpg:                using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  qemu-doc: Rework the network options chapter to make "-net" less prominent

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-31 08:39:08 +01:00
Peter Maydell
eae3aace85 Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2018-03-29-1' into staging
Merge tpm 2018/03/29 v1

# gpg: Signature made Fri 30 Mar 2018 01:04:47 BST
# gpg:                using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE  C66B 75AD 6580 2A0B 4211

* remotes/stefanberger/tags/pull-tpm-2018-03-29-1:
  tests: Tests more flags of the CRB interface
  tpm: CRB: Enforce locality is requested before processing buffer
  tpm: CRB: Reset Granted flag when relinquishing locality
  tpm: CRB: set the Idle flag by default

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-30 23:05:19 +01:00
Peter Maydell
4cd327ded7 Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20180329a' into staging
Migration pull (small fixes)

A pair of two small fixes for 2.12.

# gpg: Signature made Thu 29 Mar 2018 14:55:17 BST
# gpg:                using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20180329a:
  migration: Don't activate block devices if using -S
  migration: fix pfd leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-30 19:58:17 +01:00
Thomas Huth
abbbb0350d qemu-doc: Rework the network options chapter to make "-net" less prominent
"-net" is clearly a legacy option. Yet we still use it in almost all
examples in the qemu documentation, and many other spots in the network
chapter. We should make it less prominent that users are not lured into
using it so often anymore. So instead of starting the network chapter with
"-net nic" and documenting "-net <backend>" below "-netdev <backend>"
everywhere, all the "-net" related documentation is now moved to the end
of the chapter. The new "-nic" option is moved to the beginning of the
chapter instead, with a new example that should demonstrate how "-nic"
can be used to shortcut "-device" with "-netdev". The examples in this
chapter are changed to use the "-device" and "-netdev" options or
"-nic" instead of "-net nic -net <backend>".

While we're at it, also remove a legacy remark about very old Linux
distributions. Also remove the "[...]" from the examples in this chapter
since we are not using this ellipsis in any other examples in our docu-
mentation.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-03-30 11:44:11 +08:00
Stefan Berger
4d0d1c077e tests: Tests more flags of the CRB interface
Test and modify more flags of the CRB interface.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-29 17:41:03 -04:00
Stefan Berger
384cf1fc64 tpm: CRB: Enforce locality is requested before processing buffer
Section 5.5.3.2.2 of the CRB specs states that use of the TPM
through the localty control method must first be requested,
otherwise the command will be dropped.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-29 17:41:02 -04:00
Stefan Berger
025bc93619 tpm: CRB: Reset Granted flag when relinquishing locality
Reset the Granted flag when relinquishing a locality.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-29 17:41:02 -04:00
Stefan Berger
3a3c873502 tpm: CRB: set the Idle flag by default
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-29 17:41:02 -04:00
Michael Clark
b02403363f RISC-V: Workaround for critical mstatus.FS bug
This change is a workaround for a bug where mstatus.FS
is not correctly reporting dirty after operations that
modify floating point registers. This a critical bug
or RISC-V in QEMU as it results in floating point
register file corruption when running SMP Linux due to
task migration and possibly uniprocessor Linux if
more than one process is using the FPU.

This workaround will return dirty if mstatus.FS is
switched from off to initial or clean. According to
the specification it is legal for an implementation
to return only off, or dirty.

Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Sagar Karandikar <sagark@eecs.berkeley.edu>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2018-03-29 10:22:26 -07:00
Dr. David Alan Gilbert
0746a92612 migration: Don't activate block devices if using -S
Activating the block devices causes the locks to be taken on
the backing file.  If we're running with -S and the destination libvirt
hasn't started the destination with 'cont', it's expecting the locks are
still untaken.

Don't activate the block devices if we're not going to autostart the VM;
'cont' already will do that anyway.

bz: https://bugzilla.redhat.com/show_bug.cgi?id=1560854
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180328170207.49512-1-dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-29 14:53:16 +01:00
Marc-André Lureau
fc6008f37a migration: fix pfd leak
Fix leak spotted by ASAN:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7fe1abb80a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7fe1aaf1bf75 in g_malloc0 ../glib/gmem.c:124
    #2 0x7fe1aaf1c249 in g_malloc0_n ../glib/gmem.c:355
    #3 0x55f4841cfaa9 in postcopy_ram_fault_thread /home/elmarco/src/qemu/migration/postcopy-ram.c:596
    #4 0x55f48479447b in qemu_thread_start /home/elmarco/src/qemu/util/qemu-thread-posix.c:504
    #5 0x7fe1a043550a in start_thread (/lib64/libpthread.so.0+0x750a)

Regression introduced with commit 00fa4fc85b.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180321113644.21899-1-marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-29 14:53:16 +01:00
Peter Maydell
47d3b60858 Merge remote-tracking branch 'remotes/riscv/tags/riscv-qemu-2.12-important-fixes' into staging
RISC-V: Important fixes for QEMU 2.12

This series includes changes that are considered important.
i.e. correct user-visible bugs that are exercised by common
operations such as -cpu list (CPU model changes) or -d in_asm
(fix for disassembly of addiw)

# gpg: Signature made Wed 28 Mar 2018 21:34:57 BST
# gpg:                using DSA key 6BF1D7B357EF3E4F
# gpg: Good signature from "Michael Clark <michaeljclark@mac.com>"
# gpg:                 aka "Michael Clark <mjc@sifive.com>"
# gpg:                 aka "Michael Clark <michael@metaparadigm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7C99 930E B17C D8BA 073D  5EFA 6BF1 D7B3 57EF 3E4F

* remotes/riscv/tags/riscv-qemu-2.12-important-fixes:
  RISC-V: Fix incorrect disassembly for addiw
  RISC-V: Convert cpu definition to future model

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-28 22:13:38 +01:00
Michael Clark
33b4f859f1 RISC-V: Fix incorrect disassembly for addiw
This fixes a bug in the disassembler constraints used
to lift instructions into pseudo-instructions, whereby
addiw instructions are always lifted to sext.w instead
of just lifting addiw with a zero immediate.

An associated fix has been made to the metadata used to
machine generate the disseasembler:

https://github.com/michaeljclark/riscv-meta/
commit/4a6b2f3898430768acfe201405224d2ea31e1477

Cc: Sagar Karandikar <sagark@eecs.berkeley.edu>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Clark <mjc@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2018-03-28 11:12:02 -07:00
Michael Clark
eab1586257 RISC-V: Convert cpu definition to future model
- Model borrowed from target/sh4/cpu.c
- Rewrote riscv_cpu_list to use object_class_get_list
- Dropped 'struct RISCVCPUInfo' and used TypeInfo array
- Replaced riscv_cpu_register_types with DEFINE_TYPES
- Marked base class as abstract
- Fixes -cpu list

Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Sagar Karandikar <sagark@eecs.berkeley.edu>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2018-03-28 11:12:02 -07:00
Peter Maydell
043289bef4 Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20180328' into staging
Fix muluh_i64 and mulsh_i64 flags

# gpg: Signature made Wed 28 Mar 2018 05:46:45 BST
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-tcg-20180328:
  tcg: Mark muluh_i64 and mulsh_i64 as 64-bit ops

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-28 13:30:10 +01:00
Richard Henderson
f2f1dde751 tcg: Mark muluh_i64 and mulsh_i64 as 64-bit ops
Failure to do so results in the tcg optimizer sign-extending
any constant fold from 32-bits.  This turns out to be visible
in the RISC-V testsuite using a host that emits these opcodes
(e.g. any non-x86_64).

Reported-by: Michael Clark <mjc@sifive.com>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-03-28 12:45:16 +08:00
Peter Maydell
fa3704d877 Update version for v2.12.0-rc1 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-27 22:04:23 +01:00
KONRAD Frederic
1bb982b8fc gdbstub: send a termination packet instead of crashing gdb
Since the commit:
commit 4486e89c21
Author: Stefan Hajnoczi <stefanha@redhat.com>
Date:   Wed Mar 7 14:42:05 2018 +0000

    vl: introduce vm_shutdown()

GDB crashes when qemu exits (at least on sparc-softmmu):
Remote communication error.  Target disconnected.: Connection reset by peer.
Quitting: putpkt: write failed: Broken pipe.

So send a packet to exit GDB before we exit QEMU:
[Inferior 1 (Thread 0) exited normally]

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com>
Message-id: 1521538773-30802-1-git-send-email-frederic.konrad@adacore.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-27 21:16:27 +01:00
Peter Maydell
f55e88f2ab Merge remote-tracking branch 'remotes/ericb/tags/pull-qapi-2018-03-27-v2' into staging
qapi patches for 2018-03-27, 2.12-rc1

- Marc-André Lureau: qmp-test: fix response leak
- Eric Blake: tests: Silence false positive warning on generated test name
- Laurent Vivier: 0/4 (partial) coccinelle: re-run scripts from scripst/coccinelle
- Peter Xu: 0/8 Monitor: some oob related patches (fixes, new param, tests)
- Satheesh Rajendran: hmp.c: Revert hmp_info_cpus output format change

# gpg: Signature made Tue 27 Mar 2018 16:18:36 BST
# gpg:                using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg:                 aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-qapi-2018-03-27-v2:
  hmp.c: Revert hmp_info_cpus output format change
  tests: qmp-test: add test for new "x-oob"
  tests: Add parameter to qtest_init_without_qmp_handshake
  monitor: new parameter "x-oob"
  qmp: cleanup qmp queues properly
  tests: add oob-test for qapi-schema
  tests: let qapi-schema tests detect oob
  qapi: restrict allow-oob value to be "true"
  qmp: fix qmp_capabilities error regression
  qdict: remove useless cast
  error: Remove NULL checks on error_propagate() calls
  error: Strip trailing '\n' from error string arguments (again again)
  tests: Silence false positive warning on generated test name
  qmp-test: fix response leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-27 19:20:57 +01:00
Peter Maydell
6cf38cbf29 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Tue 27 Mar 2018 15:41:11 BST
# gpg:                using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  MAINTAINERS: add include/block/aio-wait.h
  coroutine: add test-aio coroutine queue chaining test case
  coroutine: avoid co_queue_wakeup recursion
  queue: add QSIMPLEQ_PREPEND()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-27 17:11:33 +01:00
Peter Maydell
dfe732fb68 Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging
# gpg: Signature made Tue 27 Mar 2018 05:56:19 BST
# gpg:                using RSA key 7DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/ide-pull-request:
  macio: fix NULL pointer dereference when issuing IDE trim
  ide: fix invalid TRIM range abortion for macio

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-27 16:25:12 +01:00
Satheesh Rajendran
0dfddbb537 hmp.c: Revert hmp_info_cpus output format change
Commit 137b5cb6 refactored 'info cpus' output, changing
'thread_id' to 'thread-id'.  While HMP is not a stable
interface, it is trivial to keep the spelling consistent
for test frameworks that have not yet updated to using QMP.

This patch just reverts back output format to 'thread_id'.

CC: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Message-Id: <20180327123800.28851-1-sathnaga@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
[eblake: improve commit message]
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:48 -05:00
Peter Xu
fa198ad9bd tests: qmp-test: add test for new "x-oob"
Test the new OOB capability. It's mostly the reverted OOB test
(see commit 4fd78ad7), but differs in that:

- It uses the new qtest_init_without_qmp_handshake() parameter to
  create the monitor with "x-oob"
- Squashed the capability tests on greeting message
- Don't use qtest_global any more, instead use self-maintained
  QTestState, which is the trend

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180326063901.27425-9-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qtest_init changes]
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:45 -05:00
Eric Blake
ddee57e017 tests: Add parameter to qtest_init_without_qmp_handshake
Allow callers to choose whether to allow OOB support during a test;
for now, all existing callers pass false, but the next patch will
add a new caller.  Also, rewrite the monitor setup to be generic
(using the -qmp shorthand is insufficient for honoring the parameter).

Based on an idea by Peter Xu, in <20180326063901.27425-8-peterx@redhat.com>

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180327013620.1644387-4-eblake@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-27 10:17:45 -05:00
Peter Xu
be933ffc23 monitor: new parameter "x-oob"
Add new parameter to optionally enable Out-Of-Band for a QMP server.

An example command line:

  ./qemu-system-x86_64 -chardev stdio,id=char0 \
                       -mon chardev=char0,mode=control,x-oob=on

By default, Out-Of-Band is off.

It is not allowed if either MUX or non-QMP is detected, since
Out-Of-Band is currently only for QMP, and non-MUX chardev backends.

Note that the client STILL has to request 'oob' during qmp_capabilities;
in part because the x-oob command line option may disappear in the
future if we decide the capabilities negotiation is sufficient.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180326063901.27425-4-peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: enhance commit message]
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:45 -05:00
Peter Xu
6d2d563f8c qmp: cleanup qmp queues properly
Marc-André Lureau reported that we can have this happen:

1. client1 connects, send command C1
2. client1 disconnects before getting response for C1
3. client2 connects, who might receive response of C1

However client2 should not receive remaining responses for client1.

Basically, we should clean up the request/response queue elements when:

- after a session is closed
- before destroying the queues

Some helpers are introduced to achieve that.  We need to make sure we're
with the lock when operating on those queues.  This also needed the
declaration of QMPRequest moved earlier.

Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180326063901.27425-3-peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: drop pointless qmp_response_free(), drop queue flush on connect
since a clean queue on disconnect is sufficient]
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:45 -05:00
Peter Xu
1a1b11dc0f tests: add oob-test for qapi-schema
It simply tests the new OOB capability, and make sure the QAPISchema can
parse it correctly.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180326063901.27425-7-peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:45 -05:00
Peter Xu
4bebca1e42 tests: let qapi-schema tests detect oob
The allow_oob parameter was passed in but not used in tests.  Now
reflect that in the tests, so we need to touch up other command testers
with that new change.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180326063901.27425-6-peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:45 -05:00
Peter Xu
9408860165 qapi: restrict allow-oob value to be "true"
It was missed in the first version of OOB series.  We should check this
to make sure we throw the right error when fault value is passed in.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180326063901.27425-5-peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:45 -05:00
Peter Xu
9ddb7456c8 qmp: fix qmp_capabilities error regression
When someone sends a command before QMP handshake, the error used to be
like this:

 {"execute": "query-cpus"}
 {"error": {"class": "CommandNotFound", "desc":
            "Expecting capabilities negotiation with 'qmp_capabilities'"}}

While after cf869d5317 it becomes:

 {"execute": "query-cpus"}
 {"error": {"class": "CommandNotFound", "desc":
            "The command query-cpus has not been found"}}

Fix it back to the nicer one.

Fixes: cf869d5317 ("qmp: support out-of-band (oob) execution", 2018-03-19)
Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180326063901.27425-2-peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: commit message grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:34 -05:00
Laurent Vivier
625eaca9e5 qdict: remove useless cast
Re-run Coccinelle script scripts/coccinelle/qobject.cocci

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20180323143202.28879-5-lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:32 -05:00
Laurent Vivier
710c263407 error: Remove NULL checks on error_propagate() calls
Re-run Coccinelle patch
scripts/coccinelle/error_propagate_null.cocci

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20180323143202.28879-4-lvivier@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:32 -05:00
Laurent Vivier
2d9178d90f error: Strip trailing '\n' from error string arguments (again again)
Re-run Coccinelle script scripts/coccinelle/err-bad-newline.cocci,
and found new error_report() occurrences with '\n'.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20180323143202.28879-3-lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:32 -05:00
Eric Blake
fdf235ba15 tests: Silence false positive warning on generated test name
Running 'make check' on rawhide with gcc 8.0.1 fails:

tests/test-visitor-serialization.c: In function 'main':
tests/test-visitor-serialization.c:1127:34: error: '/primitives/' directive writing 12 bytes into a region of size between 1 and 128 [-Werror=format-overflow=]

The warning is a false positive (we have two buffers of size 128,
so yes, if we FULLY used the first buffer, then sprint'ing it into
the second will overflow the second).  But in practice, our first
buffer will not be longer than "/visitor/serialization/String",
so sizing it smaller is enough to let gcc see that we don't
overflow the second.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180323204341.1501664-1-eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-27 10:17:32 -05:00
Marc-André Lureau
fa15cf8b5c qmp-test: fix response leak
Apparently introduced in commit a4f90923b5.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180326172041.21009-1-marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-27 10:17:32 -05:00
Peter Maydell
62d0289662 Merge remote-tracking branch 'remotes/xtensa/tags/20180326-xtensa' into staging
target/xtensa fixes for 2.12:

- add .inc. to non-top level source file names under target/xtensa;
- fix #include <xtensa-isa.h> in the import_core.sh script;
- remove stray linux-user/xtensa/syscall.h;
- fix timers test.

# gpg: Signature made Mon 26 Mar 2018 22:40:20 BST
# gpg:                using RSA key 51F9CC91F83FA044
# gpg: Good signature from "Max Filippov <filippov@cadence.com>"
# gpg:                 aka "Max Filippov <max.filippov@cogentembedded.com>"
# gpg:                 aka "Max Filippov <jcmvbkbc@gmail.com>"
# Primary key fingerprint: 2B67 854B 98E5 327D CDEB  17D8 51F9 CC91 F83F A044

* remotes/xtensa/tags/20180326-xtensa:
  target/xtensa: fix timers test
  linux-user/xtensa: remove stray syscall.h
  target/xtensa/import_core.sh: fix #include <xtensa-isa.h>
  target/xtensa: add .inc. to non-top level source file names

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-27 15:23:38 +01:00
Peter Maydell
bdc408e91b Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2018-03-26' into staging
A fix for dirty bitmap migration through shared storage, and a VMDK
patch keeping us from creating too large extents.

# gpg: Signature made Mon 26 Mar 2018 21:17:05 BST
# gpg:                using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2018-03-26:
  vmdk: return ERROR when cluster sector is larger than vmdk limitation
  iotests: enable shared migration cases in 169
  qcow2: fix bitmaps loading when bitmaps already exist
  qcow2-bitmap: add qcow2_reopen_bitmaps_rw_hint()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-27 14:11:30 +01:00
Stefan Hajnoczi
f5a53faad4 MAINTAINERS: add include/block/aio-wait.h
The include/block/aio-wait.h header file was added by commit
7719f3c968 ("block: extract
AIO_WAIT_WHILE() from BlockDriverState") without updating MAINTAINERS.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180312132204.23683-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-27 13:05:48 +01:00
Stefan Hajnoczi
35111583aa coroutine: add test-aio coroutine queue chaining test case
Check that two coroutines can queue each other repeatedly without
hitting stack exhaustion.

Switch to qemu_init_main_loop() in main() because coroutines use
qemu_get_aio_context() - they don't know about test-aio's ctx variable.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180322152834.12656-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-27 13:05:28 +01:00
Stefan Hajnoczi
c40a254570 coroutine: avoid co_queue_wakeup recursion
qemu_aio_coroutine_enter() is (indirectly) called recursively when
processing co_queue_wakeup.  This can lead to stack exhaustion.

This patch rewrites co_queue_wakeup in an iterative fashion (instead of
recursive) with bounded memory usage to prevent stack exhaustion.

qemu_co_queue_run_restart() is inlined into qemu_aio_coroutine_enter()
and the qemu_coroutine_enter() call is turned into a loop to avoid
recursion.

There is one change that is worth mentioning:  Previously, when
coroutine A queued coroutine B, qemu_co_queue_run_restart() entered
coroutine B from coroutine A.  If A was terminating then it would still
stay alive until B yielded.  After this patch B is entered by A's parent
so that a A can be deleted immediately if it is terminating.

It is safe to make this change since B could never interact with A if it
was terminating anyway.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180322152834.12656-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-27 13:05:28 +01:00
Stefan Hajnoczi
67a74148d8 queue: add QSIMPLEQ_PREPEND()
QSIMPLEQ_CONCAT(a, b) joins a = a + b.  The new QSIMPLEQ_PREPEND(a, b)
API joins a = b + a.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180322152834.12656-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-27 13:05:28 +01:00
Peter Maydell
78a0f82bf7 Merge remote-tracking branch 'remotes/rth/tags/pull-hppa-20180327' into staging
Fix glibc 2.27 for hppa-linux-user

# gpg: Signature made Mon 26 Mar 2018 17:48:19 BST
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-hppa-20180327:
  target/hppa: Include priv level in user-only iaoq

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-27 11:54:24 +01:00
Peter Maydell
f58d9620aa Merge remote-tracking branch 'remotes/rth/tags/pull-dt-20180326' into staging
Fix a decodetree problem with 16-bit insns

# gpg: Signature made Mon 26 Mar 2018 15:35:04 BST
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-dt-20180326:
  scripts/decodetree: Fix insnmask not marked as global in main()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-27 10:27:34 +01:00
Mark Cave-Ayland
eb69953ecb macio: fix NULL pointer dereference when issuing IDE trim
Commit ef0e64a983 "ide: pass IDEState to trim AIO callback" changed the
IDE trim callback from using a BlockBackend to an IDEState but forgot to update
the dma_blk_io() call in hw/ide/macio.c accordingly.

Without this fix qemu-system-ppc segfaults when issuing an IDE trim command on
any of the PPC Mac machines (easily triggered by running the Debian installer).

Reported-by: Howard Spoelstra <hsp.cat7@gmail.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Message-id: 20180223184700.28854-1-mark.cave-ayland@ilande.co.uk
Signed-off-by: John Snow <jsnow@redhat.com>
2018-03-27 00:38:00 -04:00
Anton Nefedov
caeadbc8ba ide: fix invalid TRIM range abortion for macio
commit 947858b0 "ide: abort TRIM operation for invalid range"
is incorrect for macio; just ide_dma_error() without doing a callback
is not enough for that errorpath.

Instead, pass -EINVAL to the callback and handle it there
(see related motivation for read/write in 58ac32113).

It will however catch possible EINVAL from the block layer too.

Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-id: 1520010495-58172-1-git-send-email-anton.nefedov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2018-03-27 00:38:00 -04:00
Max Filippov
d0ce7e9cfc target/xtensa: fix timers test
The value of CCOUNT special register is calculated as time elapsed
since CCOUNT == 0 multiplied by the core frequency. In icount mode time
increment between consecutive instructions that don't involve time
warps is constant, but unless the result of multiplication of this
constant by the core frequency is a whole number the CCOUNT increment
between these instructions may not be constant. E.g. with icount=7 each
instruction takes 128ns, with core clock of 10MHz CCOUNT values for
consecutive instructions are:

  502: (128 * 502 * 10000000) / 1000000000 = 642.56
  503: (128 * 503 * 10000000) / 1000000000 = 643.84
  504: (128 * 504 * 10000000) / 1000000000 = 645.12

I.e.the CCOUNT increments depend on the absolute time. This results in
varying CCOUNT differences for consecutive instructions in tests that
involve time warps and don't set CCOUNT explicitly.

Change frequency of the core used in tests so that clock cycle takes
exactly 64ns. Change icount power used in tests to 6, so that each
instruction takes exactly 1 clock cycle. With these changes CCOUNT
increments only depend on the number of executed instructions and that's
what timer tests expect, so they work correctly.

Longer story:
  http://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg04326.html

Cc: Pavel Dovgaluk <Pavel.Dovgaluk@ispras.ru>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-26 14:17:04 -07:00
Max Filippov
12ab0b33f1 linux-user/xtensa: remove stray syscall.h
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-26 14:17:03 -07:00
Max Filippov
2745c3bbf3 target/xtensa/import_core.sh: fix #include <xtensa-isa.h>
Change #include <xtensa-isa.h> to #include "xtensa-isa.h" in imported
files to make references to local files consistent.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-26 14:17:03 -07:00
Max Filippov
dda2441b2b target/xtensa: add .inc. to non-top level source file names
Fix definitions of existing cores and core importing script to follow
the rule of naming non-top level source files.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-26 14:17:03 -07:00
yuchenlin
a77672ea3d vmdk: return ERROR when cluster sector is larger than vmdk limitation
VMDK has a hard limitation of extent size, which is due to the size of grain
table entry is 32 bits. It means it can only point to a grain located at
offset = 2^32. To avoid writing the user data beyond limitation and record a useless offset
in grain table. We should return ERROR here.

Signed-off-by: yuchenlin <yuchenlin@synology.com>
Message-id: 20180322133337.28024-1-yuchenlin@synology.com
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-26 21:17:24 +02:00
Vladimir Sementsov-Ogievskiy
f7640f0dbc iotests: enable shared migration cases in 169
Shared migration for dirty bitmaps is fixed by previous patches,
so we can enable the test.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20180320170521.32152-5-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-26 21:17:24 +02:00
Vladimir Sementsov-Ogievskiy
2d949dfcef qcow2: fix bitmaps loading when bitmaps already exist
On reopen with existing bitmaps, instead of loading bitmaps, lets
reopen them if needed. This also fixes bitmaps migration through
shared storage.
Consider the case. Persistent bitmaps are stored on bdrv_inactivate.
Then, on destination process_incoming_migration_bh() calls
bdrv_invalidate_cache_all() which leads to
qcow2_load_autoloading_dirty_bitmaps() which fails if bitmaps are
already loaded on destination start.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20180320170521.32152-3-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-26 21:17:24 +02:00
Vladimir Sementsov-Ogievskiy
b1336cc2ec qcow2-bitmap: add qcow2_reopen_bitmaps_rw_hint()
Add version of qcow2_reopen_bitmaps_rw, which do the same work but
also return a hint about was header updated or not. This will be
used in the following fix for bitmaps reloading after migration.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180320170521.32152-2-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-26 21:17:24 +02:00
Peter Maydell
62a2b55e8d Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Mon 26 Mar 2018 15:33:01 BST
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  qemu-iotests: Test vhdx image creation with QMP
  vhdx: Check for 4 GB maximum log size on creation
  vhdx: Don't use error_setg_errno() with constant errno
  vhdx: Require power-of-two block size on create
  qemu-iotests: Test parallels image creation with QMP
  parallels: Check maximum cluster size on create
  qemu-iotests: Test invalid resize on luks
  luks: Turn another invalid assertion into check
  qemu-iotests: Enable 025 for luks
  qemu-iotests: Test vdi image creation with QMP
  vdi: Fix build with CONFIG_VDI_DEBUG
  vdi: Change 'static' create option to 'preallocation' in QMP
  qcow2: Reset free_cluster_index when allocating a new refcount block
  include/block/block_int: Document protocol related functions
  block/blkreplay: Remove protocol-related fields
  block/throttle: Remove protocol-related fields
  block/quorum: Remove protocol-related fields
  block/replication: Remove protocol_name field
  iotests: 163 is not quick

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-26 17:13:31 +01:00
Richard Henderson
ebd0e15114 target/hppa: Include priv level in user-only iaoq
A recent glibc change relies on the fact that the iaoq must be 3,
and computes an address based on that.  QEMU had been ignoring the
priv level for user-only, which produced an incorrect address.

Reported-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-03-26 22:56:57 +08:00
Bastian Koppelmann
83d7c40c92 scripts/decodetree: Fix insnmask not marked as global in main()
if '-w 16' was given as a cmdline args a local copy of insnmask
is set and not the global one.

Signed-off-by: Peer Adelt <peer.adelt@hni.uni-paderborn.de>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20180319115846.9662-1-kbastian@mail.uni-paderborn.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-03-26 22:34:07 +08:00
Peter Maydell
7b93d78a04 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Miscellaenous bugfixes, including crash fixes from Alexey, Peter M. and
Thomas.

# gpg: Signature made Mon 26 Mar 2018 13:37:38 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  qemu-pr-helper: Actually allow users to specify pidfile
  chardev/char-fe: Allow NULL chardev in qemu_chr_fe_init()
  iothread: fix breakage on windows
  scsi: turn "is this a SCSI device?" into a conditional hint
  chardev-socket: remove useless if
  tcg: Really fix cpu_io_recompile
  vhost-user-test: add back memfd check
  vhost-user-test: do not hang if chardev creation failed
  scripts/device-crash-test: Remove fixed isapc-with-iommu entry
  hw/audio: Fix crashes when devices are used on ISA bus without DMA
  fdc: Exit if ISA controller does not support DMA
  hw/net/can: Fix segfaults when using the devices without bus
  WHPX improve vcpu_post_run perf
  WHPX fix WHvSetPartitionProperty in PropertyCode
  WHPX fix WHvGetCapability out WrittenSizeInBytes
  scripts/get_maintainer.pl: Print proper error message for missing $file

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-26 15:17:25 +01:00
Michal Privoznik
f8e1a98964 qemu-pr-helper: Actually allow users to specify pidfile
Due to wrong specification of arguments to getopt_long() any
attempt to set pidfile resulted in:

1) the default to be leaked
2) the @pidfile variable to be set to NULL (because optarg is
NULL without this patch).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Message-Id: <6f10cd53d361a395aa0e85a9311ec4e9a8fc11e5.1521868451.git.mprivozn@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:15 +02:00
Peter Maydell
12051d82f0 chardev/char-fe: Allow NULL chardev in qemu_chr_fe_init()
All the functions in char-fe.c handle the CharBackend
having a NULL Chardev pointer, which means that the
backend exists but is not connected to anything. The
exception is qemu_chr_fe_init(), which will crash if
passed a NULL Chardev pointer argument. This can happen
for various boards if they're started with 'nodefaults':
 arm-softmmu/qemu-system-arm -S -nodefaults -M cubieboard
 riscv32-softmmu/qemu-system-riscv32 -nodefaults -M sifive_e

Make qemu_chr_fe_init() accept a NULL chardev. This allows
UART models to handle NULL chardev properties without
generally needing to special case them or to manually
create a NullChardev.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180323152948.27048-1-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:15 +02:00
Peter Xu
90c558beca iothread: fix breakage on windows
OOB can enable iothread for parsing even on Windows.  We need some tunes
to enable that on Windows otherwise it'll break Windows users.  This
patch fixes the breakage on Windows with qemu-system-ppc.exe.

Reported-by: Howard Spoelstra <hsp.cat7@gmail.com>
Tested-by: Howard Spoelstra <hsp.cat7@gmail.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180322085630.23654-1-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:15 +02:00
Paolo Bonzini
09c2c6ffda scsi: turn "is this a SCSI device?" into a conditional hint
If the user does not have permissions to send ioctls to the device (due to
SELinux or cgroups, for example), the output can look like

qemu-kvm: -device scsi-block,drive=disk: cannot get SG_IO version number:
  Operation not permitted.  Is this a SCSI device?

but this is confusing because the ioctl was blocked _before_ the device
even received the SG_GET_VERSION_NUM ioctl.  Therefore, for EPERM errors
the suggestion should be eliminated.  To make that simpler, change the
code to use error_append_hint.

Reported-by: Ala Hino <ahino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:15 +02:00
Paolo Bonzini
ff82fab792 chardev-socket: remove useless if
This trips Coverity, which believes the subsequent qio_channel_create_watch
can dereference a NULL pointer.  In reality, tcp_chr_connect's callers
all have s->ioc properly initialized, since they are all rooted at
tcp_chr_new_client.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:14 +02:00
Richard Henderson
87f963be66 tcg: Really fix cpu_io_recompile
We have confused the number of instructions that have been
executed in the TB with the number of instructions needed
to repeat the I/O instruction.

We have used cpu_restore_state_from_tb, which means that
the guest pc is pointing to the I/O instruction.  The only
time the answer to the later question is not 1 is when
MIPS or SH4 need to re-execute the branch for the delay
slot as well.

We must rely on cpu->cflags_next_tb to generate the next TB,
as otherwise we have a race condition with other guest cpus
within the TB cache.

Fixes: 0790f86861
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180319031545.29359-1-richard.henderson@linaro.org>
Tested-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:14 +02:00
Marc-André Lureau
8e029fd64e vhost-user-test: add back memfd check
This revert commit fb68096da3, and
modify test_read_guest_mem() to use different chardev names, when
using memfd (_test_server_free(), where the chardev is removed, runs
in idle).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180215212552.26997-4-marcandre.lureau@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:13 +02:00
Marc-André Lureau
642e065a15 vhost-user-test: do not hang if chardev creation failed
Before the chardev name fix, the following error may happen: "attempt
to add duplicate property 'chr-test' to object (type 'container')",
due to races.

Sadly, error_vprintf() uses g_test_message(), so you have to use
read the cryptic --debug-log to see it. Later, it would make sense to
use g_critical() instead, and catch errors with
g_test_expect_message() (in glib 2.34).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180215212552.26997-5-marcandre.lureau@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:13 +02:00
Thomas Huth
6ff8d9b03a scripts/device-crash-test: Remove fixed isapc-with-iommu entry
Fixed in a0c167a184 ("x86_iommu: check
if machine has PCI bus").

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1521193892-15552-5-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:13 +02:00
Thomas Huth
c9073238fc hw/audio: Fix crashes when devices are used on ISA bus without DMA
The cs4231a, gus and sb16 sound cards crash QEMU when the user tries
to instantiate them on a machine with DMA-less ISA bus (for example
with "qemu-system-mips64el -M mips -device sb16"). Add proper checks
to the realize functions to avoid the crashes.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1521193892-15552-4-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:13 +02:00
Alexey Kardashevskiy
b3da551389 fdc: Exit if ISA controller does not support DMA
A "powernv" machine type defines an ISA bus but it does not add any DMA
controller to it so it is possible to hit assert(fdctrl->dma) by
adding "-machine powernv -device isa-fdc".

This replaces assert() with an error message.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[thuth: Slightly adjusted error message and updated scripts/device-crash-test]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1521193892-15552-3-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:13 +02:00
Thomas Huth
089eac81e1 hw/net/can: Fix segfaults when using the devices without bus
The CAN devices can currently be used to crash QEMU, e.g.:

$ x86_64-softmmu/qemu-system-x86_64 -device kvaser_pci
Segmentation fault (core dumped)

So we've got to add a proper check here that the corresponding
bus is available.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1521193892-15552-2-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:13 +02:00
Justin Terry (VM)
4e286099fe WHPX improve vcpu_post_run perf
This removes the additional call to WHvGetVirtualProcessorRegisters in
whpx_vcpu_post_run now that the WHV_VP_EXIT_CONTEXT is returned in all
WHV_RUN_VP_EXIT_CONTEXT structures.

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1521039163-138-4-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:12 +02:00
Justin Terry (VM)
60168541da WHPX fix WHvSetPartitionProperty in PropertyCode
This fixes a breaking change to WHvSetPartitionProperty to pass the 'in'
PropertyCode on function invocation introduced in Windows Insider SDK 17110.
Usage of this indicates the PropertyCode of the opaque PropertyBuffer passed in
on function invocation.

Also fixes the removal of the PropertyCode parameter from the
WHV_PARTITION_PROPERTY struct as it is now passed to the function directly.

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1521039163-138-3-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:12 +02:00
Justin Terry (VM)
3907e6318e WHPX fix WHvGetCapability out WrittenSizeInBytes
This fixes a breaking change to WHvGetCapability to include the 'out'
WrittenSizeInBytes introduced in Windows Insider SDK 17110.

This specifies on return the safe length to read into the WHV_CAPABILITY
structure passed to the call.

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1521039163-138-2-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-26 14:37:12 +02:00
Ian Jackson
36b4cf1934 scripts/get_maintainer.pl: Print proper error message for missing $file
If you pass scripts/get_maintainer.pl the name of a FIFO or other
exciting object (/dev/stdin, for example), it would falsely print
"file not found".  Instead: stat the object rather than using -f so
that we do not mind if the object is not a file; and print the errno
value in the error message.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Thomas Huth <thuth@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <1520535787-6223-13-git-send-email-ian.jackson@eu.citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
2018-03-26 14:37:12 +02:00
Kevin Wolf
0b7e7f6681 qemu-iotests: Test vhdx image creation with QMP
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-26 12:17:43 +02:00
Kevin Wolf
6f16f7c562 vhdx: Check for 4 GB maximum log size on creation
It's unclear what the real maximum is, but we use an uint32_t to store
the log size in vhdx_co_create(), so we should check that the given
value fits in 32 bits.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-03-26 12:17:43 +02:00
Kevin Wolf
0fcc38e7d0 vhdx: Don't use error_setg_errno() with constant errno
error_setg_errno() is meant for cases where we got an errno from the OS
that can add useful extra information to an error message. It's
pointless if we pass a constant errno, these cases should use plain
error_setg().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-03-26 12:17:43 +02:00
Kevin Wolf
b412f49407 vhdx: Require power-of-two block size on create
Images with a non-power-of-two block size are invalid and cannot be
opened. Reject such block sizes when creating an image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-03-26 12:17:43 +02:00
Kevin Wolf
e8f6ea6fb6 qemu-iotests: Test parallels image creation with QMP
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-26 12:17:43 +02:00
Kevin Wolf
2332d82589 parallels: Check maximum cluster size on create
It's unclear what the real maximum cluster size is for the Parallels
format, but let's at least make sure that we don't get integer
overflows in our .bdrv_co_create implementation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-26 12:17:43 +02:00
Kevin Wolf
50880f25c8 qemu-iotests: Test invalid resize on luks
This tests that the .bdrv_truncate implementation for luks doesn't crash
for invalid image sizes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-26 12:17:40 +02:00
Kevin Wolf
120bc742c0 luks: Turn another invalid assertion into check
Commit e39e959e fixed an invalid assertion in the .bdrv_length
implementation, but left a similar assertion in place for
.bdrv_truncate. Instead of crashing when the user requests a too large
image size, fail gracefully.

A file size of exactly INT64_MAX caused failure before, but is actually
legal.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-26 12:17:40 +02:00
Kevin Wolf
633c175f8c qemu-iotests: Enable 025 for luks
We want to test resizing even for luks. The only change that is needed
is to explicitly zero out new space for luks because it's undefined.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-26 12:17:40 +02:00
Kevin Wolf
b7de0777dc qemu-iotests: Test vdi image creation with QMP
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-26 12:17:35 +02:00
Kevin Wolf
95a14d51b2 vdi: Fix build with CONFIG_VDI_DEBUG
Use qemu_uuid_unparse() instead of uuid_unparse() to make vdi.c compile
again when CONFIG_VDI_DEBUG is set. In order to prevent future bitrot,
replace '#ifdef CONFIG_VDI_DEBUG' by 'if (VDI_DEBUG)' so that the
compiler always sees the code.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-26 12:16:12 +02:00
Kevin Wolf
61fa64871d vdi: Change 'static' create option to 'preallocation' in QMP
What static=on really does is what we call metadata preallocation for
other block drivers. While we can still change the QMP interface, make
it more consistent by using 'preallocation' for VDI, too.

This doesn't implement any new functionality, so the only supported
preallocation modes are 'off' and 'metadata' for now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-26 12:16:12 +02:00
Alberto Garcia
abf754fe40 qcow2: Reset free_cluster_index when allocating a new refcount block
When we try to allocate new clusters we first look for available ones
starting from s->free_cluster_index and once we find them we increase
their reference counts. Before we get to call update_refcount() to do
this last step s->free_cluster_index is already pointing to the next
cluster after the ones we are trying to allocate.

During update_refcount() it may happen however that we also need to
allocate a new refcount block in order to store the refcounts of these
new clusters (and to complicate things further that may also require
us to grow the refcount table). After all this we don't know if the
clusters that we originally tried to allocate are still available, so
we return -EAGAIN to ask the caller to restart the search for free
clusters.

This is what can happen in a common scenario:

  1) We want to allocate a new cluster and we see that cluster N is
     free.

  2) We try to increase N's refcount but all refcount blocks are full,
     so we allocate a new one at N+1 (where s->free_cluster_index was
     pointing at).

  3) Once we're done we return -EAGAIN to look again for a free
     cluster, but now s->free_cluster_index points at N+2, so that's
     the one we allocate. Cluster N remains unallocated and we have a
     hole in the qcow2 file.

This can be reproduced easily:

     qemu-img create -f qcow2 -o cluster_size=512 hd.qcow2 1M
     qemu-io -c 'write 0 124k' hd.qcow2

After this the image has 132608 bytes (256 clusters), and the refcount
block is full. If we write 512 more bytes it should allocate two new
clusters: the data cluster itself and a new refcount block.

     qemu-io -c 'write 124k 512' hd.qcow2

However the image has now three new clusters (259 in total), and the
first one of them is empty (and unallocated):

     dd if=hd.qcow2 bs=512c skip=256 count=1 | hexdump -C

If we write larger amounts of data in the last step instead of the 512
bytes used in this example we can create larger holes in the qcow2
file.

What this patch does is reset s->free_cluster_index to its previous
value when alloc_refcount_block() returns -EAGAIN. This way the caller
will try to allocate again the original clusters if they are still
free.

The output of iotest 026 also needs to be updated because now that
images have no holes some tests fail at a different point and the
number of leaked clusters is different.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-26 12:16:00 +02:00
Fabiano Rosas
1e486cf30a include/block/block_int: Document protocol related functions
Clarify that:

- for protocols the brdv_file_open function is used instead
of bdrv_open;

- when protocol_name is set, a driver should expect
to be given only a filename and no other options.

Signed-off-by: Fabiano Rosas <farosas@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-26 12:16:00 +02:00
Fabiano Rosas
8140e786f0 block/blkreplay: Remove protocol-related fields
The blkreplay driver is not a protocol so it should implement bdrv_open
instead of bdrv_file_open and not provide a protocol_name.

Attempts to invoke this driver using protocol syntax
(i.e. blkreplay:<filename:options:...>) will now fail gracefully:

  $ qemu-img info blkreplay:foo
  qemu-img: Could not open 'blkreplay:foo': Unknown protocol 'blkreplay'

Signed-off-by: Fabiano Rosas <farosas@linux.vnet.ibm.com>
Reviewed-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-26 12:16:00 +02:00
Fabiano Rosas
a7328ba55f block/throttle: Remove protocol-related fields
The throttle driver is not a protocol so it should implement bdrv_open
instead of bdrv_file_open and not provide a protocol_name.

Attempts to invoke this driver using protocol syntax
(i.e. throttle:<filename:options:...>) will now fail gracefully:

  $ qemu-img info throttle:foo
  qemu-img: Could not open 'throttle:foo': Unknown protocol 'throttle'

Signed-off-by: Fabiano Rosas <farosas@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-26 12:16:00 +02:00
Fabiano Rosas
65d2c3e2f6 block/quorum: Remove protocol-related fields
The quorum driver is not a protocol so it should implement bdrv_open
instead of bdrv_file_open and not provide a protocol_name.

Attempts to invoke this driver using protocol syntax
(i.e. quorum:<filename:options:...>) will now fail gracefully:

  $ qemu-img info quorum:foo
  qemu-img: Could not open 'quorum:foo': Unknown protocol 'quorum'

Signed-off-by: Fabiano Rosas <farosas@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-26 12:16:00 +02:00
Fabiano Rosas
cb83d2efe1 block/replication: Remove protocol_name field
The protocol_name field is used when selecting a driver via protocol
syntax (i.e. <protocol_name>:<filename:options:...>). Drivers that are
only selected explicitly (e.g. driver=replication,mode=primary,...)
should not have a protocol_name.

This patch removes the protocol_name field from the brdv_replication
structure so that attempts to invoke this driver using protocol syntax
will fail gracefully:

  $ qemu-img info replication:foo
  qemu-img: Could not open 'replication:': Unknown protocol 'replication'

Buglink: https://bugs.launchpad.net/qemu/+bug/1726733
Signed-off-by: Fabiano Rosas <farosas@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-26 12:16:00 +02:00
Eric Blake
71b74b2544 iotests: 163 is not quick
Testing on ext4, most 'quick' qcow2 tests took less than 5 seconds,
but 163 took more than 20.  Let's remove it from the quick set.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-26 12:16:00 +02:00
Peter Maydell
2ffd221d07 Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Mon 26 Mar 2018 07:53:27 BST
# gpg:                using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  net/vde: print error on vde_open() failure
  virtio_net: flush uncompleted TX on reset

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-26 11:02:50 +01:00
Julia Suvorova via Qemu-devel
7587855cd2 net/vde: print error on vde_open() failure
Despite the fact that now when the initialization of vde fails, qemu
does not end silently, no informative error is printed. The patch
generates an error and pushes it through the calling function.

Related bug: https://bugs.launchpad.net/qemu/+bug/676029

Signed-off-by: Julia Suvorova <jusual@mail.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-03-26 14:52:43 +08:00
Greg Kurz
94b52958b7 virtio_net: flush uncompleted TX on reset
If the backend could not transmit a packet right away for some reason,
the packet is queued for asynchronous sending. The corresponding vq
element is tracked in the async_tx.elem field of the VirtIONetQueue,
for later freeing when the transmission is complete.

If a reset happens before completion, virtio_net_tx_complete() will push
async_tx.elem back to the guest anyway, and we end up with the inuse flag
of the vq being equal to -1. The next call to virtqueue_pop() is then
likely to fail with "Virtqueue size exceeded".

This can be reproduced easily by starting a guest with an hubport backend
that is not connected to a functional network, eg,

 -device virtio-net-pci,netdev=hub0 -netdev hubport,id=hub0,hubid=0

and no other -netdev hubport,hubid=0 on the command line.

The appropriate fix is to ensure that such an asynchronous transmission
cannot survive a device reset. So for all queues, we first try to send
the packet again, and eventually we purge it if the backend still could
not deliver it.

CC: qemu-stable@nongnu.org
Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com>
Buglink: https://github.com/open-power-host-os/qemu/issues/37
Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: R. Nageswara Sastry <nasastry@in.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-03-26 14:49:17 +08:00
Peter Maydell
7b1db0908d Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180323' into staging
target-arm queue:
 * arm/translate-a64: don't lose interrupts after unmasking via write to DAIF
 * sdhci: fix incorrect use of Error *
 * hw/intc/arm_gicv3: Fix secure-GIC NS ICC_PMR and ICC_RPR accesses
 * hw/arm/bcm2836: Use the Cortex-A7 instead of Cortex-A15
 * i.MX: Support serial RS-232 break properly
 * mach-virt: Set VM's SMBIOS system version to mc->name
 * target/arm: Honour MDCR_EL2.TDE when routing exceptions due to BKPT/BRK
 * target/arm: Factor out code to calculate FSR for debug exceptions
 * target/arm: Set FSR for BKPT, BRK when raising exception
 * target/arm: Always set FAR to a known unknown value for debug exceptions

# gpg: Signature made Fri 23 Mar 2018 18:48:57 GMT
# gpg:                using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20180323:
  target/arm: Always set FAR to a known unknown value for debug exceptions
  target/arm: Set FSR for BKPT, BRK when raising exception
  target/arm: Factor out code to calculate FSR for debug exceptions
  target/arm: Honour MDCR_EL2.TDE when routing exceptions due to BKPT/BRK
  mach-virt: Set VM's SMBIOS system version to mc->name
  i.MX: Support serial RS-232 break properly
  hw/arm/bcm2836: Use the Cortex-A7 instead of Cortex-A15
  hw/intc/arm_gicv3: Fix secure-GIC NS ICC_PMR and ICC_RPR accesses
  sdhci: fix incorrect use of Error *
  arm/translate-a64: treat DISAS_UPDATE as variant of DISAS_EXIT

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-25 13:51:33 +01:00
Peter Maydell
77fea92dbb Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20180323a' into staging
Migration fixes for 2.12

All small fixes.  Dan's is a missing piece
of a cleanup that finally completes something,
and between Paolo, Dan and myself we recon it's
still on the edge of being a bug fix.

# gpg: Signature made Fri 23 Mar 2018 20:17:40 GMT
# gpg:                using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20180323a:
  migration: Fix block migration flag case
  migration/block: compare only read blocks against the rate limiter
  migration/block: limit the number of parallel I/O requests
  migration: Fix rate limiting issue on RDMA migration
  migration: convert socket server to QIONetListener

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-24 19:26:11 +00:00
Peter Maydell
ed4916e8f8 Merge remote-tracking branch 'remotes/marcel/tags/rdma-pull-request' into staging
* fix PVRDMA compilation errors and warnings
* implement query_qp for the PVRDMA device
* fix make - switch from -I to -iquote

# gpg: Signature made Fri 23 Mar 2018 15:39:23 GMT
# gpg:                using RSA key 36D4C0F0CF2FE46D
# gpg: Good signature from "Marcel Apfelbaum <marcel@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B1C6 3A57 F92E 08F2 640F  31F5 36D4 C0F0 CF2F E46D

* remotes/marcel/tags/rdma-pull-request:
  hw/rdma: Fix 32-bit compilation
  hw/rdma: Use correct print format in CHK_ATTR macro
  hw/rdma: Change host_virt to void *
  hw/rdma: fix clang compilation errors
  make: switch from -I to -iquote
  rdma: fix up include directives
  hw/rdma: Add support for Query QP verb to pvrdma device
  hw/rdma: Add Query QP operation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-24 16:34:06 +00:00
Peter Maydell
548f514cf8 target/arm: Always set FAR to a known unknown value for debug exceptions
For debug exceptions due to breakpoints or the BKPT instruction which
are taken to AArch32, the Fault Address Register is architecturally
UNKNOWN.  We were using that as license to simply not set
env->exception.vaddress, but this isn't correct, because it will
expose to the guest whatever old value was in that field when
arm_cpu_do_interrupt_aarch32() writes it to the guest IFSR.  That old
value might be a FAR for a previous guest EL2 or secure exception, in
which case we shouldn't show it to an EL1 or non-secure exception
handler. It might also be a non-deterministic value, which is bad
for record-and-replay.

Clear env->exception.vaddress before taking breakpoint debug
exceptions, to avoid this minor information leak.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180320134114.30418-5-peter.maydell@linaro.org
2018-03-23 18:26:46 +00:00
Peter Maydell
62b94f31d0 target/arm: Set FSR for BKPT, BRK when raising exception
Now that we have a helper function specifically for the BRK and
BKPT instructions, we can set the exception.fsr there rather
than in arm_cpu_do_interrupt_aarch32(). This allows us to
use our new arm_debug_exception_fsr() helper.

In particular this fixes a bug where we were hardcoding the
short-form IFSR value, which is wrong if the target exception
level has LPAE enabled.

Fixes: https://bugs.launchpad.net/qemu/+bug/1756927
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180320134114.30418-4-peter.maydell@linaro.org
2018-03-23 18:26:46 +00:00
Peter Maydell
81621d9ab8 target/arm: Factor out code to calculate FSR for debug exceptions
When a debug exception is taken to AArch32, it appears as a Prefetch
Abort, and the Instruction Fault Status Register (IFSR) must be set.
The IFSR has two possible formats, depending on whether LPAE is in
use. Factor out the code in arm_debug_excp_handler() which picks
an FSR value into its own utility function, update it to use
arm_fi_to_lfsc() and arm_fi_to_sfsc() rather than hard-coded constants,
and use the correct condition to select long or short format.

In particular this fixes a bug where we could select the short
format because we're at EL0 and the EL1 translation regime is
not using LPAE, but then route the debug exception to EL2 because
of MDCR_EL2.TDE and hand EL2 the wrong format FSR.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180320134114.30418-3-peter.maydell@linaro.org
2018-03-23 18:26:46 +00:00
Peter Maydell
c900a2e62d target/arm: Honour MDCR_EL2.TDE when routing exceptions due to BKPT/BRK
The MDCR_EL2.TDE bit allows the exception level targeted by debug
exceptions to be set to EL2 for code executing at EL0.  We handle
this in the arm_debug_target_el() function, but this is only used for
hardware breakpoint and watchpoint exceptions, not for the exception
generated when the guest executes an AArch32 BKPT or AArch64 BRK
instruction.  We don't have enough information for a translate-time
equivalent of arm_debug_target_el(), so instead make BKPT and BRK
call a special purpose helper which can do the routing, rather than
the generic exception_with_syndrome helper.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180320134114.30418-2-peter.maydell@linaro.org
2018-03-23 18:26:46 +00:00
Wei Huang
dfadc3bfb4 mach-virt: Set VM's SMBIOS system version to mc->name
Instead of using "1.0" as the system version of SMBIOS, we should use
mc->name for mach-virt machine type to be consistent other architectures.
With this patch, "dmidecode -t 1" (e.g., "-M virt-2.12,accel=kvm") will
show:

    Handle 0x0100, DMI type 1, 27 bytes
    System Information
            Manufacturer: QEMU
            Product Name: KVM Virtual Machine
            Version: virt-2.12
            Serial Number: Not Specified
            ...

instead of:

    Handle 0x0100, DMI type 1, 27 bytes
    System Information
            Manufacturer: QEMU
            Product Name: KVM Virtual Machine
            Version: 1.0
            Serial Number: Not Specified
            ...

For backward compatibility, we allow older machine types to keep "1.0"
as the default system version.

Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 20180322212318.7182-1-wei@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-23 18:26:46 +00:00
Trent Piepho
478a573a7d i.MX: Support serial RS-232 break properly
Linux does not detect a break from this IMX serial driver as a magic
sysrq.  Nor does it note a break in the port error counts.

The former is because the Linux driver uses the BRCD bit in the USR2
register to trigger the RS-232 break handler in the kernel, which is
where sysrq hooks in.  The emulated UART was not setting this status
bit.

The latter is because the Linux driver expects, in addition to the BRK
bit, that the ERR bit is set when a break is read in the FIFO.  A break
should also count as a frame error, so add that bit too.

Cc: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Message-id: 20180320013657.25038-1-tpiepho@impinj.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-23 18:26:45 +00:00
Peter Maydell
2b0b93210a hw/arm/bcm2836: Use the Cortex-A7 instead of Cortex-A15
The BCM2836 uses a Cortex-A7, not a Cortex-A15. Update the device to
use the correct CPU.
https://www.raspberrypi.org/documentation/hardware/raspberrypi/bcm2836/QA7_rev3.4.pdf

When the BCM2836 was introduced (bad5623690) the Cortex-A7 was not
available, so the very similar Cortex-A15 was used. Since dcf578ed8c
we can model the correct core.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair@alistair23.me>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180319110215.16755-1-peter.maydell@linaro.org
2018-03-23 18:26:45 +00:00
Peter Maydell
a2e2d7fc46 hw/intc/arm_gicv3: Fix secure-GIC NS ICC_PMR and ICC_RPR accesses
If the GIC has the security extension support enabled, then a
non-secure access to ICC_PMR must take account of the non-secure
view of interrupt priorities, where real priorities 0x00..0x7f
are secure-only and not visible to the non-secure guest, and
priorities 0x80..0xff are shown to the guest as if they were
0x00..0xff. We had the logic here wrong:
 * on reads, the priority is in the secure range if bit 7
   is clear, not if it is set
 * on writes, we want to set bit 7, not mask everything else

Our ICC_RPR read code had the same error as ICC_PMR.

(Compare the GICv3 spec pseudocode functions ICC_RPR_EL1
and ICC_PMR_EL1.)

Fixes: https://bugs.launchpad.net/qemu/+bug/1748434
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 20180315133441.24149-1-peter.maydell@linaro.org
2018-03-23 18:26:45 +00:00
Paolo Bonzini
544156efcf sdhci: fix incorrect use of Error *
Detected by Coverity (CID 1386072, 1386073, 1386076, 1386077).  local_err
was unused, and this made the static analyzer unhappy.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180320151355.25854-1-pbonzini@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-23 18:26:45 +00:00
Victor Kamensky
a75a52d624 arm/translate-a64: treat DISAS_UPDATE as variant of DISAS_EXIT
In OE project 4.15 linux kernel boot hang was observed under
single cpu aarch64 qemu. Kernel code was in a loop waiting for
vtimer arrival, spinning in TC generated blocks, while interrupt
was pending unprocessed. This happened because when qemu tried to
handle vtimer interrupt target had interrupts disabled, as
result flag indicating TCG exit, cpu->icount_decr.u16.high,
was cleared but arm_cpu_exec_interrupt function did not call
arm_cpu_do_interrupt to process interrupt. Later when target
reenabled interrupts, it happened without exit into main loop, so
following code that waited for result of interrupt execution
run in infinite loop.

To solve the problem instructions that operate on CPU sys state
(i.e enable/disable interrupt), and marked as DISAS_UPDATE,
should be considered as DISAS_EXIT variant, and should be
forced to exit back to main loop so qemu will have a chance
processing pending CPU state updates, including pending
interrupts.

This change brings consistency with how DISAS_UPDATE is treated
in aarch32 case.

CC: Peter Maydell <peter.maydell@linaro.org>
CC: Alex Bennée <alex.bennee@linaro.org>
CC: qemu-stable@nongnu.org
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1521526368-1996-1-git-send-email-kamensky@cisco.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-23 18:26:45 +00:00
Dr. David Alan Gilbert
09576e74db migration: Fix block migration flag case
Fix the case where when a migration with a bad protocol is tried,
we leave the block migration capability set.

(This is a cut down version of my 'migration: Fix block failure cases'
where it's other case was fixed by Peter's dd0ee30cae )

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180316202114.32345-1-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-23 18:24:11 +00:00
Peter Maydell
66793daa03 Merge remote-tracking branch 'remotes/ericb/tags/pull-qapi-2018-03-23' into staging
qapi patches for 2018-03-12, 2.12-rc1

- Peter Xu: 0/4 Turn OOB off for 2.12-rc1, revert OOB tests
- Eric Blake: qapi: Force UTF8 encoding when parsing qapi files

# gpg: Signature made Fri 23 Mar 2018 17:35:49 GMT
# gpg:                using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg:                 aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-qapi-2018-03-23:
  qapi: Force UTF8 encoding when parsing qapi files
  Revert "monitor: enable IO thread for (qmp & !mux) typed"
  Revert "tests: qmp-test: verify command batching"
  Revert "tests: qmp-test: add oob test"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-23 17:41:56 +00:00
Eric Blake
39615354fc qapi: Force UTF8 encoding when parsing qapi files
Commit d4e5ec877 already fixed things to work around Python 3's
lame bug of having LC_ALL=C not be 8-bit clean, when parsing the
main QMP qapi files; but failed to do likewise in the tests
directory.  As a result, running 'LC_ALL=C make check' fails on
escape-too-big and unicode-str when using python 3 with a nasty
stack trace instead of the intended graceful error message that
QAPI doesn't yet support 8-bit data (the two tests contain
Unicode é, when parsed in UTF-8; they represent something
different when parsed in a proper single-byte C locale, but that
doesn't matter to the error message printed out, provided that
brain-dead Python hasn't first choked on the input instead of
being 8-bit clean).

Ideally, we'd teach the qapi generator scripts to automatically
slurp things in using UTF-8 regardless of locale, and to honor
content that is not limited to 7 bit data rather than gracefully
erroring out; but until then, since our graceful error depends
on python parsing 8-bit data (even if nothing we generate uses
8-bit data), our quick fix is to use the right locale when
running these tests.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180319205040.1113423-1-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-23 12:29:07 -05:00
Peter Xu
a4f90923b5 Revert "monitor: enable IO thread for (qmp & !mux) typed"
This reverts commit 3fd2457d18.

Enabling OOB caused several iotests failures; due to the imminent
2.12 release, the safest action is to disable OOB for now.  If
other patches fix the issues that iotests exposed, it may be turned
back on in time for the release, otherwise it will be 2.13 material;
either way, the framework changes not reverted now do not hurt if
they remain as part of the 2.12 release.

Additionally, revert the tests in the patch 02130314d8 ("qmp: introduce
QMPCapability", 2018-03-19), as both parts must be reverted at once
to keep 'make check' passing.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180323140821.28957-2-peterx@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
[eblake: reorder/squash commits, enhance commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-23 12:28:00 -05:00
Peter Xu
cc797607c0 Revert "tests: qmp-test: verify command batching"
This reverts commit 91ad45061a.

Enabling OOB caused several iotests failures; due to the imminent
2.12 release, the safest action is to disable OOB, but first we
have to revert tests that rely on OOB.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180323140821.28957-4-peterx@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
[eblake: reorder commits, enhance commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-23 12:24:19 -05:00
Peter Xu
4fd78ad793 Revert "tests: qmp-test: add oob test"
This reverts commit d003f7a8f9.

Enabling OOB caused several iotests failures; due to the imminent
2.12 release, the safest action is to disable OOB, but first we
have to revert tests that rely on OOB.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180323140821.28957-3-peterx@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
[eblake: reorder commits, enhance commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-23 12:20:22 -05:00
Peter Lieven
b47d1e9fe0 migration/block: compare only read blocks against the rate limiter
only read_done blocks are in the queued to be flushed to the migration
stream. submitted blocks are still in flight.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1520507908-16743-6-git-send-email-pl@kamp.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-23 16:45:18 +00:00
Peter Lieven
44815334e1 migration/block: limit the number of parallel I/O requests
the current implementation submits up to 512 I/O requests in parallel
which is much to high especially for a background task.
This patch adds a maximum limit of 16 I/O requests that can
be submitted in parallel to avoid monopolizing the I/O device.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1520507908-16743-5-git-send-email-pl@kamp.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-23 16:45:03 +00:00
Lidong Chen
e8a0f2f9a1 migration: Fix rate limiting issue on RDMA migration
RDMA migration implement save_page function for QEMUFile, but
ram_control_save_page do not increase bytes_xfer. So when doing
RDMA migration, it will use whole bandwidth.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Message-Id: <1520692378-1835-1-git-send-email-lidongchen@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-23 16:37:15 +00:00
Daniel P. Berrange
bdd847a026 migration: convert socket server to QIONetListener
Instead of creating a QIOChannelSocket directly for the migration
server socket, use a QIONetListener. This provides the ability
to listen on multiple sockets at the same time, so enables
full support for IPv4/IPv6 dual stack.

For example,   '$QEMU -incoming tcp::9000' now correctly listens
on both 0.0.0.0 and :: at the same time, instead of only on 0.0.0.0.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180312141714.7223-1-berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-23 16:27:24 +00:00
Yuval Shaia
6f559013c8 hw/rdma: Fix 32-bit compilation
Use the correct printf formats, so that a 32-bit compile doesn't spit
out lots of warnings about %lx being incompatible with uint64_t.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180322095220.9976-4-yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2018-03-23 18:38:55 +03:00
Yuval Shaia
94f480b8db hw/rdma: Use correct print format in CHK_ATTR macro
Macro should not cast the given variable to u64 instead it should use
the supplied format argument (fmt).

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180322095220.9976-3-yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2018-03-23 18:38:55 +03:00
Yuval Shaia
9bbb8d3577 hw/rdma: Change host_virt to void *
To avoid compilation warnings on 32-bit machines:
rdma_backend.c: In function 'rdma_backend_create_mr':
rdma_backend.c:409:37: error: cast to pointer from integer of different
size [-Werror=int-to-pointer-cast]
	mr->ibmr = ibv_reg_mr(pd->ibpd, (void *)addr, length, access);

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180322095220.9976-2-yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2018-03-23 18:38:55 +03:00
Marcel Apfelbaum
197053e212 hw/rdma: fix clang compilation errors
Fix some enum castings and extra parentheses.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Message-Id: <20180321140316.96045-1-marcel@redhat.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
2018-03-23 18:38:55 +03:00
Michael S. Tsirkin
9edc19c939 make: switch from -I to -iquote
Our rule right now is to use <> for external headers,
"" for internal ones. The idea was to avoid conflicts
between e.g. a system file named <trace.h> and an
internal one by the same name.

Unfortunately we use -I compiler flag so it does not
help: a system file doing #include <trace.h> will
still pick up ours first.

To fix, switch to -iquote which is supported by both
gcc and clang and only affects #include "" directives.

As a side effect, this catches any future uses of
 #include <> for internal headers.

Suggested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2018-03-23 18:38:55 +03:00
Michael S. Tsirkin
0efc9511aa rdma: fix up include directives
Our rule right now is to use <> for external headers only.
RDMA code violates that, fix it up.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2018-03-23 18:38:55 +03:00
Yuval Shaia
79cfdca7aa hw/rdma: Add support for Query QP verb to pvrdma device
This IB verb is needed by some applications - implement it.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2018-03-23 18:38:55 +03:00
Yuval Shaia
c99f217431 hw/rdma: Add Query QP operation
This operation is needed by rdma devices - implement it.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2018-03-23 18:38:55 +03:00
Peter Maydell
4c2c101590 Merge remote-tracking branch 'remotes/borntraeger/tags/s390x-20180323' into staging
s390x: Fixes for 2.12

- Fix for the s390 cpumodel
- Forbid multifunction PCI devices

# gpg: Signature made Fri 23 Mar 2018 09:06:31 GMT
# gpg:                using RSA key 117BBC80B5A61C7C
# gpg: Good signature from "Christian Borntraeger (IBM) <borntraeger@de.ibm.com>"
# Primary key fingerprint: F922 9381 A334 08F9 DBAB  FBCA 117B BC80 B5A6 1C7C

* remotes/borntraeger/tags/s390x-20180323:
  s390x/cpumodel: fix feature groups and breakage of MSA8
  s390x/pci: forbid multifunction pci device

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-23 10:20:54 +00:00
Christian Borntraeger
06a97edac1 s390x/cpumodel: fix feature groups and breakage of MSA8
Since commit 46a99c9f73 ("s390x/cpumodel: model PTFF subfunctions
for Multiple-epoch facility") -cpu help no longer shows the MSA8
feature group. Turns out that we forgot to add the new MEPOCH_PTFF
group enum.

Fixes: 46a99c9f73 ("s390x/cpumodel: model PTFF subfunctions for Multiple-epoch facility")
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-03-23 09:05:42 +00:00
Yi Min Zhao
57da367b9e s390x/pci: forbid multifunction pci device
Currently we don't support pci multifunction. If a pci with
multifucntion is plugged, the guest will spin forever. This patch fixes
this.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-03-23 09:05:18 +00:00
Peter Maydell
d522e0bd18 gitmodules: Use the QEMU mirror of qemu-palcode
We have a mirror of the qemu-palcode repository on
git.qemu.org; use that instead of the upstream github,
in line with our general policy of keeping and using
a mirror for submodules.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180319131743.3885-1-peter.maydell@linaro.org
2018-03-22 19:24:16 +00:00
Peter Maydell
211d626020 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Multiboot patches

# gpg: Signature made Wed 21 Mar 2018 14:38:36 GMT
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  tests/multiboot: Add .gitignore
  tests/multiboot: Add tests for the a.out kludge
  tests/multiboot: Test exit code for every qemu run
  multiboot: Check validity of mh_header_addr
  multiboot: Reject kernels exceeding the address space

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-22 14:01:29 +00:00
Peter Maydell
99728ba3ec Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging
Pull request

# gpg: Signature made Wed 21 Mar 2018 14:37:05 GMT
# gpg:                using RSA key DAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/dump-pull-request:
  dump-guest-memory: more descriptive lookup_type failure
  dump.c: allow fd_write_vmcore to return errno on failure

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-22 13:15:52 +00:00
Peter Maydell
b2ce07de4e Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2018-03-21-1' into staging
Merge tpm 2018/03/21 v1

# gpg: Signature made Wed 21 Mar 2018 12:02:06 GMT
# gpg:                using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE  C66B 75AD 6580 2A0B 4211

* remotes/stefanberger/tags/pull-tpm-2018-03-21-1:
  tpm: CRB: query backend for TPM established flag
  tpm: CRB: reset locAssigned upon relinquishing locality
  tpm: CRB: set registers to 0 by default
  tpm: CRB: Set tpmRegValidSts flag to '1' in device reset

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-22 12:13:43 +00:00
Peter Maydell
3be2e41b33 Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-2.12-pull-request' into staging
# gpg: Signature made Tue 20 Mar 2018 20:43:37 GMT
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-2.12-pull-request:
  linux-user: init_guest_space: Try to make ARM space+commpage continuous

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-22 11:10:37 +00:00
Kevin Wolf
e2679395d5 tests/multiboot: Add .gitignore
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-21 15:13:40 +01:00
Kevin Wolf
1c8c426fb4 tests/multiboot: Add tests for the a.out kludge
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jack Schwartz <jack.schwartz@oracle.com>
2018-03-21 15:13:25 +01:00
Kevin Wolf
49713c413a tests/multiboot: Test exit code for every qemu run
Testing the exit code only once after a whole group of tests has
completed is not enough, it catches errors only in the very last qemu
invocation. We need to have the check after each qemu run.

The logging and diff with the reference output is still done once per
group to keep things more managable. This is not a problem because the
log file accumulates the output of all runs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jack Schwartz <jack.schwartz@oracle.com>
2018-03-21 15:13:25 +01:00
Kevin Wolf
dbf2dce7aa multiboot: Check validity of mh_header_addr
I couldn't find a case where this prevents something bad from happening
that isn't already caught by other checks, but let's err on the safe
side and check that mh_header_addr is as expected.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jack Schwartz <jack.schwartz@oracle.com>
2018-03-21 15:13:25 +01:00
Kevin Wolf
b17a9054a0 multiboot: Reject kernels exceeding the address space
The code path where mh_load_end_addr is non-zero in the Multiboot
header checks that mh_load_end_addr >= mh_load_addr and so
mb_load_size is checked.  However, mb_load_size is not checked when
calculated from the file size, when mh_load_end_addr is 0.

If the kernel binary size is larger than can fit in the address space
after load_addr, we ended up with a kernel_size that is smaller than
load_size, which means that we read the file into a too small buffer.

Add a check to reject kernel files with such Multiboot headers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jack Schwartz <jack.schwartz@oracle.com>
2018-03-21 15:13:25 +01:00
Andrew Jones
4b17bc933f dump-guest-memory: more descriptive lookup_type failure
We've seen a few reports of

 (gdb) source /usr/share/qemu-kvm/dump-guest-memory.py
 Traceback (most recent call last):
   File "/usr/share/qemu-kvm/dump-guest-memory.py", line 19, in <module>
     UINTPTR_T = gdb.lookup_type("uintptr_t")
 gdb.error: No type named uintptr_t.

This occurs when symbols haven't been loaded first, i.e. neither a
QEMU binary was loaded nor a QEMU process was attached first. Let's
better inform the user of how to fix the issue themselves in order
to avoid more reports.

Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20180314153820.18426-1-drjones@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-21 15:02:00 +01:00
Yasmin Beatriz
0c33659d09 dump.c: allow fd_write_vmcore to return errno on failure
fd_write_vmcore can fail to execute for a lot of reasons that can be
retrieved by errno, but it only returns -1. This makes difficult for
the caller to know what happened and only a generic error message is
propagated back to the user. This is an example using dump-guest-memory:

(qemu) dump-guest-memory /home/yasmin/mnt/test.dump
dump: failed to save memory

All callers of fd_write_vmcore of dump.c does error handling via
error_setg(), so at first it seems feasible to add the Error pointer as
an argument of fd_write_vmcore. This proved to be more complex than it
first looked. fd_write_vmcore is used by write_elf64_notes and
write_elf32_notes as a WriteCoreDumpFunction prototype. WriteCoreDumpFunction
is declared in include/qom/cpu.h and is used all around the code. This
leaves us with few alternatives:

- change the WriteCoreDumpFunction prototype to include an error pointer.
This would require to change all functions that implements this prototype
to also receive an Error pointer;

- change both write_elf64_notes and write_elf32_notes to no use the
WriteCoreDumpFunction. These functions use not only fd_write_vmcore
but also buf_write_note, so this would require to change buf_write_note
to handle an Error pointer. Considerable easier than the alternative
above, but it's still a lot of code just for the benefit of the callers
of fd_write_vmcore.

This patch presents an easier solution that benefits all fd_write_vmcore
callers:

- instead of returning -1 on error, return -errno. All existing callers
already checks for ret < 0 so there is no need to change the caller's
logic too much. This also allows the retrieval of the errno.

- all callers were updated to use error_setg_errno instead of just
errno_setg. Now that fd_write_vmcore can return an errno, let's update
all callers so they can benefit from a more detailed error message.

This is the same dump-guest-memory example with this patch applied:

(qemu) dump-guest-memory /home/yasmin/mnt/test.dump
dump: failed to save memory: No space left on device
(qemu)

This example illustrates an error of fd_write_vmcore when called
from write_data. All other callers will benefit from better
error messages as well.

Reported-by: yilzhang@redhat.com
Cc: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: Yasmin Beatriz <yasmins@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Message-Id: <20180212142506.28445-2-danielhb@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-21 14:55:49 +01:00
Stefan Berger
ffbf24bdb2 tpm: CRB: query backend for TPM established flag
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-21 08:01:03 -04:00
Stefan Berger
de4a22d0fa tpm: CRB: reset locAssigned upon relinquishing locality
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-21 08:00:57 -04:00
Stefan Berger
e1880ed80a tpm: CRB: set registers to 0 by default
Initialize all registers of the CRB device to 0. This clears a few
flags upon a reset.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-21 08:00:50 -04:00
Stefan Berger
be052a3b3d tpm: CRB: Set tpmRegValidSts flag to '1' in device reset
Fix the initialization of the tpmRegValidSts flag and set it to '1'
during device reset without expecting a write to another register.
This seems to also be the default behavior of real hardware.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-03-21 08:00:31 -04:00
Peter Maydell
f1a63fcfcd Update version for v2.12.0-rc0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-20 19:04:22 +00:00
Peter Maydell
a9b47e53e8 Merge remote-tracking branch 'remotes/dgilbert/tags/pull-hmp-20180320' into staging
HMP fixes for 2.12

# gpg: Signature made Tue 20 Mar 2018 12:39:24 GMT
# gpg:                using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-hmp-20180320:
  hmp: free sev info
  HMP: Initialize err before using

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-20 18:03:10 +00:00
Luke Shumaker
2a53535af4 linux-user: init_guest_space: Try to make ARM space+commpage continuous
At a fixed distance after the usable memory that init_guest_space maps, for
32-bit ARM targets we also need to map a commpage.  The normal
init_guest_space logic doesn't keep this in mind when searching for an
address range.

If !host_start, then try to find a big continuous segment where we can put
both the usable memory and the commpage; we then munmap that segment and
set current_start to that address; and let the normal code mmap the usable
memory and the commpage separately.  That is: if we don't have hint of
where to start looking for memory, come up with one that is better than
NULL.  Depending on host_size and guest_start, there may or may not be a
gap between the usable memory and the commpage, so this is slightly more
restrictive than it needs to be; but it's only a hint, so that's OK.

We only do that for !host start, because if host_start, then either:
 - we got an address passed in with -B, in which case we don't want to
   interfere with what the user said;
 - or host_start is based off of the ELF image's loaddr.  The check "if
   (host_start && real_start != current_start)" suggests that we really
   want lowest available address that is >= loaddr.  I don't know why that
   is, but I'm trusting that Paul Brook knew what he was doing when he
   wrote the original version of that check in
   c581deda32 way back in 2010.

Signed-off-by: Luke Shumaker <lukeshu@parabola.nu>
Message-Id: <20171228180814.9749-11-lukeshu@lukeshu.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-20 18:26:40 +01:00
Peter Maydell
ed627b2ad3 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio,vhost,pci,pc: features, cleanups

SRAT tables for DIMM devices
new virtio net flags for speed/duplex
post-copy migration support in vhost
cleanups in pci

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 20 Mar 2018 14:40:43 GMT
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (51 commits)
  postcopy shared docs
  libvhost-user: Claim support for postcopy
  postcopy: Allow shared memory
  vhost: Huge page align and merge
  vhost+postcopy: Wire up POSTCOPY_END notify
  vhost-user: Add VHOST_USER_POSTCOPY_END message
  libvhost-user: mprotect & madvises for postcopy
  vhost+postcopy: Call wakeups
  vhost+postcopy: Add vhost waker
  postcopy: postcopy_notify_shared_wake
  postcopy: helper for waking shared
  vhost+postcopy: Resolve client address
  postcopy-ram: add a stub for postcopy_request_shared_page
  vhost+postcopy: Helper to send requests to source for shared pages
  vhost+postcopy: Stash RAMBlock and offset
  vhost+postcopy: Send address back to qemu
  libvhost-user+postcopy: Register new regions with the ufd
  migration/ram: ramblock_recv_bitmap_test_byte_offset
  postcopy+vhost-user: Split set_mem_table for postcopy
  vhost+postcopy: Transmit 'listen' to slave
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	scripts/update-linux-headers.sh
2018-03-20 15:48:34 +00:00
Dr. David Alan Gilbert
1dc61e7b37 postcopy shared docs
Add some notes to the migration documentation for shared memory
postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert
4275cd99c6 libvhost-user: Claim support for postcopy
Tell QEMU we understand the protocol features needed for postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert
29d8fa7f73 postcopy: Allow shared memory
Now that we have the mechanisms in here, allow shared memory in a
postcopy.

Note that QEMU can't tell who all the users of shared regions are
and thus can't tell whether all the users of the shared regions
have appropriate support for postcopy.  Those devices that explicitly
support shared memory (e.g. vhost-user) must check, but it doesn't
stop weirder configurations causing problems.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert
c1ece84e7c vhost: Huge page align and merge
Align RAMBlocks to page size alignment, and adjust the merging code
to deal with partial overlap due to that alignment.

This is needed for postcopy so that we can place/fetch whole hugepages
when under userfault.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert
46343570c0 vhost+postcopy: Wire up POSTCOPY_END notify
Wire up a call to VHOST_USER_POSTCOPY_END message to the vhost clients
right before we ask the listener thread to shutdown.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert
c639187e33 vhost-user: Add VHOST_USER_POSTCOPY_END message
This message is sent just before the end of postcopy to get the
client to stop using userfault since we wont respond to any more
requests.  It should close userfaultfd so that any other pages
get mapped to the backing file automatically by the kernel, since
at this point we know we've received everything.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert
0185cfb30b libvhost-user: mprotect & madvises for postcopy
Clear the area and turn off THP.
PROT_NONE the area until after we've userfault advised it
to catch any unexpected changes.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert
dedfb4b21a vhost+postcopy: Call wakeups
Cause the vhost-user client to be woken up whenever:
  a) We place a page in postcopy mode
  b) We get a fault and the page has already been received

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert
c07e36158f vhost+postcopy: Add vhost waker
Register a waker function in vhost-user code to be notified when
pages arrive or requests to previously mapped pages get requested.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert
d488b349a3 postcopy: postcopy_notify_shared_wake
Add a hook to allow a client userfaultfd to be 'woken'
when a page arrives, and a walker that calls that
hook for relevant clients given a RAMBlock and offset.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert
5efc356403 postcopy: helper for waking shared
Provide a helper to send a 'wake' request on a userfaultfd for
a shared process.
The address in the clients address space is specified together
with the RAMBlock it was resolved to.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:35 +02:00
Dr. David Alan Gilbert
375318d03f vhost+postcopy: Resolve client address
Resolve fault addresses read off the clients UFD into RAMBlock
and offset, and call back to the postcopy code to ask for the page.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:19 +02:00
Michael S. Tsirkin
c188c53927 postcopy-ram: add a stub for postcopy_request_shared_page
This fixes the build on systems without userfaultfd.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:10 +02:00
Peter Maydell
4aafb1b192 Merge remote-tracking branch 'remotes/vivier/tags/m68k-for-2.12-pull-request' into staging
# gpg: Signature made Tue 20 Mar 2018 09:07:55 GMT
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier/tags/m68k-for-2.12-pull-request:
  target/m68k: add a mechanism to automatically free TCGv
  target/m68k: add DisasContext parameter to gen_extend()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-20 14:19:23 +00:00
Peter Maydell
036793aebf Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging
Machine and x86 queue, 2018-03-19

* cpu_model/cpu_type cleanups
* x86: Fix on Intel Processor Trace CPUID checks

# gpg: Signature made Mon 19 Mar 2018 20:07:14 GMT
# gpg:                using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/machine-next-pull-request:
  i386: Disable Intel PT if packets IP payloads have LIP values
  cpu: drop unnecessary NULL check and cpu_common_class_by_name()
  cpu: get rid of unused cpu_init() defines
  Use cpu_create(type) instead of cpu_init(cpu_model)
  cpu: add CPU_RESOLVING_TYPE macro
  tests: add machine 'none' with -cpu test
  nios2: 10m50_devboard: replace cpu_model with cpu_type

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-20 12:56:20 +00:00
Marc-André Lureau
95372184b7 hmp: free sev info
Found thanks to ASAN:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7efe20417a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7efe1f7b2f75 in g_malloc0 ../glib/gmem.c:124
    #2 0x7efe1f7b3249 in g_malloc0_n ../glib/gmem.c:355
    #3 0x558272879162 in sev_get_info /home/elmarco/src/qemu/target/i386/sev.c:414
    #4 0x55827285113b in hmp_info_sev /home/elmarco/src/qemu/target/i386/monitor.c:684
    #5 0x5582724043b8 in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3333

Fixes: 63036314
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180319175823.22111-1-marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-20 12:32:06 +00:00
zhangjixiang
32cd6550f7 HMP: Initialize err before using
When bdrv_snapshot_delete return fail, the errp will not be
assigned a valid value in error_propagate as errp didn't be
initialized in hmp_delvm, then error_reportf_err will use an
uninitialized value(call by hmp_delvm), and qemu crash.

Signed-off-by: zhangjixiang <jixiang_zhang@h3c.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-20 12:32:06 +00:00
Michael Clark
d1fd31f822 RISC-V: Fix riscv_isa_string memory size bug
This version uses a constant size memory buffer sized for
the maximum possible ISA string length. It also uses g_new
instead of g_new0, uses more efficient logic to append
extensions and adds manual zero termination of the string.

Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Clark <mjc@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[PMM: Use qemu_tolower() rather than tolower()]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-20 11:45:55 +00:00
Peter Maydell
4bdc24fa01 Merge remote-tracking branch 'remotes/ericb/tags/pull-qapi-2018-03-12-v4' into staging
qapi patches for 2018-03-12, 2.12 softfreeze

- Marc-André Lureau: 0/4 qapi: generate a literal qobject for introspection
- Max Reitz: 0/7 block: Handle null backing link
- Daniel P. Berrange: chardev: tcp: postpone TLS work until machine done
- Peter Xu: 00/23 QMP: out-of-band (OOB) execution support
- Vladimir Sementsov-Ogievskiy: 0/2 block latency histogram
- Eric Blake: qapi: Pass '-u' when doing non-silent diff

# gpg: Signature made Mon 19 Mar 2018 19:59:04 GMT
# gpg:                using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg:                 aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-qapi-2018-03-12-v4: (38 commits)
  qapi: Pass '-u' when doing non-silent diff
  qapi: add block latency histogram interface
  block/accounting: introduce latency histogram
  tests: qmp-test: add oob test
  tests: qmp-test: verify command batching
  qmp: add command "x-oob-test"
  monitor: enable IO thread for (qmp & !mux) typed
  qmp: isolate responses into io thread
  qmp: support out-of-band (oob) execution
  qapi: introduce new cmd option "allow-oob"
  monitor: send event when command queue full
  qmp: add new event "command-dropped"
  monitor: separate QMP parser and dispatcher
  monitor: let suspend/resume work even with QMPs
  monitor: let suspend_cnt be thread safe
  monitor: introduce monitor_qmp_respond()
  qmp: introduce QMPCapability
  monitor: allow using IO thread for parsing
  monitor: let mon_list be tail queue
  monitor: unify global init
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-20 09:51:49 +00:00
Laurent Vivier
ecc207d2fc target/m68k: add a mechanism to automatically free TCGv
SRC_EA() and gen_extend() can return either a temporary
TCGv or a memory allocated one. Mark them when they are
allocated, and free them automatically at end of the
instruction translation.

We want to free locally allocated TCGv to avoid
overflow in sequence like:

  0xc00ae406:  movel %fp@(-132),%fp@(-268)
  0xc00ae40c:  movel %fp@(-128),%fp@(-264)
  0xc00ae412:  movel %fp@(-20),%fp@(-212)
  0xc00ae418:  movel %fp@(-16),%fp@(-208)
  0xc00ae41e:  movel %fp@(-60),%fp@(-220)
  0xc00ae424:  movel %fp@(-56),%fp@(-216)
  0xc00ae42a:  movel %fp@(-124),%fp@(-252)
  0xc00ae430:  movel %fp@(-120),%fp@(-248)
  0xc00ae436:  movel %fp@(-12),%fp@(-260)
  0xc00ae43c:  movel %fp@(-8),%fp@(-256)
  0xc00ae442:  movel %fp@(-52),%fp@(-276)
  0xc00ae448:  movel %fp@(-48),%fp@(-272)
  ...

That can fill a lot of TCGv entries in a sequence,
especially since 15fa08f845 ("tcg: Dynamically allocate TCGOps")
we have no limit to fill the TCGOps cache and we can fill
the entire TCG variables array and overflow it.

Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180319113544.704-3-laurent@vivier.eu>
2018-03-20 09:38:58 +01:00
Laurent Vivier
3f215a147b target/m68k: add DisasContext parameter to gen_extend()
This parameter will be needed to manage automatic release
of temporary allocated TCG variables.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180319113544.704-2-laurent@vivier.eu>
2018-03-20 09:38:51 +01:00
Dr. David Alan Gilbert
096bf4c852 vhost+postcopy: Helper to send requests to source for shared pages
Provide a helper to be used by shared waker functions to request
shared pages from the source.
The last_rb pointer is moved into the incoming state since this
helper can update it as well as the main fault thread function.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:29 +02:00
Dr. David Alan Gilbert
905125d0e2 vhost+postcopy: Stash RAMBlock and offset
Stash the RAMBlock and offset for later use looking up
addresses.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert
9bb3801994 vhost+postcopy: Send address back to qemu
We need a better way, but at the moment we need the address of the
mappings sent back to qemu so it can interpret the messages on the
userfaultfd it reads.

This is done as a 3 stage set:
   QEMU -> client
      set_mem_table

   mmap stuff, get addresses

   client -> qemu
       here are the addresses

   qemu -> client
       OK - now you can use them

That ensures that qemu has registered the new addresses in it's
userfault code before the client starts accessing them.

Note: We don't ask for the default 'ack' reply since we've got our own.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert
51a5d6e5b2 libvhost-user+postcopy: Register new regions with the ufd
When new regions are sent to the client using SET_MEM_TABLE, register
them with the userfaultfd.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert
1cba9f6e66 migration/ram: ramblock_recv_bitmap_test_byte_offset
Utility for testing the map when you already know the offset
in the RAMBlock.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert
55d754b307 postcopy+vhost-user: Split set_mem_table for postcopy
Split the set_mem_table routines in both qemu and libvhost-user
because the postcopy versions are going to be quite different
once changes in the later patches are added. However, this patch
doesn't produce any functional change, just the split.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert
6864a7b5ac vhost+postcopy: Transmit 'listen' to slave
Notify the vhost-user slave on reception of the 'postcopy-listen'
event from the source.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert
f82c11165f vhost+postcopy: Register shared ufd with postcopy
Register the UFD that comes in as the response to the 'advise' method
with the postcopy code.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert
00fa4fc85b postcopy: Allow registering of fd handler
Allow other userfaultfd's to be registered into the fault thread
so that handlers for shared memory can get responses.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert
2a84ffc0be libvhost-user: Open userfaultfd
Open a userfaultfd (on a postcopy_advise) and send it back in
the reply to the qemu for it to monitor.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:27 +02:00
Dr. David Alan Gilbert
d6e47717b0 libvhost-user: Support sending fds back to qemu
Allow replies with fds (for postcopy)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:27 +02:00
Dr. David Alan Gilbert
d3dff7a5a1 vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' message
Wire up a notifier to send a VHOST_USER_POSTCOPY_ADVISE
message on an incoming advise.

Later patches will fill in the behaviour/contents of the
message.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:27 +02:00
Dr. David Alan Gilbert
9ccbfe14dd postcopy: Add vhost-user flag for postcopy and check it
Add a vhost feature flag for postcopy support, and
use the postcopy notifier to check it before allowing postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:27 +02:00
Dr. David Alan Gilbert
1693c64c27 postcopy: Add notifier chain
Add a notifier chain for postcopy with a 'reason' flag
and an opportunity for a notifier member to return an error.

Call it when enabling postcopy.

This will initially used to enable devices to declare they're unable
to postcopy and later to notify of devices of stages within postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:27 +02:00
Dr. David Alan Gilbert
2ce16640b4 postcopy: use UFFDIO_ZEROPAGE only when available
Use a flag on the RAMBlock to state whether it has the
UFFDIO_ZEROPAGE capability, use it when it's available.

This allows the use of postcopy on tmpfs as well as hugepage
backed files.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:27 +02:00
Dr. David Alan Gilbert
f90bb71bfd qemu_ram_block_host_offset
Utility to give the offset of a host pointer within a RAMBlock
(assuming we already know it's in that RAMBlock)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:26 +02:00
Dr. David Alan Gilbert
db144f7000 migrate: Update ram_block_discard_range for shared
The choice of call to discard a block is getting more complicated
for other cases.   We use fallocate PUNCH_HOLE in any file cases;
it works for both hugepage and for tmpfs.
We use the DONTNEED for non-hugepage cases either where they're
anonymous or where they're private.

Care should be taken when trying other backing files.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:26 +02:00
Michael S. Tsirkin
9578f8cc3e Makefile: add target to print generated files
This is helpful for automatic code analysis.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 03:46:48 +02:00
Haozhong Zhang
e0e5c8583d test/acpi-test-data: add ACPI tables for dimmpxm test
Reviewers can use ACPI tables in this patch to run
test_acpi_{piix4,q35}_tcg_dimm_pxm cases.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 03:34:52 +02:00
Haozhong Zhang
adae91ce32 tests/bios-tables-test: add test cases for DIMM proximity
QEMU now builds one SRAT memory affinity structure for each PC-DIMM
and NVDIMM device presented at boot time with the proximity domain
specified in the device option 'node', rather than only one SRAT
memory affinity structure covering the entire hotpluggable address
space with the proximity domain of the last node.

Add test cases on PC and Q35 machines with 4 proximity domains, and
one PC-DIMM and one NVDIMM attached to the 2nd and 3rd proximity
domains respectively. Check whether the QEMU-built SRAT tables match
with the expected ones.

The following ACPI tables need to be added for this test:
  tests/acpi-test-data/pc/APIC.dimmpxm
  tests/acpi-test-data/pc/DSDT.dimmpxm
  tests/acpi-test-data/pc/NFIT.dimmpxm
  tests/acpi-test-data/pc/SRAT.dimmpxm
  tests/acpi-test-data/pc/SSDT.dimmpxm
  tests/acpi-test-data/q35/APIC.dimmpxm
  tests/acpi-test-data/q35/DSDT.dimmpxm
  tests/acpi-test-data/q35/NFIT.dimmpxm
  tests/acpi-test-data/q35/SRAT.dimmpxm
  tests/acpi-test-data/q35/SSDT.dimmpxm
New APIC and DSDT are needed because of the multiple processors
configuration. New NFIT and SSDT are needed because of NVDIMM.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 03:34:52 +02:00
Haozhong Zhang
848a1cc1e8 hw/acpi-build: build SRAT memory affinity structures for DIMM devices
ACPI 6.2A Table 5-129 "SPA Range Structure" requires the proximity
domain of a NVDIMM SPA range must match with corresponding entry in
SRAT table.

The address ranges of vNVDIMM in QEMU are allocated from the
hot-pluggable address space, which is entirely covered by one SRAT
memory affinity structure. However, users can set the vNVDIMM
proximity domain in NFIT SPA range structure by the 'node' property of
'-device nvdimm' to a value different than the one in the above SRAT
memory affinity structure.

In order to solve such proximity domain mismatch, this patch builds
one SRAT memory affinity structure for each DIMM device present at
boot time, including both PC-DIMM and NVDIMM, with the proximity
domain specified in '-device pc-dimm' or '-device nvdimm'.

The remaining hot-pluggable address space is covered by one or multiple
SRAT memory affinity structures with the proximity domain of the last
node as before.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 03:34:52 +02:00
Haozhong Zhang
6388e18de9 qmp: distinguish PC-DIMM and NVDIMM in MemoryDeviceInfoList
It may need to treat PC-DIMM and NVDIMM differently, e.g., when
deciding the necessity of non-volatile flag bit in SRAT memory
affinity structures.

A new field 'nvdimm' is added to the union type MemoryDeviceInfo for
such purpose. Its type is currently PCDIMMDeviceInfo and will be
updated when necessary in the future.

It also fixes "info memory-devices"/query-memory-devices which
currently show nvdimm devices as dimm devices since
object_dynamic_cast(obj, TYPE_PC_DIMM) happily cast nvdimm to
TYPE_PC_DIMM which it's been inherited from.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 03:34:52 +02:00
Haozhong Zhang
52c95cae4e pc-dimm: make qmp_pc_dimm_device_list() sort devices by address
Make qmp_pc_dimm_device_list() return sorted by start address
list of devices so that it could be reused in places that
would need sorted list*. Reuse existing pc_dimm_built_list()
to get sorted list.

While at it hide recursive callbacks from callers, so that:

  qmp_pc_dimm_device_list(qdev_get_machine(), &list);

could be replaced with simpler:

  list = qmp_pc_dimm_device_list();

* follow up patch will use it in build_srat()

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> for ppc part
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 03:34:52 +02:00
Philippe Mathieu-Daudé
1d0cad532c hw/pci: remove obsolete PCIDevice->init()
All PCI devices are now QOM'ified.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 03:34:52 +02:00
Michael S. Tsirkin
056c339d97 standard-headers: update virtio_net.h
include speed/duplex fields

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 03:34:52 +02:00
Luwei Kang
c078ca968c i386: Disable Intel PT if packets IP payloads have LIP values
Intel processor trace should be disabled when
CPUID.(EAX=14H,ECX=0H).ECX.[bit31] is set.
Generated packets which contain IP payloads will have LIP
values when this bit is set, or IP payloads will have RIP
values.
Currently, The information of CPUID 14H is constant to make
live migration safty and this bit is always 0 in guest even
if host support LIP values.
Guest sees the bit is 0 will expect IP payloads with RIP
values, but the host CPU will generate IP payloads with
LIP values if this bit is set in HW.
To make sure the value of IP payloads correctly, Intel PT
should be disabled when bit[31] is set.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1520969191-18162-1-git-send-email-luwei.kang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-19 17:05:57 -03:00
Eric Blake
02e3092db3 qapi: Pass '-u' when doing non-silent diff
Ed-script diffs are awful compared to context diffs.  Fix another
'diff -q' while in the area (if the files are different, being
noisy makes it easier to diagnose why).

While at it, diff .err before .out, because if a test fails, .err
is more likely to contain the most important information for
fixing the failure.

Fixes: 46ec4fce
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180315125116.804342-1-eblake@redhat.com>
2018-03-19 14:58:38 -05:00
Vladimir Sementsov-Ogievskiy
7e5c776d15 qapi: add block latency histogram interface
Set (and clear) histograms through new command
block-latency-histogram-set and show new statistics in
query-blockstats results.

For now, the command is marked experimental with prefix 'x-',
to gain experience with the interface without being stuck
with design decisions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180309165212.97144-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[eblake: fix typos, mention x- prefix in commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:38 -05:00
Vladimir Sementsov-Ogievskiy
b741ae7417 block/accounting: introduce latency histogram
Introduce latency histogram statics for block devices.
For each accounted operation type, the latency region [0, +inf) is
divided into subregions by several points. Then, calculate
hits for each subregion.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180309165212.97144-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
d003f7a8f9 tests: qmp-test: add oob test
Test the new OOB capability.  Here we used the new "x-oob-test" command.
First, we send a lock=true and oob=false command to hang the main
thread.  Then send another lock=false and oob=true command (which will
be run inside parser this time) to free that hanged command.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-24-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
91ad45061a tests: qmp-test: verify command batching
OOB introduced DROP event for flow control.  This should not affect old
QMP clients.  Add a command batching check to make sure of it.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-23-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
469638f9cb qmp: add command "x-oob-test"
This command is only used to test OOB functionality.  It should not be
used for any other purposes.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-22-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
3fd2457d18 monitor: enable IO thread for (qmp & !mux) typed
Start to use dedicate IO thread for QMP monitors that are not using
MUXed chardev.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-21-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
abe3cd0ff7 qmp: isolate responses into io thread
For those monitors who have enabled IO thread, we'll offload the
responding procedure into IO thread.  The main reason is that chardev is
not thread safe, and we need to do all the read/write IOs in the same
thread.  For use_io_thr=true monitors, that thread is the IO thread.

We do this isolation in similar pattern as what we have done to the
request queue: we first create one response queue for each monitor, then
instead of replying directly in the main thread, we queue the responses
and kick the IO thread to do the rest of the job for us.

A funny thing after doing this is that, when the QMP clients send "quit"
to QEMU, it's possible that we close the IOThread even earlier than
replying to that "quit".  So another thing we need to do before cleaning
up the monitors is that we need to flush the response queue (we don't
need to do that for command queue; after all we are quitting) to make
sure replies for handled commands are always flushed back to clients.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-20-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
cf869d5317 qmp: support out-of-band (oob) execution
Having "allow-oob":true for a command does not mean that this command
will always be run in out-of-band mode.  The out-of-band quick path will
only be executed if we specify the extra "run-oob" flag when sending the
QMP request:

    { "execute":   "command-that-allows-oob",
      "arguments": { ... },
      "control":   { "run-oob": true } }

The "control" key is introduced to store this extra flag.  "control"
field is used to store arguments that are shared by all the commands,
rather than command specific arguments.  Let "run-oob" be the first.

Note that in the patch I exported qmp_dispatch_check_obj() to be used to
check the request earlier, and at the same time allowed "id" field to be
there since actually we always allow that.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-19-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qobject_to(), spelling fix]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
876c67512e qapi: introduce new cmd option "allow-oob"
Here "oob" stands for "Out-Of-Band".  When "allow-oob" is set, it means
the command allows out-of-band execution.

The "oob" idea is proposed by Markus Armbruster in following thread:

  https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg02057.html

This new "allow-oob" boolean will be exposed by "query-qmp-schema" as
well for command entries, so that QMP clients can know which commands
can be used in out-of-band calls. For example the command "migrate"
originally looks like:

  {"name": "migrate", "ret-type": "17", "meta-type": "command",
   "arg-type": "86"}

And it'll be changed into:

  {"name": "migrate", "ret-type": "17", "allow-oob": false,
   "meta-type": "command", "arg-type": "86"}

This patch only provides the QMP interface level changes.  It does not
contain the real out-of-band execution implementation yet.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-18-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase on introspection done by qlit]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
bf1e730174 monitor: send event when command queue full
Set maximum QMP command queue length to 8.  If the queue is full,
instead of queuing the command, we directly return a "command-dropped"
event, telling the client that a specific command is dropped.

Note that this flow control mechanism is only valid if OOB is enabled.
If it's not, the effective queue length will always be 1, which strictly
follows original behavior of QMP command handling (which never drops
messages).

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-17-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: commit message grammar, abort on failure to send event]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Eric Blake
8167d8bd36 qmp: add new event "command-dropped"
This event will be emitted if one QMP command is dropped.  Also,
declare an enum for the reasons.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-16-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
71da4667db monitor: separate QMP parser and dispatcher
Originally QMP goes through these steps:

  JSON Parser --> QMP Dispatcher --> Respond
      /|\    (2)                (3)     |
   (1) |                               \|/ (4)
       +---------  main thread  --------+

This patch does this:

  JSON Parser     QMP Dispatcher --> Respond
      /|\ |           /|\       (4)     |
       |  | (2)        | (3)            |  (5)
   (1) |  +----->      |               \|/
       +---------  main thread  <-------+

So the parsing job and the dispatching job is isolated now.  It gives us
a chance in follow up patches to totally move the parser outside.

The isolation is done using one QEMUBH. Only one dispatcher QEMUBH is
used for all the monitors.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-15-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks, rebase to qobject_to()]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
e3e977d45b monitor: let suspend/resume work even with QMPs
This patches allows QMP monitors to be suspended/resumed.

One thing to mention is that for QMPs that are using IOThreads, we need
an explicit kick for the IOThread in case it is sleeping.

Meanwhile, we need to take special care on non-interactive HMPs.
Currently only gdbserver is using that.  For these monitors, we still
don't allow suspend/resume operations.

Since at it, add traces for the operations.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-14-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
df152fb950 monitor: let suspend_cnt be thread safe
Monitor code now can be run in more than one thread.  Let it be thread
safe when accessing suspend_cnt counter.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-13-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
546aa56674 monitor: introduce monitor_qmp_respond()
A tiny refactoring, preparing to split the QMP dispatcher away.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-12-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qobject_to() usage]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
02130314d8 qmp: introduce QMPCapability
There were no QMP capabilities defined.  Define the first capability,
"oob", to allow out-of-band messages.

After this patch, we will allow QMP clients to enable QMP capabilities
when sending the first "qmp_capabilities" command.  Originally we are
starting QMP session with no arguments like:

  { "execute": "qmp_capabilities" }

Now we can enable some QMP capabilities using (take OOB as example,
which is the only capability that we support):

  { "execute": "qmp_capabilities",
    "arguments": { "enable": [ "oob" ] } }

When the "arguments" key is not provided, no capability is enabled.

For capability "oob", the monitor needs to be run on a dedicated IO
thread, otherwise the command will fail.  For example, trying to enable
OOB on a MUXed typed QMP monitor will fail.

One thing to mention is that QMP capabilities are per-monitor, and also
when the connection is closed due to some reason, the capabilities will
be reset.

Also, touch up qmp-test.c to test the new bits.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-11-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: touch up commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:37 -05:00
Peter Xu
a5ed352596 monitor: allow using IO thread for parsing
For each Monitor, add one field "use_io_thr" to show whether it will be
using the dedicated monitor IO thread to handle input/output.  When set,
monitor IO parsing work will be offloaded to the dedicated monitor IO
thread, rather than the original main loop thread.

This only works for QMP.  HMP will always be run on the main loop
thread.

Currently we're still keeping use_io_thr off always.  Will turn it on
later at some point.

One thing to mention is that we cannot set use_io_thr for every QMP
monitor.  The problem is that MUXed typed chardevs may not work well
with it now. When MUX is used, frontend of chardev can be the monitor
plus something else.  The only thing we know would be safe to be run
outside main thread so far is the monitor frontend. All the rest of the
frontends should still be run in main thread only.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-10-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: squash in Peter's followup patch to avoid test failures]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Peter Xu
238d9f34f4 monitor: let mon_list be tail queue
It was QLIST.  I want to use this list to do monitor priority job later,
which need tail insertion ability.  So switching to a tail queue.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-9-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Peter Xu
6adf08dd42 monitor: unify global init
There are many places where the monitor initializes its globals:

- monitor_init_qmp_commands() at the very beginning
- single function to init monitor_lock
- in the first entry of monitor_init() using "is_first_init"

Unify them a bit.

monitor_lock is not used before monitor_init() (as confirmed by code
analysis and gdb watchpoints); so we are safe delaying what was a
constructor-time initialization of the mutex into the later first call
to monitor_init().

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-8-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Peter Xu
227a07552f monitor: move the cur_mon hack deeper for QMP
In monitor_qmp_read(), we have the hack to temporarily replace the
cur_mon pointer.  Now we move this hack deeper inside the QMP dispatcher
routine since the Monitor pointer can be actually obtained using
container_of() upon the parser object, just like most of the other JSON
parser users do.

This does not make much sense as a single patch.  However, this will be
a big step for the next patch, when the QMP dispatcher routine will be
split from the QMP parser.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-7-peterx@redhat.com>
[eblake: rebase context of qobject_to() macro]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Peter Xu
88a95d1090 monitor: move skip_flush into monitor_data_init
It's part of the data init.  Collect it.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-6-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Peter Xu
aafb21a0b9 qobject: let object_property_get_str() use new API
We can simplify object_property_get_str() using the new
qobject_get_try_str().

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-5-peterx@redhat.com>
[eblake: rebase context of qobject_to() macro]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Peter Xu
b26ae1cb8e qobject: introduce qobject_get_try_str()
A quick way to fetch string from qobject when it's a QString.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-4-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qobject_to() macro]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Peter Xu
775932020d qobject: introduce qstring_get_try_str()
The only difference from qstring_get_str() is that it allows the qstring
to be NULL.  If so, NULL is returned.

CC: Eric Blake <eblake@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-3-peterx@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Peter Xu
378112b002 docs: update QMP documents for OOB commands
Update both the developer and spec for the new QMP OOB (Out-Of-Band)
command.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180309090006.10018-2-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Daniel P. Berrange
99f2f54174 chardev: tcp: postpone TLS work until machine done
TLS handshake may create background GSource tasks, while we won't know
the correct GMainContext until the whole chardev (including frontend)
inited.  Let's postpone the initial TLS handshake until machine done.

For dynamically created tcp chardev, we don't postpone that by checking
the init_machine_done variable.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[peterx: add missing include line, do unit test]
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180308140714.28906-1-peterx@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Max Reitz
4f7be2806e block: Deprecate "backing": ""
We have a clear replacement, so let's deprecate it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20180224154033.29559-8-mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Max Reitz
e59a0cf17b block: Handle null backing link
Instead of converting all "backing": null instances into "backing": "",
handle a null value directly in bdrv_open_inherit().

This enables explicitly null backing links for json:{} filenames.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20180224154033.29559-7-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qobject_to() parameter order and qapi headers split]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Max Reitz
532fb53284 qapi: Make more of qobject_to()
This patch reworks some places which use either qobject_type() checks
plus qobject_to(), where the latter alone is sufficient, or NULL checks
plus qobject_type() checks where we can simply do a qobject_to() != NULL
check.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20180224154033.29559-6-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase to qobject_to() parameter ordering]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Max Reitz
cb51b976ba qapi: Remove qobject_to_X() functions
They are no longer needed now.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20180224154033.29559-5-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Max Reitz
7dc847ebba qapi: Replace qobject_to_X(o) by qobject_to(X, o)
This patch was generated using the following Coccinelle script:

@@
expression Obj;
@@
(
- qobject_to_qnum(Obj)
+ qobject_to(QNum, Obj)
|
- qobject_to_qstring(Obj)
+ qobject_to(QString, Obj)
|
- qobject_to_qdict(Obj)
+ qobject_to(QDict, Obj)
|
- qobject_to_qlist(Obj)
+ qobject_to(QList, Obj)
|
- qobject_to_qbool(Obj)
+ qobject_to(QBool, Obj)
)

and a bit of manual fix-up for overly long lines and three places in
tests/check-qjson.c that Coccinelle did not find.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20180224154033.29559-4-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: swap order from qobject_to(o, X), rebase to master, also a fix
to latent false-positive compiler complaint about hw/i386/acpi-build.c]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:58:36 -05:00
Max Reitz
1a56b1e2ab qapi: Add qobject_to()
This is a dynamic casting macro that, given a QObject type, returns an
object as that type or NULL if the object is of a different type (or
NULL itself).

The macro uses lower-case letters because:
1. There does not seem to be a hard rule on whether qemu macros have to
   be upper-cased,
2. The current situation in qapi/qmp is inconsistent (compare e.g.
   QINCREF() vs. qdict_put()),
3. qobject_to() will evaluate its @obj parameter only once, thus it is
   generally not important to the caller whether it is a macro or not,
4. I prefer it aesthetically.

The macro parameter order is chosen with typename first for
consistency with other QAPI macros like QAPI_CLONE(), as well as
for legibility (read it as "qobject to" type "applied to" obj).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20180224154033.29559-3-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
[eblake: swap parameter order to list type first, avoid clang ubsan
warning on QOBJECT(NULL) and container_of(NULL,type,base)]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 14:49:04 -05:00
Peter Maydell
c26ef39204 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180319' into staging
target-arm queue:
 * fsl-imx6: Fix incorrect Ethernet interrupt defines
 * dump: Update correct kdump phys_base field for AArch64
 * char: i.MX: Add support for "TX complete" interrupt
 * bcm2836/raspi: Fix various bugs resulting in panics trying
   to boot a Debian Linux kernel on raspi3

# gpg: Signature made Mon 19 Mar 2018 18:30:33 GMT
# gpg:                using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20180319:
  hw/arm/raspi: Provide spin-loop code for AArch64 CPUs
  hw/arm/bcm2836: Hardcode correct CPU type
  hw/arm/bcm2836: Use correct affinity values for BCM2837
  hw/arm/bcm2836: Create proper bcm2837 device
  hw/arm/bcm2836: Rename bcm2836 type/struct to bcm283x
  hw/arm/bcm2386: Fix parent type of bcm2386
  hw/arm/boot: If booting a kernel in EL2, set SCR_EL3.HCE
  hw/arm/boot: assert that secure_boot and secure_board_setup are false for AArch64
  hw/arm/raspi: Don't do board-setup or secure-boot for raspi3
  char: i.MX: Add support for "TX complete" interrupt
  char: i.MX: Simplify imx_update()
  dump: Update correct kdump phys_base field for AArch64
  fsl-imx6: Swap Ethernet interrupt defines

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-19 19:20:45 +00:00
Peter Maydell
ff72cb6b46 hw/arm/raspi: Provide spin-loop code for AArch64 CPUs
The raspi3 has AArch64 CPUs, which means that our smpboot
code for keeping the secondary CPUs in a pen needs to have
a version for A64 as well as A32. Without this, the
secondary CPUs go into an infinite loop of taking undefined
instruction exceptions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-10-peter.maydell@linaro.org
2018-03-19 18:23:24 +00:00
Peter Maydell
210f47840d hw/arm/bcm2836: Hardcode correct CPU type
Now we have separate types for BCM2386 and BCM2387, we might as well
just hard-code the CPU type they use rather than having it passed
through as an object property. This then lets us put the initialization
of the CPU object in init rather than realize.

Note that this change means that it's no longer possible on
the command line to use -cpu to ask for a different kind of
CPU than the SoC supports. This was never a supported thing to
do anyway; we were just not sanity-checking the command line.

This does require us to only build the bcm2837 object on
TARGET_AARCH64 configs, since otherwise it won't instantiate
due to the missing cortex-a53 device and "make check" will fail.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-9-peter.maydell@linaro.org
2018-03-19 18:23:24 +00:00
Peter Maydell
1bcb4d16bb hw/arm/bcm2836: Use correct affinity values for BCM2837
The BCM2837 sets the Aff1 field of the MPIDR affinity values for the
CPUs to 0, whereas the BCM2836 uses 0xf. Set this correctly, as it
is required for Linux to boot.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-8-peter.maydell@linaro.org
2018-03-19 18:23:24 +00:00
Peter Maydell
0fd74f03ed hw/arm/bcm2836: Create proper bcm2837 device
The bcm2837 is pretty similar to the bcm2836, but it does have
some differences. Notably, the MPIDR affinity aff1 values it
sets for the CPUs are 0x0, rather than the 0xf that the bcm2836
uses, and if this is wrong Linux will not boot.

Rather than trying to have one device with properties that
configure it differently for the two cases, create two
separate QOM devices for the two SoCs. We use the same approach
as hw/arm/aspeed_soc.c and share code and have a data table
that might differ per-SoC. For the moment the two types don't
actually have different behaviour.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-7-peter.maydell@linaro.org
2018-03-19 18:23:24 +00:00
Peter Maydell
926dcdf073 hw/arm/bcm2836: Rename bcm2836 type/struct to bcm283x
Our BCM2836 type is really a generic one that can be any of
the bcm283x family. Rename it accordingly. We change only
the names which are visible via the header file to the
rest of the QEMU code, leaving private function names
in bcm2836.c as they are.

This is a preliminary to making bcm283x be an abstract
parent class to specific types for the bcm2836 and bcm2837.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-6-peter.maydell@linaro.org
2018-03-19 18:23:24 +00:00
Peter Maydell
3d260cf3c6 hw/arm/bcm2386: Fix parent type of bcm2386
The TypeInfo and state struct for bcm2386 disagree about what the
parent class is -- the TypeInfo says it's TYPE_SYS_BUS_DEVICE,
but the BCM2386State struct only defines the parent_obj field
as DeviceState. This would have caused problems if anything
actually tried to treat the object as a TYPE_SYS_BUS_DEVICE.
Fix the TypeInfo to use TYPE_DEVICE as the parent, since we don't
need any of the additional functionality TYPE_SYS_BUS_DEVICE
provides.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-5-peter.maydell@linaro.org
2018-03-19 18:23:24 +00:00
Peter Maydell
bda816f08a hw/arm/boot: If booting a kernel in EL2, set SCR_EL3.HCE
If we're directly booting a Linux kernel and the CPU supports both
EL3 and EL2, we start the kernel in EL2, as it expects. We must also
set the SCR_EL3.HCE bit in this situation, so that the HVC
instruction is enabled rather than UNDEFing. Otherwise at least some
kernels will panic when trying to initialize KVM in the guest.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180313153458.26822-4-peter.maydell@linaro.org
2018-03-19 18:23:24 +00:00
Peter Maydell
43118f4351 hw/arm/boot: assert that secure_boot and secure_board_setup are false for AArch64
Add some assertions that if we're about to boot an AArch64 kernel,
the board code has not mistakenly set either secure_boot or
secure_board_setup. It doesn't make sense to set secure_boot,
because all AArch64 kernels must be booted in non-secure mode.

It might in theory make sense to set secure_board_setup, but
we don't currently support that, because only the AArch32
bootloader[] code calls this hook; bootloader_aarch64[] does not.
Since we don't have a current need for this functionality, just
assert that we don't try to use it. If it's needed we'll add
it later.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-3-peter.maydell@linaro.org
2018-03-19 18:23:24 +00:00
Peter Maydell
01e02f5aa7 hw/arm/raspi: Don't do board-setup or secure-boot for raspi3
For the rpi1 and 2 we want to boot the Linux kernel via some
custom setup code that makes sure that the SMC instruction
acts as a no-op, because it's used for cache maintenance.
The rpi3 boots AArch64 kernels, which don't need SMC for
cache maintenance and always expect to be booted non-secure.
Don't fill in the aarch32-specific parts of the binfo struct.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180313153458.26822-2-peter.maydell@linaro.org
2018-03-19 18:23:24 +00:00
Andrey Smirnov
46d3fb634c char: i.MX: Add support for "TX complete" interrupt
Add support for "TX complete"/TXDC interrupt generate by real HW since
it is needed to support guests other than Linux.

Based on the patch by Bill Paul as found here:
https://bugs.launchpad.net/qemu/+bug/1753314

Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: Bill Paul <wpaul@windriver.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bill Paul <wpaul@windriver.com>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Message-id: 20180315191141.6789-2-andrew.smirnov@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-19 18:23:24 +00:00
Andrey Smirnov
824e4a12f3 char: i.MX: Simplify imx_update()
Code of imx_update() is slightly confusing since the "flags" variable
doesn't really corespond to anything in real hardware and server as a
kitchensink accumulating events normally reported via USR1 and USR2
registers.

Change the code to explicitly evaluate state of interrupts reported
via USR1 and USR2 against corresponding masking bits and use the to
detemine if IRQ line should be asserted or not.

NOTE: Check for UTS1_TXEMPTY being set has been dropped for two
reasons:

    1. Emulation code implements a single character FIFO, so this flag
       will always be set since characters are trasmitted as a part of
       the code emulating "push" into the FIFO

    2. imx_update() is really just a function doing ORing and maksing
       of reported events, so checking for UTS1_TXEMPTY should happen,
       if it's ever really needed should probably happen outside of
       it.

Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: Bill Paul <wpaul@windriver.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Message-id: 20180315191141.6789-1-andrew.smirnov@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-19 18:23:24 +00:00
Wei Huang
68cbecfdd7 dump: Update correct kdump phys_base field for AArch64
For guest kernel that supports KASLR, the load address can change every
time when guest VM runs. To find the physical base address correctly,
current QEMU dump searches VMCOREINFO for the string "NUMBER(phys_base)=".
However this string pattern is only available on x86_64. AArch64 uses a
different field, called "NUMBER(PHYS_OFFSET)=". This patch makes sure
QEMU dump uses the correct string on AArch64.

Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1520615003-20869-1-git-send-email-wei@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-19 18:23:24 +00:00
Guenter Roeck
6461d7e267 fsl-imx6: Swap Ethernet interrupt defines
The sabrelite machine model used by qemu-system-arm is based on the
Freescale/NXP i.MX6Q processor. This SoC has an on-board ethernet
controller which is supported in QEMU using the imx_fec.c module
(actually called imx.enet for this model.)

The include/hw/arm/fsm-imx6.h file defines the interrupt vectors for the
imx.enet device like this:

 #define FSL_IMX6_ENET_MAC_1588_IRQ 118
 #define FSL_IMX6_ENET_MAC_IRQ 119

According to https://www.nxp.com/docs/en/reference-manual/IMX6DQRM.pdf,
page 225, in Table 3-1. ARM Cortex A9 domain interrupt summary,
interrupts are as follows.

150 ENET MAC 0 IRQ
151 ENET MAC 0 1588 Timer interrupt

where

150 - 32 == 118
151 - 32 == 119

In other words, the vector definitions in the fsl-imx6.h file are reversed.

Fixing the interrupts alone causes problems with older Linux kernels:
The Ethernet interface will fail to probe with Linux v4.9 and earlier.
Linux v4.1 and earlier will crash due to a bug in Ethernet driver probe
error handling. This is a Linux kernel problem, not a qemu problem:
the Linux kernel only worked by accident since it requested both interrupts.

For backward compatibility, generate the Ethernet interrupt on both interrupt
lines. This was shown to work from all Linux kernel releases starting with
v3.16.

Link: https://bugs.launchpad.net/qemu/+bug/1753309
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 1520723090-22130-1-git-send-email-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-19 18:23:24 +00:00
Igor Mammedov
99193d8f2e cpu: drop unnecessary NULL check and cpu_common_class_by_name()
both do nothing as for the first all callers
   parse_cpu_model() and qmp_query_cpu_model_()
should provide non NULL value, so just abort if it's not so.

While at it drop cpu_common_class_by_name() which is not need
any more as every target has CPUClass::class_by_name callback
by now, though abort in case a new arch will forget to define one.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1518013857-4372-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-19 14:10:36 -03:00
Igor Mammedov
3f71e724e2 cpu: get rid of unused cpu_init() defines
cpu_init(cpu_model) were replaced by cpu_create(cpu_type) so
no users are left, remove it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc)
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1518000027-274608-6-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-19 14:10:36 -03:00
Igor Mammedov
2278b93941 Use cpu_create(type) instead of cpu_init(cpu_model)
With all targets defining CPU_RESOLVING_TYPE, refactor
cpu_parse_cpu_model(type, cpu_model) to parse_cpu_model(cpu_model)
so that callers won't have to know internal resolving cpu
type. Place it in exec.c so it could be called from both
target independed vl.c and *-user/main.c.

That allows us to stop abusing cpu type from
  MachineClass::default_cpu_type
as resolver class in vl.c which were confusing part of
cpu_parse_cpu_model().

Also with new parse_cpu_model(), the last users of cpu_init()
in null-machine.c and bsd/linux-user targets could be switched
to cpu_create() API and cpu_init() API will be removed by
follow up patch.

With no longer users left remove MachineState::cpu_model field,
new code should use MachineState::cpu_type instead and
leave cpu_model parsing to generic code in vl.c.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1518000027-274608-5-git-send-email-imammedo@redhat.com>
[ehabkost: Fix bsd-user build error]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-19 14:10:36 -03:00
Igor Mammedov
0dacec874f cpu: add CPU_RESOLVING_TYPE macro
it will be used for providing to cpu name resolving class for
parsing cpu model for system and user emulation code.

Along with change add target to null-machine tests, so
that when switch to CPU_RESOLVING_TYPE happens,
it would ensure that null-machine usecase still works.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu> (m68k)
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc)
Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> (tricore)
Message-Id: <1518000027-274608-4-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: Added macro to riscv too]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-19 14:10:36 -03:00
Igor Mammedov
9c5a87e496 tests: add machine 'none' with -cpu test
Check that "$QEMU -M none -cpu FOO" starts QEMU without error

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1518000027-274608-3-git-send-email-imammedo@redhat.com>
[ehabkost: include qmp/qdict.h instead of qmp/types.h]
[ehabkost: add riscv targets to machine-none-test]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-19 14:09:57 -03:00
Igor Mammedov
253a5504ce nios2: 10m50_devboard: replace cpu_model with cpu_type
use cpu_create() instead of being removed cpu_generic_init()

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1518000027-274608-2-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-19 14:09:44 -03:00
Max Reitz
9139b56723 compiler: Add QEMU_BUILD_BUG_MSG() macro
_Static_assert() allows us to specify messages, and that may come in
handy.  Even without _Static_assert(), encouraging developers to put a
helpful message next to the QEMU_BUILD_BUG_* may make debugging easier
whenever it breaks.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20180224154033.29559-2-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 10:00:14 -05:00
Marc-André Lureau
7d0f982bfb qapi: generate a literal qobject for introspection
Replace the generated json string with a literal qobject. The later is
easier to deal with, at run time as well as compile time: adding #if
conditionals will be easier than in a json string.

The output of query-qmp-schema is not changed.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180305172951.2150-5-marcandre.lureau@redhat.com>
[eblake: fix python 3 failure]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 10:00:14 -05:00
Marc-André Lureau
3cf42b8b3a qlit: add qobject_from_qlit()
Instantiate a QObject* from a literal QLitObject.

LitObject only supports int64_t for now.  uint64_t and double aren't
implemented.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180305172951.2150-4-marcandre.lureau@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 10:00:14 -05:00
Marc-André Lureau
3d96ea44d4 qlit: use QType instead of int
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180305172951.2150-3-marcandre.lureau@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 10:00:14 -05:00
Marc-André Lureau
26ee12ad1f qapi2texi: minor python code simplification
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180305172951.2150-2-marcandre.lureau@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-19 10:00:14 -05:00
Peter Maydell
2c8cfc0b52 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Mon 19 Mar 2018 11:01:45 GMT
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (46 commits)
  iotests: Avoid realpath, for CentOS 6
  block: fix iotest 146 output expectations
  iscsi: fix iSER compilation
  block: Fix leak of ignore_children in error path
  vvfat: Fix inherit_options flags
  block/mirror: change the semantic of 'force' of block-job-cancel
  vpc: Require aligned size in .bdrv_co_create
  vpc: Support .bdrv_co_create
  vhdx: Support .bdrv_co_create
  vdi: Make comments consistent with other drivers
  qed: Support .bdrv_co_create
  qcow: Support .bdrv_co_create
  qemu-iotests: Enable write tests for parallels
  parallels: Support .bdrv_co_create
  iotests: Add regression test for commit base locking
  block: Fix flags in reopen queue
  vdi: Implement .bdrv_co_create
  vdi: Move file creation to vdi_co_create_opts
  vdi: Pull option parsing from vdi_co_create
  qemu-iotests: Test luks QMP image creation
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-19 11:44:26 +00:00
Eric Blake
63ca8406be iotests: Avoid realpath, for CentOS 6
CentOS 6 lacks a realpath binary on the base install, which makes
all iotests runs fail since the 2.11 release:

001         - output mismatch (see 001.out.bad)
./check: line 815: realpath: command not found
diff: missing operand after `/home/dummy/qemu/tests/qemu-iotests/001.out'
diff: Try `diff --help' for more information.

Many of the uses of 'realpath' in the check script were being
used on the output of 'type -p' - but that is already an
absolute file name.  While a canonical name can often be
shorter (realpath gets rid of /../), it can also be longer (due
to symlink expansion); and we really don't care if the name is
canonical, merely that it was an executable file with an
absolute path.  These were broken in commit cceaf1db.

The remaining use of realpath was to convert a possibly relative
filename into an absolute one before calling diff to make it
easier to copy-and-paste the filename for moving the .bad file
into place as the new reference file even when running iotests
out-of-tree (see commit 93e53fb6), but $PWD can achieve the same
purpose.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:39 +01:00
Jeff Cody
181bb8822b block: fix iotest 146 output expectations
Commit bff5554843 added "force_size" into the common.filter for
_filter_img_create(), but test 146 still expects it in the output.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:39 +01:00
Paolo Bonzini
b16a7cfac5 iscsi: fix iSER compilation
This fails in Fedora 28.

Reported-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:39 +01:00
Fam Zheng
2c860e797a block: Fix leak of ignore_children in error path
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:39 +01:00
Fam Zheng
4f8e3a1f22 vvfat: Fix inherit_options flags
Overriding flags violates the precedence rules of
bdrv_reopen_queue_child. Just like the read-only option, no-flush should
be put into the options. The same is done in bdrv_temp_snapshot_options.

Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:39 +01:00
Liang Li
b76e4458b1 block/mirror: change the semantic of 'force' of block-job-cancel
When doing drive mirror to a low speed shared storage, if there was heavy
BLK IO write workload in VM after the 'ready' event, drive mirror block job
can't be canceled immediately, it would keep running until the heavy BLK IO
workload stopped in the VM.

Libvirt depends on the current block-job-cancel semantics, which is that
when used without a flag after the 'ready' event, the command blocks
until data is in sync.  However, these semantics are awkward in other
situations, for example, people may use drive mirror for realtime
backups while still wanting to use block live migration.  Libvirt cannot
start a block live migration while another drive mirror is in progress,
but the user would rather abandon the backup attempt as broken and
proceed with the live migration than be stuck waiting for the current
drive mirror backup to finish.

The drive-mirror command already includes a 'force' flag, which libvirt
does not use, although it documented the flag as only being useful to
quit a job which is paused.  However, since quitting a paused job has
the same effect as abandoning a backup in a non-paused job (namely, the
destination file is not in sync, and the command completes immediately),
we can just improve the documentation to make the force flag obviously
useful.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jeff Cody <jcody@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Reported-by: Huaitong Han <huanhuaitong@didichuxing.com>
Signed-off-by: Huaitong Han <huanhuaitong@didichuxing.com>
Signed-off-by: Liang Li <liliangleo@didichuxing.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:39 +01:00
Kevin Wolf
1cfeaf386e vpc: Require aligned size in .bdrv_co_create
Perform the rounding to match a CHS geometry only in the legacy code
path in .bdrv_co_create_opts. QMP now requires that the user already
passes a CHS aligned image size, unless force-size=true is given.

CHS alignment is required to make the image compatible with Virtual PC,
but not for use with newer Microsoft hypervisors.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-19 12:01:39 +01:00
Kevin Wolf
182c883550 vpc: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to vpc, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-19 12:01:39 +01:00
Kevin Wolf
09b68dab5a vhdx: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to vhdx, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-19 12:01:39 +01:00
Kevin Wolf
da23248f27 vdi: Make comments consistent with other drivers
This makes the .bdrv_co_create(_opts) implementation of vdi look more
like the other recently converted block drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-19 12:01:39 +01:00
Kevin Wolf
959355a476 qed: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to qed, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-19 12:01:39 +01:00
Kevin Wolf
42a3e1ab36 qcow: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to qcow, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-03-19 12:01:39 +01:00
Kevin Wolf
e1473133f7 qemu-iotests: Enable write tests for parallels
Originally we added parallels as a read-only format to qemu-iotests
where we did just some tests with a binary image. Since then, write and
image creation support has been added to the driver, so we can now
enable it in _supported_fmt generic.

The driver doesn't support migration yet, though, so we need to add it
to the list of exceptions in 181.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-03-19 12:01:39 +01:00
Kevin Wolf
1511b49040 parallels: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to parallels, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2018-03-19 12:01:39 +01:00
Fam Zheng
de963500c5 iotests: Add regression test for commit base locking
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:39 +01:00
Fam Zheng
1a5297366f block: Fix flags in reopen queue
Reopen flags are not synchronized according to the
bdrv_reopen_queue_child precedence until bdrv_reopen_prepare. It is a
bit too late: we already check the consistency in bdrv_check_perm before
that.

This fixes the bug that when bdrv_reopen a RO node as RW, the flags for
backing child are wrong. Before, we could recurse with flags.rw=1; now,
role->inherit_options + update_flags_from_options will make sure to
clear the bit when necessary.  Note that this will not clear an
explicitly set bit, as in the case of parallel block jobs (e.g.
test_stream_parallel in 030), because the explicit options include
'read-only=false' (for an intermediate node used by a different job).

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:39 +01:00
Max Reitz
e38105748f vdi: Implement .bdrv_co_create
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:39 +01:00
Max Reitz
ec73f06071 vdi: Move file creation to vdi_co_create_opts
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:39 +01:00
Max Reitz
49858b5098 vdi: Pull option parsing from vdi_co_create
In preparation of QAPI-fying VDI image creation, we have to create a
BlockdevCreateOptionsVdi type which is received by a (future)
vdi_co_create().

vdi_co_create_opts() now converts the QemuOpts object into such a
BlockdevCreateOptionsVdi object.  The protocol-layer file is still
created in vdi_co_do_create() (and BlockdevCreateOptionsVdi.file is set
to an empty string), but that will be addressed by a follow-up patch.

Note that cluster-size is not part of the QAPI schema because it is not
supported by default.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:39 +01:00
Kevin Wolf
d06195e6a6 qemu-iotests: Test luks QMP image creation
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-19 12:01:39 +01:00
Kevin Wolf
3d7ed9c453 luks: Catch integer overflow for huge sizes
When you request an image size close to UINT64_MAX, the addition of the
crypto header may cause an integer overflow. Catch it instead of
silently truncating the image size.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-19 12:01:24 +01:00
Kevin Wolf
e39e959e89 luks: Turn invalid assertion into check
The .bdrv_getlength implementation of the crypto block driver asserted
that the payload offset isn't after EOF. This is an invalid assertion to
make as the image file could be corrupted. Instead, check it and return
-EIO if the file is too small for the payload offset.

Zero length images are fine, so trigger -EIO only on offset > len, not
on offset >= len as the assertion did before.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-19 12:01:24 +01:00
Kevin Wolf
1bedcaf120 luks: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to luks, which enables
image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-19 12:01:24 +01:00
Kevin Wolf
1ec4f4160a luks: Create block_crypto_co_create_generic()
Everything that refers to the protocol layer or QemuOpts is moved out of
block_crypto_create_generic(), so that the remaining function is
suitable to be called by a .bdrv_co_create implementation.

LUKS is the only driver that actually implements the old interface, and
we don't intend to use it in any new drivers, so put the moved out code
directly into a LUKS function rather than creating a generic
intermediate one.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-19 12:01:24 +01:00
Kevin Wolf
3b5a1f6aa4 luks: Separate image file creation from formatting
The crypto driver used to create the image file in a callback from the
crypto subsystem. If we want to implement .bdrv_co_create, this needs to
go away because that callback will get a reference to an already
existing block node.

Move the image file creation to block_crypto_create_generic().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
fb367e039f tests/test-blockjob: test cancellations
Whatever the state a blockjob is in, it should be able to be canceled
by the block layer.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
6d8be96762 iotests: test manual job dismissal
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
b40dacdc7c blockjobs: Expose manual property
Expose the "manual" property via QAPI for the backup-related jobs.
As of this commit, this allows the management API to request the
"concluded" and "dismiss" semantics for backup jobs.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
11b61fbc0d blockjobs: add block-job-finalize
Instead of automatically transitioning from PENDING to CONCLUDED, gate
the .prepare() and .commit() phases behind an explicit acknowledgement
provided by the QMP monitor if auto_finalize = false has been requested.

This allows us to perform graph changes in prepare and/or commit so that
graph changes do not occur autonomously without knowledge of the
controlling management layer.

Transactions that have reached the "PENDING" state together can all be
moved to invoke their finalization methods by issuing block_job_finalize
to any one job in the transaction.

Jobs in a transaction with mixed job->auto_finalize settings will all
remain stuck in the "PENDING" state, as if the entire transaction was
specified with auto_finalize = false. Jobs that specified
auto_finalize = true, however, will still not emit the PENDING event.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
5f241594c4 blockjobs: add PENDING status and event
For jobs utilizing the new manual workflow, we intend to prohibit
them from modifying the block graph until the management layer provides
an explicit ACK via block-job-finalize to move the process forward.

To distinguish this runstate from "ready" or "waiting," we add a new
"pending" event and status.

For now, the transition from PENDING to CONCLUDED/ABORTING is automatic,
but a future commit will add the explicit block-job-finalize step.

Transitions:
Waiting -> Pending:   Normal transition.
Pending -> Concluded: Normal transition.
Pending -> Aborting:  Late transactional failures and cancellations.

Removed Transitions:
Waiting -> Concluded: Jobs must go to PENDING first.

Verbs:
Cancel: Can be applied to a pending job.

             +---------+
             |UNDEFINED|
             +--+------+
                |
             +--v----+
   +---------+CREATED+-----------------+
   |         +--+----+                 |
   |            |                      |
   |         +--+----+     +------+    |
   +---------+RUNNING<----->PAUSED|    |
   |         +--+-+--+     +------+    |
   |            | |                    |
   |            | +------------------+ |
   |            |                    | |
   |         +--v--+       +-------+ | |
   +---------+READY<------->STANDBY| | |
   |         +--+--+       +-------+ | |
   |            |                    | |
   |         +--v----+               | |
   +---------+WAITING<---------------+ |
   |         +--+----+                 |
   |            |                      |
   |         +--v----+                 |
   +---------+PENDING|                 |
   |         +--+----+                 |
   |            |                      |
+--v-----+   +--v------+               |
|ABORTING+--->CONCLUDED|               |
+--------+   +--+------+               |
                |                      |
             +--v-+                    |
             |NULL<--------------------+
             +----+

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
e8af5686ff blockjobs: add waiting status
For jobs that are stuck waiting on others in a transaction, it would
be nice to know that they are no longer "running" in that sense, but
instead are waiting on other jobs in the transaction.

Jobs that are "waiting" in this sense cannot be meaningfully altered
any longer as they have left their running loop. The only meaningful
user verb for jobs in this state is "cancel," which will cancel the
whole transaction, too.

Transitions:
Running -> Waiting:   Normal transition.
Ready   -> Waiting:   Normal transition.
Waiting -> Aborting:  Transactional cancellation.
Waiting -> Concluded: Normal transition.

Removed Transitions:
Running -> Concluded: Jobs must go to WAITING first.
Ready   -> Concluded: Jobs must go to WAITING first.

Verbs:
Cancel: Can be applied to WAITING jobs.

             +---------+
             |UNDEFINED|
             +--+------+
                |
             +--v----+
   +---------+CREATED+-----------------+
   |         +--+----+                 |
   |            |                      |
   |         +--v----+     +------+    |
   +---------+RUNNING<----->PAUSED|    |
   |         +--+-+--+     +------+    |
   |            | |                    |
   |            | +------------------+ |
   |            |                    | |
   |         +--v--+       +-------+ | |
   +---------+READY<------->STANDBY| | |
   |         +--+--+       +-------+ | |
   |            |                    | |
   |         +--v----+               | |
   +---------+WAITING<---------------+ |
   |         +--+----+                 |
   |            |                      |
+--v-----+   +--v------+               |
|ABORTING+--->CONCLUDED|               |
+--------+   +--+------+               |
                |                      |
             +--v-+                    |
             |NULL<--------------------+
             +----+

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
2da4617a54 blockjobs: add prepare callback
Some jobs upon finalization may need to perform some work that can
still fail. If these jobs are part of a transaction, it's important
that these callbacks fail the entire transaction.

We allow for a new callback in addition to commit/abort/clean that
allows us the opportunity to have fairly late-breaking failures
in the transactional process.

The expected flow is:

- All jobs in a transaction converge to the PENDING state,
  added in a forthcoming commit.
- Upon being finalized, either automatically or explicitly
  by the user, jobs prepare to complete.
- If any job fails preparation, all jobs call .abort.
- Otherwise, they succeed and call .commit.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
efe4d4b7b2 blockjobs: add block_job_txn_apply function
Simply apply a function transaction-wide.
A few more uses of this in forthcoming patches.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
43628d9336 blockjobs: add commit, abort, clean helpers
The completed_single function is getting a little mucked up with
checking to see which callbacks exist, so let's factor them out.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
35d6b368f2 blockjobs: ensure abort is called for cancelled jobs
Presently, even if a job is canceled post-completion as a result of
a failing peer in a transaction, it will still call .commit because
nothing has updated or changed its return code.

The reason why this does not cause problems currently is because
backup's implementation of .commit checks for cancellation itself.

I'd like to simplify this contract:

(1) Abort is called if the job/transaction fails
(2) Commit is called if the job/transaction succeeds

To this end: A job's return code, if 0, will be forcibly set as
-ECANCELED if that job has already concluded. Remove the now
redundant check in the backup job implementation.

We need to check for cancellation in both block_job_completed
AND block_job_completed_single, because jobs may be cancelled between
those two calls; for instance in transactions. This also necessitates
an ABORTING -> ABORTING transition to be allowed.

The check in block_job_completed could be removed, but there's no
point in starting to attempt to succeed a transaction that we know
in advance will fail.

This does NOT affect mirror jobs that are "canceled" during their
synchronous phase. The mirror job itself forcibly sets the canceled
property to false prior to ceding control, so such cases will invoke
the "commit" callback.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
75f710599f blockjobs: add block_job_dismiss
For jobs that have reached their CONCLUDED state, prior to having their
last reference put down (meaning jobs that have completed successfully,
unsuccessfully, or have been canceled), allow the user to dismiss the
job's lingering status report via block-job-dismiss.

This gives management APIs the chance to conclusively determine if a job
failed or succeeded, even if the event broadcast was missed.

Note: block_job_do_dismiss and block_job_decommission happen to do
exactly the same thing, but they're called from different semantic
contexts, so both aliases are kept to improve readability.

Note 2: Don't worry about the 0x04 flag definition for AUTO_DISMISS, she
has a friend coming in a future patch to fill the hole where 0x02 is.

Verbs:
Dismiss: operates on CONCLUDED jobs only.
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
3925cd3bc7 blockjobs: add NULL state
Add a new state that specifically demarcates when we begin to permanently
demolish a job after it has performed all work. This makes the transition
explicit in the STM table and highlights conditions under which a job may
be demolished.

Alongside this state, add a new helper command "block_job_decommission",
which transitions to the NULL state and puts down our implicit reference.
This separates instances in the code for "block_job_unref" which merely
undo a matching "block_job_ref" with instances intended to initiate the
full destruction of the object.

This decommission action also sets a number of fields to make sure that
block internals or external users that are holding a reference to a job
to see when it "finishes" are convinced that the job object is "done."
This is necessary, for instance, to do a block_job_cancel_sync on a
created object which will not make any progress.

Now, all jobs must go through block_job_decommission prior to being
freed, giving us start-to-finish state machine coverage for jobs.

Transitions:
Created   -> Null: Early failure event before the job is started
Concluded -> Null: Standard transition.

Verbs:
None. This should not ever be visible to the monitor.

             +---------+
             |UNDEFINED|
             +--+------+
                |
             +--v----+
   +---------+CREATED+------------------+
   |         +--+----+                  |
   |            |                       |
   |         +--v----+     +------+     |
   +---------+RUNNING<----->PAUSED|     |
   |         +--+-+--+     +------+     |
   |            | |                     |
   |            | +------------------+  |
   |            |                    |  |
   |         +--v--+       +-------+ |  |
   +---------+READY<------->STANDBY| |  |
   |         +--+--+       +-------+ |  |
   |            |                    |  |
+--v-----+   +--v------+             |  |
|ABORTING+--->CONCLUDED<-------------+  |
+--------+   +--+------+                |
                |                       |
             +--v-+                     |
             |NULL<---------------------+
             +----+

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
e0cf03647a blockjobs: add CONCLUDED state
add a new state "CONCLUDED" that identifies a job that has ceased all
operations. The wording was chosen to avoid any phrasing that might
imply success, error, or cancellation. The task has simply ceased all
operation and can never again perform any work.

("finished", "done", and "completed" might all imply success.)

Transitions:
Running  -> Concluded: normal completion
Ready    -> Concluded: normal completion
Aborting -> Concluded: error and cancellations

Verbs:
None as of this commit. (a future commit adds 'dismiss')

             +---------+
             |UNDEFINED|
             +--+------+
                |
             +--v----+
   +---------+CREATED|
   |         +--+----+
   |            |
   |         +--v----+     +------+
   +---------+RUNNING<----->PAUSED|
   |         +--+-+--+     +------+
   |            | |
   |            | +------------------+
   |            |                    |
   |         +--v--+       +-------+ |
   +---------+READY<------->STANDBY| |
   |         +--+--+       +-------+ |
   |            |                    |
+--v-----+   +--v------+             |
|ABORTING+--->CONCLUDED<-------------+
+--------+   +---------+

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
10a3fbb0f7 blockjobs: add ABORTING state
Add a new state ABORTING.

This makes transitions from normative states to error states explicit
in the STM, and serves as a disambiguation for which states may complete
normally when normal end-states (CONCLUDED) are added in future commits.

Notably, Paused/Standby jobs do not transition directly to aborting,
as they must wake up first and cooperate in their cancellation.

Transitions:
Created -> Aborting: can be cancelled (by the system)
Running -> Aborting: can be cancelled or encounter an error
Ready   -> Aborting: can be cancelled or encounter an error

Verbs:
None. The job must finish cleaning itself up and report its final status.

             +---------+
             |UNDEFINED|
             +--+------+
                |
             +--v----+
   +---------+CREATED|
   |         +--+----+
   |            |
   |         +--v----+     +------+
   +---------+RUNNING<----->PAUSED|
   |         +--+----+     +------+
   |            |
   |         +--v--+       +-------+
   +---------+READY<------->STANDBY|
   |         +-----+       +-------+
   |
+--v-----+
|ABORTING|
+--------+

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
0ec4dfb8d6 blockjobs: add block_job_verb permission table
Which commands ("verbs") are appropriate for jobs in which state is
also somewhat burdensome to keep track of.

As of this commit, it looks rather useless, but begins to look more
interesting the more states we add to the STM table.

A recurring theme is that no verb will apply to an 'undefined' job.

Further, it's not presently possible to restrict the "pause" or "resume"
verbs any more than they are in this commit because of the asynchronous
nature of how jobs enter the PAUSED state; justifications for some
seemingly erroneous applications are given below.

=====
Verbs
=====

Cancel:    Any state except undefined.
Pause:     Any state except undefined;
           'created': Requests that the job pauses as it starts.
           'running': Normal usage. (PAUSED)
           'paused':  The job may be paused for internal reasons,
                      but the user may wish to force an indefinite
                      user-pause, so this is allowed.
           'ready':   Normal usage. (STANDBY)
           'standby': Same logic as above.
Resume:    Any state except undefined;
           'created': Will lift a user's pause-on-start request.
           'running': Will lift a pause request before it takes effect.
           'paused':  Normal usage.
           'ready':   Will lift a pause request before it takes effect.
           'standby': Normal usage.
Set-speed: Any state except undefined, though ready may not be meaningful.
Complete:  Only a 'ready' job may accept a complete request.

=======
Changes
=======

(1)

To facilitate "nice" error checking, all five major block-job verb
interfaces in blockjob.c now support an errp parameter:

- block_job_user_cancel is added as a new interface.
- block_job_user_pause gains an errp paramter
- block_job_user_resume gains an errp parameter
- block_job_set_speed already had an errp parameter.
- block_job_complete already had an errp parameter.

(2)

block-job-pause and block-job-resume will no longer no-op when trying
to pause an already paused job, or trying to resume a job that isn't
paused. These functions will now report that they did not perform the
action requested because it was not possible.

iotests have been adjusted to address this new behavior.

(3)

block-job-complete doesn't worry about checking !block_job_started,
because the permission table guards against this.

(4)

test-bdrv-drain's job implementation needs to announce that it is
'ready' now, in order to be completed.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
f03d9d243f iotests: add pause_wait
Split out the pause command into the actual pause and the wait.
Not every usage presently needs to resubmit a pause request.

The intent with the next commit will be to explicitly disallow
redundant or meaningless pause/resume requests, so the tests
need to become more judicious to reflect that.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
c9de40505f blockjobs: add state transition table
The state transition table has mostly been implied. We're about to make
it a bit more complex, so let's make the STM explicit instead.

Perform state transitions with a function that for now just asserts the
transition is appropriate.

Transitions:
Undefined -> Created: During job initialization.
Created   -> Running: Once the job is started.
                      Jobs cannot transition from "Created" to "Paused"
                      directly, but will instead synchronously transition
                      to running to paused immediately.
Running   -> Paused:  Normal workflow for pauses.
Running   -> Ready:   Normal workflow for jobs reaching their sync point.
                      (e.g. mirror)
Ready     -> Standby: Normal workflow for pausing ready jobs.
Paused    -> Running: Normal resume.
Standby   -> Ready:   Resume of a Standby job.

+---------+
|UNDEFINED|
+--+------+
   |
+--v----+
|CREATED|
+--+----+
   |
+--v----+     +------+
|RUNNING<----->PAUSED|
+--+----+     +------+
   |
+--v--+       +-------+
|READY<------->STANDBY|
+-----+       +-------+

Notably, there is no state presently defined as of this commit that
deals with a job after the "running" or "ready" states, so this table
will be adjusted alongside the commits that introduce those states.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
58b295ba52 blockjobs: add status enum
We're about to add several new states, and booleans are becoming
unwieldly and difficult to reason about. It would help to have a
more explicit bookkeeping of the state of blockjobs. To this end,
add a new "status" field and add our existing states in a redundant
manner alongside the bools they are replacing:

UNDEFINED: Placeholder, default state. Not currently visible to QMP
           unless changes occur in the future to allow creating jobs
           without starting them via QMP.
CREATED:   replaces !!job->co && paused && !busy
RUNNING:   replaces effectively (!paused && busy)
PAUSED:    Nearly redundant with info->paused, which shows pause_count.
           This reports the actual status of the job, which almost always
           matches the paused request status. It differs in that it is
           strictly only true when the job has actually gone dormant.
READY:     replaces job->ready.
STANDBY:   Paused, but job->ready is true.

New state additions in coming commits will not be quite so redundant:

WAITING:   Waiting on transaction. This job has finished all the work
           it can until the transaction converges, fails, or is canceled.
PENDING:   Pending authorization from user. This job has finished all the
           work it can until the job or transaction is finalized via
           block_job_finalize. This implies the transaction has converged
           and left the WAITING phase.
ABORTING:  Job has encountered an error condition and is in the process
           of aborting.
CONCLUDED: Job has ceased all operations and has a return code available
           for query and may be dismissed via block_job_dismiss.
NULL:      Job has been dismissed and (should) be destroyed. Should never
           be visible to QMP.

Some of these states appear somewhat superfluous, but it helps define the
expected flow of a job; so some of the states wind up being synchronous
empty transitions. Importantly, jobs can be in only one of these states
at any given time, which helps code and external users alike reason about
the current condition of a job unambiguously.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
62bfdf0ca1 Blockjobs: documentation touchup
Trivial; Document what the job creation flags do,
and some general tidying.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
75859b9420 blockjobs: model single jobs as transactions
model all independent jobs as single job transactions.

It's one less case we have to worry about when we add more states to the
transition machine. This way, we can just treat all job lifetimes exactly
the same. This helps tighten assertions of the STM graph and removes some
conditionals that would have been needed in the coming commits adding a
more explicit job lifetime management API.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
John Snow
d4fce18844 blockjobs: fix set-speed kick
If speed is '0' it's not actually "less than" the previous speed.
Kick the job in this case too.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-19 12:01:24 +01:00
Peter Maydell
590a3914be Merge remote-tracking branch 'remotes/kraxel/tags/seabios-1.11.1-20180319-pull-request' into staging
update seabios to 1.11.1

# gpg: Signature made Mon 19 Mar 2018 10:26:35 GMT
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/seabios-1.11.1-20180319-pull-request:
  update seabios to 1.11.1

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-19 10:57:21 +00:00
Gerd Hoffmann
9cdd2a736b update seabios to 1.11.1
git shortlog rel-1.11.0..rel-1.11.1
===================================

Kevin O'Connor (3):
      build: Use git describe --always
      shadow: Don't invoke a shutdown on reboot unless in a reboot loop
      paravirt: Only enable sercon in NOGRAPHIC mode if no other console specified

Marcel Apfelbaum (1):
      pci: fix 'io hints' capability for RedHat PCI bridges

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2018-03-19 11:18:29 +01:00
Peter Maydell
870c75b543 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.12-20180319' into staging
ppc patch queue for 2018-03-19

This pull request supersedes the one for 2018-03-15.  The only
difference is one patch is removed, since it exposed some code which
triggers ubsan warnings.

Here's the set of accumulated patches now that we're into soft freeze.
I've split new functionality into a ppc-for-2.13 branch, so this only
has bugfixes.  Well.. and a couple of simple cleanups to make bugfixes
easier, some test improvements and a trivial change to make command
line options more obvious.  I think those are all acceptable for soft
freeze.

# gpg: Signature made Mon 19 Mar 2018 00:34:56 GMT
# gpg:                using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.12-20180319:
  target/ppc: fix tlbsync to check privilege level depending on GTSE
  ppc440_pcix: Change some error_report to qemu_log_mask(LOG_UNIMP, ...)
  hw/ppc/spapr: Allow "spapr-vlan" as NIC model name beside "ibmveth"
  PPC e500: Fix gap between u-boot and kernel
  hw/misc/macio: Mark the macio devices with user_creatable = false
  hw/ppc/prep: Fix implicit creation of "-drive if=scsi" devices
  tests/boot-serial: Check the 40p machine, too
  sii3112: Remove unneeded exit function

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-19 09:39:29 +00:00
Cédric Le Goater
91c60f124e target/ppc: fix tlbsync to check privilege level depending on GTSE
tlbsync also needs to check the Guest Translation Shootdown Enable
(GTSE) bit in the Logical Partition Control Register (LPCR) to
determine at which privilege level it is running.

See commit c6fd28fd57 ("target/ppc: Update tlbie to check privilege
level based on GTSE")

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-18 21:03:20 +11:00
BALATON Zoltan
21a5a442ae ppc440_pcix: Change some error_report to qemu_log_mask(LOG_UNIMP, ...)
Using log unimp is more appropriate for these messages and this also
silences them by default so they won't clobber make check output when
tests are added for this board.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-18 18:27:23 +11:00
Thomas Huth
3c3a4e7afa hw/ppc/spapr: Allow "spapr-vlan" as NIC model name beside "ibmveth"
With the new "--nic" command line parameter option, the "old" way of
specifying a NIC model via the nd_table[] is becoming more prominent
again. But for the pseries "spapr-vlan" device, there is a confusing
discrepancy between the model name that is used for "--device" (i.e.
"spapr-vlan") and the model name that has to be used for "--net nic"
or the new "--nic" parameter (i.e. "ibmveth"). Since "spapr-vlan" is
the "real" name of the device, let's allow "spapr-vlan" to be used
as model name for the nd_table[] entries, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-18 18:27:23 +11:00
David Engraf
b4a5f24a17 PPC e500: Fix gap between u-boot and kernel
This patch moves the gap between u-boot and kernel at the correct location.

Signed-off-by: David Engraf <david.engraf@sysgo.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-18 18:27:23 +11:00
Thomas Huth
1ca15d85ab hw/misc/macio: Mark the macio devices with user_creatable = false
The macio devices currently cause a crash when the user tries to
instantiate them on a different machine:

$ ppc64-softmmu/qemu-system-ppc64 -device macio-newworld
Unexpected error in qemu_chr_fe_init() at chardev/char-fe.c:222:
qemu-system-ppc64: -device macio-newworld: Device 'serial0' is in use
Aborted (core dumped)

These devices are clearly not intended to be creatable by the user
since they are using serial_hds[] directly in their instance_init
function. So let's mark them with user_creatable = false.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-18 18:27:23 +11:00
Thomas Huth
b891538e81 hw/ppc/prep: Fix implicit creation of "-drive if=scsi" devices
The global hack for creating SCSI devices has recently been removed,
but this apparently broke SCSI devices on some boards that were not
ready for this change yet. For the 40p machine you now get:

$ ppc64-softmmu/qemu-system-ppc64 -M 40p -cdrom x.iso
qemu-system-ppc64: -cdrom x.iso: machine type does not support if=scsi,bus=0,unit=2

Fix it by providing a lsi53c810_create() function that takes care
of calling scsi_bus_legacy_handle_cmdline() after creating the
corresponding SCSI controller.

Fixes: 1454509726
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-18 18:27:23 +11:00
Thomas Huth
d7d15a6e34 tests/boot-serial: Check the 40p machine, too
The "40p" machine is using the Open Hack'Ware BIOS, just like the "prep"
machine, so we can test it accordingly with the boot-serial tester, too.
While we're at it, also change the strings that we are using for the
"prep" machine, so that this test now also checks some CLI parameters.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-18 18:27:23 +11:00
BALATON Zoltan
d0f17aba75 sii3112: Remove unneeded exit function
An exit function was mistakenly left here but it's not needed because
the PCI bars are organised differently in this device. Calling this
exit function during device_del was causing an abort with
memory_region_del_subregion: `Assertion subregion->container == mr' failed.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-18 18:27:23 +11:00
Peter Maydell
e1e44a9916 Merge remote-tracking branch 'remotes/xtensa/tags/20180316-xtensa' into staging
target/xtensa linux-user support.

- small cleanup for xtensa registers dumping (-d cpu);
- add support for debugging linux-user process with xtensa-linux-gdb
  (as opposed to xtensa-elf-gdb), which can only access unprivileged
  registers;
- enable MTTCG for target/xtensa;
- cleanup in linux-user/mmap area making sure that it works correctly
  with limited 30-bit-wide user address space;
- import xtensa-specific definitions from the linux kernel,
  conditionalize user-only/softmmu-only code and add handlers for
  signals, exceptions, process/thread creation and core registers dumping.

# gpg: Signature made Fri 16 Mar 2018 16:46:19 GMT
# gpg:                using RSA key 51F9CC91F83FA044
# gpg: Good signature from "Max Filippov <filippov@cadence.com>"
# gpg:                 aka "Max Filippov <max.filippov@cogentembedded.com>"
# gpg:                 aka "Max Filippov <jcmvbkbc@gmail.com>"
# Primary key fingerprint: 2B67 854B 98E5 327D CDEB  17D8 51F9 CC91 F83F A044

* remotes/xtensa/tags/20180316-xtensa:
  MAINTAINERS: fix W: address for xtensa
  qemu-binfmt-conf.sh: add qemu-xtensa
  target/xtensa: add linux-user support
  linux-user: drop unused target_msync function
  linux-user: fix target_mprotect/target_munmap error return values
  linux-user: fix assertion in shmdt
  linux-user: fix mmap/munmap/mprotect/mremap/shmat
  target/xtensa: support MTTCG
  target/xtensa: use correct number of registers in gdbstub
  target/xtensa: mark register windows in the dump
  target/xtensa: dump correct physical registers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	linux-user/syscall.c
2018-03-17 14:15:03 +00:00
Peter Maydell
979454028c Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging
PC compat bug fix for 2.12

# gpg: Signature made Fri 16 Mar 2018 19:30:25 GMT
# gpg:                using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/machine-next-pull-request:
  pc: correct misspelled CPU model-id for pc 2.2

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-17 12:11:08 +00:00
Wang Xin
0ab126f165 pc: correct misspelled CPU model-id for pc 2.2
Signed-off-by: Wang Xin <wangxinxin.wang@huawei.com>
Message-Id: <1517367668-25048-1-git-send-email-wangxinxin.wang@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-16 16:29:07 -03:00
Peter Maydell
2bb39a657a Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20180316' into staging
Queued TCG patches

# gpg: Signature made Thu 15 Mar 2018 17:17:31 GMT
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-tcg-20180316:
  tcg: Add choose_vector_size
  tcg/i386: Support INDEX_op_dup2_vec for -m32
  tcg: Improve tcg_gen_muli_i32/i64

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-16 17:25:33 +00:00
Max Filippov
b8105d2194 MAINTAINERS: fix W: address for xtensa
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-16 09:40:35 -07:00
Max Filippov
4b27922525 qemu-binfmt-conf.sh: add qemu-xtensa
Register qemu-xtensa and qemu-xtensaeb for transparent linux userspace
emulation.

Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-16 09:40:35 -07:00
Max Filippov
ba7651fba5 target/xtensa: add linux-user support
Import list of syscalls from the kernel source. Conditionalize code/data
that is only used with softmmu. Implement exception handlers. Implement
signal hander (only the core registers for now, no coprocessors or TIE).

Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-16 09:40:34 -07:00
Peter Maydell
9cc7d0cf6a Merge remote-tracking branch 'remotes/jnsnow/tags/bitmaps-pull-request' into staging
# gpg: Signature made Tue 13 Mar 2018 21:11:43 GMT
# gpg:                using RSA key 7DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/bitmaps-pull-request:
  iotests: add dirty bitmap postcopy test
  iotests: add dirty bitmap migration test
  migration: add postcopy migration of dirty bitmaps
  migration: allow qmp command migrate-start-postcopy for any postcopy
  migration: add is_active_iterate handler
  migration/qemu-file: add qemu_put_counted_string()
  migration: include migrate_dirty_bitmaps in migrate_postcopy
  qapi: add dirty-bitmaps migration capability
  migration: introduce postcopy-only pending
  dirty-bitmap: add locked state
  block/dirty-bitmap: add _locked version of bdrv_reclaim_dirty_bitmap
  block/dirty-bitmap: fix locking in bdrv_reclaim_dirty_bitmap
  block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-16 14:15:18 +00:00
Peter Maydell
475fe4576f Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2018-03-13-v2' into staging
nbd patches for 2018-03-13

- Eric Blake: iotests: Fix stuck NBD process on 33
- Vladimir Sementsov-Ogievskiy: 0/5 nbd server fixing and refactoring before BLOCK_STATUS
- Eric Blake: nbd/server: Honor FUA request on NBD_CMD_TRIM
- Stefan Hajnoczi: 0/2 block: fix nbd-server-stop crash after blockdev-snapshot-sync
- Vladimir Sementsov-Ogievskiy: nbd block status base:allocation

# gpg: Signature made Tue 13 Mar 2018 20:48:37 GMT
# gpg:                using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg:                 aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-nbd-2018-03-13-v2:
  iotests: new test 209 for NBD BLOCK_STATUS
  iotests: add file_path helper
  iotests.py: tiny refactor: move system imports up
  nbd: BLOCK_STATUS for standard get_block_status function: client part
  block/nbd-client: save first fatal error in nbd_iter_error
  nbd: BLOCK_STATUS for standard get_block_status function: server part
  nbd/server: add nbd_read_opt_name helper
  nbd/server: add nbd_opt_invalid helper
  iotests: add 208 nbd-server + blockdev-snapshot-sync test case
  block: let blk_add/remove_aio_context_notifier() tolerate BDS changes
  nbd/server: Honor FUA request on NBD_CMD_TRIM
  nbd/server: refactor nbd_trip: split out nbd_handle_request
  nbd/server: refactor nbd_trip: cmd_read and generic reply
  nbd/server: fix: check client->closing before sending reply
  nbd/server: fix sparse read
  nbd/server: move nbd_co_send_structured_error up
  iotests: Fix stuck NBD process on 33

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-16 13:14:07 +00:00
Peter Maydell
3788c7b6e5 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Record-replay lockstep execution, log dumper and fixes (Alex, Pavel)
* SCSI fix to pass maximum transfer size (Daniel Barboza)
* chardev fixes and improved iothread support (Daniel Berrangé, Peter)
* checkpatch tweak (Eric)
* make help tweak (Marc-André)
* make more PCI NICs available with -net or -nic (myself)
* change default q35 NIC to e1000e (myself)
* SCSI support for NDOB bit (myself)
* membarrier system call support (myself)
* SuperIO refactoring (Philippe)
* miscellaneous cleanups and fixes (Thomas)

# gpg: Signature made Mon 12 Mar 2018 16:10:52 GMT
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (69 commits)
  tcg: fix cpu_io_recompile
  replay: update documentation
  replay: save vmstate of the asynchronous events
  replay: don't process async events when warping the clock
  scripts/replay-dump.py: replay log dumper
  replay: avoid recursive call of checkpoints
  replay: check return values of fwrite
  replay: push replay_mutex_lock up the call tree
  replay: don't destroy mutex at exit
  replay: make locking visible outside replay code
  replay/replay-internal.c: track holding of replay_lock
  replay/replay.c: bump REPLAY_VERSION again
  replay: save prior value of the host clock
  replay: added replay log format description
  replay: fix save/load vm for non-empty queue
  replay: fixed replay_enable_events
  replay: fix processing async events
  cpu-exec: fix exception_index handling
  hw/i386/pc: Factor out the superio code
  hw/alpha/dp264: Use the TYPE_SMC37C669_SUPERIO
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	default-configs/i386-softmmu.mak
#	default-configs/x86_64-softmmu.mak
2018-03-16 11:05:03 +00:00
Peter Maydell
a57946ff2a Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20180313.0' into staging
VFIO updates 2018-03-13

 - Display support for vGPUs (Gerd Hoffmann)

 - Enable new kernel support for mmaps overlapping MSI-X vector table,
   disable MSI-X emulation on POWER (Alexey Kardashevskiy)

# gpg: Signature made Tue 13 Mar 2018 19:48:49 GMT
# gpg:                using RSA key 239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-update-20180313.0:
  ppc/spapr, vfio: Turn off MSIX emulation for VFIO devices
  vfio-pci: Allow mmap of MSIX BAR
  vfio/pci: Relax DMA map errors for MMIO regions
  vfio/display: adding dmabuf support
  vfio/display: adding region support
  vfio/display: core & wireup
  vfio/common: cleanup in vfio_region_finalize
  secondary-vga: properly close QemuConsole on unplug
  console: minimal hotplug suport
  ui/pixman: add qemu_drm_format_to_pixman()
  standard-headers: add drm/drm_fourcc.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-16 09:51:47 +00:00
Peter Maydell
58888f8cdd Merge remote-tracking branch 'remotes/berrange/tags/socket-next-pull-request' into staging
# gpg: Signature made Tue 13 Mar 2018 18:12:14 GMT
# gpg:                using RSA key BE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/socket-next-pull-request:
  char: allow passing pre-opened socket file descriptor at startup
  char: refactor parsing of socket address information
  sockets: allow SocketAddress 'fd' to reference numeric file descriptors
  sockets: check that the named file descriptor is a socket
  sockets: move fd_is_socket() into common sockets code
  sockets: strengthen test suite IP protocol availability checks
  sockets: pull code for testing IP availability out of specific test
  cutils: add qemu_strtoi & qemu_strtoui parsers for int/unsigned int types
  char: don't silently skip tn3270 protocol init when TLS is enabled

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-15 18:53:07 +00:00
Peter Maydell
55901900ec Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-2.12-pull-request' into staging
# gpg: Signature made Tue 13 Mar 2018 17:33:03 GMT
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-2.12-pull-request:
  linux-user: init_guest_space: Add a comment about search strategy
  linux-user: init_guest_space: Don't try to align if we'll reject it
  linux-user: init_guest_space: Clean up control flow a bit
  linux-user: init_guest_commpage: Add a comment about size check
  linux-user: init_guest_space: Clarify page alignment logic
  linux-user: init_guest_space: Correctly handle guest_start in commpage initialization
  linux-user: init_guest_space: Clean up if we can't initialize the commpage
  linux-user: Rename validate_guest_space => init_guest_commpage
  linux-user: Use #if to only call validate_guest_space for 32-bit ARM target
  qemu-binfmt-conf.sh: add qemu-xtensa
  linux-user: drop unused target_msync function
  linux-user: fix target_mprotect/target_munmap error return values
  linux-user: fix assertion in shmdt
  linux-user: fix mmap/munmap/mprotect/mremap/shmat
  linux-user: Support f_flags in statfs when available.
  linux-user: allows to use "--systemd ALL" with qemu-binfmt-conf.sh
  linux-user: Remove the unused "not implemented" signal handling stubs
  linux-user: Drop unicore32 code

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-15 17:58:28 +00:00
Richard Henderson
adb196cbd5 tcg: Add choose_vector_size
This unifies 5 copies of checks for supported vector size,
and in the process fixes a missing check in tcg_gen_gvec_2s.

This lead to an assertion failure for 64-bit vector multiply,
which is not available in the AVX instruction set.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-03-16 00:55:04 +08:00
Richard Henderson
7f34ed4bcd tcg/i386: Support INDEX_op_dup2_vec for -m32
Unknown why -m32 was passing with gcc but not clang; it should have
failed for both.  This would be used for tcg_gen_dup_i64_vec, and
visible with the right TB and an aarch64 guest.

Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-03-16 00:55:04 +08:00
Richard Henderson
b2e3ae9452 tcg: Improve tcg_gen_muli_i32/i64
Convert multiplication by power of two to left shift.

Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2018-03-16 00:55:04 +08:00
Peter Maydell
5bdd374347 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream-sev' into staging
* Migrate MSR_SMI_COUNT (Liran)
* Update kernel headers (Gerd, myself)
* SEV support (Brijesh)

I have not tested non-x86 compilation, but I reordered the SEV patches
so that all non-x86-specific changes go first to catch any possible
issues (which weren't there anyway :)).

# gpg: Signature made Tue 13 Mar 2018 16:37:06 GMT
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream-sev: (22 commits)
  sev/i386: add sev_get_capabilities()
  sev/i386: qmp: add query-sev-capabilities command
  sev/i386: qmp: add query-sev-launch-measure command
  sev/i386: hmp: add 'info sev' command
  cpu/i386: populate CPUID 0x8000_001F when SEV is active
  sev/i386: add migration blocker
  sev/i386: finalize the SEV guest launch flow
  sev/i386: add support to LAUNCH_MEASURE command
  target/i386: encrypt bios rom
  sev/i386: add command to encrypt guest memory region
  sev/i386: add command to create launch memory encryption context
  sev/i386: register the guest memory range which may contain encrypted data
  sev/i386: add command to initialize the memory encryption context
  include: add psp-sev.h header file
  sev/i386: qmp: add query-sev command
  target/i386: add Secure Encrypted Virtualization (SEV) object
  kvm: introduce memory encryption APIs
  kvm: add memory encryption context
  docs: add AMD Secure Encrypted Virtualization (SEV)
  machine: add memory-encryption option
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-15 16:49:30 +00:00
Peter Maydell
56e8698ffa Merge remote-tracking branch 'remotes/stsquad/tags/pull-travis-speedup-130318-1' into staging
Some updates to reduce timeouts in Travis

# gpg: Signature made Tue 13 Mar 2018 16:44:46 GMT
# gpg:                using RSA key FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-travis-speedup-130318-1:
  .travis.yml: add --disable-user with the rest of the disables
  .travis.yml: split default config into system and user
  .travis.yml: drop setting default log output

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-15 14:48:09 +00:00
Peter Maydell
6265e23b1c Merge remote-tracking branch 'remotes/vivier/tags/m68k-for-2.12-pull-request' into staging
# gpg: Signature made Tue 13 Mar 2018 15:58:42 GMT
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier/tags/m68k-for-2.12-pull-request:
  target/m68k: implement fcosh
  target/m68k: implement fsinh
  target/m68k: implement ftanh
  target/m68k: implement fatanh
  target/m68k: implement facos
  target/m68k: implement fasin
  target/m68k: implement fatan
  target/m68k: implement fsincos
  target/m68k: implement fcos
  target/m68k: implement fsin
  target/m68k: implement ftan

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-15 10:00:25 +00:00
Igor Mammedov
89d47c1927 tests: acpi: don't read all fields in test_acpi_fadt_table()
there is no point to read fields here but not actually
checking them so drop it and read only header + dsdt/facs
addresses since it's needed later to fetch that tables.

With this cleanup we can get rid of AcpiFadtDescriptorRev3/
ACPI_FADT_COMMON_DEF which have no users left.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:50 +02:00
Igor Mammedov
dd1b2037a3 virt_arm: acpi: reuse common build_fadt()
Extend generic build_fadt() to support rev5.1 FADT
and reuse it for 'virt' board, it would allow to
phase out usage of AcpiFadtDescriptorRev5_1 and
later ACPI_FADT_COMMON_DEF.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:50 +02:00
Igor Mammedov
8612f8bd9f acpi: move build_fadt() from i386 specific to generic ACPI source
It will be extended and reused by follow up patch for ARM target.

PS:
Since it's generic function now, don't patch FIRMWARE_CTRL, DSDT
fields if they don't point to tables since platform might not
provide them and use X_ variants instead if applicable.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:50 +02:00
Igor Mammedov
5d7a334f7c pc: acpi: use build_append_foo() API to construct FADT
build_append_foo() API doesn't need explicit endianness
conversions which eliminates a source of errors and
it makes build_fadt() look like declarative definition of
FADT table in ACPI spec, which makes it easy to review.
Also it allows easily extending FADT to support other
revisions which will be used by follow up patches
where build_fadt() will be reused for ARM target.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:50 +02:00
Igor Mammedov
937d1b5871 pc: acpi: isolate FADT specific data into AcpiFadtData structure
move FADT data initialization out of fadt_setup() into dedicated
init_fadt_data() that will set common for pc/q35 values in
AcpiFadtData structure and acpi_get_pm_info() will complement
it with pc/q35 specific values initialization.

That will allow to get rid of fadt_setup() and generalize
build_fadt() so it could be easily extended for rev5 and
reused by ARM target.

While at it also move facs/dsdt/xdsdt offsets from build_fadt()
arg list into AcpiFadtData, as they belong to the same dataset.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:50 +02:00
Igor Mammedov
f8eaef67a3 acpi: move ACPI_PORT_SMI_CMD define to header it belongs to
ACPI_PORT_SMI_CMD is alias for APM_CNT_IOPORT,
so make it really one instead of duplicating its value.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:50 +02:00
Igor Mammedov
d0384d9020 acpi: add build_append_gas() helper for Generic Address Structure
it will help to add Generic Address Structure to ACPI tables
without using packed C structures and avoid endianness
issues as API doesn't need an explicit conversion.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:50 +02:00
Igor Mammedov
b8e0f58982 acpi: reuse AcpiGenericAddress instead of Acpi20GenericAddress
Drop duplicate in form of Acpi20GenericAddress and reuse
AcpiGenericAddress.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:50 +02:00
Igor Mammedov
697155cdf1 pc: replace pm object initialization with one-liner in acpi_get_pm_info()
next patch will need it before it gets to piix4/lpc branches
that initializes 'obj' now.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:50 +02:00
Igor Mammedov
13b1881aac acpi: remove unused acpi-dsdt.aml
SeaBIOS blob which is currently shipped with QEMU
doesn't need acpi-dsdt.aml nor is able to use it
and code that loaded it in QEMU was removed by
(commit 9fb7aaaf4c "pc: drop external DSDT loading")
in 2013.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:49 +02:00
Jason Baron
9473939ed7 virtio-net: add linkspeed and duplex settings to virtio-net
Although linkspeed and duplex can be set in a linux guest via 'ethtool -s',
this requires custom ethtool commands for virtio-net by default.

Introduce a new feature flag, VIRTIO_NET_F_SPEED_DUPLEX, which allows
the hypervisor to export a linkspeed and duplex setting. The user can
subsequently overwrite it later if desired via: 'ethtool -s'.

Linkspeed and duplex settings can be set as:
'-device virtio-net,speed=10000,duplex=full'

where speed is [0...INT_MAX], and duplex is ["half"|"full"].

Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtio-dev@lists.oasis-open.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:49 +02:00
Jason Baron
127833eeea virtio-net: use 64-bit values for feature flags
In prepartion for using some of the high order feature bits, make sure that
virtio-net uses 64-bit values everywhere.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtio-dev@lists.oasis-open.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-13 23:09:49 +02:00
Jason Baron
d3b7b37445 scripts/update-linux-headers: add ethtool.h and update to 4.16.0-rc4
A subsequent patch to add support for setting linkspeed/duplex in
virtio-net, requires a few definitions from ethtool.h, which ends up
pulling in kernel.h and sysinfo.h as well.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtio-dev@lists.oasis-open.org
2018-03-13 23:09:49 +02:00
Vladimir Sementsov-Ogievskiy
ac8bd439bb iotests: add dirty bitmap postcopy test
Test
- start two vms (vm_a, vm_b)

- in a
    - do writes from set A
    - do writes from set B
    - fix bitmap sha256
    - clear bitmap
    - do writes from set A
    - start migration
- than, in b
    - wait vm start (postcopy should start)
    - do writes from set B
    - check bitmap sha256

The test should verify postcopy migration and then merging with delta
(changes in target, during postcopy process).

Reduce supported cache modes to only 'none', because with cache on time
from source.STOP to target.RESUME is unpredictable and we can fail with
timout while waiting for target.RESUME.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20180313180320.339796-14-vsementsov@virtuozzo.com
2018-03-13 17:06:32 -04:00
Vladimir Sementsov-Ogievskiy
33dac6f343 iotests: add dirty bitmap migration test
The test starts two vms (vm_a, vm_b), create dirty bitmap in
the first one, do several writes to corresponding device and
then migrate vm_a to vm_b.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20180313180320.339796-13-vsementsov@virtuozzo.com
2018-03-13 17:06:26 -04:00
Vladimir Sementsov-Ogievskiy
b35ebdf076 migration: add postcopy migration of dirty bitmaps
Postcopy migration of dirty bitmaps. Only named dirty bitmaps are migrated.

If destination qemu is already containing a dirty bitmap with the same name
as a migrated bitmap (for the same node), then, if their granularities are
the same the migration will be done, otherwise the error will be generated.

If destination qemu doesn't contain such bitmap it will be created.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20180313180320.339796-12-vsementsov@virtuozzo.com
[Changed '+' to '*' as per list discussion. --js]
Signed-off-by: John Snow <jsnow@redhat.com>
2018-03-13 17:06:09 -04:00
Vladimir Sementsov-Ogievskiy
16b0fd3252 migration: allow qmp command migrate-start-postcopy for any postcopy
Allow migrate-start-postcopy for any postcopy type

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20180313180320.339796-11-vsementsov@virtuozzo.com
2018-03-13 17:06:03 -04:00
Vladimir Sementsov-Ogievskiy
c865d84872 migration: add is_active_iterate handler
Only-postcopy savevm states (dirty-bitmap) don't need live iteration, so
to disable them and stop transporting empty sections there is a new
savevm handler.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20180313180320.339796-10-vsementsov@virtuozzo.com
2018-03-13 17:05:58 -04:00
Vladimir Sementsov-Ogievskiy
f0d64cb729 migration/qemu-file: add qemu_put_counted_string()
Add function opposite to qemu_get_counted_string.
qemu_put_counted_string puts one-byte length of the string (string
should not be longer than 255 characters), and then it puts the string,
without last zero byte.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20180313180320.339796-9-vsementsov@virtuozzo.com
2018-03-13 17:05:55 -04:00
Vladimir Sementsov-Ogievskiy
dd6bb91450 migration: include migrate_dirty_bitmaps in migrate_postcopy
Enable postcopy if dirty bitmap migration is enabled.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20180313180320.339796-8-vsementsov@virtuozzo.com
2018-03-13 17:05:51 -04:00
Vladimir Sementsov-Ogievskiy
55efc8c2ff qapi: add dirty-bitmaps migration capability
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20180313180320.339796-7-vsementsov@virtuozzo.com
2018-03-13 17:05:45 -04:00
Vladimir Sementsov-Ogievskiy
4799502640 migration: introduce postcopy-only pending
There would be savevm states (dirty-bitmap) which can migrate only in
postcopy stage. The corresponding pending is introduced here.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20180313180320.339796-6-vsementsov@virtuozzo.com
2018-03-13 17:05:41 -04:00
Vladimir Sementsov-Ogievskiy
4f43e9535b dirty-bitmap: add locked state
Add special state, when qmp operations on the bitmap are disabled.
It is needed during bitmap migration. "Frozen" state is not
appropriate here, because it looks like bitmap is unchanged.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180207155837.92351-5-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2018-03-13 17:05:00 -04:00
Vladimir Sementsov-Ogievskiy
044ee8e143 block/dirty-bitmap: add _locked version of bdrv_reclaim_dirty_bitmap
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180207155837.92351-4-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2018-03-13 17:04:54 -04:00
Vladimir Sementsov-Ogievskiy
604ab74bb5 block/dirty-bitmap: fix locking in bdrv_reclaim_dirty_bitmap
Like other setters here these functions should take a lock.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180207155837.92351-3-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2018-03-13 17:04:48 -04:00
Vladimir Sementsov-Ogievskiy
65374c1aa6 iotests: new test 209 for NBD BLOCK_STATUS
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180312152126.286890-9-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:44:09 -05:00
Vladimir Sementsov-Ogievskiy
ef6e92280e iotests: add file_path helper
Simple way to have auto generated filenames with auto cleanup. Like
FilePath but without using 'with' statement and without additional
indentation of the whole test.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180312152126.286890-8-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:44:09 -05:00
Vladimir Sementsov-Ogievskiy
02f3a91199 iotests.py: tiny refactor: move system imports up
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180312152126.286890-7-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:44:09 -05:00
Vladimir Sementsov-Ogievskiy
78a33ab587 nbd: BLOCK_STATUS for standard get_block_status function: client part
Minimal realization: only one extent in server answer is supported.
Flag NBD_CMD_FLAG_REQ_ONE is used to force this behavior.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180312152126.286890-6-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks, fix min_block check and 32-bit cap, use -1
instead of errno on failure in nbd_negotiate_simple_meta_context,
ensure that block status makes progress on success]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:43:48 -05:00
Vladimir Sementsov-Ogievskiy
1e98efc029 block/nbd-client: save first fatal error in nbd_iter_error
It is ok, that fatal error hides previous not fatal, but hiding
first fatal error is a bad feature.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180312152126.286890-5-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:56 -05:00
Vladimir Sementsov-Ogievskiy
e7b1948d51 nbd: BLOCK_STATUS for standard get_block_status function: server part
Minimal realization: only one extent in server answer is supported.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180312152126.286890-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: tweak whitespace, move constant from .h to .c, improve
logic of check_meta_export_name, simplify nbd_negotiate_options
by doing more in nbd_negotiate_meta_queries]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:56 -05:00
Vladimir Sementsov-Ogievskiy
12296459f4 nbd/server: add nbd_read_opt_name helper
Add helper to read name in format:

  uint32 len       (<= NBD_MAX_NAME_SIZE)
  len bytes string (not 0-terminated)

The helper will be reused in following patch.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180312152126.286890-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar fixes, actually check error]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy
2e425fd568 nbd/server: add nbd_opt_invalid helper
NBD_REP_ERR_INVALID is often parameter to nbd_opt_drop and it would
be used more in following patches. So, let's add a helper.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180312152126.286890-2-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Stefan Hajnoczi
44a8174e0a iotests: add 208 nbd-server + blockdev-snapshot-sync test case
This test case adds an NBD server export and then invokes
blockdev-snapshot-sync, which changes the BlockDriverState node that the
NBD server's BlockBackend points to.  This is an interesting scenario to
test and exercises the code path fixed by the previous commit.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180306204819.11266-3-stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Stefan Hajnoczi
d03654eacd block: let blk_add/remove_aio_context_notifier() tolerate BDS changes
Commit 2019ba0a01 ("block: Add AioContextNotifier functions to BB")
added blk_add/remove_aio_context_notifier() and implemented them by
passing through the bdrv_*() equivalent.

This doesn't work across bdrv_append(), which detaches child->bs and
re-attaches it to a new BlockDriverState.  When
blk_remove_aio_context_notifier() is called we will access the new BDS
instead of the one where the notifier was added!

>From the point of view of the blk_*() API user, changes to the root BDS
should be transparent.

This patch maintains a list of AioContext notifiers in BlockBackend and
adds/removes them from the BlockDriverState as needed.

Reported-by: Stefano Panella <spanella@gmail.com>
Cc: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180306204819.11266-2-stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Eric Blake
65529782f8 nbd/server: Honor FUA request on NBD_CMD_TRIM
The NBD spec states that since trim requests can affect disk contents,
then they should allow for FUA semantics just like writes for ensuring
the disk has settled before returning.  As bdrv_[co_]pdiscard() does
not support a flags argument, we can't pass FUA down the block layer
stack, and must therefore emulate it with a flush at the NBD layer.

Note that in all reality, generic well-behaved clients will never
send TRIM+FUA (in fact, qemu as a client never does, and we have no
intention to plumb flags into bdrv_pdiscard).  This is because the
NBD protocol states that it is unspecified to READ a trimmed area
(you might read stale data, all zeroes, or even random unrelated
data) without first rewriting it, and even the experimental
BLOCK_STATUS extension states that TRIM need not affect reported
status.  Thus, in the general case, a client cannot tell the
difference between an arbitrary server that ignores TRIM, a server
that had a power outage without flushing to disk, and a server that
actually affected the disk before returning; so waiting for the
trim actions to flush to disk makes little sense.  However, for a
specific client and server pair, where the client knows the server
treats TRIM'd areas as guaranteed reads-zero, waiting for a flush
makes sense, hence why the protocol documents that FUA is valid on
trim.  So, even though the NBD protocol doesn't have a way for the
server to advertise what effects (if any) TRIM will actually have,
and thus any client that relies on specific effects is probably
in error, we can at least support a client that requests TRIM+FUA.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180307225732.155835-1-eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy
6f302e6093 nbd/server: refactor nbd_trip: split out nbd_handle_request
Split out request handling logic.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-6-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: touch up blank line placement]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy
6a4175997b nbd/server: refactor nbd_trip: cmd_read and generic reply
nbd_trip has difficult logic when sending replies: it tries to use one
code path for all replies. It is ok for simple replies, but is not
comfortable for structured replies. Also, two types of error (and
corresponding messages in local_err) - fatal (leading to disconnect)
and not-fatal (just to be sent to the client) are difficult to follow.

To make things a bit clearer, the following is done:
 - split CMD_READ logic to separate function. It is the most difficult
   command for now, and it is definitely cramped inside nbd_trip. Also,
   it is difficult to follow CMD_READ logic, shared between
   "case NBD_CMD_READ" and "if"s under "reply:" label.
 - create separate helper function nbd_send_generic_reply() and use it
   both in new nbd_do_cmd_read and for other commands in nbd_trip instead
   of common code-path under "reply:" label in nbd_trip. The helper
   supports an error message, so logic with local_err in nbd_trip is
   simplified.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks and blank line placement]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy
a0d7ce20a9 nbd/server: fix: check client->closing before sending reply
Since the unchanged code has just set client->recv_coroutine to
NULL before calling nbd_client_receive_next_request(), we are
spawning a new coroutine unconditionally, but the first thing
that coroutine will do is check for client->closing, making it
a no-op if we have already detected that the client is going
away.  Furthermore, for any error other than EIO (where we
disconnect, which itself sets client->closing), if the client
has already gone away, we'll probably encounter EIO later
in the function and attempt disconnect at that point.  Logically,
as soon as we know the connection is closing, there is no need
to try a likely-to-fail a response or spawn a no-op coroutine.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: squash in further reordering: hoist check before spawning
next coroutine, and document rationale in commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy
37e02aebf8 nbd/server: fix sparse read
In case of io error in nbd_co_send_sparse_read we should not
"goto reply:", as it was a fatal error and the common behavior
is to disconnect in this case. We should not try to send the
client an additional error reply, since we already hit a
channel-io error on our previous attempt to send one.

Fix this by handling block-status error in nbd_co_send_sparse_read,
so nbd_co_send_sparse_read fails only on io error. Then just skip
common "reply:" code path in nbd_trip.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-3-vsementsov@virtuozzo.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy
60ace2bacf nbd/server: move nbd_co_send_structured_error up
To be reused in nbd_co_send_sparse_read() in the following patch.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Eric Blake
6eba9f01bb iotests: Fix stuck NBD process on 33
Commit afe35cde6 added additional actions to test 33, but forgot
to reset the image between tests.  As a result, './check -nbd 33'
fails because the qemu-nbd process from the first half is still
occupying the port, preventing the second half from starting a
new qemu-nbd process.  Worse, the failure leaves a rogue qemu-nbd
process behind even after the test fails, which causes knock-on
failures to later tests that also want to start qemu-nbd.

Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180312211156.452139-1-eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy
e73a265e9f block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor()
Enabling bitmap successor is necessary to enable successors of bitmaps
being migrated before target vm start.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20180207155837.92351-2-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2018-03-13 15:33:59 -04:00
Max Filippov
bf9c3a5a96 linux-user: drop unused target_msync function
target_msync is not used, remove its declaration and implementation.

Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-13 11:30:22 -07:00
Max Filippov
72d75bb316 linux-user: fix target_mprotect/target_munmap error return values
target_mprotect/target_munmap return value goes through get_errno at the
call site, thus the functions must either set errno to host error code
and return -1 or return negative guest error code. Do the latter.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-13 11:30:22 -07:00
Max Filippov
21b869a321 linux-user: fix assertion in shmdt
shmdt fails to call mmap_lock/mmap_unlock around page_set_flags,
resulting in the following assertion:
  page_set_flags: Assertion `have_mmap_lock()' failed.

Wrap shmdt internals into mmap_lock/mmap_unlock.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-13 11:30:22 -07:00
Max Filippov
e530acd7de linux-user: fix mmap/munmap/mprotect/mremap/shmat
In linux-user QEMU that runs for a target with TARGET_ABI_BITS bigger
than L1_MAP_ADDR_SPACE_BITS an assertion in page_set_flags fires when
mmap, munmap, mprotect, mremap or shmat is called for an address outside
the guest address space. mmap and mprotect should return ENOMEM in such
case.

Change definition of GUEST_ADDR_MAX to always be the last valid guest
address. Account for this change in open_self_maps.
Add macro guest_addr_valid that verifies if the guest address is valid.
Add function guest_range_valid that verifies if address range is within
guest address space and does not wrap around. Use that macro in
mmap/munmap/mprotect/mremap/shmat for error checking.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-13 11:30:22 -07:00
Max Filippov
9fb40342d4 target/xtensa: support MTTCG
- emit TCG barriers for MEMW, EXTW, S32RI and L32AI;
- do atomic_cmpxchg_i32 for S32C1I.

Cc: Emilio G. Cota <cota@braap.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-13 11:30:22 -07:00
Max Filippov
1b7b26e474 target/xtensa: use correct number of registers in gdbstub
System emulation should provide access to all registers, userspace
emulation should only provide access to unprivileged registers.
Record register flags from GDB register map definition, calculate both
num_regs and num_core_regs if either is zero. Use num_regs in system
emulation, num_core_regs in userspace emulation gdbstub.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-13 11:30:22 -07:00
Max Filippov
b9317a2a69 target/xtensa: mark register windows in the dump
Add arrows that mark beginning of register windows and position of the
current window in the windowed register file.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-13 11:30:21 -07:00
Max Filippov
b55b1afda9 target/xtensa: dump correct physical registers
xtensa_cpu_dump_state outputs CPU physical registers as is, without
synchronization from current window. That may result in different values
printed for the current window and corresponding physical registers.
Synchronize physical registers from window before dumping.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-03-13 11:30:21 -07:00
Peter Maydell
3a2e46ae1d Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Tue 13 Mar 2018 12:28:21 GMT
# gpg:                using RSA key BDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  block: include original filename when reporting invalid URIs

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-13 18:29:28 +00:00
Daniel P. Berrange
0935700f85 char: allow passing pre-opened socket file descriptor at startup
When starting QEMU management apps will usually setup a monitor socket, and
then open it immediately after startup. If not using QEMU's own -daemonize
arg, this process can be troublesome to handle correctly. The mgmt app will
need to repeatedly call connect() until it succeeds, because it does not
know when QEMU has created the listener socket. If can't retry connect()
forever though, because an error might have caused QEMU to exit before it
even creates the monitor.

The obvious way to fix this kind of problem is to just pass in a pre-opened
socket file descriptor for the QEMU monitor to listen on. The management
app can now immediately call connect() just once. If connect() fails it
knows that QEMU has exited with an error.

The SocketAddress(Legacy) structs allow for FD passing via the monitor, and
now via inherited file descriptors from the process that spawned QEMU. The
final missing piece is adding a 'fd' parameter in the socket chardev
options.

This allows both HMP usage, pass any FD number with SCM_RIGHTS, then
running HMP commands:

   getfd myfd
   chardev-add socket,fd=myfd

Note that numeric FDs cannot be referenced directly in HMP, only named FDs.

And also CLI usage, by leak FD 3 from parent by clearing O_CLOEXEC, then
spawning QEMU with

  -chardev socket,fd=3,id=mon
  -mon chardev=mon,mode=control

Note that named FDs cannot be referenced in CLI args, only numeric FDs.

We do not wire this up in the legacy chardev syntax, so you cannot use FD
passing with '-qmp', you must use the modern '-mon' + '-chardev' pair.

When passing pre-opened FDs there is a restriction on use of TLS encryption.
It can be used on a server socket chardev, but cannot be used for a client
socket chardev. This is because when validating a server's certificate, the
client needs to have a hostname available to match against the certificate
identity.

An illustrative example of usage is:

  #!/usr/bin/perl

  use IO::Socket::UNIX;
  use Fcntl;

  unlink "/tmp/qmp";
  my $srv = IO::Socket::UNIX->new(
    Type => SOCK_STREAM(),
    Local => "/tmp/qmp",
    Listen => 1,
  );

  my $flags = fcntl $srv, F_GETFD, 0;
  fcntl $srv, F_SETFD, $flags & ~FD_CLOEXEC;

  my $fd = $srv->fileno();

  exec "qemu-system-x86_64", \
      "-chardev", "socket,fd=$fd,server,nowait,id=mon", \
      "-mon", "chardev=mon,mode=control";

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2018-03-13 18:06:06 +00:00
Daniel P. Berrange
9bb4060c99 char: refactor parsing of socket address information
To prepare for handling more address types, refactor the parsing of
socket address information to make it more robust and extensible.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2018-03-13 18:06:06 +00:00
Daniel P. Berrange
1723d6b1cf sockets: allow SocketAddress 'fd' to reference numeric file descriptors
The SocketAddress 'fd' kind accepts the name of a file descriptor passed
to the monitor with the 'getfd' command. This makes it impossible to use
the 'fd' kind in cases where a monitor is not available. This can apply in
handling command line argv at startup, or simply if internal code wants to
use SocketAddress and pass a numeric FD it has acquired from elsewhere.

Fortunately the 'getfd' command mandated that the FD names must not start
with a leading digit. We can thus safely extend semantics of the
SocketAddress 'fd' kind, to allow a purely numeric name to reference an
file descriptor that QEMU already has open. There will be restrictions on
when each kind can be used.

In codepaths where we are handling a monitor command (ie cur_mon != NULL),
we will only support use of named file descriptors as before. Use of FD
numbers is still not permitted for monitor commands.

In codepaths where we are not handling a monitor command (ie cur_mon ==
NULL), we will not support named file descriptors. Instead we can reference
FD numers explicitly. This allows the app spawning QEMU to intentionally
"leak" a pre-opened socket to QEMU and reference that in a SocketAddress
definition, or for code inside QEMU to pass pre-opened FDs around.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2018-03-13 18:06:06 +00:00
Daniel P. Berrange
30bdb3c56d sockets: check that the named file descriptor is a socket
The SocketAddress struct has an "fd" type, which references the name of a
file descriptor passed over the monitor using the "getfd" command. We
currently blindly assume the FD is a socket, which can lead to hard to
diagnose errors later. This adds an explicit check that the FD is actually
a socket to improve the error diagnosis.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2018-03-13 18:06:06 +00:00
Daniel P. Berrange
58dc31f1a7 sockets: move fd_is_socket() into common sockets code
The fd_is_socket() helper method is useful in a few places, so put it in
the common sockets code. Make the code more compact while moving it.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2018-03-13 18:06:06 +00:00
Daniel P. Berrange
abd983c0e0 sockets: strengthen test suite IP protocol availability checks
Instead of just checking whether it is possible to bind() on a socket, also
check that we can successfully connect() to the socket we bound to. This
more closely replicates the level of functionality that tests will actually
use.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2018-03-13 18:06:06 +00:00
Daniel P. Berrange
9b589ffb12 sockets: pull code for testing IP availability out of specific test
The test-io-channel-socket.c file has some useful helper functions for
checking if a specific IP protocol is available. Other tests need to
perform similar kinds of checks to avoid running tests that will fail
due to missing IP protocols.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2018-03-13 18:06:06 +00:00
Daniel P. Berrange
473a2a331e cutils: add qemu_strtoi & qemu_strtoui parsers for int/unsigned int types
There are qemu_strtoNN functions for various sized integers. This adds two
more for plain int & unsigned int types, with suitable range checking.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2018-03-13 18:06:05 +00:00
Alexey Kardashevskiy
fcad0d2121 ppc/spapr, vfio: Turn off MSIX emulation for VFIO devices
This adds a possibility for the platform to tell VFIO not to emulate MSIX
so MMIO memory regions do not get split into chunks in flatview and
the entire page can be registered as a KVM memory slot and make direct
MMIO access possible for the guest.

This enables the entire MSIX BAR mapping to the guest for the pseries
platform in order to achieve the maximum MMIO preformance for certain
devices.

Tested on:
LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:31 -06:00
Alexey Kardashevskiy
ae0215b2bb vfio-pci: Allow mmap of MSIX BAR
At the moment we unconditionally avoid mapping MSIX data of a BAR and
emulate MSIX table in QEMU. However it is 1) not always necessary as
a platform may provide a paravirt interface for MSIX configuration;
2) can affect the speed of MMIO access by emulating them in QEMU when
frequently accessed registers share same system page with MSIX data,
this is particularly a problem for systems with the page size bigger
than 4KB.

A new capability - VFIO_REGION_INFO_CAP_MSIX_MAPPABLE - has been added
to the kernel [1] which tells the userspace that mapping of the MSIX data
is possible now. This makes use of it so from now on QEMU tries mapping
the entire BAR as a whole and emulate MSIX on top of that.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a32295c612c57990d17fb0f41e7134394b2f35f6

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:31 -06:00
Alexey Kardashevskiy
567b5b309a vfio/pci: Relax DMA map errors for MMIO regions
At the moment if vfio_memory_listener is registered in the system memory
address space, it maps/unmaps every RAM memory region for DMA.
It expects system page size aligned memory sections so vfio_dma_map
would not fail and so far this has been the case. A mapping failure
would be fatal. A side effect of such behavior is that some MMIO pages
would not be mapped silently.

However we are going to change MSIX BAR handling so we will end having
non-aligned sections in vfio_memory_listener (more details is in
the next patch) and vfio_dma_map will exit QEMU.

In order to avoid fatal failures on what previously was not a failure and
was just silently ignored, this checks the section alignment to
the smallest supported IOMMU page size and prints an error if not aligned;
it also prints an error if vfio_dma_map failed despite the page size check.
Both errors are not fatal; only MMIO RAM regions are checked
(aka "RAM device" regions).

If the amount of errors printed is overwhelming, the MSIX relocation
could be used to avoid excessive error output.

This is unlikely to cause any behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: Fix Int128 bit ops]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:30 -06:00
Gerd Hoffmann
8b818e059b vfio/display: adding dmabuf support
Wire up dmabuf-based display.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:30 -06:00
Gerd Hoffmann
00195ba710 vfio/display: adding region support
Wire up region-based display.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed By: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:30 -06:00
Gerd Hoffmann
a9994687cb vfio/display: core & wireup
Infrastructure for display support.  Must be enabled
using 'display' property.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed By: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:29 -06:00
Gerd Hoffmann
92f86bff08 vfio/common: cleanup in vfio_region_finalize
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:29 -06:00
Gerd Hoffmann
fc70514ccf secondary-vga: properly close QemuConsole on unplug
Using the new graphic_console_close() function.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:29 -06:00
Gerd Hoffmann
9588d67e72 console: minimal hotplug suport
This patch allows to unbind devices from QemuConsoles, using the new
graphic_console_close() function.  The QemuConsole will show a static
display then, saying the device was unplugged.  When re-plugging a
display later on the QemuConsole will be reused.

Eventually we will allocate and release QemuConsoles dynamically at some
point in the future, that'll need more infrastructure though to notify
user interfaces (gtk, sdl, spice, ...) about QemuConsoles coming and
going.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:29 -06:00
Gerd Hoffmann
a5127bd73f ui/pixman: add qemu_drm_format_to_pixman()
Map drm fourcc codes to pixman formats.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:28 -06:00
Gerd Hoffmann
8e8ee8509a standard-headers: add drm/drm_fourcc.h
So we can use the drm fourcc codes without a dependency on libdrm-devel.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:28 -06:00
Brijesh Singh
9f75079498 sev/i386: add sev_get_capabilities()
The function can be used to get the current SEV capabilities.
The capabilities include platform diffie-hellman key (pdh) and certificate
chain. The key can be provided to the external entities which wants to
establish a trusted channel between SEV firmware and guest owner.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:36:06 +01:00
Brijesh Singh
31dd67f684 sev/i386: qmp: add query-sev-capabilities command
The command can be used by libvirt to query the SEV capabilities.

Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:36:05 +01:00
Brijesh Singh
1b6a034f29 sev/i386: qmp: add query-sev-launch-measure command
The command can be used by libvirt to retrieve the measurement of SEV guest.
This measurement is a signature of the memory contents that was encrypted
through the LAUNCH_UPDATE_DATA.

Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:36:05 +01:00
Brijesh Singh
6303631467 sev/i386: hmp: add 'info sev' command
The command can be used to show the SEV information when memory
encryption is enabled on AMD platform.

Cc: Eric Blake <eblake@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Reviewed-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:36:05 +01:00
Brijesh Singh
6cb8f2a663 cpu/i386: populate CPUID 0x8000_001F when SEV is active
When SEV is enabled, CPUID 0x8000_001F should provide additional
information regarding the feature (such as which page table bit is used
to mark the pages as encrypted etc).

The details for memory encryption CPUID is available in AMD APM
(https://support.amd.com/TechDocs/24594.pdf) Section E.4.17

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:36:05 +01:00
Brijesh Singh
8fa4466d77 sev/i386: add migration blocker
SEV guest migration is not implemented yet.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:36:05 +01:00
Brijesh Singh
5dd0df7e74 sev/i386: finalize the SEV guest launch flow
SEV launch flow requires us to issue LAUNCH_FINISH command before guest
is ready to run.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:36:05 +01:00
Brijesh Singh
c6c89c976d sev/i386: add support to LAUNCH_MEASURE command
During machine creation we encrypted the guest bios image, the
LAUNCH_MEASURE command can be used to retrieve the measurement of
the encrypted memory region. This measurement is a signature of
the memory contents that can be sent to the guest owner as an
attestation that the memory was encrypted correctly by the firmware.
VM management tools like libvirt can query the measurement using
query-sev-launch-measure QMP command.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:36:05 +01:00
Brijesh Singh
952e0668c4 target/i386: encrypt bios rom
SEV requires that guest bios must be encrypted before booting the guest.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:36:05 +01:00
Brijesh Singh
b738d6300d sev/i386: add command to encrypt guest memory region
The KVM_SEV_LAUNCH_UPDATE_DATA command is used to encrypt a guest memory
region using the VM Encryption Key created using LAUNCH_START.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:36:00 +01:00
Brijesh Singh
620fd55c24 sev/i386: add command to create launch memory encryption context
The KVM_SEV_LAUNCH_START command creates a new VM encryption key (VEK).
The encryption key created with the command will be used for encrypting
the bootstrap images (such as guest bios).

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:35:59 +01:00
Brijesh Singh
2b308e4431 sev/i386: register the guest memory range which may contain encrypted data
When SEV is enabled, the hardware encryption engine uses a tweak such
that the two identical plaintext at different location will have a
different ciphertexts. So swapping or moving a ciphertexts of two guest
pages will not result in plaintexts being swapped. Hence relocating
a physical backing pages of the SEV guest will require some additional
steps in KVM driver. The KVM_MEMORY_ENCRYPT_{UN,}REG_REGION ioctl can be
used to register/unregister the guest memory region which may contain the
encrypted data. KVM driver will internally handle the relocating physical
backing pages of registered memory regions.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:35:41 +01:00
Brijesh Singh
d8575c6c02 sev/i386: add command to initialize the memory encryption context
When memory encryption is enabled, KVM_SEV_INIT command is used to
initialize the platform. The command loads the SEV related persistent
data from non-volatile storage and initializes the platform context.
This command should be first issued before invoking any other guest
commands provided by the SEV firmware.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 17:35:16 +01:00
Daniel P. Berrange
63bab2b696 char: don't silently skip tn3270 protocol init when TLS is enabled
Even if common tn3270 implementations do not support TLS, it is trivial to
have them proxied over a proxy like stunnel which adds TLS at the sockets
layer. We should thus not silently skip tn3270 protocol initialization
when TLS is enabled.

Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2018-03-13 16:32:15 +00:00
Peter Maydell
026aaf47c0 Merge remote-tracking branch 'remotes/ehabkost/tags/python-next-pull-request' into staging
Python queue, 2018-03-12

# gpg: Signature made Mon 12 Mar 2018 22:10:36 GMT
# gpg:                using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/python-next-pull-request:
  device-crash-test: Use 'python' binary
  qmp.py: Encode json data before sending
  qemu.py: Use items() instead of iteritems()
  device-crash-test: New known crashes

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-13 16:26:44 +00:00
Alex Bennée
0b438fa627 .travis.yml: add --disable-user with the rest of the disables
As all the disabled features only affect system emulation we might as
well disable user mode to save compile time.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-13 16:22:23 +00:00
Alex Bennée
ad20a090a5 .travis.yml: split default config into system and user
As the build times have risen we keep timing out. Split the default
config into system and user builds.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-13 16:22:20 +00:00
Alex Bennée
52dd196752 .travis.yml: drop setting default log output
The log backend is the default one, we don't need to explicitly set it.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-13 16:22:16 +00:00
Laurent Vivier
02f9124ebe target/m68k: implement fcosh
Using a local m68k  floatx80_cosh()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180312202728.23790-12-laurent@vivier.eu>
2018-03-13 16:35:05 +01:00
Laurent Vivier
eee6b892a6 target/m68k: implement fsinh
Using a local m68k floatx80_sinh()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180312202728.23790-11-laurent@vivier.eu>
2018-03-13 16:34:58 +01:00
Laurent Vivier
9937b02965 target/m68k: implement ftanh
Using local m68k floatx80_tanh() and floatx80_etoxm1()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180312202728.23790-10-laurent@vivier.eu>
2018-03-13 16:34:51 +01:00
Laurent Vivier
e3655afa13 target/m68k: implement fatanh
Using a local m68k floatx80_atanh()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180312202728.23790-9-laurent@vivier.eu>
2018-03-13 16:34:42 +01:00
Laurent Vivier
c84813b807 target/m68k: implement facos
Using a local m68k floatx80_acos()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180312202728.23790-8-laurent@vivier.eu>
2018-03-13 16:34:33 +01:00
Laurent Vivier
bc20b34e03 target/m68k: implement fasin
Using a local m68k floatx80_asin()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180312202728.23790-7-laurent@vivier.eu>
2018-03-13 16:34:25 +01:00
Laurent Vivier
8c992abc89 target/m68k: implement fatan
Using a local m68k floatx80_atan()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180312202728.23790-6-laurent@vivier.eu>
2018-03-13 16:34:16 +01:00
Laurent Vivier
47446c9ce3 target/m68k: implement fsincos
using floatx80_sin() and floatx80_cos()

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180312202728.23790-5-laurent@vivier.eu>
2018-03-13 16:34:09 +01:00
Laurent Vivier
68d0ed3786 target/m68k: implement fcos
Using a local m68k floatx80_cos()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180312202728.23790-4-laurent@vivier.eu>
2018-03-13 16:34:02 +01:00
Laurent Vivier
5add1ac42f target/m68k: implement fsin
Using a local m68k floatx80_sin()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180312202728.23790-3-laurent@vivier.eu>
2018-03-13 16:33:54 +01:00
Laurent Vivier
273401809c target/m68k: implement ftan
Using a local m68k floatx80_tan()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180312202728.23790-2-laurent@vivier.eu>
2018-03-13 16:33:34 +01:00
Luke Shumaker
8c17d862b3 linux-user: init_guest_space: Add a comment about search strategy
Signed-off-by: Luke Shumaker <lukeshu@parabola.nu>
Message-Id: <20171228180814.9749-10-lukeshu@lukeshu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-13 15:04:01 +01:00
Peter Maydell
59667bb167 Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging
x86 queue, 2018-03-12

* Intel Processor Trace support
* KVM_HINTS_DEDICATED

# gpg: Signature made Mon 12 Mar 2018 19:58:39 GMT
# gpg:                using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-next-pull-request:
  i386: Add support to get/set/migrate Intel Processor Trace feature
  i386: Add Intel Processor Trace feature support
  target-i386: add KVM_HINTS_DEDICATED performance hint

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-13 14:02:47 +00:00
Luke Shumaker
aac362e46f linux-user: init_guest_space: Don't try to align if we'll reject it
If the ensure-alignment code gets triggered, then the
"if (host_start && real_start != current_start)" check will always trigger,
so save 2 syscalls and put that check first.

Note that we can't just switch to using MAP_FIXED for that check, because
then we couldn't differentiate between a failure because "there isn't
enough space" and "there isn't enough space *here*".

Signed-off-by: Luke Shumaker <lukeshu@parabola.nu>
Message-Id: <20171228180814.9749-9-lukeshu@lukeshu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-13 15:02:20 +01:00
Luke Shumaker
7ad75eea86 linux-user: init_guest_space: Clean up control flow a bit
Instead of doing

        if (check1) {
            if (check2) {
               success;
            }
        }

        retry;

Do a clearer

        if (!check1) {
           goto try_again;
        }

        if (!check2) {
           goto try_again;
        }

        success;

    try_again:
        retry;

Besides being clearer, this makes it easier to insert more checks that
need to trigger a retry on check failure, or rearrange them, or anything
like that.

Because some indentation is changing, "ignore space change" may be useful
for viewing this patch.

Signed-off-by: Luke Shumaker <lukeshu@parabola.nu>
Message-Id: <20171228180814.9749-8-lukeshu@lukeshu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[lv: modified to try again fi valid == 0, not valid == -1 (error case)]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-13 15:01:14 +01:00
Daniel P. Berrangé
44acd46f60 block: include original filename when reporting invalid URIs
Consider passing a JSON based block driver to "qemu-img commit"

$ qemu-img commit 'json:{"driver":"qcow2","file":{"driver":"gluster",\
                  "volume":"gv0","path":"sn1.qcow2",
                  "server":[{"type":\
		  "tcp","host":"10.73.199.197","port":"24007"}]},}'

Currently it will commit the content and then report an incredibly
useless error message when trying to re-open the committed image:

  qemu-img: invalid URI
  Usage: file=gluster[+transport]://[host[:port]]volume/path[?socket=...][,file.debug=N][,file.logfile=/path/filename.log]

With this fix we get:

  qemu-img: invalid URI json:{"server.0.host": "10.73.199.197",
      "driver": "gluster", "path": "luks.qcow2", "server.0.type":
      "tcp", "server.0.port": "24007", "volume": "gv0"}

Of course the root cause problem still exists, but now we know
what actually needs fixing.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180206105204.14817-1-berrange@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2018-03-13 08:06:55 -04:00
Peter Maydell
22ef7ba8e8 Merge remote-tracking branch 'remotes/famz/tags/staging-pull-request' into staging
docker patches

# gpg: Signature made Mon 12 Mar 2018 17:25:57 GMT
# gpg:                using RSA key CA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/staging-pull-request:
  tests: make docker-test-debug@fedora run sanitizers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-13 11:42:45 +00:00
Brijesh Singh
9d8ad11429 include: add psp-sev.h header file
The header file provide the ioctl command and structure to communicate
with /dev/sev device.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
2018-03-13 12:04:03 +01:00
Brijesh Singh
08a161fd35 sev/i386: qmp: add query-sev command
The QMP query command can used to retrieve the SEV information when
memory encryption is enabled on AMD platform.

Cc: Eric Blake <eblake@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 12:04:03 +01:00
Brijesh Singh
a9b4942f48 target/i386: add Secure Encrypted Virtualization (SEV) object
Add a new memory encryption object 'sev-guest'. The object will be used
to create encrypted VMs on AMD EPYC CPU. The object provides the properties
to pass guest owner's public Diffie-hellman key, guest policy and session
information required to create the memory encryption context within the
SEV firmware.

e.g to launch SEV guest
 # $QEMU \
    -object sev-guest,id=sev0 \
    -machine ....,memory-encryption=sev0

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 12:04:03 +01:00
Brijesh Singh
54e8953967 kvm: introduce memory encryption APIs
Inorder to integerate the Secure Encryption Virtualization (SEV) support
add few high-level memory encryption APIs which can be used for encrypting
the guest memory region.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 12:04:03 +01:00
Brijesh Singh
b20e37801f kvm: add memory encryption context
Split from a patch by Brijesh Singh (brijesh.singh@amd.com).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
2018-03-13 12:04:03 +01:00
Brijesh Singh
9b02f7bf85 docs: add AMD Secure Encrypted Virtualization (SEV)
Create a documentation entry to describe the AMD Secure Encrypted
Virtualization (SEV) feature.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 12:04:03 +01:00
Brijesh Singh
db5881949f machine: add memory-encryption option
When CPU supports memory encryption feature, the property can be used to
specify the encryption object to use when launching an encrypted guest.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 12:04:03 +01:00
Liran Alon
e13713db5b KVM: x86: Add support for save/load MSR_SMI_COUNT
This MSR returns the number of #SMIs that occurred on
CPU since boot.

KVM commit 52797bf9a875 ("KVM: x86: Add emulation of MSR_SMI_COUNT")
introduced support for emulating this MSR.

This commit adds support for QEMU to save/load this
MSR for migration purposes.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 12:04:03 +01:00
Paolo Bonzini
9f2d175db5 update Linux headers to 4.16-rc5
Note that VIRTIO_GPU_CAPSET_VIRGL2 was added manually so it has to be added
manually after re-running scripts/update-linux-headers.sh.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-13 12:02:32 +01:00
Peter Maydell
834eddf22e Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Mon 12 Mar 2018 16:01:16 GMT
# gpg:                using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  block: make BDRV_POLL_WHILE() re-entrancy safe

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-13 10:49:02 +00:00
Peter Maydell
73988d529e Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Mon 12 Mar 2018 15:59:54 GMT
# gpg:                using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  trace: only permit standard C types and fixed size integer types
  trace: remove use of QEMU specific types from trace probes
  trace: include filename when printing parser error messages
  simpletrace: fix timestamp argument type
  log-for-trace.h: Split out parts of log.h used by trace.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-13 09:43:44 +00:00
Eduardo Habkost
006cc55835 device-crash-test: Use 'python' binary
Now the script works with Python 3, so we can use the 'python'
binary provided by the system.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180312185503.5746-4-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-12 19:10:16 -03:00
Eduardo Habkost
7e2b34f26e qmp.py: Encode json data before sending
On Python 3, json.dumps() return a str object, which can't be
sent directly through a socket and must be encoded into a bytes
object.  Use .encode('utf-8'), which will work on both Python 2
and Python 3.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180312185503.5746-3-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-12 19:10:16 -03:00
Eduardo Habkost
fb2e1cc6c4 qemu.py: Use items() instead of iteritems()
items() is less efficient on Python 2.x, but makes the code work
on both Python 2 and Python 3.

Cc: Lukáš Doktor <ldoktor@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Cleber Rosa <crosa@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180312185503.5746-2-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-12 19:10:16 -03:00
Eduardo Habkost
44d815e83a device-crash-test: New known crashes
We are not running the script on "make check" yet, and additional
bugs were introduced recently in the tree.

Whitelist the new crashes while we investigate, to allow us to
run device-crash-test on "make check" as soon as possible to
prevent new bugs.

Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Cc: John Snow <jsnow@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180309202827.12085-5-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-12 19:10:16 -03:00
Peter Maydell
1396156f50 Merge remote-tracking branch 'remotes/kraxel/tags/usb-20180312-pull-request' into staging
usbredir: reorder fields in USBRedirDevice to reduce padding

# gpg: Signature made Mon 12 Mar 2018 11:05:19 GMT
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/usb-20180312-pull-request:
  usbredir: reorder fields in USBRedirDevice to reduce padding

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-12 19:40:44 +00:00
Chao Peng
b77146e9a1 i386: Add support to get/set/migrate Intel Processor Trace feature
Add Intel Processor Trace related definition. It also add
corresponding part to kvm_get/set_msr and vmstate.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1520182116-16485-2-git-send-email-luwei.kang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-12 15:59:46 -03:00
Chao Peng
e37a5c7fa4 i386: Add Intel Processor Trace feature support
Expose Intel Processor Trace feature to guest.

To make Intel PT live migration safe and get same CPUID information
with same CPU model on diffrent host. CPUID[14] is constant in this
patch. Intel PT use EPT is first supported in IceLake, the CPUID[14]
get on this machine as default value. Intel PT would be disabled
if any machine don't support this minial feature list.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1520182116-16485-1-git-send-email-luwei.kang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-12 15:59:46 -03:00
Wanpeng Li
be7773268d target-i386: add KVM_HINTS_DEDICATED performance hint
Add KVM_HINTS_DEDICATED performance hint, guest checks this feature bit
to determine if they run on dedicated vCPUs, allowing optimizations such
as usage of qspinlocks.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1518185725-69559-1-git-send-email-wanpengli@tencent.com>
[ehabkost: Renamed property to kvm-hint-dedicated]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-03-12 15:57:39 -03:00
Peter Maydell
fb5fff1588 Merge remote-tracking branch 'remotes/kraxel/tags/vga-20180312-pull-request' into staging
7cdc61becd vga: fix region calculation

# gpg: Signature made Mon 12 Mar 2018 10:59:24 GMT
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/vga-20180312-pull-request:
  vga: fix region calculation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-12 18:35:37 +00:00
Marc-André Lureau
02f769b7ee tests: make docker-test-debug@fedora run sanitizers
Since --enable-debug no longer enable sanitizers, we need explicit
--enable-sanitizers.

llvm package is required for llvm-symbolizer, to get symbols in
backtraces.

Add make V=1 to get details about failing tests.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180312120849.20073-1-marcandre.lureau@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-03-13 01:19:56 +08:00
Peter Maydell
6ceb1b51f0 Merge remote-tracking branch 'remotes/kraxel/tags/audio-20180312-pull-request' into staging
modules: use gmodule-export.
audio: add driver registry, enable module builds.

# gpg: Signature made Mon 12 Mar 2018 10:42:19 GMT
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/audio-20180312-pull-request:
  audio/sdl: build as module
  audio/pulseaudio: build as module
  audio/oss: build as module
  audio/alsa: build as module
  build: enable audio modules
  audio: add module loading support
  audio: add driver registry
  modules: use gmodule-export

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-12 16:14:37 +00:00
Pavel Dovgalyuk
0790f86861 tcg: fix cpu_io_recompile
cpu_io_recompile() function was broken by
the commit 9b990ee5a3. Instead of regenerating
the block starting from PC of the original block, it just set the instruction
counter for TCG. In most cases this was unnoticed, but in icount mode
there was an exception for incorrect usage of CF_LAST_IO flag.
This patch recovers recompilation of the original block and also
configures translation for executing single IO instruction which
caused a recompilation.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095338.1060.27385.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-12 17:10:38 +01:00
Pavel Dovgalyuk
7273db9d28 replay: update documentation
This patch clarifies the description of the record/replay feature
in docs/replay.txt

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095333.1060.1331.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-12 17:10:38 +01:00
Pavel Dovgalyuk
0b30dc0164 replay: save vmstate of the asynchronous events
This patch fixes saving and loading the snapshots in the replay mode.
It is required for the snapshots created in the moment when the header
of the asynchronous event is read. This information was not saved in
the snapshot. After loading the vmstate replay continued with the file offset
passed the event header. The event header is lost in this case and replay
hangs.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <20180227095322.1060.53929.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 17:10:38 +01:00
Pavel Dovgalyuk
89e46eb477 replay: don't process async events when warping the clock
Virtual clock is warped from iothread and vcpu thread. When the hardware
events associated with warp checkpoint, then interrupt delivering may be
non-deterministic if checkpoint is processed in different threads in record
and replay.
This patch disables event processing for clock warp checkpoint and leaves
all hardware events to other checkpoints (e.g., virtual clock).

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095316.1060.4134.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-12 17:10:38 +01:00
Alex Bennée
821c113033 scripts/replay-dump.py: replay log dumper
This script is a debugging tool for looking through the contents of a
replay log file. It is incomplete but should fail gracefully at events
it doesn't understand.

It currently understands two different log formats as the audio
record/replay support was merged during since MTTCG. It was written to
help debug what has caused the BQL changes to break replay support.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20180227095310.1060.14500.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 17:10:38 +01:00
Pavel Dovgalyuk
66eb7825d0 replay: avoid recursive call of checkpoints
This patch adds a flag which denies recursive call of replay_checkpoint
function. Checkpoints may be accompanied by the hardware events. When event
is processed, virtual device may invoke timer modification functions that
also invoke the checkpoint function. This leads to infinite loop.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095305.1060.56463.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-12 17:10:38 +01:00
Pavel Dovgalyuk
6dc0f52963 replay: check return values of fwrite
This patch adds error reporting when fwrite cannot completely
save the buffer to the file.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095259.1060.86410.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-12 17:10:37 +01:00
Alex Bennée
d759c951f3 replay: push replay_mutex_lock up the call tree
Now instead of using the replay_lock to guard the output of the log we
now use it to protect the whole execution section. This replaces what
the BQL used to do when it was held during TCG execution.

We also introduce some rules for locking order - mainly that you
cannot take the replay_mutex while holding the BQL. This leads to some
slight sophistry during start-up and extending the
replay_mutex_destroy function to unlock the mutex without checking
for the BQL condition so it can be cleanly dropped in the non-replay
case.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Tested-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095248.1060.40374.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2018-03-12 17:10:36 +01:00
Pavel Dovgalyuk
1a423896fa replay: don't destroy mutex at exit
Replay mutex is held by vCPU thread and destroy function is called
from atexit of the main thread. Therefore we cannot destroy it safely.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180227095254.1060.96971.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-12 17:09:50 +01:00
Alex Bennée
a36544d34c replay: make locking visible outside replay code
The replay_mutex_lock/unlock/locked functions are now going to be used
for ensuring lock-step behaviour between the two threads. Make them
public API functions and also provide stubs for non-QEMU builds on
common paths.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095242.1060.16601.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:50 +01:00
Alex Bennée
180d30bebe replay/replay-internal.c: track holding of replay_lock
This is modelled after the iothread mutex lock. We keep a TLS flag to
indicate when that thread has acquired the lock and assert we don't
double-lock or release when we shouldn't have.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095237.1060.44661.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:50 +01:00
Alex Bennée
80be169c1f replay/replay.c: bump REPLAY_VERSION again
This time commit 802f045a5f broke the
replay file format. Also add a comment about this to
replay-internal.h.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180227095231.1060.91180.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:50 +01:00
Pavel Dovgalyuk
4b930d264c replay: save prior value of the host clock
This patch adds saving/restoring of the host clock field 'last'.
It is used in host clock calculation and therefore clock may
become incorrect when using restored vmstate.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180227095226.1060.50975.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-12 16:12:50 +01:00
Pavel Dovgalyuk
bb040e006f replay: added replay log format description
This patch adds description of the replay log file format
into the docs/replay.txt.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180227095220.1060.58759.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:50 +01:00
Pavel Dovgalyuk
377b21ccea replay: fix save/load vm for non-empty queue
This patch does not allows saving/loading vmstate when
replay events queue is not empty. There is no reliable
way to save events queue, because it describes internal
coroutine state. Therefore saving and loading operations
should be deferred to another record/replay step.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095214.1060.32939.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-12 16:12:50 +01:00
Pavel Dovgalyuk
1652e0c30f replay: fixed replay_enable_events
This patch fixes assignment to internal events_enabled variable.
Now it is set only in record/replay mode. This affects the behavior
of the external functions that check this flag.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180227095209.1060.45884.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-12 16:12:50 +01:00
Pavel Dovgalyuk
1a96e3c1e7 replay: fix processing async events
Asynchronous events saved at checkpoints may invoke
callbacks when processed. These callbacks may also generate/read
new events (e.g. clock reads). Therefore event processing flag must be
reset before callback invocation.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180227095203.1060.70831.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-12 16:12:50 +01:00
Pavel Dovgalyuk
5f3bdfd4fa cpu-exec: fix exception_index handling
Function cpu_handle_interrupt calls cc->cpu_exec_interrupt to process
pending hardware interrupts. Under the hood cpu_exec_interrupt uses
cpu->exception_index to pass information to the internal function which
is usually common for exception and interrupt processing.
But this value is not reset after return and may be processed again
by cpu_handle_exception. This does not happen due to overwriting
the exception_index at the end of cpu_handle_interrupt.
But this branch may also overwrite the valid exception_index in some cases.
Therefore this patch:
 1. resets exception_index just after the call to cpu_exec_interrupt
 2. prevents overwriting the meaningful value of exception_index

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180227095140.1060.61357.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
2018-03-12 16:12:50 +01:00
Philippe Mathieu-Daudé
ac64273c66 hw/i386/pc: Factor out the superio code
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-26-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:50 +01:00
Philippe Mathieu-Daudé
a4cb773928 hw/alpha/dp264: Use the TYPE_SMC37C669_SUPERIO
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-25-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:49 +01:00
Philippe Mathieu-Daudé
f4564fc0e8 hw/alpha/dp264: Add the ISA DMA controller
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-24-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:49 +01:00
Philippe Mathieu-Daudé
7bea0dd434 hw/isa/superio: Add the SMC FDC37C669 Super I/O
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-23-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:49 +01:00
Philippe Mathieu-Daudé
b250d04a3b MAINTAINERS: Split the Alpha TCG/machine section
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-22-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:49 +01:00
Philippe Mathieu-Daudé
0170a3fcb3 MAINTAINERS: Add entries for the VT82C686B Super I/O
So far, it is only used by the MIPS Fulong 2E mini PC.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-21-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:49 +01:00
Philippe Mathieu-Daudé
98cf824b5f hw/isa/vt82c686: Add the TYPE_VT82C686B_SUPERIO
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-20-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:49 +01:00
Philippe Mathieu-Daudé
728d891003 hw/isa/vt82c686: Rename vt82c686b_init() -> vt82c686b_isa_init()
This function only initialize the ISA bus.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-19-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:49 +01:00
Philippe Mathieu-Daudé
5c961c3fb1 hw/mips/mips_fulong2e: Factor out vt82c686b_southbridge_init()
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-18-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:49 +01:00
Philippe Mathieu-Daudé
7313b1f28b hw/isa/superio: Factor out the FDC37M817 Super I/O from mips_malta.c
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-17-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:49 +01:00
Philippe Mathieu-Daudé
78f16256c1 hw/mips/malta: Code movement
Move the SouthBridge peripherals first, and keep the Super I/O
peripherals last.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-16-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:49 +01:00
Philippe Mathieu-Daudé
c16a4e1bc5 hw/isa/superio: Factor out the IDE code from pc87312.c
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-15-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:49 +01:00
Philippe Mathieu-Daudé
72d3d8f052 hw/isa/superio: Add a keyboard/mouse controller (8042)
Since the PC87312 inherits this abstract model, we remove the I8042
instance in the PREP machine.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20180308223946.26784-14-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
6f6695b136 hw/isa/superio: Factor out the floppy disc controller code from pc87312.c
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-13-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
cd9526ab7c hw/isa/superio: Factor out the serial code from pc87312.c
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-12-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
4c3119a6e3 hw/isa/superio: Factor out the parallel code from pc87312.c
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-11-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
63f01a74ae hw/isa/pc87312: Inherit from the abstract TYPE_ISA_SUPERIO
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-10-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
1854eb287e hw/isa/superio: Add a Super I/O template based on the PC87312 device
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-9-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
818c9d992f hw/isa/pc87312: Use 'unsigned int' for the irq value
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-8-f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
4e00105a76 hw/isa/pc87312: Use uint16_t for the ISA I/O base address
This matches the isa_register_ioport() prototype.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-7-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
010d2dc473 hw/isa/pc87312: Rename the device type as TYPE_PC87312_SUPERIO
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (hw/ppc)
Message-Id: <20180308223946.26784-6-f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
a48c6b5155 MAINTAINERS: Fix the PC87312 include path
Missed while moving it in 0d09e41a51.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-5-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
47973a2dbf hw/input/i8042: Extract declarations from i386/pc.h into input/i8042.h
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (hw/ppc)
Message-Id: <20180308223946.26784-4-f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
55f613ac25 hw/dma/i8257: Rename DMA_init() to i8257_dma_init()
- Move the header from hw/isa/ to hw/dma/
- Remove the old i386/pc dependency
- use a bool type for the high_page_enable argument

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-3-f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:48 +01:00
Philippe Mathieu-Daudé
bb3d5ea858 hw/isa: Move parallel_hds_isa_init() to hw/char/parallel-isa.c
Again... (after 07dc788054 and 9157eee1b1).

We now extract the ISA bus specific helpers.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180308223946.26784-2-f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:47 +01:00
Paolo Bonzini
a40161cbe9 membarrier: add --enable-membarrier
Actually enable the global memory barriers if supported by the OS.
Because only recent versions of Linux include the support, they
are disabled by default.  Note that it also has to be disabled
for QEMU to run under Wine.

Before this patch, rcutorture reports 85 ns/read for my machine,
after the patch it reports 12.5 ns/read.  On the other hand updates
go from 50 *micro*seconds to 20 *milli*seconds.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:47 +01:00
Paolo Bonzini
c8d3877e48 membarrier: introduce qemu/sys_membarrier.h
This new header file provides heavy-weight "global" memory barriers that
enforce memory ordering on each running thread belonging to the current
process.  For now, use a dummy implementation that issues memory barriers
on both sides (matching what QEMU has been doing so far).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:47 +01:00
Paolo Bonzini
77a8b8462b rcu: make memory barriers more explicit
Prepare for introducing smp_mb_placeholder() and smp_mb_global().
The new smp_mb() in synchronize_rcu() is not strictly necessary, since
the first atomic_mb_set for rcu_gp_ctr provides the required ordering.
However, synchronize_rcu is not performance critical, and it *will* be
necessary to introduce a smp_mb_global before calling wait_for_readers().

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:47 +01:00
Paolo Bonzini
729c0ddd3c docs: document atomic_load_acquire and atomic_store_release
We will use them in the next patch, document what they do.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:47 +01:00
Paolo Bonzini
b9b7581754 rcutorture: remove synchronize_rcu from readers
This gives much worse numbers for readers, especially if synchronize_rcu
is made more expensive as is the case with --enable-membarrier.  Before:

   $ tests/rcutorture 10 stress 10
   n_reads: 98304  n_updates: 529  n_mberror: 0
   rcu_stress_count: 98302 2 0 0 0 0 0 0 0 0 0

After:

   $ tests/rcutorture 10 stress 10
   n_reads: 165158482  n_updates: 429  n_mberror: 0
   rcu_stress_count: 165154364 4118 0 0 0 0 0 0 0 0 0

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:47 +01:00
Thomas Huth
148b2ba114 hw/mips/jazz: Fix implicit creation of "-drive if=scsi" devices
The global hack for creating SCSI devices has recently been removed,
but this apparently broke SCSI devices on some boards that were not
ready for this change yet. For the pica61 machine you now get:

$ mips64-softmmu/qemu-system-mips64 -M pica61 -cdrom x.iso
qemu-system-mips64: -cdrom x.iso: machine type does not support if=scsi,bus=0,unit=2

Fix it by calling scsi_bus_legacy_handle_cmdline() after creating the
corresponding SCSI controller.

Fixes: 1454509726
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1520414644-11535-1-git-send-email-thuth@redhat.com>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:47 +01:00
Thomas Huth
7e563bfb8a Polish the version strings containing the package version
Since commit 67a1de0d19 there is no space anymore between the
version number and the parentheses when running configure with
--with-pkgversion=foo :

 $ qemu-system-s390x --version
 QEMU emulator version 2.11.50(foo)

But the space is included when building without that option
when building from a git checkout:

 $ qemu-system-s390x --version
 QEMU emulator version 2.11.50 (v2.11.0-1494-gbec9c64-dirty)

The same confusion exists with the "query-version" QMP command.
Let's fix this by introducing a proper QEMU_FULL_VERSION definition
that includes the space and parentheses, while the QEMU_PKGVERSION
should just cleanly contain the package version string itself.
Note that this also changes the behavior of the "query-version" QMP
command (the space and parentheses are not included there anymore),
but that's supposed to be OK since the strings there are not meant
to be parsed by other tools.

Fixes: 67a1de0d19
Buglink: https://bugs.launchpad.net/qemu/+bug/1673373
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1518692807-25859-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:47 +01:00
Paolo Bonzini
8cc436d9c5 hw/i386: make IOMMUs configurable via default-configs/
Allow distributions to disable the Intel and/or AMD IOMMU devices.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Paolo Bonzini
4397a01847 scsi: support NDOB (no data-out buffer) for WRITE SAME commands
A NDOB bit set to one specifies that the disk shall not transfer data
from the data-out buffer and shall process the command as if the data-out
buffer contained user data set to all zeroes.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Peter Xu
05b6cc4ae2 chardev: tcp: let TLS run on chardev context
Now qio_channel_tls_handshake() is ready to receive the context.  Let
socket chardev use it, then the TLS handshake of chardev will always be
with the chardev's context.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180306053320.15401-9-peterx@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Peter Xu
25679e5d58 chardev: tcp: postpone async connection setup
This patch allows the socket chardev async connection be setup with
non-default gcontext.  We do it by postponing the setup to machine done,
since until then we can know which context we should run the async
operation on.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180306053320.15401-8-peterx@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Peter Xu
3e7d4d20d3 chardev: use chardev's gcontext for async connect
Generalize the function to create the async QIO task connection.  Also,
fix the context pointer to use the chardev's gcontext.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180306053320.15401-7-peterx@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Peter Xu
c7278b4355 chardev: introduce chr_machine_done hook
Introduce ChardevClass.chr_machine_done() hook so that chardevs can run
customized procedures after machine init.

There was an existing mux user already that did similar thing but used a
raw machine done notifier.  Generalize it into a framework, and let the
mux chardevs provide such a class-specific hook to achieve the same
thing.  Then we can move the mux related code to the char-mux.c file.

Since at it, replace the mux_realized variable with the global
machine_init_done varible.

This notifier framework will be further leverged by other type of
chardevs soon.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180306053320.15401-6-peterx@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Peter Xu
ce1230c054 chardev: allow telnet gsource to switch gcontext
It was originally created by qio_channel_add_watch() so it's always
assigning the task to main context.  Now we use the new API called
qio_channel_add_watch_source() so that we get the GSource handle rather
than the tag ID.

Meanwhile, caching the gsource and TCPChardevTelnetInit (which holds the
handshake data) in SocketChardev.telnet_source so that we can also do
dynamic context switch when update read handlers.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180306053320.15401-5-peterx@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Peter Xu
3da9de5ce2 chardev: update net listener gcontext
TCP chardevs can be using QIO network listeners working in the
background when in listening mode.  However the network listeners are
always running in main context.  This can race with chardevs that are
running in non-main contexts.

To solve this, we need to re-setup the net listeners in
tcp_chr_update_read_handler() with the newly cached gcontext.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180306053320.15401-4-peterx@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Daniel P. Berrangé
c863fdec6a chardev: fix handling of EAGAIN for TCP chardev
When this commit was applied

  commit 9894dc0cdc
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Tue Jan 19 11:14:29 2016 +0000

    char: convert from GIOChannel to QIOChannel

The tcp_chr_recv() function was changed to return QIO_CHANNEL_ERR_BLOCK
which corresonds to -2. As such the handling for EAGAIN was able to be
removed from tcp_chr_read(). Unfortunately in a later commit:

  commit b6572b4f97
  Author: Marc-André Lureau <marcandre.lureau@redhat.com>
  Date:   Fri Mar 11 18:55:24 2016 +0100

    char: translate from QIOChannel error to errno

The tcp_chr_recv() function was changed back to return -1, with errno
set to EAGAIN, without also re-addding support for this to tcp_chr_read()

Reported-by: Aleksey Kuleshov <rndfax@yandex.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180222121351.26191-1-berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Peter Xu
c8ca2a23a9 vl: export machine_init_done
We have that variable but not exported.  Export that so modules can have
a way to poke on whether machine init has finished.

Meanwhile, set that up even before calling the notifiers, so that
notifiers who may depend on this field will get a correct answer.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180306053320.15401-2-peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Eric Blake
205f31a81a checkpatch: Exempt long URLs
Sometimes, we want to refer to really long URLs, but checkpatch
balks, and we have to manually bypass the check.  URL shorteners
may be nice at reducing long links, but it's hard to guarantee the
shortened link will live as long as the real target, and it is
also nice to see the original target without having to load the
shortened URL through a browser.  So exempt a line containing
only a URL from the long-line syntax check.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180222215838.18223-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Thomas Huth
7eceff5b5a hw: Do not include "sysemu/block-backend.h" if it is not necessary
After reviewing a patch from Philippe that removes block-backend.h
from hw/lm32/milkymist.c, I noticed that this header is included
unnecessarily in a lot of other files, too. Remove those unneeded
includes to speed up the compilation process a little bit.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1518684912-31637-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:46 +01:00
Marc-André Lureau
0decdfe29b build-sys: make help could have 'modules' target
Available when configure --enable-modules.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180306161728.20890-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:45 +01:00
Thomas Huth
e12c8198cf qemu-doc: Add the paragraph about the -no-frame deprecation again
The section has accidentially been removed while resolving a
contextual conflict during a rebase, so add this again.

Fixes: f29d445042
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1520405769-22179-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:45 +01:00
Paolo Bonzini
fb8152181d qemu-doc: update deprecation section to use -nic and -netdev hubport
The deprecated SLIRP options -tftp, -bootp, -redir, -smb provide
sample replacements that use "-net nic".  Suggest "-nic" instead,
since we finally have a path towards getting rid of "-net".

For "-net vlan" the replacement involves hubport network devices,
so mention that too.

Cc: Jason Wang <jasowang@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:45 +01:00
Daniel Henrique Barboza
d082d16a5c scsi-disk.c: consider bl->max_transfer in INQUIRY emulation
The calculation of the max_transfer atribute of BlockDriverState
makes considerations such as max_segments and transfer_length via
the BLKSECTGET ioctl (if available).

However, bl->max_transfer isn't considered when emulating the INQUIRY
'Block Limit' response to the scsi-hd devices. This leads to situations
where the declared max_sectors from the INQUIRY response is inconsistent
with the block limits, which isn't ideal. It can also be misleading to the
user that sets /sys/block/<dev>/queue/max_sectors_kb to a certain
value, then finds a different value in the guest OS for the same disk.

Following the same logic scsi_read_complete from scsi-generic.c does
when patching the response of the Block Limits VPD back to the guest,
change the max_io_sectors value of the emulated Block Limits VPD
response by considering the blk_get_max_transfer of the related
BlockDriverState. Use MIN_NOT_ZERO to be sure that the minimal
value is chosen.

Given that we're changing max_io_sectors, consider that min_io_sectors
and opt_io_sectors can't be greater than the new calculated value.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180306154411.18462-1-danielhb@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:45 +01:00
Paolo Bonzini
4b9c264bd2 q35: change default NIC to e1000e
The e1000 NIC is getting old and is not a very good default for a
PCIe machine type.  Change it to e1000e, which should be supported
by a good number of guests.

In particular, drivers for 82574 were added first to Linux 2.6.27 (2008)
and Windows 2008 R2.  This does mean that Windows 2008 will not work
anymore with Q35 machine types and a default "-net nic -net xxx" network
configuration; it did work before because it does have an AHCI driver.
However, Windows 2008 has been declared out of main stream support
in 2015.  It will get out of extended support in 2020.  Windows 2008
R2 has the same end of support dates and, since the two are basically
Vista vs. Windows 7, R2 probably is more popular.

Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:45 +01:00
Paolo Bonzini
52310c3fa7 net: allow using any PCI NICs in -net or -nic
Remove the hard-coded list of PCI NIC names; instead, fill an array
using all PCI devices listed under DEVICE_CATEGORY_NETWORK. Keep
the old shortcut "virtio" for virtio-net-pci.

Suggested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 16:12:45 +01:00
Paolo Bonzini
47c66009ab qom: introduce object_class_get_list_sorted
Unify half a dozen copies of very similar code (the only difference being
whether comparisons were case-sensitive) and use it also in Tricore,
which did not do any sorting of CPU model names.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-03-12 15:21:10 +01:00
Peter Maydell
b16a54da06 Merge remote-tracking branch 'remotes/kraxel/tags/ui-20180312-pull-request' into staging
gtk,spice: add dmabuf support.
sdl,vnc,gtk: bugfixes.
ui/qapi: add device ID and head parameters to screendump.
build: try improve handling of clang warnings.

# gpg: Signature made Mon 12 Mar 2018 09:13:28 GMT
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/ui-20180312-pull-request:
  qapi: Add device ID and head parameters to screendump
  spice: add cursor_dmabuf support
  spice: add scanout_dmabuf support
  spice: drop dprint() debug logging
  vnc: deal with surface NULL pointers
  ui/gtk-egl: add cursor_dmabuf support
  ui/gtk-egl: add scanout_dmabuf support
  ui/gtk: use GtkGlArea on wayland only
  ui/opengl: Makefile cleanup
  ui/gtk: group gtk.mo declarations in Makefile
  ui/gtk: make GtkGlArea usage a runtime option
  sdl: workaround bug in sdl 2.0.8 headers
  make: switch language file build to be gtk module aware
  build: try improve handling of clang warnings

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-12 14:06:23 +00:00
Peter Maydell
819fd4699c Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20180309a' into staging
Migration pull 2018-03-09

# gpg: Signature made Fri 09 Mar 2018 17:52:46 GMT
# gpg:                using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20180309a:
  tests: Silence migration-test 'bad' test
  migration: fix applying wrong capabilities
  migration/block: rename MAX_INFLIGHT_IO to MAX_IO_BUFFERS
  migration/block: reset dirty bitmap before read in bulk phase
  migration: do not transfer ram during bulk storage migration
  migration: fix minor finalize leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-12 13:21:53 +00:00
Peter Maydell
5df089564b Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180309' into staging
target-arm queue:
 * i.MX: Add i.MX7 SOC implementation and i.MX7 Sabre board
 * Report the correct core count in A53 L2CTLR on the ZynqMP board
 * linux-user: preliminary SVE support work (signal handling)
 * hw/arm/boot: fix memory leak in case of error loading ELF file
 * hw/arm/boot: avoid reading off end of buffer if passed very
   small image file
 * hw/arm: Use more CONFIG switches for the object files
 * target/arm: Add "-cpu max" support
 * hw/arm/virt: Support -machine gic-version=max
 * hw/sd: improve debug tracing
 * hw/sd: sdcard: Add the Tuning Command (CMD 19)
 * MAINTAINERS: add Philippe as odd-fixes maintainer for SD

# gpg: Signature made Fri 09 Mar 2018 17:24:23 GMT
# gpg:                using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20180309: (25 commits)
  MAINTAINERS: Add entries for SD (SDHCI, SDBus, SDCard)
  sdhci: Fix a typo in comment
  sdcard: Add the Tuning Command (CMD19)
  sdcard: Display which protocol is used when tracing (SD or SPI)
  sdcard: Display command name when tracing CMD/ACMD
  sdcard: Do not trace CMD55, except when we already expect an ACMD
  hw/arm/virt: Support -machine gic-version=max
  hw/arm/virt: Add "max" to the list of CPU types "virt" supports
  target/arm: Make 'any' CPU just an alias for 'max'
  target/arm: Add "-cpu max" support
  target/arm: Move definition of 'host' cpu type into cpu.c
  target/arm: Query host CPU features on-demand at instance init
  arm: avoid heap-buffer-overflow in load_aarch64_image
  arm: fix load ELF error leak
  hw/arm: Use more CONFIG switches for the object files
  aarch64-linux-user: Add support for SVE signal frame records
  aarch64-linux-user: Add support for EXTRA signal frame records
  aarch64-linux-user: Remove struct target_aux_context
  aarch64-linux-user: Split out helpers for guest signal handling
  linux-user: Implement aarch64 PR_SVE_SET/GET_VL
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-12 11:47:52 +00:00
Daniel P. Berrangé
73ff061032 trace: only permit standard C types and fixed size integer types
Some trace backends will compile code based on the declared trace
events. It should not be assumed that the backends can resolve any QEMU
specific typedefs. So trace events should restrict their argument
types to the standard C types and fixed size integer types. Any complex
pointer types can be declared as "void *" for purposes of trace events,
since nothing will be dereferencing these pointer arguments.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20180308155524.5082-3-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-12 11:10:20 +00:00
Daniel P. Berrangé
8858444850 trace: remove use of QEMU specific types from trace probes
Any compound structs / unions / etc, should always be declared as
'void *' pointers, since it cannot be assumed that trace backends
are able to resolve QEMU typedefs.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20180308155524.5082-2-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-12 11:10:20 +00:00
Daniel P. Berrangé
86b5aacfb9 trace: include filename when printing parser error messages
Improves error messages from:

  ValueError: Error on line 72: need more than 1 value to unpack

To

  ValueError: Error at /home/berrange/src/virt/qemu/trace-events:72:
    need more than 1 value to unpack

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180306154650.24075-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-12 11:10:20 +00:00
Stefan Hajnoczi
e42860ae83 simpletrace: fix timestamp argument type
The timestamp argument to a trace event method is documented as follows:

  The method can also take a timestamp argument before the trace event
  arguments:

    def runstate_set(self, timestamp, new_state):
        ...

  Timestamps have the uint64_t type and are in nanoseconds.

In reality methods with a timestamp argument actually receive a tuple
like (123456789,) as the timestamp argument.  This is due to a bug in
simpletrace.py.

This patch unpacks the tuple so that methods receive the correct
timestamp argument type.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180222163901.14095-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-12 11:10:20 +00:00
Peter Maydell
be0aa7ac89 log-for-trace.h: Split out parts of log.h used by trace.h
A persistent build problem we see is where a source file
accidentally omits the #include of log.h. This slips through
local developer testing because if you configure with the
default (log) trace backend trace.h will pull in log.h for you.
Compilation fails only if some other backend is selected.

To make this error cause a compile failure regardless of
the configured trace backend, split out the parts of log.h
that trace.h requires into a new log-for-trace.h header.
Since almost all manual uses of the log.h functions will
use constants or functions which aren't in log-for-trace.h,
this will let us catch missing #include "qemu/log.h" more
consistently.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180213140029.8308-1-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-12 11:10:20 +00:00
Stefan Hajnoczi
7376eda7c2 block: make BDRV_POLL_WHILE() re-entrancy safe
Nested BDRV_POLL_WHILE() calls can occur.  Currently
assert(!wait_->wakeup) fails in AIO_WAIT_WHILE() when this happens.

This patch converts the bool wait_->need_kick flag to an unsigned
wait_->num_waiters counter.

Nesting works correctly because outer AIO_WAIT_WHILE() callers evaluate
the condition again after the inner caller completes (invoking the inner
caller counts as aio_poll() progress).

Reported-by: "fuweiwei (C)" <fuweiwei2@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180307124619.6218-1-stefanha@redhat.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-12 11:07:37 +00:00
Gerd Hoffmann
7cdc61becd vga: fix region calculation
Typically the scanline length and the line offset are identical.  But
in case they are not our calculation for region_end is incorrect.  Using
line_offset is fine for all scanlines, except the last one where we have
to use the actual scanline length.

Fixes: CVE-2018-7550
Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Tested-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Message-id: 20180309143704.13420-1-kraxel@redhat.com
2018-03-12 11:45:21 +01:00
zhenwei.pi
c7ac1ab020 usbredir: reorder fields in USBRedirDevice to reduce padding
Changing the current ordering saves 8 bytes per entry in x86_64.

Signed-off-by: zhenwei.pi <zhenwei.pi@youruncloud.com>
Message-id: 1520318781-22644-1-git-send-email-zhenwei.pi@youruncloud.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2018-03-12 11:43:49 +01:00
Gerd Hoffmann
051c7d5c1e audio/sdl: build as module
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306074053.22856-8-kraxel@redhat.com
2018-03-12 11:18:27 +01:00
Gerd Hoffmann
d2f623dad5 audio/pulseaudio: build as module
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306074053.22856-7-kraxel@redhat.com
2018-03-12 11:18:27 +01:00
Gerd Hoffmann
22d8154391 audio/oss: build as module
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306074053.22856-6-kraxel@redhat.com
2018-03-12 11:18:27 +01:00
Gerd Hoffmann
ce3dc033df audio/alsa: build as module
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306074053.22856-5-kraxel@redhat.com
2018-03-12 11:18:27 +01:00
Gerd Hoffmann
08a05b379a build: enable audio modules
Add audio/ to common-obj-m variable.

Also run both audio and ui variables through unnest-vars.
This avoids sdl.mo (exists in both audio/ and ui/) name clashes.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306074053.22856-4-kraxel@redhat.com
2018-03-12 11:18:27 +01:00
Gerd Hoffmann
65ba869661 audio: add module loading support
Make audio_driver_lookup() try load the module in case it doesn't find
the driver in the registry.  Also load all modules for -audio-help, so
the help output includes the help text for modular audio drivers.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180306074053.22856-3-kraxel@redhat.com
2018-03-12 11:18:26 +01:00
Gerd Hoffmann
d3893a39eb audio: add driver registry
Add registry for audio drivers, using the existing audio_driver struct.
Make all drivers register themself.  The old list of audio_driver struct
pointers is now a list of audio driver names, specifying the priority
(aka probe order) in case no driver is explicitly asked for.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180306074053.22856-2-kraxel@redhat.com
2018-03-12 11:18:26 +01:00
Peter Maydell
12c06d6f96 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Fri 09 Mar 2018 15:09:20 GMT
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (56 commits)
  qemu-iotests: fix 203 migration completion race
  iotests: Tweak 030 in order to trigger a race condition with parallel jobs
  iotests: Skip test for ENOMEM error
  iotests: Mark all tests executable
  iotests: Test creating overlay when guest running
  qemu-iotests: Test ssh image creation over QMP
  qemu-iotests: Test qcow2 over file image creation with QMP
  block: Fail bdrv_truncate() with negative size
  file-posix: Fix no-op bdrv_truncate() with falloc preallocation
  ssh: Support .bdrv_co_create
  ssh: Pass BlockdevOptionsSsh to connect_to_ssh()
  ssh: QAPIfy host-key-check option
  ssh: Use QAPI BlockdevOptionsSsh object
  sheepdog: Support .bdrv_co_create
  sheepdog: QAPIfy "redundancy" create option
  nfs: Support .bdrv_co_create
  nfs: Use QAPI options in nfs_client_open()
  rbd: Use qemu_rbd_connect() in qemu_rbd_do_create()
  rbd: Assign s->snap/image_name in qemu_rbd_open()
  rbd: Support .bdrv_co_create
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-12 10:08:09 +00:00
Gerd Hoffmann
a88afc649e modules: use gmodule-export
As we want qemu symbols be exported to modules we should use the
gmodule-export-2.0 pkg-config instead of gmodule-2.0.

Cc: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180308085301.8875-2-kraxel@redhat.com
2018-03-12 10:15:07 +01:00
Thomas Huth
f771c5440e qapi: Add device ID and head parameters to screendump
QEMU's screendump command can only take dumps from the primary display.
When using multiple VGA cards, there is no way to get a dump from a
secondary card or other display heads yet. So let's add a 'device' and
a 'head' parameter to the HMP and QMP commands to be able to specify
alternative devices and heads with the screendump command, too.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1520267868-31778-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2018-03-12 09:03:15 +01:00
Gerd Hoffmann
b153f9019b spice: add cursor_dmabuf support
Add support for cursor dmabufs.  qemu has to render the cursor for
that, so in case a cursor is present qemu allocates a new dmabuf, blits
the scanout, blends in the pointer and passes on the new dmabuf to
spice-server.  Without cursor qemu continues to simply pass on the
scanout dmabuf as-is.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180308090618.30147-4-kraxel@redhat.com
2018-03-12 09:01:56 +01:00
Gerd Hoffmann
cd2452b279 spice: add scanout_dmabuf support
Add support for scanout dmabufs.  Just
pass them through to spice-server.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180308090618.30147-3-kraxel@redhat.com
2018-03-12 09:01:53 +01:00
Gerd Hoffmann
e181f8bbd0 spice: drop dprint() debug logging
Some calls are deleted, some are converted into tracepoints.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180308090618.30147-2-kraxel@redhat.com
2018-03-12 09:01:48 +01:00
Gerd Hoffmann
2e5567c903 vnc: deal with surface NULL pointers
Secondary displays in multihead setups are allowed to have a NULL
DisplaySurface.  Typically user interfaces handle this by hiding the
window which shows the display in question.

This isn't an option for vnc though because it simply hasn't a concept
of windows or outputs.  So handle the situation by showing a placeholder
DisplaySurface instead.  Also check in console_select whenever a surface
is preset in the first place before requesting an update.

This fixes a segfault which can be triggered by switching to an unused
display (via vtrl-alt-<nr>) in a multihead setup, for example using
-device virtio-vga,max_outputs=2.

Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 20180308161803.6152-1-kraxel@redhat.com
2018-03-12 09:00:59 +01:00
Gerd Hoffmann
f1bd313264 ui/gtk-egl: add cursor_dmabuf support
Add support for cursor dmabufs to gtk-egl.  Just blend in the cursor
(if we have one) when rendering the dmabuf.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306090951.22932-7-kraxel@redhat.com
2018-03-12 09:00:34 +01:00
Gerd Hoffmann
70763fea4f ui/gtk-egl: add scanout_dmabuf support
Add support for dmabuf scanouts to gtk-egl.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306090951.22932-6-kraxel@redhat.com
2018-03-12 09:00:34 +01:00
Gerd Hoffmann
4c70280592 ui/gtk: use GtkGlArea on wayland only
For dma-buf support we need a egl context.  The gtk x11 backend uses glx
contexts though.  We can't use the GtkGlArea widget on x11 because of
that, so use our own gtk-egl code instead.  wayland continues to use
the GtkGlArea widget.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306090951.22932-5-kraxel@redhat.com
2018-03-12 09:00:34 +01:00
Gerd Hoffmann
2f92f37c9e ui/opengl: Makefile cleanup
With gtk.mo bits moved away we don't need the ifeq any more.
Also add missing opengl libs for some objects.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306090951.22932-4-kraxel@redhat.com
2018-03-12 09:00:34 +01:00
Gerd Hoffmann
c923cbedb2 ui/gtk: group gtk.mo declarations in Makefile
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306090951.22932-3-kraxel@redhat.com
2018-03-12 09:00:34 +01:00
Gerd Hoffmann
11c82b584a ui/gtk: make GtkGlArea usage a runtime option
Compile in both gtk-egl and gtk-gl-area, then allow to choose at runtime
instead of compile time which opengl variant we want use.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306090951.22932-2-kraxel@redhat.com
2018-03-12 09:00:34 +01:00
Gerd Hoffmann
2ca5c43091 sdl: workaround bug in sdl 2.0.8 headers
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892087

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20180307154258.9313-1-kraxel@redhat.com
2018-03-12 09:00:34 +01:00
Bruce Rogers
722cd74964 make: switch language file build to be gtk module aware
Now that gtk support builds as a module, CONFIG_GTK changed from
y to m. Adjust Makefile correspondingly.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-id: 20180307155517.32570-1-brogers@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2018-03-12 09:00:34 +01:00
Gerd Hoffmann
0e39c4aa7e build: try improve handling of clang warnings
This patch disables the pragma diagnostic -Wunused-but-set-variable for
clang in util/coroutine-ucontext.c.

This in turn allows us to remove it from the configure check, so the
CONFIG_PRAGMA_DIAGNOSTIC_AVAILABLE will succeed for clang.

With that in place clang builds (linux) will use -Werror by default,
which breaks the build due to warning about unaligned struct members.

Just turning off this warning isn't a good idea as it indicates
portability problems.  So make it a warning again, using
-Wno-error=address-of-packed-member.  That way it doesn't break the
build but still shows up in the logs.

Now clang builds qemu without errors.  Well, almost.  There are some
left in the rdma code.  Leaving that to the rdma people.  All others can
use --disable-rdma to workarounds this.

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20180309135945.20436-1-kraxel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-03-12 08:59:03 +01:00
Luke Shumaker
955e304f6f linux-user: init_guest_commpage: Add a comment about size check
Signed-off-by: Luke Shumaker <lukeshu@parabola.nu>
Message-Id: <20171228180814.9749-7-lukeshu@lukeshu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 21:29:49 +01:00
Luke Shumaker
293f206008 linux-user: init_guest_space: Clarify page alignment logic
There are 3 parts to this change:
 - Add a comment showing the relative sizes and positions of the blocks of
   memory
 - introduce and use new aligned_{start,size} instead of adjusting
   real_{start_size}
 - When we clean up (on failure), munmap(real_start, real_size) instead of
   munmap(aligned_start, aligned_size).  It *shouldn't* make any
   difference, but I will admit that this does mean we are making the
   syscall with different values, so this isn't quite a no-op patch.

Signed-off-by: Luke Shumaker <lukeshu@parabola.nu>
Message-Id: <20171228180814.9749-6-lukeshu@lukeshu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 21:29:24 +01:00
Luke Shumaker
e7ea02e62a linux-user: init_guest_space: Correctly handle guest_start in commpage initialization
init_guest_commpage  needs to check if the mapped space, which ends at
real_start+real_size overlaps with where it needs to put the commpage,
which is (assuming sane qemu_host_page_size) guest_base + 0xffff000, where
guest_base is real_start - guest_start.

    [guest_base][       0xffff0000      ][commpage]
    [guest_base][guest_start][real_size] [commpage]
    [       real_start      ][real_size] [commpage]
                                        ^
                                 fail if this gap < 0

Since init_guest_commpage wants to do everything relative to guest_base
(rather than real_start), it obviously needs to be comparing 0xffff0000
against guest_start+real_size, not just real_size.

This bug has been present since 806d102141 in
2012, but guest_start is usually 0, and prior to v2.11 real_size was
usually much smaller than 0xffff0000, so it was uncommon for it to have
made a difference.

Signed-off-by: Luke Shumaker <lukeshu@parabola.nu>
Message-Id: <20171228180814.9749-5-lukeshu@lukeshu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 21:25:32 +01:00
Luke Shumaker
f202481754 linux-user: init_guest_space: Clean up if we can't initialize the commpage
We'll just exit with an error anyway, so it doesn't really matter, but it
is cleaned up in all of the other places were we error out.

Signed-off-by: Luke Shumaker <lukeshu@parabola.nu>
Message-Id: <20171228180814.9749-4-lukeshu@lukeshu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 21:24:22 +01:00
Luke Shumaker
c3637eaf61 linux-user: Rename validate_guest_space => init_guest_commpage
init_guest_commpage is a much more honest description of what the function
does.  validate_guest_space not only suggests that the function has no
side-effects, but also introduces confusion as to why it is only needed on
32-bit ARM targets.

Signed-off-by: Luke Shumaker <lukeshu@parabola.nu>
Message-Id: <20171228180814.9749-3-lukeshu@lukeshu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 21:23:03 +01:00
Luke Shumaker
8756e1361d linux-user: Use #if to only call validate_guest_space for 32-bit ARM target
Instead of defining a bogus validate_guest_space that always returns 1 on
targets other than 32-bit ARM, use #if blocks to only call it on 32-bit ARM
targets.  This makes the "normal" flow control clearer.

Signed-off-by: Luke Shumaker <lukeshu@parabola.nu>
Message-Id: <20171228180814.9749-2-lukeshu@lukeshu.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[lv: fix condition to "!= 1" as requested by Peter]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 21:19:03 +01:00
Peter Maydell
20f59d1200 Merge remote-tracking branch 'remotes/vivier/tags/m68k-for-2.12-pull-request' into staging
# gpg: Signature made Fri 09 Mar 2018 14:54:33 GMT
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier/tags/m68k-for-2.12-pull-request:
  target/m68k: implement ftentox
  target/m68k: implement ftwotox
  target/m68k: implement fetox
  target/m68k: implement flog2
  target/m68k: implement flog10
  target/m68k: implement flogn
  target/m68k: implement flognp1
  target/m68k: define floatx80_move()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 18:49:27 +00:00
Max Filippov
d4090306c8 qemu-binfmt-conf.sh: add qemu-xtensa
Register qemu-xtensa and qemu-xtensaeb for transparent linux userspace
emulation.

Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180228221609.11265-11-jcmvbkbc@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 19:26:15 +01:00
Max Filippov
41c97cc022 linux-user: drop unused target_msync function
target_msync is not used, remove its declaration and implementation.

Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180228221609.11265-9-jcmvbkbc@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 19:23:38 +01:00
Max Filippov
78cf339039 linux-user: fix target_mprotect/target_munmap error return values
target_mprotect/target_munmap return value goes through get_errno at the
call site, thus the functions must either set errno to host error code
and return -1 or return negative guest error code. Do the latter.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180228221609.11265-8-jcmvbkbc@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 19:22:47 +01:00
Max Filippov
3c5f6a5f88 linux-user: fix assertion in shmdt
shmdt fails to call mmap_lock/mmap_unlock around page_set_flags,
resulting in the following assertion:
  page_set_flags: Assertion `have_mmap_lock()' failed.

Wrap shmdt internals into mmap_lock/mmap_unlock.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180228221609.11265-7-jcmvbkbc@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 19:22:09 +01:00
Max Filippov
ebf9a3630c linux-user: fix mmap/munmap/mprotect/mremap/shmat
In linux-user QEMU that runs for a target with TARGET_ABI_BITS bigger
than L1_MAP_ADDR_SPACE_BITS an assertion in page_set_flags fires when
mmap, munmap, mprotect, mremap or shmat is called for an address outside
the guest address space. mmap and mprotect should return ENOMEM in such
case.

Change definition of GUEST_ADDR_MAX to always be the last valid guest
address. Account for this change in open_self_maps.
Add macro guest_addr_valid that verifies if the guest address is valid.
Add function guest_range_valid that verifies if address range is within
guest address space and does not wrap around. Use that macro in
mmap/munmap/mprotect/mremap/shmat for error checking.

Cc: qemu-stable@nongnu.org
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180307215010.30706-1-jcmvbkbc@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 19:21:34 +01:00
Shea Levy
d4247ec2d7 linux-user: Support f_flags in statfs when available.
Signed-off-by: Shea Levy <shea@shealevy.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180301111500.15717-1-shea@shealevy.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 19:21:34 +01:00
Laurent Vivier
3ff48453e8 linux-user: allows to use "--systemd ALL" with qemu-binfmt-conf.sh
qemu-binfmt-conf.sh when it is used with systemd
needs to know for which CPU the systemd-binfmt.service
file must be created (i.e. "--systemd ppc").

But sometime, for instance for test purpose, we need to
create an entry for all known architectures.
This patch entroduce the "ALL" parameter for this purpose.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180308104859.3315-1-laurent@vivier.eu>
2018-03-09 19:18:35 +01:00
Peter Maydell
f8b985d65c linux-user: Remove the unused "not implemented" signal handling stubs
Now we've dropped unicore32, all of the architectures we support
for linux-user implement the signal handling routines. The
dummy "just print a message" versions are unimplemented, so we
can drop them entirely.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180308144733.25615-3-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 19:17:27 +01:00
Peter Maydell
daa4374a04 linux-user: Drop unicore32 code
We dropped the unicore32-linux-user target in commit 5e2b40f727
in 2016. Nobody has made any attempt to fix the issues that
caused us to drop it, so remove the associated code.
(The system emulation parts of unicore32 remain.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180308144733.25615-2-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2018-03-09 19:17:27 +01:00
Dr. David Alan Gilbert
f96d6651e4 tests: Silence migration-test 'bad' test
In 2c9bb29703 I added a migration test that purposely fails;
unfortunately it prints a copy of the failure message to stderr
which makes the output a bit messy.

Hide stderr for that test.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180306173042.24572-1-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-09 17:39:25 +00:00
Peter Xu
dd0ee30cae migration: fix applying wrong capabilities
When setting migration capabilities via QMP/HMP, we'll apply them even
if the capability check failed.  Fix it.

Fixes: 4a84214ebe ("migration: provide migrate_caps_check()", 2017-07-18)
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180305094938.31374-1-peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-09 17:39:25 +00:00
Peter Lieven
ef9c5160a1 migration/block: rename MAX_INFLIGHT_IO to MAX_IO_BUFFERS
this actually limits (as the original commit mesage suggests) the
number of I/O buffers that can be allocated and not the number
of parallel (inflight) I/O requests.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1520507908-16743-4-git-send-email-pl@kamp.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-09 17:39:25 +00:00
Peter Lieven
86b124bc76 migration/block: reset dirty bitmap before read in bulk phase
Reset the dirty bitmap before reading to make sure we don't miss
any new data.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1520507908-16743-3-git-send-email-pl@kamp.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-09 17:39:25 +00:00
Peter Lieven
b255734531 migration: do not transfer ram during bulk storage migration
this patch makes the bulk phase of a block migration to take
place before we start transferring ram. As the bulk block migration
can take a long time its pointless to transfer ram during that phase.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1520507908-16743-2-git-send-email-pl@kamp.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-09 17:39:25 +00:00
Marc-André Lureau
ab105cc138 migration: fix minor finalize leak
Spotted thanks to ASAN:
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/migration-test -p /x86_64/migration/bad_dest

==30302==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f60efba1a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7f60eef3cf75 in g_malloc0 ../glib/gmem.c:124
    #2 0x55ca9094702c in error_copy /home/elmarco/src/qemu/util/error.c:203
    #3 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139
    #4 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150
    #5 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411
    #6 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81
    #7 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85
    #8 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142
    #9 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88
    #10 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552
    #11 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182
    #12 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847
    #13 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214
    #14 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261
    #15 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515
    #16 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942
    #17 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724
    #18 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009)

Indirect leak of 45 byte(s) in 1 object(s) allocated from:
    #0 0x7f60efba1850 in malloc (/lib64/libasan.so.4+0xde850)
    #1 0x7f60eef3cf0c in g_malloc ../glib/gmem.c:94
    #2 0x7f60eef3d1cf in g_malloc_n ../glib/gmem.c:331
    #3 0x7f60eef596eb in g_strdup ../glib/gstrfuncs.c:363
    #4 0x55ca90947085 in error_copy /home/elmarco/src/qemu/util/error.c:204
    #5 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139
    #6 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150
    #7 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411
    #8 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81
    #9 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85
    #10 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142
    #11 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88
    #12 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552
    #13 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182
    #14 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847
    #15 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214
    #16 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261
    #17 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515
    #18 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942
    #19 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724
    #20 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180306170959.3921-1-marcandre.lureau@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-09 17:39:25 +00:00
Peter Maydell
e4ae62b802 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Fri 09 Mar 2018 13:19:02 GMT
# gpg:                using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  vl: introduce vm_shutdown()
  virtio-scsi: fix race between .ioeventfd_stop() and vq handler
  virtio-blk: fix race between .ioeventfd_stop() and vq handler
  block: add aio_wait_bh_oneshot()
  virtio-blk: dataplane: Don't batch notifications if EVENT_IDX is present
  README: Fix typo 'git-publish'
  block: Fix qemu crash when using scsi-block

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:28:16 +00:00
Philippe Mathieu-Daudé
076a0fc32a MAINTAINERS: Add entries for SD (SDHCI, SDBus, SDCard)
After spending months studying all the different SD Specifications
from the SD Association, voluntarily add myself as maintainer
for the SD code.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180309153654.13518-9-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:45 +00:00
Philippe Mathieu-Daudé
08022a916c sdhci: Fix a typo in comment
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180309153654.13518-8-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:45 +00:00
Philippe Mathieu-Daudé
0c3fb03f7e sdcard: Add the Tuning Command (CMD19)
From the "Physical Layer Simplified Specification Version 3.01":

  A known data block ("Tuning block") can be used to tune sampling
  point for tuning required hosts. [...]
  This procedure gives the system optimal timing for each specific
  host and card combination and compensates for static delays in
  the timing budget including process, voltage and different PCB
  loads and skews. [...]
  Data block, carried by DAT[3:0], contains a pattern for tuning
  sampling position to receive data on the CMD and DAT[3:0] line.

[based on a patch from Alistair Francis <alistair.francis@xilinx.com>
 from qemu/xilinx tag xilinx-v2015.2]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20180309153654.13518-5-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:44 +00:00
Philippe Mathieu-Daudé
75a96f5e1c sdcard: Display which protocol is used when tracing (SD or SPI)
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180309153654.13518-4-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:44 +00:00
Philippe Mathieu-Daudé
2ed61fb57b sdcard: Display command name when tracing CMD/ACMD
The SDBus will reuse these functions, so we put them in a new source file.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180309153654.13518-3-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: slight wordsmithing of comments, added note that string
 returned does not need to be freed]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:44 +00:00
Philippe Mathieu-Daudé
586634b9a8 sdcard: Do not trace CMD55, except when we already expect an ACMD
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20180309153654.13518-2-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:44 +00:00
Peter Maydell
dc16538a98 hw/arm/virt: Support -machine gic-version=max
Add support for passing 'max' to -machine gic-version. By analogy
with the -cpu max option, this picks the "best available" GIC version
whether you're using KVM or TCG, so it behaves like 'host' when
using KVM, and gives you GICv3 when using TCG.

Also like '-cpu host', using -machine gic-version=max' means there
is no guarantee of migration compatibility between QEMU versions;
in future 'max' might mean '4'.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180308130626.12393-7-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2018-03-09 17:09:44 +00:00
Peter Maydell
9076ddb3f6 hw/arm/virt: Add "max" to the list of CPU types "virt" supports
Allow the virt board to support '-cpu max' in the same way
it already handles '-cpu host'.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180308130626.12393-6-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-03-09 17:09:44 +00:00
Peter Maydell
a0032cc542 target/arm: Make 'any' CPU just an alias for 'max'
Now we have a working '-cpu max', the linux-user-only
'any' CPU is pretty much the same thing, so implement it
that way.

For the moment we don't add any of the extra feature bits
to the system-emulation "max", because we don't set the
ID register bits we would need to to advertise those
features as present.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180308130626.12393-5-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-03-09 17:09:44 +00:00
Peter Maydell
bab52d4bba target/arm: Add "-cpu max" support
Add support for "-cpu max" for ARM guests. This CPU type behaves
like "-cpu host" when KVM is enabled, and like a system CPU with
the maximum possible feature set otherwise. (Note that this means
it won't be migratable across versions, as we will likely add
features to it in future.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180308130626.12393-4-peter.maydell@linaro.org
2018-03-09 17:09:44 +00:00
Peter Maydell
86f0a186d6 target/arm: Move definition of 'host' cpu type into cpu.c
Move the definition of the 'host' cpu type into cpu.c, where all the
other CPU types are defined.  We can do this now we've decoupled it
from the KVM-specific host feature probing.  This means we now create
the type unconditionally (assuming we were built with KVM support at
all), but if you try to use it without -enable-kvm this will end
up in the "host cpu probe failed and KVM not enabled" path in
arm_cpu_realizefn(), for an appropriate error message.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180308130626.12393-3-peter.maydell@linaro.org
2018-03-09 17:09:44 +00:00
Peter Maydell
c4487d76d5 target/arm: Query host CPU features on-demand at instance init
Currently we query the host CPU features in the class init function
for the TYPE_ARM_HOST_CPU class, so that we can later copy them
from the class object into the instance object in the object
instance init function. This is awkward for implementing "-cpu max",
which should work like "-cpu host" for KVM but like "cpu with all
implemented features" for TCG.

Move the place where we store the information about the host CPU from
a class object to static variables in kvm.c, and then in the instance
init function call a new kvm_arm_set_cpu_features_from_host()
function which will query the host kernel if necessary and then
fill in the CPU instance fields.

This allows us to drop the special class struct and class init
function for TYPE_ARM_HOST_CPU entirely.

We can't delay the probe until realize, because the ARM
instance_post_init hook needs to look at the feature bits we
set, so we need to do it in the initfn. This is safe because
the probing doesn't affect the actual VM state (it creates a
separate scratch VM to do its testing), but the probe might fail.
Because we can't report errors in retrieving the host features
in the initfn, we check this belatedly in the realize function
(the intervening code will be able to cope with the relevant
fields in the CPU structure being zero).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180308130626.12393-2-peter.maydell@linaro.org
2018-03-09 17:09:44 +00:00
Marc-André Lureau
2764040785 arm: avoid heap-buffer-overflow in load_aarch64_image
Spotted by ASAN:

elmarco@boraha:~/src/qemu/build (master *%)$ QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 tests/boot-serial-test
/aarch64/boot-serial/virt: ** (process:19740): DEBUG: 18:39:30.275: foo /tmp/qtest-boot-serial-cXaS94D
=================================================================
==19740==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000069648 at pc 0x7f1d2201cc54 bp 0x7fff331f6a40 sp 0x7fff331f61e8
READ of size 4 at 0x603000069648 thread T0
    #0 0x7f1d2201cc53  (/lib64/libasan.so.4+0xafc53)
    #1 0x55bc86685ee3 in load_aarch64_image /home/elmarco/src/qemu/hw/arm/boot.c:894
    #2 0x55bc86687217 in arm_load_kernel_notify /home/elmarco/src/qemu/hw/arm/boot.c:1047
    #3 0x55bc877363b5 in notifier_list_notify /home/elmarco/src/qemu/util/notify.c:40
    #4 0x55bc869331ea in qemu_run_machine_init_done_notifiers /home/elmarco/src/qemu/vl.c:2716
    #5 0x55bc8693bc39 in main /home/elmarco/src/qemu/vl.c:4679
    #6 0x7f1d1652c009 in __libc_start_main (/lib64/libc.so.6+0x21009)
    #7 0x55bc86255cc9 in _start (/home/elmarco/src/qemu/build/aarch64-softmmu/qemu-system-aarch64+0x1ae5cc9)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:44 +00:00
Marc-André Lureau
36f876cea4 arm: fix load ELF error leak
Spotted by ASAN:
QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 tests/boot-serial-test

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7ff8a9b0ca38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7ff8a8ea7f75 in g_malloc0 ../glib/gmem.c:124
    #2 0x55fef3d99129 in error_setv /home/elmarco/src/qemu/util/error.c:59
    #3 0x55fef3d99738 in error_setg_internal /home/elmarco/src/qemu/util/error.c:95
    #4 0x55fef323acb2 in load_elf_hdr /home/elmarco/src/qemu/hw/core/loader.c:393
    #5 0x55fef2d15776 in arm_load_elf /home/elmarco/src/qemu/hw/arm/boot.c:830
    #6 0x55fef2d16d39 in arm_load_kernel_notify /home/elmarco/src/qemu/hw/arm/boot.c:1022
    #7 0x55fef3dc634d in notifier_list_notify /home/elmarco/src/qemu/util/notify.c:40
    #8 0x55fef2fc3182 in qemu_run_machine_init_done_notifiers /home/elmarco/src/qemu/vl.c:2716
    #9 0x55fef2fcbbd1 in main /home/elmarco/src/qemu/vl.c:4679
    #10 0x7ff89dfed009 in __libc_start_main (/lib64/libc.so.6+0x21009)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:44 +00:00
Thomas Huth
0bbabbb414 hw/arm: Use more CONFIG switches for the object files
A lot of ARM object files are linked into the executable unconditionally,
even though we have corresponding CONFIG switches like CONFIG_PXA2XX or
CONFIG_OMAP. We should make sure to use these switches in the Makefile so
that the users can disable certain unwanted boards and devices more easily.
While we're at it, also add some new switches for the boards that do not
have a CONFIG option yet.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1520266949-29817-1-git-send-email-thuth@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:44 +00:00
Richard Henderson
8c5931de0a aarch64-linux-user: Add support for SVE signal frame records
Depending on the currently selected size of the SVE vector registers,
we can either store the data within the "standard" allocation, or we
may beedn to allocate additional space with an EXTRA record.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180303143823.27055-6-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:44 +00:00
Richard Henderson
7a53fb907f aarch64-linux-user: Add support for EXTRA signal frame records
The EXTRA record allows for additional space to be allocated
beyon what is currently reserved.  Add code to emit and read
this record type.

Nothing uses extra space yet.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180303143823.27055-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:43 +00:00
Richard Henderson
e1eecd1d9d aarch64-linux-user: Remove struct target_aux_context
This changes the qemu signal frame layout to be more like the kernel's,
in that the various records are dynamically allocated rather than fixed
in place by a structure.

For now, all of the allocation is out of uc.tuc_mcontext.__reserved,
so the allocation is actually trivial.  That will change with SVE support.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180303143823.27055-4-richard.henderson@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:43 +00:00
Richard Henderson
3b505bbae1 aarch64-linux-user: Split out helpers for guest signal handling
Split out helpers from target_setup_frame and target_restore_sigframe
for dealing with general registers, fpsimd registers, and the end record.

When we add support for sve registers, the relative positions of
these will change.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180303143823.27055-3-richard.henderson@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:43 +00:00
Richard Henderson
85fc716732 linux-user: Implement aarch64 PR_SVE_SET/GET_VL
As an implementation choice, widening VL has zeroed the
previously inaccessible portion of the sve registers.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180303143823.27055-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:43 +00:00
Andrey Smirnov
843361ed04 Implement support for i.MX7 Sabre board
Implement code needed to set up emulation of MCIMX7SABRE board from
NXP. For more info about the HW see:

https://www.nxp.com/support/developer-resources/hardware-development-tools/sabre-development-system/sabre-board-for-smart-devices-based-on-the-i.mx-7dual-applications-processors:MCIMX7SABRE

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:43 +00:00
Andrey Smirnov
757282ada8 i.MX: Add i.MX7 SOC implementation.
The following interfaces are partially or fully emulated:

    * up to 2 Cortex A9 cores (SMP works with PSCI)
    * A7 MPCORE (identical to A15 MPCORE)
    * 4 GPTs modules
    * 7 GPIO controllers
    * 2 IOMUXC controllers
    * 1 CCM module
    * 1 SVNS module
    * 1 SRC module
    * 1 GPCv2 controller
    * 4 eCSPI controllers
    * 4 I2C controllers
    * 7 i.MX UART controllers
    * 2 FlexCAN controllers
    * 2 Ethernet controllers (FEC)
    * 3 SD controllers (USDHC)
    * 4 WDT modules
    * 1 SDMA module
    * 1 GPR module
    * 2 USBMISC modules
    * 2 ADC modules
    * 1 PCIe controller

Tested to boot and work with upstream Linux (4.13+) guest.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
[PMM: folded a couple of long lines]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:43 +00:00
Andrey Smirnov
d64e5eabc4 pci: Add support for Designware IP block
Add code needed to get a functional PCI subsytem when using in
conjunction with upstream Linux guest (4.13+). Tested to work against
"e1000e" (network adapter, using MSI interrupts) as well as
"usb-ehci" (USB controller, using legacy PCI interrupts).

Based on "i.MX6 Applications Processor Reference Manual" (Document
Number: IMX6DQRM Rev. 4) as well as corresponding dirver in Linux
kernel (circa 4.13 - 4.16 found in drivers/pci/dwc/*)

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:43 +00:00
Alistair Francis
8f2ba1f278 hw/arm: Set the core count for Xilinx's ZynqMP
Set the ARM CPU core count property for the A53's attached to the Xilnx
ZynqMP machine.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: fe0dd90b85ac73f9fc9548c253bededa70a07006.1520018138.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:43 +00:00
Alistair Francis
f9a697112e target/arm: Add a core count property
The cortex A53 TRM specifies that bits 24 and 25 of the L2CTLR register
specify the number of cores in the processor, not the total number of
cores in the system. To report this correctly on machines with multiple
CPU clusters (ARM's big.LITTLE or Xilinx's ZynqMP) we need to allow
the machine to overwrite this value. To do this let's add an optional
property.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: ef01d95c0759e88f47f22d11b14c91512a658b4f.1520018138.git.alistair.francis@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 17:09:43 +00:00
Paolo Bonzini
b39b61e410 memory: fix flatview_access_valid RCU read lock/unlock imbalance
Fixes: 11e732a5ed
Reported-by: Cornelia Huck <cohuck@redhat.com>
Reported-by: luigi burdo <intermediadc@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 20180307130238.19358-1-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 15:55:20 +00:00
Kevin Wolf
a1be5921e3 Merge remote-tracking branch 'mreitz/tags/pull-block-2018-03-09' into queue-block
Block patches

# gpg: Signature made Fri Mar  9 15:40:32 2018 CET
# gpg:                using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* mreitz/tags/pull-block-2018-03-09:
  qemu-iotests: fix 203 migration completion race
  iotests: Tweak 030 in order to trigger a race condition with parallel jobs
  iotests: Skip test for ENOMEM error
  iotests: Mark all tests executable
  iotests: Test creating overlay when guest running

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 16:09:06 +01:00
Laurent Vivier
6c25be6e30 target/m68k: implement ftentox
Using a local m68k floatx80_tentox()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-9-laurent@vivier.eu>
2018-03-09 15:50:38 +01:00
Laurent Vivier
068f161536 target/m68k: implement ftwotox
Using a local m68k floatx80_twotox()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-8-laurent@vivier.eu>
2018-03-09 15:50:38 +01:00
Laurent Vivier
40ad087330 target/m68k: implement fetox
Using a local m68k floatx80_etox()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-7-laurent@vivier.eu>
2018-03-09 15:50:38 +01:00
Laurent Vivier
67b453ed73 target/m68k: implement flog2
Using a local m68k floatx80_log2()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-6-laurent@vivier.eu>
2018-03-09 15:50:38 +01:00
Laurent Vivier
248efb66fb target/m68k: implement flog10
Using a local m68k floatx80_log10()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-5-laurent@vivier.eu>
2018-03-09 15:50:38 +01:00
Laurent Vivier
50067bd16f target/m68k: implement flogn
Using a local m68k floatx80_logn()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-4-laurent@vivier.eu>
2018-03-09 15:50:38 +01:00
Laurent Vivier
4b5c65b8f0 target/m68k: implement flognp1
Using a local m68k floatx80_lognp1()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-3-laurent@vivier.eu>
2018-03-09 15:50:36 +01:00
Stefan Hajnoczi
21794244d4 qemu-iotests: fix 203 migration completion race
There is a race between the test's 'query-migrate' QMP command after the
QMP 'STOP' event and completing the migration:

The test case invokes 'query-migrate' upon receiving 'STOP'.  At this
point the migration thread may still be in the process of completing.
Therefore 'query-migrate' can return 'status': 'active' for a brief
window of time instead of 'status': 'completed'.  This results in
qemu-iotests 203 hanging.

Solve the race by enabling the 'events' migration capability, which
causes QEMU to emit migration-specific QMP events that do not suffer
from this race condition.  Wait for the QMP 'MIGRATION' event with
'status': 'completed'.

Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180305155926.25858-1-stefanha@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:40:07 +01:00
Alberto Garcia
39eaefcedb iotests: Tweak 030 in order to trigger a race condition with parallel jobs
This patch tweaks TestParallelOps in iotest 030 so it allocates data
in smaller regions (256KB/512KB instead of 512KB/1MB) and the
block-stream job in test_stream_commit() only needs to copy data that
is at the very end of the image.

This way when the block-stream job is awakened it will finish right
away without any chance of being stopped by block_job_sleep_ns(). This
triggers the bug that was fixed by 3d5d319e12 and
1a63a90750 and is therefore a more useful test
case for parallel block jobs.

After this patch the aforementiond bug can also be reproduced with the
test_stream_parallel() test case.

Since with this change the stream job in test_stream_commit() finishes
early, this patch introduces a similar test case where both jobs are
slowed down so they can actually run in parallel.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Cc: John Snow <jsnow@redhat.com>
Message-id: 20180306130121.30243-1-berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:40:07 +01:00
Fam Zheng
0bfed484a5 iotests: Skip test for ENOMEM error
The AFL image is to exercise the code validating image size, which
doesn't work on 32 bit or when out of memory (there is a large
allocation before the interesting point). So check that and skip the
test, instead of faking the result.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20180301011413.11531-1-famz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:40:07 +01:00
Eric Blake
990dc39cfa iotests: Mark all tests executable
The majority of our iotests have the executable bit set; fix the
few outliers for consistency.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20180305161824.7188-1-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:40:07 +01:00
Fam Zheng
fcca5dce20 iotests: Test creating overlay when guest running
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20171225025107.23985-1-famz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:40:07 +01:00
Stef O'Rear
cffad426f5 softfloat: fix crash on int conversion of SNaN
Signed-off-by: Stef O'Rear <sorear2@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 14:30:12 +00:00
Kevin Wolf
56ea7450aa qemu-iotests: Test ssh image creation over QMP
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:48 +01:00
Kevin Wolf
39218a771a qemu-iotests: Test qcow2 over file image creation with QMP
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:48 +01:00
Kevin Wolf
cd8b7aaa07 block: Fail bdrv_truncate() with negative size
Most callers have their own checks, but something like this should also
be checked centrally. As it happens, x-blockdev-create can pass negative
image sizes to format drivers (because there is no QAPI type that would
reject negative numbers) and triggers the check added by this patch.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09 15:17:48 +01:00
Kevin Wolf
89b259eeaa file-posix: Fix no-op bdrv_truncate() with falloc preallocation
If bdrv_truncate() is called, but the requested size is the same as
before, don't call posix_fallocate(), which returns -EINVAL for length
zero and would therefore make bdrv_truncate() fail.

The problem can be triggered by creating a zero-sized raw image with
'falloc' preallocation mode.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09 15:17:48 +01:00
Kevin Wolf
4906da7e4d ssh: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to ssh, which enables
image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:48 +01:00
Kevin Wolf
375f0b9266 ssh: Pass BlockdevOptionsSsh to connect_to_ssh()
Move the parsing of the QDict options up to the callers, in preparation
for the .bdrv_co_create implementation that directly gets a QAPI type.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
ec2f54182c ssh: QAPIfy host-key-check option
This makes the host-key-check option available in blockdev-add.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
16e4bdb1bf ssh: Use QAPI BlockdevOptionsSsh object
Create a BlockdevOptionsSsh object in connect_to_ssh() and take the
options from there. 'host_key_check' is still processed separately
because it's not in the schema yet.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
63fd65a0a5 sheepdog: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to sheepdog, which enables
image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
a595e4bcca sheepdog: QAPIfy "redundancy" create option
The "redundancy" option for Sheepdog image creation is currently a
string that can encode one or two integers depending on its format,
which at the same time implicitly selects a mode.

This patch turns it into a QAPI union and converts the string into such
a QAPI object before interpreting the values.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
a1a42af422 nfs: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to nfs, which enables
image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
c22a034545 nfs: Use QAPI options in nfs_client_open()
Using the QAPI visitor to turn all options into QAPI BlockdevOptionsNfs
simplifies the code a lot. It will also be useful for implementing the
QAPI based .bdrv_co_create callback.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
aa045c2d37 rbd: Use qemu_rbd_connect() in qemu_rbd_do_create()
This is almost exactly the same code. The differences are that
qemu_rbd_connect() supports BlockdevOptionsRbd.server and that the cache
mode is set explicitly.

Supporting 'server' is a welcome new feature for image creation.
Caching is disabled by default, so leave it that way.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
d41a55885d rbd: Assign s->snap/image_name in qemu_rbd_open()
Now that the options are already available in qemu_rbd_open() and not
only parsed in qemu_rbd_connect(), we can assign s->snap and
s->image_name there instead of passing the fields by reference to
qemu_rbd_connect().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
1bebea37b4 rbd: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to rbd, which enables
image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
4bfb274165 rbd: Pass BlockdevOptionsRbd to qemu_rbd_connect()
With the conversion to a QAPI options object, the function is now
prepared to be used in a .bdrv_co_create implementation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
4ff4504974 rbd: Remove non-schema options from runtime_opts
Instead of the QemuOpts in qemu_rbd_connect(), we want to use QAPI
objects. As a preparation, fetch those options directly from the QDict
that .bdrv_open() supports in the rbd driver and that are not in the
schema.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
3d9136f972 rbd: Factor out qemu_rbd_connect()
The code to establish an RBD connection is duplicated between open and
create. In order to be able to share the code, factor out the code from
qemu_rbd_open() as a first step.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
71c87815f9 rbd: Fix use after free in qemu_rbd_set_keypairs() error path
If we want to include the invalid option name in the error message, we
can't free the string earlier than that.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
ab8bda76a0 gluster: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to gluster, which enables
image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
3766ef579c file-win32: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to file-win32, which
enables image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
927f11e131 file-posix: Support .bdrv_co_create
This adds the .bdrv_co_create driver callback to file, which enables
image creation over QMP.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
b0292b851b block: x-blockdev-create QMP command
This adds a synchronous x-blockdev-create QMP command that can create
qcow2 images on a given node name.

We don't want to block while creating an image, so this is not the final
interface in all aspects, but BlockdevCreateOptionsQcow2 and
.bdrv_co_create() are what they actually might look like in the end. In
any case, this should be good enough to test whether we interpret
BlockdevCreateOptions as we should.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
e8eb863778 block: Make bdrv_is_whitelisted() public
We'll use a separate source file for image creation, and we need to
check there whether the requested driver is whitelisted.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
b76b4f6045 qcow2: Use visitor for options in qcow2_create()
Instead of manually creating the BlockdevCreateOptions object, use a
visitor to parse the given options into the QAPI object.

This involves translation from the old command line syntax to the syntax
mandated by the QAPI schema. Option names are still checked against
qcow2_create_opts, so only the old option names are allowed on the
command line, even if they are translated in qcow2_create().

In contrast, new option values are optionally recognised besides the old
values: 'compat' accepts 'v2'/'v3' as an alias for '0.10'/'1.1', and
'encrypt.format' accepts 'qcow' as an alias for 'aes' now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
bcebf102cc qdict: Introduce qdict_rename_keys()
A few block drivers will need to rename .bdrv_create options for their
QAPIfication, so let's have a helper function for that.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
37974a97d4 test-qemu-opts: Test qemu_opts_to_qdict_filtered()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
575ef8bff6 test-qemu-opts: Test qemu_opts_append()
Basic test for merging two QemuOptsLists.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
72215395b9 util: Add qemu_opts_to_qdict_filtered()
This allows, given a QemuOpts for a QemuOptsList that was merged from
multiple QemuOptsList, to only consider those options that exist in one
specific list. Block drivers need this to separate format-layer create
options from protocol-level options.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
e4b5dad826 qcow2: Handle full/falloc preallocation in qcow2_co_create()
Once qcow2_co_create() can be called directly on an already existing
node, we must provide the 'full' and 'falloc' preallocation modes
outside of creating the image on the protocol layer. Fortunately, we
have preallocated truncate now which can provide this functionality.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
60900b7bd9 qcow2: Use QCryptoBlockCreateOptions in qcow2_co_create()
Instead of passing the encryption format name and the QemuOpts down, use
the QCryptoBlockCreateOptions contained in BlockdevCreateOptions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
e1d74bc6c6 qcow2: Use BlockdevRef in qcow2_co_create()
Instead of passing a separate BlockDriverState* into qcow2_co_create(),
make use of the BlockdevRef that is included in BlockdevCreateOptions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
29ca9e450d qcow2: Pass BlockdevCreateOptions to qcow2_co_create()
All of the simple options are now passed to qcow2_co_create() in a
BlockdevCreateOptions object. Still missing: node-name and the
encryption options.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
cbf2b7c404 qcow2: Let qcow2_create() handle protocol layer
Currently, qcow2_create() only parses the QemuOpts and then calls
qcow2_co_create() for the actual image creation, which includes both the
creation of the actual file on the file system and writing a valid empty
qcow2 image into that file.

The plan is that qcow2_co_create() becomes the function that implements
the functionality for a future 'blockdev-create' QMP command, which only
creates the qcow2 layer on an already opened file node.

This is a first step towards that goal: Let's move out anything that
deals with the protocol layer from qcow2_co_create() into
qcow2_create().  This means that qcow2_co_create() doesn't need a file
name any more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
8ca5bfdc46 qcow2: Rename qcow2_co_create2() to qcow2_co_create()
The functions originally known as qcow2_create() and qcow2_create2()
are now called qcow2_co_create_opts() and qcow2_co_create(), which
matches the names of the BlockDriver callbacks that they will implement
at the end of this patch series.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
c2808abaf7 block/qapi: Add qcow2 create options to schema
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Kevin Wolf
5361468974 block/qapi: Introduce BlockdevCreateOptions
This creates a BlockdevCreateOptions union type that will contain all of
the options for image creation. We'll start out with an empty struct
type BlockdevCreateNotSupported for all drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2018-03-09 15:17:47 +01:00
Alberto Garcia
0c2ada8136 qcow2: Make qemu-img check detect corrupted L1 tables in snapshots
'qemu-img check' cannot detect if a snapshot's L1 table is corrupted.
This patch checks the table's offset and size and reports corruption
if the values are not valid.

This patch doesn't add code to fix that corruption yet, only to detect
and report it.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Alberto Garcia
db5794f1f1 qcow2: Check snapshot L1 table in qcow2_snapshot_delete()
This function deletes a snapshot from disk, removing its entry from
the snapshot table, freeing its L1 table and decreasing the refcounts
of all clusters.

The L1 table offset and size are however not validated. If we use
invalid values in this function we'll probably corrupt the image even
more, so we should return an error instead.

We now have a function to take care of this, so let's use it.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Alberto Garcia
a8475d7573 qcow2: Check snapshot L1 table in qcow2_snapshot_goto()
This function copies a snapshot's L1 table into the active one without
validating it first.

We now have a function to take care of this, so let's use it.

Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Alberto Garcia
c7a9d81d70 qcow2: Check snapshot L1 tables in qcow2_check_metadata_overlap()
The inactive-l2 overlap check iterates uses the L1 tables from all
snapshots, but it does not validate them first.

We now have a function to take care of this, so let's use it.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Alberto Garcia
c9a442e450 qcow2: Check L1 table parameters in qcow2_expand_zero_clusters()
This function iterates over all snapshots of a qcow2 file in order to
expand all zero clusters, but it does not validate the snapshots' L1
tables first.

We now have a function to take care of this, so let's use it.

We can also take the opportunity to replace the sector-based
bdrv_read() with bdrv_pread().

Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Alberto Garcia
314e8d3928 qcow2: Check L1 table offset in qcow2_snapshot_load_tmp()
This function checks that the size of a snapshot's L1 table is not too
large, but it doesn't validate the offset.

We now have a function to take care of this, so let's use it.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Alberto Garcia
0cf0e5980b qcow2: Generalize validate_table_offset() into qcow2_validate_table()
This function checks that the offset and size of a table are valid.

While the offset checks are fine, the size check is too generic, since
it only verifies that the total size in bytes fits in a 64-bit
integer. In practice all tables used in qcow2 have much smaller size
limits, so the size needs to be checked again for each table using its
actual limit.

This patch generalizes this function by allowing the caller to specify
the maximum size for that table. In addition to that it allows passing
an Error variable.

The function is also renamed and made public since we're going to use
it in other parts of the code.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Paolo Bonzini
2fd6163884 block: convert bdrv_check callback to coroutine_fn
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1516279431-30424-8-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Paolo Bonzini
2b148f392b block: convert bdrv_invalidate_cache callback to coroutine_fn
QED's bdrv_invalidate_cache implementation would like to reuse functions
that acquire/release the metadata locks.  Call it from coroutine context
to simplify the logic.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1516279431-30424-6-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Paolo Bonzini
9fb4dfc570 qed: make bdrv_qed_do_open a coroutine_fn
It is called from qed_invalidate_cache in coroutine context (incoming
migration runs in a coroutine), so it's cleaner if metadata is always
loaded from a coroutine.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1516279431-30424-5-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Paolo Bonzini
1fafcd9368 qcow2: make qcow2_do_open a coroutine_fn
It is called from qcow2_invalidate_cache in coroutine context (incoming
migration runs in a coroutine), so it's cleaner if metadata is always
loaded from a coroutine.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1516279431-30424-4-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Paolo Bonzini
1a0c2bfb03 qcow2: fix flushing after dirty bitmap metadata writes
update_header_sync itself does not need to flush the caches to disk.
The only paths that allocate clusters are:

- bitmap_list_store with in_place=false, called by update_ext_header_and_dir

- store_bitmap_data, called by store_bitmap

- store_bitmap, called by qcow2_store_persistent_dirty_bitmaps and
  followed by update_ext_header_and_dir

So in the end the central place where we need to flush the caches
is update_ext_header_and_dir.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Paolo Bonzini
8b220eb7c8 qcow2: introduce qcow2_write_caches and qcow2_flush_caches
They will be used to avoid recursively taking s->lock during
bdrv_open or bdrv_check.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1516279431-30424-7-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Daniel P. Berrange
f87e08f9e2 block: implement the bdrv_reopen_prepare helper for LUKS driver
If the bdrv_reopen_prepare helper isn't provided, the qemu-img commit
command fails to re-open the base layer after committing changes into
it. Provide a no-op implementation for the LUKS driver, since there
is not any custom work that needs doing to re-open it.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-09 15:17:47 +01:00
Laurent Vivier
9a069775a8 target/m68k: define floatx80_move()
This functions is needed by upcoming m68k softfloat functions.

Source code copied for WinUAE (tag 3500)
(The WinUAE file has been copied from QEMU and has
the QEMU licensing notice)

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-2-laurent@vivier.eu>
2018-03-09 15:09:57 +01:00
Peter Maydell
d9bbfea646 Merge remote-tracking branch 'remotes/riscv/tags/riscv-qemu-upstream-v8.2' into staging
QEMU RISC-V Emulation Support (RV64GC, RV32GC)

This release renames the SiFive machines to sifive_e and sifive_u
to represent the SiFive Everywhere and SiFive Unleashed platforms.
SiFive has configurable soft-core IP, so it is intended that these
machines will be extended to enable a variety of SiFive IP blocks.
The CPU definition infrastructure has been improved and there are
now vendor CPU modules including the SiFiVe E31, E51, U34 and U54
cores. The emulation accuracy for the E series has been improved
by disabling the MMU for the E series. S mode has been disabled on
cores that only support M mode and U mode. The two Spike machines
that support two privileged ISA versions have been coalesced into
one file. This series has Signed-off-by from the core contributors.

*** Known Issues ***

* Disassembler has some checkpatch warnings for the sake of code brevity
* scripts/qemu-binfmt-conf.sh has checkpatch warnings due to line length
* PMP (Physical Memory Protection) is as-of-yet unused and needs testing

*** Changelog ***

v8.2

* Rebase

v8.1

* Fix missed case of renaming spike_v1.9 to spike_v1.9.1

v8

* Added linux-user/riscv/target_elf.h during rebase
* Make resetvec configurable and clear mpp and mie on reset
* Use SiFive E31, E51, U34 and U54 cores in SiFive machines
* Define SiFive E31, E51, U34 and U54 cores
* Refactor CPU core definition in preparation for vendor cores
* Prevent S or U mode unless S or U extensions are present
* SiFive E Series cores have no MMU
* SiFive E Series cores have U mode
* Make privileged ISA v1.10 implicit in CPU types
* Remove DRAM_BASE and EXT_IO_BASE as they vary by machine
* Correctly handle mtvec and stvec alignment with respect to RVC
* Print more machine mode state in riscv_cpu_dump_state
* Make riscv_isa_string use compact extension order method
* Fix bug introduced in v6 RISCV_CPU_TYPE_NAME macro change
* Parameterize spike v1.9.1 config string
* Coalesce spike_v1.9.1 and spike_v1.10 machines
* Rename sifive_e300 to sifive_e, and sifive_u500 to sifive_u

v7

* Make spike_v1.10 the default machine
* Rename spike_v1.9 to spike_v1.9.1 to match privileged spec version
* Remove empty target/riscv/trace-events file
* Monitor ROM 32-bit reset code needs to be target endian
* Add TARGET_TIOCGPTPEER to linux-user/riscv/termbits.h
* Add -initrd support to the virt board
* Fix naming in spike machine interface header
* Update copyright notice on RISC-V Spike machines
* Update copyright notice on RISC-V HTIF Console device
* Change CPU Core and translator to GPLv2+
* Change RISC-V Disassembler to GPLv2+
* Change SiFive Test Finisher to GPLv2+
* Change SiFive CLINT to GPLv2+
* Change SiFive PRCI to GPLv2+
* Change SiFive PLIC to GPLv2+
* Change RISC-V spike machines to GPLv2+
* Change RISC-V virt machine to GPLv2+
* Change SiFive E300 machine to GPLv2+
* Change SiFive U500 machine to GPLv2+
* Change RISC-V Hart Array to GPLv2+
* Change RISC-V HTIF device to GPLv2+
* Change SiFiveUART device to GPLv2+

v6

* Drop IEEE 754-201x minimumNumber/maximumNumber for fmin/fmax
* Remove some unnecessary commented debug statements
* Change RISCV_CPU_TYPE_NAME to use riscv-cpu suffix
* Define all CPU variants for linux-user
* qemu_log calls require trailing \n
* Replace PLIC printfs with qemu_log
* Tear out unused HTIF code and eliminate shouting debug messages
* Fix illegal instruction when sfence.vma is passed (rs2) arguments
* Make updates to PTE accessed and dirty bits atomic
* Only require atomic PTE updates on MTTCG enabled guests
* Page fault if accessed or dirty bits can't be updated
* Fix get_physical_address PTE reads and writes on riscv32
* Remove erroneous comments from the PLIC
* Default enable MTTCG
* Make WFI less conservative
* Unify local interrupt handling
* Expunge HTIF interrupts
* Always access mstatus.mip under a lock
* Don't implement rdtime/rdtimeh in system mode (bbl emulates them)
* Implement insreth/cycleh for rv32 and always enable user-mode counters
* Add GDB stub support for reading and writing CSRs
* Rename ENABLE_CHARDEV #ifdef from HTIF code
* Replace bad HTIF ELF code with load_elf symbol callback
* Convert chained if else fault handlers to switch statements
* Use RISCV exception codes for linux-user page faults

v5

* Implement NaN-boxing for flw, set high order bits to 1
* Use float_muladd_negate_* flags to floatXX_muladd
* Use IEEE 754-201x minimumNumber/maximumNumber for fmin/fmax
* Fix TARGET_NR_syscalls
* Update linux-user/riscv/syscall_nr.h
* Fix FENCE.I, needs to terminate translation block
* Adjust unusual convention for interruptno >= 0

v4

* Add @riscv: since 2.12 to CpuInfoArch
* Remove misleading little-endian comment from load_kernel
* Rename cpu-model property to cpu-type
* Drop some unnecessary inline function attributes
* Don't allow GDB to set value of x0 register
* Remove unnecessary empty property lists
* Add Test Finisher device to implement poweroff in virt machine
* Implement priv ISA v1.10 trap and sret/mret xPIE/xIE behavior
* Store fflags data in fp_status
* Purge runtime users of helper_raise_exception
* Fix validate_csr
* Tidy gen_jalr
* Tidy immediate shifts
* Add gen_exception_inst_addr_mis
* Add gen_exception_debug
* Add gen_exception_illegal
* Tidy helper_fclass_*
* Split rounding mode setting to a new function
* Enforce MSTATUS_FS via TB flags
* Implement acquire/release barrier semantics
* Use atomic operations as required
* Fix FENCE and FENCE_I
* Remove commented code from spike machines
* PAGE_WRITE permissions can be set on loads if page is already dirty
* The result of format conversion on an NaN must be a quiet NaN
* Add missing process_queued_cpu_work to riscv linux-user
* Remove float(32|64)_classify from cpu.h
* Removed nonsensical unions aliasing the same type
* Use uintN_t instead of uintN_fast_t in fpu_helper.c
* Use macros for FPU exception values in softfloat_flags_to_riscv
* Move code to set round mode into set_fp_round_mode function
* Convert set_fp_exceptions from a macro to an inline function
* Convert round mode helper into an inline function
* Make fpu_helper ieee_rm array static const
* Include cpu_mmu_index in cpu_get_tb_cpu_state flags
* Eliminate MPRV influence on mmu_index
* Remove unrecoverable do_unassigned_access function
* Only update PTE accessed and dirty bits if necessary
* Remove unnecessary tlb_flush in set_mode as mode is in mmu_idx
* Remove buggy support for misa writes. misa writes are optional
  and are not implemented in any known hardware
* Always set PTE read or execute permissions during page walk
* Reorder helper function declarations to match order in helper.c
* Remove redundant variable declaration in get_physical_address
* Remove duplicated code from get_physical_address
* Use mmu_idx instead of mem_idx in riscv_cpu_get_phys_page_debug

v3

* Fix indentation in PMP and HTIF debug macros
* Fix disassembler checkpatch open brace '{' on next line errors
* Fix trailing statements on next line in decode_inst_decompress
* NOTE: the other checkpatch issues have been reviewed previously

v2

* Remove redundant NULL terminators from disassembler register arrays
* Change disassembler register name arrays to const
* Refine disassembler internal function names
* Update dates in disassembler copyright message
* Remove #ifdef CONFIG_USER_ONLY version of cpu_has_work
* Use ULL suffix on 64-bit constants
* Move riscv_cpu_mmu_index from cpu.h to helper.c
* Move riscv_cpu_hw_interrupts_pending from cpu.h to helper.c
* Remove redundant TARGET_HAS_ICE from cpu.h
* Use qemu_irq instead of void* for irq definition in cpu.h
* Remove duplicate typedef from struct CPURISCVState
* Remove redundant g_strdup from cpu_register
* Remove redundant tlb_flush from riscv_cpu_reset
* Remove redundant mode calculation from get_physical_address
* Remove redundant debug mode printf and dcsr comment
* Remove redundant clearing of MSB for bare physical addresses
* Use g_assert_not_reached for invalid mode in get_physical_address
* Use g_assert_not_reached for unreachable checks in get_physical_address
* Use g_assert_not_reached for unreachable type in raise_mmu_exception
* Return exception instead of aborting for misaligned fetches
* Move exception defines from cpu.h to cpu_bits.h
* Remove redundant breakpoint control definitions from cpu_bits.h
* Implement riscv_cpu_unassigned_access exception handling
* Log and raise exceptions for unimplemented CSRs
* Match Spike HTIF exit behavior - don’t print TEST-PASSED
* Make frm,fflags,fcsr writes trap when mstatus.FS is clear
* Use g_assert_not_reached for unreachable invalid mode
* Make hret,uret,dret generate illegal instructions
* Move riscv_cpu_dump_state and int/fpr regnames to cpu.c
* Lift interrupt flag and mask into constants in cpu_bits.h
* Change trap debugging to use qemu_log_mask LOG_TRACE
* Change CSR debugging to use qemu_log_mask LOG_TRACE
* Change PMP debugging to use qemu_log_mask LOG_TRACE
* Remove commented code from pmp.c
* Change CpuInfoRISCV qapi schema docs to Since 2.12
* Change RV feature macro to use target_ulong cast
* Remove riscv_feature and instead use misa extension flags
* Make riscv_flush_icache_syscall a no-op
* Undo checkpatch whitespace fixes in unrelated linux-user code
* Remove redudant constants and tidy up cpu_bits.h
* Make helper_fence_i a no-op
* Move include "exec/cpu-all" to end of cpu.h
* Rename set_privilege to riscv_set_mode
* Move redundant forward declaration for cpu_riscv_translate_address
* Remove TCGV_UNUSED from riscv_translate_init
* Add comment to pmp.c stating the code is untested and currently unused
* Use ctz to simplify decoding of PMP NAPOT address ranges
* Change pmp_is_in_range to use than equal for end addresses
* Fix off by one error in pmp_update_rule
* Rearrange PMP_DEBUG so that formatting is compile-time checked
* Rearrange trap debugging so that formatting is compile-time checked
* Rearrange PLIC debugging so that formatting is compile-time checked
* Use qemu_log/qemu_log_mask for HTIF logging and debugging
* Move exception and interrupt names into cpu.c
* Add Palmer Dabbelt as a RISC-V Maintainer
* Rebase against current qemu master branch

v1

* initial version based on forward port from riscv-qemu repository

*** Background ***

"RISC-V is an open, free ISA enabling a new era of processor innovation
through open standard collaboration. Born in academia and research,
RISC-V ISA delivers a new level of free, extensible software and
hardware freedom on architecture, paving the way for the next 50 years
of computing design and innovation."

The QEMU RISC-V port has been developed and maintained out-of-tree for
several years by Sagar Karandikar and Bastian Koppelmann. The RISC-V
Privileged specification has evolved substantially over this period but
has recently been solidifying. The RISC-V Base ISA has been frozon for
some time and the Privileged ISA, GCC toolchain and Linux ABI are now
quite stable. I have recently joined Sagar and Bastian as a RISC-V QEMU
Maintainer and hope to support upstreaming the port.

There are multiple vendors taping out, preparing to ship, or shipping
silicon that implements the RISC-V Privileged ISA Version 1.10. There
are also several RISC-V Soft-IP cores implementing Privileged ISA
Version 1.10 that run on FPGA such as SiFive's Freedom U500 Platform
and the U54‑MC RISC-V Core IP, among many more implementations from a
variety of vendors. See https://riscv.org/ for more details.

RISC-V support was upstreamed in binutils 2.28 and GCC 7.1 in the first
half of 2016. RISC-V support is now available in LLVM top-of-tree and
the RISC-V Linux port was accepted into Linux 4.15-rc1 late last year
and is available in the Linux 4.15 release. GLIBC 2.27 added support
for the RISC-V ISA running on Linux (requires at least binutils-2.30,
gcc-7.3.0, and linux-4.15). We believe it is timely to submit the
RISC-V QEMU port for upstream review with the goal of incorporating
RISC-V support into the upcoming QEMU 2.12 release.

The RISC-V QEMU port is still under active development, mostly with
respect to device emulation, the addition of Hypervisor support as
specified in the RISC-V Draft Privileged ISA Version 1.11, and Vector
support once the first draft is finalized later this year. We believe
now is the appropriate time for RISC-V QEMU development to be carried
out in the main QEMU repository as the code will benefit from more
rigorous review. The RISC-V QEMU port currently supports all the ISA
extensions that have been finalized and frozen in the Base ISA.

Blog post about recent additions to RISC-V QEMU: https://goo.gl/fJ4zgk

The RISC-V QEMU wiki: https://github.com/riscv/riscv-qemu/wiki

Instructions for building a busybox+dropbear root image, BBL (Berkeley
Boot Loader) and linux kernel image for use with the RISC-V QEMU
'virt' machine: https://github.com/michaeljclark/busybear-linux

*** Overview ***

The RISC-V QEMU port implements the following specifications:

* RISC-V Instruction Set Manual Volume I: User-Level ISA Version 2.2
* RISC-V Instruction Set Manual Volume II: Privileged ISA Version 1.9.1
* RISC-V Instruction Set Manual Volume II: Privileged ISA Version 1.10

The RISC-V QEMU port supports the following instruction set extensions:

* RV32GC with Supervisor-mode and User-mode (RV32IMAFDCSU)
* RV64GC with Supervisor-mode and User-mode (RV64IMAFDCSU)

The RISC-V QEMU port adds the following targets to QEMU:

* riscv32-softmmu
* riscv64-softmmu
* riscv32-linux-user
* riscv64-linux-user

The RISC-V QEMU port supports the following hardware:

* HTIF Console (Host Target Interface)
* SiFive CLINT (Core Local Interruptor) for Timer interrupts and IPIs
* SiFive PLIC (Platform Level Interrupt Controller)
* SiFive Test (Test Finisher) for exiting simulation
* SiFive UART, PRCI, AON, PWM, QSPI support is partially implemented
* VirtIO MMIO (GPEX PCI support will be added in a future patch)
* Generic 16550A UART emulation using 'hw/char/serial.c'
* MTTCG and SMP support (PLIC and CLINT) on the 'virt' machine

The RISC-V QEMU full system emulator supports 5 machines:

* 'spike_v1.9.1', CLINT, PLIC, HTIF console, config-string, Priv v1.9.1
* 'spike_v1.10', CLINT, PLIC, HTIF console, device-tree, Priv v1.10
* 'sifive_e', CLINT, PLIC, SiFive UART, HiFive1 compat, Priv v1.10
* 'sifive_u', CLINT, PLIC, SiFive UART, device-tree, Priv v1.10
* 'virt', CLINT, PLIC, 16550A UART, VirtIO, device-tree, Priv v1.10

This is a list of RISC-V QEMU Port Contributors:

* Alex Suykov
* Andreas Schwab
* Antony Pavlov
* Bastian Koppelmann
* Bruce Hoult
* Chih-Min Chao
* Daire McNamara
* Darius Rad
* David Abdurachmanov
* Hesham Almatary
* Ivan Griffin
* Jim Wilson
* Kito Cheng
* Michael Clark
* Palmer Dabbelt
* Richard Henderson
* Sagar Karandikar
* Shea Levy
* Stefan O'Rear

Notes:

* contributor email addresses available off-list on request.
* checkpatch has been run on all 23 patches.
* checkpatch exceptions are noted in patches that have errors.
* passes "make check" on full build for all targets
* tested riscv-linux-4.6.2 on 'spike_v1.9.1' machine
* tested riscv-linux-4.15 on 'spike_v1.10' and 'virt' machines
* tested SiFive HiFive1 binaries in 'sifive_e' machine
* tested RV64 on 32-bit i386

This patch series includes the following patches:

# gpg: Signature made Thu 08 Mar 2018 19:40:20 GMT
# gpg:                using DSA key 6BF1D7B357EF3E4F
# gpg: Good signature from "Michael Clark <michaeljclark@mac.com>"
# gpg:                 aka "Michael Clark <mjc@sifive.com>"
# gpg:                 aka "Michael Clark <michael@metaparadigm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7C99 930E B17C D8BA 073D  5EFA 6BF1 D7B3 57EF 3E4F

* remotes/riscv/tags/riscv-qemu-upstream-v8.2: (23 commits)
  RISC-V Build Infrastructure
  SiFive Freedom U Series RISC-V Machine
  SiFive Freedom E Series RISC-V Machine
  SiFive RISC-V PRCI Block
  SiFive RISC-V UART Device
  RISC-V VirtIO Machine
  SiFive RISC-V Test Finisher
  RISC-V Spike Machines
  SiFive RISC-V PLIC Block
  SiFive RISC-V CLINT Block
  RISC-V HART Array
  RISC-V HTIF Console
  Add symbol table callback interface to load_elf
  RISC-V Linux User Emulation
  RISC-V Physical Memory Protection
  RISC-V TCG Code Generation
  RISC-V GDB Stub
  RISC-V FPU Support
  RISC-V CPU Helpers
  RISC-V Disassembler
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-09 10:58:57 +00:00
Peter Maydell
9fa673c3e3 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20180308' into staging
Fixes and cleanups for the 2.12 softfreeze.

# gpg: Signature made Thu 08 Mar 2018 17:53:14 GMT
# gpg:                using RSA key DECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg:                 aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg:                 aka "Cornelia Huck <cohuck@kernel.org>"
# gpg:                 aka "Cornelia Huck <cohuck@redhat.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* remotes/cohuck/tags/s390x-20180308:
  s390x/virtio: Convert virtio-ccw from *_exit to *_unrealize
  pc-bios/s390-ccw: Move string arrays from bootmap header to .c file
  s390x/sclp: clean up sclp masks
  s390x/sclp: proper support of larger send and receive masks
  vfio-ccw: license text should indicate GPL v2 or later
  s390x/sclpconsole: Remove dead code - remove exit handlers
  numa: we don't implement NUMA for s390x
  hw/s390x: Add the possibility to specify the netboot image on the command line
  target/s390x: Remove leading underscores from #defines
  s390/ipl: only print boot menu error if -boot menu=on was specified
  hw/s390x/ipl: Bail out if the network bootloader can not be found

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-08 18:30:16 +00:00
Stefan Hajnoczi
4486e89c21 vl: introduce vm_shutdown()
Commit 00d09fdbba ("vl: pause vcpus before
stopping iothreads") and commit dce8921b2b
("iothread: Stop threads before main() quits") tried to work around the
fact that emulation was still active during termination by stopping
iothreads.  They suffer from race conditions:
1. virtio_scsi_handle_cmd_vq() racing with iothread_stop_all() hits the
   virtio_scsi_ctx_check() assertion failure because the BDS AioContext
   has been modified by iothread_stop_all().
2. Guest vq kick racing with main loop termination leaves a readable
   ioeventfd that is handled by the next aio_poll() when external
   clients are enabled again, resulting in unwanted emulation activity.

This patch obsoletes those commits by fully disabling emulation activity
when vcpus are stopped.

Use the new vm_shutdown() function instead of pause_all_vcpus() so that
vm change state handlers are invoked too.  Virtio devices will now stop
their ioeventfds, preventing further emulation activity after vm_stop().

Note that vm_stop(RUN_STATE_SHUTDOWN) cannot be used because it emits a
QMP STOP event that may affect existing clients.

It is no longer necessary to call replay_disable_events() directly since
vm_shutdown() does so already.

Drop iothread_stop_all() since it is no longer used.

Cc: Fam Zheng <famz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-08 17:38:51 +00:00
Stefan Hajnoczi
184b962346 virtio-scsi: fix race between .ioeventfd_stop() and vq handler
If the main loop thread invokes .ioeventfd_stop() just as the vq handler
function begins in the IOThread then the handler may lose the race for
the AioContext lock.  By the time the vq handler is able to acquire the
AioContext lock the ioeventfd has already been removed and the handler
isn't supposed to run anymore!

Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal
from within the IOThread.  This way no races with the vq handler are
possible.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-08 17:38:51 +00:00
Stefan Hajnoczi
1010cadf62 virtio-blk: fix race between .ioeventfd_stop() and vq handler
If the main loop thread invokes .ioeventfd_stop() just as the vq handler
function begins in the IOThread then the handler may lose the race for
the AioContext lock.  By the time the vq handler is able to acquire the
AioContext lock the ioeventfd has already been removed and the handler
isn't supposed to run anymore!

Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal
from within the IOThread.  This way no races with the vq handler are
possible.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-08 17:38:51 +00:00
Stefan Hajnoczi
b89d92f3cf block: add aio_wait_bh_oneshot()
Sometimes it's necessary for the main loop thread to run a BH in an
IOThread and wait for its completion.  This primitive is useful during
startup/shutdown to synchronize and avoid race conditions.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-08 17:38:51 +00:00
Peter Maydell
f6d81cdec8 Merge remote-tracking branch 'remotes/stsquad/tags/pull-shippable-disable-ppc-080318-1' into staging
One fix to disable broken ppc cross-compile test on shippable

# gpg: Signature made Thu 08 Mar 2018 15:00:23 GMT
# gpg:                using RSA key FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-shippable-disable-ppc-080318-1:
  .shippable.yml: disable powerpc-cross image

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-08 17:27:53 +00:00
Nia Alarie
24118af846 s390x/virtio: Convert virtio-ccw from *_exit to *_unrealize
Signed-off-by: Nia Alarie <nia.alarie@gmail.com>
Message-Id: <20180307162958.11232-1-nia.alarie@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-03-08 17:22:20 +01:00
Sergio Lopez
12c1c7d7ce virtio-blk: dataplane: Don't batch notifications if EVENT_IDX is present
Commit 5b2ffbe4d9 ("virtio-blk: dataplane:
notify guest as a batch") deferred guest notification to a BH in order
batch notifications, with purpose of avoiding flooding the guest with
interruptions.

This optimization came with a cost. The average latency perceived in the
guest is increased by a few microseconds, but also when multiple IO
operations finish at the same time, the guest won't be notified until
all completions from each operation has been run. On the contrary,
virtio-scsi issues the notification at the end of each completion.

On the other hand, nowadays we have the EVENT_IDX feature that allows a
better coordination between QEMU and the Guest OS to avoid sending
unnecessary interruptions.

With this change, virtio-blk/dataplane only batches notifications if the
EVENT_IDX feature is not present.

Some numbers obtained with fio (ioengine=sync, iodepth=1, direct=1):
 - Test specs:
   * fio-3.4 (ioengine=sync, iodepth=1, direct=1)
   * qemu master
   * virtio-blk with a dedicated iothread (default poll-max-ns)
   * backend: null_blk nr_devices=1 irqmode=2 completion_nsec=280000
   * 8 vCPUs pinned to isolated physical cores
   * Emulator and iothread also pinned to separate isolated cores
   * variance between runs < 1%

 - Not patched
   * numjobs=1:  lat_avg=327.32  irqs=29998
   * numjobs=4:  lat_avg=337.89  irqs=29073
   * numjobs=8:  lat_avg=342.98  irqs=28643

 - Patched:
   * numjobs=1:  lat_avg=323.92  irqs=30262
   * numjobs=4:  lat_avg=332.65  irqs=29520
   * numjobs=8:  lat_avg=335.54  irqs=29323

Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-id: 20180307114459.26636-1-slp@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-08 15:59:03 +00:00
Fam Zheng
7c9e274829 README: Fix typo 'git-publish'
Reported-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180306024328.19195-1-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-08 15:45:14 +00:00
Deepa Srinivasan
c060332c76 block: Fix qemu crash when using scsi-block
Starting qemu with the following arguments causes qemu to segfault:
... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1

This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
details about the bug follow.

blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().

When blk_aio_ioctl() is executed from within a coroutine context (e.g.
iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
the current coroutine's wakeup queue. blk_aio_ioctl() then returns.

When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
....
    BlkRwCo *rwco = &acb->rwco;

    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
                             rwco->qiov->iov[0].iov_base);  <--- qiov is
                                                                 invalid here
...

In the case when blk_aio_ioctl() is called from a non-coroutine context,
blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
execution is complete, control returns to blk_aio_ioctl_entry() after the call
to blk_co_ioctl(). There is no invalid reference after this point, but the
function is still holding on to invalid pointers.

The fix is to change blk_aio_prwv() to accept a void pointer for the IO buffer
rather than a QEMUIOVector. blk_aio_prwv() passes this through in BlkRwCo and the
coroutine function casts it to QEMUIOVector or uses the void pointer directly.

Signed-off-by: Deepa Srinivasan <deepa.srinivasan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-08 15:43:11 +00:00
Thomas Huth
6af978ae8b pc-bios/s390-ccw: Move string arrays from bootmap header to .c file
bootmap.h can currently only be included once - otherwise the linker
complains about multiple definitions of the "magic" strings. It's a
bad style to define string arrays in header files, so let's better
move these to the bootmap.c file instead where they are used.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1520317081-5341-1-git-send-email-thuth@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-03-08 15:49:23 +01:00
Claudio Imbrenda
1ffed98f24 s390x/sclp: clean up sclp masks
Introduce an sccb_mask_t to be used for SCLP event masks instead of just
unsigned int or uint32_t. This will allow later to extend the mask with
more ease.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Message-Id: <1519407778-23095-3-git-send-email-imbrenda@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-03-08 15:49:23 +01:00
Claudio Imbrenda
bc61c8c626 s390x/sclp: proper support of larger send and receive masks
Until 67915de9f0 ("s390x/event-facility: variable-length event masks")
we only supported sclp event masks with a size of exactly 4 bytes, even
though the architecture allows the guests to set up sclp event masks
from 1 to 1021 bytes in length.
After that patch, the behaviour was almost compliant, but some issues
were still remaining, in particular regarding the handling of selective
reads and migration.

When setting the sclp event mask, a mask size is also specified. Until
now we only considered the size in order to decide which bits to save
in the internal state. On the other hand, when a guest performs a
selective read, it sends a mask, but it does not specify a size; the
implied size is the size of the last mask that has been set.

Specifying bits in the mask of selective read that are not available in
the internal mask should return an error, and bits past the end of the
mask should obviously be ignored. This can only be achieved by keeping
track of the lenght of the mask.

The mask length is thus now part of the internal state that needs to be
migrated.

This patch fixes the handling of selective reads, whose size will now
match the length of the event mask, as per architecture.

While the default behaviour is to be compliant with the architecture,
when using older machine models the old broken behaviour is selected
(allowing only masks of size exactly 4), in order to be able to migrate
toward older versions.

Fixes: 67915de9f0 ("s390x/event-facility: variable-length event masks")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Message-Id: <1519407778-23095-2-git-send-email-imbrenda@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-03-08 15:49:23 +01:00
Cornelia Huck
08b824aa2d vfio-ccw: license text should indicate GPL v2 or later
The license text currently specifies "any version" of the GPL. It
is unlikely that GPL v1 was ever intended; change this to the
standard "or any later version" text.

Cc: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Cc: Xiao Feng Ren <renxiaof@linux.vnet.ibm.com>
Cc: Pierre Morel <pmorel@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-03-08 15:49:23 +01:00
Nia Alarie
2231384e56 s390x/sclpconsole: Remove dead code - remove exit handlers
The other event handlers (quiesce and cpu) do not define these
handlers, and this one does nothing, so it can be removed.

Signed-off-by: Nia Alarie <nia.alarie@gmail.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20180306100721.19419-1-nia.alarie@gmail.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-03-08 15:49:23 +01:00
David Hildenbrand
81ce6aa51a numa: we don't implement NUMA for s390x
Right now it is possible to crash QEMU for s390x by providing e.g.
    -numa node,nodeid=0,cpus=0-1

Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
indicator whether NUMA is supported by a machine type. We don't
implement NUMA for s390x ("topology") yet. However we need
mc->cpu_index_to_instance_props for query-cpus.

So let's fix this case by also checking for mc->get_default_cpu_node_id,
which will be needed by machine_set_cpu_numa_node().

qemu-system-s390x: -numa node,nodeid=0,cpus=0-1: NUMA is not supported by
                   this machine-type

While at it, make s390_cpu_index_to_props() look like on other
architectures.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180227110255.20999-1-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-03-08 15:49:23 +01:00
Thomas Huth
3c4e9baacf hw/s390x: Add the possibility to specify the netboot image on the command line
The file name of the netboot binary is currently hard-coded to
"s390-netboot.img", without a possibility for the user to select
an alternative firmware image here. That's unfortunate, especially
since the basics are already there: The filename is a property of
the s390-ipl device. So we just have to add a check whether the user
already provided the property and only set the default if the string
is still empty. Now it is possible to select a different firmware
image with "-global s390-ipl.netboot_fw=/path/to/s390-netboot.img".

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1519731154-3127-1-git-send-email-thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-03-08 15:49:23 +01:00
Thomas Huth
adab99be66 target/s390x: Remove leading underscores from #defines
We should not use leading underscores followed by a capital letter
in #defines since such identifiers are reserved by the C standard.

For ASCE_ORIGIN, REGION_ENTRY_ORIGIN and SEGMENT_ENTRY_ORIGIN I also
added parentheses around the value to silence an error message from
checkpatch.pl.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1520227018-4061-1-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-03-08 15:49:23 +01:00
Collin L. Walling
5600086976 s390/ipl: only print boot menu error if -boot menu=on was specified
It is possible that certain QEMU configurations may not
create an IPLB (such as when -kernel is provided). In
this case, a misleading error message will be printed
stating that the "boot menu is not supported for this
device type".

To amend this, only print this message iff boot menu=on
was provided on the commandline. Otherwise, return silently.

While we're at it, remove trailing periods from error
messages.

Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Message-Id: <1519760121-24594-1-git-send-email-walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-03-08 15:49:23 +01:00
Thomas Huth
c575fa678b hw/s390x/ipl: Bail out if the network bootloader can not be found
If QEMU fails to load 's390-netboot.img', the guest firmware currently
loops forever and just floods the console with "Network boot device
detected" messages. The code in ipl.c apparently already tried to stop
the VM with vm_stop() in this case, but this is in vain since the run
state is later reset due to a call to vm_start() from vl.c again.
To avoid the ugly firmware loop, let's simply exit QEMU directly instead
since it just does not make sense to continue if the required firmware
image can not be loaded. While we're at it, also add the file name of
the netboot binary to the error message, so that the user has a better
hint about what is missing.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1519725913-24852-1-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-03-08 15:49:23 +01:00
Alex Bennée
b2dba411cf .shippable.yml: disable powerpc-cross image
Something has happened to the old emdebian setup which means it no
longer builds. Let's disable the shippable builds which are always
failing.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-03-08 14:34:41 +00:00
Peter Maydell
83d2e94cba Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-signed' into staging
qemu-sparc update

# gpg: Signature made Thu 08 Mar 2018 07:23:01 GMT
# gpg:                using RSA key 5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-sparc-signed:
  sparc: fix leon3 casa instruction when MMU is disabled
  hw/sparc/sun4m: Fix implicit creation of "-drive if=scsi" devices

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-08 13:42:26 +00:00
Peter Maydell
0ab4537f08 Merge remote-tracking branch 'remotes/stefanberger/tags/pull-tpm-2018-03-07-1' into staging
Merge tpm 2018/03/07

# gpg: Signature made Wed 07 Mar 2018 12:42:13 GMT
# gpg:                using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE  C66B 75AD 6580 2A0B 4211

* remotes/stefanberger/tags/pull-tpm-2018-03-07-1:
  tpm: convert tpm_tis.c to use trace-events
  tpm: convert tpm_emulator.c to use trace-events
  tpm: convert tpm_util.c to use trace-events
  tpm: convert tpm_passthrough.c to use trace-events
  tpm: convert tpm_crb.c to use trace-events

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-08 12:56:39 +00:00
Peter Maydell
3ef91576b9 Merge remote-tracking branch 'remotes/berrange/tags/qio-next-pull-request' into staging
# gpg: Signature made Wed 07 Mar 2018 11:24:41 GMT
# gpg:                using RSA key BE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/qio-next-pull-request:
  qio: non-default context for TLS handshake
  qio: non-default context for async conn
  qio: non-default context for threaded qtask
  qio: store gsources for net listeners
  qio: introduce qio_channel_add_watch_{full|source}
  qio: rename qio_task_thread_result

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-08 11:26:14 +00:00
Peter Maydell
854a4436dd Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Multiboot patches

# gpg: Signature made Wed 07 Mar 2018 11:15:17 GMT
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  multiboot: fprintf(stderr...) -> error_report()
  multiboot: Use header names when displaying fields
  multiboot: Remove unused variables from multiboot.c
  multiboot: bss_end_addr can be zero

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-03-08 10:02:46 +00:00
Jack Schwartz
4b9006a41e multiboot: fprintf(stderr...) -> error_report()
Change all fprintf(stderr...) calls in hw/i386/multiboot.c to call
error_report() instead, including the mb_debug macro.  Remove the "\n"
from strings passed to all modified calls, since error_report() appends
one.

Signed-off-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-07 11:53:37 +01:00
Jack Schwartz
ce5eb6dc4d multiboot: Use header names when displaying fields
Refer to field names when displaying fields in printf and debug statements.

Signed-off-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-07 11:53:37 +01:00
Jack Schwartz
7a2e43cc96 multiboot: Remove unused variables from multiboot.c
Remove unused variables: mh_mode_type, mh_width, mh_height, mh_depth

Signed-off-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-07 11:53:35 +01:00
Jack Schwartz
2a8fcd119e multiboot: bss_end_addr can be zero
The multiboot spec (https://www.gnu.org/software/grub/manual/multiboot/),
section 3.1.3, allows for bss_end_addr to be zero.

A zero bss_end_addr signifies there is no .bss section.

Suggested-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2018-03-07 11:53:26 +01:00
Michael Clark
25fa194b7b RISC-V Build Infrastructure
This adds RISC-V into the build system enabling the following targets:

- riscv32-softmmu
- riscv64-softmmu
- riscv32-linux-user
- riscv64-linux-user

This adds defaults configs for RISC-V, enables the build for the RISC-V
CPU core, hardware, and Linux User Emulation. The 'qemu-binfmt-conf.sh'
script is updated to add the RISC-V ELF magic.

Expected checkpatch errors for consistency reasons:

ERROR: line over 90 characters
FILE: scripts/qemu-binfmt-conf.sh

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
a7240d1e4a SiFive Freedom U Series RISC-V Machine
This provides a RISC-V Board compatible with the the SiFive Freedom U SDK.
The following machine is implemented:

- 'sifive_u'; CLINT, PLIC, UART, device-tree

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
eb637edb12 SiFive Freedom E Series RISC-V Machine
This provides a RISC-V Board compatible with the the SiFive Freedom E SDK.
The following machine is implemented:

- 'sifive_e'; CLINT, PLIC, UART, AON, GPIO, QSPI, PWM

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
e6b8552c65 SiFive RISC-V PRCI Block
Simple model of the PRCI  (Power, Reset, Clock, Interrupt) to emulate
register reads made by the SDK BSP.

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
bb72692cbd SiFive RISC-V UART Device
QEMU model of the UART on the SiFive E300 and U500 series SOCs.
BBL supports the SiFive UART for early console access via the SBI
(Supervisor Binary Interface) and the linux kernel SBI console.

The SiFive UART implements the pre qom legacy interface consistent
with the 16550a UART in 'hw/char/serial.c'.

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stefan O'Rear <sorear2@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
04331d0b56 RISC-V VirtIO Machine
RISC-V machine with device-tree, 16550a UART and VirtIO MMIO.
The following machine is implemented:

- 'virt'; CLINT, PLIC, 16550A UART, VirtIO MMIO, device-tree

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
88a07990fa SiFive RISC-V Test Finisher
Test finisher memory mapped device used to exit simulation.

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
5b4beba124 RISC-V Spike Machines
RISC-V machines compatble with Spike aka riscv-isa-sim, the RISC-V
Instruction Set Simulator. The following machines are implemented:

- 'spike_v1.9.1'; HTIF console, config-string, Privileged ISA Version 1.9.1
- 'spike_v1.10'; HTIF console, device-tree, Privileged ISA Version 1.10

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
1e24429e40 SiFive RISC-V PLIC Block
The PLIC (Platform Level Interrupt Controller) device provides a
parameterizable interrupt controller based on SiFive's PLIC specification.

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stefan O'Rear <sorear2@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
1c77c410b6 SiFive RISC-V CLINT Block
The CLINT (Core Local Interruptor) device provides real-time clock, timer
and interprocessor interrupts based on SiFive's CLINT specification.

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Stefan O'Rear <sorear2@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
4b50b8d9f2 RISC-V HART Array
Holds the state of a heterogenous array of RISC-V hardware threads.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
5033606780 RISC-V HTIF Console
HTIF (Host Target Interface) provides console emulation for QEMU. HTIF
allows identical copies of BBL (Berkeley Boot Loader) and linux to run
on both Spike and QEMU. BBL provides HTIF console access via the
SBI (Supervisor Binary Interface) and the linux kernel SBI console.

The HTIT chardev implements the pre qom legacy interface consistent
with the 16550a UART in 'hw/char/serial.c'.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Stefan O'Rear <sorear2@gmail.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
a2480ffa88 Add symbol table callback interface to load_elf
The RISC-V HTIF (Host Target Interface) console device requires access
to the symbol table to locate the 'tohost' and 'fromhost' symbols.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
47ae93cdfe RISC-V Linux User Emulation
Implementation of linux user emulation for RISC-V.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
65c5b75c38 RISC-V Physical Memory Protection
Implements the physical memory protection extension as specified in
Privileged ISA Version 1.10.

PMP (Physical Memory Protection) is as-of-yet unused and needs testing.
The SiFive verification team have PMP test cases that will be run.

Nothing currently depends on PMP support. It would be preferable to keep
the code in-tree for folk that are interested in RISC-V PMP support.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Daire McNamara <daire.mcnamara@emdalo.com>
Signed-off-by: Ivan Griffin <ivan.griffin@emdalo.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
55c2a12cbc RISC-V TCG Code Generation
TCG code generation for the RV32IMAFDC and RV64IMAFDC. The QEMU
RISC-V code generator has complete coverage for the Base ISA v2.2,
Privileged ISA v1.9.1 and Privileged ISA v1.10:

- RISC-V Instruction Set Manual Volume I: User-Level ISA Version 2.2
- RISC-V Instruction Set Manual Volume II: Privileged ISA Version 1.9.1
- RISC-V Instruction Set Manual Volume II: Privileged ISA Version 1.10

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
9438fe7d7c RISC-V GDB Stub
GDB Register read and write routines.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
f798f1e29b RISC-V FPU Support
Helper routines for FPU instructions and NaN definitions.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
0c3e702aca RISC-V CPU Helpers
Privileged control and status register helpers and page fault handling.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
ea10325917 RISC-V Disassembler
The RISC-V disassembler has no dependencies outside of the 'disas'
directory so it can be applied independently. The majority of the
disassembler is machine-generated from instruction set metadata:

- https://github.com/michaeljclark/riscv-meta

Expected checkpatch errors for consistency and brevity reasons:

ERROR: line over 90 characters
ERROR: trailing statements should be on next line
ERROR: space prohibited between function name and open parenthesis '('

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
dc5bd18fa5 RISC-V CPU Core Definition
Add CPU state header, CPU definitions and initialization routines

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
f71a8eaffb RISC-V ELF Machine Definition
Define RISC-V ELF machine EM_RISCV 243

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Michael Clark
4dc62b1532 RISC-V Maintainers
Add Michael Clark, Palmer Dabbelt, Sagar Karandikar and Bastian
Koppelmann as RISC-V Maintainers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
2018-03-07 08:30:28 +13:00
Stefan Berger
fcbed221ff tpm: convert tpm_tis.c to use trace-events
Leave the DEBUG_TIS for more debugging and convert to use if (DEBUG_TIS)
rather than #if DEBUG_TIS where it is being used.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-06 13:00:41 -05:00
Stefan Berger
9d9dcd9602 tpm: convert tpm_emulator.c to use trace-events
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-06 13:00:41 -05:00
Stefan Berger
cc7d320f5d tpm: convert tpm_util.c to use trace-events
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-06 13:00:41 -05:00
Stefan Berger
49d302fe3d tpm: convert tpm_passthrough.c to use trace-events
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-06 13:00:41 -05:00
Stefan Berger
ec427498da tpm: convert tpm_crb.c to use trace-events
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-03-06 13:00:41 -05:00
Peter Xu
1939ccdaa6 qio: non-default context for TLS handshake
A new parameter "context" is added to qio_channel_tls_handshake() is to
allow the TLS to be run on a non-default context.  Still, no functional
change.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-06 10:19:07 +00:00
Peter Xu
8005fdd8fa qio: non-default context for async conn
We have worked on qio_task_run_in_thread() already.  Further, let
all the qio channel APIs use that context.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-06 10:19:06 +00:00
Peter Xu
a17536c594 qio: non-default context for threaded qtask
qio_task_run_in_thread() allows main thread to run blocking operations
in the background. However it has an assumption on that it's always
working with the default context. This patch tries to allow the threaded
QIO task framework to run with non-default gcontext.

Currently no functional change so far, so the QIOTasks are still always
running on main context.

Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-06 10:19:05 +00:00
Peter Xu
938c8b79e5 qio: store gsources for net listeners
Originally we were storing the GSources tag IDs.  That'll be not enough
if we are going to support non-default gcontext for QIO code.  Switch to
GSources without changing anything real.  Now we still always pass in
NULL, which means the default gcontext.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-06 10:19:05 +00:00
Peter Xu
315409c711 qio: introduce qio_channel_add_watch_{full|source}
Firstly, introduce an internal qio_channel_add_watch_full(), which
enhances qio_channel_add_watch() that context can be specified.

Then add a new API wrapper qio_channel_add_watch_source() to return a
GSource pointer rather than a tag ID.

Note that the _source() call will keep a reference of GSource so that
callers need to unref them explicitly when finished using the GSource.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-06 10:19:04 +00:00
Peter Xu
7c28768fef qio: rename qio_task_thread_result
It is strange that it was called gio_task_thread_result.  Rename it to
follow the naming rule of the file.

Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-06 10:19:03 +00:00
781 changed files with 47884 additions and 7068 deletions

2
.gitmodules vendored
View File

@@ -18,7 +18,7 @@
url = git://git.qemu-project.org/openhackware.git
[submodule "roms/qemu-palcode"]
path = roms/qemu-palcode
url = git://github.com/rth7680/qemu-palcode.git
url = git://git.qemu.org/qemu-palcode.git
[submodule "roms/sgabios"]
path = roms/sgabios
url = git://git.qemu-project.org/sgabios.git

View File

@@ -23,8 +23,6 @@ env:
TARGET_LIST=mips-softmmu,mipsel-linux-user
- IMAGE=debian-mips64el-cross
TARGET_LIST=mips64el-softmmu,mips64el-linux-user
- IMAGE=debian-powerpc-cross
TARGET_LIST=ppc-softmmu,ppcemb-softmmu,ppc-linux-user
- IMAGE=debian-ppc64el-cross
TARGET_LIST=ppc64-softmmu,ppc64-linux-user,ppc64abi32-linux-user
build:

View File

@@ -49,9 +49,10 @@ env:
- TEST_CMD="make check"
- MAKEFLAGS="-j3"
matrix:
- CONFIG=""
- CONFIG="--enable-debug --enable-debug-tcg --enable-trace-backends=log"
- CONFIG="--disable-linux-aio --disable-cap-ng --disable-attr --disable-brlapi --disable-uuid --disable-libusb"
- CONFIG="--disable-system"
- CONFIG="--disable-user"
- CONFIG="--enable-debug --enable-debug-tcg"
- CONFIG="--disable-linux-aio --disable-cap-ng --disable-attr --disable-brlapi --disable-uuid --disable-libusb --disable-user"
- CONFIG="--enable-modules --disable-linux-user"
- CONFIG="--with-coroutine=ucontext --disable-linux-user"
- CONFIG="--with-coroutine=sigaltstack --disable-linux-user"

View File

@@ -127,7 +127,6 @@ Alpha
M: Richard Henderson <rth@twiddle.net>
S: Maintained
F: target/alpha/
F: hw/alpha/
F: tests/tcg/alpha/
F: disas/alpha.c
@@ -233,6 +232,17 @@ F: hw/ppc/
F: include/hw/ppc/
F: disas/ppc.c
RISC-V
M: Michael Clark <mjc@sifive.com>
M: Palmer Dabbelt <palmer@sifive.com>
M: Sagar Karandikar <sagark@eecs.berkeley.edu>
M: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
S: Maintained
F: target/riscv/
F: hw/riscv/
F: include/hw/riscv/
F: disas/riscv.c
S390
M: Richard Henderson <rth@twiddle.net>
M: Alexander Graf <agraf@suse.de>
@@ -279,7 +289,7 @@ T: git git://github.com/ehabkost/qemu.git x86-next
Xtensa
M: Max Filippov <jcmvbkbc@gmail.com>
W: http://wiki.osll.spb.ru/doku.php?id=etc:users:jcmvbkbc:qemu-target-xtensa
W: http://wiki.osll.ru/doku.php?id=etc:users:jcmvbkbc:qemu-target-xtensa
S: Maintained
F: target/xtensa/
F: hw/xtensa/
@@ -402,6 +412,12 @@ F: include/*/*win32*
X: qga/*win32*
F: qemu.nsi
Alpha Machines
M: Richard Henderson <rth@twiddle.net>
S: Maintained
F: hw/alpha/
F: hw/isa/smc37c669-superio.c
ARM Machines
------------
Allwinner-a10
@@ -689,6 +705,8 @@ Fulong 2E
M: Yongbok Kim <yongbok.kim@mips.com>
S: Odd Fixes
F: hw/mips/mips_fulong2e.c
F: hw/isa/vt82c686.c
F: include/hw/isa/vt82c686.h
Boston
M: Paul Burton <paul.burton@mips.com>
@@ -765,9 +783,10 @@ F: hw/ppc/prep_systemio.c
F: hw/ppc/rs6000_mc.c
F: hw/pci-host/prep.[hc]
F: hw/isa/i82378.c
F: hw/isa/pc87312.[hc]
F: hw/isa/pc87312.c
F: hw/dma/i82374.c
F: hw/timer/m48t59-isa.c
F: include/hw/isa/pc87312.h
F: include/hw/timer/m48t59.h
F: pc-bios/ppc_rom.bin
@@ -913,7 +932,7 @@ M: Michael S. Tsirkin <mst@redhat.com>
M: Paolo Bonzini <pbonzini@redhat.com>
S: Supported
F: hw/char/debugcon.c
F: hw/char/parallel.c
F: hw/char/parallel*
F: hw/char/serial*
F: hw/dma/i8257*
F: hw/i2c/pm_smbus.c
@@ -921,6 +940,7 @@ F: hw/input/pckbd.c
F: hw/intc/apic*
F: hw/intc/ioapic*
F: hw/intc/i8259*
F: hw/isa/isa-superio.c
F: hw/misc/debugexit.c
F: hw/misc/pc-testdev.c
F: hw/timer/hpet*
@@ -928,8 +948,11 @@ F: hw/timer/i8254*
F: hw/timer/mc146818rtc*
F: hw/watchdog/wdt_ib700.c
F: include/hw/display/vga.h
F: include/hw/char/parallel.h
F: include/hw/dma/i8257.h
F: include/hw/i2c/pm_smbus.h
F: include/hw/isa/i8257.h
F: include/hw/input/i8042.h
F: include/hw/isa/superio.h
F: include/hw/timer/hpet.h
F: include/hw/timer/i8254*
F: include/hw/timer/mc146818rtc*
@@ -1089,6 +1112,14 @@ M: Peter Crosthwaite <crosthwaite.peter@gmail.com>
S: Maintained
F: hw/ssi/xilinx_*
SD (Secure Card)
M: Philippe Mathieu-Daudé <f4bug@amsat.org>
S: Odd Fixes
F: include/hw/sd/sd*
F: hw/sd/core.c
F: hw/sd/sd*
F: tests/sd*
USB
M: Gerd Hoffmann <kraxel@redhat.com>
S: Maintained
@@ -1320,6 +1351,7 @@ F: util/aio-*.c
F: block/io.c
F: migration/block*
F: include/block/aio.h
F: include/block/aio-wait.h
F: scripts/qemugdb/aio.py
T: git git://github.com/stefanha/qemu.git block

View File

@@ -425,6 +425,10 @@ dummy := $(call unnest-vars,, \
io-obj-y \
common-obj-y \
common-obj-m \
ui-obj-y \
ui-obj-m \
audio-obj-y \
audio-obj-m \
trace-obj-y)
include $(SRC_PATH)/tests/Makefile.include
@@ -434,21 +438,23 @@ all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules
qemu-version.h: FORCE
$(call quiet-command, \
(cd $(SRC_PATH); \
printf '#define QEMU_PKGVERSION '; \
if test -n "$(PKGVERSION)"; then \
printf '"$(PKGVERSION)"\n'; \
pkgvers="$(PKGVERSION)"; \
else \
if test -d .git; then \
printf '" ('; \
git describe --match 'v*' 2>/dev/null | tr -d '\n'; \
pkgvers=$$(git describe --match 'v*' 2>/dev/null | tr -d '\n');\
if ! git diff-index --quiet HEAD &>/dev/null; then \
printf -- '-dirty'; \
pkgvers="$${pkgvers}-dirty"; \
fi; \
printf ')"\n'; \
else \
printf '""\n'; \
fi; \
fi) > $@.tmp)
fi; \
printf "#define QEMU_PKGVERSION \"$${pkgvers}\"\n"; \
if test -n "$${pkgvers}"; then \
printf '#define QEMU_FULL_VERSION QEMU_VERSION " (" QEMU_PKGVERSION ")"\n'; \
else \
printf '#define QEMU_FULL_VERSION QEMU_VERSION\n'; \
fi; \
) > $@.tmp)
$(call quiet-command, if ! cmp -s $@ $@.tmp; then \
mv $@.tmp $@; \
else \
@@ -771,7 +777,6 @@ bepo cz
ifdef INSTALL_BLOBS
BLOBS=bios.bin bios-256k.bin sgabios.bin vgabios.bin vgabios-cirrus.bin \
vgabios-stdvga.bin vgabios-vmware.bin vgabios-qxl.bin vgabios-virtio.bin \
acpi-dsdt.aml \
ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc QEMU,tcx.bin QEMU,cgthree.bin \
pxe-e1000.rom pxe-eepro100.rom pxe-ne2k_pci.rom \
pxe-pcnet.rom pxe-rtl8139.rom pxe-virtio.rom \
@@ -851,7 +856,7 @@ ifneq ($(BLOBS),)
$(INSTALL_DATA) $(SRC_PATH)/pc-bios/$$x "$(DESTDIR)$(qemu_datadir)"; \
done
endif
ifeq ($(CONFIG_GTK),y)
ifeq ($(CONFIG_GTK),m)
$(MAKE) -C po $@
endif
$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/keymaps"
@@ -1042,10 +1047,16 @@ endif
include $(SRC_PATH)/tests/docker/Makefile.include
include $(SRC_PATH)/tests/vm/Makefile.include
printgen:
@echo $(GENERATED_FILES)
.PHONY: help
help:
@echo 'Generic targets:'
@echo ' all - Build all'
ifdef CONFIG_MODULES
@echo ' modules - Build all modules'
endif
@echo ' dir/file.o - Build specified target only'
@echo ' install - Install QEMU, documentation and tools'
@echo ' ctags/TAGS - Generate tags file for editors'

View File

@@ -104,6 +104,7 @@ common-obj-$(CONFIG_LINUX) += fsdev/
common-obj-y += migration/
common-obj-y += audio/
common-obj-m += audio/
common-obj-y += hw/
common-obj-y += replay/
@@ -233,6 +234,7 @@ trace-events-subdirs += hw/alpha
trace-events-subdirs += hw/hppa
trace-events-subdirs += hw/xen
trace-events-subdirs += hw/ide
trace-events-subdirs += hw/tpm
trace-events-subdirs += ui
trace-events-subdirs += audio
trace-events-subdirs += net

View File

@@ -11,9 +11,9 @@ $(call set-vpath, $(SRC_PATH):$(BUILD_DIR))
ifdef CONFIG_LINUX
QEMU_CFLAGS += -I../linux-headers
endif
QEMU_CFLAGS += -I.. -I$(SRC_PATH)/target/$(TARGET_BASE_ARCH) -DNEED_CPU_H
QEMU_CFLAGS += -iquote .. -iquote $(SRC_PATH)/target/$(TARGET_BASE_ARCH) -DNEED_CPU_H
QEMU_CFLAGS+=-I$(SRC_PATH)/include
QEMU_CFLAGS+=-iquote $(SRC_PATH)/include
ifdef CONFIG_USER_ONLY
# user emulator name

2
README
View File

@@ -73,7 +73,7 @@ The QEMU website is also maintained under source control.
git clone git://git.qemu.org/qemu-web.git
https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/
A 'git-profile' utility was created to make above process less
A 'git-publish' utility was created to make above process less
cumbersome, and is highly recommended for making regular contributions,
or even just for sending consecutive patch series revisions. It also
requires a working 'git send-email' setup, and by default doesn't

View File

@@ -1 +1 @@
2.11.50
2.11.92

View File

@@ -1,4 +1,4 @@
obj-$(CONFIG_SOFTMMU) += accel.o
obj-y += kvm/
obj-$(CONFIG_KVM) += kvm/
obj-$(CONFIG_TCG) += tcg/
obj-y += stubs/

View File

@@ -1 +1,2 @@
obj-$(CONFIG_KVM) += kvm-all.o
obj-y += kvm-all.o
obj-$(call lnot,$(CONFIG_SEV)) += sev-stub.o

View File

@@ -38,6 +38,7 @@
#include "qemu/event_notifier.h"
#include "trace.h"
#include "hw/irq.h"
#include "sysemu/sev.h"
#include "hw/boards.h"
@@ -103,6 +104,10 @@ struct KVMState
#endif
KVMMemoryListener memory_listener;
QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
/* memory encryption */
void *memcrypt_handle;
int (*memcrypt_encrypt_data)(void *handle, uint8_t *ptr, uint64_t len);
};
KVMState *kvm_state;
@@ -138,6 +143,26 @@ int kvm_get_max_memslots(void)
return s->nr_slots;
}
bool kvm_memcrypt_enabled(void)
{
if (kvm_state && kvm_state->memcrypt_handle) {
return true;
}
return false;
}
int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len)
{
if (kvm_state->memcrypt_handle &&
kvm_state->memcrypt_encrypt_data) {
return kvm_state->memcrypt_encrypt_data(kvm_state->memcrypt_handle,
ptr, len);
}
return 1;
}
static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml)
{
KVMState *s = kvm_state;
@@ -1636,6 +1661,20 @@ static int kvm_init(MachineState *ms)
kvm_state = s;
/*
* if memory encryption object is specified then initialize the memory
* encryption context.
*/
if (ms->memory_encryption) {
kvm_state->memcrypt_handle = sev_guest_init(ms->memory_encryption);
if (!kvm_state->memcrypt_handle) {
ret = -1;
goto err;
}
kvm_state->memcrypt_encrypt_data = sev_encrypt_data;
}
ret = kvm_arch_init(ms, s);
if (ret < 0) {
goto err;

26
accel/kvm/sev-stub.c Normal file
View File

@@ -0,0 +1,26 @@
/*
* QEMU SEV stub
*
* Copyright Advanced Micro Devices 2018
*
* Authors:
* Brijesh Singh <brijesh.singh@amd.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/sev.h"
int sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len)
{
abort();
}
void *sev_guest_init(const char *id)
{
return NULL;
}

View File

@@ -105,6 +105,16 @@ int kvm_on_sigbus(int code, void *addr)
return 1;
}
bool kvm_memcrypt_enabled(void)
{
return false;
}
int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len)
{
return 1;
}
#ifndef CONFIG_USER_ONLY
int kvm_irqchip_add_msi_route(KVMState *s, int vector, PCIDevice *dev)
{

View File

@@ -585,6 +585,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
else {
if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
replay_interrupt();
cpu->exception_index = -1;
*last_tb = NULL;
}
/* The target hook may have updated the 'cpu->interrupt_request';
@@ -606,7 +607,9 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
if (unlikely(atomic_read(&cpu->exit_request)
|| (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 0))) {
atomic_set(&cpu->exit_request, 0);
cpu->exception_index = EXCP_INTERRUPT;
if (cpu->exception_index == -1) {
cpu->exception_index = EXCP_INTERRUPT;
}
return true;
}

View File

@@ -1736,37 +1736,32 @@ void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
cpu_abort(cpu, "cpu_io_recompile: could not find TB for pc=%p",
(void *)retaddr);
}
n = cpu->icount_decr.u16.low + tb->icount;
cpu_restore_state_from_tb(cpu, tb, retaddr);
/* Calculate how many instructions had been executed before the fault
occurred. */
n = n - cpu->icount_decr.u16.low;
/* Generate a new TB ending on the I/O insn. */
n++;
/* On MIPS and SH, delay slot instructions can only be restarted if
they were already the first instruction in the TB. If this is not
the first instruction in a TB then re-execute the preceding
branch. */
n = 1;
#if defined(TARGET_MIPS)
if ((env->hflags & MIPS_HFLAG_BMASK) != 0 && n > 1) {
if ((env->hflags & MIPS_HFLAG_BMASK) != 0
&& env->active_tc.PC != tb->pc) {
env->active_tc.PC -= (env->hflags & MIPS_HFLAG_B16 ? 2 : 4);
cpu->icount_decr.u16.low++;
env->hflags &= ~MIPS_HFLAG_BMASK;
n = 2;
}
#elif defined(TARGET_SH4)
if ((env->flags & ((DELAY_SLOT | DELAY_SLOT_CONDITIONAL))) != 0
&& n > 1) {
&& env->pc != tb->pc) {
env->pc -= 2;
cpu->icount_decr.u16.low++;
env->flags &= ~(DELAY_SLOT | DELAY_SLOT_CONDITIONAL);
n = 2;
}
#endif
/* This should never happen. */
if (n > CF_COUNT_MASK) {
cpu_abort(cpu, "TB too big during recompile");
}
/* Adjust the execution state of the next TB. */
/* Generate a new TB executing the I/O insn. */
cpu->cflags_next_tb = curr_cflags() | CF_LAST_IO | n;
if (tb->cflags & CF_NOCACHE) {

View File

@@ -71,6 +71,8 @@ int graphic_depth = 32;
#define QEMU_ARCH QEMU_ARCH_OPENRISC
#elif defined(TARGET_PPC)
#define QEMU_ARCH QEMU_ARCH_PPC
#elif defined(TARGET_RISCV)
#define QEMU_ARCH QEMU_ARCH_RISCV
#elif defined(TARGET_S390X)
#define QEMU_ARCH QEMU_ARCH_S390X
#elif defined(TARGET_SH4)

View File

@@ -1,19 +1,31 @@
common-obj-y = audio.o noaudio.o wavaudio.o mixeng.o
common-obj-$(CONFIG_AUDIO_SDL) += sdlaudio.o
common-obj-$(CONFIG_AUDIO_OSS) += ossaudio.o
common-obj-$(CONFIG_SPICE) += spiceaudio.o
common-obj-$(CONFIG_AUDIO_COREAUDIO) += coreaudio.o
common-obj-$(CONFIG_AUDIO_ALSA) += alsaaudio.o
common-obj-$(CONFIG_AUDIO_DSOUND) += dsoundaudio.o
common-obj-$(CONFIG_AUDIO_PA) += paaudio.o
common-obj-$(CONFIG_AUDIO_PT_INT) += audio_pt_int.o
common-obj-$(CONFIG_AUDIO_WIN_INT) += audio_win_int.o
common-obj-y += wavcapture.o
sdlaudio.o-cflags := $(SDL_CFLAGS)
sdlaudio.o-libs := $(SDL_LIBS)
alsaaudio.o-libs := $(ALSA_LIBS)
paaudio.o-libs := $(PULSE_LIBS)
coreaudio.o-libs := $(COREAUDIO_LIBS)
dsoundaudio.o-libs := $(DSOUND_LIBS)
ossaudio.o-libs := $(OSS_LIBS)
# alsa module
common-obj-$(CONFIG_AUDIO_ALSA) += alsa.mo
alsa.mo-objs = alsaaudio.o
alsa.mo-libs := $(ALSA_LIBS)
# oss module
common-obj-$(CONFIG_AUDIO_OSS) += oss.mo
oss.mo-objs = ossaudio.o
oss.mo-libs := $(OSS_LIBS)
# pulseaudio module
common-obj-$(CONFIG_AUDIO_PA) += pa.mo
pa.mo-objs = paaudio.o
pa.mo-libs := $(PULSE_LIBS)
# sdl module
common-obj-$(CONFIG_AUDIO_SDL) += sdl.mo
sdl.mo-objs = sdlaudio.o
sdl.mo-cflags := $(SDL_CFLAGS)
sdl.mo-libs := $(SDL_LIBS)

View File

@@ -1213,7 +1213,7 @@ static struct audio_pcm_ops alsa_pcm_ops = {
.ctl_in = alsa_ctl_in,
};
struct audio_driver alsa_audio_driver = {
static struct audio_driver alsa_audio_driver = {
.name = "alsa",
.descr = "ALSA http://www.alsa-project.org",
.options = alsa_options,
@@ -1226,3 +1226,9 @@ struct audio_driver alsa_audio_driver = {
.voice_size_out = sizeof (ALSAVoiceOut),
.voice_size_in = sizeof (ALSAVoiceIn)
};
static void register_audio_alsa(void)
{
audio_driver_register(&alsa_audio_driver);
}
type_init(register_audio_alsa);

View File

@@ -45,15 +45,49 @@
The 1st one is the one used by default, that is the reason
that we generate the list.
*/
static struct audio_driver *drvtab[] = {
#ifdef CONFIG_SPICE
&spice_audio_driver,
#endif
static const char *audio_prio_list[] = {
"spice",
CONFIG_AUDIO_DRIVERS
&no_audio_driver,
&wav_audio_driver
"none",
"wav",
};
static QLIST_HEAD(, audio_driver) audio_drivers;
void audio_driver_register(audio_driver *drv)
{
QLIST_INSERT_HEAD(&audio_drivers, drv, next);
}
audio_driver *audio_driver_lookup(const char *name)
{
struct audio_driver *d;
QLIST_FOREACH(d, &audio_drivers, next) {
if (strcmp(name, d->name) == 0) {
return d;
}
}
audio_module_load_one(name);
QLIST_FOREACH(d, &audio_drivers, next) {
if (strcmp(name, d->name) == 0) {
return d;
}
}
return NULL;
}
static void audio_module_load_all(void)
{
int i;
for (i = 0; i < ARRAY_SIZE(audio_prio_list); i++) {
audio_driver_lookup(audio_prio_list[i]);
}
}
struct fixed_settings {
int enabled;
int nb_voices;
@@ -1656,11 +1690,13 @@ static void audio_pp_nb_voices (const char *typ, int nb)
void AUD_help (void)
{
size_t i;
struct audio_driver *d;
/* make sure we print the help text for modular drivers too */
audio_module_load_all();
audio_process_options ("AUDIO", audio_options);
for (i = 0; i < ARRAY_SIZE (drvtab); i++) {
struct audio_driver *d = drvtab[i];
QLIST_FOREACH(d, &audio_drivers, next) {
if (d->options) {
audio_process_options (d->name, d->options);
}
@@ -1672,8 +1708,7 @@ void AUD_help (void)
printf ("Available drivers:\n");
for (i = 0; i < ARRAY_SIZE (drvtab); i++) {
struct audio_driver *d = drvtab[i];
QLIST_FOREACH(d, &audio_drivers, next) {
printf ("Name: %s\n", d->name);
printf ("Description: %s\n", d->descr);
@@ -1807,6 +1842,7 @@ static void audio_init (void)
const char *drvname;
VMChangeStateEntry *e;
AudioState *s = &glob_audio_state;
struct audio_driver *driver;
if (s->drv) {
return;
@@ -1842,32 +1878,27 @@ static void audio_init (void)
}
if (drvname) {
int found = 0;
for (i = 0; i < ARRAY_SIZE (drvtab); i++) {
if (!strcmp (drvname, drvtab[i]->name)) {
done = !audio_driver_init (s, drvtab[i]);
found = 1;
break;
}
}
if (!found) {
driver = audio_driver_lookup(drvname);
if (driver) {
done = !audio_driver_init(s, driver);
} else {
dolog ("Unknown audio driver `%s'\n", drvname);
dolog ("Run with -audio-help to list available drivers\n");
}
}
if (!done) {
for (i = 0; !done && i < ARRAY_SIZE (drvtab); i++) {
if (drvtab[i]->can_be_default) {
done = !audio_driver_init (s, drvtab[i]);
for (i = 0; !done && i < ARRAY_SIZE(audio_prio_list); i++) {
driver = audio_driver_lookup(audio_prio_list[i]);
if (driver && driver->can_be_default) {
done = !audio_driver_init(s, driver);
}
}
}
if (!done) {
done = !audio_driver_init (s, &no_audio_driver);
driver = audio_driver_lookup("none");
done = !audio_driver_init(s, driver);
assert(done);
dolog("warning: Using timer based audio emulation\n");
}

View File

@@ -141,6 +141,7 @@ struct SWVoiceIn {
QLIST_ENTRY (SWVoiceIn) entries;
};
typedef struct audio_driver audio_driver;
struct audio_driver {
const char *name;
const char *descr;
@@ -154,6 +155,7 @@ struct audio_driver {
int voice_size_out;
int voice_size_in;
int ctl_caps;
QLIST_ENTRY(audio_driver) next;
};
struct audio_pcm_ops {
@@ -203,17 +205,11 @@ struct AudioState {
int vm_running;
};
extern struct audio_driver no_audio_driver;
extern struct audio_driver oss_audio_driver;
extern struct audio_driver sdl_audio_driver;
extern struct audio_driver wav_audio_driver;
extern struct audio_driver alsa_audio_driver;
extern struct audio_driver coreaudio_audio_driver;
extern struct audio_driver dsound_audio_driver;
extern struct audio_driver pa_audio_driver;
extern struct audio_driver spice_audio_driver;
extern const struct mixeng_volume nominal_volume;
void audio_driver_register(audio_driver *drv);
audio_driver *audio_driver_lookup(const char *name);
void audio_pcm_init_info (struct audio_pcm_info *info, struct audsettings *as);
void audio_pcm_info_clear_buf (struct audio_pcm_info *info, void *buf, int len);

View File

@@ -722,7 +722,7 @@ static struct audio_pcm_ops coreaudio_pcm_ops = {
.ctl_out = coreaudio_ctl_out
};
struct audio_driver coreaudio_audio_driver = {
static struct audio_driver coreaudio_audio_driver = {
.name = "coreaudio",
.descr = "CoreAudio http://developer.apple.com/audio/coreaudio.html",
.options = coreaudio_options,
@@ -735,3 +735,9 @@ struct audio_driver coreaudio_audio_driver = {
.voice_size_out = sizeof (coreaudioVoiceOut),
.voice_size_in = 0
};
static void register_audio_coreaudio(void)
{
audio_driver_register(&coreaudio_audio_driver);
}
type_init(register_audio_coreaudio);

View File

@@ -890,7 +890,7 @@ static struct audio_pcm_ops dsound_pcm_ops = {
.ctl_in = dsound_ctl_in
};
struct audio_driver dsound_audio_driver = {
static struct audio_driver dsound_audio_driver = {
.name = "dsound",
.descr = "DirectSound http://wikipedia.org/wiki/DirectSound",
.options = dsound_options,
@@ -903,3 +903,9 @@ struct audio_driver dsound_audio_driver = {
.voice_size_out = sizeof (DSoundVoiceOut),
.voice_size_in = sizeof (DSoundVoiceIn)
};
static void register_audio_dsound(void)
{
audio_driver_register(&dsound_audio_driver);
}
type_init(register_audio_dsound);

View File

@@ -160,7 +160,7 @@ static struct audio_pcm_ops no_pcm_ops = {
.ctl_in = no_ctl_in
};
struct audio_driver no_audio_driver = {
static struct audio_driver no_audio_driver = {
.name = "none",
.descr = "Timer based audio emulation",
.options = NULL,
@@ -173,3 +173,9 @@ struct audio_driver no_audio_driver = {
.voice_size_out = sizeof (NoVoiceOut),
.voice_size_in = sizeof (NoVoiceIn)
};
static void register_audio_none(void)
{
audio_driver_register(&no_audio_driver);
}
type_init(register_audio_none);

View File

@@ -922,7 +922,7 @@ static struct audio_pcm_ops oss_pcm_ops = {
.ctl_in = oss_ctl_in
};
struct audio_driver oss_audio_driver = {
static struct audio_driver oss_audio_driver = {
.name = "oss",
.descr = "OSS http://www.opensound.com",
.options = oss_options,
@@ -935,3 +935,9 @@ struct audio_driver oss_audio_driver = {
.voice_size_out = sizeof (OSSVoiceOut),
.voice_size_in = sizeof (OSSVoiceIn)
};
static void register_audio_oss(void)
{
audio_driver_register(&oss_audio_driver);
}
type_init(register_audio_oss);

View File

@@ -937,7 +937,7 @@ static struct audio_pcm_ops qpa_pcm_ops = {
.ctl_in = qpa_ctl_in
};
struct audio_driver pa_audio_driver = {
static struct audio_driver pa_audio_driver = {
.name = "pa",
.descr = "http://www.pulseaudio.org/",
.options = qpa_options,
@@ -951,3 +951,9 @@ struct audio_driver pa_audio_driver = {
.voice_size_in = sizeof (PAVoiceIn),
.ctl_caps = VOICE_VOLUME_CAP
};
static void register_audio_pa(void)
{
audio_driver_register(&pa_audio_driver);
}
type_init(register_audio_pa);

View File

@@ -500,7 +500,7 @@ static struct audio_pcm_ops sdl_pcm_ops = {
.ctl_out = sdl_ctl_out,
};
struct audio_driver sdl_audio_driver = {
static struct audio_driver sdl_audio_driver = {
.name = "sdl",
.descr = "SDL http://www.libsdl.org",
.options = sdl_options,
@@ -513,3 +513,9 @@ struct audio_driver sdl_audio_driver = {
.voice_size_out = sizeof (SDLVoiceOut),
.voice_size_in = 0
};
static void register_audio_sdl(void)
{
audio_driver_register(&sdl_audio_driver);
}
type_init(register_audio_sdl);

View File

@@ -391,7 +391,7 @@ static struct audio_pcm_ops audio_callbacks = {
.ctl_in = line_in_ctl,
};
struct audio_driver spice_audio_driver = {
static struct audio_driver spice_audio_driver = {
.name = "spice",
.descr = "spice audio driver",
.options = audio_options,
@@ -411,3 +411,9 @@ void qemu_spice_audio_init (void)
{
spice_audio_driver.can_be_default = 1;
}
static void register_audio_spice(void)
{
audio_driver_register(&spice_audio_driver);
}
type_init(register_audio_spice);

View File

@@ -278,7 +278,7 @@ static struct audio_pcm_ops wav_pcm_ops = {
.ctl_out = wav_ctl_out,
};
struct audio_driver wav_audio_driver = {
static struct audio_driver wav_audio_driver = {
.name = "wav",
.descr = "WAV renderer http://wikipedia.org/wiki/WAV",
.options = wav_options,
@@ -291,3 +291,9 @@ struct audio_driver wav_audio_driver = {
.voice_size_out = sizeof (WAVVoiceOut),
.voice_size_in = 0
};
static void register_audio_wav(void)
{
audio_driver_register(&wav_audio_driver);
}
type_init(register_audio_wav);

159
block.c
View File

@@ -33,7 +33,10 @@
#include "qapi/error.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qjson.h"
#include "qapi/qmp/qnull.h"
#include "qapi/qmp/qstring.h"
#include "qapi/qobject-output-visitor.h"
#include "qapi/qapi-visit-block-core.h"
#include "sysemu/block-backend.h"
#include "sysemu/sysemu.h"
#include "qemu/notify.h"
@@ -368,7 +371,7 @@ BlockDriver *bdrv_find_format(const char *format_name)
return bdrv_do_find_format(format_name);
}
static int bdrv_is_whitelisted(BlockDriver *drv, bool read_only)
int bdrv_is_whitelisted(BlockDriver *drv, bool read_only)
{
static const char *whitelist_rw[] = {
CONFIG_BDRV_RW_WHITELIST
@@ -1455,7 +1458,7 @@ static QDict *parse_json_filename(const char *filename, Error **errp)
return NULL;
}
options = qobject_to_qdict(options_obj);
options = qobject_to(QDict, options_obj);
if (!options) {
qobject_decref(options_obj);
error_setg(errp, "Invalid JSON object given");
@@ -2406,6 +2409,51 @@ BdrvChild *bdrv_open_child(const char *filename,
return c;
}
/* TODO Future callers may need to specify parent/child_role in order for
* option inheritance to work. Existing callers use it for the root node. */
BlockDriverState *bdrv_open_blockdev_ref(BlockdevRef *ref, Error **errp)
{
BlockDriverState *bs = NULL;
Error *local_err = NULL;
QObject *obj = NULL;
QDict *qdict = NULL;
const char *reference = NULL;
Visitor *v = NULL;
if (ref->type == QTYPE_QSTRING) {
reference = ref->u.reference;
} else {
BlockdevOptions *options = &ref->u.definition;
assert(ref->type == QTYPE_QDICT);
v = qobject_output_visitor_new(&obj);
visit_type_BlockdevOptions(v, NULL, &options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto fail;
}
visit_complete(v, &obj);
qdict = qobject_to(QDict, obj);
qdict_flatten(qdict);
/* bdrv_open_inherit() defaults to the values in bdrv_flags (for
* compatibility with other callers) rather than what we want as the
* real defaults. Apply the defaults here instead. */
qdict_set_default_str(qdict, BDRV_OPT_CACHE_DIRECT, "off");
qdict_set_default_str(qdict, BDRV_OPT_CACHE_NO_FLUSH, "off");
qdict_set_default_str(qdict, BDRV_OPT_READ_ONLY, "off");
}
bs = bdrv_open_inherit(NULL, reference, qdict, 0, NULL, NULL, errp);
obj = NULL;
fail:
qobject_decref(obj);
visit_free(v);
return bs;
}
static BlockDriverState *bdrv_append_temp_snapshot(BlockDriverState *bs,
int flags,
QDict *snapshot_options,
@@ -2598,7 +2646,13 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
/* See cautionary note on accessing @options above */
backing = qdict_get_try_str(options, "backing");
if (backing && *backing == '\0') {
if (qobject_to(QNull, qdict_get(options, "backing")) != NULL ||
(backing && *backing == '\0'))
{
if (backing) {
warn_report("Use of \"backing\": \"\" is deprecated; "
"use \"backing\": null instead");
}
flags |= BDRV_O_NO_BACKING;
qdict_del(options, "backing");
}
@@ -2836,8 +2890,16 @@ static BlockReopenQueue *bdrv_reopen_queue_child(BlockReopenQueue *bs_queue,
/* Inherit from parent node */
if (parent_options) {
QemuOpts *opts;
QDict *options_copy;
assert(!flags);
role->inherit_options(&flags, options, parent_flags, parent_options);
options_copy = qdict_clone_shallow(options);
opts = qemu_opts_create(&bdrv_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options_copy, NULL);
update_flags_from_options(&flags, opts);
qemu_opts_del(opts);
QDECREF(options_copy);
}
/* Old values are used for options that aren't set yet */
@@ -3455,17 +3517,54 @@ static void bdrv_delete(BlockDriverState *bs)
* free of errors) or -errno when an internal error occurred. The results of the
* check are stored in res.
*/
int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix)
static int coroutine_fn bdrv_co_check(BlockDriverState *bs,
BdrvCheckResult *res, BdrvCheckMode fix)
{
if (bs->drv == NULL) {
return -ENOMEDIUM;
}
if (bs->drv->bdrv_check == NULL) {
if (bs->drv->bdrv_co_check == NULL) {
return -ENOTSUP;
}
memset(res, 0, sizeof(*res));
return bs->drv->bdrv_check(bs, res, fix);
return bs->drv->bdrv_co_check(bs, res, fix);
}
typedef struct CheckCo {
BlockDriverState *bs;
BdrvCheckResult *res;
BdrvCheckMode fix;
int ret;
} CheckCo;
static void bdrv_check_co_entry(void *opaque)
{
CheckCo *cco = opaque;
cco->ret = bdrv_co_check(cco->bs, cco->res, cco->fix);
}
int bdrv_check(BlockDriverState *bs,
BdrvCheckResult *res, BdrvCheckMode fix)
{
Coroutine *co;
CheckCo cco = {
.bs = bs,
.res = res,
.ret = -EINPROGRESS,
.fix = fix,
};
if (qemu_in_coroutine()) {
/* Fast-path if already in coroutine context */
bdrv_check_co_entry(&cco);
} else {
co = qemu_coroutine_create(bdrv_check_co_entry, &cco);
qemu_coroutine_enter(co);
BDRV_POLL_WHILE(bs, cco.ret == -EINPROGRESS);
}
return cco.ret;
}
/*
@@ -3587,12 +3686,12 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
GSList *ignore_children = g_slist_prepend(NULL, c);
bdrv_check_update_perm(base, NULL, c->perm, c->shared_perm,
ignore_children, &local_err);
g_slist_free(ignore_children);
if (local_err) {
ret = -EPERM;
error_report_err(local_err);
goto exit;
}
g_slist_free(ignore_children);
/* If so, update the backing file path in the image file */
if (c->role->update_filename) {
@@ -3635,6 +3734,11 @@ int bdrv_truncate(BdrvChild *child, int64_t offset, PreallocMode prealloc,
error_setg(errp, "No medium inserted");
return -ENOMEDIUM;
}
if (offset < 0) {
error_setg(errp, "Image size cannot be negative");
return -EINVAL;
}
if (!drv->bdrv_truncate) {
if (bs->file && drv->is_filter) {
return bdrv_truncate(bs->file, offset, prealloc, errp);
@@ -4209,7 +4313,8 @@ void bdrv_init_with_whitelist(void)
bdrv_init();
}
void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
static void coroutine_fn bdrv_co_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
BdrvChild *child, *parent;
uint64_t perm, shared_perm;
@@ -4225,7 +4330,7 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
}
QLIST_FOREACH(child, &bs->children, next) {
bdrv_invalidate_cache(child->bs, &local_err);
bdrv_co_invalidate_cache(child->bs, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
@@ -4255,8 +4360,8 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
}
bdrv_set_perm(bs, perm, shared_perm);
if (bs->drv->bdrv_invalidate_cache) {
bs->drv->bdrv_invalidate_cache(bs, &local_err);
if (bs->drv->bdrv_co_invalidate_cache) {
bs->drv->bdrv_co_invalidate_cache(bs, &local_err);
if (local_err) {
bs->open_flags |= BDRV_O_INACTIVE;
error_propagate(errp, local_err);
@@ -4282,6 +4387,38 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
}
}
typedef struct InvalidateCacheCo {
BlockDriverState *bs;
Error **errp;
bool done;
} InvalidateCacheCo;
static void coroutine_fn bdrv_invalidate_cache_co_entry(void *opaque)
{
InvalidateCacheCo *ico = opaque;
bdrv_co_invalidate_cache(ico->bs, ico->errp);
ico->done = true;
}
void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
{
Coroutine *co;
InvalidateCacheCo ico = {
.bs = bs,
.done = false,
.errp = errp
};
if (qemu_in_coroutine()) {
/* Fast-path if already in coroutine context */
bdrv_invalidate_cache_co_entry(&ico);
} else {
co = qemu_coroutine_create(bdrv_invalidate_cache_co_entry, &ico);
qemu_coroutine_enter(co);
BDRV_POLL_WHILE(bs, !ico.done);
}
}
void bdrv_invalidate_cache_all(Error **errp)
{
BlockDriverState *bs;

View File

@@ -9,7 +9,7 @@ block-obj-y += block-backend.o snapshot.o qapi.o
block-obj-$(CONFIG_WIN32) += file-win32.o win32-aio.o
block-obj-$(CONFIG_POSIX) += file-posix.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
block-obj-y += null.o mirror.o commit.o io.o
block-obj-y += null.o mirror.o commit.o io.o create.o
block-obj-y += throttle-groups.o
block-obj-$(CONFIG_LINUX) += nvme.o

View File

@@ -94,6 +94,94 @@ void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
cookie->type = type;
}
/* block_latency_histogram_compare_func:
* Compare @key with interval [@it[0], @it[1]).
* Return: -1 if @key < @it[0]
* 0 if @key in [@it[0], @it[1])
* +1 if @key >= @it[1]
*/
static int block_latency_histogram_compare_func(const void *key, const void *it)
{
uint64_t k = *(uint64_t *)key;
uint64_t a = ((uint64_t *)it)[0];
uint64_t b = ((uint64_t *)it)[1];
return k < a ? -1 : (k < b ? 0 : 1);
}
static void block_latency_histogram_account(BlockLatencyHistogram *hist,
int64_t latency_ns)
{
uint64_t *pos;
if (hist->bins == NULL) {
/* histogram disabled */
return;
}
if (latency_ns < hist->boundaries[0]) {
hist->bins[0]++;
return;
}
if (latency_ns >= hist->boundaries[hist->nbins - 2]) {
hist->bins[hist->nbins - 1]++;
return;
}
pos = bsearch(&latency_ns, hist->boundaries, hist->nbins - 2,
sizeof(hist->boundaries[0]),
block_latency_histogram_compare_func);
assert(pos != NULL);
hist->bins[pos - hist->boundaries + 1]++;
}
int block_latency_histogram_set(BlockAcctStats *stats, enum BlockAcctType type,
uint64List *boundaries)
{
BlockLatencyHistogram *hist = &stats->latency_histogram[type];
uint64List *entry;
uint64_t *ptr;
uint64_t prev = 0;
int new_nbins = 1;
for (entry = boundaries; entry; entry = entry->next) {
if (entry->value <= prev) {
return -EINVAL;
}
new_nbins++;
prev = entry->value;
}
hist->nbins = new_nbins;
g_free(hist->boundaries);
hist->boundaries = g_new(uint64_t, hist->nbins - 1);
for (entry = boundaries, ptr = hist->boundaries; entry;
entry = entry->next, ptr++)
{
*ptr = entry->value;
}
g_free(hist->bins);
hist->bins = g_new0(uint64_t, hist->nbins);
return 0;
}
void block_latency_histograms_clear(BlockAcctStats *stats)
{
int i;
for (i = 0; i < BLOCK_MAX_IOTYPE; i++) {
BlockLatencyHistogram *hist = &stats->latency_histogram[i];
g_free(hist->bins);
g_free(hist->boundaries);
memset(hist, 0, sizeof(*hist));
}
}
static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie *cookie,
bool failed)
{
@@ -116,6 +204,9 @@ static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie *cookie,
stats->nr_ops[cookie->type]++;
}
block_latency_histogram_account(&stats->latency_histogram[cookie->type],
latency_ns);
if (!failed || stats->account_failed) {
stats->total_time_ns[cookie->type] += latency_ns;
stats->last_access_time_ns = time_ns;

View File

@@ -206,7 +206,7 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
BdrvDirtyBitmap *bm;
BlockDriverState *bs = blk_bs(job->common.blk);
if (ret < 0 || block_job_is_cancelled(&job->common)) {
if (ret < 0) {
/* Merge the successor back into the parent, delete nothing. */
bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
assert(bm);
@@ -621,7 +621,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
}
/* job->common.len is fixed, so we can't allow resize */
job = block_job_create(job_id, &backup_job_driver, bs,
job = block_job_create(job_id, &backup_job_driver, txn, bs,
BLK_PERM_CONSISTENT_READ,
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD,
@@ -677,7 +677,6 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
&error_abort);
job->common.len = len;
block_job_txn_add_job(txn, &job->common);
return &job->common;

View File

@@ -129,10 +129,9 @@ static int coroutine_fn blkreplay_co_flush(BlockDriverState *bs)
static BlockDriver bdrv_blkreplay = {
.format_name = "blkreplay",
.protocol_name = "blkreplay",
.instance_size = 0,
.bdrv_file_open = blkreplay_open,
.bdrv_open = blkreplay_open,
.bdrv_close = blkreplay_close,
.bdrv_child_perm = bdrv_filter_default_perms,
.bdrv_getlength = blkreplay_getlength,

View File

@@ -31,6 +31,13 @@
static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb);
typedef struct BlockBackendAioNotifier {
void (*attached_aio_context)(AioContext *new_context, void *opaque);
void (*detach_aio_context)(void *opaque);
void *opaque;
QLIST_ENTRY(BlockBackendAioNotifier) list;
} BlockBackendAioNotifier;
struct BlockBackend {
char *name;
int refcnt;
@@ -69,6 +76,7 @@ struct BlockBackend {
bool allow_write_beyond_eof;
NotifierList remove_bs_notifiers, insert_bs_notifiers;
QLIST_HEAD(, BlockBackendAioNotifier) aio_notifiers;
int quiesce_counter;
VMChangeStateEntry *vmsh;
@@ -247,6 +255,36 @@ static int blk_root_inactivate(BdrvChild *child)
return 0;
}
static void blk_root_attach(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
BlockBackendAioNotifier *notifier;
trace_blk_root_attach(child, blk, child->bs);
QLIST_FOREACH(notifier, &blk->aio_notifiers, list) {
bdrv_add_aio_context_notifier(child->bs,
notifier->attached_aio_context,
notifier->detach_aio_context,
notifier->opaque);
}
}
static void blk_root_detach(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
BlockBackendAioNotifier *notifier;
trace_blk_root_detach(child, blk, child->bs);
QLIST_FOREACH(notifier, &blk->aio_notifiers, list) {
bdrv_remove_aio_context_notifier(child->bs,
notifier->attached_aio_context,
notifier->detach_aio_context,
notifier->opaque);
}
}
static const BdrvChildRole child_root = {
.inherit_options = blk_root_inherit_options,
@@ -260,6 +298,9 @@ static const BdrvChildRole child_root = {
.activate = blk_root_activate,
.inactivate = blk_root_inactivate,
.attach = blk_root_attach,
.detach = blk_root_detach,
};
/*
@@ -287,6 +328,7 @@ BlockBackend *blk_new(uint64_t perm, uint64_t shared_perm)
notifier_list_init(&blk->remove_bs_notifiers);
notifier_list_init(&blk->insert_bs_notifiers);
QLIST_INIT(&blk->aio_notifiers);
QTAILQ_INSERT_TAIL(&block_backends, blk, link);
return blk;
@@ -364,6 +406,7 @@ static void blk_delete(BlockBackend *blk)
}
assert(QLIST_EMPTY(&blk->remove_bs_notifiers.notifiers));
assert(QLIST_EMPTY(&blk->insert_bs_notifiers.notifiers));
assert(QLIST_EMPTY(&blk->aio_notifiers));
QTAILQ_REMOVE(&block_backends, blk, link);
drive_info_del(blk->legacy_dinfo);
block_acct_cleanup(&blk->stats);
@@ -1150,7 +1193,7 @@ int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
typedef struct BlkRwCo {
BlockBackend *blk;
int64_t offset;
QEMUIOVector *qiov;
void *iobuf;
int ret;
BdrvRequestFlags flags;
} BlkRwCo;
@@ -1158,17 +1201,19 @@ typedef struct BlkRwCo {
static void blk_read_entry(void *opaque)
{
BlkRwCo *rwco = opaque;
QEMUIOVector *qiov = rwco->iobuf;
rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, rwco->qiov->size,
rwco->qiov, rwco->flags);
rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, qiov->size,
qiov, rwco->flags);
}
static void blk_write_entry(void *opaque)
{
BlkRwCo *rwco = opaque;
QEMUIOVector *qiov = rwco->iobuf;
rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, rwco->qiov->size,
rwco->qiov, rwco->flags);
rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, qiov->size,
qiov, rwco->flags);
}
static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
@@ -1188,7 +1233,7 @@ static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
rwco = (BlkRwCo) {
.blk = blk,
.offset = offset,
.qiov = &qiov,
.iobuf = &qiov,
.flags = flags,
.ret = NOT_DONE,
};
@@ -1296,7 +1341,7 @@ static void blk_aio_complete_bh(void *opaque)
}
static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
QEMUIOVector *qiov, CoroutineEntry co_entry,
void *iobuf, CoroutineEntry co_entry,
BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque)
{
@@ -1308,7 +1353,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
acb->rwco = (BlkRwCo) {
.blk = blk,
.offset = offset,
.qiov = qiov,
.iobuf = iobuf,
.flags = flags,
.ret = NOT_DONE,
};
@@ -1331,10 +1376,11 @@ static void blk_aio_read_entry(void *opaque)
{
BlkAioEmAIOCB *acb = opaque;
BlkRwCo *rwco = &acb->rwco;
QEMUIOVector *qiov = rwco->iobuf;
assert(rwco->qiov->size == acb->bytes);
assert(qiov->size == acb->bytes);
rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, acb->bytes,
rwco->qiov, rwco->flags);
qiov, rwco->flags);
blk_aio_complete(acb);
}
@@ -1342,10 +1388,11 @@ static void blk_aio_write_entry(void *opaque)
{
BlkAioEmAIOCB *acb = opaque;
BlkRwCo *rwco = &acb->rwco;
QEMUIOVector *qiov = rwco->iobuf;
assert(!rwco->qiov || rwco->qiov->size == acb->bytes);
assert(!qiov || qiov->size == acb->bytes);
rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, acb->bytes,
rwco->qiov, rwco->flags);
qiov, rwco->flags);
blk_aio_complete(acb);
}
@@ -1474,8 +1521,10 @@ int blk_co_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
static void blk_ioctl_entry(void *opaque)
{
BlkRwCo *rwco = opaque;
QEMUIOVector *qiov = rwco->iobuf;
rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
rwco->qiov->iov[0].iov_base);
qiov->iov[0].iov_base);
}
int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
@@ -1488,24 +1537,15 @@ static void blk_aio_ioctl_entry(void *opaque)
BlkAioEmAIOCB *acb = opaque;
BlkRwCo *rwco = &acb->rwco;
rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
rwco->qiov->iov[0].iov_base);
rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, rwco->iobuf);
blk_aio_complete(acb);
}
BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
BlockCompletionFunc *cb, void *opaque)
{
QEMUIOVector qiov;
struct iovec iov;
iov = (struct iovec) {
.iov_base = buf,
.iov_len = 0,
};
qemu_iovec_init_external(&qiov, &iov, 1);
return blk_aio_prwv(blk, req, 0, &qiov, blk_aio_ioctl_entry, 0, cb, opaque);
return blk_aio_prwv(blk, req, 0, buf, blk_aio_ioctl_entry, 0, cb, opaque);
}
int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
@@ -1860,8 +1900,15 @@ void blk_add_aio_context_notifier(BlockBackend *blk,
void (*attached_aio_context)(AioContext *new_context, void *opaque),
void (*detach_aio_context)(void *opaque), void *opaque)
{
BlockBackendAioNotifier *notifier;
BlockDriverState *bs = blk_bs(blk);
notifier = g_new(BlockBackendAioNotifier, 1);
notifier->attached_aio_context = attached_aio_context;
notifier->detach_aio_context = detach_aio_context;
notifier->opaque = opaque;
QLIST_INSERT_HEAD(&blk->aio_notifiers, notifier, list);
if (bs) {
bdrv_add_aio_context_notifier(bs, attached_aio_context,
detach_aio_context, opaque);
@@ -1874,12 +1921,25 @@ void blk_remove_aio_context_notifier(BlockBackend *blk,
void (*detach_aio_context)(void *),
void *opaque)
{
BlockBackendAioNotifier *notifier;
BlockDriverState *bs = blk_bs(blk);
if (bs) {
bdrv_remove_aio_context_notifier(bs, attached_aio_context,
detach_aio_context, opaque);
}
QLIST_FOREACH(notifier, &blk->aio_notifiers, list) {
if (notifier->attached_aio_context == attached_aio_context &&
notifier->detach_aio_context == detach_aio_context &&
notifier->opaque == opaque) {
QLIST_REMOVE(notifier, list);
g_free(notifier);
return;
}
}
abort();
}
void blk_add_remove_bs_notifier(BlockBackend *blk, Notifier *notify)
@@ -1949,7 +2009,9 @@ int blk_truncate(BlockBackend *blk, int64_t offset, PreallocMode prealloc,
static void blk_pdiscard_entry(void *opaque)
{
BlkRwCo *rwco = opaque;
rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, rwco->qiov->size);
QEMUIOVector *qiov = rwco->iobuf;
rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, qiov->size);
}
int blk_pdiscard(BlockBackend *blk, int64_t offset, int bytes)

View File

@@ -289,7 +289,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
return;
}
s = block_job_create(job_id, &commit_job_driver, bs, 0, BLK_PERM_ALL,
s = block_job_create(job_id, &commit_job_driver, NULL, bs, 0, BLK_PERM_ALL,
speed, BLOCK_JOB_DEFAULT, NULL, NULL, errp);
if (!s) {
return;

76
block/create.c Normal file
View File

@@ -0,0 +1,76 @@
/*
* Block layer code related to image creation
*
* Copyright (c) 2018 Kevin Wolf <kwolf@redhat.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "block/block_int.h"
#include "qapi/qapi-commands-block-core.h"
#include "qapi/error.h"
typedef struct BlockdevCreateCo {
BlockDriver *drv;
BlockdevCreateOptions *opts;
int ret;
Error **errp;
} BlockdevCreateCo;
static void coroutine_fn bdrv_co_create_co_entry(void *opaque)
{
BlockdevCreateCo *cco = opaque;
cco->ret = cco->drv->bdrv_co_create(cco->opts, cco->errp);
}
void qmp_x_blockdev_create(BlockdevCreateOptions *options, Error **errp)
{
const char *fmt = BlockdevDriver_str(options->driver);
BlockDriver *drv = bdrv_find_format(fmt);
Coroutine *co;
BlockdevCreateCo cco;
/* If the driver is in the schema, we know that it exists. But it may not
* be whitelisted. */
assert(drv);
if (bdrv_uses_whitelist() && !bdrv_is_whitelisted(drv, false)) {
error_setg(errp, "Driver is not whitelisted");
return;
}
/* Call callback if it exists */
if (!drv->bdrv_co_create) {
error_setg(errp, "Driver does not support blockdev-create");
return;
}
cco = (BlockdevCreateCo) {
.drv = drv,
.opts = options,
.ret = -EINPROGRESS,
.errp = errp,
};
co = qemu_coroutine_create(bdrv_co_create_co_entry, &cco);
qemu_coroutine_enter(co);
while (cco.ret == -EINPROGRESS) {
aio_poll(qemu_get_aio_context(), true);
}
}

View File

@@ -71,8 +71,6 @@ static ssize_t block_crypto_read_func(QCryptoBlock *block,
struct BlockCryptoCreateData {
const char *filename;
QemuOpts *opts;
BlockBackend *blk;
uint64_t size;
};
@@ -103,27 +101,18 @@ static ssize_t block_crypto_init_func(QCryptoBlock *block,
Error **errp)
{
struct BlockCryptoCreateData *data = opaque;
int ret;
if (data->size > INT64_MAX || headerlen > INT64_MAX - data->size) {
error_setg(errp, "The requested file size is too large");
return -EFBIG;
}
/* User provided size should reflect amount of space made
* available to the guest, so we must take account of that
* which will be used by the crypto header
*/
data->size += headerlen;
qemu_opt_set_number(data->opts, BLOCK_OPT_SIZE, data->size, &error_abort);
ret = bdrv_create_file(data->filename, data->opts, errp);
if (ret < 0) {
return -1;
}
data->blk = blk_new_open(data->filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, errp);
if (!data->blk) {
return -1;
}
return 0;
return blk_truncate(data->blk, data->size + headerlen, PREALLOC_MODE_OFF,
errp);
}
@@ -322,30 +311,29 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
}
static int block_crypto_create_generic(QCryptoBlockFormat format,
const char *filename,
QemuOpts *opts,
Error **errp)
static int block_crypto_co_create_generic(BlockDriverState *bs,
int64_t size,
QCryptoBlockCreateOptions *opts,
Error **errp)
{
int ret = -EINVAL;
QCryptoBlockCreateOptions *create_opts = NULL;
int ret;
BlockBackend *blk;
QCryptoBlock *crypto = NULL;
struct BlockCryptoCreateData data = {
.size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE),
.opts = opts,
.filename = filename,
};
QDict *cryptoopts;
struct BlockCryptoCreateData data;
cryptoopts = qemu_opts_to_qdict(opts, NULL);
blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
create_opts = block_crypto_create_opts_init(format, cryptoopts, errp);
if (!create_opts) {
return -1;
ret = blk_insert_bs(blk, bs, errp);
if (ret < 0) {
goto cleanup;
}
crypto = qcrypto_block_create(create_opts, NULL,
data = (struct BlockCryptoCreateData) {
.blk = blk,
.size = size,
};
crypto = qcrypto_block_create(opts, NULL,
block_crypto_init_func,
block_crypto_write_func,
&data,
@@ -358,10 +346,8 @@ static int block_crypto_create_generic(QCryptoBlockFormat format,
ret = 0;
cleanup:
QDECREF(cryptoopts);
qcrypto_block_free(crypto);
blk_unref(data.blk);
qapi_free_QCryptoBlockCreateOptions(create_opts);
blk_unref(blk);
return ret;
}
@@ -371,7 +357,11 @@ static int block_crypto_truncate(BlockDriverState *bs, int64_t offset,
BlockCrypto *crypto = bs->opaque;
uint64_t payload_offset =
qcrypto_block_get_payload_offset(crypto->block);
assert(payload_offset < (INT64_MAX - offset));
if (payload_offset > INT64_MAX - offset) {
error_setg(errp, "The requested file size is too large");
return -EFBIG;
}
offset += payload_offset;
@@ -384,6 +374,12 @@ static void block_crypto_close(BlockDriverState *bs)
qcrypto_block_free(crypto->block);
}
static int block_crypto_reopen_prepare(BDRVReopenState *state,
BlockReopenQueue *queue, Error **errp)
{
/* nothing needs checking */
return 0;
}
/*
* 1 MB bounce buffer gives good performance / memory tradeoff
@@ -531,7 +527,10 @@ static int64_t block_crypto_getlength(BlockDriverState *bs)
uint64_t offset = qcrypto_block_get_payload_offset(crypto->block);
assert(offset < INT64_MAX);
assert(offset < len);
if (offset > len) {
return -EIO;
}
len -= offset;
@@ -556,12 +555,88 @@ static int block_crypto_open_luks(BlockDriverState *bs,
bs, options, flags, errp);
}
static int coroutine_fn
block_crypto_co_create_luks(BlockdevCreateOptions *create_options, Error **errp)
{
BlockdevCreateOptionsLUKS *luks_opts;
BlockDriverState *bs = NULL;
QCryptoBlockCreateOptions create_opts;
int ret;
assert(create_options->driver == BLOCKDEV_DRIVER_LUKS);
luks_opts = &create_options->u.luks;
bs = bdrv_open_blockdev_ref(luks_opts->file, errp);
if (bs == NULL) {
return -EIO;
}
create_opts = (QCryptoBlockCreateOptions) {
.format = Q_CRYPTO_BLOCK_FORMAT_LUKS,
.u.luks = *qapi_BlockdevCreateOptionsLUKS_base(luks_opts),
};
ret = block_crypto_co_create_generic(bs, luks_opts->size, &create_opts,
errp);
if (ret < 0) {
goto fail;
}
ret = 0;
fail:
bdrv_unref(bs);
return ret;
}
static int coroutine_fn block_crypto_co_create_opts_luks(const char *filename,
QemuOpts *opts,
Error **errp)
{
return block_crypto_create_generic(Q_CRYPTO_BLOCK_FORMAT_LUKS,
filename, opts, errp);
QCryptoBlockCreateOptions *create_opts = NULL;
BlockDriverState *bs = NULL;
QDict *cryptoopts;
int64_t size;
int ret;
/* Parse options */
size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
cryptoopts = qemu_opts_to_qdict_filtered(opts, NULL,
&block_crypto_create_opts_luks,
true);
create_opts = block_crypto_create_opts_init(Q_CRYPTO_BLOCK_FORMAT_LUKS,
cryptoopts, errp);
if (!create_opts) {
ret = -EINVAL;
goto fail;
}
/* Create protocol layer */
ret = bdrv_create_file(filename, opts, errp);
if (ret < 0) {
return ret;
}
bs = bdrv_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
if (!bs) {
ret = -EINVAL;
goto fail;
}
/* Create format layer */
ret = block_crypto_co_create_generic(bs, size, create_opts, errp);
if (ret < 0) {
goto fail;
}
ret = 0;
fail:
bdrv_unref(bs);
qapi_free_QCryptoBlockCreateOptions(create_opts);
QDECREF(cryptoopts);
return ret;
}
static int block_crypto_get_info_luks(BlockDriverState *bs,
@@ -617,10 +692,12 @@ BlockDriver bdrv_crypto_luks = {
.bdrv_open = block_crypto_open_luks,
.bdrv_close = block_crypto_close,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_co_create = block_crypto_co_create_luks,
.bdrv_co_create_opts = block_crypto_co_create_opts_luks,
.bdrv_truncate = block_crypto_truncate,
.create_opts = &block_crypto_create_opts_luks,
.bdrv_reopen_prepare = block_crypto_reopen_prepare,
.bdrv_refresh_limits = block_crypto_refresh_limits,
.bdrv_co_preadv = block_crypto_co_preadv,
.bdrv_co_pwritev = block_crypto_co_pwritev,

View File

@@ -40,6 +40,8 @@ struct BdrvDirtyBitmap {
QemuMutex *mutex;
HBitmap *bitmap; /* Dirty bitmap implementation */
HBitmap *meta; /* Meta dirty bitmap */
bool qmp_locked; /* Bitmap is locked, it can't be modified
through QMP */
BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
char *name; /* Optional non-empty unique ID */
int64_t size; /* Size of the bitmap, in bytes */
@@ -183,6 +185,18 @@ bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
return bitmap->successor;
}
void bdrv_dirty_bitmap_set_qmp_locked(BdrvDirtyBitmap *bitmap, bool qmp_locked)
{
qemu_mutex_lock(bitmap->mutex);
bitmap->qmp_locked = qmp_locked;
qemu_mutex_unlock(bitmap->mutex);
}
bool bdrv_dirty_bitmap_qmp_locked(BdrvDirtyBitmap *bitmap)
{
return bitmap->qmp_locked;
}
/* Called with BQL taken. */
bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
{
@@ -194,6 +208,8 @@ DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
{
if (bdrv_dirty_bitmap_frozen(bitmap)) {
return DIRTY_BITMAP_STATUS_FROZEN;
} else if (bdrv_dirty_bitmap_qmp_locked(bitmap)) {
return DIRTY_BITMAP_STATUS_LOCKED;
} else if (!bdrv_dirty_bitmap_enabled(bitmap)) {
return DIRTY_BITMAP_STATUS_DISABLED;
} else {
@@ -234,6 +250,59 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
return 0;
}
/* Called with BQL taken. */
void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap *bitmap)
{
qemu_mutex_lock(bitmap->mutex);
bdrv_enable_dirty_bitmap(bitmap->successor);
qemu_mutex_unlock(bitmap->mutex);
}
/* Called within bdrv_dirty_bitmap_lock..unlock */
static void bdrv_do_release_matching_dirty_bitmap_locked(
BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
bool (*cond)(BdrvDirtyBitmap *bitmap))
{
BdrvDirtyBitmap *bm, *next;
QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
if ((!bitmap || bm == bitmap) && (!cond || cond(bm))) {
assert(!bm->active_iterators);
assert(!bdrv_dirty_bitmap_frozen(bm));
assert(!bm->meta);
QLIST_REMOVE(bm, list);
hbitmap_free(bm->bitmap);
g_free(bm->name);
g_free(bm);
if (bitmap) {
return;
}
}
}
if (bitmap) {
abort();
}
}
/* Called with BQL taken. */
static void bdrv_do_release_matching_dirty_bitmap(
BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
bool (*cond)(BdrvDirtyBitmap *bitmap))
{
bdrv_dirty_bitmaps_lock(bs);
bdrv_do_release_matching_dirty_bitmap_locked(bs, bitmap, cond);
bdrv_dirty_bitmaps_unlock(bs);
}
/* Called within bdrv_dirty_bitmap_lock..unlock */
static void bdrv_release_dirty_bitmap_locked(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap)
{
bdrv_do_release_matching_dirty_bitmap_locked(bs, bitmap, NULL);
}
/**
* For a bitmap with a successor, yield our name to the successor,
* delete the old bitmap, and return a handle to the new bitmap.
@@ -267,11 +336,11 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
* In cases of failure where we can no longer safely delete the parent,
* we may wish to re-join the parent and child/successor.
* The merged parent will be un-frozen, but not explicitly re-enabled.
* Called with BQL taken.
* Called within bdrv_dirty_bitmap_lock..unlock and with BQL taken.
*/
BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
Error **errp)
BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap_locked(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
Error **errp)
{
BdrvDirtyBitmap *successor = parent->successor;
@@ -284,12 +353,26 @@ BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
error_setg(errp, "Merging of parent and successor bitmap failed");
return NULL;
}
bdrv_release_dirty_bitmap(bs, successor);
bdrv_release_dirty_bitmap_locked(bs, successor);
parent->successor = NULL;
return parent;
}
/* Called with BQL taken. */
BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
Error **errp)
{
BdrvDirtyBitmap *ret;
qemu_mutex_lock(parent->mutex);
ret = bdrv_reclaim_dirty_bitmap_locked(bs, parent, errp);
qemu_mutex_unlock(parent->mutex);
return ret;
}
/**
* Truncates _all_ bitmaps attached to a BDS.
* Called with BQL taken.
@@ -313,36 +396,6 @@ static bool bdrv_dirty_bitmap_has_name(BdrvDirtyBitmap *bitmap)
return !!bdrv_dirty_bitmap_name(bitmap);
}
/* Called with BQL taken. */
static void bdrv_do_release_matching_dirty_bitmap(
BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
bool (*cond)(BdrvDirtyBitmap *bitmap))
{
BdrvDirtyBitmap *bm, *next;
bdrv_dirty_bitmaps_lock(bs);
QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
if ((!bitmap || bm == bitmap) && (!cond || cond(bm))) {
assert(!bm->active_iterators);
assert(!bdrv_dirty_bitmap_frozen(bm));
assert(!bm->meta);
QLIST_REMOVE(bm, list);
hbitmap_free(bm->bitmap);
g_free(bm->name);
g_free(bm);
if (bitmap) {
goto out;
}
}
}
if (bitmap) {
abort();
}
out:
bdrv_dirty_bitmaps_unlock(bs);
}
/* Called with BQL taken. */
void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
{

View File

@@ -1686,17 +1686,22 @@ static int raw_regular_truncate(int fd, int64_t offset, PreallocMode prealloc,
* file systems that do not support fallocate(), trying to check if a
* block is allocated before allocating it, so don't do that here.
*/
result = -posix_fallocate(fd, current_length, offset - current_length);
if (result != 0) {
/* posix_fallocate() doesn't set errno. */
error_setg_errno(errp, -result,
"Could not preallocate new data");
if (offset != current_length) {
result = -posix_fallocate(fd, current_length, offset - current_length);
if (result != 0) {
/* posix_fallocate() doesn't set errno. */
error_setg_errno(errp, -result,
"Could not preallocate new data");
}
} else {
result = 0;
}
goto out;
#endif
case PREALLOC_MODE_FULL:
{
int64_t num = 0, left = offset - current_length;
off_t seek_result;
/*
* Knowing the final size from the beginning could allow the file
@@ -1711,8 +1716,8 @@ static int raw_regular_truncate(int fd, int64_t offset, PreallocMode prealloc,
buf = g_malloc0(65536);
result = lseek(fd, current_length, SEEK_SET);
if (result < 0) {
seek_result = lseek(fd, current_length, SEEK_SET);
if (seek_result < 0) {
result = -errno;
error_setg_errno(errp, -result,
"Failed to seek to the old end of file");
@@ -1982,34 +1987,25 @@ static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
return (int64_t)st.st_blocks * 512;
}
static int coroutine_fn raw_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
{
BlockdevCreateOptionsFile *file_opts;
int fd;
int result = 0;
int64_t total_size = 0;
bool nocow = false;
PreallocMode prealloc;
char *buf = NULL;
Error *local_err = NULL;
strstart(filename, "file:", &filename);
/* Validate options and set default values */
assert(options->driver == BLOCKDEV_DRIVER_FILE);
file_opts = &options->u.file;
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
prealloc = qapi_enum_parse(&PreallocMode_lookup, buf,
PREALLOC_MODE_OFF, &local_err);
g_free(buf);
if (local_err) {
error_propagate(errp, local_err);
result = -EINVAL;
goto out;
if (!file_opts->has_nocow) {
file_opts->nocow = false;
}
if (!file_opts->has_preallocation) {
file_opts->preallocation = PREALLOC_MODE_OFF;
}
fd = qemu_open(filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY,
/* Create file */
fd = qemu_open(file_opts->filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY,
0644);
if (fd < 0) {
result = -errno;
@@ -2017,7 +2013,7 @@ static int coroutine_fn raw_co_create_opts(const char *filename, QemuOpts *opts,
goto out;
}
if (nocow) {
if (file_opts->nocow) {
#ifdef __linux__
/* Set NOCOW flag to solve performance issue on fs like btrfs.
* This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
@@ -2032,7 +2028,8 @@ static int coroutine_fn raw_co_create_opts(const char *filename, QemuOpts *opts,
#endif
}
result = raw_regular_truncate(fd, total_size, prealloc, errp);
result = raw_regular_truncate(fd, file_opts->size, file_opts->preallocation,
errp);
if (result < 0) {
goto out_close;
}
@@ -2046,6 +2043,46 @@ out:
return result;
}
static int coroutine_fn raw_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
{
BlockdevCreateOptions options;
int64_t total_size = 0;
bool nocow = false;
PreallocMode prealloc;
char *buf = NULL;
Error *local_err = NULL;
/* Skip file: protocol prefix */
strstart(filename, "file:", &filename);
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
prealloc = qapi_enum_parse(&PreallocMode_lookup, buf,
PREALLOC_MODE_OFF, &local_err);
g_free(buf);
if (local_err) {
error_propagate(errp, local_err);
return -EINVAL;
}
options = (BlockdevCreateOptions) {
.driver = BLOCKDEV_DRIVER_FILE,
.u.file = {
.filename = (char *) filename,
.size = total_size,
.has_preallocation = true,
.preallocation = prealloc,
.has_nocow = true,
.nocow = nocow,
},
};
return raw_co_create(&options, errp);
}
/*
* Find allocation range in @bs around offset @start.
* May change underlying file descriptor's file offset.
@@ -2078,7 +2115,12 @@ static int find_allocation(BlockDriverState *bs, off_t start,
if (offs < 0) {
return -errno; /* D3 or D4 */
}
assert(offs >= start);
if (offs < start) {
/* This is not a valid return by lseek(). We are safe to just return
* -EIO in this case, and we'll treat it like D4. */
return -EIO;
}
if (offs > start) {
/* D2: in hole, next data at offs */
@@ -2110,7 +2152,12 @@ static int find_allocation(BlockDriverState *bs, off_t start,
if (offs < 0) {
return -errno; /* D1 and (H3 or H4) */
}
assert(offs >= start);
if (offs < start) {
/* This is not a valid return by lseek(). We are safe to just return
* -EIO in this case, and we'll treat it like H4. */
return -EIO;
}
if (offs > start) {
/*
@@ -2277,6 +2324,7 @@ BlockDriver bdrv_file = {
.bdrv_reopen_commit = raw_reopen_commit,
.bdrv_reopen_abort = raw_reopen_abort,
.bdrv_close = raw_close,
.bdrv_co_create = raw_co_create,
.bdrv_co_create_opts = raw_co_create_opts,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_co_block_status = raw_co_block_status,

View File

@@ -553,10 +553,40 @@ static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
return st.st_size;
}
static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
{
BlockdevCreateOptionsFile *file_opts;
int fd;
assert(options->driver == BLOCKDEV_DRIVER_FILE);
file_opts = &options->u.file;
if (file_opts->has_preallocation) {
error_setg(errp, "Preallocation is not supported on Windows");
return -EINVAL;
}
if (file_opts->has_nocow) {
error_setg(errp, "nocow is not supported on Windows");
return -EINVAL;
}
fd = qemu_open(file_opts->filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
if (fd < 0) {
error_setg_errno(errp, errno, "Could not create file");
return -EIO;
}
set_sparse(fd);
ftruncate(fd, file_opts->size);
qemu_close(fd);
return 0;
}
static int coroutine_fn raw_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
{
int fd;
BlockdevCreateOptions options;
int64_t total_size = 0;
strstart(filename, "file:", &filename);
@@ -565,19 +595,18 @@ static int coroutine_fn raw_co_create_opts(const char *filename, QemuOpts *opts,
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
if (fd < 0) {
error_setg_errno(errp, errno, "Could not create file");
return -EIO;
}
set_sparse(fd);
ftruncate(fd, total_size);
qemu_close(fd);
return 0;
options = (BlockdevCreateOptions) {
.driver = BLOCKDEV_DRIVER_FILE,
.u.file = {
.filename = (char *) filename,
.size = total_size,
.has_preallocation = false,
.has_nocow = false,
},
};
return raw_co_create(&options, errp);
}
static QemuOptsList raw_create_opts = {
.name = "raw-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),

View File

@@ -167,7 +167,12 @@ static QemuOptsList runtime_unix_opts = {
{
.name = GLUSTER_OPT_SOCKET,
.type = QEMU_OPT_STRING,
.help = "socket file path)",
.help = "socket file path (legacy)",
},
{
.name = GLUSTER_OPT_PATH,
.type = QEMU_OPT_STRING,
.help = "socket file path (QAPI)",
},
{ /* end of list */ }
},
@@ -615,10 +620,18 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
goto out;
}
ptr = qemu_opt_get(opts, GLUSTER_OPT_SOCKET);
ptr = qemu_opt_get(opts, GLUSTER_OPT_PATH);
if (!ptr) {
ptr = qemu_opt_get(opts, GLUSTER_OPT_SOCKET);
} else if (qemu_opt_get(opts, GLUSTER_OPT_SOCKET)) {
error_setg(&local_err,
"Conflicting parameters 'path' and 'socket'");
error_append_hint(&local_err, GERR_INDEX_HINT, i);
goto out;
}
if (!ptr) {
error_setg(&local_err, QERR_MISSING_PARAMETER,
GLUSTER_OPT_SOCKET);
GLUSTER_OPT_PATH);
error_append_hint(&local_err, GERR_INDEX_HINT, i);
goto out;
}
@@ -655,21 +668,22 @@ out:
return -errno;
}
static struct glfs *qemu_gluster_init(BlockdevOptionsGluster *gconf,
const char *filename,
QDict *options, Error **errp)
/* Converts options given in @filename and the @options QDict into the QAPI
* object @gconf. */
static int qemu_gluster_parse(BlockdevOptionsGluster *gconf,
const char *filename,
QDict *options, Error **errp)
{
int ret;
if (filename) {
ret = qemu_gluster_parse_uri(gconf, filename);
if (ret < 0) {
error_setg(errp, "invalid URI");
error_setg(errp, "invalid URI %s", filename);
error_append_hint(errp, "Usage: file=gluster[+transport]://"
"[host[:port]]volume/path[?socket=...]"
"[,file.debug=N]"
"[,file.logfile=/path/filename.log]\n");
errno = -ret;
return NULL;
return ret;
}
} else {
ret = qemu_gluster_parse_json(gconf, options, errp);
@@ -683,12 +697,25 @@ static struct glfs *qemu_gluster_init(BlockdevOptionsGluster *gconf,
"file.server.0.host=1.2.3.4,"
"file.server.0.port=24007,"
"file.server.1.transport=unix,"
"file.server.1.socket=/var/run/glusterd.socket ..."
"file.server.1.path=/var/run/glusterd.socket ..."
"\n");
errno = -ret;
return NULL;
return ret;
}
}
return 0;
}
static struct glfs *qemu_gluster_init(BlockdevOptionsGluster *gconf,
const char *filename,
QDict *options, Error **errp)
{
int ret;
ret = qemu_gluster_parse(gconf, filename, options, errp);
if (ret < 0) {
errno = -ret;
return NULL;
}
return qemu_gluster_glfs_init(gconf, errp);
@@ -1021,20 +1048,72 @@ static int qemu_gluster_do_truncate(struct glfs_fd *fd, int64_t offset,
return 0;
}
static int qemu_gluster_co_create(BlockdevCreateOptions *options,
Error **errp)
{
BlockdevCreateOptionsGluster *opts = &options->u.gluster;
struct glfs *glfs;
struct glfs_fd *fd = NULL;
int ret = 0;
assert(options->driver == BLOCKDEV_DRIVER_GLUSTER);
glfs = qemu_gluster_glfs_init(opts->location, errp);
if (!glfs) {
ret = -errno;
goto out;
}
fd = glfs_creat(glfs, opts->location->path,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR | S_IWUSR);
if (!fd) {
ret = -errno;
goto out;
}
ret = qemu_gluster_do_truncate(fd, opts->size, opts->preallocation, errp);
out:
if (fd) {
if (glfs_close(fd) != 0 && ret == 0) {
ret = -errno;
}
}
glfs_clear_preopened(glfs);
return ret;
}
static int coroutine_fn qemu_gluster_co_create_opts(const char *filename,
QemuOpts *opts,
Error **errp)
{
BlockdevCreateOptions *options;
BlockdevCreateOptionsGluster *gopts;
BlockdevOptionsGluster *gconf;
struct glfs *glfs;
struct glfs_fd *fd = NULL;
int ret = 0;
PreallocMode prealloc;
int64_t total_size = 0;
char *tmp = NULL;
Error *local_err = NULL;
int ret;
options = g_new0(BlockdevCreateOptions, 1);
options->driver = BLOCKDEV_DRIVER_GLUSTER;
gopts = &options->u.gluster;
gconf = g_new0(BlockdevOptionsGluster, 1);
gopts->location = gconf;
gopts->size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
gopts->preallocation = qapi_enum_parse(&PreallocMode_lookup, tmp,
PREALLOC_MODE_OFF, &local_err);
g_free(tmp);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
gconf->debug = qemu_opt_get_number_del(opts, GLUSTER_OPT_DEBUG,
GLUSTER_DEBUG_DEFAULT);
if (gconf->debug < 0) {
@@ -1050,42 +1129,19 @@ static int coroutine_fn qemu_gluster_co_create_opts(const char *filename,
}
gconf->has_logfile = true;
glfs = qemu_gluster_init(gconf, filename, NULL, errp);
if (!glfs) {
ret = -errno;
goto out;
ret = qemu_gluster_parse(gconf, filename, NULL, errp);
if (ret < 0) {
goto fail;
}
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
prealloc = qapi_enum_parse(&PreallocMode_lookup, tmp, PREALLOC_MODE_OFF,
&local_err);
g_free(tmp);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto out;
ret = qemu_gluster_co_create(options, errp);
if (ret < 0) {
goto fail;
}
fd = glfs_creat(glfs, gconf->path,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR | S_IWUSR);
if (!fd) {
ret = -errno;
goto out;
}
ret = qemu_gluster_do_truncate(fd, total_size, prealloc, errp);
out:
if (fd) {
if (glfs_close(fd) != 0 && ret == 0) {
ret = -errno;
}
}
qapi_free_BlockdevOptionsGluster(gconf);
glfs_clear_preopened(glfs);
ret = 0;
fail:
qapi_free_BlockdevCreateOptions(options);
return ret;
}
@@ -1436,6 +1492,7 @@ static BlockDriver bdrv_gluster = {
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_co_create = qemu_gluster_co_create,
.bdrv_co_create_opts = qemu_gluster_co_create_opts,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
@@ -1464,6 +1521,7 @@ static BlockDriver bdrv_gluster_tcp = {
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_co_create = qemu_gluster_co_create,
.bdrv_co_create_opts = qemu_gluster_co_create_opts,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
@@ -1492,6 +1550,7 @@ static BlockDriver bdrv_gluster_unix = {
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_co_create = qemu_gluster_co_create,
.bdrv_co_create_opts = qemu_gluster_co_create_opts,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
@@ -1526,6 +1585,7 @@ static BlockDriver bdrv_gluster_rdma = {
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_co_create = qemu_gluster_co_create,
.bdrv_co_create_opts = qemu_gluster_co_create_opts,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,

View File

@@ -249,8 +249,7 @@ static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
BdrvCoDrainData data;
/* Calling bdrv_drain() from a BH ensures the current coroutine yields and
* other coroutines run if they were queued from
* qemu_co_queue_run_restart(). */
* other coroutines run if they were queued by aio_co_enter(). */
assert(qemu_in_coroutine());
data = (BdrvCoDrainData) {

View File

@@ -2177,8 +2177,8 @@ static int iscsi_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
return 0;
}
static void iscsi_invalidate_cache(BlockDriverState *bs,
Error **errp)
static void coroutine_fn iscsi_co_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
IscsiLun *iscsilun = bs->opaque;
iscsi_allocmap_invalidate(iscsilun);
@@ -2209,7 +2209,7 @@ static BlockDriver bdrv_iscsi = {
.create_opts = &iscsi_create_opts,
.bdrv_reopen_prepare = iscsi_reopen_prepare,
.bdrv_reopen_commit = iscsi_reopen_commit,
.bdrv_invalidate_cache = iscsi_invalidate_cache,
.bdrv_co_invalidate_cache = iscsi_co_invalidate_cache,
.bdrv_getlength = iscsi_getlength,
.bdrv_get_info = iscsi_get_info,
@@ -2244,7 +2244,7 @@ static BlockDriver bdrv_iser = {
.create_opts = &iscsi_create_opts,
.bdrv_reopen_prepare = iscsi_reopen_prepare,
.bdrv_reopen_commit = iscsi_reopen_commit,
.bdrv_invalidate_cache = iscsi_invalidate_cache,
.bdrv_co_invalidate_cache = iscsi_co_invalidate_cache,
.bdrv_getlength = iscsi_getlength,
.bdrv_get_info = iscsi_get_info,

View File

@@ -869,11 +869,8 @@ static void coroutine_fn mirror_run(void *opaque)
ret = 0;
trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
if (!s->synced) {
block_job_sleep_ns(&s->common, delay_ns);
if (block_job_is_cancelled(&s->common)) {
break;
}
if (block_job_is_cancelled(&s->common) && s->common.force) {
break;
} else if (!should_complete) {
delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
block_job_sleep_ns(&s->common, delay_ns);
@@ -887,7 +884,8 @@ immediate_exit:
* or it was cancelled prematurely so that we do not guarantee that
* the target is a copy of the source.
*/
assert(ret < 0 || (!s->synced && block_job_is_cancelled(&s->common)));
assert(ret < 0 || ((s->common.force || !s->synced) &&
block_job_is_cancelled(&s->common)));
assert(need_drain);
mirror_wait_for_all_io(s);
}
@@ -1166,7 +1164,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
}
/* Make sure that the source is not resized while the job is running */
s = block_job_create(job_id, driver, mirror_top_bs,
s = block_job_create(job_id, driver, NULL, mirror_top_bs,
BLK_PERM_CONSISTENT_READ,
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD, speed,

View File

@@ -228,6 +228,48 @@ static int nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk,
return 0;
}
/* nbd_parse_blockstatus_payload
* support only one extent in reply and only for
* base:allocation context
*/
static int nbd_parse_blockstatus_payload(NBDClientSession *client,
NBDStructuredReplyChunk *chunk,
uint8_t *payload, uint64_t orig_length,
NBDExtent *extent, Error **errp)
{
uint32_t context_id;
if (chunk->length != sizeof(context_id) + sizeof(*extent)) {
error_setg(errp, "Protocol error: invalid payload for "
"NBD_REPLY_TYPE_BLOCK_STATUS");
return -EINVAL;
}
context_id = payload_advance32(&payload);
if (client->info.meta_base_allocation_id != context_id) {
error_setg(errp, "Protocol error: unexpected context id %d for "
"NBD_REPLY_TYPE_BLOCK_STATUS, when negotiated context "
"id is %d", context_id,
client->info.meta_base_allocation_id);
return -EINVAL;
}
extent->length = payload_advance32(&payload);
extent->flags = payload_advance32(&payload);
if (extent->length == 0 ||
(client->info.min_block && !QEMU_IS_ALIGNED(extent->length,
client->info.min_block)) ||
extent->length > orig_length)
{
error_setg(errp, "Protocol error: server sent status chunk with "
"invalid length");
return -EINVAL;
}
return 0;
}
/* nbd_parse_error_payload
* on success @errp contains message describing nbd error reply
*/
@@ -481,6 +523,7 @@ static coroutine_fn int nbd_co_receive_one_chunk(
typedef struct NBDReplyChunkIter {
int ret;
bool fatal;
Error *err;
bool done, only_structured;
} NBDReplyChunkIter;
@@ -490,11 +533,12 @@ static void nbd_iter_error(NBDReplyChunkIter *iter, bool fatal,
{
assert(ret < 0);
if (fatal || iter->ret == 0) {
if ((fatal && !iter->fatal) || iter->ret == 0) {
if (iter->ret != 0) {
error_free(iter->err);
iter->err = NULL;
}
iter->fatal = fatal;
iter->ret = ret;
error_propagate(&iter->err, *local_err);
} else {
@@ -640,6 +684,68 @@ static int nbd_co_receive_cmdread_reply(NBDClientSession *s, uint64_t handle,
return iter.ret;
}
static int nbd_co_receive_blockstatus_reply(NBDClientSession *s,
uint64_t handle, uint64_t length,
NBDExtent *extent, Error **errp)
{
NBDReplyChunkIter iter;
NBDReply reply;
void *payload = NULL;
Error *local_err = NULL;
bool received = false;
assert(!extent->length);
NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply,
NULL, &reply, &payload)
{
int ret;
NBDStructuredReplyChunk *chunk = &reply.structured;
assert(nbd_reply_is_structured(&reply));
switch (chunk->type) {
case NBD_REPLY_TYPE_BLOCK_STATUS:
if (received) {
s->quit = true;
error_setg(&local_err, "Several BLOCK_STATUS chunks in reply");
nbd_iter_error(&iter, true, -EINVAL, &local_err);
}
received = true;
ret = nbd_parse_blockstatus_payload(s, &reply.structured,
payload, length, extent,
&local_err);
if (ret < 0) {
s->quit = true;
nbd_iter_error(&iter, true, ret, &local_err);
}
break;
default:
if (!nbd_reply_type_is_error(chunk->type)) {
s->quit = true;
error_setg(&local_err,
"Unexpected reply type: %d (%s) "
"for CMD_BLOCK_STATUS",
chunk->type, nbd_reply_type_lookup(chunk->type));
nbd_iter_error(&iter, true, -EINVAL, &local_err);
}
}
g_free(payload);
payload = NULL;
}
if (!extent->length && !iter.err) {
error_setg(&iter.err,
"Server did not reply with any status extents");
if (!iter.ret) {
iter.ret = -EIO;
}
}
error_propagate(errp, iter.err);
return iter.ret;
}
static int nbd_co_request(BlockDriverState *bs, NBDRequest *request,
QEMUIOVector *write_qiov)
{
@@ -782,6 +888,51 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
return nbd_co_request(bs, &request, NULL);
}
int coroutine_fn nbd_client_co_block_status(BlockDriverState *bs,
bool want_zero,
int64_t offset, int64_t bytes,
int64_t *pnum, int64_t *map,
BlockDriverState **file)
{
int64_t ret;
NBDExtent extent = { 0 };
NBDClientSession *client = nbd_get_client_session(bs);
Error *local_err = NULL;
NBDRequest request = {
.type = NBD_CMD_BLOCK_STATUS,
.from = offset,
.len = MIN(MIN_NON_ZERO(QEMU_ALIGN_DOWN(INT_MAX,
bs->bl.request_alignment),
client->info.max_block), bytes),
.flags = NBD_CMD_FLAG_REQ_ONE,
};
if (!client->info.base_allocation) {
*pnum = bytes;
return BDRV_BLOCK_DATA;
}
ret = nbd_co_send_request(bs, &request, NULL);
if (ret < 0) {
return ret;
}
ret = nbd_co_receive_blockstatus_reply(client, request.handle, bytes,
&extent, &local_err);
if (local_err) {
error_report_err(local_err);
}
if (ret < 0) {
return ret;
}
assert(extent.length);
*pnum = extent.length;
return (extent.flags & NBD_STATE_HOLE ? 0 : BDRV_BLOCK_DATA) |
(extent.flags & NBD_STATE_ZERO ? BDRV_BLOCK_ZERO : 0);
}
void nbd_client_detach_aio_context(BlockDriverState *bs)
{
NBDClientSession *client = nbd_get_client_session(bs);
@@ -826,6 +977,7 @@ int nbd_client_init(BlockDriverState *bs,
client->info.request_sizes = true;
client->info.structured_reply = true;
client->info.base_allocation = true;
ret = nbd_receive_negotiate(QIO_CHANNEL(sioc), export,
tlscreds, hostname,
&client->ioc, &client->info, errp);

View File

@@ -61,4 +61,10 @@ void nbd_client_detach_aio_context(BlockDriverState *bs);
void nbd_client_attach_aio_context(BlockDriverState *bs,
AioContext *new_context);
int coroutine_fn nbd_client_co_block_status(BlockDriverState *bs,
bool want_zero,
int64_t offset, int64_t bytes,
int64_t *pnum, int64_t *map,
BlockDriverState **file);
#endif /* NBD_CLIENT_H */

View File

@@ -585,6 +585,7 @@ static BlockDriver bdrv_nbd = {
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
.bdrv_co_block_status = nbd_client_co_block_status,
};
static BlockDriver bdrv_nbd_tcp = {
@@ -604,6 +605,7 @@ static BlockDriver bdrv_nbd_tcp = {
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
.bdrv_co_block_status = nbd_client_co_block_status,
};
static BlockDriver bdrv_nbd_unix = {
@@ -623,6 +625,7 @@ static BlockDriver bdrv_nbd_unix = {
.bdrv_detach_aio_context = nbd_detach_aio_context,
.bdrv_attach_aio_context = nbd_attach_aio_context,
.bdrv_refresh_filename = nbd_refresh_filename,
.bdrv_co_block_status = nbd_client_co_block_status,
};
static void bdrv_nbd_init(void)

View File

@@ -367,49 +367,6 @@ static int coroutine_fn nfs_co_flush(BlockDriverState *bs)
return task.ret;
}
static QemuOptsList runtime_opts = {
.name = "nfs",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "path",
.type = QEMU_OPT_STRING,
.help = "Path of the image on the host",
},
{
.name = "user",
.type = QEMU_OPT_NUMBER,
.help = "UID value to use when talking to the server",
},
{
.name = "group",
.type = QEMU_OPT_NUMBER,
.help = "GID value to use when talking to the server",
},
{
.name = "tcp-syn-count",
.type = QEMU_OPT_NUMBER,
.help = "Number of SYNs to send during the session establish",
},
{
.name = "readahead-size",
.type = QEMU_OPT_NUMBER,
.help = "Set the readahead size in bytes",
},
{
.name = "page-cache-size",
.type = QEMU_OPT_NUMBER,
.help = "Set the pagecache size in bytes",
},
{
.name = "debug",
.type = QEMU_OPT_NUMBER,
.help = "Set the NFS debug level (max 2)",
},
{ /* end of list */ }
},
};
static void nfs_detach_aio_context(BlockDriverState *bs)
{
NFSClient *client = bs->opaque;
@@ -452,71 +409,16 @@ static void nfs_file_close(BlockDriverState *bs)
nfs_client_close(client);
}
static NFSServer *nfs_config(QDict *options, Error **errp)
{
NFSServer *server = NULL;
QDict *addr = NULL;
QObject *crumpled_addr = NULL;
Visitor *iv = NULL;
Error *local_error = NULL;
qdict_extract_subqdict(options, &addr, "server.");
if (!qdict_size(addr)) {
error_setg(errp, "NFS server address missing");
goto out;
}
crumpled_addr = qdict_crumple(addr, errp);
if (!crumpled_addr) {
goto out;
}
/*
* Caution: this works only because all scalar members of
* NFSServer are QString in @crumpled_addr. The visitor expects
* @crumpled_addr to be typed according to the QAPI schema. It
* is when @options come from -blockdev or blockdev_add. But when
* they come from -drive, they're all QString.
*/
iv = qobject_input_visitor_new(crumpled_addr);
visit_type_NFSServer(iv, NULL, &server, &local_error);
if (local_error) {
error_propagate(errp, local_error);
goto out;
}
out:
QDECREF(addr);
qobject_decref(crumpled_addr);
visit_free(iv);
return server;
}
static int64_t nfs_client_open(NFSClient *client, QDict *options,
static int64_t nfs_client_open(NFSClient *client, BlockdevOptionsNfs *opts,
int flags, int open_flags, Error **errp)
{
int64_t ret = -EINVAL;
QemuOpts *opts = NULL;
Error *local_err = NULL;
struct stat st;
char *file = NULL, *strp = NULL;
qemu_mutex_init(&client->mutex);
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
client->path = g_strdup(qemu_opt_get(opts, "path"));
if (!client->path) {
ret = -EINVAL;
error_setg(errp, "No path was specified");
goto fail;
}
client->path = g_strdup(opts->path);
strp = strrchr(client->path, '/');
if (strp == NULL) {
@@ -526,12 +428,10 @@ static int64_t nfs_client_open(NFSClient *client, QDict *options,
file = g_strdup(strp);
*strp = 0;
/* Pop the config into our state object, Exit if invalid */
client->server = nfs_config(options, errp);
if (!client->server) {
ret = -EINVAL;
goto fail;
}
/* Steal the NFSServer object from opts; set the original pointer to NULL
* to avoid use after free and double free. */
client->server = opts->server;
opts->server = NULL;
client->context = nfs_init_context();
if (client->context == NULL) {
@@ -539,29 +439,29 @@ static int64_t nfs_client_open(NFSClient *client, QDict *options,
goto fail;
}
if (qemu_opt_get(opts, "user")) {
client->uid = qemu_opt_get_number(opts, "user", 0);
if (opts->has_user) {
client->uid = opts->user;
nfs_set_uid(client->context, client->uid);
}
if (qemu_opt_get(opts, "group")) {
client->gid = qemu_opt_get_number(opts, "group", 0);
if (opts->has_group) {
client->gid = opts->group;
nfs_set_gid(client->context, client->gid);
}
if (qemu_opt_get(opts, "tcp-syn-count")) {
client->tcp_syncnt = qemu_opt_get_number(opts, "tcp-syn-count", 0);
if (opts->has_tcp_syn_count) {
client->tcp_syncnt = opts->tcp_syn_count;
nfs_set_tcp_syncnt(client->context, client->tcp_syncnt);
}
#ifdef LIBNFS_FEATURE_READAHEAD
if (qemu_opt_get(opts, "readahead-size")) {
if (opts->has_readahead_size) {
if (open_flags & BDRV_O_NOCACHE) {
error_setg(errp, "Cannot enable NFS readahead "
"if cache.direct = on");
goto fail;
}
client->readahead = qemu_opt_get_number(opts, "readahead-size", 0);
client->readahead = opts->readahead_size;
if (client->readahead > QEMU_NFS_MAX_READAHEAD_SIZE) {
warn_report("Truncating NFS readahead size to %d",
QEMU_NFS_MAX_READAHEAD_SIZE);
@@ -576,13 +476,13 @@ static int64_t nfs_client_open(NFSClient *client, QDict *options,
#endif
#ifdef LIBNFS_FEATURE_PAGECACHE
if (qemu_opt_get(opts, "page-cache-size")) {
if (opts->has_page_cache_size) {
if (open_flags & BDRV_O_NOCACHE) {
error_setg(errp, "Cannot enable NFS pagecache "
"if cache.direct = on");
goto fail;
}
client->pagecache = qemu_opt_get_number(opts, "page-cache-size", 0);
client->pagecache = opts->page_cache_size;
if (client->pagecache > QEMU_NFS_MAX_PAGECACHE_SIZE) {
warn_report("Truncating NFS pagecache size to %d pages",
QEMU_NFS_MAX_PAGECACHE_SIZE);
@@ -595,8 +495,8 @@ static int64_t nfs_client_open(NFSClient *client, QDict *options,
#endif
#ifdef LIBNFS_FEATURE_DEBUG
if (qemu_opt_get(opts, "debug")) {
client->debug = qemu_opt_get_number(opts, "debug", 0);
if (opts->has_debug) {
client->debug = opts->debug;
/* limit the maximum debug level to avoid potential flooding
* of our log files. */
if (client->debug > QEMU_NFS_MAX_DEBUG_LEVEL) {
@@ -647,11 +547,53 @@ static int64_t nfs_client_open(NFSClient *client, QDict *options,
fail:
nfs_client_close(client);
out:
qemu_opts_del(opts);
g_free(file);
return ret;
}
static BlockdevOptionsNfs *nfs_options_qdict_to_qapi(QDict *options,
Error **errp)
{
BlockdevOptionsNfs *opts = NULL;
QObject *crumpled = NULL;
Visitor *v;
Error *local_err = NULL;
crumpled = qdict_crumple(options, errp);
if (crumpled == NULL) {
return NULL;
}
v = qobject_input_visitor_new_keyval(crumpled);
visit_type_BlockdevOptionsNfs(v, NULL, &opts, &local_err);
visit_free(v);
qobject_decref(crumpled);
if (local_err) {
return NULL;
}
return opts;
}
static int64_t nfs_client_open_qdict(NFSClient *client, QDict *options,
int flags, int open_flags, Error **errp)
{
BlockdevOptionsNfs *opts;
int ret;
opts = nfs_options_qdict_to_qapi(options, errp);
if (opts == NULL) {
ret = -EINVAL;
goto fail;
}
ret = nfs_client_open(client, opts, flags, open_flags, errp);
fail:
qapi_free_BlockdevOptionsNfs(opts);
return ret;
}
static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp) {
NFSClient *client = bs->opaque;
@@ -659,9 +601,9 @@ static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
client->aio_context = bdrv_get_aio_context(bs);
ret = nfs_client_open(client, options,
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
bs->open_flags, errp);
ret = nfs_client_open_qdict(client, options,
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
bs->open_flags, errp);
if (ret < 0) {
return ret;
}
@@ -684,18 +626,43 @@ static QemuOptsList nfs_create_opts = {
}
};
static int coroutine_fn nfs_file_co_create_opts(const char *url, QemuOpts *opts,
Error **errp)
static int nfs_file_co_create(BlockdevCreateOptions *options, Error **errp)
{
int64_t ret, total_size;
BlockdevCreateOptionsNfs *opts = &options->u.nfs;
NFSClient *client = g_new0(NFSClient, 1);
QDict *options = NULL;
int ret;
assert(options->driver == BLOCKDEV_DRIVER_NFS);
client->aio_context = qemu_get_aio_context();
ret = nfs_client_open(client, opts->location, O_CREAT, 0, errp);
if (ret < 0) {
goto out;
}
ret = nfs_ftruncate(client->context, client->fh, opts->size);
nfs_client_close(client);
out:
g_free(client);
return ret;
}
static int coroutine_fn nfs_file_co_create_opts(const char *url, QemuOpts *opts,
Error **errp)
{
BlockdevCreateOptions *create_options;
BlockdevCreateOptionsNfs *nfs_opts;
QDict *options;
int ret;
create_options = g_new0(BlockdevCreateOptions, 1);
create_options->driver = BLOCKDEV_DRIVER_NFS;
nfs_opts = &create_options->u.nfs;
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
nfs_opts->size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
options = qdict_new();
ret = nfs_parse_uri(url, options, errp);
@@ -703,15 +670,21 @@ static int coroutine_fn nfs_file_co_create_opts(const char *url, QemuOpts *opts,
goto out;
}
ret = nfs_client_open(client, options, O_CREAT, 0, errp);
nfs_opts->location = nfs_options_qdict_to_qapi(options, errp);
if (nfs_opts->location == NULL) {
ret = -EINVAL;
goto out;
}
ret = nfs_file_co_create(create_options, errp);
if (ret < 0) {
goto out;
}
ret = nfs_ftruncate(client->context, client->fh, total_size);
nfs_client_close(client);
ret = 0;
out:
QDECREF(options);
g_free(client);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
@@ -876,8 +849,8 @@ static void nfs_refresh_filename(BlockDriverState *bs, QDict *options)
}
#ifdef LIBNFS_FEATURE_PAGECACHE
static void nfs_invalidate_cache(BlockDriverState *bs,
Error **errp)
static void coroutine_fn nfs_co_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
NFSClient *client = bs->opaque;
nfs_pagecache_invalidate(client->context, client->fh);
@@ -898,6 +871,7 @@ static BlockDriver bdrv_nfs = {
.bdrv_file_open = nfs_file_open,
.bdrv_close = nfs_file_close,
.bdrv_co_create = nfs_file_co_create,
.bdrv_co_create_opts = nfs_file_co_create_opts,
.bdrv_reopen_prepare = nfs_reopen_prepare,
@@ -910,7 +884,7 @@ static BlockDriver bdrv_nfs = {
.bdrv_refresh_filename = nfs_refresh_filename,
#ifdef LIBNFS_FEATURE_PAGECACHE
.bdrv_invalidate_cache = nfs_invalidate_cache,
.bdrv_co_invalidate_cache = nfs_co_invalidate_cache,
#endif
};

View File

@@ -695,12 +695,11 @@ static void nvme_parse_filename(const char *filename, QDict *options,
unsigned long ns;
const char *slash = strchr(tmp, '/');
if (!slash) {
qdict_put(options, NVME_BLOCK_OPT_DEVICE,
qstring_from_str(tmp));
qdict_put_str(options, NVME_BLOCK_OPT_DEVICE, tmp);
return;
}
device = g_strndup(tmp, slash - tmp);
qdict_put(options, NVME_BLOCK_OPT_DEVICE, qstring_from_str(device));
qdict_put_str(options, NVME_BLOCK_OPT_DEVICE, device);
g_free(device);
namespace = slash + 1;
if (*namespace && qemu_strtoul(namespace, NULL, 10, &ns)) {
@@ -708,8 +707,8 @@ static void nvme_parse_filename(const char *filename, QDict *options,
namespace);
return;
}
qdict_put(options, NVME_BLOCK_OPT_NAMESPACE,
qstring_from_str(*namespace ? namespace : "1"));
qdict_put_str(options, NVME_BLOCK_OPT_NAMESPACE,
*namespace ? namespace : "1");
}
}
@@ -1082,7 +1081,7 @@ static void nvme_refresh_filename(BlockDriverState *bs, QDict *opts)
bs->drv->format_name);
}
qdict_put(opts, "driver", qstring_from_str(bs->drv->format_name));
qdict_put_str(opts, "driver", bs->drv->format_name);
bs->full_open_options = opts;
}

View File

@@ -34,6 +34,9 @@
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "qemu/option.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi/qapi-visit-block-core.h"
#include "qemu/bswap.h"
#include "qemu/bitmap.h"
#include "migration/blocker.h"
@@ -79,6 +82,25 @@ static QemuOptsList parallels_runtime_opts = {
},
};
static QemuOptsList parallels_create_opts = {
.name = "parallels-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(parallels_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size",
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Parallels image cluster size",
.def_value_str = stringify(DEFAULT_CLUSTER_SIZE),
},
{ /* end of list */ }
}
};
static int64_t bat2sect(BDRVParallelsState *s, uint32_t idx)
{
@@ -378,8 +400,9 @@ static coroutine_fn int parallels_co_readv(BlockDriverState *bs,
}
static int parallels_check(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix)
static int coroutine_fn parallels_co_check(BlockDriverState *bs,
BdrvCheckResult *res,
BdrvCheckMode fix)
{
BDRVParallelsState *s = bs->opaque;
int64_t size, prev_off, high_off;
@@ -394,6 +417,7 @@ static int parallels_check(BlockDriverState *bs, BdrvCheckResult *res,
return size;
}
qemu_co_mutex_lock(&s->lock);
if (s->header_unclean) {
fprintf(stderr, "%s image was not closed correctly\n",
fix & BDRV_FIX_ERRORS ? "Repairing" : "ERROR");
@@ -442,11 +466,12 @@ static int parallels_check(BlockDriverState *bs, BdrvCheckResult *res,
prev_off = off;
}
ret = 0;
if (flush_bat) {
ret = bdrv_pwrite_sync(bs->file, 0, s->header, s->header_size);
if (ret < 0) {
res->check_errors++;
return ret;
goto out;
}
}
@@ -465,56 +490,79 @@ static int parallels_check(BlockDriverState *bs, BdrvCheckResult *res,
if (ret < 0) {
error_report_err(local_err);
res->check_errors++;
return ret;
goto out;
}
res->leaks_fixed += count;
}
}
return 0;
out:
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static int coroutine_fn parallels_co_create_opts(const char *filename,
QemuOpts *opts,
Error **errp)
static int coroutine_fn parallels_co_create(BlockdevCreateOptions* opts,
Error **errp)
{
BlockdevCreateOptionsParallels *parallels_opts;
BlockDriverState *bs;
BlockBackend *blk;
int64_t total_size, cl_size;
uint8_t tmp[BDRV_SECTOR_SIZE];
Error *local_err = NULL;
BlockBackend *file;
uint32_t bat_entries, bat_sectors;
ParallelsHeader header;
uint8_t tmp[BDRV_SECTOR_SIZE];
int ret;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
cl_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE), BDRV_SECTOR_SIZE);
assert(opts->driver == BLOCKDEV_DRIVER_PARALLELS);
parallels_opts = &opts->u.parallels;
/* Sanity checks */
total_size = parallels_opts->size;
if (parallels_opts->has_cluster_size) {
cl_size = parallels_opts->cluster_size;
} else {
cl_size = DEFAULT_CLUSTER_SIZE;
}
/* XXX What is the real limit here? This is an insanely large maximum. */
if (cl_size >= INT64_MAX / MAX_PARALLELS_IMAGE_FACTOR) {
error_setg(errp, "Cluster size is too large");
return -EINVAL;
}
if (total_size >= MAX_PARALLELS_IMAGE_FACTOR * cl_size) {
error_propagate(errp, local_err);
error_setg(errp, "Image size is too large for this cluster size");
return -E2BIG;
}
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
if (!QEMU_IS_ALIGNED(total_size, BDRV_SECTOR_SIZE)) {
error_setg(errp, "Image size must be a multiple of 512 bytes");
return -EINVAL;
}
file = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
&local_err);
if (file == NULL) {
error_propagate(errp, local_err);
if (!QEMU_IS_ALIGNED(cl_size, BDRV_SECTOR_SIZE)) {
error_setg(errp, "Cluster size must be a multiple of 512 bytes");
return -EINVAL;
}
/* Create BlockBackend to write to the image */
bs = bdrv_open_blockdev_ref(parallels_opts->file, errp);
if (bs == NULL) {
return -EIO;
}
blk_set_allow_write_beyond_eof(file, true);
ret = blk_truncate(file, 0, PREALLOC_MODE_OFF, errp);
blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
ret = blk_insert_bs(blk, bs, errp);
if (ret < 0) {
goto exit;
goto out;
}
blk_set_allow_write_beyond_eof(blk, true);
/* Create image format */
ret = blk_truncate(blk, 0, PREALLOC_MODE_OFF, errp);
if (ret < 0) {
goto out;
}
bat_entries = DIV_ROUND_UP(total_size, cl_size);
@@ -537,24 +585,107 @@ static int coroutine_fn parallels_co_create_opts(const char *filename,
memset(tmp, 0, sizeof(tmp));
memcpy(tmp, &header, sizeof(header));
ret = blk_pwrite(file, 0, tmp, BDRV_SECTOR_SIZE, 0);
ret = blk_pwrite(blk, 0, tmp, BDRV_SECTOR_SIZE, 0);
if (ret < 0) {
goto exit;
}
ret = blk_pwrite_zeroes(file, BDRV_SECTOR_SIZE,
ret = blk_pwrite_zeroes(blk, BDRV_SECTOR_SIZE,
(bat_sectors - 1) << BDRV_SECTOR_BITS, 0);
if (ret < 0) {
goto exit;
}
ret = 0;
done:
blk_unref(file);
ret = 0;
out:
blk_unref(blk);
bdrv_unref(bs);
return ret;
exit:
error_setg_errno(errp, -ret, "Failed to create Parallels image");
goto done;
goto out;
}
static int coroutine_fn parallels_co_create_opts(const char *filename,
QemuOpts *opts,
Error **errp)
{
BlockdevCreateOptions *create_options = NULL;
Error *local_err = NULL;
BlockDriverState *bs = NULL;
QDict *qdict = NULL;
QObject *qobj;
Visitor *v;
int ret;
static const QDictRenames opt_renames[] = {
{ BLOCK_OPT_CLUSTER_SIZE, "cluster-size" },
{ NULL, NULL },
};
/* Parse options and convert legacy syntax */
qdict = qemu_opts_to_qdict_filtered(opts, NULL, &parallels_create_opts,
true);
if (!qdict_rename_keys(qdict, opt_renames, errp)) {
ret = -EINVAL;
goto done;
}
/* Create and open the file (protocol layer) */
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto done;
}
bs = bdrv_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
if (bs == NULL) {
ret = -EIO;
goto done;
}
/* Now get the QAPI type BlockdevCreateOptions */
qdict_put_str(qdict, "driver", "parallels");
qdict_put_str(qdict, "file", bs->node_name);
qobj = qdict_crumple(qdict, errp);
QDECREF(qdict);
qdict = qobject_to(QDict, qobj);
if (qdict == NULL) {
ret = -EINVAL;
goto done;
}
v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
visit_type_BlockdevCreateOptions(v, NULL, &create_options, &local_err);
visit_free(v);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto done;
}
/* Silently round up sizes */
create_options->u.parallels.size =
ROUND_UP(create_options->u.parallels.size, BDRV_SECTOR_SIZE);
create_options->u.parallels.cluster_size =
ROUND_UP(create_options->u.parallels.cluster_size, BDRV_SECTOR_SIZE);
/* Create the Parallels image (format layer) */
ret = parallels_co_create(create_options, errp);
if (ret < 0) {
goto done;
}
ret = 0;
done:
QDECREF(qdict);
bdrv_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
@@ -766,25 +897,6 @@ static void parallels_close(BlockDriverState *bs)
error_free(s->migration_blocker);
}
static QemuOptsList parallels_create_opts = {
.name = "parallels-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(parallels_create_opts.head),
.desc = {
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size",
},
{
.name = BLOCK_OPT_CLUSTER_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Parallels image cluster size",
.def_value_str = stringify(DEFAULT_CLUSTER_SIZE),
},
{ /* end of list */ }
}
};
static BlockDriver bdrv_parallels = {
.format_name = "parallels",
.instance_size = sizeof(BDRVParallelsState),
@@ -798,8 +910,9 @@ static BlockDriver bdrv_parallels = {
.bdrv_co_readv = parallels_co_readv,
.bdrv_co_writev = parallels_co_writev,
.supports_backing = true,
.bdrv_co_create = parallels_co_create,
.bdrv_co_create_opts = parallels_co_create_opts,
.bdrv_check = parallels_check,
.bdrv_co_check = parallels_co_check,
.create_opts = &parallels_create_opts,
};

View File

@@ -394,6 +394,37 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
qapi_free_BlockInfo(info);
}
static uint64List *uint64_list(uint64_t *list, int size)
{
int i;
uint64List *out_list = NULL;
uint64List **pout_list = &out_list;
for (i = 0; i < size; i++) {
uint64List *entry = g_new(uint64List, 1);
entry->value = list[i];
*pout_list = entry;
pout_list = &entry->next;
}
*pout_list = NULL;
return out_list;
}
static void bdrv_latency_histogram_stats(BlockLatencyHistogram *hist,
bool *not_null,
BlockLatencyHistogramInfo **info)
{
*not_null = hist->bins != NULL;
if (*not_null) {
*info = g_new0(BlockLatencyHistogramInfo, 1);
(*info)->boundaries = uint64_list(hist->boundaries, hist->nbins - 1);
(*info)->bins = uint64_list(hist->bins, hist->nbins);
}
}
static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
{
BlockAcctStats *stats = blk_get_stats(blk);
@@ -459,6 +490,16 @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
dev_stats->avg_wr_queue_depth =
block_acct_queue_depth(ts, BLOCK_ACCT_WRITE);
}
bdrv_latency_histogram_stats(&stats->latency_histogram[BLOCK_ACCT_READ],
&ds->has_x_rd_latency_histogram,
&ds->x_rd_latency_histogram);
bdrv_latency_histogram_stats(&stats->latency_histogram[BLOCK_ACCT_WRITE],
&ds->has_x_wr_latency_histogram,
&ds->x_wr_latency_histogram);
bdrv_latency_histogram_stats(&stats->latency_histogram[BLOCK_ACCT_FLUSH],
&ds->has_x_flush_latency_histogram,
&ds->x_flush_latency_histogram);
}
static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
@@ -647,29 +688,29 @@ static void dump_qobject(fprintf_function func_fprintf, void *f,
{
switch (qobject_type(obj)) {
case QTYPE_QNUM: {
QNum *value = qobject_to_qnum(obj);
QNum *value = qobject_to(QNum, obj);
char *tmp = qnum_to_string(value);
func_fprintf(f, "%s", tmp);
g_free(tmp);
break;
}
case QTYPE_QSTRING: {
QString *value = qobject_to_qstring(obj);
QString *value = qobject_to(QString, obj);
func_fprintf(f, "%s", qstring_get_str(value));
break;
}
case QTYPE_QDICT: {
QDict *value = qobject_to_qdict(obj);
QDict *value = qobject_to(QDict, obj);
dump_qdict(func_fprintf, f, comp_indent, value);
break;
}
case QTYPE_QLIST: {
QList *value = qobject_to_qlist(obj);
QList *value = qobject_to(QList, obj);
dump_qlist(func_fprintf, f, comp_indent, value);
break;
}
case QTYPE_QBOOL: {
QBool *value = qobject_to_qbool(obj);
QBool *value = qobject_to(QBool, obj);
func_fprintf(f, "%s", qbool_get_bool(value) ? "true" : "false");
break;
}
@@ -730,7 +771,7 @@ void bdrv_image_info_specific_dump(fprintf_function func_fprintf, void *f,
visit_type_ImageInfoSpecific(v, NULL, &info_spec, &error_abort);
visit_complete(v, &obj);
data = qdict_get(qobject_to_qdict(obj), "data");
data = qdict_get(qobject_to(QDict, obj), "data");
dump_qobject(func_fprintf, f, 1, data);
qobject_decref(obj);
visit_free(v);

View File

@@ -33,6 +33,8 @@
#include <zlib.h>
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi/qapi-visit-block-core.h"
#include "crypto/block.h"
#include "migration/blocker.h"
#include "block/crypto.h"
@@ -86,6 +88,8 @@ typedef struct BDRVQcowState {
Error *migration_blocker;
} BDRVQcowState;
static QemuOptsList qcow_create_opts;
static int decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
static int qcow_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -810,62 +814,50 @@ static void qcow_close(BlockDriverState *bs)
error_free(s->migration_blocker);
}
static int coroutine_fn qcow_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
static int coroutine_fn qcow_co_create(BlockdevCreateOptions *opts,
Error **errp)
{
BlockdevCreateOptionsQcow *qcow_opts;
int header_size, backing_filename_len, l1_size, shift, i;
QCowHeader header;
uint8_t *tmp;
int64_t total_size = 0;
char *backing_file = NULL;
Error *local_err = NULL;
int ret;
BlockDriverState *bs;
BlockBackend *qcow_blk;
char *encryptfmt = NULL;
QDict *options;
QDict *encryptopts = NULL;
QCryptoBlockCreateOptions *crypto_opts = NULL;
QCryptoBlock *crypto = NULL;
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
assert(opts->driver == BLOCKDEV_DRIVER_QCOW);
qcow_opts = &opts->u.qcow;
/* Sanity checks */
total_size = qcow_opts->size;
if (total_size == 0) {
error_setg(errp, "Image size is too small, cannot be zero length");
ret = -EINVAL;
goto cleanup;
return -EINVAL;
}
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
encryptfmt = qemu_opt_get_del(opts, BLOCK_OPT_ENCRYPT_FORMAT);
if (encryptfmt) {
if (qemu_opt_get(opts, BLOCK_OPT_ENCRYPT)) {
error_setg(errp, "Options " BLOCK_OPT_ENCRYPT " and "
BLOCK_OPT_ENCRYPT_FORMAT " are mutually exclusive");
ret = -EINVAL;
goto cleanup;
}
} else if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
encryptfmt = g_strdup("aes");
if (qcow_opts->has_encrypt &&
qcow_opts->encrypt->format != Q_CRYPTO_BLOCK_FORMAT_QCOW)
{
error_setg(errp, "Unsupported encryption format");
return -EINVAL;
}
ret = bdrv_create_file(filename, opts, &local_err);
/* Create BlockBackend to write to the image */
bs = bdrv_open_blockdev_ref(qcow_opts->file, errp);
if (bs == NULL) {
return -EIO;
}
qcow_blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
ret = blk_insert_bs(qcow_blk, bs, errp);
if (ret < 0) {
error_propagate(errp, local_err);
goto cleanup;
goto exit;
}
qcow_blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
&local_err);
if (qcow_blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
goto cleanup;
}
blk_set_allow_write_beyond_eof(qcow_blk, true);
/* Create image format */
ret = blk_truncate(qcow_blk, 0, PREALLOC_MODE_OFF, errp);
if (ret < 0) {
goto exit;
@@ -877,16 +869,15 @@ static int coroutine_fn qcow_co_create_opts(const char *filename, QemuOpts *opts
header.size = cpu_to_be64(total_size);
header_size = sizeof(header);
backing_filename_len = 0;
if (backing_file) {
if (strcmp(backing_file, "fat:")) {
if (qcow_opts->has_backing_file) {
if (strcmp(qcow_opts->backing_file, "fat:")) {
header.backing_file_offset = cpu_to_be64(header_size);
backing_filename_len = strlen(backing_file);
backing_filename_len = strlen(qcow_opts->backing_file);
header.backing_file_size = cpu_to_be32(backing_filename_len);
header_size += backing_filename_len;
} else {
/* special backing file for vvfat */
g_free(backing_file);
backing_file = NULL;
qcow_opts->has_backing_file = false;
}
header.cluster_bits = 9; /* 512 byte cluster to avoid copying
unmodified sectors */
@@ -901,26 +892,10 @@ static int coroutine_fn qcow_co_create_opts(const char *filename, QemuOpts *opts
header.l1_table_offset = cpu_to_be64(header_size);
options = qemu_opts_to_qdict(opts, NULL);
qdict_extract_subqdict(options, &encryptopts, "encrypt.");
QDECREF(options);
if (encryptfmt) {
if (!g_str_equal(encryptfmt, "aes")) {
error_setg(errp, "Unknown encryption format '%s', expected 'aes'",
encryptfmt);
ret = -EINVAL;
goto exit;
}
if (qcow_opts->has_encrypt) {
header.crypt_method = cpu_to_be32(QCOW_CRYPT_AES);
crypto_opts = block_crypto_create_opts_init(
Q_CRYPTO_BLOCK_FORMAT_QCOW, encryptopts, errp);
if (!crypto_opts) {
ret = -EINVAL;
goto exit;
}
crypto = qcrypto_block_create(crypto_opts, "encrypt.",
crypto = qcrypto_block_create(qcow_opts->encrypt, "encrypt.",
NULL, NULL, NULL, errp);
if (!crypto) {
ret = -EINVAL;
@@ -936,9 +911,9 @@ static int coroutine_fn qcow_co_create_opts(const char *filename, QemuOpts *opts
goto exit;
}
if (backing_file) {
if (qcow_opts->has_backing_file) {
ret = blk_pwrite(qcow_blk, sizeof(header),
backing_file, backing_filename_len, 0);
qcow_opts->backing_file, backing_filename_len, 0);
if (ret != backing_filename_len) {
goto exit;
}
@@ -959,12 +934,100 @@ static int coroutine_fn qcow_co_create_opts(const char *filename, QemuOpts *opts
ret = 0;
exit:
blk_unref(qcow_blk);
cleanup:
QDECREF(encryptopts);
g_free(encryptfmt);
qcrypto_block_free(crypto);
qapi_free_QCryptoBlockCreateOptions(crypto_opts);
g_free(backing_file);
return ret;
}
static int coroutine_fn qcow_co_create_opts(const char *filename,
QemuOpts *opts, Error **errp)
{
BlockdevCreateOptions *create_options = NULL;
BlockDriverState *bs = NULL;
QDict *qdict = NULL;
QObject *qobj;
Visitor *v;
const char *val;
Error *local_err = NULL;
int ret;
static const QDictRenames opt_renames[] = {
{ BLOCK_OPT_BACKING_FILE, "backing-file" },
{ BLOCK_OPT_ENCRYPT, BLOCK_OPT_ENCRYPT_FORMAT },
{ NULL, NULL },
};
/* Parse options and convert legacy syntax */
qdict = qemu_opts_to_qdict_filtered(opts, NULL, &qcow_create_opts, true);
val = qdict_get_try_str(qdict, BLOCK_OPT_ENCRYPT);
if (val && !strcmp(val, "on")) {
qdict_put_str(qdict, BLOCK_OPT_ENCRYPT, "qcow");
} else if (val && !strcmp(val, "off")) {
qdict_del(qdict, BLOCK_OPT_ENCRYPT);
}
val = qdict_get_try_str(qdict, BLOCK_OPT_ENCRYPT_FORMAT);
if (val && !strcmp(val, "aes")) {
qdict_put_str(qdict, BLOCK_OPT_ENCRYPT_FORMAT, "qcow");
}
if (!qdict_rename_keys(qdict, opt_renames, errp)) {
ret = -EINVAL;
goto fail;
}
/* Create and open the file (protocol layer) */
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
}
bs = bdrv_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
if (bs == NULL) {
ret = -EIO;
goto fail;
}
/* Now get the QAPI type BlockdevCreateOptions */
qdict_put_str(qdict, "driver", "qcow");
qdict_put_str(qdict, "file", bs->node_name);
qobj = qdict_crumple(qdict, errp);
QDECREF(qdict);
qdict = qobject_to(QDict, qobj);
if (qdict == NULL) {
ret = -EINVAL;
goto fail;
}
v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
visit_type_BlockdevCreateOptions(v, NULL, &create_options, &local_err);
visit_free(v);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
/* Silently round up size */
assert(create_options->driver == BLOCKDEV_DRIVER_QCOW);
create_options->u.qcow.size =
ROUND_UP(create_options->u.qcow.size, BDRV_SECTOR_SIZE);
/* Create the qcow image (format layer) */
ret = qcow_co_create(create_options, errp);
if (ret < 0) {
goto fail;
}
ret = 0;
fail:
QDECREF(qdict);
bdrv_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
@@ -1128,6 +1191,7 @@ static BlockDriver bdrv_qcow = {
.bdrv_close = qcow_close,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_reopen_prepare = qcow_reopen_prepare,
.bdrv_co_create = qcow_co_create,
.bdrv_co_create_opts = qcow_co_create_opts,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.supports_backing = true,

View File

@@ -110,7 +110,7 @@ static int update_header_sync(BlockDriverState *bs)
return ret;
}
return bdrv_flush(bs);
return bdrv_flush(bs->file->bs);
}
static inline void bitmap_table_to_be(uint64_t *bitmap_table, size_t size)
@@ -882,7 +882,7 @@ static int update_ext_header_and_dir(BlockDriverState *bs,
return ret;
}
ret = bdrv_flush(bs->file->bs);
ret = qcow2_flush_caches(bs);
if (ret < 0) {
goto fail;
}
@@ -1004,7 +1004,8 @@ fail:
return false;
}
int qcow2_reopen_bitmaps_rw(BlockDriverState *bs, Error **errp)
int qcow2_reopen_bitmaps_rw_hint(BlockDriverState *bs, bool *header_updated,
Error **errp)
{
BDRVQcow2State *s = bs->opaque;
Qcow2BitmapList *bm_list;
@@ -1012,6 +1013,10 @@ int qcow2_reopen_bitmaps_rw(BlockDriverState *bs, Error **errp)
GSList *ro_dirty_bitmaps = NULL;
int ret = 0;
if (header_updated != NULL) {
*header_updated = false;
}
if (s->nb_bitmaps == 0) {
/* No bitmaps - nothing to do */
return 0;
@@ -1055,6 +1060,9 @@ int qcow2_reopen_bitmaps_rw(BlockDriverState *bs, Error **errp)
error_setg_errno(errp, -ret, "Can't update bitmap directory");
goto out;
}
if (header_updated != NULL) {
*header_updated = true;
}
g_slist_foreach(ro_dirty_bitmaps, set_readonly_helper, false);
}
@@ -1065,6 +1073,11 @@ out:
return ret;
}
int qcow2_reopen_bitmaps_rw(BlockDriverState *bs, Error **errp)
{
return qcow2_reopen_bitmaps_rw_hint(bs, NULL, errp);
}
/* store_bitmap_data()
* Store bitmap to image, filling bitmap table accordingly.
*/

View File

@@ -25,6 +25,7 @@
#include "qemu/osdep.h"
#include <zlib.h>
#include "qapi/error.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "block/qcow2.h"
@@ -2092,11 +2093,21 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs,
}
for (i = 0; i < s->nb_snapshots; i++) {
int l1_sectors = DIV_ROUND_UP(s->snapshots[i].l1_size *
sizeof(uint64_t), BDRV_SECTOR_SIZE);
int l1_size2;
uint64_t *new_l1_table;
Error *local_err = NULL;
uint64_t *new_l1_table =
g_try_realloc(l1_table, l1_sectors * BDRV_SECTOR_SIZE);
ret = qcow2_validate_table(bs, s->snapshots[i].l1_table_offset,
s->snapshots[i].l1_size, sizeof(uint64_t),
QCOW_MAX_L1_SIZE, "Snapshot L1 table",
&local_err);
if (ret < 0) {
error_report_err(local_err);
goto fail;
}
l1_size2 = s->snapshots[i].l1_size * sizeof(uint64_t);
new_l1_table = g_try_realloc(l1_table, l1_size2);
if (!new_l1_table) {
ret = -ENOMEM;
@@ -2105,9 +2116,8 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs,
l1_table = new_l1_table;
ret = bdrv_read(bs->file,
s->snapshots[i].l1_table_offset / BDRV_SECTOR_SIZE,
(void *)l1_table, l1_sectors);
ret = bdrv_pread(bs->file, s->snapshots[i].l1_table_offset,
l1_table, l1_size2);
if (ret < 0) {
goto fail;
}

View File

@@ -839,6 +839,13 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
qcow2_cache_put(s->refcount_block_cache, &refcount_block);
}
ret = alloc_refcount_block(bs, cluster_index, &refcount_block);
/* If the caller needs to restart the search for free clusters,
* try the same ones first to see if they're still free. */
if (ret == -EAGAIN) {
if (s->free_cluster_index > (start >> s->cluster_bits)) {
s->free_cluster_index = (start >> s->cluster_bits);
}
}
if (ret < 0) {
goto fail;
}
@@ -1171,7 +1178,35 @@ void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
}
}
int coroutine_fn qcow2_write_caches(BlockDriverState *bs)
{
BDRVQcow2State *s = bs->opaque;
int ret;
ret = qcow2_cache_write(bs, s->l2_table_cache);
if (ret < 0) {
return ret;
}
if (qcow2_need_accurate_refcounts(s)) {
ret = qcow2_cache_write(bs, s->refcount_block_cache);
if (ret < 0) {
return ret;
}
}
return 0;
}
int coroutine_fn qcow2_flush_caches(BlockDriverState *bs)
{
int ret = qcow2_write_caches(bs);
if (ret < 0) {
return ret;
}
return bdrv_flush(bs->file->bs);
}
/*********************************************************/
/* snapshots and image creation */
@@ -2019,6 +2054,20 @@ static int calculate_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
/* snapshots */
for (i = 0; i < s->nb_snapshots; i++) {
sn = s->snapshots + i;
if (offset_into_cluster(s, sn->l1_table_offset)) {
fprintf(stderr, "ERROR snapshot %s (%s) l1_offset=%#" PRIx64 ": "
"L1 table is not cluster aligned; snapshot table entry "
"corrupted\n", sn->id_str, sn->name, sn->l1_table_offset);
res->corruptions++;
continue;
}
if (sn->l1_size > QCOW_MAX_L1_SIZE / sizeof(uint64_t)) {
fprintf(stderr, "ERROR snapshot %s (%s) l1_size=%#" PRIx32 ": "
"L1 table is too large; snapshot table entry corrupted\n",
sn->id_str, sn->name, sn->l1_size);
res->corruptions++;
continue;
}
ret = check_refcounts_l1(bs, res, refcount_table, nb_clusters,
sn->l1_table_offset, sn->l1_size, 0, fix);
if (ret < 0) {
@@ -2614,9 +2663,17 @@ int qcow2_check_metadata_overlap(BlockDriverState *bs, int ign, int64_t offset,
uint64_t l1_ofs = s->snapshots[i].l1_table_offset;
uint32_t l1_sz = s->snapshots[i].l1_size;
uint64_t l1_sz2 = l1_sz * sizeof(uint64_t);
uint64_t *l1 = g_try_malloc(l1_sz2);
uint64_t *l1;
int ret;
ret = qcow2_validate_table(bs, l1_ofs, l1_sz, sizeof(uint64_t),
QCOW_MAX_L1_SIZE, "", NULL);
if (ret < 0) {
return ret;
}
l1 = g_try_malloc(l1_sz2);
if (l1_sz2 && l1 == NULL) {
return -ENOMEM;
}

View File

@@ -465,6 +465,7 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
{
BDRVQcow2State *s = bs->opaque;
QCowSnapshot *sn;
Error *local_err = NULL;
int i, snapshot_index;
int cur_l1_bytes, sn_l1_bytes;
int ret;
@@ -477,6 +478,14 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
}
sn = &s->snapshots[snapshot_index];
ret = qcow2_validate_table(bs, sn->l1_table_offset, sn->l1_size,
sizeof(uint64_t), QCOW_MAX_L1_SIZE,
"Snapshot L1 table", &local_err);
if (ret < 0) {
error_report_err(local_err);
goto fail;
}
if (sn->disk_size != bs->total_sectors * BDRV_SECTOR_SIZE) {
error_report("qcow2: Loading snapshots with different disk "
"size is not implemented");
@@ -602,6 +611,13 @@ int qcow2_snapshot_delete(BlockDriverState *bs,
}
sn = s->snapshots[snapshot_index];
ret = qcow2_validate_table(bs, sn.l1_table_offset, sn.l1_size,
sizeof(uint64_t), QCOW_MAX_L1_SIZE,
"Snapshot L1 table", errp);
if (ret < 0) {
return ret;
}
/* Remove it from the snapshot list */
memmove(s->snapshots + snapshot_index,
s->snapshots + snapshot_index + 1,
@@ -704,9 +720,11 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
sn = &s->snapshots[snapshot_index];
/* Allocate and read in the snapshot's L1 table */
if (sn->l1_size > QCOW_MAX_L1_SIZE / sizeof(uint64_t)) {
error_setg(errp, "Snapshot L1 table too large");
return -EFBIG;
ret = qcow2_validate_table(bs, sn->l1_table_offset, sn->l1_size,
sizeof(uint64_t), QCOW_MAX_L1_SIZE,
"Snapshot L1 table", errp);
if (ret < 0) {
return ret;
}
new_l1_bytes = sn->l1_size * sizeof(uint64_t);
new_l1_table = qemu_try_blockalign(bs->file->bs,

View File

@@ -37,7 +37,8 @@
#include "qemu/option_int.h"
#include "qemu/cutils.h"
#include "qemu/bswap.h"
#include "qapi/opts-visitor.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi/qapi-visit-block-core.h"
#include "block/crypto.h"
/*
@@ -500,7 +501,7 @@ static int qcow2_mark_clean(BlockDriverState *bs)
s->incompatible_features &= ~QCOW2_INCOMPAT_DIRTY;
ret = bdrv_flush(bs);
ret = qcow2_flush_caches(bs);
if (ret < 0) {
return ret;
}
@@ -530,7 +531,7 @@ int qcow2_mark_consistent(BlockDriverState *bs)
BDRVQcow2State *s = bs->opaque;
if (s->incompatible_features & QCOW2_INCOMPAT_CORRUPT) {
int ret = bdrv_flush(bs);
int ret = qcow2_flush_caches(bs);
if (ret < 0) {
return ret;
}
@@ -541,8 +542,9 @@ int qcow2_mark_consistent(BlockDriverState *bs)
return 0;
}
static int qcow2_check(BlockDriverState *bs, BdrvCheckResult *result,
BdrvCheckMode fix)
static int coroutine_fn qcow2_co_check_locked(BlockDriverState *bs,
BdrvCheckResult *result,
BdrvCheckMode fix)
{
int ret = qcow2_check_refcounts(bs, result, fix);
if (ret < 0) {
@@ -559,26 +561,36 @@ static int qcow2_check(BlockDriverState *bs, BdrvCheckResult *result,
return ret;
}
static int validate_table_offset(BlockDriverState *bs, uint64_t offset,
uint64_t entries, size_t entry_len)
static int coroutine_fn qcow2_co_check(BlockDriverState *bs,
BdrvCheckResult *result,
BdrvCheckMode fix)
{
BDRVQcow2State *s = bs->opaque;
uint64_t size;
int ret;
qemu_co_mutex_lock(&s->lock);
ret = qcow2_co_check_locked(bs, result, fix);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
int qcow2_validate_table(BlockDriverState *bs, uint64_t offset,
uint64_t entries, size_t entry_len,
int64_t max_size_bytes, const char *table_name,
Error **errp)
{
BDRVQcow2State *s = bs->opaque;
if (entries > max_size_bytes / entry_len) {
error_setg(errp, "%s too large", table_name);
return -EFBIG;
}
/* Use signed INT64_MAX as the maximum even for uint64_t header fields,
* because values will be passed to qemu functions taking int64_t. */
if (entries > INT64_MAX / entry_len) {
return -EINVAL;
}
size = entries * entry_len;
if (INT64_MAX - size < offset) {
return -EINVAL;
}
/* Tables must be cluster aligned */
if (offset_into_cluster(s, offset) != 0) {
if ((INT64_MAX - entries * entry_len < offset) ||
(offset_into_cluster(s, offset) != 0)) {
error_setg(errp, "%s offset invalid", table_name);
return -EINVAL;
}
@@ -1118,8 +1130,9 @@ static int qcow2_update_options(BlockDriverState *bs, QDict *options,
return ret;
}
static int qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
/* Called with s->lock held. */
static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
int flags, Error **errp)
{
BDRVQcow2State *s = bs->opaque;
unsigned int len, i;
@@ -1308,47 +1321,42 @@ static int qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
s->refcount_table_size =
header.refcount_table_clusters << (s->cluster_bits - 3);
if (header.refcount_table_clusters > qcow2_max_refcount_clusters(s)) {
error_setg(errp, "Reference count table too large");
ret = -EINVAL;
goto fail;
}
if (header.refcount_table_clusters == 0 && !(flags & BDRV_O_CHECK)) {
error_setg(errp, "Image does not contain a reference count table");
ret = -EINVAL;
goto fail;
}
ret = validate_table_offset(bs, s->refcount_table_offset,
s->refcount_table_size, sizeof(uint64_t));
ret = qcow2_validate_table(bs, s->refcount_table_offset,
header.refcount_table_clusters,
s->cluster_size, QCOW_MAX_REFTABLE_SIZE,
"Reference count table", errp);
if (ret < 0) {
error_setg(errp, "Invalid reference count table offset");
goto fail;
}
/* Snapshot table offset/length */
if (header.nb_snapshots > QCOW_MAX_SNAPSHOTS) {
error_setg(errp, "Too many snapshots");
ret = -EINVAL;
goto fail;
}
ret = validate_table_offset(bs, header.snapshots_offset,
header.nb_snapshots,
sizeof(QCowSnapshotHeader));
/* The total size in bytes of the snapshot table is checked in
* qcow2_read_snapshots() because the size of each snapshot is
* variable and we don't know it yet.
* Here we only check the offset and number of snapshots. */
ret = qcow2_validate_table(bs, header.snapshots_offset,
header.nb_snapshots,
sizeof(QCowSnapshotHeader),
sizeof(QCowSnapshotHeader) * QCOW_MAX_SNAPSHOTS,
"Snapshot table", errp);
if (ret < 0) {
error_setg(errp, "Invalid snapshot table offset");
goto fail;
}
/* read the level 1 table */
if (header.l1_size > QCOW_MAX_L1_SIZE / sizeof(uint64_t)) {
error_setg(errp, "Active L1 table too large");
ret = -EFBIG;
ret = qcow2_validate_table(bs, header.l1_table_offset,
header.l1_size, sizeof(uint64_t),
QCOW_MAX_L1_SIZE, "Active L1 table", errp);
if (ret < 0) {
goto fail;
}
s->l1_size = header.l1_size;
s->l1_table_offset = header.l1_table_offset;
l1_vm_state_index = size_to_l1(s, header.size);
if (l1_vm_state_index > INT_MAX) {
@@ -1366,15 +1374,6 @@ static int qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
ret = validate_table_offset(bs, header.l1_table_offset,
header.l1_size, sizeof(uint64_t));
if (ret < 0) {
error_setg(errp, "Invalid L1 table offset");
goto fail;
}
s->l1_table_offset = header.l1_table_offset;
if (s->l1_size > 0) {
s->l1_table = qemu_try_blockalign(bs->file->bs,
ROUND_UP(s->l1_size * sizeof(uint64_t), 512));
@@ -1481,7 +1480,22 @@ static int qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
s->autoclear_features &= QCOW2_AUTOCLEAR_MASK;
}
if (qcow2_load_dirty_bitmaps(bs, &local_err)) {
if (bdrv_dirty_bitmap_next(bs, NULL)) {
/* It's some kind of reopen with already existing dirty bitmaps. There
* are no known cases where we need loading bitmaps in such situation,
* so it's safer don't load them.
*
* Moreover, if we have some readonly bitmaps and we are reopening for
* rw we should reopen bitmaps correspondingly.
*/
if (bdrv_has_readonly_bitmaps(bs) &&
!bdrv_is_read_only(bs) && !(bdrv_get_flags(bs) & BDRV_O_INACTIVE))
{
bool header_updated = false;
qcow2_reopen_bitmaps_rw_hint(bs, &header_updated, &local_err);
update_header = update_header && !header_updated;
}
} else if (qcow2_load_dirty_bitmaps(bs, &local_err)) {
update_header = false;
}
if (local_err != NULL) {
@@ -1498,8 +1512,6 @@ static int qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
}
}
/* Initialise locks */
qemu_co_mutex_init(&s->lock);
bs->supported_zero_flags = header.version >= 3 ? BDRV_REQ_MAY_UNMAP : 0;
/* Repair image if dirty */
@@ -1507,7 +1519,8 @@ static int qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
(s->incompatible_features & QCOW2_INCOMPAT_DIRTY)) {
BdrvCheckResult result = {0};
ret = qcow2_check(bs, &result, BDRV_FIX_ERRORS | BDRV_FIX_LEAKS);
ret = qcow2_co_check_locked(bs, &result,
BDRV_FIX_ERRORS | BDRV_FIX_LEAKS);
if (ret < 0 || result.check_errors) {
if (ret >= 0) {
ret = -EIO;
@@ -1545,16 +1558,53 @@ static int qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
return ret;
}
typedef struct QCow2OpenCo {
BlockDriverState *bs;
QDict *options;
int flags;
Error **errp;
int ret;
} QCow2OpenCo;
static void coroutine_fn qcow2_open_entry(void *opaque)
{
QCow2OpenCo *qoc = opaque;
BDRVQcow2State *s = qoc->bs->opaque;
qemu_co_mutex_lock(&s->lock);
qoc->ret = qcow2_do_open(qoc->bs, qoc->options, qoc->flags, qoc->errp);
qemu_co_mutex_unlock(&s->lock);
}
static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVQcow2State *s = bs->opaque;
QCow2OpenCo qoc = {
.bs = bs,
.options = options,
.flags = flags,
.errp = errp,
.ret = -EINPROGRESS
};
bs->file = bdrv_open_child(NULL, options, "file", bs, &child_file,
false, errp);
if (!bs->file) {
return -EINVAL;
}
return qcow2_do_open(bs, options, flags, errp);
/* Initialise locks */
qemu_co_mutex_init(&s->lock);
if (qemu_in_coroutine()) {
/* From bdrv_co_create. */
qcow2_open_entry(&qoc);
} else {
qemu_coroutine_enter(qemu_coroutine_create(qcow2_open_entry, &qoc));
BDRV_POLL_WHILE(bs, qoc.ret == -EINPROGRESS);
}
return qoc.ret;
}
static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp)
@@ -2106,7 +2156,8 @@ static void qcow2_close(BlockDriverState *bs)
qcow2_free_snapshots(bs);
}
static void qcow2_invalidate_cache(BlockDriverState *bs, Error **errp)
static void coroutine_fn qcow2_co_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
BDRVQcow2State *s = bs->opaque;
int flags = s->flags;
@@ -2129,7 +2180,9 @@ static void qcow2_invalidate_cache(BlockDriverState *bs, Error **errp)
options = qdict_clone_shallow(bs->options);
flags &= ~BDRV_O_INACTIVE;
qemu_co_mutex_lock(&s->lock);
ret = qcow2_do_open(bs, options, flags, &local_err);
qemu_co_mutex_unlock(&s->lock);
QDECREF(options);
if (local_err) {
error_propagate(errp, local_err);
@@ -2412,39 +2465,26 @@ static int qcow2_crypt_method_from_format(const char *encryptfmt)
}
}
static int qcow2_set_up_encryption(BlockDriverState *bs, const char *encryptfmt,
QemuOpts *opts, Error **errp)
static int qcow2_set_up_encryption(BlockDriverState *bs,
QCryptoBlockCreateOptions *cryptoopts,
Error **errp)
{
BDRVQcow2State *s = bs->opaque;
QCryptoBlockCreateOptions *cryptoopts = NULL;
QCryptoBlock *crypto = NULL;
int ret = -EINVAL;
QDict *options, *encryptopts;
int fmt;
int fmt, ret;
options = qemu_opts_to_qdict(opts, NULL);
qdict_extract_subqdict(options, &encryptopts, "encrypt.");
QDECREF(options);
fmt = qcow2_crypt_method_from_format(encryptfmt);
switch (fmt) {
case QCOW_CRYPT_LUKS:
cryptoopts = block_crypto_create_opts_init(
Q_CRYPTO_BLOCK_FORMAT_LUKS, encryptopts, errp);
switch (cryptoopts->format) {
case Q_CRYPTO_BLOCK_FORMAT_LUKS:
fmt = QCOW_CRYPT_LUKS;
break;
case QCOW_CRYPT_AES:
cryptoopts = block_crypto_create_opts_init(
Q_CRYPTO_BLOCK_FORMAT_QCOW, encryptopts, errp);
case Q_CRYPTO_BLOCK_FORMAT_QCOW:
fmt = QCOW_CRYPT_AES;
break;
default:
error_setg(errp, "Unknown encryption format '%s'", encryptfmt);
break;
}
if (!cryptoopts) {
ret = -EINVAL;
goto out;
error_setg(errp, "Crypto format not supported in qcow2");
return -EINVAL;
}
s->crypt_method_header = fmt;
crypto = qcrypto_block_create(cryptoopts, "encrypt.",
@@ -2452,8 +2492,7 @@ static int qcow2_set_up_encryption(BlockDriverState *bs, const char *encryptfmt,
qcow2_crypto_hdr_write_func,
bs, errp);
if (!crypto) {
ret = -EINVAL;
goto out;
return -EINVAL;
}
ret = qcow2_update_header(bs);
@@ -2462,10 +2501,9 @@ static int qcow2_set_up_encryption(BlockDriverState *bs, const char *encryptfmt,
goto out;
}
ret = 0;
out:
QDECREF(encryptopts);
qcrypto_block_free(crypto);
qapi_free_QCryptoBlockCreateOptions(cryptoopts);
return ret;
}
@@ -2663,19 +2701,26 @@ static int64_t qcow2_calc_prealloc_size(int64_t total_size,
return meta_size + aligned_total_size;
}
static size_t qcow2_opt_get_cluster_size_del(QemuOpts *opts, Error **errp)
static bool validate_cluster_size(size_t cluster_size, Error **errp)
{
size_t cluster_size;
int cluster_bits;
cluster_size = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE);
cluster_bits = ctz32(cluster_size);
int cluster_bits = ctz32(cluster_size);
if (cluster_bits < MIN_CLUSTER_BITS || cluster_bits > MAX_CLUSTER_BITS ||
(1 << cluster_bits) != cluster_size)
{
error_setg(errp, "Cluster size must be a power of two between %d and "
"%dk", 1 << MIN_CLUSTER_BITS, 1 << (MAX_CLUSTER_BITS - 10));
return false;
}
return true;
}
static size_t qcow2_opt_get_cluster_size_del(QemuOpts *opts, Error **errp)
{
size_t cluster_size;
cluster_size = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE);
if (!validate_cluster_size(cluster_size, errp)) {
return 0;
}
return cluster_size;
@@ -2724,12 +2769,9 @@ static uint64_t qcow2_opt_get_refcount_bits_del(QemuOpts *opts, int version,
}
static int coroutine_fn
qcow2_co_create2(const char *filename, int64_t total_size,
const char *backing_file, const char *backing_format,
int flags, size_t cluster_size, PreallocMode prealloc,
QemuOpts *opts, int version, int refcount_order,
const char *encryptfmt, Error **errp)
qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
{
BlockdevCreateOptionsQcow2 *qcow2_opts;
QDict *options;
/*
@@ -2744,36 +2786,132 @@ qcow2_co_create2(const char *filename, int64_t total_size,
* 2 GB for 64k clusters, and we don't want to have a 2 GB initial file
* size for any qcow2 image.
*/
BlockBackend *blk;
BlockBackend *blk = NULL;
BlockDriverState *bs = NULL;
QCowHeader *header;
size_t cluster_size;
int version;
int refcount_order;
uint64_t* refcount_table;
Error *local_err = NULL;
int ret;
if (prealloc == PREALLOC_MODE_FULL || prealloc == PREALLOC_MODE_FALLOC) {
int64_t prealloc_size =
qcow2_calc_prealloc_size(total_size, cluster_size, refcount_order);
qemu_opt_set_number(opts, BLOCK_OPT_SIZE, prealloc_size, &error_abort);
qemu_opt_set(opts, BLOCK_OPT_PREALLOC, PreallocMode_str(prealloc),
&error_abort);
}
assert(create_options->driver == BLOCKDEV_DRIVER_QCOW2);
qcow2_opts = &create_options->u.qcow2;
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
}
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
bs = bdrv_open_blockdev_ref(qcow2_opts->file, errp);
if (bs == NULL) {
return -EIO;
}
/* Validate options and set default values */
if (!QEMU_IS_ALIGNED(qcow2_opts->size, BDRV_SECTOR_SIZE)) {
error_setg(errp, "Image size must be a multiple of 512 bytes");
ret = -EINVAL;
goto out;
}
if (qcow2_opts->has_version) {
switch (qcow2_opts->version) {
case BLOCKDEV_QCOW2_VERSION_V2:
version = 2;
break;
case BLOCKDEV_QCOW2_VERSION_V3:
version = 3;
break;
default:
g_assert_not_reached();
}
} else {
version = 3;
}
if (qcow2_opts->has_cluster_size) {
cluster_size = qcow2_opts->cluster_size;
} else {
cluster_size = DEFAULT_CLUSTER_SIZE;
}
if (!validate_cluster_size(cluster_size, errp)) {
ret = -EINVAL;
goto out;
}
if (!qcow2_opts->has_preallocation) {
qcow2_opts->preallocation = PREALLOC_MODE_OFF;
}
if (qcow2_opts->has_backing_file &&
qcow2_opts->preallocation != PREALLOC_MODE_OFF)
{
error_setg(errp, "Backing file and preallocation cannot be used at "
"the same time");
ret = -EINVAL;
goto out;
}
if (qcow2_opts->has_backing_fmt && !qcow2_opts->has_backing_file) {
error_setg(errp, "Backing format cannot be used without backing file");
ret = -EINVAL;
goto out;
}
if (!qcow2_opts->has_lazy_refcounts) {
qcow2_opts->lazy_refcounts = false;
}
if (version < 3 && qcow2_opts->lazy_refcounts) {
error_setg(errp, "Lazy refcounts only supported with compatibility "
"level 1.1 and above (use version=v3 or greater)");
ret = -EINVAL;
goto out;
}
if (!qcow2_opts->has_refcount_bits) {
qcow2_opts->refcount_bits = 16;
}
if (qcow2_opts->refcount_bits > 64 ||
!is_power_of_2(qcow2_opts->refcount_bits))
{
error_setg(errp, "Refcount width must be a power of two and may not "
"exceed 64 bits");
ret = -EINVAL;
goto out;
}
if (version < 3 && qcow2_opts->refcount_bits != 16) {
error_setg(errp, "Different refcount widths than 16 bits require "
"compatibility level 1.1 or above (use version=v3 or "
"greater)");
ret = -EINVAL;
goto out;
}
refcount_order = ctz32(qcow2_opts->refcount_bits);
/* Create BlockBackend to write to the image */
blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
ret = blk_insert_bs(blk, bs, errp);
if (ret < 0) {
goto out;
}
blk_set_allow_write_beyond_eof(blk, true);
/* Clear the protocol layer and preallocate it if necessary */
ret = blk_truncate(blk, 0, PREALLOC_MODE_OFF, errp);
if (ret < 0) {
goto out;
}
if (qcow2_opts->preallocation == PREALLOC_MODE_FULL ||
qcow2_opts->preallocation == PREALLOC_MODE_FALLOC)
{
int64_t prealloc_size =
qcow2_calc_prealloc_size(qcow2_opts->size, cluster_size,
refcount_order);
ret = blk_truncate(blk, prealloc_size, qcow2_opts->preallocation, errp);
if (ret < 0) {
goto out;
}
}
/* Write the header */
QEMU_BUILD_BUG_ON((1 << MIN_CLUSTER_BITS) < sizeof(*header));
header = g_malloc0(cluster_size);
@@ -2793,7 +2931,7 @@ qcow2_co_create2(const char *filename, int64_t total_size,
/* We'll update this to correct value later */
header->crypt_method = cpu_to_be32(QCOW_CRYPT_NONE);
if (flags & BLOCK_FLAG_LAZY_REFCOUNTS) {
if (qcow2_opts->lazy_refcounts) {
header->compatible_features |=
cpu_to_be64(QCOW2_COMPAT_LAZY_REFCOUNTS);
}
@@ -2826,7 +2964,8 @@ qcow2_co_create2(const char *filename, int64_t total_size,
*/
options = qdict_new();
qdict_put_str(options, "driver", "qcow2");
blk = blk_new_open(filename, NULL, options,
qdict_put_str(options, "file", bs->node_name);
blk = blk_new_open(NULL, NULL, options,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_NO_FLUSH,
&local_err);
if (blk == NULL) {
@@ -2854,33 +2993,41 @@ qcow2_co_create2(const char *filename, int64_t total_size,
}
/* Okay, now that we have a valid image, let's give it the right size */
ret = blk_truncate(blk, total_size, PREALLOC_MODE_OFF, errp);
ret = blk_truncate(blk, qcow2_opts->size, PREALLOC_MODE_OFF, errp);
if (ret < 0) {
error_prepend(errp, "Could not resize image: ");
goto out;
}
/* Want a backing file? There you go.*/
if (backing_file) {
ret = bdrv_change_backing_file(blk_bs(blk), backing_file, backing_format);
if (qcow2_opts->has_backing_file) {
const char *backing_format = NULL;
if (qcow2_opts->has_backing_fmt) {
backing_format = BlockdevDriver_str(qcow2_opts->backing_fmt);
}
ret = bdrv_change_backing_file(blk_bs(blk), qcow2_opts->backing_file,
backing_format);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not assign backing file '%s' "
"with format '%s'", backing_file, backing_format);
"with format '%s'", qcow2_opts->backing_file,
backing_format);
goto out;
}
}
/* Want encryption? There you go. */
if (encryptfmt) {
ret = qcow2_set_up_encryption(blk_bs(blk), encryptfmt, opts, errp);
if (qcow2_opts->has_encrypt) {
ret = qcow2_set_up_encryption(blk_bs(blk), qcow2_opts->encrypt, errp);
if (ret < 0) {
goto out;
}
}
/* And if we're supposed to preallocate metadata, do that now */
if (prealloc != PREALLOC_MODE_OFF) {
ret = preallocate(blk_bs(blk), 0, total_size);
if (qcow2_opts->preallocation != PREALLOC_MODE_OFF) {
ret = preallocate(blk_bs(blk), 0, qcow2_opts->size);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not preallocate metadata");
goto out;
@@ -2898,7 +3045,8 @@ qcow2_co_create2(const char *filename, int64_t total_size,
*/
options = qdict_new();
qdict_put_str(options, "driver", "qcow2");
blk = blk_new_open(filename, NULL, options,
qdict_put_str(options, "file", bs->node_name);
blk = blk_new_open(NULL, NULL, options,
BDRV_O_RDWR | BDRV_O_NO_BACKING | BDRV_O_NO_IO,
&local_err);
if (blk == NULL) {
@@ -2909,104 +3057,120 @@ qcow2_co_create2(const char *filename, int64_t total_size,
ret = 0;
out:
if (blk) {
blk_unref(blk);
}
blk_unref(blk);
bdrv_unref(bs);
return ret;
}
static int coroutine_fn qcow2_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
{
char *backing_file = NULL;
char *backing_fmt = NULL;
char *buf = NULL;
uint64_t size = 0;
int flags = 0;
size_t cluster_size = DEFAULT_CLUSTER_SIZE;
PreallocMode prealloc;
int version;
uint64_t refcount_bits;
int refcount_order;
char *encryptfmt = NULL;
BlockdevCreateOptions *create_options = NULL;
QDict *qdict = NULL;
QObject *qobj;
Visitor *v;
BlockDriverState *bs = NULL;
Error *local_err = NULL;
const char *val;
int ret;
/* Read out options */
size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
encryptfmt = qemu_opt_get_del(opts, BLOCK_OPT_ENCRYPT_FORMAT);
if (encryptfmt) {
if (qemu_opt_get(opts, BLOCK_OPT_ENCRYPT)) {
error_setg(errp, "Options " BLOCK_OPT_ENCRYPT " and "
BLOCK_OPT_ENCRYPT_FORMAT " are mutually exclusive");
ret = -EINVAL;
goto finish;
}
} else if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ENCRYPT, false)) {
encryptfmt = g_strdup("aes");
/* Only the keyval visitor supports the dotted syntax needed for
* encryption, so go through a QDict before getting a QAPI type. Ignore
* options meant for the protocol layer so that the visitor doesn't
* complain. */
qdict = qemu_opts_to_qdict_filtered(opts, NULL, bdrv_qcow2.create_opts,
true);
/* Handle encryption options */
val = qdict_get_try_str(qdict, BLOCK_OPT_ENCRYPT);
if (val && !strcmp(val, "on")) {
qdict_put_str(qdict, BLOCK_OPT_ENCRYPT, "qcow");
} else if (val && !strcmp(val, "off")) {
qdict_del(qdict, BLOCK_OPT_ENCRYPT);
}
cluster_size = qcow2_opt_get_cluster_size_del(opts, &local_err);
if (local_err) {
error_propagate(errp, local_err);
val = qdict_get_try_str(qdict, BLOCK_OPT_ENCRYPT_FORMAT);
if (val && !strcmp(val, "aes")) {
qdict_put_str(qdict, BLOCK_OPT_ENCRYPT_FORMAT, "qcow");
}
/* Convert compat=0.10/1.1 into compat=v2/v3, to be renamed into
* version=v2/v3 below. */
val = qdict_get_try_str(qdict, BLOCK_OPT_COMPAT_LEVEL);
if (val && !strcmp(val, "0.10")) {
qdict_put_str(qdict, BLOCK_OPT_COMPAT_LEVEL, "v2");
} else if (val && !strcmp(val, "1.1")) {
qdict_put_str(qdict, BLOCK_OPT_COMPAT_LEVEL, "v3");
}
/* Change legacy command line options into QMP ones */
static const QDictRenames opt_renames[] = {
{ BLOCK_OPT_BACKING_FILE, "backing-file" },
{ BLOCK_OPT_BACKING_FMT, "backing-fmt" },
{ BLOCK_OPT_CLUSTER_SIZE, "cluster-size" },
{ BLOCK_OPT_LAZY_REFCOUNTS, "lazy-refcounts" },
{ BLOCK_OPT_REFCOUNT_BITS, "refcount-bits" },
{ BLOCK_OPT_ENCRYPT, BLOCK_OPT_ENCRYPT_FORMAT },
{ BLOCK_OPT_COMPAT_LEVEL, "version" },
{ NULL, NULL },
};
if (!qdict_rename_keys(qdict, opt_renames, errp)) {
ret = -EINVAL;
goto finish;
}
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
prealloc = qapi_enum_parse(&PreallocMode_lookup, buf,
PREALLOC_MODE_OFF, &local_err);
/* Create and open the file (protocol layer) */
ret = bdrv_create_file(filename, opts, errp);
if (ret < 0) {
goto finish;
}
bs = bdrv_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
if (bs == NULL) {
ret = -EIO;
goto finish;
}
/* Set 'driver' and 'node' options */
qdict_put_str(qdict, "driver", "qcow2");
qdict_put_str(qdict, "file", bs->node_name);
/* Now get the QAPI type BlockdevCreateOptions */
qobj = qdict_crumple(qdict, errp);
QDECREF(qdict);
qdict = qobject_to(QDict, qobj);
if (qdict == NULL) {
ret = -EINVAL;
goto finish;
}
v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
visit_type_BlockdevCreateOptions(v, NULL, &create_options, &local_err);
visit_free(v);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto finish;
}
version = qcow2_opt_get_version_del(opts, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
/* Silently round up size */
create_options->u.qcow2.size = ROUND_UP(create_options->u.qcow2.size,
BDRV_SECTOR_SIZE);
/* Create the qcow2 image (format layer) */
ret = qcow2_co_create(create_options, errp);
if (ret < 0) {
goto finish;
}
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_LAZY_REFCOUNTS, false)) {
flags |= BLOCK_FLAG_LAZY_REFCOUNTS;
}
if (backing_file && prealloc != PREALLOC_MODE_OFF) {
error_setg(errp, "Backing file and preallocation cannot be used at "
"the same time");
ret = -EINVAL;
goto finish;
}
if (version < 3 && (flags & BLOCK_FLAG_LAZY_REFCOUNTS)) {
error_setg(errp, "Lazy refcounts only supported with compatibility "
"level 1.1 and above (use compat=1.1 or greater)");
ret = -EINVAL;
goto finish;
}
refcount_bits = qcow2_opt_get_refcount_bits_del(opts, version, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto finish;
}
refcount_order = ctz32(refcount_bits);
ret = qcow2_co_create2(filename, size, backing_file, backing_fmt, flags,
cluster_size, prealloc, opts, version, refcount_order,
encryptfmt, &local_err);
error_propagate(errp, local_err);
ret = 0;
finish:
g_free(backing_file);
g_free(backing_fmt);
g_free(encryptfmt);
g_free(buf);
QDECREF(qdict);
bdrv_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
@@ -3647,22 +3811,10 @@ static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs)
int ret;
qemu_co_mutex_lock(&s->lock);
ret = qcow2_cache_write(bs, s->l2_table_cache);
if (ret < 0) {
qemu_co_mutex_unlock(&s->lock);
return ret;
}
if (qcow2_need_accurate_refcounts(s)) {
ret = qcow2_cache_write(bs, s->refcount_block_cache);
if (ret < 0) {
qemu_co_mutex_unlock(&s->lock);
return ret;
}
}
ret = qcow2_write_caches(bs);
qemu_co_mutex_unlock(&s->lock);
return 0;
return ret;
}
static BlockMeasureInfo *qcow2_measure(QemuOpts *opts, BlockDriverState *in_bs,
@@ -4353,6 +4505,7 @@ BlockDriver bdrv_qcow2 = {
.bdrv_join_options = qcow2_join_options,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_co_create_opts = qcow2_co_create_opts,
.bdrv_co_create = qcow2_co_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_co_block_status = qcow2_co_block_status,
@@ -4382,11 +4535,11 @@ BlockDriver bdrv_qcow2 = {
.bdrv_change_backing_file = qcow2_change_backing_file,
.bdrv_refresh_limits = qcow2_refresh_limits,
.bdrv_invalidate_cache = qcow2_invalidate_cache,
.bdrv_co_invalidate_cache = qcow2_co_invalidate_cache,
.bdrv_inactivate = qcow2_inactivate,
.create_opts = &qcow2_create_opts,
.bdrv_check = qcow2_check,
.bdrv_co_check = qcow2_co_check,
.bdrv_amend_options = qcow2_amend_options,
.bdrv_detach_aio_context = qcow2_detach_aio_context,

View File

@@ -485,11 +485,6 @@ static inline int64_t qcow2_vm_state_offset(BDRVQcow2State *s)
return (int64_t)s->l1_vm_state_index << (s->cluster_bits + s->l2_bits);
}
static inline uint64_t qcow2_max_refcount_clusters(BDRVQcow2State *s)
{
return QCOW_MAX_REFTABLE_SIZE >> s->cluster_bits;
}
static inline QCow2ClusterType qcow2_get_cluster_type(uint64_t l2_entry)
{
if (l2_entry & QCOW_OFLAG_COMPRESSED) {
@@ -547,6 +542,11 @@ void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
int64_t size, const char *message_format, ...)
GCC_FMT_ATTR(5, 6);
int qcow2_validate_table(BlockDriverState *bs, uint64_t offset,
uint64_t entries, size_t entry_len,
int64_t max_size_bytes, const char *table_name,
Error **errp);
/* qcow2-refcount.c functions */
int qcow2_refcount_init(BlockDriverState *bs);
void qcow2_refcount_close(BlockDriverState *bs);
@@ -576,6 +576,8 @@ void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
int qcow2_update_snapshot_refcount(BlockDriverState *bs,
int64_t l1_table_offset, int l1_size, int addend);
int coroutine_fn qcow2_flush_caches(BlockDriverState *bs);
int coroutine_fn qcow2_write_caches(BlockDriverState *bs);
int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix);
@@ -669,6 +671,8 @@ int qcow2_check_bitmaps_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
void **refcount_table,
int64_t *refcount_table_size);
bool qcow2_load_dirty_bitmaps(BlockDriverState *bs, Error **errp);
int qcow2_reopen_bitmaps_rw_hint(BlockDriverState *bs, bool *header_updated,
Error **errp);
int qcow2_reopen_bitmaps_rw(BlockDriverState *bs, Error **errp);
void qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error **errp);
int qcow2_reopen_bitmaps_ro(BlockDriverState *bs, Error **errp);

View File

@@ -217,6 +217,7 @@ static void qed_check_mark_clean(BDRVQEDState *s, BdrvCheckResult *result)
qed_write_header_sync(s);
}
/* Called with table_lock held. */
int qed_check(BDRVQEDState *s, BdrvCheckResult *result, bool fix)
{
QEDCheck check = {

View File

@@ -18,7 +18,7 @@
#include "qed.h"
#include "qemu/bswap.h"
/* Called either from qed_check or with table_lock held. */
/* Called with table_lock held. */
static int qed_read_table(BDRVQEDState *s, uint64_t offset, QEDTable *table)
{
QEMUIOVector qiov;
@@ -33,13 +33,9 @@ static int qed_read_table(BDRVQEDState *s, uint64_t offset, QEDTable *table)
trace_qed_read_table(s, offset, table);
if (qemu_in_coroutine()) {
qemu_co_mutex_unlock(&s->table_lock);
}
qemu_co_mutex_unlock(&s->table_lock);
ret = bdrv_preadv(s->bs->file, offset, &qiov);
if (qemu_in_coroutine()) {
qemu_co_mutex_lock(&s->table_lock);
}
qemu_co_mutex_lock(&s->table_lock);
if (ret < 0) {
goto out;
}
@@ -67,7 +63,7 @@ out:
* @n: Number of elements
* @flush: Whether or not to sync to disk
*
* Called either from qed_check or with table_lock held.
* Called with table_lock held.
*/
static int qed_write_table(BDRVQEDState *s, uint64_t offset, QEDTable *table,
unsigned int index, unsigned int n, bool flush)
@@ -104,13 +100,9 @@ static int qed_write_table(BDRVQEDState *s, uint64_t offset, QEDTable *table,
/* Adjust for offset into table */
offset += start * sizeof(uint64_t);
if (qemu_in_coroutine()) {
qemu_co_mutex_unlock(&s->table_lock);
}
qemu_co_mutex_unlock(&s->table_lock);
ret = bdrv_pwritev(s->bs->file, offset, &qiov);
if (qemu_in_coroutine()) {
qemu_co_mutex_lock(&s->table_lock);
}
qemu_co_mutex_lock(&s->table_lock);
trace_qed_write_table_cb(s, table, flush, ret);
if (ret < 0) {
goto out;
@@ -134,7 +126,7 @@ int qed_read_l1_table_sync(BDRVQEDState *s)
return qed_read_table(s, s->header.l1_table_offset, s->l1_table);
}
/* Called either from qed_check or with table_lock held. */
/* Called with table_lock held. */
int qed_write_l1_table(BDRVQEDState *s, unsigned int index, unsigned int n)
{
BLKDBG_EVENT(s->bs->file, BLKDBG_L1_UPDATE);
@@ -148,7 +140,7 @@ int qed_write_l1_table_sync(BDRVQEDState *s, unsigned int index,
return qed_write_l1_table(s, index, n);
}
/* Called either from qed_check or with table_lock held. */
/* Called with table_lock held. */
int qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset)
{
int ret;
@@ -191,7 +183,7 @@ int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request, uint64_t offset
return qed_read_l2_table(s, request, offset);
}
/* Called either from qed_check or with table_lock held. */
/* Called with table_lock held. */
int qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
unsigned int index, unsigned int n, bool flush)
{

View File

@@ -20,6 +20,11 @@
#include "trace.h"
#include "qed.h"
#include "sysemu/block-backend.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi/qapi-visit-block-core.h"
static QemuOptsList qed_create_opts;
static int bdrv_qed_probe(const uint8_t *buf, int buf_size,
const char *filename)
@@ -381,8 +386,9 @@ static void bdrv_qed_init_state(BlockDriverState *bs)
qemu_co_queue_init(&s->allocating_write_reqs);
}
static int bdrv_qed_do_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
/* Called with table_lock held. */
static int coroutine_fn bdrv_qed_do_open(BlockDriverState *bs, QDict *options,
int flags, Error **errp)
{
BDRVQEDState *s = bs->opaque;
QEDHeader le_header;
@@ -513,9 +519,35 @@ out:
return ret;
}
typedef struct QEDOpenCo {
BlockDriverState *bs;
QDict *options;
int flags;
Error **errp;
int ret;
} QEDOpenCo;
static void coroutine_fn bdrv_qed_open_entry(void *opaque)
{
QEDOpenCo *qoc = opaque;
BDRVQEDState *s = qoc->bs->opaque;
qemu_co_mutex_lock(&s->table_lock);
qoc->ret = bdrv_qed_do_open(qoc->bs, qoc->options, qoc->flags, qoc->errp);
qemu_co_mutex_unlock(&s->table_lock);
}
static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
QEDOpenCo qoc = {
.bs = bs,
.options = options,
.flags = flags,
.errp = errp,
.ret = -EINPROGRESS
};
bs->file = bdrv_open_child(NULL, options, "file", bs, &child_file,
false, errp);
if (!bs->file) {
@@ -523,7 +555,14 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
}
bdrv_qed_init_state(bs);
return bdrv_qed_do_open(bs, options, flags, errp);
if (qemu_in_coroutine()) {
bdrv_qed_open_entry(&qoc);
} else {
qemu_coroutine_enter(qemu_coroutine_create(bdrv_qed_open_entry, &qoc));
BDRV_POLL_WHILE(bs, qoc.ret == -EINPROGRESS);
}
BDRV_POLL_WHILE(bs, qoc.ret == -EINPROGRESS);
return qoc.ret;
}
static void bdrv_qed_refresh_limits(BlockDriverState *bs, Error **errp)
@@ -560,57 +599,95 @@ static void bdrv_qed_close(BlockDriverState *bs)
qemu_vfree(s->l1_table);
}
static int qed_create(const char *filename, uint32_t cluster_size,
uint64_t image_size, uint32_t table_size,
const char *backing_file, const char *backing_fmt,
QemuOpts *opts, Error **errp)
static int coroutine_fn bdrv_qed_co_create(BlockdevCreateOptions *opts,
Error **errp)
{
QEDHeader header = {
.magic = QED_MAGIC,
.cluster_size = cluster_size,
.table_size = table_size,
.header_size = 1,
.features = 0,
.compat_features = 0,
.l1_table_offset = cluster_size,
.image_size = image_size,
};
BlockdevCreateOptionsQed *qed_opts;
BlockBackend *blk = NULL;
BlockDriverState *bs = NULL;
QEDHeader header;
QEDHeader le_header;
uint8_t *l1_table = NULL;
size_t l1_size = header.cluster_size * header.table_size;
Error *local_err = NULL;
size_t l1_size;
int ret = 0;
BlockBackend *blk;
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return ret;
assert(opts->driver == BLOCKDEV_DRIVER_QED);
qed_opts = &opts->u.qed;
/* Validate options and set default values */
if (!qed_opts->has_cluster_size) {
qed_opts->cluster_size = QED_DEFAULT_CLUSTER_SIZE;
}
if (!qed_opts->has_table_size) {
qed_opts->table_size = QED_DEFAULT_TABLE_SIZE;
}
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
if (!qed_is_cluster_size_valid(qed_opts->cluster_size)) {
error_setg(errp, "QED cluster size must be within range [%u, %u] "
"and power of 2",
QED_MIN_CLUSTER_SIZE, QED_MAX_CLUSTER_SIZE);
return -EINVAL;
}
if (!qed_is_table_size_valid(qed_opts->table_size)) {
error_setg(errp, "QED table size must be within range [%u, %u] "
"and power of 2",
QED_MIN_TABLE_SIZE, QED_MAX_TABLE_SIZE);
return -EINVAL;
}
if (!qed_is_image_size_valid(qed_opts->size, qed_opts->cluster_size,
qed_opts->table_size))
{
error_setg(errp, "QED image size must be a non-zero multiple of "
"cluster size and less than %" PRIu64 " bytes",
qed_max_image_size(qed_opts->cluster_size,
qed_opts->table_size));
return -EINVAL;
}
/* Create BlockBackend to write to the image */
bs = bdrv_open_blockdev_ref(qed_opts->file, errp);
if (bs == NULL) {
return -EIO;
}
blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
ret = blk_insert_bs(blk, bs, errp);
if (ret < 0) {
goto out;
}
blk_set_allow_write_beyond_eof(blk, true);
/* Prepare image format */
header = (QEDHeader) {
.magic = QED_MAGIC,
.cluster_size = qed_opts->cluster_size,
.table_size = qed_opts->table_size,
.header_size = 1,
.features = 0,
.compat_features = 0,
.l1_table_offset = qed_opts->cluster_size,
.image_size = qed_opts->size,
};
l1_size = header.cluster_size * header.table_size;
/* File must start empty and grow, check truncate is supported */
ret = blk_truncate(blk, 0, PREALLOC_MODE_OFF, errp);
if (ret < 0) {
goto out;
}
if (backing_file) {
if (qed_opts->has_backing_file) {
header.features |= QED_F_BACKING_FILE;
header.backing_filename_offset = sizeof(le_header);
header.backing_filename_size = strlen(backing_file);
header.backing_filename_size = strlen(qed_opts->backing_file);
if (qed_fmt_is_raw(backing_fmt)) {
header.features |= QED_F_BACKING_FORMAT_NO_PROBE;
if (qed_opts->has_backing_fmt) {
const char *backing_fmt = BlockdevDriver_str(qed_opts->backing_fmt);
if (qed_fmt_is_raw(backing_fmt)) {
header.features |= QED_F_BACKING_FORMAT_NO_PROBE;
}
}
}
@@ -619,7 +696,7 @@ static int qed_create(const char *filename, uint32_t cluster_size,
if (ret < 0) {
goto out;
}
ret = blk_pwrite(blk, sizeof(le_header), backing_file,
ret = blk_pwrite(blk, sizeof(le_header), qed_opts->backing_file,
header.backing_filename_size, 0);
if (ret < 0) {
goto out;
@@ -635,6 +712,7 @@ static int qed_create(const char *filename, uint32_t cluster_size,
out:
g_free(l1_table);
blk_unref(blk);
bdrv_unref(bs);
return ret;
}
@@ -642,51 +720,78 @@ static int coroutine_fn bdrv_qed_co_create_opts(const char *filename,
QemuOpts *opts,
Error **errp)
{
uint64_t image_size = 0;
uint32_t cluster_size = QED_DEFAULT_CLUSTER_SIZE;
uint32_t table_size = QED_DEFAULT_TABLE_SIZE;
char *backing_file = NULL;
char *backing_fmt = NULL;
BlockdevCreateOptions *create_options = NULL;
QDict *qdict = NULL;
QObject *qobj;
Visitor *v;
BlockDriverState *bs = NULL;
Error *local_err = NULL;
int ret;
image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
cluster_size = qemu_opt_get_size_del(opts,
BLOCK_OPT_CLUSTER_SIZE,
QED_DEFAULT_CLUSTER_SIZE);
table_size = qemu_opt_get_size_del(opts, BLOCK_OPT_TABLE_SIZE,
QED_DEFAULT_TABLE_SIZE);
static const QDictRenames opt_renames[] = {
{ BLOCK_OPT_BACKING_FILE, "backing-file" },
{ BLOCK_OPT_BACKING_FMT, "backing-fmt" },
{ BLOCK_OPT_CLUSTER_SIZE, "cluster-size" },
{ BLOCK_OPT_TABLE_SIZE, "table-size" },
{ NULL, NULL },
};
if (!qed_is_cluster_size_valid(cluster_size)) {
error_setg(errp, "QED cluster size must be within range [%u, %u] "
"and power of 2",
QED_MIN_CLUSTER_SIZE, QED_MAX_CLUSTER_SIZE);
/* Parse options and convert legacy syntax */
qdict = qemu_opts_to_qdict_filtered(opts, NULL, &qed_create_opts, true);
if (!qdict_rename_keys(qdict, opt_renames, errp)) {
ret = -EINVAL;
goto finish;
}
if (!qed_is_table_size_valid(table_size)) {
error_setg(errp, "QED table size must be within range [%u, %u] "
"and power of 2",
QED_MIN_TABLE_SIZE, QED_MAX_TABLE_SIZE);
ret = -EINVAL;
goto finish;
}
if (!qed_is_image_size_valid(image_size, cluster_size, table_size)) {
error_setg(errp, "QED image size must be a non-zero multiple of "
"cluster size and less than %" PRIu64 " bytes",
qed_max_image_size(cluster_size, table_size));
ret = -EINVAL;
goto finish;
goto fail;
}
ret = qed_create(filename, cluster_size, image_size, table_size,
backing_file, backing_fmt, opts, errp);
/* Create and open the file (protocol layer) */
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
}
finish:
g_free(backing_file);
g_free(backing_fmt);
bs = bdrv_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
if (bs == NULL) {
ret = -EIO;
goto fail;
}
/* Now get the QAPI type BlockdevCreateOptions */
qdict_put_str(qdict, "driver", "qed");
qdict_put_str(qdict, "file", bs->node_name);
qobj = qdict_crumple(qdict, errp);
QDECREF(qdict);
qdict = qobject_to(QDict, qobj);
if (qdict == NULL) {
ret = -EINVAL;
goto fail;
}
v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
visit_type_BlockdevCreateOptions(v, NULL, &create_options, &local_err);
visit_free(v);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
/* Silently round up size */
assert(create_options->driver == BLOCKDEV_DRIVER_QED);
create_options->u.qed.size =
ROUND_UP(create_options->u.qed.size, BDRV_SECTOR_SIZE);
/* Create the qed image (format layer) */
ret = bdrv_qed_co_create(create_options, errp);
fail:
QDECREF(qdict);
bdrv_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
@@ -1487,7 +1592,8 @@ static int bdrv_qed_change_backing_file(BlockDriverState *bs,
return ret;
}
static void bdrv_qed_invalidate_cache(BlockDriverState *bs, Error **errp)
static void coroutine_fn bdrv_qed_co_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
BDRVQEDState *s = bs->opaque;
Error *local_err = NULL;
@@ -1496,13 +1602,9 @@ static void bdrv_qed_invalidate_cache(BlockDriverState *bs, Error **errp)
bdrv_qed_close(bs);
bdrv_qed_init_state(bs);
if (qemu_in_coroutine()) {
qemu_co_mutex_lock(&s->table_lock);
}
qemu_co_mutex_lock(&s->table_lock);
ret = bdrv_qed_do_open(bs, NULL, bs->open_flags, &local_err);
if (qemu_in_coroutine()) {
qemu_co_mutex_unlock(&s->table_lock);
}
qemu_co_mutex_unlock(&s->table_lock);
if (local_err) {
error_propagate(errp, local_err);
error_prepend(errp, "Could not reopen qed layer: ");
@@ -1513,12 +1615,17 @@ static void bdrv_qed_invalidate_cache(BlockDriverState *bs, Error **errp)
}
}
static int bdrv_qed_check(BlockDriverState *bs, BdrvCheckResult *result,
BdrvCheckMode fix)
static int bdrv_qed_co_check(BlockDriverState *bs, BdrvCheckResult *result,
BdrvCheckMode fix)
{
BDRVQEDState *s = bs->opaque;
int ret;
return qed_check(s, result, !!fix);
qemu_co_mutex_lock(&s->table_lock);
ret = qed_check(s, result, !!fix);
qemu_co_mutex_unlock(&s->table_lock);
return ret;
}
static QemuOptsList qed_create_opts = {
@@ -1566,6 +1673,7 @@ static BlockDriver bdrv_qed = {
.bdrv_close = bdrv_qed_close,
.bdrv_reopen_prepare = bdrv_qed_reopen_prepare,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_co_create = bdrv_qed_co_create,
.bdrv_co_create_opts = bdrv_qed_co_create_opts,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_co_block_status = bdrv_qed_co_block_status,
@@ -1577,8 +1685,8 @@ static BlockDriver bdrv_qed = {
.bdrv_get_info = bdrv_qed_get_info,
.bdrv_refresh_limits = bdrv_qed_refresh_limits,
.bdrv_change_backing_file = bdrv_qed_change_backing_file,
.bdrv_invalidate_cache = bdrv_qed_invalidate_cache,
.bdrv_check = bdrv_qed_check,
.bdrv_co_invalidate_cache = bdrv_qed_co_invalidate_cache,
.bdrv_co_check = bdrv_qed_co_check,
.bdrv_detach_aio_context = bdrv_qed_detach_aio_context,
.bdrv_attach_aio_context = bdrv_qed_attach_aio_context,
.bdrv_co_drain_begin = bdrv_qed_co_drain_begin,

View File

@@ -1098,11 +1098,10 @@ static void quorum_refresh_filename(BlockDriverState *bs, QDict *options)
static BlockDriver bdrv_quorum = {
.format_name = "quorum",
.protocol_name = "quorum",
.instance_size = sizeof(BDRVQuorumState),
.bdrv_file_open = quorum_open,
.bdrv_open = quorum_open,
.bdrv_close = quorum_close,
.bdrv_refresh_filename = quorum_refresh_filename,

View File

@@ -24,6 +24,8 @@
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qjson.h"
#include "qapi/qmp/qlist.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi/qapi-visit-block-core.h"
/*
* When specifying the image filename use:
@@ -101,6 +103,11 @@ typedef struct BDRVRBDState {
char *snap;
} BDRVRBDState;
static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
BlockdevOptionsRbd *opts, bool cache,
const char *keypairs, const char *secretid,
Error **errp);
static char *qemu_rbd_next_tok(char *src, char delim, char **p)
{
char *end;
@@ -256,25 +263,26 @@ static int qemu_rbd_set_keypairs(rados_t cluster, const char *keypairs_json,
if (!keypairs_json) {
return ret;
}
keypairs = qobject_to_qlist(qobject_from_json(keypairs_json,
&error_abort));
keypairs = qobject_to(QList,
qobject_from_json(keypairs_json, &error_abort));
remaining = qlist_size(keypairs) / 2;
assert(remaining);
while (remaining--) {
name = qobject_to_qstring(qlist_pop(keypairs));
value = qobject_to_qstring(qlist_pop(keypairs));
name = qobject_to(QString, qlist_pop(keypairs));
value = qobject_to(QString, qlist_pop(keypairs));
assert(name && value);
key = qstring_get_str(name);
ret = rados_conf_set(cluster, key, qstring_get_str(value));
QDECREF(name);
QDECREF(value);
if (ret < 0) {
error_setg_errno(errp, -ret, "invalid conf option %s", key);
QDECREF(name);
ret = -EINVAL;
break;
}
QDECREF(name);
}
QDECREF(keypairs);
@@ -325,66 +333,92 @@ static QemuOptsList runtime_opts = {
/*
* server.* extracted manually, see qemu_rbd_mon_host()
*/
{
.name = "password-secret",
.type = QEMU_OPT_STRING,
.help = "ID of secret providing the password",
},
/*
* Keys for qemu_rbd_parse_filename(), not in the QAPI schema
*/
{
/*
* HACK: name starts with '=' so that qemu_opts_parse()
* can't set it
*/
.name = "=keyvalue-pairs",
.type = QEMU_OPT_STRING,
.help = "Legacy rados key/value option parameters",
},
{
.name = "filename",
.type = QEMU_OPT_STRING,
},
{ /* end of list */ }
},
};
/* FIXME Deprecate and remove keypairs or make it available in QMP.
* password_secret should eventually be configurable in opts->location. Support
* for it in .bdrv_open will make it work here as well. */
static int qemu_rbd_do_create(BlockdevCreateOptions *options,
const char *keypairs, const char *password_secret,
Error **errp)
{
BlockdevCreateOptionsRbd *opts = &options->u.rbd;
rados_t cluster;
rados_ioctx_t io_ctx;
int obj_order = 0;
int ret;
assert(options->driver == BLOCKDEV_DRIVER_RBD);
if (opts->location->has_snapshot) {
error_setg(errp, "Can't use snapshot name for image creation");
return -EINVAL;
}
if (opts->has_cluster_size) {
int64_t objsize = opts->cluster_size;
if ((objsize - 1) & objsize) { /* not a power of 2? */
error_setg(errp, "obj size needs to be power of 2");
return -EINVAL;
}
if (objsize < 4096) {
error_setg(errp, "obj size too small");
return -EINVAL;
}
obj_order = ctz32(objsize);
}
ret = qemu_rbd_connect(&cluster, &io_ctx, opts->location, false, keypairs,
password_secret, errp);
if (ret < 0) {
return ret;
}
ret = rbd_create(io_ctx, opts->location->image, opts->size, &obj_order);
if (ret < 0) {
error_setg_errno(errp, -ret, "error rbd create");
goto out;
}
ret = 0;
out:
rados_ioctx_destroy(io_ctx);
rados_shutdown(cluster);
return ret;
}
static int qemu_rbd_co_create(BlockdevCreateOptions *options, Error **errp)
{
return qemu_rbd_do_create(options, NULL, NULL, errp);
}
static int coroutine_fn qemu_rbd_co_create_opts(const char *filename,
QemuOpts *opts,
Error **errp)
{
BlockdevCreateOptions *create_options;
BlockdevCreateOptionsRbd *rbd_opts;
BlockdevOptionsRbd *loc;
Error *local_err = NULL;
int64_t bytes = 0;
int64_t objsize;
int obj_order = 0;
const char *pool, *image_name, *conf, *user, *keypairs;
const char *secretid;
rados_t cluster;
rados_ioctx_t io_ctx;
const char *keypairs, *password_secret;
QDict *options = NULL;
int ret = 0;
secretid = qemu_opt_get(opts, "password-secret");
create_options = g_new0(BlockdevCreateOptions, 1);
create_options->driver = BLOCKDEV_DRIVER_RBD;
rbd_opts = &create_options->u.rbd;
rbd_opts->location = g_new0(BlockdevOptionsRbd, 1);
password_secret = qemu_opt_get(opts, "password-secret");
/* Read out options */
bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
objsize = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE, 0);
if (objsize) {
if ((objsize - 1) & objsize) { /* not a power of 2? */
error_setg(errp, "obj size needs to be power of 2");
ret = -EINVAL;
goto exit;
}
if (objsize < 4096) {
error_setg(errp, "obj size too small");
ret = -EINVAL;
goto exit;
}
obj_order = ctz32(objsize);
}
rbd_opts->size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
rbd_opts->cluster_size = qemu_opt_get_size_del(opts,
BLOCK_OPT_CLUSTER_SIZE, 0);
rbd_opts->has_cluster_size = (rbd_opts->cluster_size != 0);
options = qdict_new();
qemu_rbd_parse_filename(filename, options, &local_err);
@@ -400,61 +434,23 @@ static int coroutine_fn qemu_rbd_co_create_opts(const char *filename,
* or blockdev_add, its members are typed according to the QAPI
* schema, but when they come from -drive, they're all QString.
*/
pool = qdict_get_try_str(options, "pool");
conf = qdict_get_try_str(options, "conf");
user = qdict_get_try_str(options, "user");
image_name = qdict_get_try_str(options, "image");
keypairs = qdict_get_try_str(options, "=keyvalue-pairs");
loc = rbd_opts->location;
loc->pool = g_strdup(qdict_get_try_str(options, "pool"));
loc->conf = g_strdup(qdict_get_try_str(options, "conf"));
loc->has_conf = !!loc->conf;
loc->user = g_strdup(qdict_get_try_str(options, "user"));
loc->has_user = !!loc->user;
loc->image = g_strdup(qdict_get_try_str(options, "image"));
keypairs = qdict_get_try_str(options, "=keyvalue-pairs");
ret = rados_create(&cluster, user);
ret = qemu_rbd_do_create(create_options, keypairs, password_secret, errp);
if (ret < 0) {
error_setg_errno(errp, -ret, "error initializing");
goto exit;
}
/* try default location when conf=NULL, but ignore failure */
ret = rados_conf_read_file(cluster, conf);
if (conf && ret < 0) {
error_setg_errno(errp, -ret, "error reading conf file %s", conf);
ret = -EIO;
goto shutdown;
}
ret = qemu_rbd_set_keypairs(cluster, keypairs, errp);
if (ret < 0) {
ret = -EIO;
goto shutdown;
}
if (qemu_rbd_set_auth(cluster, secretid, errp) < 0) {
ret = -EIO;
goto shutdown;
}
ret = rados_connect(cluster);
if (ret < 0) {
error_setg_errno(errp, -ret, "error connecting");
goto shutdown;
}
ret = rados_ioctx_create(cluster, pool, &io_ctx);
if (ret < 0) {
error_setg_errno(errp, -ret, "error opening pool %s", pool);
goto shutdown;
}
ret = rbd_create(io_ctx, image_name, bytes, &obj_order);
if (ret < 0) {
error_setg_errno(errp, -ret, "error rbd create");
}
rados_ioctx_destroy(io_ctx);
shutdown:
rados_shutdown(cluster);
exit:
QDECREF(options);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
@@ -505,131 +501,83 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
qemu_aio_unref(acb);
}
static char *qemu_rbd_mon_host(QDict *options, Error **errp)
static char *qemu_rbd_mon_host(BlockdevOptionsRbd *opts, Error **errp)
{
const char **vals = g_new(const char *, qdict_size(options) + 1);
char keybuf[32];
const char **vals;
const char *host, *port;
char *rados_str;
int i;
InetSocketAddressBaseList *p;
int i, cnt;
for (i = 0;; i++) {
sprintf(keybuf, "server.%d.host", i);
host = qdict_get_try_str(options, keybuf);
qdict_del(options, keybuf);
sprintf(keybuf, "server.%d.port", i);
port = qdict_get_try_str(options, keybuf);
qdict_del(options, keybuf);
if (!host && !port) {
break;
}
if (!host) {
error_setg(errp, "Parameter server.%d.host is missing", i);
rados_str = NULL;
goto out;
}
if (!opts->has_server) {
return NULL;
}
for (cnt = 0, p = opts->server; p; p = p->next) {
cnt++;
}
vals = g_new(const char *, cnt + 1);
for (i = 0, p = opts->server; p; p = p->next, i++) {
host = p->value->host;
port = p->value->port;
if (strchr(host, ':')) {
vals[i] = port ? g_strdup_printf("[%s]:%s", host, port)
: g_strdup_printf("[%s]", host);
vals[i] = g_strdup_printf("[%s]:%s", host, port);
} else {
vals[i] = port ? g_strdup_printf("%s:%s", host, port)
: g_strdup(host);
vals[i] = g_strdup_printf("%s:%s", host, port);
}
}
vals[i] = NULL;
rados_str = i ? g_strjoinv(";", (char **)vals) : NULL;
out:
g_strfreev((char **)vals);
return rados_str;
}
static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx,
BlockdevOptionsRbd *opts, bool cache,
const char *keypairs, const char *secretid,
Error **errp)
{
BDRVRBDState *s = bs->opaque;
const char *pool, *snap, *conf, *user, *image_name, *keypairs;
const char *secretid, *filename;
QemuOpts *opts;
Error *local_err = NULL;
char *mon_host = NULL;
Error *local_err = NULL;
int r;
/* If we are given a filename, parse the filename, with precedence given to
* filename encoded options */
filename = qdict_get_try_str(options, "filename");
if (filename) {
warn_report("'filename' option specified. "
"This is an unsupported option, and may be deprecated "
"in the future");
qemu_rbd_parse_filename(filename, options, &local_err);
if (local_err) {
r = -EINVAL;
error_propagate(errp, local_err);
goto exit;
}
}
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
mon_host = qemu_rbd_mon_host(opts, &local_err);
if (local_err) {
error_propagate(errp, local_err);
r = -EINVAL;
goto failed_opts;
}
mon_host = qemu_rbd_mon_host(options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
r = -EINVAL;
goto failed_opts;
}
secretid = qemu_opt_get(opts, "password-secret");
pool = qemu_opt_get(opts, "pool");
conf = qemu_opt_get(opts, "conf");
snap = qemu_opt_get(opts, "snapshot");
user = qemu_opt_get(opts, "user");
image_name = qemu_opt_get(opts, "image");
keypairs = qemu_opt_get(opts, "=keyvalue-pairs");
if (!pool || !image_name) {
error_setg(errp, "Parameters 'pool' and 'image' are required");
r = -EINVAL;
goto failed_opts;
}
r = rados_create(&s->cluster, user);
r = rados_create(cluster, opts->user);
if (r < 0) {
error_setg_errno(errp, -r, "error initializing");
goto failed_opts;
}
s->snap = g_strdup(snap);
s->image_name = g_strdup(image_name);
/* try default location when conf=NULL, but ignore failure */
r = rados_conf_read_file(s->cluster, conf);
if (conf && r < 0) {
error_setg_errno(errp, -r, "error reading conf file %s", conf);
r = rados_conf_read_file(*cluster, opts->conf);
if (opts->has_conf && r < 0) {
error_setg_errno(errp, -r, "error reading conf file %s", opts->conf);
goto failed_shutdown;
}
r = qemu_rbd_set_keypairs(s->cluster, keypairs, errp);
r = qemu_rbd_set_keypairs(*cluster, keypairs, errp);
if (r < 0) {
goto failed_shutdown;
}
if (mon_host) {
r = rados_conf_set(s->cluster, "mon_host", mon_host);
r = rados_conf_set(*cluster, "mon_host", mon_host);
if (r < 0) {
goto failed_shutdown;
}
}
if (qemu_rbd_set_auth(s->cluster, secretid, errp) < 0) {
if (qemu_rbd_set_auth(*cluster, secretid, errp) < 0) {
r = -EIO;
goto failed_shutdown;
}
@@ -641,24 +589,104 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
* librbd defaults to no caching. If write through caching cannot
* be set up, fall back to no caching.
*/
if (flags & BDRV_O_NOCACHE) {
rados_conf_set(s->cluster, "rbd_cache", "false");
if (cache) {
rados_conf_set(*cluster, "rbd_cache", "true");
} else {
rados_conf_set(s->cluster, "rbd_cache", "true");
rados_conf_set(*cluster, "rbd_cache", "false");
}
r = rados_connect(s->cluster);
r = rados_connect(*cluster);
if (r < 0) {
error_setg_errno(errp, -r, "error connecting");
goto failed_shutdown;
}
r = rados_ioctx_create(s->cluster, pool, &s->io_ctx);
r = rados_ioctx_create(*cluster, opts->pool, io_ctx);
if (r < 0) {
error_setg_errno(errp, -r, "error opening pool %s", pool);
error_setg_errno(errp, -r, "error opening pool %s", opts->pool);
goto failed_shutdown;
}
return 0;
failed_shutdown:
rados_shutdown(*cluster);
failed_opts:
g_free(mon_host);
return r;
}
static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRBDState *s = bs->opaque;
BlockdevOptionsRbd *opts = NULL;
Visitor *v;
QObject *crumpled = NULL;
const QDictEntry *e;
Error *local_err = NULL;
const char *filename;
char *keypairs, *secretid;
int r;
/* If we are given a filename, parse the filename, with precedence given to
* filename encoded options */
filename = qdict_get_try_str(options, "filename");
if (filename) {
warn_report("'filename' option specified. "
"This is an unsupported option, and may be deprecated "
"in the future");
qemu_rbd_parse_filename(filename, options, &local_err);
qdict_del(options, "filename");
if (local_err) {
error_propagate(errp, local_err);
return -EINVAL;
}
}
keypairs = g_strdup(qdict_get_try_str(options, "=keyvalue-pairs"));
if (keypairs) {
qdict_del(options, "=keyvalue-pairs");
}
secretid = g_strdup(qdict_get_try_str(options, "password-secret"));
if (secretid) {
qdict_del(options, "password-secret");
}
/* Convert the remaining options into a QAPI object */
crumpled = qdict_crumple(options, errp);
if (crumpled == NULL) {
r = -EINVAL;
goto out;
}
v = qobject_input_visitor_new_keyval(crumpled);
visit_type_BlockdevOptionsRbd(v, NULL, &opts, &local_err);
visit_free(v);
qobject_decref(crumpled);
if (local_err) {
error_propagate(errp, local_err);
r = -EINVAL;
goto out;
}
/* Remove the processed options from the QDict (the visitor processes
* _all_ options in the QDict) */
while ((e = qdict_first(options))) {
qdict_del(options, e->key);
}
r = qemu_rbd_connect(&s->cluster, &s->io_ctx, opts,
!(flags & BDRV_O_NOCACHE), keypairs, secretid, errp);
if (r < 0) {
goto out;
}
s->snap = g_strdup(opts->snapshot);
s->image_name = g_strdup(opts->image);
/* rbd_open is always r/w */
r = rbd_open(s->io_ctx, s->image_name, &s->image, s->snap);
if (r < 0) {
@@ -683,19 +711,18 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
}
}
qemu_opts_del(opts);
return 0;
r = 0;
goto out;
failed_open:
rados_ioctx_destroy(s->io_ctx);
failed_shutdown:
rados_shutdown(s->cluster);
g_free(s->snap);
g_free(s->image_name);
failed_opts:
qemu_opts_del(opts);
g_free(mon_host);
exit:
rados_shutdown(s->cluster);
out:
qapi_free_BlockdevOptionsRbd(opts);
g_free(keypairs);
g_free(secretid);
return r;
}
@@ -1093,8 +1120,8 @@ static BlockAIOCB *qemu_rbd_aio_pdiscard(BlockDriverState *bs,
#endif
#ifdef LIBRBD_SUPPORTS_INVALIDATE
static void qemu_rbd_invalidate_cache(BlockDriverState *bs,
Error **errp)
static void coroutine_fn qemu_rbd_co_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
BDRVRBDState *s = bs->opaque;
int r = rbd_invalidate_cache(s->image);
@@ -1134,6 +1161,7 @@ static BlockDriver bdrv_rbd = {
.bdrv_file_open = qemu_rbd_open,
.bdrv_close = qemu_rbd_close,
.bdrv_reopen_prepare = qemu_rbd_reopen_prepare,
.bdrv_co_create = qemu_rbd_co_create,
.bdrv_co_create_opts = qemu_rbd_co_create_opts,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_get_info = qemu_rbd_getinfo,
@@ -1160,7 +1188,7 @@ static BlockDriver bdrv_rbd = {
.bdrv_snapshot_list = qemu_rbd_snap_list,
.bdrv_snapshot_goto = qemu_rbd_snap_rollback,
#ifdef LIBRBD_SUPPORTS_INVALIDATE
.bdrv_invalidate_cache = qemu_rbd_invalidate_cache,
.bdrv_co_invalidate_cache = qemu_rbd_co_invalidate_cache,
#endif
};

View File

@@ -703,7 +703,6 @@ static void replication_stop(ReplicationState *rs, bool failover, Error **errp)
BlockDriver bdrv_replication = {
.format_name = "replication",
.protocol_name = "replication",
.instance_size = sizeof(BDRVReplicationState),
.bdrv_open = replication_open,

View File

@@ -15,8 +15,10 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qapi/qapi-visit-sockets.h"
#include "qapi/qapi-visit-block-core.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi/qobject-output-visitor.h"
#include "qemu/uri.h"
#include "qemu/error-report.h"
#include "qemu/option.h"
@@ -533,23 +535,6 @@ static void sd_aio_setup(SheepdogAIOCB *acb, BDRVSheepdogState *s,
qemu_co_mutex_unlock(&s->queue_lock);
}
static SocketAddress *sd_socket_address(const char *path,
const char *host, const char *port)
{
SocketAddress *addr = g_new0(SocketAddress, 1);
if (path) {
addr->type = SOCKET_ADDRESS_TYPE_UNIX;
addr->u.q_unix.path = g_strdup(path);
} else {
addr->type = SOCKET_ADDRESS_TYPE_INET;
addr->u.inet.host = g_strdup(host ?: SD_DEFAULT_ADDR);
addr->u.inet.port = g_strdup(port ?: stringify(SD_DEFAULT_PORT));
}
return addr;
}
static SocketAddress *sd_server_config(QDict *options, Error **errp)
{
QDict *server = NULL;
@@ -1051,7 +1036,7 @@ static void sd_parse_uri(SheepdogConfig *cfg, const char *filename,
cfg->uri = uri = uri_parse(filename);
if (!uri) {
error_setg(&err, "invalid URI");
error_setg(&err, "invalid URI '%s'", filename);
goto out;
}
@@ -1882,6 +1867,86 @@ out_with_err_set:
return ret;
}
static int sd_create_prealloc(BlockdevOptionsSheepdog *location, int64_t size,
Error **errp)
{
BlockDriverState *bs;
Visitor *v;
QObject *obj = NULL;
QDict *qdict;
Error *local_err = NULL;
int ret;
v = qobject_output_visitor_new(&obj);
visit_type_BlockdevOptionsSheepdog(v, NULL, &location, &local_err);
visit_free(v);
if (local_err) {
error_propagate(errp, local_err);
qobject_decref(obj);
return -EINVAL;
}
qdict = qobject_to(QDict, obj);
qdict_flatten(qdict);
qdict_put_str(qdict, "driver", "sheepdog");
bs = bdrv_open(NULL, NULL, qdict, BDRV_O_PROTOCOL | BDRV_O_RDWR, errp);
if (bs == NULL) {
ret = -EIO;
goto fail;
}
ret = sd_prealloc(bs, 0, size, errp);
fail:
bdrv_unref(bs);
QDECREF(qdict);
return ret;
}
static int parse_redundancy(BDRVSheepdogState *s, SheepdogRedundancy *opt)
{
struct SheepdogInode *inode = &s->inode;
switch (opt->type) {
case SHEEPDOG_REDUNDANCY_TYPE_FULL:
if (opt->u.full.copies > SD_MAX_COPIES || opt->u.full.copies < 1) {
return -EINVAL;
}
inode->copy_policy = 0;
inode->nr_copies = opt->u.full.copies;
return 0;
case SHEEPDOG_REDUNDANCY_TYPE_ERASURE_CODED:
{
int64_t copy = opt->u.erasure_coded.data_strips;
int64_t parity = opt->u.erasure_coded.parity_strips;
if (copy != 2 && copy != 4 && copy != 8 && copy != 16) {
return -EINVAL;
}
if (parity >= SD_EC_MAX_STRIP || parity < 1) {
return -EINVAL;
}
/*
* 4 bits for parity and 4 bits for data.
* We have to compress upper data bits because it can't represent 16
*/
inode->copy_policy = ((copy / 2) << 4) + parity;
inode->nr_copies = copy + parity;
return 0;
}
default:
g_assert_not_reached();
}
return -EINVAL;
}
/*
* Sheepdog support two kinds of redundancy, full replication and erasure
* coding.
@@ -1892,60 +1957,61 @@ out_with_err_set:
* # create a erasure coded vdi with x data strips and y parity strips
* -o redundancy=x:y (x must be one of {2,4,8,16} and 1 <= y < SD_EC_MAX_STRIP)
*/
static int parse_redundancy(BDRVSheepdogState *s, const char *opt)
static SheepdogRedundancy *parse_redundancy_str(const char *opt)
{
struct SheepdogInode *inode = &s->inode;
SheepdogRedundancy *redundancy;
const char *n1, *n2;
long copy, parity;
char p[10];
int ret;
pstrcpy(p, sizeof(p), opt);
n1 = strtok(p, ":");
n2 = strtok(NULL, ":");
if (!n1) {
return -EINVAL;
return NULL;
}
copy = strtol(n1, NULL, 10);
/* FIXME fix error checking by switching to qemu_strtol() */
if (copy > SD_MAX_COPIES || copy < 1) {
return -EINVAL;
ret = qemu_strtol(n1, NULL, 10, &copy);
if (ret < 0) {
return NULL;
}
redundancy = g_new0(SheepdogRedundancy, 1);
if (!n2) {
inode->copy_policy = 0;
inode->nr_copies = copy;
return 0;
*redundancy = (SheepdogRedundancy) {
.type = SHEEPDOG_REDUNDANCY_TYPE_FULL,
.u.full.copies = copy,
};
} else {
ret = qemu_strtol(n2, NULL, 10, &parity);
if (ret < 0) {
return NULL;
}
*redundancy = (SheepdogRedundancy) {
.type = SHEEPDOG_REDUNDANCY_TYPE_ERASURE_CODED,
.u.erasure_coded = {
.data_strips = copy,
.parity_strips = parity,
},
};
}
if (copy != 2 && copy != 4 && copy != 8 && copy != 16) {
return -EINVAL;
}
parity = strtol(n2, NULL, 10);
/* FIXME fix error checking by switching to qemu_strtol() */
if (parity >= SD_EC_MAX_STRIP || parity < 1) {
return -EINVAL;
}
/*
* 4 bits for parity and 4 bits for data.
* We have to compress upper data bits because it can't represent 16
*/
inode->copy_policy = ((copy / 2) << 4) + parity;
inode->nr_copies = copy + parity;
return 0;
return redundancy;
}
static int parse_block_size_shift(BDRVSheepdogState *s, QemuOpts *opt)
static int parse_block_size_shift(BDRVSheepdogState *s,
BlockdevCreateOptionsSheepdog *opts)
{
struct SheepdogInode *inode = &s->inode;
uint64_t object_size;
int obj_order;
object_size = qemu_opt_get_size_del(opt, BLOCK_OPT_OBJECT_SIZE, 0);
if (object_size) {
if (opts->has_object_size) {
object_size = opts->object_size;
if ((object_size - 1) & object_size) { /* not a power of 2? */
return -EINVAL;
}
@@ -1959,57 +2025,55 @@ static int parse_block_size_shift(BDRVSheepdogState *s, QemuOpts *opt)
return 0;
}
static int coroutine_fn sd_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
static int sd_co_create(BlockdevCreateOptions *options, Error **errp)
{
Error *err = NULL;
BlockdevCreateOptionsSheepdog *opts = &options->u.sheepdog;
int ret = 0;
uint32_t vid = 0;
char *backing_file = NULL;
char *buf = NULL;
BDRVSheepdogState *s;
SheepdogConfig cfg;
uint64_t max_vdi_size;
bool prealloc = false;
assert(options->driver == BLOCKDEV_DRIVER_SHEEPDOG);
s = g_new0(BDRVSheepdogState, 1);
if (strstr(filename, "://")) {
sd_parse_uri(&cfg, filename, &err);
} else {
parse_vdiname(&cfg, filename, &err);
}
if (err) {
error_propagate(errp, err);
/* Steal SocketAddress from QAPI, set NULL to prevent double free */
s->addr = opts->location->server;
opts->location->server = NULL;
if (strlen(opts->location->vdi) >= sizeof(s->name)) {
error_setg(errp, "'vdi' string too long");
ret = -EINVAL;
goto out;
}
pstrcpy(s->name, sizeof(s->name), opts->location->vdi);
buf = cfg.port ? g_strdup_printf("%d", cfg.port) : NULL;
s->addr = sd_socket_address(cfg.path, cfg.host, buf);
g_free(buf);
strcpy(s->name, cfg.vdi);
sd_config_done(&cfg);
s->inode.vdi_size = opts->size;
backing_file = opts->backing_file;
s->inode.vdi_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!buf || !strcmp(buf, "off")) {
if (!opts->has_preallocation) {
opts->preallocation = PREALLOC_MODE_OFF;
}
switch (opts->preallocation) {
case PREALLOC_MODE_OFF:
prealloc = false;
} else if (!strcmp(buf, "full")) {
break;
case PREALLOC_MODE_FULL:
prealloc = true;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'", buf);
break;
default:
error_setg(errp, "Preallocation mode not supported for Sheepdog");
ret = -EINVAL;
goto out;
}
g_free(buf);
buf = qemu_opt_get_del(opts, BLOCK_OPT_REDUNDANCY);
if (buf) {
ret = parse_redundancy(s, buf);
if (opts->has_redundancy) {
ret = parse_redundancy(s, opts->redundancy);
if (ret < 0) {
error_setg(errp, "Invalid redundancy mode: '%s'", buf);
error_setg(errp, "Invalid redundancy mode");
goto out;
}
}
@@ -2021,20 +2085,20 @@ static int coroutine_fn sd_co_create_opts(const char *filename, QemuOpts *opts,
goto out;
}
if (backing_file) {
if (opts->has_backing_file) {
BlockBackend *blk;
BDRVSheepdogState *base;
BlockDriver *drv;
/* Currently, only Sheepdog backing image is supported. */
drv = bdrv_find_protocol(backing_file, true, NULL);
drv = bdrv_find_protocol(opts->backing_file, true, NULL);
if (!drv || strcmp(drv->protocol_name, "sheepdog") != 0) {
error_setg(errp, "backing_file must be a sheepdog image");
ret = -EINVAL;
goto out;
}
blk = blk_new_open(backing_file, NULL, NULL,
blk = blk_new_open(opts->backing_file, NULL, NULL,
BDRV_O_PROTOCOL, errp);
if (blk == NULL) {
ret = -EIO;
@@ -2102,28 +2166,96 @@ static int coroutine_fn sd_co_create_opts(const char *filename, QemuOpts *opts,
}
if (prealloc) {
BlockDriverState *bs;
QDict *opts;
opts = qdict_new();
qdict_put_str(opts, "driver", "sheepdog");
bs = bdrv_open(filename, NULL, opts, BDRV_O_PROTOCOL | BDRV_O_RDWR,
errp);
if (!bs) {
goto out;
}
ret = sd_prealloc(bs, 0, s->inode.vdi_size, errp);
bdrv_unref(bs);
ret = sd_create_prealloc(opts->location, opts->size, errp);
}
out:
g_free(backing_file);
g_free(buf);
g_free(s->addr);
g_free(s);
return ret;
}
static int coroutine_fn sd_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
{
BlockdevCreateOptions *create_options = NULL;
QDict *qdict, *location_qdict;
QObject *crumpled;
Visitor *v;
const char *redundancy;
Error *local_err = NULL;
int ret;
redundancy = qemu_opt_get_del(opts, BLOCK_OPT_REDUNDANCY);
qdict = qemu_opts_to_qdict(opts, NULL);
qdict_put_str(qdict, "driver", "sheepdog");
location_qdict = qdict_new();
qdict_put(qdict, "location", location_qdict);
sd_parse_filename(filename, location_qdict, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
qdict_flatten(qdict);
/* Change legacy command line options into QMP ones */
static const QDictRenames opt_renames[] = {
{ BLOCK_OPT_BACKING_FILE, "backing-file" },
{ BLOCK_OPT_OBJECT_SIZE, "object-size" },
{ NULL, NULL },
};
if (!qdict_rename_keys(qdict, opt_renames, errp)) {
ret = -EINVAL;
goto fail;
}
/* Get the QAPI object */
crumpled = qdict_crumple(qdict, errp);
if (crumpled == NULL) {
ret = -EINVAL;
goto fail;
}
v = qobject_input_visitor_new_keyval(crumpled);
visit_type_BlockdevCreateOptions(v, NULL, &create_options, &local_err);
visit_free(v);
qobject_decref(crumpled);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
assert(create_options->driver == BLOCKDEV_DRIVER_SHEEPDOG);
create_options->u.sheepdog.size =
ROUND_UP(create_options->u.sheepdog.size, BDRV_SECTOR_SIZE);
if (redundancy) {
create_options->u.sheepdog.has_redundancy = true;
create_options->u.sheepdog.redundancy =
parse_redundancy_str(redundancy);
if (create_options->u.sheepdog.redundancy == NULL) {
error_setg(errp, "Invalid redundancy mode");
ret = -EINVAL;
goto fail;
}
}
ret = sd_co_create(create_options, errp);
fail:
qapi_free_BlockdevCreateOptions(create_options);
QDECREF(qdict);
return ret;
}
static void sd_close(BlockDriverState *bs)
{
Error *local_err = NULL;
@@ -3103,6 +3235,7 @@ static BlockDriver bdrv_sheepdog = {
.bdrv_reopen_commit = sd_reopen_commit,
.bdrv_reopen_abort = sd_reopen_abort,
.bdrv_close = sd_close,
.bdrv_co_create = sd_co_create,
.bdrv_co_create_opts = sd_co_create_opts,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_getlength = sd_getlength,
@@ -3139,6 +3272,7 @@ static BlockDriver bdrv_sheepdog_tcp = {
.bdrv_reopen_commit = sd_reopen_commit,
.bdrv_reopen_abort = sd_reopen_abort,
.bdrv_close = sd_close,
.bdrv_co_create = sd_co_create,
.bdrv_co_create_opts = sd_co_create_opts,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_getlength = sd_getlength,
@@ -3175,6 +3309,7 @@ static BlockDriver bdrv_sheepdog_unix = {
.bdrv_reopen_commit = sd_reopen_commit,
.bdrv_reopen_abort = sd_reopen_abort,
.bdrv_close = sd_close,
.bdrv_co_create = sd_co_create,
.bdrv_co_create_opts = sd_co_create_opts,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_getlength = sd_getlength,

View File

@@ -35,6 +35,7 @@
#include "qemu/sockets.h"
#include "qemu/uri.h"
#include "qapi/qapi-visit-sockets.h"
#include "qapi/qapi-visit-block-core.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
#include "qapi/qobject-input-visitor.h"
@@ -430,31 +431,35 @@ check_host_key_hash(BDRVSSHState *s, const char *hash,
}
static int check_host_key(BDRVSSHState *s, const char *host, int port,
const char *host_key_check, Error **errp)
SshHostKeyCheck *hkc, Error **errp)
{
/* host_key_check=no */
if (strcmp(host_key_check, "no") == 0) {
SshHostKeyCheckMode mode;
if (hkc) {
mode = hkc->mode;
} else {
mode = SSH_HOST_KEY_CHECK_MODE_KNOWN_HOSTS;
}
switch (mode) {
case SSH_HOST_KEY_CHECK_MODE_NONE:
return 0;
}
/* host_key_check=md5:xx:yy:zz:... */
if (strncmp(host_key_check, "md5:", 4) == 0) {
return check_host_key_hash(s, &host_key_check[4],
LIBSSH2_HOSTKEY_HASH_MD5, 16, errp);
}
/* host_key_check=sha1:xx:yy:zz:... */
if (strncmp(host_key_check, "sha1:", 5) == 0) {
return check_host_key_hash(s, &host_key_check[5],
LIBSSH2_HOSTKEY_HASH_SHA1, 20, errp);
}
/* host_key_check=yes */
if (strcmp(host_key_check, "yes") == 0) {
case SSH_HOST_KEY_CHECK_MODE_HASH:
if (hkc->u.hash.type == SSH_HOST_KEY_CHECK_HASH_TYPE_MD5) {
return check_host_key_hash(s, hkc->u.hash.hash,
LIBSSH2_HOSTKEY_HASH_MD5, 16, errp);
} else if (hkc->u.hash.type == SSH_HOST_KEY_CHECK_HASH_TYPE_SHA1) {
return check_host_key_hash(s, hkc->u.hash.hash,
LIBSSH2_HOSTKEY_HASH_SHA1, 20, errp);
}
g_assert_not_reached();
break;
case SSH_HOST_KEY_CHECK_MODE_KNOWN_HOSTS:
return check_host_key_knownhosts(s, host, port, errp);
default:
g_assert_not_reached();
}
error_setg(errp, "unknown host_key_check setting (%s)", host_key_check);
return -EINVAL;
}
@@ -543,16 +548,6 @@ static QemuOptsList ssh_runtime_opts = {
.type = QEMU_OPT_NUMBER,
.help = "Port to connect to",
},
{
.name = "path",
.type = QEMU_OPT_STRING,
.help = "Path of the image on the host",
},
{
.name = "user",
.type = QEMU_OPT_STRING,
.help = "User as which to connect",
},
{
.name = "host_key_check",
.type = QEMU_OPT_STRING,
@@ -562,12 +557,13 @@ static QemuOptsList ssh_runtime_opts = {
},
};
static bool ssh_process_legacy_socket_options(QDict *output_opts,
QemuOpts *legacy_opts,
Error **errp)
static bool ssh_process_legacy_options(QDict *output_opts,
QemuOpts *legacy_opts,
Error **errp)
{
const char *host = qemu_opt_get(legacy_opts, "host");
const char *port = qemu_opt_get(legacy_opts, "port");
const char *host_key_check = qemu_opt_get(legacy_opts, "host_key_check");
if (!host && port) {
error_setg(errp, "port may not be used without host");
@@ -579,26 +575,56 @@ static bool ssh_process_legacy_socket_options(QDict *output_opts,
qdict_put_str(output_opts, "server.port", port ?: stringify(22));
}
if (host_key_check) {
if (strcmp(host_key_check, "no") == 0) {
qdict_put_str(output_opts, "host-key-check.mode", "none");
} else if (strncmp(host_key_check, "md5:", 4) == 0) {
qdict_put_str(output_opts, "host-key-check.mode", "hash");
qdict_put_str(output_opts, "host-key-check.type", "md5");
qdict_put_str(output_opts, "host-key-check.hash",
&host_key_check[4]);
} else if (strncmp(host_key_check, "sha1:", 5) == 0) {
qdict_put_str(output_opts, "host-key-check.mode", "hash");
qdict_put_str(output_opts, "host-key-check.type", "sha1");
qdict_put_str(output_opts, "host-key-check.hash",
&host_key_check[5]);
} else if (strcmp(host_key_check, "yes") == 0) {
qdict_put_str(output_opts, "host-key-check.mode", "known_hosts");
} else {
error_setg(errp, "unknown host_key_check setting (%s)",
host_key_check);
return false;
}
}
return true;
}
static InetSocketAddress *ssh_config(QDict *options, Error **errp)
static BlockdevOptionsSsh *ssh_parse_options(QDict *options, Error **errp)
{
InetSocketAddress *inet = NULL;
QDict *addr = NULL;
QObject *crumpled_addr = NULL;
Visitor *iv = NULL;
Error *local_error = NULL;
BlockdevOptionsSsh *result = NULL;
QemuOpts *opts = NULL;
Error *local_err = NULL;
QObject *crumpled;
const QDictEntry *e;
Visitor *v;
qdict_extract_subqdict(options, &addr, "server.");
if (!qdict_size(addr)) {
error_setg(errp, "SSH server address missing");
goto out;
/* Translate legacy options */
opts = qemu_opts_create(&ssh_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto fail;
}
crumpled_addr = qdict_crumple(addr, errp);
if (!crumpled_addr) {
goto out;
if (!ssh_process_legacy_options(options, opts, errp)) {
goto fail;
}
/* Create the QAPI object */
crumpled = qdict_crumple(options, errp);
if (crumpled == NULL) {
goto fail;
}
/*
@@ -609,51 +635,37 @@ static InetSocketAddress *ssh_config(QDict *options, Error **errp)
* but when they come from -drive, they're all QString. The
* visitor expects the former.
*/
iv = qobject_input_visitor_new(crumpled_addr);
visit_type_InetSocketAddress(iv, NULL, &inet, &local_error);
if (local_error) {
error_propagate(errp, local_error);
goto out;
v = qobject_input_visitor_new(crumpled);
visit_type_BlockdevOptionsSsh(v, NULL, &result, &local_err);
visit_free(v);
qobject_decref(crumpled);
if (local_err) {
error_propagate(errp, local_err);
goto fail;
}
out:
QDECREF(addr);
qobject_decref(crumpled_addr);
visit_free(iv);
return inet;
/* Remove the processed options from the QDict (the visitor processes
* _all_ options in the QDict) */
while ((e = qdict_first(options))) {
qdict_del(options, e->key);
}
fail:
qemu_opts_del(opts);
return result;
}
static int connect_to_ssh(BDRVSSHState *s, QDict *options,
static int connect_to_ssh(BDRVSSHState *s, BlockdevOptionsSsh *opts,
int ssh_flags, int creat_mode, Error **errp)
{
int r, ret;
QemuOpts *opts = NULL;
Error *local_err = NULL;
const char *user, *path, *host_key_check;
const char *user;
long port = 0;
opts = qemu_opts_create(&ssh_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
ret = -EINVAL;
error_propagate(errp, local_err);
goto err;
}
if (!ssh_process_legacy_socket_options(options, opts, errp)) {
ret = -EINVAL;
goto err;
}
path = qemu_opt_get(opts, "path");
if (!path) {
ret = -EINVAL;
error_setg(errp, "No path was specified");
goto err;
}
user = qemu_opt_get(opts, "user");
if (!user) {
if (opts->has_user) {
user = opts->user;
} else {
user = g_get_user_name();
if (!user) {
error_setg_errno(errp, errno, "Can't get user name");
@@ -662,17 +674,9 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
}
}
host_key_check = qemu_opt_get(opts, "host_key_check");
if (!host_key_check) {
host_key_check = "yes";
}
/* Pop the config into our state object, Exit if invalid */
s->inet = ssh_config(options, errp);
if (!s->inet) {
ret = -EINVAL;
goto err;
}
s->inet = opts->server;
opts->server = NULL;
if (qemu_strtol(s->inet->port, NULL, 10, &port) < 0) {
error_setg(errp, "Use only numeric port value");
@@ -707,8 +711,7 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
}
/* Check the remote host's key against known_hosts. */
ret = check_host_key(s, s->inet->host, port, host_key_check,
errp);
ret = check_host_key(s, s->inet->host, port, opts->host_key_check, errp);
if (ret < 0) {
goto err;
}
@@ -729,16 +732,16 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
/* Open the remote file. */
DPRINTF("opening file %s flags=0x%x creat_mode=0%o",
path, ssh_flags, creat_mode);
s->sftp_handle = libssh2_sftp_open(s->sftp, path, ssh_flags, creat_mode);
opts->path, ssh_flags, creat_mode);
s->sftp_handle = libssh2_sftp_open(s->sftp, opts->path, ssh_flags,
creat_mode);
if (!s->sftp_handle) {
session_error_setg(errp, s, "failed to open remote file '%s'", path);
session_error_setg(errp, s, "failed to open remote file '%s'",
opts->path);
ret = -EINVAL;
goto err;
}
qemu_opts_del(opts);
r = libssh2_sftp_fstat(s->sftp_handle, &s->attrs);
if (r < 0) {
sftp_error_setg(errp, s, "failed to read file attributes");
@@ -764,8 +767,6 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
}
s->session = NULL;
qemu_opts_del(opts);
return ret;
}
@@ -773,6 +774,7 @@ static int ssh_file_open(BlockDriverState *bs, QDict *options, int bdrv_flags,
Error **errp)
{
BDRVSSHState *s = bs->opaque;
BlockdevOptionsSsh *opts;
int ret;
int ssh_flags;
@@ -783,8 +785,13 @@ static int ssh_file_open(BlockDriverState *bs, QDict *options, int bdrv_flags,
ssh_flags |= LIBSSH2_FXF_WRITE;
}
opts = ssh_parse_options(options, errp);
if (opts == NULL) {
return -EINVAL;
}
/* Start up SSH. */
ret = connect_to_ssh(s, options, ssh_flags, 0, errp);
ret = connect_to_ssh(s, opts, ssh_flags, 0, errp);
if (ret < 0) {
goto err;
}
@@ -792,6 +799,8 @@ static int ssh_file_open(BlockDriverState *bs, QDict *options, int bdrv_flags,
/* Go non-blocking. */
libssh2_session_set_blocking(s->session, 0);
qapi_free_BlockdevOptionsSsh(opts);
return 0;
err:
@@ -800,6 +809,8 @@ static int ssh_file_open(BlockDriverState *bs, QDict *options, int bdrv_flags,
}
s->sock = -1;
qapi_free_BlockdevOptionsSsh(opts);
return ret;
}
@@ -843,51 +854,71 @@ static QemuOptsList ssh_create_opts = {
}
};
static int coroutine_fn ssh_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
static int ssh_co_create(BlockdevCreateOptions *options, Error **errp)
{
int r, ret;
int64_t total_size = 0;
QDict *uri_options = NULL;
BlockdevCreateOptionsSsh *opts = &options->u.ssh;
BDRVSSHState s;
int ret;
assert(options->driver == BLOCKDEV_DRIVER_SSH);
ssh_state_init(&s);
/* Get desired file size. */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
DPRINTF("total_size=%" PRIi64, total_size);
uri_options = qdict_new();
r = parse_uri(filename, uri_options, errp);
if (r < 0) {
ret = r;
goto out;
ret = connect_to_ssh(&s, opts->location,
LIBSSH2_FXF_READ|LIBSSH2_FXF_WRITE|
LIBSSH2_FXF_CREAT|LIBSSH2_FXF_TRUNC,
0644, errp);
if (ret < 0) {
goto fail;
}
r = connect_to_ssh(&s, uri_options,
LIBSSH2_FXF_READ|LIBSSH2_FXF_WRITE|
LIBSSH2_FXF_CREAT|LIBSSH2_FXF_TRUNC,
0644, errp);
if (r < 0) {
ret = r;
goto out;
}
if (total_size > 0) {
ret = ssh_grow_file(&s, total_size, errp);
if (opts->size > 0) {
ret = ssh_grow_file(&s, opts->size, errp);
if (ret < 0) {
goto out;
goto fail;
}
}
ret = 0;
fail:
ssh_state_free(&s);
return ret;
}
static int coroutine_fn ssh_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
{
BlockdevCreateOptions *create_options;
BlockdevCreateOptionsSsh *ssh_opts;
int ret;
QDict *uri_options = NULL;
create_options = g_new0(BlockdevCreateOptions, 1);
create_options->driver = BLOCKDEV_DRIVER_SSH;
ssh_opts = &create_options->u.ssh;
/* Get desired file size. */
ssh_opts->size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
DPRINTF("total_size=%" PRIi64, ssh_opts->size);
uri_options = qdict_new();
ret = parse_uri(filename, uri_options, errp);
if (ret < 0) {
goto out;
}
ssh_opts->location = ssh_parse_options(uri_options, errp);
if (ssh_opts->location == NULL) {
ret = -EINVAL;
goto out;
}
ret = ssh_co_create(create_options, errp);
out:
ssh_state_free(&s);
if (uri_options != NULL) {
QDECREF(uri_options);
}
QDECREF(uri_options);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
@@ -1249,6 +1280,7 @@ static BlockDriver bdrv_ssh = {
.instance_size = sizeof(BDRVSSHState),
.bdrv_parse_filename = ssh_parse_filename,
.bdrv_file_open = ssh_file_open,
.bdrv_co_create = ssh_co_create,
.bdrv_co_create_opts = ssh_co_create_opts,
.bdrv_close = ssh_close,
.bdrv_has_zero_init = ssh_has_zero_init,

View File

@@ -244,7 +244,7 @@ void stream_start(const char *job_id, BlockDriverState *bs,
/* Prevent concurrent jobs trying to modify the graph structure here, we
* already have our own plans. Also don't allow resize as the image size is
* queried only at the job start and then cached. */
s = block_job_create(job_id, &stream_job_driver, bs,
s = block_job_create(job_id, &stream_job_driver, NULL, bs,
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
BLK_PERM_GRAPH_MOD,
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |

View File

@@ -215,10 +215,9 @@ static void coroutine_fn throttle_co_drain_end(BlockDriverState *bs)
static BlockDriver bdrv_throttle = {
.format_name = "throttle",
.protocol_name = "throttle",
.instance_size = sizeof(ThrottleGroupMember),
.bdrv_file_open = throttle_open,
.bdrv_open = throttle_open,
.bdrv_close = throttle_close,
.bdrv_co_flush = throttle_co_flush,

View File

@@ -4,9 +4,16 @@
bdrv_open_common(void *bs, const char *filename, int flags, const char *format_name) "bs %p filename \"%s\" flags 0x%x format_name \"%s\""
bdrv_lock_medium(void *bs, bool locked) "bs %p locked %d"
# blockjob.c
block_job_completed(void *job, int ret, int jret) "job %p ret %d corrected ret %d"
block_job_state_transition(void *job, int ret, const char *legal, const char *s0, const char *s1) "job %p (ret: %d) attempting %s transition (%s-->%s)"
block_job_apply_verb(void *job, const char *state, const char *verb, const char *legal) "job %p in state %s; applying verb %s (%s)"
# block/block-backend.c
blk_co_preadv(void *blk, void *bs, int64_t offset, unsigned int bytes, int flags) "blk %p bs %p offset %"PRId64" bytes %u flags 0x%x"
blk_co_pwritev(void *blk, void *bs, int64_t offset, unsigned int bytes, int flags) "blk %p bs %p offset %"PRId64" bytes %u flags 0x%x"
blk_root_attach(void *child, void *blk, void *bs) "child %p blk %p bs %p"
blk_root_detach(void *child, void *blk, void *bs) "child %p blk %p bs %p"
# block/io.c
bdrv_co_preadv(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) "bs %p offset %"PRId64" nbytes %"PRId64" flags 0x%x"
@@ -46,6 +53,8 @@ qmp_block_job_cancel(void *job) "job %p"
qmp_block_job_pause(void *job) "job %p"
qmp_block_job_resume(void *job) "job %p"
qmp_block_job_complete(void *job) "job %p"
qmp_block_job_finalize(void *job) "job %p"
qmp_block_job_dismiss(void *job) "job %p"
qmp_block_stream(void *bs, void *job) "bs %p job %p"
# block/file-win32.c

View File

@@ -51,6 +51,9 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi/qapi-visit-block-core.h"
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h"
@@ -140,6 +143,8 @@
#define VDI_DISK_SIZE_MAX ((uint64_t)VDI_BLOCKS_IN_IMAGE_MAX * \
(uint64_t)DEFAULT_CLUSTER_SIZE)
static QemuOptsList vdi_create_opts;
typedef struct {
char text[0x40];
uint32_t signature;
@@ -230,7 +235,6 @@ static void vdi_header_to_le(VdiHeader *header)
qemu_uuid_bswap(&header->uuid_parent);
}
#if defined(CONFIG_VDI_DEBUG)
static void vdi_header_print(VdiHeader *header)
{
char uuid[37];
@@ -252,19 +256,18 @@ static void vdi_header_print(VdiHeader *header)
logout("block extra 0x%04x\n", header->block_extra);
logout("blocks tot. 0x%04x\n", header->blocks_in_image);
logout("blocks all. 0x%04x\n", header->blocks_allocated);
uuid_unparse(header->uuid_image, uuid);
qemu_uuid_unparse(&header->uuid_image, uuid);
logout("uuid image %s\n", uuid);
uuid_unparse(header->uuid_last_snap, uuid);
qemu_uuid_unparse(&header->uuid_last_snap, uuid);
logout("uuid snap %s\n", uuid);
uuid_unparse(header->uuid_link, uuid);
qemu_uuid_unparse(&header->uuid_link, uuid);
logout("uuid link %s\n", uuid);
uuid_unparse(header->uuid_parent, uuid);
qemu_uuid_unparse(&header->uuid_parent, uuid);
logout("uuid parent %s\n", uuid);
}
#endif
static int vdi_check(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix)
static int coroutine_fn vdi_co_check(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix)
{
/* TODO: additional checks possible. */
BDRVVdiState *s = (BDRVVdiState *)bs->opaque;
@@ -382,9 +385,9 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
}
vdi_header_to_cpu(&header);
#if defined(CONFIG_VDI_DEBUG)
vdi_header_print(&header);
#endif
if (VDI_DEBUG) {
vdi_header_print(&header);
}
if (header.disk_size > VDI_DISK_SIZE_MAX) {
error_setg(errp, "Unsupported VDI image size (size is 0x%" PRIx64
@@ -716,36 +719,59 @@ nonallocating_write:
return ret;
}
static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
static int coroutine_fn vdi_co_do_create(BlockdevCreateOptions *create_options,
size_t block_size, Error **errp)
{
BlockdevCreateOptionsVdi *vdi_opts;
int ret = 0;
uint64_t bytes = 0;
uint32_t blocks;
size_t block_size = DEFAULT_CLUSTER_SIZE;
uint32_t image_type = VDI_TYPE_DYNAMIC;
uint32_t image_type;
VdiHeader header;
size_t i;
size_t bmap_size;
int64_t offset = 0;
Error *local_err = NULL;
BlockDriverState *bs_file = NULL;
BlockBackend *blk = NULL;
uint32_t *bmap = NULL;
assert(create_options->driver == BLOCKDEV_DRIVER_VDI);
vdi_opts = &create_options->u.vdi;
logout("\n");
/* Read out options. */
bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
#if defined(CONFIG_VDI_BLOCK_SIZE)
/* TODO: Additional checks (SECTOR_SIZE * 2^n, ...). */
block_size = qemu_opt_get_size_del(opts,
BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE);
#endif
#if defined(CONFIG_VDI_STATIC_IMAGE)
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_STATIC, false)) {
/* Validate options and set default values */
bytes = vdi_opts->size;
if (!vdi_opts->has_preallocation) {
vdi_opts->preallocation = PREALLOC_MODE_OFF;
}
switch (vdi_opts->preallocation) {
case PREALLOC_MODE_OFF:
image_type = VDI_TYPE_DYNAMIC;
break;
case PREALLOC_MODE_METADATA:
image_type = VDI_TYPE_STATIC;
break;
default:
error_setg(errp, "Preallocation mode not supported for vdi");
return -EINVAL;
}
#ifndef CONFIG_VDI_STATIC_IMAGE
if (image_type == VDI_TYPE_STATIC) {
ret = -ENOTSUP;
error_setg(errp, "Statically allocated images cannot be created in "
"this build");
goto exit;
}
#endif
#ifndef CONFIG_VDI_BLOCK_SIZE
if (block_size != DEFAULT_CLUSTER_SIZE) {
ret = -ENOTSUP;
error_setg(errp,
"A non-default cluster size is not supported in this build");
goto exit;
}
#endif
@@ -757,18 +783,16 @@ static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts *opts,
goto exit;
}
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
/* Create BlockBackend to write to the image */
bs_file = bdrv_open_blockdev_ref(vdi_opts->file, errp);
if (!bs_file) {
ret = -EIO;
goto exit;
}
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
ret = blk_insert_bs(blk, bs_file, errp);
if (ret < 0) {
goto exit;
}
@@ -799,13 +823,13 @@ static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts *opts,
qemu_uuid_generate(&header.uuid_image);
qemu_uuid_generate(&header.uuid_last_snap);
/* There is no need to set header.uuid_link or header.uuid_parent here. */
#if defined(CONFIG_VDI_DEBUG)
vdi_header_print(&header);
#endif
if (VDI_DEBUG) {
vdi_header_print(&header);
}
vdi_header_to_le(&header);
ret = blk_pwrite(blk, offset, &header, sizeof(header), 0);
if (ret < 0) {
error_setg(errp, "Error writing header to %s", filename);
error_setg(errp, "Error writing header");
goto exit;
}
offset += sizeof(header);
@@ -826,7 +850,7 @@ static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts *opts,
}
ret = blk_pwrite(blk, offset, bmap, bmap_size, 0);
if (ret < 0) {
error_setg(errp, "Error writing bmap to %s", filename);
error_setg(errp, "Error writing bmap");
goto exit;
}
offset += bmap_size;
@@ -836,17 +860,103 @@ static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts *opts,
ret = blk_truncate(blk, offset + blocks * block_size,
PREALLOC_MODE_OFF, errp);
if (ret < 0) {
error_prepend(errp, "Failed to statically allocate %s", filename);
error_prepend(errp, "Failed to statically allocate file");
goto exit;
}
}
exit:
blk_unref(blk);
bdrv_unref(bs_file);
g_free(bmap);
return ret;
}
static int coroutine_fn vdi_co_create(BlockdevCreateOptions *create_options,
Error **errp)
{
return vdi_co_do_create(create_options, DEFAULT_CLUSTER_SIZE, errp);
}
static int coroutine_fn vdi_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
{
QDict *qdict = NULL;
BlockdevCreateOptions *create_options = NULL;
BlockDriverState *bs_file = NULL;
uint64_t block_size = DEFAULT_CLUSTER_SIZE;
bool is_static = false;
Visitor *v;
Error *local_err = NULL;
int ret;
/* Parse options and convert legacy syntax.
*
* Since CONFIG_VDI_BLOCK_SIZE is disabled by default,
* cluster-size is not part of the QAPI schema; therefore we have
* to parse it before creating the QAPI object. */
#if defined(CONFIG_VDI_BLOCK_SIZE)
block_size = qemu_opt_get_size_del(opts,
BLOCK_OPT_CLUSTER_SIZE,
DEFAULT_CLUSTER_SIZE);
if (block_size < BDRV_SECTOR_SIZE || block_size > UINT32_MAX ||
!is_power_of_2(block_size))
{
error_setg(errp, "Invalid cluster size");
ret = -EINVAL;
goto done;
}
#endif
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_STATIC, false)) {
is_static = true;
}
qdict = qemu_opts_to_qdict_filtered(opts, NULL, &vdi_create_opts, true);
/* Create and open the file (protocol layer) */
ret = bdrv_create_file(filename, opts, errp);
if (ret < 0) {
goto done;
}
bs_file = bdrv_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
if (!bs_file) {
ret = -EIO;
goto done;
}
qdict_put_str(qdict, "driver", "vdi");
qdict_put_str(qdict, "file", bs_file->node_name);
if (is_static) {
qdict_put_str(qdict, "preallocation", "metadata");
}
/* Get the QAPI object */
v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
visit_type_BlockdevCreateOptions(v, NULL, &create_options, &local_err);
visit_free(v);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto done;
}
/* Silently round up size */
assert(create_options->driver == BLOCKDEV_DRIVER_VDI);
create_options->u.vdi.size = ROUND_UP(create_options->u.vdi.size,
BDRV_SECTOR_SIZE);
/* Create the vdi image (format layer) */
ret = vdi_co_do_create(create_options, block_size, errp);
done:
QDECREF(qdict);
qapi_free_BlockdevCreateOptions(create_options);
bdrv_unref(bs_file);
return ret;
}
static void vdi_close(BlockDriverState *bs)
{
BDRVVdiState *s = bs->opaque;
@@ -895,6 +1005,7 @@ static BlockDriver bdrv_vdi = {
.bdrv_close = vdi_close,
.bdrv_reopen_prepare = vdi_reopen_prepare,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_co_create = vdi_co_create,
.bdrv_co_create_opts = vdi_co_create_opts,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_co_block_status = vdi_co_block_status,
@@ -908,7 +1019,7 @@ static BlockDriver bdrv_vdi = {
.bdrv_get_info = vdi_get_info,
.create_opts = &vdi_create_opts,
.bdrv_check = vdi_check,
.bdrv_co_check = vdi_co_check,
};
static void bdrv_vdi_init(void)

View File

@@ -26,6 +26,9 @@
#include "block/vhdx.h"
#include "migration/blocker.h"
#include "qemu/uuid.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi/qapi-visit-block-core.h"
/* Options for VHDX creation */
@@ -39,6 +42,8 @@ typedef enum VHDXImageType {
VHDX_TYPE_DIFFERENCING, /* Currently unsupported */
} VHDXImageType;
static QemuOptsList vhdx_create_opts;
/* Several metadata and region table data entries are identified by
* guids in a MS-specific GUID format. */
@@ -1792,59 +1797,75 @@ exit:
* .---- ~ ----------- ~ ------------ ~ ---------------- ~ -----------.
* 1MB
*/
static int coroutine_fn vhdx_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
static int coroutine_fn vhdx_co_create(BlockdevCreateOptions *opts,
Error **errp)
{
BlockdevCreateOptionsVhdx *vhdx_opts;
BlockBackend *blk = NULL;
BlockDriverState *bs = NULL;
int ret = 0;
uint64_t image_size = (uint64_t) 2 * GiB;
uint32_t log_size = 1 * MiB;
uint32_t block_size = 0;
uint64_t image_size;
uint32_t log_size;
uint32_t block_size;
uint64_t signature;
uint64_t metadata_offset;
bool use_zero_blocks = false;
gunichar2 *creator = NULL;
glong creator_items;
BlockBackend *blk;
char *type = NULL;
VHDXImageType image_type;
Error *local_err = NULL;
image_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
log_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_LOG_SIZE, 0);
block_size = qemu_opt_get_size_del(opts, VHDX_BLOCK_OPT_BLOCK_SIZE, 0);
type = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
use_zero_blocks = qemu_opt_get_bool_del(opts, VHDX_BLOCK_OPT_ZERO, true);
assert(opts->driver == BLOCKDEV_DRIVER_VHDX);
vhdx_opts = &opts->u.vhdx;
/* Validate options and set default values */
image_size = vhdx_opts->size;
if (image_size > VHDX_MAX_IMAGE_SIZE) {
error_setg_errno(errp, EINVAL, "Image size too large; max of 64TB");
ret = -EINVAL;
goto exit;
error_setg(errp, "Image size too large; max of 64TB");
return -EINVAL;
}
if (type == NULL) {
type = g_strdup("dynamic");
}
if (!strcmp(type, "dynamic")) {
image_type = VHDX_TYPE_DYNAMIC;
} else if (!strcmp(type, "fixed")) {
image_type = VHDX_TYPE_FIXED;
} else if (!strcmp(type, "differencing")) {
error_setg_errno(errp, ENOTSUP,
"Differencing files not yet supported");
ret = -ENOTSUP;
goto exit;
if (!vhdx_opts->has_log_size) {
log_size = DEFAULT_LOG_SIZE;
} else {
error_setg(errp, "Invalid subformat '%s'", type);
ret = -EINVAL;
goto exit;
if (vhdx_opts->log_size > UINT32_MAX) {
error_setg(errp, "Log size must be smaller than 4 GB");
return -EINVAL;
}
log_size = vhdx_opts->log_size;
}
if (log_size < MiB || (log_size % MiB) != 0) {
error_setg(errp, "Log size must be a multiple of 1 MB");
return -EINVAL;
}
if (!vhdx_opts->has_block_state_zero) {
use_zero_blocks = true;
} else {
use_zero_blocks = vhdx_opts->block_state_zero;
}
if (!vhdx_opts->has_subformat) {
vhdx_opts->subformat = BLOCKDEV_VHDX_SUBFORMAT_DYNAMIC;
}
switch (vhdx_opts->subformat) {
case BLOCKDEV_VHDX_SUBFORMAT_DYNAMIC:
image_type = VHDX_TYPE_DYNAMIC;
break;
case BLOCKDEV_VHDX_SUBFORMAT_FIXED:
image_type = VHDX_TYPE_FIXED;
break;
default:
g_assert_not_reached();
}
/* These are pretty arbitrary, and mainly designed to keep the BAT
* size reasonable to load into RAM */
if (block_size == 0) {
if (vhdx_opts->has_block_size) {
block_size = vhdx_opts->block_size;
} else {
if (image_size > 32 * TiB) {
block_size = 64 * MiB;
} else if (image_size > (uint64_t) 100 * GiB) {
@@ -1856,30 +1877,30 @@ static int coroutine_fn vhdx_co_create_opts(const char *filename, QemuOpts *opts
}
}
if (block_size < MiB || (block_size % MiB) != 0) {
error_setg(errp, "Block size must be a multiple of 1 MB");
return -EINVAL;
}
if (!is_power_of_2(block_size)) {
error_setg(errp, "Block size must be a power of two");
return -EINVAL;
}
if (block_size > VHDX_BLOCK_SIZE_MAX) {
error_setg(errp, "Block size must not exceed %d", VHDX_BLOCK_SIZE_MAX);
return -EINVAL;
}
/* make the log size close to what was specified, but must be
* min 1MB, and multiple of 1MB */
log_size = ROUND_UP(log_size, MiB);
/* Create BlockBackend to write to the image */
bs = bdrv_open_blockdev_ref(vhdx_opts->file, errp);
if (bs == NULL) {
return -EIO;
}
block_size = ROUND_UP(block_size, MiB);
block_size = block_size > VHDX_BLOCK_SIZE_MAX ? VHDX_BLOCK_SIZE_MAX :
block_size;
ret = bdrv_create_file(filename, opts, &local_err);
blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
ret = blk_insert_bs(blk, bs, errp);
if (ret < 0) {
error_propagate(errp, local_err);
goto exit;
goto delete_and_exit;
}
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
goto exit;
}
blk_set_allow_write_beyond_eof(blk, true);
/* Create (A) */
@@ -1931,12 +1952,109 @@ static int coroutine_fn vhdx_co_create_opts(const char *filename, QemuOpts *opts
delete_and_exit:
blk_unref(blk);
exit:
g_free(type);
bdrv_unref(bs);
g_free(creator);
return ret;
}
static int coroutine_fn vhdx_co_create_opts(const char *filename,
QemuOpts *opts,
Error **errp)
{
BlockdevCreateOptions *create_options = NULL;
QDict *qdict = NULL;
QObject *qobj;
Visitor *v;
BlockDriverState *bs = NULL;
Error *local_err = NULL;
int ret;
static const QDictRenames opt_renames[] = {
{ VHDX_BLOCK_OPT_LOG_SIZE, "log-size" },
{ VHDX_BLOCK_OPT_BLOCK_SIZE, "block-size" },
{ VHDX_BLOCK_OPT_ZERO, "block-state-zero" },
{ NULL, NULL },
};
/* Parse options and convert legacy syntax */
qdict = qemu_opts_to_qdict_filtered(opts, NULL, &vhdx_create_opts, true);
if (!qdict_rename_keys(qdict, opt_renames, errp)) {
ret = -EINVAL;
goto fail;
}
/* Create and open the file (protocol layer) */
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
}
bs = bdrv_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
if (bs == NULL) {
ret = -EIO;
goto fail;
}
/* Now get the QAPI type BlockdevCreateOptions */
qdict_put_str(qdict, "driver", "vhdx");
qdict_put_str(qdict, "file", bs->node_name);
qobj = qdict_crumple(qdict, errp);
QDECREF(qdict);
qdict = qobject_to(QDict, qobj);
if (qdict == NULL) {
ret = -EINVAL;
goto fail;
}
v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
visit_type_BlockdevCreateOptions(v, NULL, &create_options, &local_err);
visit_free(v);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
/* Silently round up sizes:
* The image size is rounded to 512 bytes. Make the block and log size
* close to what was specified, but must be at least 1MB, and a multiple of
* 1 MB. Also respect VHDX_BLOCK_SIZE_MAX for block sizes. block_size = 0
* means auto, which is represented by a missing key in QAPI. */
assert(create_options->driver == BLOCKDEV_DRIVER_VHDX);
create_options->u.vhdx.size =
ROUND_UP(create_options->u.vhdx.size, BDRV_SECTOR_SIZE);
if (create_options->u.vhdx.has_log_size) {
create_options->u.vhdx.log_size =
ROUND_UP(create_options->u.vhdx.log_size, MiB);
}
if (create_options->u.vhdx.has_block_size) {
create_options->u.vhdx.block_size =
ROUND_UP(create_options->u.vhdx.block_size, MiB);
if (create_options->u.vhdx.block_size == 0) {
create_options->u.vhdx.has_block_size = false;
}
if (create_options->u.vhdx.block_size > VHDX_BLOCK_SIZE_MAX) {
create_options->u.vhdx.block_size = VHDX_BLOCK_SIZE_MAX;
}
}
/* Create the vhdx image (format layer) */
ret = vhdx_co_create(create_options, errp);
fail:
QDECREF(qdict);
bdrv_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
/* If opened r/w, the VHDX driver will automatically replay the log,
* if one is present, inside the vhdx_open() call.
*
@@ -1944,8 +2062,9 @@ exit:
* r/w and any log has already been replayed, so there is nothing (currently)
* for us to do here
*/
static int vhdx_check(BlockDriverState *bs, BdrvCheckResult *result,
BdrvCheckMode fix)
static int coroutine_fn vhdx_co_check(BlockDriverState *bs,
BdrvCheckResult *result,
BdrvCheckMode fix)
{
BDRVVHDXState *s = bs->opaque;
@@ -2004,9 +2123,10 @@ static BlockDriver bdrv_vhdx = {
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_co_readv = vhdx_co_readv,
.bdrv_co_writev = vhdx_co_writev,
.bdrv_co_create = vhdx_co_create,
.bdrv_co_create_opts = vhdx_co_create_opts,
.bdrv_get_info = vhdx_get_info,
.bdrv_check = vhdx_check,
.bdrv_co_check = vhdx_co_check,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.create_opts = &vhdx_create_opts,

View File

@@ -47,6 +47,8 @@
#define VMDK4_FLAG_MARKER (1 << 17)
#define VMDK4_GD_AT_END 0xffffffffffffffffULL
#define VMDK_EXTENT_MAX_SECTORS (1ULL << 32)
#define VMDK_GTE_ZEROED 0x1
/* VMDK internal error codes */
@@ -1250,6 +1252,10 @@ static int get_cluster_offset(BlockDriverState *bs,
return zeroed ? VMDK_ZEROED : VMDK_UNALLOC;
}
if (extent->next_cluster_sector >= VMDK_EXTENT_MAX_SECTORS) {
return VMDK_ERROR;
}
cluster_sector = extent->next_cluster_sector;
extent->next_cluster_sector += extent->cluster_sectors;
@@ -2221,8 +2227,9 @@ static ImageInfo *vmdk_get_extent_info(VmdkExtent *extent)
return info;
}
static int vmdk_check(BlockDriverState *bs, BdrvCheckResult *result,
BdrvCheckMode fix)
static int coroutine_fn vmdk_co_check(BlockDriverState *bs,
BdrvCheckResult *result,
BdrvCheckMode fix)
{
BDRVVmdkState *s = bs->opaque;
VmdkExtent *extent = NULL;
@@ -2391,7 +2398,7 @@ static BlockDriver bdrv_vmdk = {
.instance_size = sizeof(BDRVVmdkState),
.bdrv_probe = vmdk_probe,
.bdrv_open = vmdk_open,
.bdrv_check = vmdk_check,
.bdrv_co_check = vmdk_co_check,
.bdrv_reopen_prepare = vmdk_reopen_prepare,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_co_preadv = vmdk_co_preadv,

View File

@@ -32,6 +32,9 @@
#include "migration/blocker.h"
#include "qemu/bswap.h"
#include "qemu/uuid.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi/qapi-visit-block-core.h"
/**************************************************************/
@@ -166,6 +169,8 @@ static QemuOptsList vpc_runtime_opts = {
}
};
static QemuOptsList vpc_create_opts;
static uint32_t vpc_checksum(uint8_t* buf, size_t size)
{
uint32_t res = 0;
@@ -897,60 +902,19 @@ static int create_fixed_disk(BlockBackend *blk, uint8_t *buf,
return ret;
}
static int coroutine_fn vpc_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
static int calculate_rounded_image_size(BlockdevCreateOptionsVpc *vpc_opts,
uint16_t *out_cyls,
uint8_t *out_heads,
uint8_t *out_secs_per_cyl,
int64_t *out_total_sectors,
Error **errp)
{
uint8_t buf[1024];
VHDFooter *footer = (VHDFooter *) buf;
char *disk_type_param;
int i;
int64_t total_size = vpc_opts->size;
uint16_t cyls = 0;
uint8_t heads = 0;
uint8_t secs_per_cyl = 0;
int64_t total_sectors;
int64_t total_size;
int disk_type;
int ret = -EIO;
bool force_size;
Error *local_err = NULL;
BlockBackend *blk = NULL;
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
disk_type_param = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
if (disk_type_param) {
if (!strcmp(disk_type_param, "dynamic")) {
disk_type = VHD_DYNAMIC;
} else if (!strcmp(disk_type_param, "fixed")) {
disk_type = VHD_FIXED;
} else {
error_setg(errp, "Invalid disk type, %s", disk_type_param);
ret = -EINVAL;
goto out;
}
} else {
disk_type = VHD_DYNAMIC;
}
force_size = qemu_opt_get_bool_del(opts, VPC_OPT_FORCE_SIZE, false);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto out;
}
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
goto out;
}
blk_set_allow_write_beyond_eof(blk, true);
int i;
/*
* Calculate matching total_size and geometry. Increase the number of
@@ -961,7 +925,7 @@ static int coroutine_fn vpc_co_create_opts(const char *filename, QemuOpts *opts,
* we set the geometry to 65535 x 16 x 255 (CxHxS) sectors and use
* the image size from the VHD footer to calculate total_sectors.
*/
if (force_size) {
if (vpc_opts->force_size) {
/* This will force the use of total_size for sector count, below */
cyls = VHD_CHS_MAX_C;
heads = VHD_CHS_MAX_H;
@@ -978,19 +942,95 @@ static int coroutine_fn vpc_co_create_opts(const char *filename, QemuOpts *opts,
/* Allow a maximum disk size of 2040 GiB */
if (total_sectors > VHD_MAX_SECTORS) {
error_setg(errp, "Disk size is too large, max size is 2040 GiB");
ret = -EFBIG;
goto out;
return -EFBIG;
}
} else {
total_sectors = (int64_t)cyls * heads * secs_per_cyl;
total_size = total_sectors * BDRV_SECTOR_SIZE;
total_sectors = (int64_t) cyls * heads * secs_per_cyl;
}
*out_total_sectors = total_sectors;
if (out_cyls) {
*out_cyls = cyls;
*out_heads = heads;
*out_secs_per_cyl = secs_per_cyl;
}
return 0;
}
static int coroutine_fn vpc_co_create(BlockdevCreateOptions *opts,
Error **errp)
{
BlockdevCreateOptionsVpc *vpc_opts;
BlockBackend *blk = NULL;
BlockDriverState *bs = NULL;
uint8_t buf[1024];
VHDFooter *footer = (VHDFooter *) buf;
uint16_t cyls = 0;
uint8_t heads = 0;
uint8_t secs_per_cyl = 0;
int64_t total_sectors;
int64_t total_size;
int disk_type;
int ret = -EIO;
assert(opts->driver == BLOCKDEV_DRIVER_VPC);
vpc_opts = &opts->u.vpc;
/* Validate options and set default values */
total_size = vpc_opts->size;
if (!vpc_opts->has_subformat) {
vpc_opts->subformat = BLOCKDEV_VPC_SUBFORMAT_DYNAMIC;
}
switch (vpc_opts->subformat) {
case BLOCKDEV_VPC_SUBFORMAT_DYNAMIC:
disk_type = VHD_DYNAMIC;
break;
case BLOCKDEV_VPC_SUBFORMAT_FIXED:
disk_type = VHD_FIXED;
break;
default:
g_assert_not_reached();
}
/* Create BlockBackend to write to the image */
bs = bdrv_open_blockdev_ref(vpc_opts->file, errp);
if (bs == NULL) {
return -EIO;
}
blk = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
ret = blk_insert_bs(blk, bs, errp);
if (ret < 0) {
goto out;
}
blk_set_allow_write_beyond_eof(blk, true);
/* Get geometry and check that it matches the image size*/
ret = calculate_rounded_image_size(vpc_opts, &cyls, &heads, &secs_per_cyl,
&total_sectors, errp);
if (ret < 0) {
goto out;
}
if (total_size != total_sectors * BDRV_SECTOR_SIZE) {
error_setg(errp, "The requested image size cannot be represented in "
"CHS geometry");
error_append_hint(errp, "Try size=%llu or force-size=on (the "
"latter makes the image incompatible with "
"Virtual PC)",
total_sectors * BDRV_SECTOR_SIZE);
ret = -EINVAL;
goto out;
}
/* Prepare the Hard Disk Footer */
memset(buf, 0, 1024);
memcpy(footer->creator, "conectix", 8);
if (force_size) {
if (vpc_opts->force_size) {
memcpy(footer->creator_app, "qem2", 4);
} else {
memcpy(footer->creator_app, "qemu", 4);
@@ -1032,10 +1072,98 @@ static int coroutine_fn vpc_co_create_opts(const char *filename, QemuOpts *opts,
out:
blk_unref(blk);
g_free(disk_type_param);
bdrv_unref(bs);
return ret;
}
static int coroutine_fn vpc_co_create_opts(const char *filename,
QemuOpts *opts, Error **errp)
{
BlockdevCreateOptions *create_options = NULL;
QDict *qdict = NULL;
QObject *qobj;
Visitor *v;
BlockDriverState *bs = NULL;
Error *local_err = NULL;
int ret;
static const QDictRenames opt_renames[] = {
{ VPC_OPT_FORCE_SIZE, "force-size" },
{ NULL, NULL },
};
/* Parse options and convert legacy syntax */
qdict = qemu_opts_to_qdict_filtered(opts, NULL, &vpc_create_opts, true);
if (!qdict_rename_keys(qdict, opt_renames, errp)) {
ret = -EINVAL;
goto fail;
}
/* Create and open the file (protocol layer) */
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
}
bs = bdrv_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_RESIZE | BDRV_O_PROTOCOL, errp);
if (bs == NULL) {
ret = -EIO;
goto fail;
}
/* Now get the QAPI type BlockdevCreateOptions */
qdict_put_str(qdict, "driver", "vpc");
qdict_put_str(qdict, "file", bs->node_name);
qobj = qdict_crumple(qdict, errp);
QDECREF(qdict);
qdict = qobject_to(QDict, qobj);
if (qdict == NULL) {
ret = -EINVAL;
goto fail;
}
v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
visit_type_BlockdevCreateOptions(v, NULL, &create_options, &local_err);
visit_free(v);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
/* Silently round up size */
assert(create_options->driver == BLOCKDEV_DRIVER_VPC);
create_options->u.vpc.size =
ROUND_UP(create_options->u.vpc.size, BDRV_SECTOR_SIZE);
if (!create_options->u.vpc.force_size) {
int64_t total_sectors;
ret = calculate_rounded_image_size(&create_options->u.vpc, NULL, NULL,
NULL, &total_sectors, errp);
if (ret < 0) {
goto fail;
}
create_options->u.vpc.size = total_sectors * BDRV_SECTOR_SIZE;
}
/* Create the vpc image (format layer) */
ret = vpc_co_create(create_options, errp);
fail:
QDECREF(qdict);
bdrv_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
static int vpc_has_zero_init(BlockDriverState *bs)
{
BDRVVPCState *s = bs->opaque;
@@ -1096,6 +1224,7 @@ static BlockDriver bdrv_vpc = {
.bdrv_close = vpc_close,
.bdrv_reopen_prepare = vpc_reopen_prepare,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_co_create = vpc_co_create,
.bdrv_co_create_opts = vpc_co_create_opts,
.bdrv_co_preadv = vpc_co_preadv,

View File

@@ -3129,7 +3129,7 @@ static void vvfat_qcow_options(int *child_flags, QDict *child_options,
int parent_flags, QDict *parent_options)
{
qdict_set_default_str(child_options, BDRV_OPT_READ_ONLY, "off");
*child_flags = BDRV_O_NO_FLUSH;
qdict_set_default_str(child_options, BDRV_OPT_CACHE_NO_FLUSH, "on");
}
static const BdrvChildRole child_vvfat_qcow = {

View File

@@ -150,7 +150,7 @@ void blockdev_mark_auto_del(BlockBackend *blk)
aio_context_acquire(aio_context);
if (bs->job) {
block_job_cancel(bs->job);
block_job_cancel(bs->job, false);
}
aio_context_release(aio_context);
@@ -334,7 +334,8 @@ static bool parse_stats_intervals(BlockAcctStats *stats, QList *intervals,
case QTYPE_QSTRING: {
unsigned long long length;
const char *str = qstring_get_str(qobject_to_qstring(entry->value));
const char *str = qstring_get_str(qobject_to(QString,
entry->value));
if (parse_uint_full(str, &length, 10) == 0 &&
length > 0 && length <= UINT_MAX) {
block_acct_add_interval(stats, (unsigned) length);
@@ -346,7 +347,7 @@ static bool parse_stats_intervals(BlockAcctStats *stats, QList *intervals,
}
case QTYPE_QNUM: {
int64_t length = qnum_get_int(qobject_to_qnum(entry->value));
int64_t length = qnum_get_int(qobject_to(QNum, entry->value));
if (length > 0 && length <= UINT_MAX) {
block_acct_add_interval(stats, (unsigned) length);
@@ -2118,6 +2119,9 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
if (bdrv_dirty_bitmap_frozen(state->bitmap)) {
error_setg(errp, "Cannot modify a frozen bitmap");
return;
} else if (bdrv_dirty_bitmap_qmp_locked(state->bitmap)) {
error_setg(errp, "Cannot modify a locked bitmap");
return;
} else if (!bdrv_dirty_bitmap_enabled(state->bitmap)) {
error_setg(errp, "Cannot clear a disabled bitmap");
return;
@@ -2862,6 +2866,11 @@ void qmp_block_dirty_bitmap_remove(const char *node, const char *name,
"Bitmap '%s' is currently frozen and cannot be removed",
name);
return;
} else if (bdrv_dirty_bitmap_qmp_locked(bitmap)) {
error_setg(errp,
"Bitmap '%s' is currently locked and cannot be removed",
name);
return;
}
if (bdrv_dirty_bitmap_get_persistance(bitmap)) {
@@ -2896,6 +2905,11 @@ void qmp_block_dirty_bitmap_clear(const char *node, const char *name,
"Bitmap '%s' is currently frozen and cannot be modified",
name);
return;
} else if (bdrv_dirty_bitmap_qmp_locked(bitmap)) {
error_setg(errp,
"Bitmap '%s' is currently locked and cannot be modified",
name);
return;
} else if (!bdrv_dirty_bitmap_enabled(bitmap)) {
error_setg(errp,
"Bitmap '%s' is currently disabled and cannot be cleared",
@@ -3261,7 +3275,7 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
AioContext *aio_context;
QDict *options = NULL;
Error *local_err = NULL;
int flags;
int flags, job_flags = BLOCK_JOB_DEFAULT;
int64_t size;
bool set_backing_hd = false;
@@ -3280,6 +3294,12 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
if (!backup->has_job_id) {
backup->job_id = NULL;
}
if (!backup->has_auto_finalize) {
backup->auto_finalize = true;
}
if (!backup->has_auto_dismiss) {
backup->auto_dismiss = true;
}
if (!backup->has_compress) {
backup->compress = false;
}
@@ -3370,12 +3390,24 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
bdrv_unref(target_bs);
goto out;
}
if (bdrv_dirty_bitmap_qmp_locked(bmap)) {
error_setg(errp,
"Bitmap '%s' is currently locked and cannot be used for "
"backup", backup->bitmap);
goto out;
}
}
if (!backup->auto_finalize) {
job_flags |= BLOCK_JOB_MANUAL_FINALIZE;
}
if (!backup->auto_dismiss) {
job_flags |= BLOCK_JOB_MANUAL_DISMISS;
}
job = backup_job_create(backup->job_id, bs, target_bs, backup->speed,
backup->sync, bmap, backup->compress,
backup->on_source_error, backup->on_target_error,
BLOCK_JOB_DEFAULT, NULL, NULL, txn, &local_err);
job_flags, NULL, NULL, txn, &local_err);
bdrv_unref(target_bs);
if (local_err != NULL) {
error_propagate(errp, local_err);
@@ -3410,6 +3442,7 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, BlockJobTxn *txn,
Error *local_err = NULL;
AioContext *aio_context;
BlockJob *job = NULL;
int job_flags = BLOCK_JOB_DEFAULT;
if (!backup->has_speed) {
backup->speed = 0;
@@ -3423,6 +3456,12 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, BlockJobTxn *txn,
if (!backup->has_job_id) {
backup->job_id = NULL;
}
if (!backup->has_auto_finalize) {
backup->auto_finalize = true;
}
if (!backup->has_auto_dismiss) {
backup->auto_dismiss = true;
}
if (!backup->has_compress) {
backup->compress = false;
}
@@ -3451,10 +3490,16 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, BlockJobTxn *txn,
goto out;
}
}
if (!backup->auto_finalize) {
job_flags |= BLOCK_JOB_MANUAL_FINALIZE;
}
if (!backup->auto_dismiss) {
job_flags |= BLOCK_JOB_MANUAL_DISMISS;
}
job = backup_job_create(backup->job_id, bs, target_bs, backup->speed,
backup->sync, NULL, backup->compress,
backup->on_source_error, backup->on_target_error,
BLOCK_JOB_DEFAULT, NULL, NULL, txn, &local_err);
job_flags, NULL, NULL, txn, &local_err);
if (local_err != NULL) {
error_propagate(errp, local_err);
}
@@ -3806,7 +3851,7 @@ void qmp_block_job_cancel(const char *device,
}
trace_qmp_block_job_cancel(job);
block_job_cancel(job);
block_job_user_cancel(job, force, errp);
out:
aio_context_release(aio_context);
}
@@ -3816,12 +3861,12 @@ void qmp_block_job_pause(const char *device, Error **errp)
AioContext *aio_context;
BlockJob *job = find_block_job(device, &aio_context, errp);
if (!job || block_job_user_paused(job)) {
if (!job) {
return;
}
trace_qmp_block_job_pause(job);
block_job_user_pause(job);
block_job_user_pause(job, errp);
aio_context_release(aio_context);
}
@@ -3830,12 +3875,12 @@ void qmp_block_job_resume(const char *device, Error **errp)
AioContext *aio_context;
BlockJob *job = find_block_job(device, &aio_context, errp);
if (!job || !block_job_user_paused(job)) {
if (!job) {
return;
}
trace_qmp_block_job_resume(job);
block_job_user_resume(job);
block_job_user_resume(job, errp);
aio_context_release(aio_context);
}
@@ -3853,6 +3898,34 @@ void qmp_block_job_complete(const char *device, Error **errp)
aio_context_release(aio_context);
}
void qmp_block_job_finalize(const char *id, Error **errp)
{
AioContext *aio_context;
BlockJob *job = find_block_job(id, &aio_context, errp);
if (!job) {
return;
}
trace_qmp_block_job_finalize(job);
block_job_finalize(job, errp);
aio_context_release(aio_context);
}
void qmp_block_job_dismiss(const char *id, Error **errp)
{
AioContext *aio_context;
BlockJob *job = find_block_job(id, &aio_context, errp);
if (!job) {
return;
}
trace_qmp_block_job_dismiss(job);
block_job_dismiss(&job, errp);
aio_context_release(aio_context);
}
void qmp_change_backing_file(const char *device,
const char *image_node_name,
const char *backing_file,
@@ -3972,7 +4045,6 @@ void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
QObject *obj;
Visitor *v = qobject_output_visitor_new(&obj);
QDict *qdict;
const QDictEntry *ent;
Error *local_err = NULL;
visit_type_BlockdevOptions(v, NULL, &options, &local_err);
@@ -3982,23 +4054,10 @@ void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
}
visit_complete(v, &obj);
qdict = qobject_to_qdict(obj);
qdict = qobject_to(QDict, obj);
qdict_flatten(qdict);
/*
* Rewrite "backing": null to "backing": ""
* TODO Rewrite "" to null instead, and perhaps not even here
*/
for (ent = qdict_first(qdict); ent; ent = qdict_next(qdict, ent)) {
char *dot = strrchr(ent->key, '.');
if (!strcmp(dot ? dot + 1 : ent->key, "backing")
&& qobject_type(ent->value) == QTYPE_QNULL) {
qdict_put(qdict, ent->key, qstring_new());
}
}
if (!qdict_get_try_str(qdict, "node-name")) {
error_setg(errp, "'node-name' must be specified for the root node");
goto fail;
@@ -4180,6 +4239,49 @@ void qmp_x_blockdev_set_iothread(const char *node_name, StrOrNull *iothread,
aio_context_release(old_context);
}
void qmp_x_block_latency_histogram_set(
const char *device,
bool has_boundaries, uint64List *boundaries,
bool has_boundaries_read, uint64List *boundaries_read,
bool has_boundaries_write, uint64List *boundaries_write,
bool has_boundaries_flush, uint64List *boundaries_flush,
Error **errp)
{
BlockBackend *blk = blk_by_name(device);
BlockAcctStats *stats;
if (!blk) {
error_setg(errp, "Device '%s' not found", device);
return;
}
stats = blk_get_stats(blk);
if (!has_boundaries && !has_boundaries_read && !has_boundaries_write &&
!has_boundaries_flush)
{
block_latency_histograms_clear(stats);
return;
}
if (has_boundaries || has_boundaries_read) {
block_latency_histogram_set(
stats, BLOCK_ACCT_READ,
has_boundaries_read ? boundaries_read : boundaries);
}
if (has_boundaries || has_boundaries_write) {
block_latency_histogram_set(
stats, BLOCK_ACCT_WRITE,
has_boundaries_write ? boundaries_write : boundaries);
}
if (has_boundaries || has_boundaries_flush) {
block_latency_histogram_set(
stats, BLOCK_ACCT_FLUSH,
has_boundaries_flush ? boundaries_flush : boundaries);
}
}
QemuOptsList qemu_common_drive_opts = {
.name = "drive",
.head = QTAILQ_HEAD_INITIALIZER(qemu_common_drive_opts.head),

View File

@@ -28,6 +28,7 @@
#include "block/block.h"
#include "block/blockjob_int.h"
#include "block/block_int.h"
#include "block/trace.h"
#include "sysemu/block-backend.h"
#include "qapi/error.h"
#include "qapi/qapi-events-block-core.h"
@@ -41,6 +42,60 @@
* block_job_enter. */
static QemuMutex block_job_mutex;
/* BlockJob State Transition Table */
bool BlockJobSTT[BLOCK_JOB_STATUS__MAX][BLOCK_JOB_STATUS__MAX] = {
/* U, C, R, P, Y, S, W, D, X, E, N */
/* U: */ [BLOCK_JOB_STATUS_UNDEFINED] = {0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0},
/* C: */ [BLOCK_JOB_STATUS_CREATED] = {0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1},
/* R: */ [BLOCK_JOB_STATUS_RUNNING] = {0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0},
/* P: */ [BLOCK_JOB_STATUS_PAUSED] = {0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0},
/* Y: */ [BLOCK_JOB_STATUS_READY] = {0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0},
/* S: */ [BLOCK_JOB_STATUS_STANDBY] = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0},
/* W: */ [BLOCK_JOB_STATUS_WAITING] = {0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0},
/* D: */ [BLOCK_JOB_STATUS_PENDING] = {0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
/* X: */ [BLOCK_JOB_STATUS_ABORTING] = {0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0},
/* E: */ [BLOCK_JOB_STATUS_CONCLUDED] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1},
/* N: */ [BLOCK_JOB_STATUS_NULL] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
};
bool BlockJobVerbTable[BLOCK_JOB_VERB__MAX][BLOCK_JOB_STATUS__MAX] = {
/* U, C, R, P, Y, S, W, D, X, E, N */
[BLOCK_JOB_VERB_CANCEL] = {0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0},
[BLOCK_JOB_VERB_PAUSE] = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
[BLOCK_JOB_VERB_RESUME] = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
[BLOCK_JOB_VERB_SET_SPEED] = {0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0},
[BLOCK_JOB_VERB_COMPLETE] = {0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0},
[BLOCK_JOB_VERB_FINALIZE] = {0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0},
[BLOCK_JOB_VERB_DISMISS] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
};
static void block_job_state_transition(BlockJob *job, BlockJobStatus s1)
{
BlockJobStatus s0 = job->status;
assert(s1 >= 0 && s1 <= BLOCK_JOB_STATUS__MAX);
trace_block_job_state_transition(job, job->ret, BlockJobSTT[s0][s1] ?
"allowed" : "disallowed",
BlockJobStatus_str(s0),
BlockJobStatus_str(s1));
assert(BlockJobSTT[s0][s1]);
job->status = s1;
}
static int block_job_apply_verb(BlockJob *job, BlockJobVerb bv, Error **errp)
{
assert(bv >= 0 && bv <= BLOCK_JOB_VERB__MAX);
trace_block_job_apply_verb(job, BlockJobStatus_str(job->status),
BlockJobVerb_str(bv),
BlockJobVerbTable[bv][job->status] ?
"allowed" : "prohibited");
if (BlockJobVerbTable[bv][job->status]) {
return 0;
}
error_setg(errp, "Job '%s' in state '%s' cannot accept command verb '%s'",
job->id, BlockJobStatus_str(job->status), BlockJobVerb_str(bv));
return -EPERM;
}
static void block_job_lock(void)
{
qemu_mutex_lock(&block_job_mutex);
@@ -58,6 +113,7 @@ static void __attribute__((__constructor__)) block_job_init(void)
static void block_job_event_cancelled(BlockJob *job);
static void block_job_event_completed(BlockJob *job, const char *msg);
static int block_job_event_pending(BlockJob *job);
static void block_job_enter_cond(BlockJob *job, bool(*fn)(BlockJob *job));
/* Transactional group of block jobs */
@@ -144,6 +200,15 @@ void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
block_job_txn_ref(txn);
}
static void block_job_txn_del_job(BlockJob *job)
{
if (job->txn) {
QLIST_REMOVE(job, txn_list);
block_job_txn_unref(job->txn);
job->txn = NULL;
}
}
static void block_job_pause(BlockJob *job)
{
job->pause_count++;
@@ -171,6 +236,8 @@ static void block_job_detach_aio_context(void *opaque);
void block_job_unref(BlockJob *job)
{
if (--job->refcnt == 0) {
assert(job->status == BLOCK_JOB_STATUS_NULL);
assert(!job->txn);
BlockDriverState *bs = blk_bs(job->blk);
QLIST_REMOVE(job, job_list);
bs->job = NULL;
@@ -320,25 +387,89 @@ void block_job_start(BlockJob *job)
job->pause_count--;
job->busy = true;
job->paused = false;
block_job_state_transition(job, BLOCK_JOB_STATUS_RUNNING);
bdrv_coroutine_enter(blk_bs(job->blk), job->co);
}
static void block_job_completed_single(BlockJob *job)
static void block_job_decommission(BlockJob *job)
{
assert(job->completed);
assert(job);
job->completed = true;
job->busy = false;
job->paused = false;
job->deferred_to_main_loop = true;
block_job_txn_del_job(job);
block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
block_job_unref(job);
}
if (!job->ret) {
if (job->driver->commit) {
job->driver->commit(job);
}
} else {
if (job->driver->abort) {
job->driver->abort(job);
}
static void block_job_do_dismiss(BlockJob *job)
{
block_job_decommission(job);
}
static void block_job_conclude(BlockJob *job)
{
block_job_state_transition(job, BLOCK_JOB_STATUS_CONCLUDED);
if (job->auto_dismiss || !block_job_started(job)) {
block_job_do_dismiss(job);
}
}
static void block_job_update_rc(BlockJob *job)
{
if (!job->ret && block_job_is_cancelled(job)) {
job->ret = -ECANCELED;
}
if (job->ret) {
block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
}
}
static int block_job_prepare(BlockJob *job)
{
if (job->ret == 0 && job->driver->prepare) {
job->ret = job->driver->prepare(job);
}
return job->ret;
}
static void block_job_commit(BlockJob *job)
{
assert(!job->ret);
if (job->driver->commit) {
job->driver->commit(job);
}
}
static void block_job_abort(BlockJob *job)
{
assert(job->ret);
if (job->driver->abort) {
job->driver->abort(job);
}
}
static void block_job_clean(BlockJob *job)
{
if (job->driver->clean) {
job->driver->clean(job);
}
}
static int block_job_finalize_single(BlockJob *job)
{
assert(job->completed);
/* Ensure abort is called for late-transactional failures */
block_job_update_rc(job);
if (!job->ret) {
block_job_commit(job);
} else {
block_job_abort(job);
}
block_job_clean(job);
if (job->cb) {
job->cb(job->opaque, job->ret);
@@ -357,14 +488,12 @@ static void block_job_completed_single(BlockJob *job)
}
}
if (job->txn) {
QLIST_REMOVE(job, txn_list);
block_job_txn_unref(job->txn);
}
block_job_unref(job);
block_job_txn_del_job(job);
block_job_conclude(job);
return 0;
}
static void block_job_cancel_async(BlockJob *job)
static void block_job_cancel_async(BlockJob *job, bool force)
{
if (job->iostatus != BLOCK_DEVICE_IO_STATUS_OK) {
block_job_iostatus_reset(job);
@@ -375,6 +504,30 @@ static void block_job_cancel_async(BlockJob *job)
job->pause_count--;
}
job->cancelled = true;
/* To prevent 'force == false' overriding a previous 'force == true' */
job->force |= force;
}
static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *), bool lock)
{
AioContext *ctx;
BlockJob *job, *next;
int rc = 0;
QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
if (lock) {
ctx = blk_get_aio_context(job->blk);
aio_context_acquire(ctx);
}
rc = fn(job);
if (lock) {
aio_context_release(ctx);
}
if (rc) {
break;
}
}
return rc;
}
static int block_job_finish_sync(BlockJob *job,
@@ -436,7 +589,7 @@ static void block_job_completed_txn_abort(BlockJob *job)
* on the caller, so leave it. */
QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
if (other_job != job) {
block_job_cancel_async(other_job);
block_job_cancel_async(other_job, false);
}
}
while (!QLIST_EMPTY(&txn->jobs)) {
@@ -446,18 +599,39 @@ static void block_job_completed_txn_abort(BlockJob *job)
assert(other_job->cancelled);
block_job_finish_sync(other_job, NULL, NULL);
}
block_job_completed_single(other_job);
block_job_finalize_single(other_job);
aio_context_release(ctx);
}
block_job_txn_unref(txn);
}
static int block_job_needs_finalize(BlockJob *job)
{
return !job->auto_finalize;
}
static void block_job_do_finalize(BlockJob *job)
{
int rc;
assert(job && job->txn);
/* prepare the transaction to complete */
rc = block_job_txn_apply(job->txn, block_job_prepare, true);
if (rc) {
block_job_completed_txn_abort(job);
} else {
block_job_txn_apply(job->txn, block_job_finalize_single, true);
}
}
static void block_job_completed_txn_success(BlockJob *job)
{
AioContext *ctx;
BlockJobTxn *txn = job->txn;
BlockJob *other_job, *next;
BlockJob *other_job;
block_job_state_transition(job, BLOCK_JOB_STATUS_WAITING);
/*
* Successful completion, see if there are other running jobs in this
* txn.
@@ -466,14 +640,14 @@ static void block_job_completed_txn_success(BlockJob *job)
if (!other_job->completed) {
return;
}
}
/* We are the last completed job, commit the transaction. */
QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
ctx = blk_get_aio_context(other_job->blk);
aio_context_acquire(ctx);
assert(other_job->ret == 0);
block_job_completed_single(other_job);
aio_context_release(ctx);
}
block_job_txn_apply(txn, block_job_event_pending, false);
/* If no jobs need manual finalization, automatically do so */
if (block_job_txn_apply(txn, block_job_needs_finalize, false) == 0) {
block_job_do_finalize(job);
}
}
@@ -492,6 +666,9 @@ void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
error_setg(errp, QERR_UNSUPPORTED);
return;
}
if (block_job_apply_verb(job, BLOCK_JOB_VERB_SET_SPEED, errp)) {
return;
}
job->driver->set_speed(job, speed, &local_err);
if (local_err) {
error_propagate(errp, local_err);
@@ -499,7 +676,7 @@ void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
}
job->speed = speed;
if (speed <= old_speed) {
if (speed && speed <= old_speed) {
return;
}
@@ -511,8 +688,10 @@ void block_job_complete(BlockJob *job, Error **errp)
{
/* Should not be reachable via external interface for internal jobs */
assert(job->id);
if (job->pause_count || job->cancelled ||
!block_job_started(job) || !job->driver->complete) {
if (block_job_apply_verb(job, BLOCK_JOB_VERB_COMPLETE, errp)) {
return;
}
if (job->pause_count || job->cancelled || !job->driver->complete) {
error_setg(errp, "The active block job '%s' cannot be completed",
job->id);
return;
@@ -521,8 +700,37 @@ void block_job_complete(BlockJob *job, Error **errp)
job->driver->complete(job, errp);
}
void block_job_user_pause(BlockJob *job)
void block_job_finalize(BlockJob *job, Error **errp)
{
assert(job && job->id && job->txn);
if (block_job_apply_verb(job, BLOCK_JOB_VERB_FINALIZE, errp)) {
return;
}
block_job_do_finalize(job);
}
void block_job_dismiss(BlockJob **jobptr, Error **errp)
{
BlockJob *job = *jobptr;
/* similarly to _complete, this is QMP-interface only. */
assert(job->id);
if (block_job_apply_verb(job, BLOCK_JOB_VERB_DISMISS, errp)) {
return;
}
block_job_do_dismiss(job);
*jobptr = NULL;
}
void block_job_user_pause(BlockJob *job, Error **errp)
{
if (block_job_apply_verb(job, BLOCK_JOB_VERB_PAUSE, errp)) {
return;
}
if (job->user_paused) {
error_setg(errp, "Job is already paused");
return;
}
job->user_paused = true;
block_job_pause(job);
}
@@ -532,23 +740,43 @@ bool block_job_user_paused(BlockJob *job)
return job->user_paused;
}
void block_job_user_resume(BlockJob *job)
void block_job_user_resume(BlockJob *job, Error **errp)
{
if (job && job->user_paused && job->pause_count > 0) {
block_job_iostatus_reset(job);
job->user_paused = false;
block_job_resume(job);
assert(job);
if (!job->user_paused || job->pause_count <= 0) {
error_setg(errp, "Can't resume a job that was not paused");
return;
}
if (block_job_apply_verb(job, BLOCK_JOB_VERB_RESUME, errp)) {
return;
}
block_job_iostatus_reset(job);
job->user_paused = false;
block_job_resume(job);
}
void block_job_cancel(BlockJob *job, bool force)
{
if (job->status == BLOCK_JOB_STATUS_CONCLUDED) {
block_job_do_dismiss(job);
return;
}
block_job_cancel_async(job, force);
if (!block_job_started(job)) {
block_job_completed(job, -ECANCELED);
} else if (job->deferred_to_main_loop) {
block_job_completed_txn_abort(job);
} else {
block_job_enter(job);
}
}
void block_job_cancel(BlockJob *job)
void block_job_user_cancel(BlockJob *job, bool force, Error **errp)
{
if (block_job_started(job)) {
block_job_cancel_async(job);
block_job_enter(job);
} else {
block_job_completed(job, -ECANCELED);
if (block_job_apply_verb(job, BLOCK_JOB_VERB_CANCEL, errp)) {
return;
}
block_job_cancel(job, force);
}
/* A wrapper around block_job_cancel() taking an Error ** parameter so it may be
@@ -556,7 +784,7 @@ void block_job_cancel(BlockJob *job)
* function pointer casts there. */
static void block_job_cancel_err(BlockJob *job, Error **errp)
{
block_job_cancel(job);
block_job_cancel(job, false);
}
int block_job_cancel_sync(BlockJob *job)
@@ -600,6 +828,9 @@ BlockJobInfo *block_job_query(BlockJob *job, Error **errp)
info->speed = job->speed;
info->io_status = job->iostatus;
info->ready = job->ready;
info->status = job->status;
info->auto_finalize = job->auto_finalize;
info->auto_dismiss = job->auto_dismiss;
return info;
}
@@ -641,13 +872,24 @@ static void block_job_event_completed(BlockJob *job, const char *msg)
&error_abort);
}
static int block_job_event_pending(BlockJob *job)
{
block_job_state_transition(job, BLOCK_JOB_STATUS_PENDING);
if (!job->auto_finalize && !block_job_is_internal(job)) {
qapi_event_send_block_job_pending(job->driver->job_type,
job->id,
&error_abort);
}
return 0;
}
/*
* API for block job drivers and the block layer. These functions are
* declared in blockjob_int.h.
*/
void *block_job_create(const char *job_id, const BlockJobDriver *driver,
BlockDriverState *bs, uint64_t perm,
BlockJobTxn *txn, BlockDriverState *bs, uint64_t perm,
uint64_t shared_perm, int64_t speed, int flags,
BlockCompletionFunc *cb, void *opaque, Error **errp)
{
@@ -702,6 +944,9 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
job->paused = true;
job->pause_count = 1;
job->refcnt = 1;
job->auto_finalize = !(flags & BLOCK_JOB_MANUAL_FINALIZE);
job->auto_dismiss = !(flags & BLOCK_JOB_MANUAL_DISMISS);
block_job_state_transition(job, BLOCK_JOB_STATUS_CREATED);
aio_timer_init(qemu_get_aio_context(), &job->sleep_timer,
QEMU_CLOCK_REALTIME, SCALE_NS,
block_job_sleep_timer_cb, job);
@@ -724,11 +969,22 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
block_job_set_speed(job, speed, &local_err);
if (local_err) {
block_job_unref(job);
block_job_early_fail(job);
error_propagate(errp, local_err);
return NULL;
}
}
/* Single jobs are modeled as single-job transactions for sake of
* consolidating the job management logic */
if (!txn) {
txn = block_job_txn_new();
block_job_txn_add_job(txn, job);
block_job_txn_unref(txn);
} else {
block_job_txn_add_job(txn, job);
}
return job;
}
@@ -747,18 +1003,19 @@ void block_job_pause_all(void)
void block_job_early_fail(BlockJob *job)
{
block_job_unref(job);
assert(job->status == BLOCK_JOB_STATUS_CREATED);
block_job_decommission(job);
}
void block_job_completed(BlockJob *job, int ret)
{
assert(job && job->txn && !job->completed);
assert(blk_bs(job->blk)->job == job);
assert(!job->completed);
job->completed = true;
job->ret = ret;
if (!job->txn) {
block_job_completed_single(job);
} else if (ret < 0 || block_job_is_cancelled(job)) {
block_job_update_rc(job);
trace_block_job_completed(job, ret, job->ret);
if (job->ret) {
block_job_completed_txn_abort(job);
} else {
block_job_completed_txn_success(job);
@@ -806,9 +1063,14 @@ void coroutine_fn block_job_pause_point(BlockJob *job)
}
if (block_job_should_pause(job) && !block_job_is_cancelled(job)) {
BlockJobStatus status = job->status;
block_job_state_transition(job, status == BLOCK_JOB_STATUS_READY ? \
BLOCK_JOB_STATUS_STANDBY : \
BLOCK_JOB_STATUS_PAUSED);
job->paused = true;
block_job_do_yield(job, -1);
job->paused = false;
block_job_state_transition(job, status);
}
if (job->driver->resume) {
@@ -914,6 +1176,7 @@ void block_job_iostatus_reset(BlockJob *job)
void block_job_event_ready(BlockJob *job)
{
block_job_state_transition(job, BLOCK_JOB_STATUS_READY);
job->ready = true;
if (block_job_is_internal(job)) {
@@ -957,8 +1220,9 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockdevOnError on_err,
action, &error_abort);
}
if (action == BLOCK_ERROR_ACTION_STOP) {
block_job_pause(job);
/* make the pause user visible, which will be resumed from QMP. */
block_job_user_pause(job);
job->user_paused = true;
block_job_iostatus_set_err(job, error);
}
return action;

View File

@@ -649,7 +649,7 @@ void cpu_loop(CPUSPARCState *env)
static void usage(void)
{
printf("qemu-" TARGET_NAME " version " QEMU_VERSION QEMU_PKGVERSION
printf("qemu-" TARGET_NAME " version " QEMU_FULL_VERSION
"\n" QEMU_COPYRIGHT "\n"
"usage: qemu-" TARGET_NAME " [options] program [arguments...]\n"
"BSD CPU emulator (compiled for %s emulation)\n"
@@ -723,6 +723,7 @@ int main(int argc, char **argv)
{
const char *filename;
const char *cpu_model;
const char *cpu_type;
const char *log_file = NULL;
const char *log_mask = NULL;
struct target_pt_regs regs1, *regs = &regs1;
@@ -900,7 +901,8 @@ int main(int argc, char **argv)
tcg_exec_init(0);
/* NOTE: we need to init the CPU at this stage to get
qemu_host_page_size */
cpu = cpu_init(cpu_model);
cpu_type = parse_cpu_model(cpu_model);
cpu = cpu_create(cpu_type);
env = cpu->env_ptr;
#if defined(TARGET_SPARC) || defined(TARGET_PPC)
cpu_reset(cpu);

View File

@@ -198,19 +198,21 @@ bool qemu_chr_fe_init(CharBackend *b, Chardev *s, Error **errp)
{
int tag = 0;
if (CHARDEV_IS_MUX(s)) {
MuxChardev *d = MUX_CHARDEV(s);
if (s) {
if (CHARDEV_IS_MUX(s)) {
MuxChardev *d = MUX_CHARDEV(s);
if (d->mux_cnt >= MAX_MUX) {
if (d->mux_cnt >= MAX_MUX) {
goto unavailable;
}
d->backends[d->mux_cnt] = b;
tag = d->mux_cnt++;
} else if (s->be) {
goto unavailable;
} else {
s->be = b;
}
d->backends[d->mux_cnt] = b;
tag = d->mux_cnt++;
} else if (s->be) {
goto unavailable;
} else {
s->be = b;
}
b->fe_open = false;

View File

@@ -27,6 +27,7 @@
#include "qemu/option.h"
#include "chardev/char.h"
#include "sysemu/block-backend.h"
#include "sysemu/sysemu.h"
#include "chardev/char-mux.h"
/* MUX driver for serial I/O splitting */
@@ -230,14 +231,12 @@ static void mux_chr_read(void *opaque, const uint8_t *buf, int size)
}
}
bool muxes_realized;
void mux_chr_send_all_event(Chardev *chr, int event)
{
MuxChardev *d = MUX_CHARDEV(chr);
int i;
if (!muxes_realized) {
if (!machine_init_done) {
return;
}
@@ -327,7 +326,7 @@ static void qemu_chr_open_mux(Chardev *chr,
/* only default to opened state if we've realized the initial
* set of muxes
*/
*be_opened = muxes_realized;
*be_opened = machine_init_done;
qemu_chr_fe_init(&d->chr, drv, errp);
}
@@ -347,6 +346,31 @@ static void qemu_chr_parse_mux(QemuOpts *opts, ChardevBackend *backend,
mux->chardev = g_strdup(chardev);
}
/**
* Called after processing of default and command-line-specified
* chardevs to deliver CHR_EVENT_OPENED events to any FEs attached
* to a mux chardev. This is done here to ensure that
* output/prompts/banners are only displayed for the FE that has
* focus when initial command-line processing/machine init is
* completed.
*
* After this point, any new FE attached to any new or existing
* mux will receive CHR_EVENT_OPENED notifications for the BE
* immediately.
*/
static int open_muxes(Chardev *chr)
{
/* send OPENED to all already-attached FEs */
mux_chr_send_all_event(chr, CHR_EVENT_OPENED);
/*
* mark mux as OPENED so any new FEs will immediately receive
* OPENED event
*/
qemu_chr_be_event(chr, CHR_EVENT_OPENED);
return 0;
}
static void char_mux_class_init(ObjectClass *oc, void *data)
{
ChardevClass *cc = CHARDEV_CLASS(oc);
@@ -357,6 +381,7 @@ static void char_mux_class_init(ObjectClass *oc, void *data)
cc->chr_accept_input = mux_chr_accept_input;
cc->chr_add_watch = mux_chr_add_watch;
cc->chr_be_event = mux_chr_be_event;
cc->chr_machine_done = open_muxes;
}
static const TypeInfo char_mux_type_info = {

View File

@@ -32,6 +32,7 @@
#include "qapi/error.h"
#include "qapi/clone-visitor.h"
#include "qapi/qapi-visit-sockets.h"
#include "sysemu/sysemu.h"
#include "chardev/char-io.h"
@@ -40,6 +41,11 @@
#define TCP_MAX_FDS 16
typedef struct {
char buf[21];
size_t buflen;
} TCPChardevTelnetInit;
typedef struct {
Chardev parent;
QIOChannel *ioc; /* Client I/O channel */
@@ -60,6 +66,8 @@ typedef struct {
bool is_listen;
bool is_telnet;
bool is_tn3270;
GSource *telnet_source;
TCPChardevTelnetInit *telnet_init;
GSource *reconnect_timer;
int64_t reconnect_time;
@@ -70,6 +78,7 @@ typedef struct {
OBJECT_CHECK(SocketChardev, (obj), TYPE_CHARDEV_SOCKET)
static gboolean socket_reconnect_timeout(gpointer opaque);
static void tcp_chr_telnet_init(Chardev *chr);
static void tcp_chr_reconn_timer_cancel(SocketChardev *s)
{
@@ -423,8 +432,8 @@ static void tcp_chr_disconnect(Chardev *chr)
tcp_chr_free_connection(chr);
if (s->listener) {
qio_net_listener_set_client_func(s->listener, tcp_chr_accept,
chr, NULL);
qio_net_listener_set_client_func_full(s->listener, tcp_chr_accept,
chr, NULL, chr->gcontext);
}
update_disconnected_filename(s);
if (emit_close) {
@@ -450,7 +459,7 @@ static gboolean tcp_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
len = s->max_size;
}
size = tcp_chr_recv(chr, (void *)buf, len);
if (size == 0 || size == -1) {
if (size == 0 || (size == -1 && errno != EAGAIN)) {
/* connection closed */
tcp_chr_disconnect(chr);
} else if (size > 0) {
@@ -541,12 +550,10 @@ static void tcp_chr_connect(void *opaque)
s->is_listen, s->is_telnet);
s->connected = 1;
if (s->ioc) {
chr->gsource = io_add_watch_poll(chr, s->ioc,
tcp_chr_read_poll,
tcp_chr_read,
chr, chr->gcontext);
}
chr->gsource = io_add_watch_poll(chr, s->ioc,
tcp_chr_read_poll,
tcp_chr_read,
chr, chr->gcontext);
s->hup_source = qio_channel_create_watch(s->ioc, G_IO_HUP);
g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup,
@@ -556,10 +563,33 @@ static void tcp_chr_connect(void *opaque)
qemu_chr_be_event(chr, CHR_EVENT_OPENED);
}
static void tcp_chr_telnet_destroy(SocketChardev *s)
{
if (s->telnet_source) {
g_source_destroy(s->telnet_source);
g_source_unref(s->telnet_source);
s->telnet_source = NULL;
}
}
static void tcp_chr_update_read_handler(Chardev *chr)
{
SocketChardev *s = SOCKET_CHARDEV(chr);
if (s->listener) {
/*
* It's possible that chardev context is changed in
* qemu_chr_be_update_read_handlers(). Reset it for QIO net
* listener if there is.
*/
qio_net_listener_set_client_func_full(s->listener, tcp_chr_accept,
chr, NULL, chr->gcontext);
}
if (s->telnet_source) {
tcp_chr_telnet_init(CHARDEV(s));
}
if (!s->connected) {
return;
}
@@ -573,32 +603,30 @@ static void tcp_chr_update_read_handler(Chardev *chr)
}
}
typedef struct {
Chardev *chr;
char buf[21];
size_t buflen;
} TCPChardevTelnetInit;
static gboolean tcp_chr_telnet_init_io(QIOChannel *ioc,
GIOCondition cond G_GNUC_UNUSED,
gpointer user_data)
{
TCPChardevTelnetInit *init = user_data;
SocketChardev *s = user_data;
Chardev *chr = CHARDEV(s);
TCPChardevTelnetInit *init = s->telnet_init;
ssize_t ret;
assert(init);
ret = qio_channel_write(ioc, init->buf, init->buflen, NULL);
if (ret < 0) {
if (ret == QIO_CHANNEL_ERR_BLOCK) {
ret = 0;
} else {
tcp_chr_disconnect(init->chr);
tcp_chr_disconnect(chr);
goto end;
}
}
init->buflen -= ret;
if (init->buflen == 0) {
tcp_chr_connect(init->chr);
tcp_chr_connect(chr);
goto end;
}
@@ -607,16 +635,30 @@ static gboolean tcp_chr_telnet_init_io(QIOChannel *ioc,
return G_SOURCE_CONTINUE;
end:
g_free(init);
g_free(s->telnet_init);
s->telnet_init = NULL;
g_source_unref(s->telnet_source);
s->telnet_source = NULL;
return G_SOURCE_REMOVE;
}
static void tcp_chr_telnet_init(Chardev *chr)
{
SocketChardev *s = SOCKET_CHARDEV(chr);
TCPChardevTelnetInit *init = g_new0(TCPChardevTelnetInit, 1);
TCPChardevTelnetInit *init;
size_t n = 0;
/* Destroy existing task */
tcp_chr_telnet_destroy(s);
if (s->telnet_init) {
/* We are possibly during a handshake already */
goto cont;
}
s->telnet_init = g_new0(TCPChardevTelnetInit, 1);
init = s->telnet_init;
#define IACSET(x, a, b, c) \
do { \
x[n++] = a; \
@@ -624,7 +666,6 @@ static void tcp_chr_telnet_init(Chardev *chr)
x[n++] = c; \
} while (0)
init->chr = chr;
if (!s->is_tn3270) {
init->buflen = 12;
/* Prep the telnet negotion to put telnet in binary,
@@ -647,10 +688,11 @@ static void tcp_chr_telnet_init(Chardev *chr)
#undef IACSET
qio_channel_add_watch(
s->ioc, G_IO_OUT,
tcp_chr_telnet_init_io,
init, NULL);
cont:
s->telnet_source = qio_channel_add_watch_source(s->ioc, G_IO_OUT,
tcp_chr_telnet_init_io,
s, NULL,
chr->gcontext);
}
@@ -663,8 +705,7 @@ static void tcp_chr_tls_handshake(QIOTask *task,
if (qio_task_propagate_error(task, NULL)) {
tcp_chr_disconnect(chr);
} else {
/* tn3270 does not support TLS yet */
if (s->do_telnetopt && !s->is_tn3270) {
if (s->do_telnetopt) {
tcp_chr_telnet_init(chr);
} else {
tcp_chr_connect(chr);
@@ -680,6 +721,11 @@ static void tcp_chr_tls_init(Chardev *chr)
Error *err = NULL;
gchar *name;
if (!machine_init_done) {
/* This will be postponed to machine_done notifier */
return;
}
if (s->is_listen) {
tioc = qio_channel_tls_new_server(
s->ioc, s->tls_creds,
@@ -707,7 +753,8 @@ static void tcp_chr_tls_init(Chardev *chr)
qio_channel_tls_handshake(tioc,
tcp_chr_tls_handshake,
chr,
NULL);
NULL,
chr->gcontext);
}
@@ -743,7 +790,8 @@ static int tcp_chr_new_client(Chardev *chr, QIOChannelSocket *sioc)
qio_channel_set_delay(s->ioc, false);
}
if (s->listener) {
qio_net_listener_set_client_func(s->listener, NULL, NULL, NULL);
qio_net_listener_set_client_func_full(s->listener, NULL, NULL,
NULL, chr->gcontext);
}
if (s->tls_creds) {
@@ -823,8 +871,11 @@ static void char_socket_finalize(Object *obj)
tcp_chr_free_connection(chr);
tcp_chr_reconn_timer_cancel(s);
qapi_free_SocketAddress(s->addr);
tcp_chr_telnet_destroy(s);
g_free(s->telnet_init);
if (s->listener) {
qio_net_listener_set_client_func(s->listener, NULL, NULL, NULL);
qio_net_listener_set_client_func_full(s->listener, NULL, NULL,
NULL, chr->gcontext);
object_unref(OBJECT(s->listener));
}
if (s->tls_creds) {
@@ -854,11 +905,22 @@ cleanup:
object_unref(OBJECT(sioc));
}
static void tcp_chr_connect_async(Chardev *chr)
{
SocketChardev *s = SOCKET_CHARDEV(chr);
QIOChannelSocket *sioc;
sioc = qio_channel_socket_new();
tcp_chr_set_client_ioc_name(chr, sioc);
qio_channel_socket_connect_async(sioc, s->addr,
qemu_chr_socket_connected,
chr, NULL, chr->gcontext);
}
static gboolean socket_reconnect_timeout(gpointer opaque)
{
Chardev *chr = CHARDEV(opaque);
SocketChardev *s = SOCKET_CHARDEV(opaque);
QIOChannelSocket *sioc;
g_source_unref(s->reconnect_timer);
s->reconnect_timer = NULL;
@@ -867,11 +929,7 @@ static gboolean socket_reconnect_timeout(gpointer opaque)
return false;
}
sioc = qio_channel_socket_new();
tcp_chr_set_client_ioc_name(chr, sioc);
qio_channel_socket_connect_async(sioc, s->addr,
qemu_chr_socket_connected,
chr, NULL);
tcp_chr_connect_async(chr);
return false;
}
@@ -950,13 +1008,8 @@ static void qmp_chardev_open_socket(Chardev *chr,
s->reconnect_time = reconnect;
}
if (s->reconnect_time) {
sioc = qio_channel_socket_new();
tcp_chr_set_client_ioc_name(chr, sioc);
qio_channel_socket_connect_async(sioc, s->addr,
qemu_chr_socket_connected,
chr, NULL);
} else {
/* If reconnect_time is set, will do that in chr_machine_done. */
if (!s->reconnect_time) {
if (s->is_listen) {
char *name;
s->listener = qio_net_listener_new();
@@ -980,8 +1033,10 @@ static void qmp_chardev_open_socket(Chardev *chr,
return;
}
if (!s->ioc) {
qio_net_listener_set_client_func(s->listener, tcp_chr_accept,
chr, NULL);
qio_net_listener_set_client_func_full(s->listener,
tcp_chr_accept,
chr, NULL,
chr->gcontext);
}
} else if (qemu_chr_wait_connected(chr, errp) < 0) {
goto error;
@@ -1008,25 +1063,36 @@ static void qemu_chr_parse_socket(QemuOpts *opts, ChardevBackend *backend,
const char *path = qemu_opt_get(opts, "path");
const char *host = qemu_opt_get(opts, "host");
const char *port = qemu_opt_get(opts, "port");
const char *fd = qemu_opt_get(opts, "fd");
const char *tls_creds = qemu_opt_get(opts, "tls-creds");
SocketAddressLegacy *addr;
ChardevSocket *sock;
if ((!!path + !!fd + !!host) != 1) {
error_setg(errp,
"Exactly one of 'path', 'fd' or 'host' required");
return;
}
backend->type = CHARDEV_BACKEND_KIND_SOCKET;
if (!path) {
if (!host) {
error_setg(errp, "chardev: socket: no host given");
return;
}
if (!port) {
error_setg(errp, "chardev: socket: no port given");
return;
}
} else {
if (path) {
if (tls_creds) {
error_setg(errp, "TLS can only be used over TCP socket");
return;
}
} else if (host) {
if (!port) {
error_setg(errp, "chardev: socket: no port given");
return;
}
} else if (fd) {
/* We don't know what host to validate against when in client mode */
if (tls_creds && !is_listen) {
error_setg(errp, "TLS can not be used with pre-opened client FD");
return;
}
} else {
g_assert_not_reached();
}
sock = backend->u.socket.data = g_new0(ChardevSocket, 1);
@@ -1052,7 +1118,7 @@ static void qemu_chr_parse_socket(QemuOpts *opts, ChardevBackend *backend,
addr->type = SOCKET_ADDRESS_LEGACY_KIND_UNIX;
q_unix = addr->u.q_unix.data = g_new0(UnixSocketAddress, 1);
q_unix->path = g_strdup(path);
} else {
} else if (host) {
addr->type = SOCKET_ADDRESS_LEGACY_KIND_INET;
addr->u.inet.data = g_new(InetSocketAddress, 1);
*addr->u.inet.data = (InetSocketAddress) {
@@ -1065,6 +1131,12 @@ static void qemu_chr_parse_socket(QemuOpts *opts, ChardevBackend *backend,
.has_ipv6 = qemu_opt_get(opts, "ipv6"),
.ipv6 = qemu_opt_get_bool(opts, "ipv6", 0),
};
} else if (fd) {
addr->type = SOCKET_ADDRESS_LEGACY_KIND_FD;
addr->u.fd.data = g_new(String, 1);
addr->u.fd.data->str = g_strdup(fd);
} else {
g_assert_not_reached();
}
sock->addr = addr;
}
@@ -1086,6 +1158,21 @@ char_socket_get_connected(Object *obj, Error **errp)
return s->connected;
}
static int tcp_chr_machine_done_hook(Chardev *chr)
{
SocketChardev *s = SOCKET_CHARDEV(chr);
if (s->reconnect_time) {
tcp_chr_connect_async(chr);
}
if (s->ioc && s->tls_creds) {
tcp_chr_tls_init(chr);
}
return 0;
}
static void char_socket_class_init(ObjectClass *oc, void *data)
{
ChardevClass *cc = CHARDEV_CLASS(oc);
@@ -1101,6 +1188,7 @@ static void char_socket_class_init(ObjectClass *oc, void *data)
cc->chr_add_client = tcp_chr_add_client;
cc->chr_add_watch = tcp_chr_add_watch;
cc->chr_update_read_handler = tcp_chr_update_read_handler;
cc->chr_machine_done = tcp_chr_machine_done_hook;
object_class_property_add(oc, "addr", "SocketAddress",
char_socket_get_addr, NULL,

View File

@@ -281,40 +281,31 @@ static const TypeInfo char_type_info = {
.class_init = char_class_init,
};
/**
* Called after processing of default and command-line-specified
* chardevs to deliver CHR_EVENT_OPENED events to any FEs attached
* to a mux chardev. This is done here to ensure that
* output/prompts/banners are only displayed for the FE that has
* focus when initial command-line processing/machine init is
* completed.
*
* After this point, any new FE attached to any new or existing
* mux will receive CHR_EVENT_OPENED notifications for the BE
* immediately.
*/
static int open_muxes(Object *child, void *opaque)
static int chardev_machine_done_notify_one(Object *child, void *opaque)
{
if (CHARDEV_IS_MUX(child)) {
/* send OPENED to all already-attached FEs */
mux_chr_send_all_event(CHARDEV(child), CHR_EVENT_OPENED);
/* mark mux as OPENED so any new FEs will immediately receive
* OPENED event
*/
qemu_chr_be_event(CHARDEV(child), CHR_EVENT_OPENED);
Chardev *chr = (Chardev *)child;
ChardevClass *class = CHARDEV_GET_CLASS(chr);
if (class->chr_machine_done) {
return class->chr_machine_done(chr);
}
return 0;
}
static void muxes_realize_done(Notifier *notifier, void *unused)
static void chardev_machine_done_hook(Notifier *notifier, void *unused)
{
muxes_realized = true;
object_child_foreach(get_chardevs_root(), open_muxes, NULL);
int ret = object_child_foreach(get_chardevs_root(),
chardev_machine_done_notify_one, NULL);
if (ret) {
error_report("Failed to call chardev machine_done hooks");
exit(1);
}
}
static Notifier muxes_realize_notify = {
.notify = muxes_realize_done,
static Notifier chardev_machine_done_notify = {
.notify = chardev_machine_done_hook,
};
static bool qemu_chr_is_busy(Chardev *s)
@@ -807,6 +798,9 @@ QemuOptsList qemu_chardev_opts = {
},{
.name = "port",
.type = QEMU_OPT_STRING,
},{
.name = "fd",
.type = QEMU_OPT_STRING,
},{
.name = "localaddr",
.type = QEMU_OPT_STRING,
@@ -1118,7 +1112,7 @@ static void register_types(void)
* as part of realize functions like serial_isa_realizefn when -nographic
* is specified
*/
qemu_add_machine_init_done_notifier(&muxes_realize_notify);
qemu_add_machine_init_done_notifier(&chardev_machine_done_notify);
}
type_init(register_types);

98
configure vendored
View File

@@ -342,7 +342,7 @@ attr=""
libattr=""
xfs=""
tcg="yes"
membarrier=""
vhost_net="no"
vhost_crypto="no"
vhost_scsi="no"
@@ -534,7 +534,7 @@ QEMU_CFLAGS="-fno-strict-aliasing -fno-common -fwrapv $QEMU_CFLAGS"
QEMU_CFLAGS="-Wall -Wundef -Wwrite-strings -Wmissing-prototypes $QEMU_CFLAGS"
QEMU_CFLAGS="-Wstrict-prototypes -Wredundant-decls $QEMU_CFLAGS"
QEMU_CFLAGS="-D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE $QEMU_CFLAGS"
QEMU_INCLUDES="-I. -I\$(SRC_PATH) -I\$(SRC_PATH)/accel/tcg -I\$(SRC_PATH)/include"
QEMU_INCLUDES="-iquote . -iquote \$(SRC_PATH) -iquote \$(SRC_PATH)/accel/tcg -iquote \$(SRC_PATH)/include"
if test "$debug_info" = "yes"; then
CFLAGS="-g $CFLAGS"
LDFLAGS="-g $LDFLAGS"
@@ -1161,9 +1161,13 @@ for opt do
;;
--enable-attr) attr="yes"
;;
--disable-membarrier) membarrier="no"
;;
--enable-membarrier) membarrier="yes"
;;
--disable-blobs) blobs="no"
;;
--with-pkgversion=*) pkgversion=" ($optarg)"
--with-pkgversion=*) pkgversion="$optarg"
;;
--with-coroutine=*) coroutine="$optarg"
;;
@@ -1577,6 +1581,7 @@ disabled with --disable-FEATURE, default is enabled if available:
xen-pci-passthrough
brlapi BrlAPI (Braile)
curl curl connectivity
membarrier membarrier system call (for Linux 4.14+ or Windows)
fdt fdt device tree
bluez bluez stack connectivity
kvm KVM acceleration support
@@ -1692,6 +1697,7 @@ gcc_flags="-Wno-missing-include-dirs -Wempty-body -Wnested-externs $gcc_flags"
gcc_flags="-Wendif-labels -Wno-shift-negative-value $gcc_flags"
gcc_flags="-Wno-initializer-overrides -Wexpansion-to-defined $gcc_flags"
gcc_flags="-Wno-string-plus-int $gcc_flags"
gcc_flags="-Wno-error=address-of-packed-member $gcc_flags"
# Note that we do not add -Werror to gcc_flags here, because that would
# enable it for all configure tests. If a configure test failed due
# to -Werror this would just silently disable some features,
@@ -2490,7 +2496,9 @@ if test "$whpx" != "no" ; then
#include <WinHvEmulation.h>
int main(void) {
WHV_CAPABILITY whpx_cap;
WHvGetCapability(WHvCapabilityCodeFeatures, &whpx_cap, sizeof(whpx_cap));
UINT32 writtenSize;
WHvGetCapability(WHvCapabilityCodeFeatures, &whpx_cap, sizeof(whpx_cap),
&writtenSize);
return 0;
}
EOF
@@ -2874,6 +2882,7 @@ if test "$sdl" != "no" ; then
int main( void ) { return SDL_Init (SDL_INIT_VIDEO); }
EOF
sdl_cflags=$($sdlconfig --cflags 2>/dev/null)
sdl_cflags="$sdl_cflags -Wno-undef" # workaround 2.0.8 bug
if test "$static" = "yes" ; then
if $pkg_config $sdlname --exists; then
sdl_libs=$($pkg_config $sdlname --static --libs 2>/dev/null)
@@ -3352,7 +3361,7 @@ else
fi
glib_modules=gthread-2.0
if test "$modules" = yes; then
glib_modules="$glib_modules gmodule-2.0"
glib_modules="$glib_modules gmodule-export-2.0"
fi
# This workaround is required due to a bug in pkg-config file for glib as it
@@ -4860,7 +4869,6 @@ fi
pragma_disable_unused_but_set=no
cat > $TMPC << EOF
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-but-set-variable"
#pragma GCC diagnostic ignored "-Wstrict-prototypes"
#pragma GCC diagnostic pop
@@ -5046,6 +5054,14 @@ static S2 c2;
static S4 c4;
static S8 c8;
static int i;
void helper(void *d, void *a, int shift, int i);
void helper(void *d, void *a, int shift, int i)
{
*(U1 *)(d + i) = *(U1 *)(a + i) << shift;
*(U2 *)(d + i) = *(U2 *)(a + i) << shift;
*(U4 *)(d + i) = *(U4 *)(a + i) << shift;
*(U8 *)(d + i) = *(U8 *)(a + i) << shift;
}
int main(void)
{
a1 += b1; a2 += b2; a4 += b4; a8 += b8;
@@ -5137,6 +5153,37 @@ if compile_prog "" "" ; then
have_fsxattr=yes
fi
##########################################
# check for usable membarrier system call
if test "$membarrier" = "yes"; then
have_membarrier=no
if test "$mingw32" = "yes" ; then
have_membarrier=yes
elif test "$linux" = "yes" ; then
cat > $TMPC << EOF
#include <linux/membarrier.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <stdlib.h>
int main(void) {
syscall(__NR_membarrier, MEMBARRIER_CMD_QUERY, 0);
syscall(__NR_membarrier, MEMBARRIER_CMD_SHARED, 0);
exit(0);
}
EOF
if compile_prog "" "" ; then
have_membarrier=yes
fi
fi
if test "$have_membarrier" = "no"; then
feature_not_found "membarrier" "membarrier system call not available"
fi
else
# Do not enable it by default even for Mingw32, because it doesn't
# work on Wine.
membarrier=no
fi
##########################################
# check if rtnetlink.h exists and is useful
have_rtnetlink=no
@@ -5763,6 +5810,7 @@ fi
echo "malloc trim support $malloc_trim"
echo "RDMA support $rdma"
echo "fdt support $fdt"
echo "membarrier $membarrier"
echo "preadv support $preadv"
echo "fdatasync $fdatasync"
echo "madvise $madvise"
@@ -5973,7 +6021,12 @@ fi
echo "CONFIG_AUDIO_DRIVERS=$audio_drv_list" >> $config_host_mak
for drv in $audio_drv_list; do
def=CONFIG_AUDIO_$(echo $drv | LC_ALL=C tr '[a-z]' '[A-Z]')
echo "$def=y" >> $config_host_mak
case "$drv" in
alsa | oss | pa | sdl)
echo "$def=m" >> $config_host_mak ;;
*)
echo "$def=y" >> $config_host_mak ;;
esac
done
echo "ALSA_LIBS=$alsa_libs" >> $config_host_mak
echo "PULSE_LIBS=$pulse_libs" >> $config_host_mak
@@ -6245,6 +6298,9 @@ fi
if test "$fdt" = "yes" ; then
echo "CONFIG_FDT=y" >> $config_host_mak
fi
if test "$membarrier" = "yes" ; then
echo "CONFIG_MEMBARRIER=y" >> $config_host_mak
fi
if test "$signalfd" = "yes" ; then
echo "CONFIG_SIGNALFD=y" >> $config_host_mak
fi
@@ -6554,19 +6610,19 @@ if test "$vxhs" = "yes" ; then
fi
if test "$tcg_interpreter" = "yes"; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"
QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"
elif test "$ARCH" = "sparc64" ; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/sparc $QEMU_INCLUDES"
QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg/sparc $QEMU_INCLUDES"
elif test "$ARCH" = "s390x" ; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/s390 $QEMU_INCLUDES"
QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg/s390 $QEMU_INCLUDES"
elif test "$ARCH" = "x86_64" -o "$ARCH" = "x32" ; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/i386 $QEMU_INCLUDES"
QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg/i386 $QEMU_INCLUDES"
elif test "$ARCH" = "ppc64" ; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/ppc $QEMU_INCLUDES"
QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg/ppc $QEMU_INCLUDES"
else
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/\$(ARCH) $QEMU_INCLUDES"
QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg/\$(ARCH) $QEMU_INCLUDES"
fi
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg $QEMU_INCLUDES"
QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg $QEMU_INCLUDES"
echo "TOOLS=$tools" >> $config_host_mak
echo "ROMS=$roms" >> $config_host_mak
@@ -6797,6 +6853,16 @@ case "$target_name" in
echo "TARGET_ABI32=y" >> $config_target_mak
gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml power-spe.xml power-vsx.xml"
;;
riscv32)
TARGET_BASE_ARCH=riscv
TARGET_ABI_DIR=riscv
mttcg=yes
;;
riscv64)
TARGET_BASE_ARCH=riscv
TARGET_ABI_DIR=riscv
mttcg=yes
;;
sh4|sh4eb)
TARGET_ARCH=sh4
bflt="yes"
@@ -6824,6 +6890,7 @@ case "$target_name" in
;;
xtensa|xtensaeb)
TARGET_ARCH=xtensa
mttcg="yes"
;;
*)
error_exit "Unsupported target CPU"
@@ -6966,6 +7033,9 @@ for i in $ARCH $TARGET_BASE_ARCH ; do
ppc*)
disas_config "PPC"
;;
riscv)
disas_config "RISCV"
;;
s390*)
disas_config "S390"
;;

View File

@@ -26,9 +26,20 @@
#include <sys/socket.h>
#include <sys/eventfd.h>
#include <sys/mman.h>
#include "qemu/compiler.h"
#if defined(__linux__)
#include <sys/syscall.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <linux/vhost.h>
#include "qemu/compiler.h"
#ifdef __NR_userfaultfd
#include <linux/userfaultfd.h>
#endif
#endif
#include "qemu/atomic.h"
#include "libvhost-user.h"
@@ -86,6 +97,9 @@ vu_request_to_string(unsigned int req)
REQ(VHOST_USER_SET_VRING_ENDIAN),
REQ(VHOST_USER_GET_CONFIG),
REQ(VHOST_USER_SET_CONFIG),
REQ(VHOST_USER_POSTCOPY_ADVISE),
REQ(VHOST_USER_POSTCOPY_LISTEN),
REQ(VHOST_USER_POSTCOPY_END),
REQ(VHOST_USER_MAX),
};
#undef REQ
@@ -171,6 +185,35 @@ vmsg_close_fds(VhostUserMsg *vmsg)
}
}
/* A test to see if we have userfault available */
static bool
have_userfault(void)
{
#if defined(__linux__) && defined(__NR_userfaultfd) &&\
defined(UFFD_FEATURE_MISSING_SHMEM) &&\
defined(UFFD_FEATURE_MISSING_HUGETLBFS)
/* Now test the kernel we're running on really has the features */
int ufd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
struct uffdio_api api_struct;
if (ufd < 0) {
return false;
}
api_struct.api = UFFD_API;
api_struct.features = UFFD_FEATURE_MISSING_SHMEM |
UFFD_FEATURE_MISSING_HUGETLBFS;
if (ioctl(ufd, UFFDIO_API, &api_struct)) {
close(ufd);
return false;
}
close(ufd);
return true;
#else
return false;
#endif
}
static bool
vu_message_read(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
{
@@ -245,6 +288,31 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
{
int rc;
uint8_t *p = (uint8_t *)vmsg;
char control[CMSG_SPACE(VHOST_MEMORY_MAX_NREGIONS * sizeof(int))] = { };
struct iovec iov = {
.iov_base = (char *)vmsg,
.iov_len = VHOST_USER_HDR_SIZE,
};
struct msghdr msg = {
.msg_iov = &iov,
.msg_iovlen = 1,
.msg_control = control,
};
struct cmsghdr *cmsg;
memset(control, 0, sizeof(control));
assert(vmsg->fd_num <= VHOST_MEMORY_MAX_NREGIONS);
if (vmsg->fd_num > 0) {
size_t fdsize = vmsg->fd_num * sizeof(int);
msg.msg_controllen = CMSG_SPACE(fdsize);
cmsg = CMSG_FIRSTHDR(&msg);
cmsg->cmsg_len = CMSG_LEN(fdsize);
cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_RIGHTS;
memcpy(CMSG_DATA(cmsg), vmsg->fds, fdsize);
} else {
msg.msg_controllen = 0;
}
/* Set the version in the flags when sending the reply */
vmsg->flags &= ~VHOST_USER_VERSION_MASK;
@@ -252,7 +320,7 @@ vu_message_write(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
vmsg->flags |= VHOST_USER_REPLY_MASK;
do {
rc = write(conn_fd, p, VHOST_USER_HDR_SIZE);
rc = sendmsg(conn_fd, &msg, 0);
} while (rc < 0 && (errno == EINTR || errno == EAGAIN));
do {
@@ -345,6 +413,7 @@ vu_get_features_exec(VuDev *dev, VhostUserMsg *vmsg)
}
vmsg->size = sizeof(vmsg->payload.u64);
vmsg->fd_num = 0;
DPRINT("Sending back to guest u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
@@ -409,6 +478,148 @@ vu_reset_device_exec(VuDev *dev, VhostUserMsg *vmsg)
return false;
}
static bool
vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg)
{
int i;
VhostUserMemory *memory = &vmsg->payload.memory;
dev->nregions = memory->nregions;
DPRINT("Nregions: %d\n", memory->nregions);
for (i = 0; i < dev->nregions; i++) {
void *mmap_addr;
VhostUserMemoryRegion *msg_region = &memory->regions[i];
VuDevRegion *dev_region = &dev->regions[i];
DPRINT("Region %d\n", i);
DPRINT(" guest_phys_addr: 0x%016"PRIx64"\n",
msg_region->guest_phys_addr);
DPRINT(" memory_size: 0x%016"PRIx64"\n",
msg_region->memory_size);
DPRINT(" userspace_addr 0x%016"PRIx64"\n",
msg_region->userspace_addr);
DPRINT(" mmap_offset 0x%016"PRIx64"\n",
msg_region->mmap_offset);
dev_region->gpa = msg_region->guest_phys_addr;
dev_region->size = msg_region->memory_size;
dev_region->qva = msg_region->userspace_addr;
dev_region->mmap_offset = msg_region->mmap_offset;
/* We don't use offset argument of mmap() since the
* mapped address has to be page aligned, and we use huge
* pages.
* In postcopy we're using PROT_NONE here to catch anyone
* accessing it before we userfault
*/
mmap_addr = mmap(0, dev_region->size + dev_region->mmap_offset,
PROT_NONE, MAP_SHARED,
vmsg->fds[i], 0);
if (mmap_addr == MAP_FAILED) {
vu_panic(dev, "region mmap error: %s", strerror(errno));
} else {
dev_region->mmap_addr = (uint64_t)(uintptr_t)mmap_addr;
DPRINT(" mmap_addr: 0x%016"PRIx64"\n",
dev_region->mmap_addr);
}
/* Return the address to QEMU so that it can translate the ufd
* fault addresses back.
*/
msg_region->userspace_addr = (uintptr_t)(mmap_addr +
dev_region->mmap_offset);
close(vmsg->fds[i]);
}
/* Send the message back to qemu with the addresses filled in */
vmsg->fd_num = 0;
if (!vu_message_write(dev, dev->sock, vmsg)) {
vu_panic(dev, "failed to respond to set-mem-table for postcopy");
return false;
}
/* Wait for QEMU to confirm that it's registered the handler for the
* faults.
*/
if (!vu_message_read(dev, dev->sock, vmsg) ||
vmsg->size != sizeof(vmsg->payload.u64) ||
vmsg->payload.u64 != 0) {
vu_panic(dev, "failed to receive valid ack for postcopy set-mem-table");
return false;
}
/* OK, now we can go and register the memory and generate faults */
for (i = 0; i < dev->nregions; i++) {
VuDevRegion *dev_region = &dev->regions[i];
int ret;
#ifdef UFFDIO_REGISTER
/* We should already have an open ufd. Mark each memory
* range as ufd.
* Discard any mapping we have here; note I can't use MADV_REMOVE
* or fallocate to make the hole since I don't want to lose
* data that's already arrived in the shared process.
* TODO: How to do hugepage
*/
ret = madvise((void *)dev_region->mmap_addr,
dev_region->size + dev_region->mmap_offset,
MADV_DONTNEED);
if (ret) {
fprintf(stderr,
"%s: Failed to madvise(DONTNEED) region %d: %s\n",
__func__, i, strerror(errno));
}
/* Turn off transparent hugepages so we dont get lose wakeups
* in neighbouring pages.
* TODO: Turn this backon later.
*/
ret = madvise((void *)dev_region->mmap_addr,
dev_region->size + dev_region->mmap_offset,
MADV_NOHUGEPAGE);
if (ret) {
/* Note: This can happen legally on kernels that are configured
* without madvise'able hugepages
*/
fprintf(stderr,
"%s: Failed to madvise(NOHUGEPAGE) region %d: %s\n",
__func__, i, strerror(errno));
}
struct uffdio_register reg_struct;
reg_struct.range.start = (uintptr_t)dev_region->mmap_addr;
reg_struct.range.len = dev_region->size + dev_region->mmap_offset;
reg_struct.mode = UFFDIO_REGISTER_MODE_MISSING;
if (ioctl(dev->postcopy_ufd, UFFDIO_REGISTER, &reg_struct)) {
vu_panic(dev, "%s: Failed to userfault region %d "
"@%p + size:%zx offset: %zx: (ufd=%d)%s\n",
__func__, i,
dev_region->mmap_addr,
dev_region->size, dev_region->mmap_offset,
dev->postcopy_ufd, strerror(errno));
return false;
}
if (!(reg_struct.ioctls & ((__u64)1 << _UFFDIO_COPY))) {
vu_panic(dev, "%s Region (%d) doesn't support COPY",
__func__, i);
return false;
}
DPRINT("%s: region %d: Registered userfault for %llx + %llx\n",
__func__, i, reg_struct.range.start, reg_struct.range.len);
/* Now it's registered we can let the client at it */
if (mprotect((void *)dev_region->mmap_addr,
dev_region->size + dev_region->mmap_offset,
PROT_READ | PROT_WRITE)) {
vu_panic(dev, "failed to mprotect region %d for postcopy (%s)",
i, strerror(errno));
return false;
}
/* TODO: Stash 'zero' support flags somewhere */
#endif
}
return false;
}
static bool
vu_set_mem_table_exec(VuDev *dev, VhostUserMsg *vmsg)
{
@@ -425,6 +636,10 @@ vu_set_mem_table_exec(VuDev *dev, VhostUserMsg *vmsg)
}
dev->nregions = memory->nregions;
if (dev->postcopy_listening) {
return vu_set_mem_table_exec_postcopy(dev, vmsg);
}
DPRINT("Nregions: %d\n", memory->nregions);
for (i = 0; i < dev->nregions; i++) {
void *mmap_addr;
@@ -500,6 +715,7 @@ vu_set_log_base_exec(VuDev *dev, VhostUserMsg *vmsg)
dev->log_size = log_mmap_size;
vmsg->size = sizeof(vmsg->payload.u64);
vmsg->fd_num = 0;
return true;
}
@@ -752,12 +968,17 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg *vmsg)
uint64_t features = 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD |
1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ;
if (have_userfault()) {
features |= 1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT;
}
if (dev->iface->get_protocol_features) {
features |= dev->iface->get_protocol_features(dev);
}
vmsg->payload.u64 = features;
vmsg->size = sizeof(vmsg->payload.u64);
vmsg->fd_num = 0;
return true;
}
@@ -856,6 +1077,77 @@ vu_set_config(VuDev *dev, VhostUserMsg *vmsg)
return false;
}
static bool
vu_set_postcopy_advise(VuDev *dev, VhostUserMsg *vmsg)
{
dev->postcopy_ufd = -1;
#ifdef UFFDIO_API
struct uffdio_api api_struct;
dev->postcopy_ufd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
vmsg->size = 0;
#endif
if (dev->postcopy_ufd == -1) {
vu_panic(dev, "Userfaultfd not available: %s", strerror(errno));
goto out;
}
#ifdef UFFDIO_API
api_struct.api = UFFD_API;
api_struct.features = 0;
if (ioctl(dev->postcopy_ufd, UFFDIO_API, &api_struct)) {
vu_panic(dev, "Failed UFFDIO_API: %s", strerror(errno));
close(dev->postcopy_ufd);
dev->postcopy_ufd = -1;
goto out;
}
/* TODO: Stash feature flags somewhere */
#endif
out:
/* Return a ufd to the QEMU */
vmsg->fd_num = 1;
vmsg->fds[0] = dev->postcopy_ufd;
return true; /* = send a reply */
}
static bool
vu_set_postcopy_listen(VuDev *dev, VhostUserMsg *vmsg)
{
vmsg->payload.u64 = -1;
vmsg->size = sizeof(vmsg->payload.u64);
if (dev->nregions) {
vu_panic(dev, "Regions already registered at postcopy-listen");
return true;
}
dev->postcopy_listening = true;
vmsg->flags = VHOST_USER_VERSION | VHOST_USER_REPLY_MASK;
vmsg->payload.u64 = 0; /* Success */
return true;
}
static bool
vu_set_postcopy_end(VuDev *dev, VhostUserMsg *vmsg)
{
DPRINT("%s: Entry\n", __func__);
dev->postcopy_listening = false;
if (dev->postcopy_ufd > 0) {
close(dev->postcopy_ufd);
dev->postcopy_ufd = -1;
DPRINT("%s: Done close\n", __func__);
}
vmsg->fd_num = 0;
vmsg->payload.u64 = 0;
vmsg->size = sizeof(vmsg->payload.u64);
vmsg->flags = VHOST_USER_VERSION | VHOST_USER_REPLY_MASK;
DPRINT("%s: exit\n", __func__);
return true;
}
static bool
vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
{
@@ -927,6 +1219,12 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
return vu_set_config(dev, vmsg);
case VHOST_USER_NONE:
break;
case VHOST_USER_POSTCOPY_ADVISE:
return vu_set_postcopy_advise(dev, vmsg);
case VHOST_USER_POSTCOPY_LISTEN:
return vu_set_postcopy_listen(dev, vmsg);
case VHOST_USER_POSTCOPY_END:
return vu_set_postcopy_end(dev, vmsg);
default:
vmsg_close_fds(vmsg);
vu_panic(dev, "Unhandled request: %d", vmsg->request);

View File

@@ -48,6 +48,8 @@ enum VhostUserProtocolFeature {
VHOST_USER_PROTOCOL_F_NET_MTU = 4,
VHOST_USER_PROTOCOL_F_SLAVE_REQ = 5,
VHOST_USER_PROTOCOL_F_CROSS_ENDIAN = 6,
VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7,
VHOST_USER_PROTOCOL_F_PAGEFAULT = 8,
VHOST_USER_PROTOCOL_F_MAX
};
@@ -81,6 +83,11 @@ typedef enum VhostUserRequest {
VHOST_USER_SET_VRING_ENDIAN = 23,
VHOST_USER_GET_CONFIG = 24,
VHOST_USER_SET_CONFIG = 25,
VHOST_USER_CREATE_CRYPTO_SESSION = 26,
VHOST_USER_CLOSE_CRYPTO_SESSION = 27,
VHOST_USER_POSTCOPY_ADVISE = 28,
VHOST_USER_POSTCOPY_LISTEN = 29,
VHOST_USER_POSTCOPY_END = 30,
VHOST_USER_MAX
} VhostUserRequest;
@@ -277,6 +284,10 @@ struct VuDev {
* re-initialize */
vu_panic_cb panic;
const VuDevIface *iface;
/* Postcopy data */
int postcopy_ufd;
bool postcopy_listening;
};
typedef struct VuVirtqElement {

46
cpus.c
View File

@@ -993,7 +993,7 @@ void cpu_synchronize_all_pre_loadvm(void)
}
}
static int do_vm_stop(RunState state)
static int do_vm_stop(RunState state, bool send_stop)
{
int ret = 0;
@@ -1002,7 +1002,9 @@ static int do_vm_stop(RunState state)
pause_all_vcpus();
runstate_set(state);
vm_state_notify(0, state);
qapi_event_send_stop(&error_abort);
if (send_stop) {
qapi_event_send_stop(&error_abort);
}
}
bdrv_drain_all();
@@ -1012,6 +1014,14 @@ static int do_vm_stop(RunState state)
return ret;
}
/* Special vm_stop() variant for terminating the process. Historically clients
* did not expect a QMP STOP event and so we need to retain compatibility.
*/
int vm_shutdown(void)
{
return do_vm_stop(RUN_STATE_SHUTDOWN, false);
}
static bool cpu_can_run(CPUState *cpu)
{
if (cpu->stop) {
@@ -1307,6 +1317,8 @@ static void prepare_icount_for_run(CPUState *cpu)
insns_left = MIN(0xffff, cpu->icount_budget);
cpu->icount_decr.u16.low = insns_left;
cpu->icount_extra = cpu->icount_budget - insns_left;
replay_mutex_lock();
}
}
@@ -1322,6 +1334,8 @@ static void process_icount_data(CPUState *cpu)
cpu->icount_budget = 0;
replay_account_executed_instructions();
replay_mutex_unlock();
}
}
@@ -1336,11 +1350,9 @@ static int tcg_cpu_exec(CPUState *cpu)
#ifdef CONFIG_PROFILER
ti = profile_getclock();
#endif
qemu_mutex_unlock_iothread();
cpu_exec_start(cpu);
ret = cpu_exec(cpu);
cpu_exec_end(cpu);
qemu_mutex_lock_iothread();
#ifdef CONFIG_PROFILER
tcg_time += profile_getclock() - ti;
#endif
@@ -1407,6 +1419,9 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
cpu->exit_request = 1;
while (1) {
qemu_mutex_unlock_iothread();
replay_mutex_lock();
qemu_mutex_lock_iothread();
/* Account partial waits to QEMU_CLOCK_VIRTUAL. */
qemu_account_warp_timer();
@@ -1415,6 +1430,8 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
*/
handle_icount_deadline();
replay_mutex_unlock();
if (!cpu) {
cpu = first_cpu;
}
@@ -1430,11 +1447,13 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
if (cpu_can_run(cpu)) {
int r;
qemu_mutex_unlock_iothread();
prepare_icount_for_run(cpu);
r = tcg_cpu_exec(cpu);
process_icount_data(cpu);
qemu_mutex_lock_iothread();
if (r == EXCP_DEBUG) {
cpu_handle_guest_debug(cpu);
@@ -1624,7 +1643,9 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
while (1) {
if (cpu_can_run(cpu)) {
int r;
qemu_mutex_unlock_iothread();
r = tcg_cpu_exec(cpu);
qemu_mutex_lock_iothread();
switch (r) {
case EXCP_DEBUG:
cpu_handle_guest_debug(cpu);
@@ -1771,12 +1792,21 @@ void pause_all_vcpus(void)
}
}
/* We need to drop the replay_lock so any vCPU threads woken up
* can finish their replay tasks
*/
replay_mutex_unlock();
while (!all_vcpus_paused()) {
qemu_cond_wait(&qemu_pause_cond, &qemu_global_mutex);
CPU_FOREACH(cpu) {
qemu_cpu_kick(cpu);
}
}
qemu_mutex_unlock_iothread();
replay_mutex_lock();
qemu_mutex_lock_iothread();
}
void cpu_resume(CPUState *cpu)
@@ -1994,7 +2024,7 @@ int vm_stop(RunState state)
return 0;
}
return do_vm_stop(state);
return do_vm_stop(state, true);
}
/**
@@ -2081,6 +2111,9 @@ CpuInfoList *qmp_query_cpus(Error **errp)
#elif defined(TARGET_SPARC)
SPARCCPU *sparc_cpu = SPARC_CPU(cpu);
CPUSPARCState *env = &sparc_cpu->env;
#elif defined(TARGET_RISCV)
RISCVCPU *riscv_cpu = RISCV_CPU(cpu);
CPURISCVState *env = &riscv_cpu->env;
#elif defined(TARGET_MIPS)
MIPSCPU *mips_cpu = MIPS_CPU(cpu);
CPUMIPSState *env = &mips_cpu->env;
@@ -2120,6 +2153,9 @@ CpuInfoList *qmp_query_cpus(Error **errp)
#elif defined(TARGET_S390X)
info->value->arch = CPU_INFO_ARCH_S390;
info->value->u.s390.cpu_state = env->cpu_state;
#elif defined(TARGET_RISCV)
info->value->arch = CPU_INFO_ARCH_RISCV;
info->value->u.riscv.pc = env->pc;
#else
info->value->arch = CPU_INFO_ARCH_OTHER;
#endif

View File

@@ -4,7 +4,12 @@ include pci.mak
include usb.mak
CONFIG_SERIAL=y
CONFIG_SERIAL_ISA=y
CONFIG_I82374=y
CONFIG_I8254=y
CONFIG_I8257=y
CONFIG_PARALLEL=y
CONFIG_PARALLEL_ISA=y
CONFIG_FDC=y
CONFIG_PCKBD=y
CONFIG_VGA_CIRRUS=y
CONFIG_IDE_CORE=y

View File

@@ -47,6 +47,7 @@ CONFIG_A9MPCORE=y
CONFIG_A15MPCORE=y
CONFIG_ARM_V7M=y
CONFIG_NETDUINO2=y
CONFIG_ARM_GIC=y
CONFIG_ARM_GIC_KVM=$(CONFIG_KVM)
@@ -109,6 +110,7 @@ CONFIG_TZ_PPC=y
CONFIG_IOTKIT=y
CONFIG_IOTKIT_SECCTL=y
CONFIG_VERSATILE=y
CONFIG_VERSATILE_PCI=y
CONFIG_VERSATILE_I2C=y
@@ -117,6 +119,7 @@ CONFIG_VFIO_XGMAC=y
CONFIG_VFIO_AMD_XGBE=y
CONFIG_SDHCI=y
CONFIG_INTEGRATOR=y
CONFIG_INTEGRATOR_DEBUG=y
CONFIG_ALLWINNER_A10_PIT=y
@@ -126,6 +129,7 @@ CONFIG_ALLWINNER_A10=y
CONFIG_FSL_IMX6=y
CONFIG_FSL_IMX31=y
CONFIG_FSL_IMX25=y
CONFIG_FSL_IMX7=y
CONFIG_IMX_I2C=y
@@ -140,3 +144,8 @@ CONFIG_GPIO_KEY=y
CONFIG_MSF2=y
CONFIG_FW_CFG_DMA=y
CONFIG_XILINX_AXI=y
CONFIG_PCI_DESIGNWARE=y
CONFIG_STRONGARM=y
CONFIG_HIGHBANK=y
CONFIG_MUSICPAL=y

View File

@@ -63,3 +63,6 @@ CONFIG_PXB=y
CONFIG_ACPI_VMGENID=y
CONFIG_FW_CFG_DMA=y
CONFIG_I2C=y
CONFIG_SEV=$(CONFIG_KVM)
CONFIG_VTD=y
CONFIG_AMD_IOMMU=y

View File

@@ -0,0 +1 @@
# Default configuration for riscv-linux-user

View File

@@ -0,0 +1,4 @@
# Default configuration for riscv-softmmu
CONFIG_SERIAL=y
CONFIG_VIRTIO=y

View File

@@ -0,0 +1 @@
# Default configuration for riscv-linux-user

View File

@@ -0,0 +1,4 @@
# Default configuration for riscv-softmmu
CONFIG_SERIAL=y
CONFIG_VIRTIO=y

View File

@@ -63,3 +63,6 @@ CONFIG_PXB=y
CONFIG_ACPI_VMGENID=y
CONFIG_FW_CFG_DMA=y
CONFIG_I2C=y
CONFIG_SEV=$(CONFIG_KVM)
CONFIG_VTD=y
CONFIG_AMD_IOMMU=y

View File

@@ -0,0 +1 @@
# Default configuration for xtensa-linux-user

View File

@@ -0,0 +1 @@
# Default configuration for xtensa-linux-user

View File

@@ -522,6 +522,8 @@ void disas(FILE *out, void *code, unsigned long size)
# ifdef _ARCH_PPC64
s.info.cap_mode = CS_MODE_64;
# endif
#elif defined(__riscv__)
print_insn = print_insn_riscv;
#elif defined(__aarch64__) && defined(CONFIG_ARM_A64_DIS)
print_insn = print_insn_arm_a64;
s.info.cap_arch = CS_ARCH_ARM64;

View File

@@ -17,6 +17,7 @@ common-obj-$(CONFIG_MIPS_DIS) += mips.o
common-obj-$(CONFIG_NIOS2_DIS) += nios2.o
common-obj-$(CONFIG_MOXIE_DIS) += moxie.o
common-obj-$(CONFIG_PPC_DIS) += ppc.o
common-obj-$(CONFIG_RISCV_DIS) += riscv.o
common-obj-$(CONFIG_S390_DIS) += s390.o
common-obj-$(CONFIG_SH4_DIS) += sh4.o
common-obj-$(CONFIG_SPARC_DIS) += sparc.o

3048
disas/riscv.c Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,109 @@
Secure Encrypted Virtualization (SEV) is a feature found on AMD processors.
SEV is an extension to the AMD-V architecture which supports running encrypted
virtual machine (VMs) under the control of KVM. Encrypted VMs have their pages
(code and data) secured such that only the guest itself has access to the
unencrypted version. Each encrypted VM is associated with a unique encryption
key; if its data is accessed to a different entity using a different key the
encrypted guests data will be incorrectly decrypted, leading to unintelligible
data.
The key management of this feature is handled by separate processor known as
AMD secure processor (AMD-SP) which is present in AMD SOCs. Firmware running
inside the AMD-SP provide commands to support common VM lifecycle. This
includes commands for launching, snapshotting, migrating and debugging the
encrypted guest. Those SEV command can be issued via KVM_MEMORY_ENCRYPT_OP
ioctls.
Launching
---------
Boot images (such as bios) must be encrypted before guest can be booted.
MEMORY_ENCRYPT_OP ioctl provides commands to encrypt the images :LAUNCH_START,
LAUNCH_UPDATE_DATA, LAUNCH_MEASURE and LAUNCH_FINISH. These four commands
together generate a fresh memory encryption key for the VM, encrypt the boot
images and provide a measurement than can be used as an attestation of the
successful launch.
LAUNCH_START is called first to create a cryptographic launch context within
the firmware. To create this context, guest owner must provides guest policy,
its public Diffie-Hellman key (PDH) and session parameters. These inputs
should be treated as binary blob and must be passed as-is to the SEV firmware.
The guest policy is passed as plaintext and hypervisor may able to read it
but should not modify it (any modification of the policy bits will result
in bad measurement). The guest policy is a 4-byte data structure containing
several flags that restricts what can be done on running SEV guest.
See KM Spec section 3 and 6.2 for more details.
The guest policy can be provided via the 'policy' property (see below)
# ${QEMU} \
sev-guest,id=sev0,policy=0x1...\
Guest owners provided DH certificate and session parameters will be used to
establish a cryptographic session with the guest owner to negotiate keys used
for the attestation.
The DH certificate and session blob can be provided via 'dh-cert-file' and
'session-file' property (see below
# ${QEMU} \
sev-guest,id=sev0,dh-cert-file=<file1>,session-file=<file2>
LAUNCH_UPDATE_DATA encrypts the memory region using the cryptographic context
created via LAUNCH_START command. If required, this command can be called
multiple times to encrypt different memory regions. The command also calculates
the measurement of the memory contents as it encrypts.
LAUNCH_MEASURE command can be used to retrieve the measurement of encrypted
memory. This measurement is a signature of the memory contents that can be
sent to the guest owner as an attestation that the memory was encrypted
correctly by the firmware. The guest owner may wait to provide the guest
confidential information until it can verify the attestation measurement.
Since the guest owner knows the initial contents of the guest at boot, the
attestation measurement can be verified by comparing it to what the guest owner
expects.
LAUNCH_FINISH command finalizes the guest launch and destroy's the cryptographic
context.
See SEV KM API Spec [1] 'Launching a guest' usage flow (Appendix A) for the
complete flow chart.
To launch a SEV guest
# ${QEMU} \
-machine ...,memory-encryption=sev0 \
-object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=1
Debugging
-----------
Since memory contents of SEV guest is encrypted hence hypervisor access to the
guest memory will get a cipher text. If guest policy allows debugging, then
hypervisor can use DEBUG_DECRYPT and DEBUG_ENCRYPT commands access the guest
memory region for debug purposes. This is not supported in QEMU yet.
Snapshot/Restore
-----------------
TODO
Live Migration
----------------
TODO
References
-----------------
AMD Memory Encryption whitepaper:
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf
Secure Encrypted Virutualization Key Management:
[1] http://support.amd.com/TechDocs/55766_SEV-KM API_Specification.pdf
KVM Forum slides:
http://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf
AMD64 Architecture Programmer's Manual:
http://support.amd.com/TechDocs/24593.pdf
SME is section 7.10
SEV is section 15.34

Some files were not shown because too many files have changed in this diff Show More