Compare commits

..

417 Commits

Author SHA1 Message Date
Thomas Huth
10f371c756 hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)
Git-commit: b987718bbb
References: bsc#1207205 (CVE-2023-0330)

We cannot use the generic reentrancy guard in the LSI code, so
we have to manually prevent endless reentrancy here. The problematic
lsi_execute_script() function has already a way to detect whether
too many instructions have been executed - we just have to slightly
change the logic here that it also takes into account if the function
has been called too often in a reentrant way.

The code in fuzz-lsi53c895a-test.c has been taken from an earlier
patch by Mauro Matteo Cascella.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563
Message-Id: <20230522091011.1082574-1-thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-10-20 16:02:13 +02:00
zhenwei pi
d0c2a993b8 virtio-crypto: verify src&dst buffer length for sym request
For symmetric algorithms, the length of ciphertext must be as same
as the plaintext.
The missing verification of the src_len and the dst_len in
virtio_crypto_sym_op_helper() may lead buffer overflow/divulged.

This patch is originally written by Yiming Tao for QEMU-SECURITY,
resend it(a few changes of error message) in qemu-devel.

Fixes: CVE-2023-3180
Fixes: 04b9b37edda("virtio-crypto: add data queue processing handler")
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: Mauro Matteo Cascella <mcascell@redhat.com>
Cc: Yiming Tao <taoym@zju.edu.cn>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230803024314.29962-2-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9d38a84347)
References: bsc#1213925
References: CVE-2023-3180
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-10-20 15:56:38 +02:00
Daniel P. Berrang_
6e201a2068 io: remove io watch if TLS channel is closed during handshake
Git-commit: 10be627d2b
References: bsc#1212850 (CVE-2023-3354)

The TLS handshake make take some time to complete, during which time an
I/O watch might be registered with the main loop. If the owner of the
I/O channel invokes qio_channel_close() while the handshake is waiting
to continue the I/O watch must be removed. Failing to remove it will
later trigger the completion callback which the owner is not expecting
to receive. In the case of the VNC server, this results in a SEGV as
vnc_disconnect_start() tries to shutdown a client connection that is
already gone / NULL.

CVE-2023-3354
Reported-by: jiangyegen <jiangyegen@huawei.com>
Signed-off-by: Daniel P. Berrang© <berrange@redhat.co>
[DB: Open-code g_clear_handle_id(), as it's not available in glib
     until version 2.56.]
Signed-off-by: Brahmajit Das <brahmajit.das@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-10-20 15:50:22 +02:00
Olaf Hering
ac78038d9f disable reentrancy guard on ppc
References: bsc#1190011 (CVE-2021-3750)

40p and g3beige will hang forever with these messages:
qemu-system-ppc: warning: Blocked re-entrant IO on MemoryRegion: pci-conf-idx at addr: 0x0
qemu-system-ppc64: warning: Blocked re-entrant IO on MemoryRegion: lpc-hc at addr: 0x34

Signed-off-by: Olaf Hering <olaf@aepfle.de>
2023-10-20 15:47:57 +02:00
Alexander Bulekov
d218f827af memory: prevent dma-reentracy issues
Git-commit: a2e1753b80
References: bsc#1190011 (CVE-2021-3750)

Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA.
This flag is set/checked prior to calling a device's MemoryRegion
handlers, and set when device code initiates DMA.  The purpose of this
flag is to prevent two types of DMA-based reentrancy issues:

1.) mmio -> dma -> mmio case
2.) bh -> dma write -> mmio case

These issues have led to problems such as stack-exhaustion and
use-after-frees.

Summary of the problem from Peter Maydell:
https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/62
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/540
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/541
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/556
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/557
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/827
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1282
Resolves: CVE-2023-0330

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-2-alxndr@bu.edu>
[thuth: Replace warn_report() with warn_report_once()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-10-20 15:46:57 +02:00
Philippe Mathieu-Daudé
9ccb8f0280 hw/sd/sdcard: Do not switch to ReceivingData if address is invalid
Git-commit: 790762e548
References: bsc#1172033, CVE-2020-13253

Only move the state machine to ReceivingData if there is no
pending error. This avoids later OOB access while processing
commands queued.

  "SD Specifications Part 1 Physical Layer Simplified Spec. v3.01"

  4.3.3 Data Read

  Read command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
  occurred and no data transfer is performed.

  4.3.4 Data Write

  Write command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
  occurred and no data transfer is performed.

WP_VIOLATION errors are not modified: the error bit is set, we
stay in receive-data state, wait for a stop command. All further
data transfer is ignored. See the check on sd->card_status at the
beginning of sd_read_data() and sd_write_data().

Fixes: CVE-2020-13253
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Buglink: https://bugs.launchpad.net/qemu/+bug/1880822
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20200630133912.9428-6-f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:28 +01:00
Philippe Mathieu-Daudé
e698f326be hw/sd/sdcard: Update coding style to make checkpatch.pl happy
Git-commit: 794d68de2f
References: bsc#1172033, CVE-2020-13253

To make the next commit easier to review, clean this code first.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200630133912.9428-3-f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:28 +01:00
Thomas Huth
b3f6320ee4 hw/usb/hcd-xhci: Fix unbounded loop in xhci_ring_chain_length() (CVE-2020-14394)
Git-commit: effaf5a240
References: bsc#1180207, CVE-2020-14394

The loop condition in xhci_ring_chain_length() is under control of
the guest, and additionally the code does not check for failed DMA
transfers (e.g. if reaching the end of the RAM), so the loop there
could run for a very long time or even forever. Fix it by checking
the return value of dma_memory_read() and by introducing a maximum
loop length.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/646
Message-Id: <20220804131300.96368-1-thuth@redhat.com>
Reviewed-by: Mauro Matteo Cascella <mcascell@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:28 +01:00
Philippe Mathieu-Daudé
1de0d28447 hw/block/fdc: Kludge missing floppy drive to fix CVE-2021-20196
Git-commit: 1ab95af033
References: bsc#1181361, CVE-2021-20196

Guest might select another drive on the bus by setting the
DRIVE_SEL bit of the DIGITAL OUTPUT REGISTER (DOR).
The current controller model doesn't expect a BlockBackend
to be NULL. A simple way to fix CVE-2021-20196 is to create
an empty BlockBackend when it is missing. All further
accesses will be safely handled, and the controller state
machines keep behaving correctly.

Cc: qemu-stable@nongnu.org
Fixes: CVE-2021-20196
Reported-by: Gaoning Pan (Ant Security Light-Year Lab) <pgn@zju.edu.cn>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20211124161536.631563-3-philmd@redhat.com
BugLink: https://bugs.launchpad.net/qemu/+bug/1912780
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/338
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:28 +01:00
Philippe Mathieu-Daudé
33604e6a3b hw/block/fdc: Extract blk_create_empty_drive()
Git-commit: b154791e7b
References: bsc#1181361, CVE-2021-20196

We are going to re-use this code in the next commit,
so extract it as a new blk_create_empty_drive() function.

Inspired-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20211124161536.631563-2-philmd@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:28 +01:00
Mauro Matteo Cascella
d24c9ab148 hw/scsi/scsi-disk: MODE_PAGE_ALLS not allowed in MODE SELECT commands
Git-commit: b3af7fdf9c
References: bsc#1192525, CVE-2021-3930

This avoids an off-by-one read of 'mode_sense_valid' buffer in
hw/scsi/scsi-disk.c:mode_sense_page().

Fixes: CVE-2021-3930
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: a8f4bbe290 ("scsi-disk: store valid mode pages in a table")
Fixes: #546
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:28 +01:00
Mauro Matteo Cascella
670358f687 scsi/lsi53c895a: really fix use-after-free in lsi_do_msgout (CVE-2022-0216)
Git-commit: 4367a20cc4
References: bsc#1198038, CVE-2022-0216

Set current_req to NULL, not current_req->req, to prevent reusing a free'd
buffer in case of repeated SCSI cancel requests.  Also apply the fix to
CLEAR QUEUE and BUS DEVICE RESET messages as well, since they also cancel
the request.

Thanks to Alexander Bulekov for providing a reproducer.

Fixes: CVE-2022-0216
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/972
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20220711123316.421279-1-mcascell@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
[DF: Kill the hunk that patches the test as it's not there]
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:28 +01:00
Klaus Jensen
8e7f55c6b8 hw/nvme: fix CVE-2021-3929
Git-commit: 736b01642d
References: bsc#1193880, CVE-2021-3929

This fixes CVE-2021-3929 "locally" by denying DMA to the iomem of the
device itself. This still allows DMA to MMIO regions of other devices
(e.g. doing P2P DMA to the controller memory buffer of another NVMe
device).

Fixes: CVE-2021-3929
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>41
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Stefano Garzarella
3716c2d7d9 vhost-vsock: detach the virqueue element in case of error
Git-commit: 8d1b247f37
References: bsc#1198712, CVE-2022-26354

In vhost_vsock_common_send_transport_reset(), if an element popped from
the virtqueue is invalid, we should call virtqueue_detach_element() to
detach it from the virtqueue before freeing its memory.

Fixes: fc0b9b0e1c ("vhost-vsock: add virtio sockets device")
Fixes: CVE-2022-26354
Cc: qemu-stable@nongnu.org
Reported-by: VictorV <vv474172261@gmail.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220228095058.27899-1-sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Mauro Matteo Cascella
92f45c0d76 display/qxl-render: fix race condition in qxl_cursor (CVE-2021-4207)
Git-commit: 9569f5cb5b
References: bsc#1198037, CVE-2021-4207

Avoid fetching 'width' and 'height' a second time to prevent possible
race condition. Refer to security advisory
https://starlabs.sg/advisories/22-4207/ for more information.

Fixes: CVE-2021-4207
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220407081106.343235-1-mcascell@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Philippe Mathieu-Daudé
b36e5b2fbe hw/display/qxl: Assert memory slot fits in preallocated MemoryRegion
Git-commit: 86fdb0582c
References: bsc#1205808, CVE-2022-4144

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-6-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Philippe Mathieu-Daudé
0f022f64b4 hw/display/qxl: Avoid buffer overrun in qxl_phys2virt (CVE-2022-4144)
Git-commit: 6dbbf05514
References: bsc#1205808, CVE-2022-4144

Have qxl_get_check_slot_offset() return false if the requested
buffer size does not fit within the slot memory region.

Similarly qxl_phys2virt() now returns NULL in such case, and
qxl_dirty_one_surface() aborts.

This avoids buffer overrun in the host pointer returned by
memory_region_get_ram_ptr().

Fixes: CVE-2022-4144 (out-of-bounds read)
Reported-by: Wenxu Yin (@awxylitol)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1336
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-5-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Philippe Mathieu-Daudé
6602b423e4 hw/display/qxl: Pass requested buffer size to qxl_phys2virt()
Git-commit: 8efec0ef8b)
References: bsc#1205808, CVE-2022-4144

Currently qxl_phys2virt() doesn't check for buffer overrun.
In order to do so in the next commit, pass the buffer size
as argument.

For QXLCursor in qxl_render_cursor() -> qxl_cursor() we
verify the size of the chunked data ahead, checking we can
access 'sizeof(QXLCursor) + chunk->data_size' bytes.
Since in the SPICE_CURSOR_TYPE_MONO case the cursor is
assumed to fit in one chunk, no change are required.
In SPICE_CURSOR_TYPE_ALPHA the ahead read is handled in
qxl_unpack_chunks().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-4-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Philippe Mathieu-Daudé
4485ad113b hw/display/qxl: Document qxl_phys2virt()
Git-commit: b1901de83a
References: bac#1205808, CVE-2022-4144

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-3-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Philippe Mathieu-Daudé
62dc82929e hw/display/qxl: Have qxl_log_command Return early if no log_cmd handler
Git-commit: 61c34fc194
References: bac#1205808, CVE-2022-4144

Only 3 command types are logged: no need to call qxl_phys2virt()
for the other types. Using different cases will help to pass
different structure sizes to qxl_phys2virt() in a pair of commits.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-2-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Gerd Hoffmann
713a23057b qxl: use guest_monitor_config for local renderer.
Git-commit: 979f7ef896
References: bsc#1198035, CVE-2021-4206

When processing monitor config from guest store head0 width and height
for single-head configurations.  Use these when creating the
DisplaySurface in the local renderer.

This fixes a rendering issue with wayland.  Wayland rounds up the
framebuffer width and height to a multiple of 64, so with odd
resolutions (800x600 for example) the framebuffer is larger than the
actual screen.  The monitor config has the actual screen size though.

This fixes guest display for anything using the local renderer
(non-spice UI, screendump monitor command).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180919103057.9666-1-kraxel@redhat.com
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Philippe Mathieu-Daudé
c49a901446 hw/block/fdc: Prevent end-of-track overrun (CVE-2021-3507)
Git-commit: defac5e2fb
References: bsc#1185000, CVE-2021-3507

Per the 82078 datasheet, if the end-of-track (EOT byte in
the FIFO) is more than the number of sectors per side, the
command is terminated unsuccessfully:

* 5.2.5 DATA TRANSFER TERMINATION

  The 82078 supports terminal count explicitly through
  the TC pin and implicitly through the underrun/over-
  run and end-of-track (EOT) functions. For full sector
  transfers, the EOT parameter can define the last
  sector to be transferred in a single or multisector
  transfer. If the last sector to be transferred is a par-
  tial sector, the host can stop transferring the data in
  mid-sector, and the 82078 will continue to complete
  the sector as if a hardware TC was received. The
  only difference between these implicit functions and
  TC is that they return "abnormal termination" result
  status. Such status indications can be ignored if they
  were expected.

* 6.1.3 READ TRACK

  This command terminates when the EOT specified
  number of sectors have been read. If the 82078
  does not find an I D Address Mark on the diskette
  after the second· occurrence of a pulse on the
  INDX# pin, then it sets the IC code in Status Regis-
  ter 0 to "01" (Abnormal termination), sets the MA bit
  in Status Register 1 to "1", and terminates the com-
  mand.

* 6.1.6 VERIFY

  Refer to Table 6-6 and Table 6-7 for information
  concerning the values of MT and EC versus SC and
  EOT value.

* Table 6·6. Result Phase Table

* Table 6-7. Verify Command Result Phase Table

Fix by aborting the transfer when EOT > # Sectors Per Side.

Cc: qemu-stable@nongnu.org
Cc: Hervé Poussineau <hpoussin@reactos.org>
Fixes: baca51faff ("floppy driver: disk geometry auto detect")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/339
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211118115733.4038610-2-philmd@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Mauro Matteo Cascella
9ccbc11f9d ui/cursor: fix integer overflow in cursor_alloc (CVE-2021-4206)
Git-commit: fa892e9abb
References: bsc#1198035, CVE-2021-4206

Prevent potential integer overflow by limiting 'width' and 'height' to
512x512. Also change 'datasize' type to size_t. Refer to security
advisory https://starlabs.sg/advisories/22-4206/ for more information.

Fixes: CVE-2021-4206
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220407081712.345609-1-mcascell@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Signed-off-by: Lin Ma <lma@suse.com>
2023-03-08 09:59:27 +01:00
Bin Meng
681b504aad hw/sd: sdhci: Reset the data pointer of s->fifo_buffer[] when a different block size is programmed
Git-commit: cffb446e8f
References: bsc#1175144

If the block size is programmed to a different value from the
previous one, reset the data pointer of s->fifo_buffer[] so that
s->fifo_buffer[] can be filled in using the new block size in
the next transfer.

With this fix, the following reproducer:

outl 0xcf8 0x80001010
outl 0xcfc 0xe0000000
outl 0xcf8 0x80001001
outl 0xcfc 0x06000000
write 0xe000002c 0x1 0x05
write 0xe0000005 0x1 0x02
write 0xe0000007 0x1 0x01
write 0xe0000028 0x1 0x10
write 0x0 0x1 0x23
write 0x2 0x1 0x08
write 0xe000000c 0x1 0x01
write 0xe000000e 0x1 0x20
write 0xe000000f 0x1 0x00
write 0xe000000c 0x1 0x32
write 0xe0000004 0x2 0x0200
write 0xe0000028 0x1 0x00
write 0xe0000003 0x1 0x40

cannot be reproduced with the following QEMU command line:

$ qemu-system-x86_64 -nographic -machine accel=qtest -m 512M \
      -nodefaults -device sdhci-pci,sd-spec-version=3 \
      -drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
      -device sd-card,drive=mydrive -qtest stdio

Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-6-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Bin Meng
0efc35a9cf hw/sd: sdhci: Limit block size only when SDHC_BLKSIZE register is writable
Git-commit: 5cd7aa3451
References: bsc#1175144

The codes to limit the maximum block size is only necessary when
SDHC_BLKSIZE register is writable.

Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-5-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Bin Meng
6adc1ceeed hw/sd: sdhci: Correctly set the controller status for ADMA
Git-commit: bc6f28995f)
References: bsc#1175144

When an ADMA transfer is started, the codes forget to set the
controller status to indicate a transfer is in progress.

With this fix, the following 2 reproducers:

https://paste.debian.net/plain/1185136
https://paste.debian.net/plain/1185141

cannot be reproduced with the following QEMU command line:

$ qemu-system-x86_64 -nographic -machine accel=qtest -m 512M \
      -nodefaults -device sdhci-pci,sd-spec-version=3 \
      -drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
      -device sd-card,drive=mydrive -qtest stdio

Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-4-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Bin Meng
03feb45b58 hw/sd: sdhci: Don't write to SDHC_SYSAD register when transfer is in progress
Git-commit: 8be45cc947
References: bsc#1175144

Per "SD Host Controller Standard Specification Version 7.00"
chapter 2.2.1 SDMA System Address Register:

This register can be accessed only if no transaction is executing
(i.e., after a transaction has stopped).

With this fix, the following reproducer:

outl 0xcf8 0x80001010
outl 0xcfc 0xfbefff00
outl 0xcf8 0x80001001
outl 0xcfc 0x06000000
write 0xfbefff2c 0x1 0x05
write 0xfbefff0f 0x1 0x37
write 0xfbefff0a 0x1 0x01
write 0xfbefff0f 0x1 0x29
write 0xfbefff0f 0x1 0x02
write 0xfbefff0f 0x1 0x03
write 0xfbefff04 0x1 0x01
write 0xfbefff05 0x1 0x01
write 0xfbefff07 0x1 0x02
write 0xfbefff0c 0x1 0x33
write 0xfbefff0e 0x1 0x20
write 0xfbefff0f 0x1 0x00
write 0xfbefff2a 0x1 0x01
write 0xfbefff0c 0x1 0x00
write 0xfbefff03 0x1 0x00
write 0xfbefff05 0x1 0x00
write 0xfbefff2a 0x1 0x02
write 0xfbefff0c 0x1 0x32
write 0xfbefff01 0x1 0x01
write 0xfbefff02 0x1 0x01
write 0xfbefff03 0x1 0x01

cannot be reproduced with the following QEMU command line:

$ qemu-system-x86_64 -nographic -machine accel=qtest -m 512M \
       -nodefaults -device sdhci-pci,sd-spec-version=3 \
       -drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
       -device sd-card,drive=mydrive -qtest stdio

Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-3-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Bin Meng
df85fb2e8b hw/sd: sdhci: Don't transfer any data when command time out
Git-commit: b263d8f928
References: bsc#1175144

At the end of sdhci_send_command(), it starts a data transfer if the
command register indicates data is associated. But the data transfer
should only be initiated when the command execution has succeeded.

With this fix, the following reproducer:

outl 0xcf8 0x80001810
outl 0xcfc 0xe1068000
outl 0xcf8 0x80001804
outw 0xcfc 0x7
write 0xe106802c 0x1 0x0f
write 0xe1068004 0xc 0x2801d10101fffffbff28a384
write 0xe106800c 0x1f 0x9dacbbcad9e8f7061524334251606f7e8d9cabbac9d8e7f60514233241505f
write 0xe1068003 0x28 0x80d000251480d000252280d000253080d000253e80d000254c80d000255a80d000256880d0002576
write 0xe1068003 0x1 0xfe

cannot be reproduced with the following QEMU command line:

$ qemu-system-x86_64 -nographic -M pc-q35-5.0 \
      -device sdhci-pci,sd-spec-version=3 \
      -drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
      -device sd-card,drive=mydrive \
      -monitor none -serial none -qtest stdio

Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-2-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Philippe Mathieu-Daudé
8faab1254a hw/sd/sdhci: Stop multiple transfers when block count is cleared
Git-commit: 6a9e5cc61c
References: bsc#

Clearing BlockCount stops multiple transfers.

See "SD Host Controller Simplified Specification Version 2.00":

- 2.2.3. Block Count Register (Offset 006h)
- Table 2-8 : Determination of Transfer Type

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200903172806.489710-2-f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:27 +01:00
Philippe Mathieu-Daudé
4444fb5020 hw/sd/sdhci: Fix DMA Transfer Block Size field
Git-commit: dfba99f17f
Reference: bsc#1176681, CVE-2020-25085

The 'Transfer Block Size' field is 12-bit wide.

See section '2.2.2. Block Size Register (Offset 004h)' in datasheet.

Two different bug reproducer available:
- https://bugs.launchpad.net/qemu/+bug/1892960
- https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Fsdhci_oob_write1

Cc: qemu-stable@nongnu.org
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200901140411.112150-3-f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-03-08 09:59:17 +01:00
Jessica Clarke
48c8cfdf55 Partially revert "build: -no-pie is no functional linker flag"
Git-commit: ffd205ef29
References: bsc#1192463

This partially reverts commit bbd2d5a812.

This commit was misguided and broke using --disable-pie on any distro
that enables PIE by default in their compiler driver, including Debian
and its derivatives. Whilst -no-pie is not a linker flag, it is a
compiler driver flag that ensures -pie is not automatically passed by it
to the linker. Without it, all compile_prog checks will fail as any code
built with the explicit -fno-pie will fail to link with the implicit
default -pie due to trying to use position-dependent relocations. The
only bug that needed fixing was LDFLAGS_NOPIE being used as a flag for
the linker itself in pc-bios/optionrom/Makefile.

Note this does not reinstate exporting LDFLAGS_NOPIE, as it is unused,
since the only previous use was the one that should not have existed. I
have also updated the comment for the -fno-pie and -no-pie checks to
reflect what they're actually needed for.

Fixes: bbd2d5a812
Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Message-Id: <20210805192545.38279-1-jrtc27@jrtc27.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
2021-12-19 00:46:38 -05:00
Christian Ehrhardt
fd85006157 build: -no-pie is no functional linker flag
Git-commit: bbd2d5a812
References: bsc#1192463

Recent binutils changes dropping unsupported options [1] caused a build
issue in regard to the optionroms.

  ld -m elf_i386 -T /<<PKGBUILDDIR>>/pc-bios/optionrom//flat.lds -no-pie \
    -s -o multiboot.img multiboot.o
  ld.bfd: Error: unable to disambiguate: -no-pie (did you mean --no-pie ?)

This isn't really a regression in ld.bfd, filing the bug upstream
revealed that this never worked as a ld flag [2] - in fact it seems we
were by accident setting --nmagic).

Since it never had the wanted effect this usage of LDFLAGS_NOPIE, should be
droppable without any effect. This also is the only use-case of LDFLAGS_NOPIE
in .mak, therefore we can also remove it from being added there.

[1]: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=983d925d
[2]: https://sourceware.org/bugzilla/show_bug.cgi?id=27050#c5

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Message-Id: <20201214150938.1297512-1-christian.ehrhardt@canonical.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
2021-12-19 00:46:27 -05:00
Jason Wang
6e65a35ab2 virtio-net: fix use after unmap/free for sg
Git-commit: bedd7e93d0
References: bsc#1189938 CVE-2021-3748

When mergeable buffer is enabled, we try to set the num_buffers after
the virtqueue elem has been unmapped. This will lead several issues,
E.g a use after free when the descriptor has an address which belongs
to the non direct access region. In this case we use bounce buffer
that is allocated during address_space_map() and freed during
address_space_unmap().

Fixing this by storing the elems temporarily in an array and delay the
unmap after we set the the num_buffers.

This addresses CVE-2021-3748.

Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: fbe78f4f55 ("virtio-net support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-09-30 07:59:28 -06:00
Gerd Hoffmann
75bbd01a2a uas: add stream number sanity checks.
Git-commit: 13b250b12a
References: bsc#1189702 CVE-2021-3713

The device uses the guest-supplied stream number unchecked, which can
lead to guest-triggered out-of-band access to the UASDevice->data3 and
UASDevice->status3 fields.  Add the missing checks.

Fixes: CVE-2021-3713
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reported-by: Chen Zhe <chenzhe@huawei.com>
Reported-by: Tan Jingguo <tanjingguo@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210818120505.1258262-2-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-09-30 07:59:27 -06:00
Gerd Hoffmann
7045286d96 usbredir: fix free call
Git-commit: 5e796671e6
References: bsc#1189145 CVE-2021-3682

data might point into the middle of a larger buffer, there is a separate
free_on_destroy pointer passed into bufp_alloc() to handle that.  It is
only used in the normal workflow though, not when dropping packets due
to the queue being full.  Fix that.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/491
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210722072756.647673-1-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-06 09:36:10 -06:00
Mark Cave-Ayland
56628efd25 esp: ensure cmdfifo is not empty and current_dev is non-NULL
Git-commit: 9954575173
References: bsc#1180433, CVE-2020-35504
            bsc#1180434, CVE-2020-35505
            bsc#1180435, CVE-2020-35506

When about to execute a SCSI command, ensure that cmdfifo is not empty and
current_dev is non-NULL. This can happen if the guest tries to execute a TI
(Transfer Information) command without issuing one of the select commands
first.

Buglink: https://bugs.launchpad.net/qemu/+bug/1910723
Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 12:38:52 -06:00
Mark Cave-Ayland
1e67d33f27 esp: always check current_req is not NULL before use in DMA callbacks
Git-commit: 0db895361b
References: bsc#1180433, CVE-2020-35504
            bsc#1180434, CVE-2020-35505
            bsc#1180435, CVE-2020-35506

After issuing a SCSI command the SCSI layer can call the SCSIBusInfo .cancel
callback which resets both current_req and current_dev to NULL. If any data
is left in the transfer buffer (async_len != 0) then the next TI (Transfer
Information) command will attempt to reference the NULL pointer causing a
segfault.

Buglink: https://bugs.launchpad.net/qemu/+bug/1910723
Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 12:38:52 -06:00
Gerd Hoffmann
e230164c8a usb: limit combined packets to 1 MiB (CVE-2021-3527)
Git-commit: 05a40b172e
References: bsc#1186012, CVE-2021-3527

usb-host and usb-redirect try to batch bulk transfers by combining many
small usb packets into a single, large transfer request, to reduce the
overhead and improve performance.

This patch adds a size limit of 1 MiB for those combined packets to
restrict the host resources the guest can bind that way.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210503132915.2335822-6-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 12:38:52 -06:00
Gerd Hoffmann
2aed7ecb23 usb/mtp: avoid dynamic stack allocation
Git-commit: 06aa50c06c
References: bsc#1186012, CVE-2021-3527

Use autofree heap allocation instead.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210503132915.2335822-4-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 12:38:52 -06:00
Gerd Hoffmann
8b0de5d282 usb/redir: avoid dynamic stack allocation (CVE-2021-3527)
Git-commit: 7ec54f9eb6
References: bsc#1186012, CVE-2021-3527

Use autofree heap allocation instead.

Fixes: 4f4321c11f ("usb: use iovecs in USBPacket")
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210503132915.2335822-3-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 12:38:52 -06:00
Gerd Hoffmann
d2e365e5a7 usb/hid: avoid dynamic stack allocation
Git-commit: 3f67e2e7f1
References: bsc#1186012, CVE-2021-3527

Use autofree heap allocation instead.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210503132915.2335822-2-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 12:38:52 -06:00
Philippe Mathieu-Daudé
35e06371ed hw/usb/host-stub: Remove unused header
Git-commit: 1081607bfa
References: bsc#1186012, CVE-2021-3527

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210424224110.3442424-2-f4bug@amsat.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 12:38:52 -06:00
Jose R Ziviani
5f975d28d8 net: eepro100: validate various address values
Git-commit: 000000000000000000000000000000000000000000000
References: bsc#1182651, CVE-2021-20255

Patch based on discussion:
https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg06098.html

While processing controller commands, eepro100 emulator gets
command unit(CU) base address OR receive unit (RU) base address
OR command block (CB) address from guest. If these values are not
checked, it may lead to an infinite loop kind of issues. Add checks
to avoid it.

Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-By: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-30 08:04:54 -06:00
Mauro Matteo Cascella
523c8d9df4 hw/scsi/megasas: check for NULL frame in megasas_command_cancelled()
Git-commit: 00000000000000000000000000000000000000000000
References: bsc#1180432, CVE-2020-35503

Ensure that 'cmd->frame' is not NULL before accessing the 'header' field.
This check prevents a potential NULL pointer dereference issue.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1910346
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Acked-By: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-30 08:04:54 -06:00
Jose R Ziviani
ddab4997b5 libslirp: tftp: check tftp_input buffer size
Git-commit: 3f17948137155f025f7809fdc38576d5d2451c3d
References: bsc#1187366, CVE-2021-3595

Fixes: CVE-2021-3595
Fixes: https://gitlab.freedesktop.org/slirp/libslirp/-/issues/46

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-08 08:20:55 -06:00
Jose R Ziviani
d0824d8b73 libslirp: upd6: check udp6_input buffer size
Git-commit: de71c15de66ba9350bf62c45b05f8fbff166517b
References: bsc#1187365, CVE-2021-3593

Fixes: CVE-2021-3593
Fixes: https://gitlab.freedesktop.org/slirp/libslirp/-/issues/45

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 13:41:20 -06:00
Jose R Ziviani
fc8535d39a libslirp: dhcp: Always send DHCP_OPT_LEN bytes in options
Git-commit: d7fb54218424c3b2517aee5b391ced0f75386a5d
References: bsc#1187364, CVE-2021-3592

RFC2131 suggests that the options field may be at least 312 bytes.
Some DHCP clients seem to assume that it has to be at least 312 bytes.

Fixes #51
Fixes: f13cad45b25d92760bb0ad67bec0300a4d7d5275 ("bootp: limit
vendor-specific area to input packet memory buffer")

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 13:41:10 -06:00
Jose R Ziviani
15504e92d8 libslirp: udp: check upd_input buffer size
Git-commit: 74572be49247c8c5feae7c6e0b50c4f569ca9824
References: bsc#1187367, CVE-2021-3594

Fixes: CVE-2021-3594
Fixes: https://gitlab.freedesktop.org/slirp/libslirp/-/issues/47

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 11:36:25 -06:00
Jose R Ziviani
9703efb0b0 libslirp: bootp: check bootp_input buffer size
Git-commit: 2eca0838eee1da96204545e22cdaed860d9d7c6c
References: bsc#1187364, CVE-2021-3592

Fixes: CVE-2021-3592
Fixes: https://gitlab.freedesktop.org/slirp/libslirp/-/issues/44

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 11:36:24 -06:00
Jose R Ziviani
21bb584214 libslirp: bootp: limit vendor-specific area to input packet memory buffer
Git-commit: f13cad45b25d92760bb0ad67bec0300a4d7d5275
References: bsc#1187364, CVE-2021-3592

sizeof(bootp_t) currently holds DHCP_OPT_LEN. Remove this optional field
from the structure, to help with the following patch checking for
minimal header size. Modify the bootp_reply() function to take the
buffer boundaries and avoiding potential buffer overflow.

Related to CVE-2021-3592.

https://gitlab.freedesktop.org/slirp/libslirp/-/issues/44

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 11:36:23 -06:00
Jose R Ziviani
4abc1dad65 libslirp: slirp: Add sanity check for str option length
Git commit: 6fd057f7960bc7b3a69f3c53de5c9d0d5d34a79c
References: bsc#1187364, CVE-2021-3592

When user provides a long domainname or hostname that doesn't fit in the
DHCP packet, we mustn't overflow the response packet buffer. Instead,
report errors, following the g_warning() in the slirp->vdnssearch
branch.

Also check the strlen against 256 when initializing slirp, which limit
is also from the protocol where one byte represents the string length.
This gives an early error before the warning which is harder to notice
or diagnose.

Reported-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 11:36:21 -06:00
Jose R Ziviani
1631e50a75 libslirp: Add mtod_check()
Git-commit: 93e645e72a056ec0b2c16e0299fc5c6b94e4ca17
References: bsc#1187364, CVE-2021-3592
            bsc#1187367, CVE-2021-3594

Recent security issues demonstrate the lack of safety care when casting
a mbuf to a particular structure type. At least, it should check that
the buffer is large enough. The following patches will make use of this
function.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 11:36:20 -06:00
Ani Sinha
407cff2a1b qom: code hardening - have bound checking while looping with integer value
Git-commit: 1bf8b88f14
References: bsc#1187529 CVE-2021-3611

Object property insertion code iterates over an integer to get an unused
index that can be used as an unique name for an object property. This loop
increments the integer value indefinitely. Although very unlikely, this can
still cause an integer overflow.
In this change, we fix the above code by checking against INT16_MAX and making
sure that the interger index does not overflow beyond that value. If no
available index is found, the code would cause an assertion failure. This
assertion failure is necessary because the callers of the function do not check
the return value for NULL.

Signed-off-by: Ani Sinha <ani@anisinha.ca>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20200921093325.25617-1-ani@anisinha.ca>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
2021-07-05 18:35:45 +08:00
Philippe Mathieu-Daudé
d0d021a34f hw/char/bcm2835_aux: Allow less than 32-bit accesses
Git-commit: 3059344f01
References: bsc#1172382 CVE-2020-13754

The "BCM2835 ARM Peripherals" datasheet [*] chapter 2
("Auxiliaries: UART1 & SPI1, SPI2"), list the register
sizes as 3/8/16/32 bits. We assume this means this
peripheral allows 8-bit accesses.

This was not an issue until commit 5d971f9e67 which reverted
("memory: accept mismatching sizes in memory_region_access_valid").

The model is implemented as 32-bit accesses (see commit 97398d900c,
all registers are 32-bit) so replace MemoryRegionOps.valid as
MemoryRegionOps.impl, and re-introduce MemoryRegionOps.valid
with a 8/32-bit range.

[*] https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf

Fixes: 97398d900c ("bcm2835_aux: add emulation of BCM2835 AUX (aka UART1) block")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201002181032.1899463-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-26 15:53:58 -06:00
Michael Tokarev
115c2cc2c9 acpi: accept byte and word access to core ACPI registers
Git-commit: dba04c3488
References: bsc#1172382 CVE-2020-13754

All ISA registers should be accessible as bytes, words or dwords
(if wide enough).  Fix the access constraints for acpi-pm-evt,
acpi-pm-tmr & acpi-cnt registers.

Fixes: 5d971f9e67 (memory: Revert "memory: accept mismatching sizes in memory_region_access_valid")
Fixes: afafe4bbe0 (apci: switch cnt to memory api)
Fixes: 77d58b1e47 (apci: switch timer to memory api)
Fixes: b5a7c024d2 (apci: switch evt to memory api)
Buglink: https://lore.kernel.org/xen-devel/20200630170913.123646-1-anthony.perard@citrix.com/T/
Buglink: https://bugs.debian.org/964793
BugLink: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964247
BugLink: https://bugs.launchpad.net/bugs/1886318
Reported-By: Simon John <git@the-jedi.co.uk>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20200720160627.15491-1-mjt@msgid.tls.msk.ru>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-11 13:18:29 -06:00
Laurent Vivier
57adab99f3 xhci: fix valid.max_access_size to access address registers
Git-commit: 8e67fda2dd
References: bsc#1172382 CVE-2020-13754

QEMU XHCI advertises AC64 (64-bit addressing) but doesn't allow
64-bit mode access in "runtime" and "operational" MemoryRegionOps.

Set the max_access_size based on sizeof(dma_addr_t) as AC64 is set.

XHCI specs:
"If the xHC supports 64-bit addressing (AC64 = ‘1’), then software
should write 64-bit registers using only Qword accesses.  If a
system is incapable of issuing Qword accesses, then writes to the
64-bit address fields shall be performed using 2 Dword accesses;
low Dword-first, high-Dword second.  If the xHC supports 32-bit
addressing (AC64 = ‘0’), then the high Dword of registers containing
64-bit address fields are unused and software should write addresses
using only Dword accesses"

The problem has been detected with SLOF, as linux kernel always accesses
registers using 32-bit access even if AC64 is set and revealed by
5d971f9e67 ("memory: Revert "memory: accept mismatching sizes in memory_region_access_valid"")

Suggested-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-id: 20200721083322.90651-1-lvivier@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-11 13:18:26 -06:00
Andrew Melnychenko
cbb156e2f0 Fixed integer overflow in e1000e
Git-commit: f22a57ac09
References: bsc#1172382 CVE-2020-13754

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1737400
Fixed setting max_queue_num if there are no peers in
NICConf. qemu_new_nic() creates NICState with 1 NetClientState(index
0) without peers, set max_queue_num to 0 - It prevents undefined
behavior and possible crashes, especially during pcie hotplug.

Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation")
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-11 13:18:23 -06:00
Jose R Ziviani
ddd3735454 memory: Revert "memory: accept mismatching sizes in memory_region_access_valid"
Git-commit: 5d971f9e67
References: bsc#1172382 CVE-2020-13754

Memory API documentation documents valid .min_access_size and .max_access_size
fields and explains that any access outside these boundaries is blocked.

This is what devices seem to assume.

However this is not what the implementation does: it simply
ignores the boundaries unless there's an "accepts" callback.

Naturally, this breaks a bunch of devices.

Revert to the documented behaviour.

Devices that want to allow any access can just drop the valid field,
or add the impl field to have accesses converted to appropriate
length.

Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Fixes: CVE-2020-13754
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1842363
Fixes: a014ed07bd ("memory: accept mismatching sizes in memory_region_access_valid")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200610134731.1514409-1-mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jose R. Ziviani <jose.ziviani@suse.com>
2021-05-11 13:18:20 -06:00
Jose R Ziviani
dbab676a49 libqos: usb-hcd-ehci: use 32-bit write for config register
Git-commit: 89ed83d8b2
References: bsc#1172382 CVE-2020-13754

The memory region ops have min_access_size == 4 so obey it.

Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-11 13:18:18 -06:00
Jose R Ziviani
5e131a074c libqos: pci-pc: use 32-bit write for EJ register
Git-commit: 4b7c06837a
References: bsc#1172382 CVE-2020-13754

The memory region ops have min_access_size == 4 so obey it.

Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-11 13:18:15 -06:00
Ralf Haferkamp
f5f9d9355d Drop bogus IPv6 messages
Git-commit: c7ede54cbd2e2b25385325600958ba0124e31cc0
References: bsc#1172380 CVE-2020-10756

Drop IPv6 message shorter than what's mentioned in the payload
length header (+ the size of the IPv6 header). They're invalid an could
lead to data leakage in icmp6_send_echoreply().

Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-07 15:03:36 -06:00
Philippe Mathieu-Daudé
b52ccc94f9 hw/intc/arm_gic: Fix interrupt ID in GICD_SGIR register
Git-commit: edfe2eb436
References: bsc#1181933 CVE-2021-20221

Per the ARM Generic Interrupt Controller Architecture specification
(document "ARM IHI 0048B.b (ID072613)"), the SGIINTID field is 4 bit,
not 10:

  - 4.3 Distributor register descriptions
  - 4.3.15 Software Generated Interrupt Register, GICD_SG

    - Table 4-21 GICD_SGIR bit assignments

    The Interrupt ID of the SGI to forward to the specified CPU
    interfaces. The value of this field is the Interrupt ID, in
    the range 0-15, for example a value of 0b0011 specifies
    Interrupt ID 3.

Correct the irq mask to fix an undefined behavior (which eventually
lead to a heap-buffer-overflow, see [Buglink]):

   $ echo 'writel 0x8000f00 0xff4affb0' | qemu-system-aarch64 -M virt,accel=qtest -qtest stdio
   [I 1612088147.116987] OPENED
  [R +0.278293] writel 0x8000f00 0xff4affb0
  ../hw/intc/arm_gic.c:1498:13: runtime error: index 944 out of bounds for type 'uint8_t [16][8]'
  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../hw/intc/arm_gic.c:1498:13

This fixes a security issue when running with KVM on Arm with
kernel-irqchip=off. (The default is kernel-irqchip=on, which is
unaffected, and which is also the correct choice for performance.)

Cc: qemu-stable@nongnu.org
Fixes: CVE-2021-20221
Fixes: 9ee6e8bb85 ("ARMv7 support.")
Buglink: https://bugs.launchpad.net/qemu/+bug/1913916
Buglink: https://bugs.launchpad.net/qemu/+bug/1913917
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210131103401.217160-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-05-07 15:03:28 -06:00
Gerd Hoffmann
57641b14d1 usb: fix setup_len init (CVE-2020-14364)
Git-commit: b946434f26
References: bsc#1175441, CVE-2020-14364

Store calculated setup_len in a local variable, verify it, and only
write it to the struct (USBDevice->setup_len) in case it passed the
sanity checks.

This prevents other code (do_token_{in,out} functions specifically)
from working with invalid USBDevice->setup_len values and overrunning
the USBDevice->setup_buf[] buffer.

Fixes: CVE-2020-14364
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200825053636.29648-1-kraxel@redhat.com
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:55:06 -06:00
Mauro Matteo Cascella
42d70f9850 hw/net/net_tx_pkt: fix assertion failure in net_tx_pkt_add_raw_fragment()
Git-commit: 035e69b063
References: bsc#1174641, CVE-2020-16092

An assertion failure issue was found in the code that processes network packets
while adding data fragments into the packet context. It could be abused by a
malicious guest to abort the QEMU process on the host. This patch replaces the
affected assert() with a conditional statement, returning false if the current
data fragment exceeds max_raw_frags.

Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Ziming Zhang <ezrakiez@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:55:06 -06:00
Mauro Matteo Cascella
fa9605ad43 hw/net/xgmac: Fix buffer overflow in xgmac_enet_send()
Git-commit: 5519724a13
References: bsc#1174386 CVE-2020-15863

A buffer overflow issue was reported by Mr. Ziming Zhang, CC'd here. It
occurs while sending an Ethernet frame due to missing break statements
and improper checking of the buffer size.

Reported-by: Ziming Zhang <ezrakiez@gmail.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:55:06 -06:00
Prasad J Pandit
f1e2517d3c net: vmxnet3: validate configuration values during activate (CVE-2021-20203)
Git-commit: 0000000000000000000000000000000000000000
References: bsc#1181639

While activating device in vmxnet3_acticate_device(), it does not
validate guest supplied configuration values against predefined
minimum - maximum limits. This may lead to integer overflow or
OOB access issues. Add checks to avoid it.

Fixes: CVE-2021-20203
Buglink: https://bugs.launchpad.net/qemu/+bug/1913873
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:55:06 -06:00
Chen Qun
38ecc93f0a block/iscsi:fix heap-buffer-overflow in iscsi_aio_ioctl_cb
Git-commit: ff0507c239
References: bsc#1180523, CVE-2020-11947

There is an overflow, the source 'datain.data[2]' is 100 bytes,
 but the 'ss' is 252 bytes.This may cause a security issue because
 we can access a lot of unrelated memory data.

The len for sbp copy data should take the minimum of mx_sb_len and
 sb_len_wr, not the maximum.

If we use iscsi device for VM backend storage, ASAN show stack:

READ of size 252 at 0xfffd149dcfc4 thread T0
    #0 0xaaad433d0d34 in __asan_memcpy (aarch64-softmmu/qemu-system-aarch64+0x2cb0d34)
    #1 0xaaad45f9d6d0 in iscsi_aio_ioctl_cb /qemu/block/iscsi.c:996:9
    #2 0xfffd1af0e2dc  (/usr/lib64/iscsi/libiscsi.so.8+0xe2dc)
    #3 0xfffd1af0d174  (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
    #4 0xfffd1af19fac  (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
    #5 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
    #6 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
    #7 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
    #8 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
    #9 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
    #10 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
    #11 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
    #12 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
    #13 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
    #14 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
    #15 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
    #16 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
    #17 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)

0xfffd149dcfc4 is located 0 bytes to the right of 100-byte region [0xfffd149dcf60,0xfffd149dcfc4)
allocated by thread T0 here:
    #0 0xaaad433d1e70 in __interceptor_malloc (aarch64-softmmu/qemu-system-aarch64+0x2cb1e70)
    #1 0xfffd1af0e254  (/usr/lib64/iscsi/libiscsi.so.8+0xe254)
    #2 0xfffd1af0d174  (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
    #3 0xfffd1af19fac  (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
    #4 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
    #5 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
    #6 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
    #7 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
    #8 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
    #9 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
    #10 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
    #11 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
    #12 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
    #13 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
    #14 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
    #15 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
    #16 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200418062602.10776-1-kuhn.chenqun@huawei.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:55:06 -06:00
Prasad J Pandit
ea41fa8656 exec: set map length to zero when returning NULL
Git-commit: 77f55eac6c
References: bsc#1172386, CVE-2020-13659

When mapping physical memory into host's virtual address space,
'address_space_map' may return NULL if BounceBuffer is in_use.
Set and return '*plen = 0' to avoid later NULL pointer dereference.

Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: https://bugs.launchpad.net/qemu/+bug/1878259
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200526111743.428367-1-ppandit@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:55:06 -06:00
Prasad J Pandit
a9e0c4dfad slirp: check pkt_len before reading protocol header
Git-commit: 2e1dcbc0c2af64fcb17009eaf2ceedd81be2b27f
References: bsc#1179467

While processing ARP/NCSI packets in 'arp_input' or 'ncsi_input'
routines, ensure that pkt_len is large enough to accommodate the
respective protocol headers, lest it should do an OOB access.
Add check to avoid it.

CVE-2020-29129 CVE-2020-29130
  QEMU: slirp: out-of-bounds access while processing ARP/NCSI packets
 -> https://www.openwall.com/lists/oss-security/2020/11/27/1

Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20201126135706.273950-1-ppandit@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
[BR: no NCSI code, so only ARP patched]
2021-04-06 16:54:54 -06:00
Alexander Bulekov
f1c4f22d7b lan9118: switch to use qemu_receive_packet() for loopback
Git-commit: 37cee01784

This patch switches to use qemu_receive_packet() which can detect
reentrancy and return early.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Alexander Bulekov
c79f80370e cadence_gem: switch to use qemu_receive_packet() for loopback
Git-commit: e73adfbeec

This patch switches to use qemu_receive_packet() which can detect
reentrancy and return early.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Alexander Bulekov
38676a402b pcnet: switch to use qemu_receive_packet() for loopback
Git-commit: 99ccfaa1ed

This patch switches to use qemu_receive_packet() which can detect
reentrancy and return early.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Buglink: https://bugs.launchpad.net/qemu/+bug/1917085
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Jason Wang
0ac2182285 tx_pkt: switch to use qemu_receive_packet_iov() for loopback
Git-commit: 8c552542b8

This patch switches to use qemu_receive_receive_iov() which can detect
reentrancy and return early.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Jason Wang
c9b45adb86 dp8393x: switch to use qemu_receive_packet() for loopback packet
Git-commit: 331d2ac9ea

This patch switches to use qemu_receive_packet() which can detect
reentrancy and return early.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Jason Wang
e016294ce3 e1000: switch to use qemu_receive_packet() for loopback
Git-commit: 1caff0340f

This patch switches to use qemu_receive_packet() which can detect
reentrancy and return early.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Alexander Bulekov
f075f63c7e rtl8139: switch to use qemu_receive_packet() for loopback
Git-commit: 5311fb805a
References: bsc#1182968, CVE-2021-3416

This patch switches to use qemu_receive_packet() which can detect
reentrancy and return early.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Buglink: https://bugs.launchpad.net/qemu/+bug/1910826
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Jason Wang
fd85166fe9 net: introduce qemu_receive_packet()
Git-commit: 705df5466c
References: bsc#1182968, CVE-2021-3416
Some NIC supports loopback mode and this is done by calling
nc->info->receive() directly which in fact suppresses the effort of
reentrancy check that is done in qemu_net_queue_send().

Unfortunately we can't use qemu_net_queue_send() here since for
loopback there's no sender as peer, so this patch introduce a
qemu_receive_packet() which is used for implementing loopback mode
for a NIC with this check.

NIC that supports loopback mode will be converted to this helper.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Jason Wang
e880f7e5d4 e1000: fail early for evil descriptor
Git-commit: 3de46e6fc4
References: bsc#1182577, CVE-2021-20257

During procss_tx_desc(), driver can try to chain data descriptor with
legacy descriptor, when will lead underflow for the following
calculation in process_tx_desc() for bytes:

            if (tp->size + bytes > msh)
                bytes = msh - tp->size;

This will lead a infinite loop. So check and fail early if tp->size if
greater or equal to msh.

Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de>
Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Prasad J Pandit
2266a34ef4 spapr_pci: add spapr msi read method
Git-commit: 921604e175
References: bsc#1173612, CVE-2020-15469

Add spapr msi mmio read method to avoid NULL pointer dereference
issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200811114133.672647-7-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Prasad J Pandit
891cde0691 prep: add ppc-parity write method
Git-commit: f867cebaed
References: bsc#1173612, CVE-2020-15469

Add ppc-parity mmio write method to avoid NULL pointer dereference
issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-Id: <20200811114133.672647-5-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Prasad J Pandit
839bb7ede8 vfio: add quirk device write method
Git-commit: 24202d2b56
References: bsc#1173612, CVE-2020-15469

Add vfio quirk device mmio write method to avoid NULL pointer
dereference issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200811114133.672647-4-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Prasad J Pandit
5e19462069 hw/pci-host: add pci-intack write method
Git-commit: 520f26fc6d
References: bsc#1173612, CVE-2020-15469

Add pci-intack mmio write method to avoid NULL pointer dereference
issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200811114133.672647-2-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Paolo Bonzini
b48c0e8c41 ide: atapi: assert that the buffer pointer is in range
Git-commit: 8132122889
References: bsc#1181108, CVE-2020-29443

A case was reported where s->io_buffer_index can be out of range.
The report skimped on the details but it seems to be triggered
by s->lba == -1 on the READ/READ CD paths (e.g. by sending an
ATAPI command with LBA = 0xFFFFFFFF).  For now paper over it
with assertions.  The first one ensures that there is no overflow
when incrementing s->io_buffer_index, the second checks for the
buffer overrun.

Note that the buffer overrun is only a read, so I am not sure
if the assertion failure is actually less harmful than the overrun.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20201201120926.56559-1-pbonzini@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Prasad J Pandit
c8b804bed2 hw/net/e1000e: advance desc_offset in case of null descriptor
Git-commit: c2cb511634
References: bsc#1179468, CVE-2020-28916

While receiving packets via e1000e_write_packet_to_guest() routine,
'desc_offset' is advanced only when RX descriptor is processed. And
RX descriptor is not processed if it has NULL buffer address.
This may lead to an infinite loop condition. Increament 'desc_offset'
to process next descriptor in the ring to avoid infinite loop.

Reported-by: Cheol-woo Myung <330cjfdn@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Prasad J Pandit
746152bd8e net: remove an assert call in eth_get_gso_type
Git-commit: 7564bf7701
References: bsc#1178174, CVE-2020-27617

eth_get_gso_type() routine returns segmentation offload type based on
L3 protocol type. It calls g_assert_not_reached if L3 protocol is
unknown, making the following return statement unreachable. Remove the
g_assert call, it maybe triggered by a guest user.

Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Prasad J Pandit
4d2180574c hw: usb: hcd-ohci: check for processed TD before retire
Git-commit: 1be90ebecc
References: bsc#1176684, CVE-2020-25625

While servicing OHCI transfer descriptors(TD), ohci_service_iso_td
retires a TD if it has passed its time frame. It does not check if
the TD was already processed once and holds an error code in TD_CC.
It may happen if the TD list has a loop. Add check to avoid an
infinite loop condition.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200915182259.68522-3-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Prasad J Pandit
aec8c7da86 hw: usb: hcd-ohci: check len and frame_number variables
Git-commit: 1328fe0c32
References: bsc#1176682, CVE-2020-25624

While servicing the OHCI transfer descriptors(TD), OHCI host
controller derives variables 'start_addr', 'end_addr', 'len'
etc. from values supplied by the host controller driver.
Host controller driver may supply values such that using
above variables leads to out-of-bounds access issues.
Add checks to avoid them.

AddressSanitizer: stack-buffer-overflow on address 0x7ffd53af76a0
  READ of size 2 at 0x7ffd53af76a0 thread T0
  #0 ohci_service_iso_td ../hw/usb/hcd-ohci.c:734
  #1 ohci_service_ed_list ../hw/usb/hcd-ohci.c:1180
  #2 ohci_process_lists ../hw/usb/hcd-ohci.c:1214
  #3 ohci_frame_boundary ../hw/usb/hcd-ohci.c:1257
  #4 timerlist_run_timers ../util/qemu-timer.c:572
  #5 qemu_clock_run_timers ../util/qemu-timer.c:586
  #6 qemu_clock_run_all_timers ../util/qemu-timer.c:672
  #7 main_loop_wait ../util/main-loop.c:527
  #8 qemu_main_loop ../softmmu/vl.c:1676
  #9 main ../softmmu/main.c:50

Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Reported-by: Yongkang Jia <j_kangel@163.com>
Reported-by: Yi Ren <yunye.ry@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20200915182259.68522-2-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Li Qiang
9ea47050ed hw: ehci: check return value of 'usb_packet_map'
Git-commit: 2fdb42d840
References: bsc#1178934, CVE-2020-25723

If 'usb_packet_map' fails, we should stop to process the usb
request.

Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20200812161727.29412-1-liq3ea@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Li Qiang
d82ce8d705 hw: xhci: check return value of 'usb_packet_map'
Git-commit: 21bc31524e
References: bsc#1176673, CVE-2020-25084

Currently we don't check the return value of 'usb_packet_map',
this will cause an UAF issue. This is LP#1891341.
Following is the reproducer provided in:
-->https://bugs.launchpad.net/qemu/+bug/1891341

cat << EOF | ./i386-softmmu/qemu-system-i386 -device nec-usb-xhci \
-trace usb\* -device usb-audio -device usb-storage,drive=mydrive \
-drive id=mydrive,file=null-co://,size=2M,format=raw,if=none \
-nodefaults -nographic -qtest stdio
outl 0xcf8 0x80001016
outl 0xcfc 0x3c009f0d
outl 0xcf8 0x80001004
outl 0xcfc 0xc77695e
writel 0x9f0d000000000040 0xffff3655
writeq 0x9f0d000000002000 0xff2f9e0000000000
write 0x1d 0x1 0x27
write 0x2d 0x1 0x2e
write 0x17232 0x1 0x03
write 0x17254 0x1 0x06
write 0x17278 0x1 0x34
write 0x3d 0x1 0x27
write 0x40 0x1 0x2e
write 0x41 0x1 0x72
write 0x42 0x1 0x01
write 0x4d 0x1 0x2e
write 0x4f 0x1 0x01
writeq 0x9f0d000000002000 0x5c051a0100000000
write 0x34001d 0x1 0x13
write 0x340026 0x1 0x30
write 0x340028 0x1 0x08
write 0x34002c 0x1 0xfe
write 0x34002d 0x1 0x08
write 0x340037 0x1 0x5e
write 0x34003a 0x1 0x05
write 0x34003d 0x1 0x05
write 0x34004d 0x1 0x13
writeq 0x9f0d000000002000 0xff00010100400009
EOF

This patch fixes this.

Buglink: https://bugs.launchpad.net/qemu/+bug/1891341
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-id: 20200812153139.15146-1-liq3ea@163.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:26:46 -06:00
Prasad J Pandit
5232fb8315 megasas: use unsigned type for reply_queue_head and check index
Git-commit: f50ab86a26
References: bsc#1172383, CVE-2020-13362

A guest user may set 'reply_queue_head' field of MegasasState to
a negative value. Later in 'megasas_lookup_frame' it is used to
index into s->frames[] array. Use unsigned type to avoid OOB
access issue.

Also check that 'index' value stays within s->frames[] bounds
through the while() loop in 'megasas_lookup_frame' to avoid OOB
access.

Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200513192540.1583887-2-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:13:10 -06:00
Bruce Rogers
47ff090c3d sm501: add qemu/log.h include
References: bsc#1172385, CVE-2020-12829

This is in place of commit 4a1f253adb which added the log.h
reference, but did other changes we're not interested in.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:13:10 -06:00
BALATON Zoltan
20514a7521 sm501: Replace hand written implementation with pixman where possible
Git-commit: b15a22bbcb
References: bsc#1172385, CVE-2020-12829

Besides being faster this should also prevent malicious guests to
abuse 2D engine to overwrite data or cause a crash.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-id: 58666389b6cae256e4e972a32c05cf8aa51bffc0.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
(cherry picked from commit 6bed160dd4e4f68a85e032b08d40d6bd5449274f)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:13:10 -06:00
BALATON Zoltan
881edaf75a sm501: Clean up local variables in sm501_2d_operation
Git-commit: 3d0b096298
References: bsc#1172385, CVE-2020-12829

Make variables local to the block they are used in to make it clearer
which operation they are needed for.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: ae59f8138afe7f6a5a4a82539d0f61496a906b06.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
(cherry picked from commit e8a152c862db6fdd316345d9cd932e4602e9b29e)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:13:09 -06:00
BALATON Zoltan
4674b6bb04 sm501: Use BIT(x) macro to shorten constant
Git-commit: 2824809b7f
References: bsc#1172385, CVE-2020-12829

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 124bf5de8d7cf503b32b377d0445029a76bfbd49.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
(cherry picked from commit 6f7231dc2939aca80adc5080464696e7d5f0ca45)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:13:09 -06:00
BALATON Zoltan
7dd455124e sm501: Shorten long variable names in sm501_2d_operation
Git-commit: 6f8183b5dc
References: bsc#1172385, CVE-2020-12829

This increases readability and cleans up some confusing naming.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-id: b9b67b94c46e945252a73c77dfd117132c63c4fb.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:12:30 -06:00
BALATON Zoltan
a4b9fd7dde sm501: Convert printf + abort to qemu_log_mask
Git-commit: e29da77e5f
References: bsc#1172385, CVE-2020-12829

Some places already use qemu_log_mask() to log unimplemented features
or errors but some others have printf() then abort(). Convert these to
qemu_log_mask() and avoid aborting to prevent guests to easily cause
denial of service.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 305af87f59d81e92f2aaff09eb8a3603b8baa322.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:12:30 -06:00
Marcus Comstedt
204d25c450 sm501: Adjust endianness of pixel value in rectangle fill
Git-commit: f3a60058c9
References: bsc#1172385, CVE-2020-12829

The value from twoD_foreground (which is in host endian format) must
be converted to the endianness of the framebuffer (currently always
little endian) before it can be used to perform the fill operation.

Signed-off-by: Marcus Comstedt <marcus@mc.pp.se>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:12:30 -06:00
BALATON Zoltan
386ca5f7b6 sm501: Set updated region dirty after 2D operation
Git-commit: eb76243c9d
References: bsc#1172385, CVE-2020-12829

Set the changed memory region dirty after performed a 2D operation to
ensure that the screen is updated properly.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:12:30 -06:00
BALATON Zoltan
e6a428806a sm501: Fix support for non-zero frame buffer start address
Git-commit: 33159dd7ce
References: bsc#1172385, CVE-2020-12829

Display updates and drawing hardware cursor did not work when frame
buffer address was non-zero. Fix this by taking the frame buffer
address into account in these cases. This fixes screen dragging on
AmigaOS. Based on patch by Sebastian Bauer.

Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:12:30 -06:00
Sebastian Bauer
1ac6b4e5d9 sm501: Log unimplemented raster operation modes
Git-commit: 06cb926aaa
References: bsc#1172385, CVE-2020-12829

The sm501 currently implements only a very limited set of raster operation
modes. After this change, unknown raster operation modes are logged so
these can be easily spotted.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:12:30 -06:00
Sebastian Bauer
6dac9775cd sm501: Implement negated destination raster operation mode
Git-commit: debc7e7dad
References: bsc#1172385, CVE-2020-12829

Add support for the negated destination operation mode. This is used e.g.
by AmigaOS for the INVERSEVID drawing mode. With this change, the cursor
in the shell and non-immediate window adjustment are working now.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:12:30 -06:00
Sebastian Bauer
32be0f4fbb sm501: Use values from the pitch register for 2D operations
Git-commit: 54b2a4339c
References: bsc#1172385, CVE-2020-12829

Before, crt_h_total was used for src_width and dst_width. This is a
property of the current display setting and not relevant for the 2D
operation that also can be done off-screen. The pitch register's purpose
is to describe line pitch relevant of the 2D operation.

Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:12:30 -06:00
BALATON Zoltan
10a8e77101 sm501: Add some more unimplemented registers
Git-commit: 5690d9ecef
References: bsc#1172385, CVE-2020-12829

These are not really implemented (just return zero or default values)
but add these so guests accessing them can run.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:12:30 -06:00
BALATON Zoltan
32d7d1ea3d sm501: Add support for panel layer
Git-commit: 1ae5e6eb42
References: bsc#1172385, CVE-2020-12829

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 2029a276362c0c3a14c78acb56baa9466848dd51.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:11:26 -06:00
BALATON Zoltan
f7713e0f17 sm501: Fix hardware cursor
Git-commit: 6a2a5aae02
References: bsc#1172385, CVE-2020-12829

Rework HWC handling to simplify it and fix cursor not updating on
screen as needed. Previously cursor was not updated because checking
for changes in a line overrode the update flag set for the cursor but
fixing this is not enough because the cursor should also be updated if
its shape or location changes. Introduce hwc_invalidate() function to
handle that similar to other display controller models.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 6970a5e9868b7246656c1d02038dc5d5fa369507.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:10:19 -06:00
Thomas Huth
858d332e0f hw/core/loader: Fix possible crash in rom_copy()
Git-commit: e423455c4f
References: bsc#1172478, CVE-2020-13765

Both, "rom->addr" and "addr" are derived from the binary image
that can be loaded with the "-kernel" paramer. The code in
rom_copy() then calculates:

    d = dest + (rom->addr - addr);

and uses "d" as destination in a memcpy() some lines later. Now with
bad kernel images, it is possible that rom->addr is smaller than addr,
thus "rom->addr - addr" gets negative and the memcpy() then tries to
copy contents from the image to a bad memory location. This could
maybe be used to inject code from a kernel image into the QEMU binary,
so we better fix it with an additional sanity check here.

Cc: qemu-stable@nongnu.org
Reported-by: Guangming Liu
Buglink: https://bugs.launchpad.net/qemu/+bug/1844635
Message-Id: <20190925130331.27825-1-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Prasad J Pandit
eeb47ad354 es1370: check total frame count against current frame
Git-commit: 369ff955a8
References: bsc#1172384, CVE-2020-13361

A guest user may set channel frame count via es1370_write()
such that, in es1370_transfer_audio(), total frame count
'size' is lesser than the number of frames that are processed
'cnt'.

    int cnt = d->frame_cnt >> 16;
    int size = d->frame_cnt & 0xffff;

if (size < cnt), it results in incorrect calculations leading
to OOB access issue(s). Add check to avoid it.

Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20200514200608.1744203-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Paolo Bonzini
c2122f7168 scsi: lsi: exit infinite loop while executing script (CVE-2019-12068)
When executing script in lsi_execute_script(), the LSI scsi adapter
emulator advances 's->dsp' index to read next opcode. This can lead
to an infinite loop if the next opcode is empty. Move the existing
loop exit after 10k iterations so that it covers no-op opcodes as
well.

Reported-by: Bugs SysSec <bugs-syssec@rub.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit de594e4765)
[BR: BSC#1146873 CVE-2019-12068 - minor trace related tweak applied]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Sven Schnelle
6d74a3386e lsi: use enum type for s->waiting
This makes the code easier to read - no functional change.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190305195519.24303-3-svens@stackframe.org>
(cherry picked from commit f08ec2b82a)
[BR: BSC#1146873 CVE-2019-12068 - small tweak made related to trace code]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Felipe Franciosi
6335f5aded iscsi: Cap block count from GET LBA STATUS (CVE-2020-1711)
Git-commit: 693fd2acdf
References: bsc#1166240

When querying an iSCSI server for the provisioning status of blocks (via
GET LBA STATUS), Qemu only validates that the response descriptor zero's
LBA matches the one requested. Given the SCSI spec allows servers to
respond with the status of blocks beyond the end of the LUN, Qemu may
have its heap corrupted by clearing/setting too many bits at the end of
its allocmap for the LUN.

A malicious guest in control of the iSCSI server could carefully program
Qemu's heap (by selectively setting the bitmap) and then smash it.

This limits the number of bits that iscsi_co_block_status() will try to
update in the allocmap so it can't overflow the bitmap.

Fixes: CVE-2020-1711
Cc: qemu-stable@nongnu.org
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[BR: Converted patch from byte based to sector based]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Marc-André Lureau
29d86468e1 Fix use-afte-free in ip_reass() (CVE-2020-1983)
Git-commit: 2faae0f778f818fadc873308f983289df697eb93
References: bsc#1170940, CVE-2020-1983

The q pointer is updated when the mbuf data is moved from m_dat to
m_ext.

m_ext buffer may also be realloc()'ed and moved during m_cat():
q should also be updated in this case.

Reported-by: Aviv Sasson <asasson@paloaltonetworks.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 9bd6c5913271eabcb7768a58197ed3301fe19f2d)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Marc-André Lureau
faab0c6c0f tcp_emu: fix unsafe snprintf() usages
Git-commit: 68ccb8021a838066f0951d4b2817eb6b6f10a843
References: bsc#1163018, CVE-2020-8608

Various calls to snprintf() assume that snprintf() returns "only" the
number of bytes written (excluding terminating NUL).

https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.html#tag_16_159_04

"Upon successful completion, the snprintf() function shall return the
number of bytes that would be written to s had n been sufficiently
large excluding the terminating null byte."

Before patch ce131029, if there isn't enough room in "m_data" for the
"DCC ..." message, we overflow "m_data".

After the patch, if there isn't enough room for the same, we don't
overflow "m_data", but we set "m_len" out-of-bounds. The next time an
access is bounded by "m_len", we'll have a buffer overflow then.

Use slirp_fmt*() to fix potential OOB memory access.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200127092414.169796-7-marcandre.lureau@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Prasad J Pandit
df5ccaec1d slirp: use correct size while emulating commands
Git-commit: 82ebe9c370a0e2970fb5695aa19aa5214a6a1c80
References: bsc#1161066, CVE-2020-7039, bsc#1163018, CVE-2020-8608

While emulating services in tcp_emu(), it uses 'mbuf' size
'm->m_size' to write commands via snprintf(3). Use M_FREEROOM(m)
size to avoid possible OOB access.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200109094228.79764-3-ppandit@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Prasad J Pandit
db504ecf42 slirp: use correct size while emulating IRC commands
Git-commit: ce131029d6d4a405cb7d3ac6716d03e58fb4a5d9
References: bsc#1161066, CVE-2020-7039, bsc#1163018, CVE-2020-8608

While emulating IRC DCC commands, tcp_emu() uses 'mbuf' size
'm->m_size' to write DCC commands via snprintf(3). This may
lead to OOB write access, because 'bptr' points somewhere in
the middle of 'mbuf' buffer, not at the start. Use M_FREEROOM(m)
size to avoid OOB access.

Reported-by: Vishnu Dev TJ <vishnudevtj@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200109094228.79764-2-ppandit@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Samuel Thibault
3590aad90a tcp_emu: Fix oob access
Git-commit: 2655fffed7a9e765bcb4701dd876e9dab975f289
References: bsc#1161066, CVE2020-7039, bsc#1161066, CVS-2020-7039

The main loop only checks for one available byte, while we sometimes
need two bytes.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Marc-André Lureau
f807d653a0 util: add slirp_fmt() helpers
Git-commit: 30648c03b27fb8d9611b723184216cd3174b6775
References: bsc#1163018, CVE-2020-8608

Various calls to snprintf() in libslirp assume that snprintf() returns
"only" the number of bytes written (excluding terminating NUL).

https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.html#tag_16_159_04

"Upon successful completion, the snprintf() function shall return the
number of bytes that would be written to s had n been sufficiently
large excluding the terminating null byte."

Introduce slirp_fmt() that handles several pathological cases the
way libslirp usually expect:

- treat error as fatal (instead of silently returning -1)

- fmt0() will always \0 end

- return the number of bytes actually written (instead of what would
have been written, which would usually result in OOB later), including
the ending \0 for fmt0()

- warn if truncation happened (instead of ignoring)

Other less common cases can still be handled with strcpy/snprintf() etc.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200127092414.169796-2-marcandre.lureau@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Marc-André Lureau
3956bab5c1 slirp: don't manipulate so_rcv in tcp_emu()
For some reason, EMU_IDENT is not like other "emulated" protocols and
tries to reconstitute the original buffer, if it came in multiple
packets. Unfortunately, it does so wrongly, as it doesn't respect the
sbuf circular buffer appending rules, nor does it maintain some of the
invariants (rptr is incremented without bounds, etc): this leads to
further memory corruption revealed by ASAN or various malloc
errors. Furthermore, the so_rcv buffer is regularly flushed, so there
is no guarantee that buffer reconstruction will do what is expected.

Instead, do what the function comment says: "XXX Assumes the whole
command came in one packet", and don't touch so_rcv.

Related to: https://bugzilla.redhat.com/show_bug.cgi?id=1664205

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2021-03-16 20:23:04 -06:00
Marc-André Lureau
807b875ac0 slirp: ensure there is enough space in mbuf to null-terminate
Prevents from buffer overflows.
Related to: https://bugzilla.redhat.com/show_bug.cgi?id=1664205

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[BFR: In support of BSC#1123156]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Samuel Thibault
a077d9f433 slirp: fix big/little endian conversion in ident protocol
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

---
Based-on: <1551476756-25749-1-git-send-email-will@wbowling.info>

(cherry picked from commit 1fd71067da)
[BFR: In support of BSC#1123156]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Michael Roth
06acbfe92a slrip: ip_reass: Fix use after free
Using ip_deq after m_free might read pointers from an allocation reuse.

This would be difficult to exploit, but that is still related with
CVE-2019-14378 which generates fragmented IP packets that would trigger this
issue and at least produce a DoS.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(from libslirp.git commit c59279437eda91841b9d26079c70b8a540d41204)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[BR: BSC#1149811 CVE-2019-15890]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Olaf Hering
5b64848374 xen-mapcache: Fix the bug when overlapping emulated DMA operations may cause inconsistency in guest memory mappings
Under certain circumstances normal xen-mapcache functioning may be broken
by guest's actions. This may lead to either QEMU performing exit() due to
a caught bad pointer (and with QEMU process gone the guest domain simply
appears hung afterwards) or actual use of the incorrect pointer inside
QEMU address space -- a write to unmapped memory is possible. The bug is
hard to reproduce on a i440 machine as multiple DMA sources are required
(though it's possible in theory, using multiple emulated devices), but can
be reproduced somewhat easily on a Q35 machine using an emulated AHCI
controller -- each NCQ queue command slot may be used as an independent
DMA source ex. using READ FPDMA QUEUED command, so a single storage
device on the AHCI controller port will be enough to produce multiple DMAs
(up to 32). The detailed description of the issue follows.

Xen-mapcache provides an ability to map parts of a guest memory into
QEMU's own address space to work with.

There are two types of cache lookups:
 - translating a guest physical address into a pointer in QEMU's address
   space, mapping a part of guest domain memory if necessary (while trying
   to reduce a number of such (re)mappings to a minimum)
 - translating a QEMU's pointer back to its physical address in guest RAM

These lookups are managed via two linked-lists of structures.
MapCacheEntry is used for forward cache lookups, while MapCacheRev -- for
reverse lookups.

Every guest physical address is broken down into 2 parts:
    address_index  = phys_addr >> MCACHE_BUCKET_SHIFT;
    address_offset = phys_addr & (MCACHE_BUCKET_SIZE - 1);

MCACHE_BUCKET_SHIFT depends on a system (32/64) and is equal to 20 for
a 64-bit system (which assumed for the further description). Basically,
this means that we deal with 1 MB chunks and offsets within those 1 MB
chunks. All mappings are created with 1MB-granularity, i.e. 1MB/2MB/3MB
etc. Most DMA transfers typically are less than 1MB, however, if the
transfer crosses any 1MB border(s) - than a nearest larger mapping size
will be used, so ex. a 512-byte DMA transfer with the start address
700FFF80h will actually require a 2MB range.

Current implementation assumes that MapCacheEntries are unique for a given
address_index and size pair and that a single MapCacheEntry may be reused
by multiple requests -- in this case the 'lock' field will be larger than
1. On other hand, each requested guest physical address (with 'lock' flag)
is described by each own MapCacheRev. So there may be multiple MapCacheRev
entries corresponding to a single MapCacheEntry. The xen-mapcache code
uses MapCacheRev entries to retrieve the address_index & size pair which
in turn used to find a related MapCacheEntry. The 'lock' field within
a MapCacheEntry structure is actually a reference counter which shows
a number of corresponding MapCacheRev entries.

The bug lies in ability for the guest to indirectly manipulate with the
xen-mapcache MapCacheEntries list via a special sequence of DMA
operations, typically for storage devices. In order to trigger the bug,
guest needs to issue DMA operations in specific order and timing.
Although xen-mapcache is protected by the mutex lock -- this doesn't help
in this case, as the bug is not due to a race condition.

Suppose we have 3 DMA transfers, namely A, B and C, where
- transfer A crosses 1MB border and thus uses a 2MB mapping
- transfers B and C are normal transfers within 1MB range
- and all 3 transfers belong to the same address_index

In this case, if all these transfers are to be executed one-by-one
(without overlaps), no special treatment necessary -- each transfer's
mapping lock will be set and then cleared on unmap before starting
the next transfer.
The situation changes when DMA transfers overlap in time, ex. like this:

  |===== transfer A (2MB) =====|

              |===== transfer B (1MB) =====|

                          |===== transfer C (1MB) =====|
 time --->

In this situation the following sequence of actions happens:

1. transfer A creates a mapping to 2MB area (lock=1)
2. transfer B (1MB) tries to find available mapping but cannot find one
   because transfer A is still in progress, and it has 2MB size + non-zero
   lock. So transfer B creates another mapping -- same address_index,
   but 1MB size.
3. transfer A completes, making 1st mapping entry available by setting its
   lock to 0
4. transfer C starts and tries to find available mapping entry and sees
   that 1st entry has lock=0, so it uses this entry but remaps the mapping
   to a 1MB size
5. transfer B completes and by this time
  - there are two locked entries in the MapCacheEntry list with the SAME
    values for both address_index and size
  - the entry for transfer B actually resides farther in list while
    transfer C's entry is first
6. xen_ram_addr_from_mapcache() for transfer B gets correct address_index
   and size pair from corresponding MapCacheRev entry, but then it starts
   looking for MapCacheEntry with these values and finds the first entry
   -- which belongs to transfer C.

At this point there may be following possible (bad) consequences:

1. xen_ram_addr_from_mapcache() will use a wrong entry->vaddr_base value
   in this statement:

   raddr = (reventry->paddr_index << MCACHE_BUCKET_SHIFT) +
       ((unsigned long) ptr - (unsigned long) entry->vaddr_base);

resulting in an incorrent raddr value returned from the function. The
(ptr - entry->vaddr_base) expression may produce both positive and negative
numbers and its actual value may differ greatly as there are many
map/unmap operations take place. If the value will be beyond guest RAM
limits then a "Bad RAM offset" error will be triggered and logged,
followed by exit() in QEMU.

2. If raddr value won't exceed guest RAM boundaries, the same sequence
of actions will be performed for xen_invalidate_map_cache_entry() on DMA
unmap, resulting in a wrong MapCacheEntry being unmapped while DMA
operation which uses it is still active. The above example must
be extended by one more DMA transfer in order to allow unmapping as the
first mapping in the list is sort of resident.

The patch modifies the behavior in which MapCacheEntry's are added to the
list, avoiding duplicates.

Signed-off-by: Alexey Gerasimenko <x1917x@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 7fb394ad8a)
[BSC#1160024]
2021-03-16 20:23:04 -06:00
Olaf Hering
940cf557e5 xen: don't use xenstore to save/restore physmap anymore
If we have a system with xenforeignmemory_map2() implemented
we don't need to save/restore physmap on suspend/restore
anymore. In case we resume a VM without physmap - try to
recreate the physmap during memory region restore phase and
remap map cache entries accordingly. The old code is left
for compatibility reasons.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 331b5189d7)
(add only code within XEN_COMPAT_PHYSMAP)
[BSC#1160024]
2021-03-16 20:23:04 -06:00
Olaf Hering
7782ec9f50 xen/mapcache: introduce xen_replace_cache_entry()
This new call is trying to update a requested map cache entry
according to the changes in the physmap. The call is searching
for the entry, unmaps it and maps again at the same place using
a new guest address. If the mapping is dummy this call will
make it real.

This function makes use of a new xenforeignmemory_map2() call
with an extended interface that was recently introduced in
libxenforeignmemory [1].

[1] https://www.mail-archive.com/xen-devel@lists.xen.org/msg113007.html

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 5ba3d75645)
(adjusted to use xenforeignmemory_map instead of xenforeignmemory_map2)
[BSC#1160024]
2021-03-16 20:23:04 -06:00
Olaf Hering
3787b85bb4 xen/mapcache: add an ability to create dummy mappings
Dummys are simple anonymous mappings that are placed instead
of regular foreign mappings in certain situations when we need
to postpone the actual mapping but still have to give a
memory region to QEMU to play with.

This is planned to be used for restore on Xen.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 759235653d)
[BSC#1160024]
2021-03-16 20:23:04 -06:00
Leonid Bloch
d078492e46 qcow2: Explicit number replaced by a constant
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit bd016b912c)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Leonid Bloch
3644bf7f46 qcow2: Set the default cache-clean-interval to 10 minutes
The default cache-clean-interval is set to 10 minutes, in order to lower
the overhead of the qcow2 caches (before the default was 0, i.e.
disabled).

* For non-Linux platforms the default is kept at 0, because
  cache-clean-interval is not supported there yet.

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit e957b50b8d)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Leonid Bloch
a03218f09c qcow2: Increase the default upper limit on the L2 cache size
The upper limit on the L2 cache size is increased from 1 MB to 32 MB
on Linux platforms, and to 8 MB on other platforms (this difference is
caused by the ability to set intervals for cache cleaning on Linux
platforms only).

This is done in order to allow default full coverage with the L2 cache
for images of up to 256 GB in size (was 8 GB). Note, that only the
needed amount to cover the full image is allocated. The value which is
changed here is just the upper limit on the L2 cache size, beyond which
it will not grow, even if the size of the image will require it to.

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 80668d0fb7)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Leonid Bloch
87bcc27ce2 qcow2: Assign the L2 cache relatively to the image size
Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O.

Previously, unless 'cache-size' was specified and was large enough, the
L2 cache was set to a certain size without taking the virtual image size
into account.

Now, the L2 cache assignment is aware of the virtual size of the image,
and will cover the entire image, unless the cache size needed for that is
larger than a certain maximum. This maximum is set to 1 MB by default
(enough to cover an 8 GB image with the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b749562d98)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Leonid Bloch
51faf966da qcow2: Avoid duplication in setting the refcount cache size
The refcount cache size does not need to be set to its minimum value in
read_cache_sizes(), as it is set to at least its minimum value in
qcow2_update_options_prepare().

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 657ada52ab)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Leonid Bloch
980aca61cc qcow2: Make sizes more humanly readable
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b6a95c6d10)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Leonid Bloch
97746a0fe5 include: Add a lookup table of sizes
Adding a lookup table for the powers of two, with the appropriate size
prefixes. This is needed when a size has to be stringified, in which
case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))'
string. Powers of two are used very often for sizes, so such a table
will also make it easier and more intuitive to write them.

This table is generatred using the following AWK script:

BEGIN {
    suffix="KMGTPE";
    for(i=10; i<64; i++) {
        val=2**i;
        s=substr(suffix, int(i/10), 1);
        n=2**(i%10);
        pad=21-int(log(n)/log(10));
        printf("#define S_%d%siB %*d\n", n, s, pad, val);
    }
}

Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 540b849261)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Leonid Bloch
e4e169830b qcow2: Options' documentation fixes
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 40fb215d48)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Alberto Garcia
61b3b74481 qcow2: Fix Coverity warning when calculating the refcount cache size
MIN_REFCOUNT_CACHE_SIZE is 4 and the cluster size is guaranteed to be
at most 2MB, so the minimum refcount cache size (in bytes) is always
going to fit in a 32-bit integer.

Coverity doesn't know that, and since we're storing the result in a
uint64_t (*refcount_cache_size) it thinks that we need the 64 bits and
that we probably want to do a 64-bit multiplication to prevent the
result from being truncated.

This is a false positive in this case, but it's a fair warning.
We could do a 64-bit multiplication to get rid of it, but since we
know that a 32-bit variable is enough to store this value let's simply
reuse min_refcount_cache, make it a normal int and stop doing casts.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7af5eea9b3)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Alberto Garcia
466ff26361 docs: Document the new default sizes of the qcow2 caches
We have just reduced the refcount cache size to the minimum unless
the user explicitly requests a larger one, so we have to update the
documentation to reflect this change.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: c5f0bde23558dd9d33b21fffc76ac9953cc19c56.1523968389.git.berto@igalia.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 603790ef3a)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Alberto Garcia
8e2f2e916c qcow2: Give the refcount cache the minimum possible size by default
The L2 and refcount caches have default sizes that can be overridden
using the l2-cache-size and refcount-cache-size (an additional
parameter named cache-size sets the combined size of both caches).

Unless forced by one of the aforementioned parameters, QEMU will set
the unspecified sizes so that the L2 cache is 4 times larger than the
refcount cache.

This is based on the premise that the refcount metadata needs to be
only a fourth of the L2 metadata to cover the same amount of disk
space. This is incorrect for two reasons:

 a) The amount of disk covered by an L2 table depends solely on the
    cluster size, but in the case of a refcount block it depends on
    the cluster size *and* the width of each refcount entry.
    The 4/1 ratio is only valid with 16-bit entries (the default).

 b) When we talk about disk space and L2 tables we are talking about
    guest space (L2 tables map guest clusters to host clusters),
    whereas refcount blocks are used for host clusters (including
    L1/L2 tables and the refcount blocks themselves). On a fully
    populated (and uncompressed) qcow2 file, image size > virtual size
    so there are more refcount entries than L2 entries.

Problem (a) could be fixed by adjusting the algorithm to take into
account the refcount entry width. Problem (b) could be fixed by
increasing a bit the refcount cache size to account for the clusters
used for qcow2 metadata.

However this patch takes a completely different approach and instead
of keeping a ratio between both cache sizes it assigns as much as
possible to the L2 cache and the remainder to the refcount cache.

The reason is that L2 tables are used for every single I/O request
from the guest and the effect of increasing the cache is significant
and clearly measurable. Refcount blocks are however only used for
cluster allocation and internal snapshots and in practice are accessed
sequentially in most cases, so the effect of increasing the cache is
negligible (even when doing random writes from the guest).

So, make the refcount cache as small as possible unless the user
explicitly asks for a larger one.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 9695182c2eb11b77cb319689a1ebaa4e7c9d6591.1523968389.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 52253998ec)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Philippe Mathieu-Daudé
f2f7cc2de1 include: Add IEC binary prefixes in "qemu/units.h"
Loosely based on 076b35b5a5.

Suggested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180625124238.25339-2-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7ecdc94c40)
[LM: BSC#1139926]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Samuel Thibault
fc72749949 Fix heap overflow in ip_reass on big packet input
When the first fragment does not fit in the preallocated buffer, q will
already be pointing to the ext buffer, so we mustn't try to update it.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 126c04acbabd7ad32c2b018fe10dfac2a3bc1210)
[LY: CVE-2019-14378 BSC#1143794]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:04 -06:00
Liang Yan
10b7b7e12c qemu-bridge-helper: restrict interface name
The interface names in qemu-bridge-helper are defined to be
of size IFNAMSIZ(=16), including the terminating null('\0') byte.
The same is applied to interface names read from 'bridge.conf'
file to form ACLs rules. If user supplied '--br=bridge' name
is not restricted to the same length, it could lead to ACL bypass
issue. Restrict bridge name to IFNAMSIZ, including null byte.

Reported-by: Riccardo Schirone <rschiron@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[LY: BSC#1140402 CVE-2019-13164]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:04 -06:00
Prasad J Pandit
9d5500370d qxl: check release info object
When releasing spice resources in release_resource() routine,
if release info object 'ext.info' is null, it leads to null
pointer dereference. Add check to avoid it.

Reported-by: Bugs SysSec <bugs-syssec@rub.de>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20190425063534.32747-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d52680fc93)
[LY: BSC#1135902 CVE-2019-12155]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:04 -06:00
bd0e04d311 Revert "postcopy: Send whole huge pages"
This reverts commit 4c011c37ec.

The commit causes speed setting of migration doesn't work while vm uses
hugepage.

Dropping this commit is harmless due to postcopy needs kernel 4.11+ support.

(bsc#1127077)
2021-03-16 20:23:04 -06:00
Paolo Bonzini
f8aedf231a target/i386: define md-clear bit
md-clear is a new CPUID bit which is set when microcode provides the
mechanism to invoke a flush of various exploitable CPU buffers by invoking
the VERW instruction.  Add the new feature.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1111331 CVE-2018-12126 CVE-2018-12127 CVE-2018-12130
CVE-2019-11091]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
David Gibson
c08957b93b spapr: Simplify handling of host-serial and host-model values
27461d69a0 "ppc: add host-serial and host-model machine attributes
(CVE-2019-8934)" introduced 'host-serial' and 'host-model' machine
properties for spapr to explicitly control the values advertised to the
guest in device tree properties with the same names.

The previous behaviour on KVM was to unconditionally populate the device
tree with the real host serial number and model, which leaks possibly
sensitive information about the host to the guest.

To maintain compatibility for old machine types, we allowed those props
to be set to "passthrough" to take the value from the host as before.  Or
they could be set to "none" to explicitly omit the device tree items.

Special casing specific values on what's otherwise a user supplied string
is very ugly.  So, this patch simplifies things by implementing the
backwards compatibility in a different way: we have a machine class flag
set for the older machines, and we only load the host values into the
device tree if A) they're not set by the user and B) we have that flag set.

This does mean that the "passthrough" functionality is no longer available
with the current machine type.  That's ok though: if a user or management
layer really wants the information passed through they can read it
themselves (OpenStack Nova already does something similar for x86).

It also means the user can't explicitly ask for the values to be omitted
on the old machine types.  I think that's an acceptable trade-off: if you
care enough about not leaking the host information you can either move to
the new machine type, or use a dummy value for the properties.

For the new machine type, this also removes an odd inconsistency
between running on a POWER and non-POWER (or non-Linux) hosts: if the
host information couldn't be read from where we expect (in the host's
device tree as exposed by Linux), we'd fallback to omitting the guest
device tree items.

While we're there, improve some poorly worded comments, and the help text
for the properties.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 0a794529bd)
[LM: BSC#1126455 CVE-2019-03812]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Peter Maydell
3cbc2371bb device_tree.c: Don't use load_image()
The load_image() function is deprecated, as it does not let the
caller specify how large the buffer to read the file into is.
Instead use load_image_size().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20181130151712.2312-9-peter.maydell@linaro.org
(cherry picked from commit da885fe1ee)
[LM: BSC#1130675 CVE-2018-20815]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Gerd Hoffmann
021ad81921 i2c-ddc: fix oob read
Suggested-by: Michael Hanselmann <public@hansmi.ch>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael Hanselmann <public@hansmi.ch>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190108102301.1957-1-kraxel@redhat.com
(cherry picked from commit b05b267840)
[LM: BSC#1125721 CVE-2019-3812]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Prasad J Pandit
289934370a ppc: add host-serial and host-model machine attributes (CVE-2019-8934)
On ppc hosts, hypervisor shares following system attributes

  - /proc/device-tree/system-id
  - /proc/device-tree/model

with a guest. This could lead to information leakage and misuse.[*]
Add machine attributes to control such system information exposure
to a guest.

[*] https://wiki.openstack.org/wiki/OSSN/OSSN-0028

Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Fix-suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20190218181349.23885-1-ppandit@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 27461d69a0)
[LM: BSC#1126455 CVE-2019-03812]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
William Bowling
ba544a67f9 slirp: check sscanf result when emulating ident
When emulating ident in tcp_emu, if the strchr checks passed but the
sscanf check failed, two uninitialized variables would be copied and
sent in the reply, so move this code inside the if(sscanf()) clause.

Signed-off-by: William Bowling <will@wbowling.info>
Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Message-Id: <1551476756-25749-1-git-send-email-will@wbowling.info>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
(cherry picked from commit d3222975c7)
[LM: BSC#1129622 CVE-2019-9824 To pass our checkpatch check, I changed
the patch to use spaces, not tabs, as in the initially proposed]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-16 20:23:04 -06:00
Richard W.M. Jones
a5d4cfea95 virtio-scsi: Add virtqueue_size parameter allowing virtqueue size to be set.
Since Linux switched to blk-mq as the default in Linux commit
5c279bd9e406 ("scsi: default to scsi-mq"), virtio-scsi LUNs consume
about 10x as much guest kernel memory.

This commit allows you to choose the virtqueue size for each
virtio-scsi-pci controller like this:

  -device virtio-scsi-pci,id=scsi,virtqueue_size=16

The default is still 128 as before.  Using smaller virtqueue_size
allows many more disks to be added to small memory virtual machines.
For a 1 vCPU, 500 MB, no swap VM I observed:

  With scsi-mq enabled (upstream kernel):              175 disks
    -"- ditto -"-   virtqueue_size=64:                 318 disks
    -"- ditto -"-   virtqueue_size=16:                 775 disks
  With scsi-mq disabled (kernel before 5c279bd9e406): 1755 disks

Note that to have any effect, this requires a kernel patch:

  https://lkml.org/lkml/2017/8/10/689

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <20170810165255.20865-1-rjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5c0919d020)
[BR: FATE#327255. Fate was for SLE12-SP2, but we need to be able to
forward migrate this to SLE12-SP3 (and beyond)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Eduardo Habkost
6525e1a8db i386: Add new -IBRS versions of Intel CPU models
The new MSR IA32_SPEC_CTRL MSR was introduced by a recent Intel
microcode updated and can be used by OSes to mitigate
CVE-2017-5715.  Unfortunately we can't change the existing CPU
models without breaking existing setups, so users need to
explicitly update their VM configuration to use the new *-IBRS
CPU model if they want to expose IBRS to guests.

The new CPU models are simple copies of the existing CPU models,
with just CPUID_7_0_EDX_SPEC_CTRL added and model_id updated.

Cc: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-6-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit ac96c41354)
[BR: FATE#327261]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Boqun Feng (Intel)
b2b664b9d5 i386: add Skylake-Server cpu model
Introduce Skylake-Server cpu mode which inherits the features from
Skylake-Client and supports some additional features that are: AVX512,
CLWB and PGPE1GB.

Signed-off-by: Boqun Feng (Intel) <boqun.feng@gmail.com>
Message-Id: <20170621052935.20715-1-boqun.feng@gmail.com>
[ehabkost: copied comment about XSAVES from Skylake-Client]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 53f9a6f45f)
[BR: FATE#327261]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:04 -06:00
Gerd Hoffmann
09bc44f0e2 vga: catch depth 0
depth == 0 is used to indicate 256 color modes.  Our region calculation
goes wrong in that case.  So detect that and just take the safe code
path we already have for the wraparound case.

While being at it also catch depth == 15 (where our region size
calculation goes wrong too).  And make the comment more verbose,
explaining what is going on here.

Without this windows guest install might trigger an assert due to trying
to check dirty bitmap outside the snapshot region.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1575541
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180514103117.21059-1-kraxel@redhat.com
(cherry picked from commit a89fe6c329)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
f91ba1aa90 vga: fix region calculation
Typically the scanline length and the line offset are identical.  But
in case they are not our calculation for region_end is incorrect.  Using
line_offset is fine for all scanlines, except the last one where we have
to use the actual scanline length.

Fixes: CVE-2018-7550
Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Tested-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Message-id: 20180309143704.13420-1-kraxel@redhat.com
(cherry picked from commit 7cdc61becd)
[BR: BSC#1084604 CVE-2018-7858 (it appears the above CVE ref. is wrong)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
374f704b22 vga: fix region checks in wraparound case
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20171030102830.4469-1-kraxel@redhat.com
(cherry picked from commit 115788d7a7)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
bebba0ae38 vga: add ram_addr_t cast
Reported by Coverity.

Fixes: CID 1381409
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171010141323.14049-4-kraxel@redhat.com
(cherry picked from commit b0898b42ef)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
d4af53b238 vga: handle cirrus vbe mode wraparounds.
Commit "3d90c62548 vga: stop passing pointers to vga_draw_line*
functions" is incomplete.  It doesn't handle the case that the vga
rendering code tries to create a shared surface, i.e. a pixman image
backed by vga video memory.  That can not work in case the guest display
wraps from end of video memory to the start.  So force shadowing in that
case.  Also adjust the snapshot region calculation.

Can trigger with cirrus only, when programming vbe modes using the bochs
api (stdvga, also qxl and virtio-vga in vga compat mode) wrap arounds
can't happen.

Fixes: CVE-2017-13672
Fixes: 3d90c62548
Cc: P J P <ppandit@redhat.com>
Reported-by: David Buchanan <d@vidbuchanan.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171010141323.14049-3-kraxel@redhat.com
(cherry picked from commit 28f77de26a)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
a403628b83 vga: drop line_offset variable
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 362f811793)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
796a476acd vga: fix display update region calculation (split screen)
vga display update mis-calculated the region for the dirty bitmap
snapshot in case split screen mode is used.  This can trigger an
assert in cpu_physical_memory_snapshot_get_dirty().

Impact:  DoS for privileged guest users.

Fixes: CVE-2017-13673
Fixes: fec5e8c92b
Cc: P J P <ppandit@redhat.com>
Reported-by: David Buchanan <d@vidbuchanan.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170828123307.15392-1-kraxel@redhat.com
(cherry picked from commit e65294157d)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Marc-André Lureau
5b6d8acaf4 vga: use DIV_ROUND_UP
I used the clang-tidy qemu-round check to generate the fix:
https://github.com/elmarco/clang-tools-extra

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
(cherry picked from commit 2c23ce22c6)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
f8ed983fe7 vga: fix display update region calculation
vga display update mis-calculated the region for the dirty bitmap
snapshot in case the scanlines are padded.  This can triggere an
assert in cpu_physical_memory_snapshot_get_dirty().

Fixes: fec5e8c92b
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170509104839.19415-1-kraxel@redhat.com
(cherry picked from commit bfc56535f7)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
a09ee91be7 vga: make display updates thread safe.
The vga code clears the dirty bits *after* reading the framebuffer
memory.  So if the guest framebuffer updates hits the race window
between vga reading the framebuffer and vga clearing the dirty bits
vga will miss that update

Fix it by using the new memory_region_copy_and_clear_dirty()
memory_region_copy_get_dirty() functions.  That way we clear the
dirty bitmap before reading the framebuffer.  Any guest display
updates happening in parallel will be properly tracked in the
dirty bitmap then and the next display refresh will pick them up.

Problem triggers with mttcg only.  Before mttcg was merged tcg
never ran in parallel to vga emulation.  Using kvm will hide the
problem too, due to qemu operating on a userspace copy of the
kernel's dirty bitmap.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-5-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fec5e8c92b)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
943d2007fd vga: add vga_scanline_invalidated helper
Add vga_scanline_invalidated helper to check whenever a scanline was
invalidated.  Add a sanity check to fix OOB read access for display
heights larger than 2048.

Only cirrus uses this, for hardware cursor rendering, so having this
work properly for the first 2048 scanlines only shouldn't be a problem
as the cirrus can't handle large resolutions anyway.  Also changing the
invalidated_y_table size would break live migration.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-4-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f3289f6f0f)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
af276afcfc memory: add support getting and using a dirty bitmap copy.
This patch adds support for getting and using a local copy of the dirty
bitmap.

memory_region_snapshot_and_clear_dirty() will create a snapshot of the
dirty bitmap for the specified range, clear the dirty bitmap and return
the copy.  The returned bitmap can be a bit larger than requested, the
range is expanded so the code can copy unsigned longs from the bitmap
and avoid atomic bit update operations.

memory_region_snapshot_get_dirty() will return the dirty status of
pages, pretty much like memory_region_get_dirty(), but using the copy
returned by memory_region_copy_and_clear_dirty().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-3-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8deaf12ca1)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
c684372b63 bitmap: add bitmap_copy_and_clear_atomic
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-2-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d6eb141392)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Prasad J Pandit
e063db9bf8 ppc/pnv: check size before data buffer access
While performing PowerNV memory r/w operations, the access length
'sz' could exceed the data[4] buffer size. Add check to avoid OOB
access.

Reported-by: Moguofang <moguofang@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit d07945e78e)
[BR: BSC#1114957 CVE-2018-18954]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Greg Kurz
d571e26c32 9p: take write lock on fid path updates (CVE-2018-19364)
Recent commit 5b76ef50f6 fixed a race where v9fs_co_open2() could
possibly overwrite a fid path with v9fs_path_copy() while it is being
accessed by some other thread, ie, use-after-free that can be detected
by ASAN with a custom 9p client.

It turns out that the same can happen at several locations where
v9fs_path_copy() is used to set the fid path. The fix is again to
take the write lock.

Fixes CVE-2018-19364.

Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 5b3c77aa58)
[BR: BSC#1116717 CVE-2018-19364]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Greg Kurz
36a7041dc8 9p: write lock path in v9fs_co_open2()
The assumption that the fid cannot be used by any other operation is
wrong. At least, nothing prevents a misbehaving client to create a
file with a given fid, and to pass this fid to some other operation
at the same time (ie, without waiting for the response to the creation
request). The call to v9fs_path_copy() performed by the worker thread
after the file was created can race with any access to the fid path
performed by some other thread. This causes use-after-free issues that
can be detected by ASAN with a custom 9p client.

Unlike other operations that only read the fid path, v9fs_co_open2()
does modify it. It should hence take the write lock.

Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 5b76ef50f6)
[BR: BSC#1116717 CVE-2018-19364]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Greg Kurz
d1f3397b11 9p: fix QEMU crash when renaming files
When using the 9P2000.u version of the protocol, the following shell
command line in the guest can cause QEMU to crash:

    while true; do rm -rf aa; mkdir -p a/b & touch a/b/c & mv a aa; done

With 9P2000.u, file renaming is handled by the WSTAT command. The
v9fs_wstat() function calls v9fs_complete_rename(), which calls
v9fs_fix_path() for every fid whose path is affected by the change.
The involved calls to v9fs_path_copy() may race with any other access
to the fid path performed by some worker thread, causing a crash like
shown below:

Thread 12 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0,
 flags=65536, mode=0) at hw/9pfs/9p-local.c:59
59          while (*path && fd != -1) {
(gdb) bt
 path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59
 path=0x0) at hw/9pfs/9p-local.c:92
 fs_path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/9p-local.c:185
 path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/cofile.c:53
 at hw/9pfs/9p.c:1083
 at util/coroutine-ucontext.c:116
(gdb)

The fix is to take the path write lock when calling v9fs_complete_rename(),
like in v9fs_rename().

Impact:  DoS triggered by unprivileged guest users.

Fixes: CVE-2018-19489
Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 1d20398694)
[BR: BSC#1117275]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Gerd Hoffmann
5c7d55ae0d usb-mtp: use O_NOFOLLOW and O_CLOEXEC.
Open files and directories with O_NOFOLLOW to avoid symlinks attacks.
While being at it also add O_CLOEXEC.

usb-mtp only handles regular files and directories and ignores
everything else, so users should not see a difference.

Because qemu ignores symlinks, carrying out a successful symlink attack
requires swapping an existing file or directory below rootdir for a
symlink and winning the race against the inotify notification to qemu.

Fixes: CVE-2018-16872
Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: Bandan Das <bsd@redhat.com>
Reported-by: Michael Hanselmann <public@hansmi.ch>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael Hanselmann <public@hansmi.ch>
Message-id: 20181213122511.13853-1-kraxel@redhat.com
(cherry picked from commit bab9df35ce)
[BR: BSC#1119493]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Prasad J Pandit
0d07b80846 slirp: check data length while emulating ident function
While emulating identification protocol, tcp_emu() does not check
available space in the 'sc_rcv->sb_data' buffer. It could lead to
heap buffer overflow issue. Add check to avoid it.

Reported-by: Kira <864786842@qq.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit a7104eda7d)
[BR: BSC#1123156 CVE-2019-6778,  modify patch to use spaces instead of tabs]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Jim Somerville
9a7cd62f9d kvmclock: use the updated system_timer_msr
Fixes e2b6c17 (kvmclock: update system_time_msr address forcibly)
which makes a call to get the latest value of the address
stored in system_timer_msr, but then uses the old address anyway.

Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>
Message-Id: <59b67db0bd15a46ab47c3aa657c81a4c11f168ea.1506702472.git.Jim.Somerville@windriver.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 346b1215b1)
[BR: BSC#1113231]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Denis Plotnikov
98333963a8 kvmclock: update system_time_msr address forcibly
Do an update of system_time_msr address every time before reading
the value of tsc_timestamp from guest's kvmclock page.

There is no other code paths which ensure that qemu has an up-to-date
value of system_time_msr. So, force this update on guest's tsc_timestamp
reading.

This bug causes effect on those nested setups which turn off TPR access
interception for L2 guests and that access being intercepted by L0 doesn't
show up in L1.
Linux bootstrap initiate kvmclock before APIC initializing causing TPR access.
That's why on L1 guests, having TPR interception turned on for L2, the effect
of the bug is not revealed.

This patch fixes this problem by making sure it knows the correct
system_time_msr address every time it is needed.

Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <1496054944-25623-1-git-send-email-dplotnikov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e2b6c1712e)
[BR: BSC#1113231]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Peter Maydell
e7768429c8 linux-user: make pwrite64/pread64(fd, NULL, 0, offset) return 0
Linux returns success if pwrite64() or pread64() are called with a
zero length NULL buffer, but QEMU was returning -TARGET_EFAULT.

This is the same bug that we fixed in commit 58cfa6c2e6
for the write syscall, and long before that in 38d840e679
for the read syscall.

Fixes: https://bugs.launchpad.net/qemu/+bug/1810433

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190108184900.9654-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 2bd3f8998e)
[LY: BSC#1121600]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:03 -06:00
Tony Garnock-Jones
8da6b2014b linux-user: write(fd, NULL, 0) parity with linux's treatment of same
Bring linux-user write(2) handling into line with linux for the case
of a 0-byte write with a NULL buffer. Based on a patch originally
written by Zhuowei Zhang.

Addresses https://bugs.launchpad.net/qemu/+bug/1716292.

>From Zhuowei Zhang's patch (https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg08073.html):

    Linux returns success for the special case of calling write with a
    zero-length NULL buffer: compiling and running

    int main() {
       ssize_t ret = write(STDOUT_FILENO, NULL, 0);
       fprintf(stderr, "write returned %ld\n", ret);
       return 0;
    }

    gives "write returned 0" when run directly, but "write returned
    -1" in QEMU.

    This commit checks for this situation and returns success if
    found.

Subsequent discussion raised the following questions (and my answers):

 - Q. Should TARGET_NR_read pass through to safe_read in this
      situation too?
   A. I'm wary of changing unrelated code to the specific problem I'm
      addressing. TARGET_NR_read is already consistent with Linux for
      this case.

 - Q. Do pread64/pwrite64 need to be changed similarly?
   A. Experiment suggests not: both linux and linux-user yield -1 for
      NULL 0-length reads/writes.

Signed-off-by: Tony Garnock-Jones <tonygarnockjones@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180908182205.GB409@mornington.dcs.gla.ac.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 58cfa6c2e6)
[LY: BSC#1121600]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:03 -06:00
Tim Smith
803ece930d xen_disk: Avoid repeated memory allocation
xen_disk currently allocates memory to hold the data for each ioreq
as that ioreq is used, and frees it afterwards. Because it requires
page-aligned blocks, this interacts poorly with non-page-aligned
allocations and balloons the heap.

Instead, allocate the maximum possible requirement, which is
BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
the ioreq is created, and keep that allocation until it is destroyed.
Since the ioreqs themselves are re-used via a free list, this
should actually improve memory usage.

Signed-off-by: Tim Smith <tim.smith@citrix.com>
[BSC#1100408]
2021-03-16 20:23:03 -06:00
Marc-André Lureau
1297515103 hw/char/serial: retry write if EAGAIN
If the chardev returns -1 with EAGAIN errno on write(), it should try
to send it again (EINTR is handled by the chardev itself).

This fixes commit 019288bf13
"hw/char/serial: Only retry if qemu_chr_fe_write returns 0"

Tested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180716110755.12499-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f3575af130)
[LY: BSC#1108474]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:03 -06:00
Sergio Lopez
758fb5dcda hw/char/serial: Only retry if qemu_chr_fe_write returns 0
Only retry on serial_xmit if qemu_chr_fe_write returns 0, as this is the
only recoverable error.

Retrying with any other scenario, in addition to being a waste of CPU
cycles, can compromise the Guest stability if by the vCPU issuing the
write and the main loop thread are, by chance or explicit pinning,
running on the same pCPU.

Previous discussion:

https://lists.nongnu.org/archive/html/qemu-devel/2018-05/msg06998.html

Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-Id: <1528185295-14199-1-git-send-email-slp@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 019288bf13)
[LY: BSC#1108474]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:03 -06:00
Paolo Bonzini
ae311d01d0 lsi53c895a: check message length value is valid
While writing a message in 'lsi_do_msgin', message length value
in 'msg_len' could be invalid due to an invalid migration stream.
Add an assertion to avoid an out of bounds access, and reject
the incoming migration data if it contains an invalid message
length.

Discovered by Deja vu Security. Reported by Oracle.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20181026194314.18663-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e58ccf0396)
[BR: BSC#1114422 CVE-2018-18849]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Jason Wang
50ebd014ca net: ignore packet size greater than INT_MAX
There should not be a reason for passing a packet size greater than
INT_MAX. It's usually a hint of bug somewhere, so ignore packet size
greater than INT_MAX in qemu_deliver_packet_iov()

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1592a99470)
[LD: BSC#1111013 CVE-2018-17963]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Jason Wang
aad97dd60b pcnet: fix possible buffer overflow
In pcnet_receive(), we try to assign size_ to size which converts from
size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit b1d80d12c5)
[LD: BSC#1111010 CVE-2018-17962]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Jason Wang
b8f82a41dd rtl8139: fix possible out of bound access
In rtl8139_do_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1a326646fe)
[LD: BSC#1111006 CVE-2018-17958]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Jason Wang
4eb8ed37f9 ne2000: fix possible out of bound access in ne2000_receive
In ne2000_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit fdc89e90fa)
[LD: BSC#1110910 CVE-2018-10839]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
f92acbfb9b xen_disk: use a single entry iovec
Since xen_disk now always copies data to and from a guest there is no need
to maintain a vector entry corresponding to every page of a request.
This means there is less per-request state to maintain so the ioreq
structure can shrink significantly.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony Perard <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 5ebf962652)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
d18c118c26 xen_disk: remove use of grant map/unmap
Now that the (native or emulated) xen_be_copy_grant_refs() helper is
always available, the xen_disk code can be significantly simplified by
removing direct use of grant map and unmap operations.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony Perard <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 06454c24ad)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
ecabdb4271 xen_backend: add an emulation of grant copy
Not all Xen environments support the xengnttab_grant_copy() operation.
E.g. where the OS is FreeBSD or Xen is older than 4.8.0.

This patch introduces an emulation of that operation using
xengnttab_map_domain_grant_refs() and memcpy() for those environments.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 3fe12b8403)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
40094b2ca7 xen: remove other open-coded use of libxengnttab
Now that helpers are available in xen_backend, use them throughout all
Xen PV backends.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony Perard <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 58560f2ae7)
[LD: BSC#1100408 - Removed xen_9p* file because it doesn't exist]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
2b91625158 xen_disk: remove open-coded use of libxengnttab
Now that helpers are present in xen_backend, this patch removes open-coded
calls to libxengnttab from the xen_disk code.

This patch also fixes one whitspace error in the assignment of the
XenDevOps initialise method.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony Perard <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 5ee1d99913)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
874d0095ec xen_backend: add grant table helpers
This patch adds grant table helper functions to the xen_backend code to
localize error reporting and use of xen_domid.

The patch also defers the call to xengnttab_open() until just before the
initialise method in XenDevOps is invoked. This method is responsible for
mapping the shared ring. No prior method requires access to the grant table.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 9838824aff)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
34ad9e61a2 xen: add a meaningful declaration of grant_copy_segment into xen_common.h
Currently the xen_disk source has to carry #ifdef exclusions to compile
against Xen older then 4.8. This is a bit messy so this patch lifts the
definition of struct xengnttab_grant_copy_segment and adds it into the
pre-4.8 compat area in xen_common.h, which allows xen_disk to be cleaned
up.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 5c0d914a9b)
[LD: BSC#1100408 - Modified as the Xen interface versioning has changed]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Juergen Gross
f87457686f xen: dont try setting max grants multiple times
Trying to call xengnttab_set_max_grants() with the same file handle
might fail on some kernels, as this operation is allowed only once.

This is a problem for the qdisk backend as blk_connect() can be
called multiple times for a domain, e.g. in case grub-xen is being
used to boot it.

So instead of letting the generic backend code open the gnttab device
do it in blk_connect() and close it again in blk_disconnect.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit e38c3e86df)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Juergen Gross
be41a22032 xen: add a global indicator for grant copy being available
The Xen qdisk backend needs to test whether grant copy operations is
available in the kernel. Unfortunately this collides with using
xengnttab_set_max_grants() on some kernels as this operation has to
be the first one after opening the gnttab device.

In order to solve this problem test for the availability of grant copy
in xen_be_init() opening the gnttab device just for that purpose and
closing it again afterwards. Advertise the availability via a global
flag and use that flag in the qdisk backend.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit b5e397a79e)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Olaf Hering
5e528f1296 xen-disk: use g_new0 to fix build
g_malloc0_n is available since glib-2.24. To allow build with older glib
versions use the generic g_new0, which is already used in many other
places in the code.

Fixes commit 3284fad728 ("xen-disk: add support for multi-page shared rings")

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit a3fd781f65)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
aecb30c502 xen-disk: add support for multi-page shared rings
The blkif protocol has had provision for negotiation of multi-page shared
rings for some time now and many guest OS have support in their frontend
drivers.

This patch makes the necessary modifications to xen-disk support a shared
ring up to order 4 (i.e. 16 pages).

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 3284fad728)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
d312b398e7 xen-disk: only advertize feature-persistent if grant copy is not available
If grant copy is available then it will always be used in preference to
persistent maps. In this case feature-persistent should not be advertized
to the frontend, otherwise it may needlessly copy data into persistently
granted buffers.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 976eba1c88)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
32fff02604 xen: use libxendevicemodel when available
This patch modifies the wrapper functions in xen_common.h to use the
new xendevicemodel interface if it is available along with compatibility
code to use the old libxenctrl interface if it is not.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit d655f34e6d)
[LD: BSC#1100408 - Commit combined with 8f25e7544 for simplification]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
1aa5e90b10 xen: make use of xen_xc implicit in xen_common.h inlines
Doing this will make the transition to using the new libxendevicemodel
interface less intrusive on the callers of these functions, since using
the new library will require a change of handle.

NOTE: The patch also moves the 'externs' for xen_xc and xen_fmem from
      xen_backend.h to xen_common.h, and the declarations from
      xen_backend.c to xen-common.c, which is where they belong.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 260cabed71)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Paul Durrant
859fc90b91 xen: rename xen_modified_memory() to xen_hvm_modified_memory()
This patch is a purely cosmetic change that avoids a name collision in
a subsequent patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 5100afb5f5)
[LD: BSC#1100408]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Marc-André Lureau
40c6a97775 seccomp: check TSYNC host capability
Remove -sandbox option if the host is not capable of TSYNC, since the
sandbox will fail at setup time otherwise. This will help libvirt, for
ex, to figure out if -sandbox will work.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 5780760f5e)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Yi Min Zhao
b0e032e5f2 sandbox: disable -sandbox if CONFIG_SECCOMP undefined
If CONFIG_SECCOMP is undefined, the option 'elevatedprivileges' remains
compiled. This would make libvirt set the corresponding capability and
then trigger failure during guest startup. This patch moves the code
regarding seccomp command line options to qemu-seccomp.c file and
wraps qemu_opts_foreach finding sandbox option with CONFIG_SECCOMP.
Because parse_sandbox() is moved into qemu-seccomp.c file, change
seccomp_start() to static function.

Signed-off-by: Yi Min Zhao <zyimin@linux.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Tested-by: Ján Tomko <jtomko@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 9d0fdecbad)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Marc-André Lureau
704c459703 seccomp: set the seccomp filter to all threads
When using "-seccomp on", the seccomp policy is only applied to the
main thread, the vcpu worker thread and other worker threads created
after seccomp policy is applied; the seccomp policy is not applied to
e.g. the RCU thread because it is created before the seccomp policy is
applied and SECCOMP_FILTER_FLAG_TSYNC isn't used.

This can be verified with
for task in /proc/`pidof qemu`/task/*; do cat $task/status | grep Secc ; done
Seccomp:	2
Seccomp:	0
Seccomp:	0
Seccomp:	2
Seccomp:	2
Seccomp:	2

Starting with libseccomp 2.2.0 and kernel >= 3.17, we can use
seccomp_attr_set(ctx, > SCMP_FLTATR_CTL_TSYNC, 1) to update the policy
on all threads.

libseccomp requirement was bumped to 2.2.0 in previous patch.
libseccomp should fail to set the filter if it can't honour
SCMP_FLTATR_CTL_TSYNC (untested), and thus -sandbox will now fail on
kernel < 3.17.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 70dfabeaa7)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Marc-André Lureau
9abac624f6 configure: require libseccomp 2.2.0
The following patch is going to require TSYNC, which is only available
since libseccomp 2.2.0.

libseccomp 2.2.0 was released February 12, 2015.

According to repology, libseccomp version in different distros:

  RHEL-7: 2.3.1
  Debian (Stretch): 2.3.1
  OpenSUSE Leap 15: 2.3.2
  Ubuntu (Xenial):  2.3.1

This will drop support for -sandbox on:

  Debian (Jessie): 2.1.1 (but 2.2.3 in backports)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit d0699bd37c)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Marc-André Lureau
79b95cc6d5 seccomp: prefer SCMP_ACT_KILL_PROCESS if available
The upcoming libseccomp release should have SCMP_ACT_KILL_PROCESS
action (https://github.com/seccomp/libseccomp/issues/96).

SCMP_ACT_KILL_PROCESS is preferable to immediately terminate the
offending process, rather than having the SIGSYS handler running.

Use SECCOMP_GET_ACTION_AVAIL to check availability of kernel support,
as libseccomp will fallback on SCMP_ACT_KILL otherwise, and we still
prefer SCMP_ACT_TRAP.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit bda08a5764)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Marc-André Lureau
bc40b68ed9 seccomp: allow sched_setscheduler() with SCHED_IDLE policy
Current and upcoming mesa releases rely on a shader disk cash. It uses
a thread job queue with low priority, set with
sched_setscheduler(SCHED_IDLE). However, that syscall is rejected by
the "resourcecontrol" seccomp qemu filter.

Since it should be safe to allow lowering thread priority, let's allow
scheduling thread to idle policy.

Related to:
https://bugzilla.redhat.com/show_bug.cgi?id=1594456

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 056de1e894)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Eduardo Otubo
e41f0cbaec seccomp: add resourcecontrol argument to command line
This patch adds [,resourcecontrol=deny] to `-sandbox on' option. It
blacklists all process affinity and scheduler priority system calls to
avoid any bigger of the process.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 24f8cdc572)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Eduardo Otubo
c3c9110dc2 seccomp: add spawn argument to command line
This patch adds [,spawn=deny] argument to `-sandbox on' option. It
blacklists fork and execve system calls, avoiding Qemu to spawn new
threads or processes.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 995a226f88)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Eduardo Otubo
c42e4ce77b seccomp: add elevateprivileges argument to command line
This patch introduces the new argument
[,elevateprivileges=allow|deny|children] to the `-sandbox on'. It allows
or denies Qemu process to elevate its privileges by blacklisting all
set*uid|gid system calls. The 'children' option will let forks and
execves run unprivileged.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 73a1e64725)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Eduardo Otubo
80727fbf56 seccomp: add obsolete argument to command line
This patch introduces the argument [,obsolete=allow] to the `-sandbox on'
option. It allows Qemu to run safely on old system that still relies on
old system calls.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 2b716fa6d6)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Eduardo Otubo
4959004b58 seccomp: changing from whitelist to blacklist
This patch changes the default behavior of the seccomp filter from
whitelist to blacklist. By default now all system calls are allowed and
a small black list of definitely forbidden ones was created.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 1bd6152ae2)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-16 20:23:03 -06:00
Konrad Rzeszutek Wilk
f3082defdd i386: define the AMD 'virt-ssbd' CPUID feature bit (CVE-2018-3639)
AMD Zen expose the Intel equivalant to Speculative Store Bypass Disable
via the 0x80000008_EBX[25] CPUID feature bit.

This needs to be exposed to guest OS to allow them to protect
against CVE-2018-3639.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180521215424.13520-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 403503b162)
[BR: BSC#1092885 CVE-2018-3639]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Konrad Rzeszutek Wilk
f6e9348cbe i386: Define the Virt SSBD MSR and handling of it (CVE-2018-3639)
"Some AMD processors only support a non-architectural means of enabling
speculative store bypass disable (SSBD).  To allow a simplified view of
this to a guest, an architectural definition has been created through a new
CPUID bit, 0x80000008_EBX[25], and a new MSR, 0xc001011f.  With this, a
hypervisor can virtualize the existence of this definition and provide an
architectural method for using SSBD to a guest.

Add the new CPUID feature, the new MSR and update the existing SSBD
support to use this MSR when present." (from x86/speculation: Add virtualized
speculative store bypass disable support in Linux).

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180521215424.13520-4-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit cfeea0c021)
[BR: BSC#1092885 CVE-2018-3639]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Prasad J Pandit
c30bf53560 qga: check bytes count read by guest-file-read
While reading file content via 'guest-file-read' command,
'qmp_guest_file_read' routine allocates buffer of count+1
bytes. It could overflow for large values of 'count'.
Add check to avoid it.

Reported-by: Fakhri Zulkifli <mohdfakhrizulkifli@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 141b197408)
[FL: BSC#1098735 CVE-2018-12617]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-16 20:23:03 -06:00
Prasad J Pandit
3d1025efbb slirp: correct size computation while concatenating mbuf
While reassembling incoming fragmented datagrams, 'm_cat' routine
extends the 'mbuf' buffer, if it has insufficient room. It computes
a wrong buffer size, which leads to overwriting adjacent heap buffer
area. Correct this size computation in m_cat.

Reported-by: ZDI Disclosures <zdi-disclosures@trendmicro.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 864036e251)
[FL: BSC#1096223 CVE-2018-11806]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-16 20:23:03 -06:00
Bruce Rogers
8ef8e4b748 xen: add block resize support for xen disks
Provide monitor naming of xen disks, and plumb guest driver
notification through xenstore of resizing instigated via the
monitor.

[BR: FATE#325467]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Marc-André Lureau
777b1d3551 tpm: lookup cancel path under tpm device class
Since Linux commit 313d21eeab9282e, tpm devices have their own device
class "tpm" and the cancel path must be looked up under
/sys/class/tpm/ instead of /sys/class/misc/.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
(cherry picked from commit 05b71fb207)
[LY: BSC#1070615]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:03 -06:00
Daniel P. Berrangé
a4f7d4e685 i386: define the 'ssbd' CPUID feature bit (CVE-2018-3639)
New microcode introduces the "Speculative Store Bypass Disable"
CPUID feature bit. This needs to be exposed to guest OS to allow
them to protect against CVE-2018-3639.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Message-Id: <20180521215424.13520-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit d19d1f9659)
[BR: BSC#1092885 CVE-2018-3639]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Bruce Rogers
ecb7c53be7 migration: warn about inconsistent spec_ctrl state
As an attempt to help the user do the right thing, warn if we
detect spec_ctrl data in the migration stream, but where the
cpu defined doesn't have the feature. This would indicate the
migration is from the quick and dirty qemu produced in January
2018 to handle Spectre v2. That qemu version exposed the IBRS
cpu feature to all vcpu types, which helped in the short term
but wasn't a well designed approach.
Warn the user that the now migrated guest needs to be restarted
as soon as possible, using the spec_ctrl cpu feature flag or a
*-IBRS vcpu model specified as appropriate.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Eduardo Habkost
cdd90b4d50 i386: Add EPYC-IBPB CPU model
EPYC-IBPB is a copy of the EPYC CPU model with
just CPUID_8000_0008_EBX_IBPB added.

Cc: Jiri Denemark <jdenemar@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-7-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 6cfbc54e89)
[BR: BSC#1068032 CVE-2017-5715]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Eduardo Habkost
6b533935ee i386: Add new -IBRS versions of Intel CPU models
The new MSR IA32_SPEC_CTRL MSR was introduced by a recent Intel
microcode updated and can be used by OSes to mitigate
CVE-2017-5715.  Unfortunately we can't change the existing CPU
models without breaking existing setups, so users need to
explicitly update their VM configuration to use the new *-IBRS
CPU model if they want to expose IBRS to guests.

The new CPU models are simple copies of the existing CPU models,
with just CPUID_7_0_EDX_SPEC_CTRL added and model_id updated.

Cc: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-6-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit ac96c41354)
[BR: BSC#1068032 CVE-2017-5715 CPU models are reduced as appropriate
to match this QEMU version]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Eduardo Habkost
1e0e253bdb i386: Add FEAT_8000_0008_EBX CPUID feature word
Add the new feature word and the "ibpb" feature flag.

Based on a patch by Paolo Bonzini.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-5-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 1b3420e1c4)
[BR: BSC#1068032 CVE-2017-5715]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Eduardo Habkost
7d60f8b34c i386: Add spec-ctrl CPUID bit
Add the feature name and a CPUID_7_0_EDX_SPEC_CTRL macro.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-4-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit a2381f0934)
[BR: BSC#1068032 CVE-2017-5715]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Bruce Rogers
689e263e1e x86: cpu: increase model_id array size from 48 to 49
This is perhaps the easiest way to handle the longer model names
introduced with the -IBRS models. This is in lew of backporting
commit 807e9869b8 and other commits
which that would require.
[BR: BSC#1068032 CVE-2017-5715]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:03 -06:00
Daniel P. Berrange
1943be857a ui: place a hard cap on VNC server output buffer size
The previous patches fix problems with throttling of forced framebuffer updates
and audio data capture that would cause the QEMU output buffer size to grow
without bound. Those fixes are graceful in that once the client catches up with
reading data from the server, everything continues operating normally.

There is some data which the server sends to the client that is impractical to
throttle. Specifically there are various pseudo framebuffer update encodings to
inform the client of things like desktop resizes, pointer changes, audio
playback start/stop, LED state and so on. These generally only involve sending
a very small amount of data to the client, but a malicious guest might be able
to do things that trigger these changes at a very high rate. Throttling them is
not practical as missed or delayed events would cause broken behaviour for the
client.

This patch thus takes a more forceful approach of setting an absolute upper
bound on the amount of data we permit to be present in the output buffer at
any time. The previous patch set a threshold for throttling the output buffer
by allowing an amount of data equivalent to one complete framebuffer update and
one seconds worth of audio data. On top of this it allowed for one further
forced framebuffer update to be queued.

To be conservative, we thus take that throttling threshold and multiply it by
5 to form an absolute upper bound. If this bound is hit during vnc_write() we
forceably disconnect the client, refusing to queue further data. This limit is
high enough that it should never be hit unless a malicious client is trying to
exploit the sever, or the network is completely saturated preventing any sending
of data on the socket.

This completes the fix for CVE-2017-15124 started in the previous patches.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-12-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f887cf165d)
[LY: BSC#1073489 CVE-2017-15124]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Daniel P. Berrange
f9a755c55d ui: fix VNC client throttling when forced update is requested
The VNC server must throttle data sent to the client to prevent the 'output'
buffer size growing without bound, if the client stops reading data off the
socket (either maliciously or due to stalled/slow network connection).

The current throttling is very crude because it simply checks whether the
output buffer offset is zero. This check is disabled if the client has requested
a forced update, because we want to send these as soon as possible.

As a result, the VNC client can cause QEMU to allocate arbitrary amounts of RAM.
They can first start something in the guest that triggers lots of framebuffer
updates eg play a youtube video. Then repeatedly send full framebuffer update
requests, but never read data back from the server. This can easily make QEMU's
VNC server send buffer consume 100MB of RAM per second, until the OOM killer
starts reaping processes (hopefully the rogue QEMU process, but it might pick
others...).

To address this we make the throttling more intelligent, so we can throttle
full updates. When we get a forced update request, we keep track of exactly how
much data we put on the output buffer. We will not process a subsequent forced
update request until this data has been fully sent on the wire. We always allow
one forced update request to be in flight, regardless of what data is queued
for incremental updates or audio data. The slight complication is that we do
not initially know how much data an update will send, as this is done in the
background by the VNC job thread. So we must track the fact that the job thread
has an update pending, and not process any further updates until this job is
has been completed & put data on the output buffer.

This unbounded memory growth affects all VNC server configurations supported by
QEMU, with no workaround possible. The mitigating factor is that it can only be
triggered by a client that has authenticated with the VNC server, and who is
able to trigger a large quantity of framebuffer updates or audio samples from
the guest OS. Mostly they'll just succeed in getting the OOM killer to kill
their own QEMU process, but its possible other processes can get taken out as
collateral damage.

This is a more general variant of the similar unbounded memory usage flaw in
the websockets server, that was previously assigned CVE-2017-15268, and fixed
in 2.11 by:

  commit a7b20a8efa
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Mon Oct 9 14:43:42 2017 +0100

    io: monitor encoutput buffer size from websocket GSource

This new general memory usage flaw has been assigned CVE-2017-15124, and is
partially fixed by this patch.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-11-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ada8d2e436)
[LY: BSC#1073489 CVE-2017-15124]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Daniel P. Berrange
f7c4b1bc33 ui: fix VNC client throttling when audio capture is active
The VNC server must throttle data sent to the client to prevent the 'output'
buffer size growing without bound, if the client stops reading data off the
socket (either maliciously or due to stalled/slow network connection).

The current throttling is very crude because it simply checks whether the
output buffer offset is zero. This check must be disabled if audio capture is
enabled, because when streaming audio the output buffer offset will rarely be
zero due to queued audio data, and so this would starve framebuffer updates.

As a result, the VNC client can cause QEMU to allocate arbitrary amounts of RAM.
They can first start something in the guest that triggers lots of framebuffer
updates eg play a youtube video. Then enable audio capture, and simply never
read data back from the server. This can easily make QEMU's VNC server send
buffer consume 100MB of RAM per second, until the OOM killer starts reaping
processes (hopefully the rogue QEMU process, but it might pick others...).

To address this we make the throttling more intelligent, so we can throttle
when audio capture is active too. To determine how to throttle incremental
updates or audio data, we calculate a size threshold. Normally the threshold is
the approximate number of bytes associated with a single complete framebuffer
update. ie width * height * bytes per pixel. We'll send incremental updates
until we hit this threshold, at which point we'll stop sending updates until
data has been written to the wire, causing the output buffer offset to fall
back below the threshold.

If audio capture is enabled, we increase the size of the threshold to also
allow for upto 1 seconds worth of audio data samples. ie nchannels * bytes
per sample * frequency. This allows the output buffer to have a mixture of
incremental framebuffer updates and audio data queued, but once the threshold
is exceeded, audio data will be dropped and incremental updates will be
throttled.

This unbounded memory growth affects all VNC server configurations supported by
QEMU, with no workaround possible. The mitigating factor is that it can only be
triggered by a client that has authenticated with the VNC server, and who is
able to trigger a large quantity of framebuffer updates or audio samples from
the guest OS. Mostly they'll just succeed in getting the OOM killer to kill
their own QEMU process, but its possible other processes can get taken out as
collateral damage.

This is a more general variant of the similar unbounded memory usage flaw in
the websockets server, that was previously assigned CVE-2017-15268, and fixed
in 2.11 by:

  commit a7b20a8efa
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Mon Oct 9 14:43:42 2017 +0100

    io: monitor encoutput buffer size from websocket GSource

This new general memory usage flaw has been assigned CVE-2017-15124, and is
partially fixed by this patch.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-10-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e2b72cb6e0)
[LY: BSC#1073489 CVE-2017-15124]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Daniel P. Berrange
beaa3c1fbc ui: refactor code for determining if an update should be sent to the client
The logic for determining if it is possible to send an update to the client
will become more complicated shortly, so pull it out into a separate method
for easier extension later.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-9-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 0bad834228)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Daniel P. Berrange
1e3261d3ba ui: correctly reset framebuffer update state after processing dirty regions
According to the RFB protocol, a client sends one or more framebuffer update
requests to the server. The server can reply with a single framebuffer update
response, that covers all previously received requests. Once the client has
read this update from the server, it may send further framebuffer update
requests to monitor future changes. The client is free to delay sending the
framebuffer update request if it needs to throttle the amount of data it is
reading from the server.

The QEMU VNC server, however, has never correctly handled the framebuffer
update requests. Once QEMU has received an update request, it will continue to
send client updates forever, even if the client hasn't asked for further
updates. This prevents the client from throttling back data it gets from the
server. This change fixes the flawed logic such that after a set of updates are
sent out, QEMU waits for a further update request before sending more data.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-8-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 728a7ac954)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Daniel P. Berrange
19f8dd8dcb ui: introduce enum to track VNC client framebuffer update request state
Currently the VNC servers tracks whether a client has requested an incremental
or forced update with two boolean flags. There are only really 3 distinct
states to track, so create an enum to more accurately reflect permitted states.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-7-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fef1bbadfb)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Daniel P. Berrange
efa1299c85 ui: track how much decoded data we consumed when doing SASL encoding
When we encode data for writing with SASL, we encode the entire pending output
buffer. The subsequent write, however, may not be able to send the full encoded
data in one go though, particularly with a slow network. So we delay setting the
output buffer offset back to zero until all the SASL encoded data is sent.

Between encoding the data and completing sending of the SASL encoded data,
however, more data might have been placed on the pending output buffer. So it
is not valid to set offset back to zero. Instead we must keep track of how much
data we consumed during encoding and subtract only that amount.

With the current bug we would be throwing away some pending data without having
sent it at all. By sheer luck this did not previously cause any serious problem
because appending data to the send buffer is always an atomic action, so we
only ever throw away complete RFB protocol messages. In the case of frame buffer
updates we'd catch up fairly quickly, so no obvious problem was visible.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-6-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8f61f1c5a6)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Daniel P. Berrange
adba409664 ui: avoid pointless VNC updates if framebuffer isn't dirty
The vnc_update_client() method checks the 'has_dirty' flag to see if there are
dirty regions that are pending to send to the client. Regardless of this flag,
if a forced update is requested, updates must be sent. For unknown reasons
though, the code also tries to sent updates if audio capture is enabled. This
makes no sense as audio capture state does not impact framebuffer contents, so
this check is removed.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-5-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3541b08475)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Daniel P. Berrange
8fdd8e74ef ui: remove redundant indentation in vnc_client_update
Now that previous dead / unreachable code has been removed, we can simplify
the indentation in the vnc_client_update method.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-4-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit b939eb89b6)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Daniel P. Berrange
935902c8e6 ui: remove unreachable code in vnc_update_client
A previous commit:

  commit 5a8be0f73d
  Author: Gerd Hoffmann <kraxel@redhat.com>
  Date:   Wed Jul 13 12:21:20 2016 +0200

    vnc: make sure we finish disconnect

Added a check for vs->disconnecting at the very start of the
vnc_update_client method. This means that the very next "if"
statement check for !vs->disconnecting always evaluates true,
and is thus redundant. This in turn means the vs->disconnecting
check at the very end of the method never evaluates true, and
is thus unreachable code.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-3-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c53df96161)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Daniel P. Berrange
62b7110af5 ui: remove 'sync' parameter from vnc_update_client
There is only one caller of vnc_update_client and that always passes false
for the 'sync' parameter.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-2-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 6af998db05)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Marc-André Lureau
6ebfa04257 vnc: fix debug spelling
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171220140618.12701-1-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 090fdc83b0)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Prasad J Pandit
d3bfb8a47c multiboot: check mh_load_end_addr address field
While loading kernel via multiboot-v1 image, (flags & 0x00010000)
indicates that multiboot header contains valid addresses to load
the kernel image. In that, end of the data segment address
'mh_load_end_addr' should be less than the bss segment address
'mh_bss_end_addr'. Add check to validate that.

Reported-by: CERT CC <cert.cc@orange.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[LY: BSC#1083291 CVE-2018-7550]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
linzhecheng
2950bc28c9 vga: check the validation of memory addr when draw text
Start a vm with qemu-kvm -enable-kvm -vnc :66 -smp 1 -m 1024 -hda
redhat_5.11.qcow2  -device pcnet -vga cirrus,
then use VNC client to connect to VM, and excute the code below in guest
OS will lead to qemu crash:

int main()
 {
    iopl(3);
    srand(time(NULL));
    int a,b;
    while(1){
	a = rand()%0x100;
	b = 0x3c0 + (rand()%0x20);
        outb(a,b);
    }
    return 0;
}

The above code is writing the registers of VGA randomly.
We can write VGA CRT controller registers index 0x0C or 0x0D
(which is the start address register) to modify the
the display memory address of the upper left pixel
or character of the screen. The address may be out of the
range of vga ram. So we should check the validation of memory address
when reading or writing it to avoid segfault.

Signed-off-by: linzhecheng <linzhecheng@huawei.com>
Message-id: 20180111132724.13744-1-linzhecheng@huawei.com
Fixes: CVE-2018-5683
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 191f59dc17)
[LY: BSC#1076114 CVE-2018-5683]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Eric Blake
a2fcf05bca osdep: Fix ROUND_UP(64-bit, 32-bit)
When using bit-wise operations that exploit the power-of-two
nature of the second argument of ROUND_UP(), we still need to
ensure that the mask is as wide as the first argument (done
by using a ternary to force proper arithmetic promotion).
Unpatched, ROUND_UP(2ULL*1024*1024*1024*1024, 512U) produces 0,
instead of the intended 2TiB, because negation of an unsigned
32-bit quantity followed by widening to 64-bits does not
sign-extend the mask.

Broken since its introduction in commit 292c8e50 (v1.5.0).
Callers that passed the same width type to both macro parameters,
or that had other code to ensure the first parameter's maximum
runtime value did not exceed the second parameter's width, are
unaffected, but I did not audit to see which (if any) existing
clients of the macro could trigger incorrect behavior (I found
the bug while adding a new use of the macro).

While preparing the patch, checkpatch complained about poor
spacing, so I also fixed that here and in the nearby DIV_ROUND_UP.

CC: qemu-trivial@nongnu.org
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 2098b073f3)
[LY: BSC#1076775 CVE-2017-18043]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Eric Blake
4f97b1174d nbd/server: CVE-2017-15119 Reject options larger than 32M
The NBD spec gives us permission to abruptly disconnect on clients
that send outrageously large option requests, rather than having
to spend the time reading to the end of the option.  No real
option request requires that much data anyways; and meanwhile, we
already have the practice of abruptly dropping the connection on
any client that sends NBD_CMD_WRITE with a payload larger than 32M.

For comparison, nbdkit drops the connection on any request with
more than 4096 bytes; however, that limit is probably too low
(as the NBD spec states an export name can theoretically be up
to 4096 bytes, which means a valid NBD_OPT_INFO could be even
longer) - even if qemu doesn't permit exports longer than 256
bytes.

It could be argued that a malicious client trying to get us to
read nearly 4G of data on a bad request is a form of denial of
service.  In particular, if the server requires TLS, but a client
that does not know the TLS credentials sends any option (other
than NBD_OPT_STARTTLS or NBD_OPT_EXPORT_NAME) with a stated
payload of nearly 4G, then the server was keeping the connection
alive trying to read all the payload, tying up resources that it
would rather be spending on a client that can get past the TLS
handshake.  Hence, this warranted a CVE.

Present since at least 2.5 when handling known options, and made
worse in 2.6 when fixing support for NBD_FLAG_C_FIXED_NEWSTYLE
to handle unknown options.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit fdad35ef6c)
[LY: BSC#1070144 CVE-2017-15119]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Prasad J Pandit
1911813223 ps2: check PS2Queue pointers in post_load routine
During Qemu guest migration, a destination process invokes ps2
post_load function. In that, if 'rptr' and 'count' values were
invalid, it could lead to OOB access or infinite loop issue.
Add check to avoid it.

Reported-by: Cyrille Chatras <cyrille.chatras@orange.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20171116075155.22378-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 802cbcb730)
[LY: BSC#1068613 CVE-2017-16845]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Prasad J Pandit
24c46bd538 virtio: check VirtQueue Vring object is set
A guest could attempt to use an uninitialised VirtQueue object
or unset Vring.align leading to a arithmetic exception. Add check
to avoid it.

Reported-by: Zhangboxian <zhangboxian@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 758ead31c7)
[LY: BSC#1071228 CVE-2017-17381]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Fei Li
c807f0a100 pcihp: fix pcihp for 1.6 machine type and older
Commit id f0c9d64a avoids adding ACPI_PCIHP_PROP_BSEL twice for
machine type 1.7, but also removes the adding for 1.6 and older
and xen, which causes an error when hotplugging a pci device using
qemu v2.9.* and v2.10.0.

There are two chances for intializing ACPI_PCIHP_PROP_BSEL before
commit f0c9d64a, one is in acpi_pcihp_init and the other is in
acpi_set_bsel. This patch restores the first chance, and adds a check
for the second chance: if the property has already been intialized,
just skip.

[FL: BSC#1074572]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-16 20:23:02 -06:00
Christian Borntraeger
2bfc928c01 s390x/kvm: provide stfle.81
stfle.81 (ppa15) is a transparent facility that can be passed to the
guest without the need to implement hypervisor support. As this feature
can be provided by firmware we add it to all full models.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit 9f0d13f4f1)
[LY: BSC#1076813]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Christian Borntraeger
348253ba35 s390x/kvm: Handle bpb feature
We need to handle the bpb control on reset and migration. Normally
stfle.82 is transparent (and the normal guest part works without
hypervisor activity). To prevent any issues we require full
host kernel support for this feature.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit b073c87517)
[LY: BSC#1076813]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Christian Borntraeger
b3ec6ae512 header sync
replace with proper header sync

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit 9cbb636270)
[LY: BSC#1076813]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Paolo Bonzini
1d40b7862f i386: Add support for SPEC_CTRL MSR
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-3-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit a33a2cfe2f)
[BR: BSC#1068032 CVE-2017-5715]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:02 -06:00
Brijesh Singh
d325162b9d target-i386/cpu: Add new EPYC CPU model
Add a new base CPU model called 'EPYC' to model processors from AMD EPYC
family (which includes EPYC 76xx,75xx,74xx, 73xx and 72xx).

The following features bits have been added/removed compare to Opteron_G5

Added: monitor, movbe, rdrand, mmxext, ffxsr, rdtscp, cr8legacy, osvw,
       fsgsbase, bmi1, avx2, smep, bmi2, rdseed, adx, smap, clfshopt, sha
       xsaveopt, xsavec, xgetbv1, arat

Removed: xop, fma4, tbm

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Message-Id: <20170815170051.127257-1-brijesh.singh@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 2e2efc7dbe)
[BR: BSC#1052825 FATE#324038]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:02 -06:00
Prasad J Pandit
e07ff195e3 9pfs: use g_malloc0 to allocate space for xattr
9p back-end first queries the size of an extended attribute,
allocates space for it via g_malloc() and then retrieves its
value into allocated buffer. Race between querying attribute
size and retrieving its could lead to memory bytes disclosure.
Use g_malloc0() to avoid it.

Reported-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 7bd9275630)
[BR: BSC#1062069 CVE-2017-15038]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:02 -06:00
Gerd Hoffmann
0df0623b78 cirrus: fix oob access in mode4and5 write functions
Move dst calculation into the loop, so we apply the mask on each
interation and will not overflow vga memory.

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Reported-by: Niu Guoxiang <niuguoxiang@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171011084314.21752-1-kraxel@redhat.com
[BR: BSC#1063122 CVE-2017-15289]
Signed-off-by: Bruce Rogers <brogers@suse.com
2021-03-16 20:23:02 -06:00
Daniel P. Berrange
9b477b16e2 io: monitor encoutput buffer size from websocket GSource
The websocket GSource is monitoring the size of the rawoutput
buffer to determine if the channel can accepts more writes.
The rawoutput buffer, however, is merely a temporary staging
buffer before data is copied into the encoutput buffer. Thus
its size will always be zero when the GSource runs.

This flaw causes the encoutput buffer to grow without bound
if the other end of the underlying data channel doesn't
read data being sent. This can be seen with VNC if a client
is on a slow WAN link and the guest OS is sending many screen
updates. A malicious VNC client can act like it is on a slow
link by playing a video in the guest and then reading data
very slowly, causing QEMU host memory to expand arbitrarily.

This issue is assigned CVE-2017-15268, publically reported in

  https://bugs.launchpad.net/qemu/+bug/1718964

(cherry picked from commit a7b20a8efa)

Reviewed-by: Eric Blake <eblake@redhat.com>

[Dan: Added extra checks to deal with code refactored in master but
 not stable 2.10]

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[BR: BSC#1062942 CVE-2017-15268]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:02 -06:00
Gerd Hoffmann
329671a216 vga: stop passing pointers to vga_draw_line* functions
Instead pass around the address (aka offset into vga memory).
Add vga_read_* helper functions which apply vbe_size_mask to
the address, to make sure the address stays within the valid
range, similar to the cirrus blitter fixes (commits ffaf857778
and 026aeffcb4).

Impact:  DoS for privileged guest users.  qemu crashes with
a segfault, when hitting the guard page after vga memory
allocation, while reading vga memory for display updates.

Fixes: CVE-2017-13672
Cc: P J P <ppandit@redhat.com>
Reported-by: David Buchanan <d@vidbuchanan.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170828122906.18993-1-kraxel@redhat.com
(cherry picked from commit 3d90c62548)
[FL: BSC#1056334 CVE-2017-13672]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-16 20:23:02 -06:00
Prasad J Pandit
8d18b02dc9 multiboot: validate multiboot header address values
While loading kernel via multiboot-v1 image, (flags & 0x00010000)
indicates that multiboot header contains valid addresses to load
the kernel image. These addresses are used to compute kernel
size and kernel text offset in the OS image. Validate these
address values to avoid an OOB access issue.

This is CVE-2017-14167.

Reported-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170907063256.7418-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ed4f86e8b6)
[FL: BSC#1057585 CVE-2017-14167]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-16 20:23:02 -06:00
Kevin Wolf
92f9d92993 IDE: test flush on empty CDROM
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170809160212.29976-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ce317e8dea)
[FL: BSC#1054724 CVE-2017-12809]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-16 20:23:02 -06:00
Stefan Hajnoczi
9a3d9e4fc3 IDE: Do not flush empty CDROM drives
The block backend changed in a way that flushing empty CDROM drives now
crashes.  Amend IDE to avoid doing so until the root problem can be
addressed for 2.11.

Original patch by John Snow <jsnow@redhat.com>.

Reported-by: Kieron Shorrock <kshorrock@paloaltonetworks.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170809160212.29976-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 4da97120d5)
[FL: BSC#1054724 CVE-2017-12809]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-16 20:23:02 -06:00
Stefano Stabellini
217c09fe63 xen/disk: don't leak stack data via response ring
Rather than constructing a local structure instance on the stack, fill
the fields directly on the shared ring, just like other (Linux)
backends do. Build on the fact that all response structure flavors are
actually identical (aside from alignment and padding at the end).

This is XSA-216.

Reported by: Anthony Perard <anthony.perard@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit b0ac694fdb)
[FL: BSC#1057378 CVE-2017-10911]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-16 20:23:02 -06:00
Khem Raj
fa3a41f1b1 Replace 'struct ucontext' with 'ucontext_t' type
glibc used to have:

   typedef struct ucontext { ... } ucontext_t;

glibc now has:

   typedef struct ucontext_t { ... } ucontext_t;

(See https://sourceware.org/bugzilla/show_bug.cgi?id=21457
 for detail and rationale for the glibc change)

However, QEMU used "struct ucontext" in declarations. This is a
private name and compatibility cannot be guaranteed. Switch to
only using the standardized type name.

Signed-off-by: Khem Raj <raj.khem@gmail.com>
Message-id: 20170628204452.41230-1-raj.khem@gmail.com
Cc: Kamil Rytarowski <kamil@netbsd.org>
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Laurent Vivier <laurent@vivier.eu>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[PMM: Rewrote commit message, based mostly on the one from
 Nathaniel McCallum]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 04b33e2186)
[BR: BOO#1055587]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:02 -06:00
Prasad J Pandit
c110404cf0 slirp: check len against dhcp options array end
While parsing dhcp options string in 'dhcp_decode', if an options'
length 'len' appeared towards the end of 'bp_vend' array, ensuing
read could lead to an OOB memory access issue. Add check to avoid it.

This is CVE-2017-11434.

Reported-by: Reno Robert <renorobert@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 413d463f43)
[BR: BSC#1049381 CVE-2017-11434]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:02 -06:00
Paolo Bonzini
ce26797251 megasas: always store SCSIRequest* into MegasasCmd
This ensures that the request is unref'ed properly, and avoids a
segmentation fault in the new qtest testcase that is added.

Reported-by: Zhangyanyu <zyy4013@stu.ouc.edu.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1043296 CVE-2017-9503, dropped testcase from patch]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:02 -06:00
Paolo Bonzini
b75ffd4f05 altera_timer: fix incorrect memset
Use sizeof instead of ARRAY_SIZE, fixing -Wmemset-elt-size with recent
GCC versions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[LY: BSC#1040228]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Dr. David Alan Gilbert
3ec305b8a5 slirp/smb: Replace constant strings by glib string
gcc 7 (on fedora 26) objects to many of the snprintf's
in the smb path and command creation because it can't
figure out that the smb_dir (i.e. the /tmp dir for the configuration)
is known to be short.

Replace all these fixed length buffers by g_str* functions that dynamically
allocate and use g_dir_make_tmp to make the directory.
(It's fairly new glib but we have a compat function for it).

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit f95cc8b6cc)
[LY: BSC#1040228]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Paolo Bonzini
4558a0441e jazz_led: fix bad snprintf
Detected by GCC 7's -Wformat-truncation.  snprintf writes at most
2 bytes here including the terminating NUL, so the result is
truncated.  In addition, the newline at the end is pointless.
Fix the buffer size and the format string.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit e9c6ab62c7)
[LY: BSC#1040228]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-16 20:23:02 -06:00
Alexander Graf
12be6c73c4 i386: Allow cpuid bit override
KVM has a feature bitmap of CPUID bits that it knows works for guests.
QEMU removes bits that are not part of that bitmap automatically on VM
start.

However, some times we just don't list features in that list because
they don't make sense for normal scenarios, but may be useful in specific,
targeted workloads.

For that purpose, add a new =force option to all CPUID feature flags in
the CPU property. With that we can override the accel filtering and give
users full control over the CPUID feature bits exposed into guests.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-16 20:23:02 -06:00
Alexander Graf
a79c6050a7 input: Add trace event for empty keyboard queue
When driving QEMU from the outside, we have basically no chance to
determine how quickly the guest OS picks up key events, so we usually
have to limit ourselves to very slow keyboard presses to make sure
the guest always has enough chance to pick them up.

This patch adds a trace events when the keyboarde queue is drained.
An external driver can use that as hint that new keys can be pressed.

Signed-off-by: Alexander Graf <agraf@suse.de>
Message-id: 1490883775-94658-1-git-send-email-agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#1031692]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:02 -06:00
Alexander Graf
a8cc7712a8 ARM: KVM: Enable in-kernel timers with user space gic
When running with KVM enabled, you can choose between emulating the
gic in kernel or user space. If the kernel supports in-kernel virtualization
of the interrupt controller, it will default to that. If not, if will
default to user space emulation.

Unfortunately when running in user mode gic emulation, we miss out on
timer events which are only available from kernel space. This patch leverages
the new kernel/user space pending line synchronization for those timer events.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-16 20:23:02 -06:00
Christoffer Dall
3914146879 RFC: update Linux headers from irqs-to-user-v3
Get ioctl number and definitions for KVM_CAP_ARM_USER_IRQ.

Signed-off-by: Christoffer Dall <cdall@linaro.org>
[agraf: change cap to indicate downstream status]
Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-16 20:23:02 -06:00
Andreas Färber
af99d39bec tests: Add scsi-disk test
Test scsi-{disk,hd,cd} wwn properties for correct 64-bit parsing.

For now piggyback on virtio-scsi.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Andreas Färber
ffeb43f6ac tests: Add QOM property unit tests
Add a test for parsing and setting a uint64 property.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Andreas Färber
da8ccab7b2 test-string-input-visitor: Add uint64 test
Test parsing of decimal and hexadecimal uint64 numbers with most
significant bit set.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Andreas Färber
852914a4a3 test-string-input-visitor: Add int test case
In addition to -42 also parse the maximum int64.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Andreas Färber
1961198aa1 string-input-visitor: Fix uint64 parsing
All integers would get parsed by strtoll(), not handling the case of
UINT64 properties with the most significient bit set.

Implement a .type_uint64 visitor callback, reusing the existing
parse_str() code through a new argument, using strtoull().

As this is a bug fix, it intentionally ignores checkpatch warnings to
prefer the use of qemu_strto[u]ll() over strto[u]ll().

Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Chunyan Liu
3f180d13bc fix xen hvm direct kernel boot
Since commit a1666142: acpi-build: make ROMs RAM blocks resizeable,
xen HVM direct kernel boot failed. Xen HVM direct kernel boot will
insert a linuxboot.bin or multiboot.bin to /genroms, before this
commit, in acpi_setup, for rom linuxboot.bin/multiboot.bin, it
only needs 0x20000 size; after the commit, it will reserve x16
size for resize, that is 0x200000 size. It causes xen_ram_alloc
failed due to running out of memory.

To resolve it, either:
1. keep using original rom size instead of max size, don't reserve x16 size.
2. guest maxmem needs to be increased. (commit c1d322e6 "xen-hvm: increase
   maxmem before calling xc_domain_populate_physmap" solved the problem for
   a time, by accident. But then it is reverted in commit ffffbb369 due to
   other problem.)

For 2, more discussion is needed about howto. So this patch tries 1, to
use unresizable rom size in xen case in rom_set_mr.

[CYL: BSC#970791]

Signed-off-by: Chunyan Liu <cyliu@suse.com>
2021-03-16 20:23:02 -06:00
Chunyan Liu
a92541e62a Fix tigervnc long press issue
Using xen tools 'xl vncviewer' with tigervnc (default on SLE-12),
found that: the display of the guest is unexpected while keep
pressing a key. We expect the same character multiple times, but
it prints only one time. This happens on a PV guest in text mode.

After debugging, found that tigervnc sends repeated key down events
in this case, to differentiate from user pressing the same key many
times. Vnc server only prints the character when it finally receives
key up event.

To solve this issue, this patch tries to add additional key up event
before the next repeated key down event (if the key is not a control
key).

[CYL: BSC#882405]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
2021-03-16 20:23:02 -06:00
Andreas Färber
79b382a16d acpi_piix4: Fix migration from SLE11 SP2
qemu-kvm 0.15 uses the same GPE format as qemu 1.4, but as version 2
rather than 3.

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Andreas Färber
7f17cde23a i8254: Fix migration from SLE11 SP2
qemu-kvm 0.15 had a VMSTATE_UINT32(flags, PITState) field that
qemu 1.4 does not have.

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Andreas Färber
0cab85fb1c vga: Raise VRAM to 16 MiB for pc-0.15 and below
qemu-kvm.git commit a7fe0297840908a4fd65a1cf742481ccd45960eb
(Extend vram size to 16MB) deviated from qemu.git since kvm-61, and only
in commit 9e56edcf8d (vga: raise default
vgamem size) did qemu.git adjust the VRAM size for v1.2.

Add compatibility properties so that up to and including pc-0.15 we
maintain migration compatibility with qemu-kvm rather than QEMU and
from pc-1.0 on with QEMU (last qemu-kvm release was 1.2).

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: adjust comma position in list in macro for v2.5.0 compat]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:02 -06:00
Bruce Rogers
e4c17119dd increase x86_64 physical bits to 42
Allow for guests with higher amounts of ram. The current thought
is that 2TB specified on qemu commandline would be an appropriate
limit. Note that this requires the next higher bit value since
the highest address is actually more than 2TB due to the pci
memory hole.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Andreas Färber
e8d2218fcd Raise soft address space limit to hard limit
For SLES we want users to be able to use large memory configurations
with KVM without fiddling with ulimit -Sv.

Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: add include for sys/resource.h]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:02 -06:00
Bruce Rogers
2902852e58 roms/Makefile: pass a packaging timestamp to subpackages with date info
Certain rom subpackages build from qemu git-submodules call the date
program to include date information in the packaged binaries. This
causes repeated builds of the package to be different, wkere the only
real difference is due to the fact that time build timestamp has
changed. To promote reproducible builds and avoid customers being
prompted to update packages needlessly, we'll use the timestamp of the
VERSION file as the packaging timestamp for all packages that build in a
timestamp for whatever reason.

[BR: BSC#1011213]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:02 -06:00
0bbe0a9624 linux-user: remove all traces of qemu from /proc/self/cmdline
Instead of post-processing the real contents use the remembered target
argv.  That removes all traces of qemu, including command line options,
and handles QEMU_ARGV0.

Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
33a4759f97 linux-user: properly test for infinite timeout in poll (#8)
After "linux-user: use target_ulong" the poll syscall was no longer
handling infinite timeout.

/home/abuild/rpmbuild/BUILD/qemu-2.7.0-rc5/linux-user/syscall.c:9773:26: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
                 if (arg3 >= 0) {
                          ^~

Signed-off-by: Andreas Schwab <schwab@suse.de>
2021-03-16 20:23:02 -06:00
markkp
617054c693 configure: Fix detection of seccomp on s390x
Signed-off-by: Mark Post <mpost@suse.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
5a82b83061 qemu-binfmt-conf: use qemu-ARCH-binfmt
Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Bruce Rogers
706a3b0849 qemu-bridge-helper: reduce security profile
Change from using glib alloc and free routines to those
from libc. Also perform safety measure of dropping privs
to user if configured no-caps.

[BR: BOO#988279]
Signed-off-by: Bruce Rogers <brogers@suse.com>
[AF: Rebased for v2.7.0-rc2]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Bruce Rogers
f9705a9de8 xen_disk: Add suse specific flush disable handling and map to QEMU equiv
Add code to read the suse specific suse-diskcache-disable-flush flag out
of xenstore, and set the equivalent flag within QEMU.

Patch taken from Xen's patch queue, Olaf Hering being the original author.
[bsc#879425]

Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Olaf Hering <olaf@aepfle.de>
2021-03-16 20:23:02 -06:00
Alexander Graf
6358fd6880 dictzip: Fix on big endian systems
The dictzip code in SLE11 received some treatment over time to support
running on big endian hosts. Somewhere in the transition to SLE12 this
support got lost. Add it back in again from the SLE11 code base.

Furthermore while at it, fix up the debug prints to not emit warnings.

[AG: BSC#937572]
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased for v2.7.0-rc2]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Alexander Graf
3cc08e3691 AIO: Reduce number of threads for 32bit hosts
On hosts with limited virtual address space (32bit pointers), we can very
easily run out of virtual memory with big thread pools.

Instead, we should limit ourselves to small pools to keep memory footprint
low on those systems.

This patch fixes random VM stalls like

  (process:25114): GLib-ERROR **: gmem.c:103: failed to allocate 1048576 bytes

on 32bit ARM systems for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-16 20:23:02 -06:00
Dinar Valeev
386c721eb5 configure: Enable PIE for ppc and ppc64 hosts
Signed-off-by: Dinar Valeev <dvaleev@suse.com>
[AF: Rebased for v1.7]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Alexander Graf
2ee5e1d731 linux-user: lseek: explicitly cast non-set offsets to signed
When doing lseek, SEEK_SET indicates that the offset is an unsigned variable.
Other seek types have parameters that can be negative.

When converting from 32bit to 64bit parameters, we need to take this into
account and enable SEEK_END and SEEK_CUR to be negative, while SEEK_SET stays
absolute positioned which we need to maintain as unsigned.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-16 20:23:02 -06:00
Alexander Graf
060c4bdb93 Make char muxer more robust wrt small FIFOs
Virtio-Console can only process one character at a time. Using it on S390
gave me strage "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.

While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.

To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.

This patch fixes input when using -nographic on s390 for me.

[AF: Rebased for v2.7.0-rc2]
2021-03-16 20:23:02 -06:00
Alexander Graf
ae4499c938 console: add question-mark escape operator
Some termcaps (found using SLES11SP1) use [? sequences. According to man
console_codes (http://linux.die.net/man/4/console_codes) the question mark
is a nop and should simply be ignored.

This patch does exactly that, rendering screen output readable when
outputting guest serial consoles to the graphical console emulator.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-16 20:23:02 -06:00
Alexander Graf
a52d3e0faf Legacy Patch kvm-qemu-preXX-dictzip3.patch 2021-03-16 20:23:02 -06:00
Alexander Graf
17440d3df7 block: Add tar container format
Tar is a very widely used format to store data in. Sometimes people even put
virtual machine images in there.

So it makes sense for qemu to be able to read from tar files. I implemented a
written from scratch reader that also knows about the GNU sparse format, which
is what pigz creates.

This version checks for filenames that end on well-known extensions. The logic
could be changed to search for filenames given on the command line, but that
would require changes to more parts of qemu.

The tar reader in conjunctiuon with dzip gives us the chance to download
tar'ed up virtual machine images (even via http) and instantly make use of
them.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[TH: Use bdrv_open options instead of filename]
Signed-off-by: Tim Hardeck <thardeck@suse.de>
[AF: bdrv_file_open got an Error **errp argument, bdrv_delete -> brd_unref]
[AF: qemu_opts_create_nofail() -> qemu_opts_create(),
     bdrv_file_open() -> bdrv_open(), based on work by brogers]
[AF: error_is_set() dropped for v2.1.0-rc0]
[AF: BlockDriverAIOCB -> BlockAIOCB,
     BlockDriverCompletionFunc -> BlockCompletionFunc,
     qemu_aio_release() -> qemu_aio_unref(),
     drop tar_aio_cancel()]
[AF: common-obj-y -> block-obj-y, drop probe hook (bsc#945778)]
[AF: Drop bdrv_open() drv parameter for 2.5]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
[AF: Changed bdrv_open() bs parameter and return value for v2.7.0-rc2,
     for bdrv_pread() and bdrv_aio_readv() s/s->hd/s->hd->file/]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Alexander Graf
515180dec7 block: Add support for DictZip enabled gzip files
DictZip is an extension to the gzip format that allows random seeks in gzip
compressed files by cutting the file into pieces and storing the piece offsets
in the "extra" header of the gzip format.

Thanks to that extension, we can use gzip compressed files as block backend,
though only in read mode.

This makes a lot of sense when stacked with tar files that can then be shipped
to VM users. If a VM image is inside a tar file that is inside a DictZip
enabled gzip file, the user can run the tar.gz file as is without having to
extract the image first.

Tar patch follows.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[TH: Use bdrv_open options instead of filename]
Signed-off-by: Tim Hardeck <thardeck@suse.de>
[AF: Error **errp added for bdrv_file_open, bdrv_delete -> bdrv_unref]
[AF: qemu_opts_create_nofail() -> qemu_opts_create(),
     bdrv_file_open() -> bdrv_open(), based on work by brogers]
[AF: error_is_set() dropped for v2.1.0-rc0]
[AF: BlockDriverAIOCB -> BlockAIOCB,
     BlockDriverCompletionFunc -> BlockCompletionFunc,
     qemu_aio_release() -> qemu_aio_unref(),
     drop dictzip_aio_cancel()]
[AF: common-obj-y -> block-obj-y, drop probe hook (bsc#945778)]
[AF: Drop bdrv_open() drv parameter for 2.5]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
[AF: Drop bdrv_open() bs parameter and change return value for v2.7.0-rc2,
     for bdrv_pread() and bdrv_aio_readv() do s/s->hd/s->hd->file/]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:02 -06:00
Alexander Graf
0b96efddf3 linux-user: use target_ulong
Linux syscalls pass pointers or data length or other information of that sort
to the kernel. This is all stuff you don't want to have sign extended.
Otherwise a host 64bit variable parameter with a size parameter will extend
it to a negative number, breaking lseek for example.

Pass syscall arguments as ulong always.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-16 20:23:02 -06:00
Andreas Färber
ca655c7032 vnc: password-file= and incoming-connections=
TBD (from SUSE Studio team)
2021-03-16 20:23:02 -06:00
Andreas Färber
eb95c111c5 slirp: -nooutgoing
TBD (from SUSE Studio team)
2021-03-16 20:23:01 -06:00
Alexander Graf
e1250d6316 linux-user: XXX disable fiemap
agraf: fiemap breaks in libarchive. Disable it for now.
2021-03-16 20:23:01 -06:00
Alexander Graf
8f51ff90ca linux-user: Fake /proc/cpuinfo
Fedora 17 for ARM reads /proc/cpuinfo and fails if it doesn't contain
ARM related contents. This patch implements a quick hack to expose real
/proc/cpuinfo data taken from a real world machine.

The real fix would be to generate at least the flags automatically based
on the selected CPU. Please do not submit this patch upstream until this
has happened.

Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased for v1.6 and v1.7]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:01 -06:00
Alexander Graf
e755ba527b linux-user: binfmt: support host binaries
When we have a working host binary equivalent for the guest binary we're
trying to run, let's just use that instead as it will be a lot faster.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-16 20:23:01 -06:00
Alexander Graf
1637529607 linux-user: fix segfault deadlock
When entering the guest we take a lock to ensure that nobody else messes
with our TB chaining while we're doing it. If we get a segfault inside that
code, we manage to work on, but will not unlock the lock.

This patch forces unlocking of that lock in the segv handler. I'm not sure
this is the right approach though. Maybe we should rather make sure we don't
segfault in the code? I would greatly appreciate someone more intelligible
than me to look at this :).

Example code to trigger this is at: http://csgraf.de/tmp/conftest.c

Reported-by: Fabio Erculiani <lxnay@sabayon.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Drop spinlock_safe_unlock() and switch to tb_lock_reset() (bonzini)]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:01 -06:00
Alexander Graf
e02f3a2c2c PPC: KVM: Disable mmu notifier check
When using hugetlbfs (which is required for HV mode KVM on 970), we
check for MMU notifiers that on 970 can not be implemented properly.

So disable the check for mmu notifiers on PowerPC guests, making
KVM guests work there, even if possibly racy in some odd circumstances.
2021-03-16 20:23:01 -06:00
Alexander Graf
aacb3a44f0 linux-user: add binfmt wrapper for argv[0] handling
When using qemu's linux-user binaries through binfmt, argv[0] gets lost
along the execution because qemu only gets passed in the full file name
to the executable while argv[0] can be something completely different.

This breaks in some subtile situations, such as the grep and make test
suites.

This patch adds a wrapper binary called qemu-$TARGET-binfmt that can be
used with binfmt's P flag which passes the full path _and_ argv[0] to
the binfmt handler.

The binary would be smart enough to be versatile and only exist in the
system once, creating the qemu binary path names from its own argv[0].
However, this seemed like it didn't fit the make system too well, so
we're currently creating a new binary for each target archictecture.

CC: Reinhard Max <max@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased onto new Makefile infrastructure, twice]
[AF: Updated for aarch64 for v2.0.0-rc1]
[AF: Rebased onto Makefile changes for v2.1.0-rc0]
[AF: Rebased onto script rewrite for v2.7.0-rc2 - to be fixed]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:01 -06:00
Alexander Graf
0f4cd18742 qemu-cvs-ioctl_nodirection
the direction given in the ioctl should be correct so we can assume the
communication is uni-directional. The alsa developers did not like this
concept though and declared ioctls IOC_R and IOC_W even though they were
IOC_RW.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
2021-03-16 20:23:01 -06:00
Alexander Graf
c384d2b0a3 qemu-cvs-ioctl_debug
Extends unsupported ioctl debug output.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
2021-03-16 20:23:01 -06:00
Ulrich Hecht
2df2d22b7a qemu-cvs-gettimeofday
No clue what this is for.
2021-03-16 20:23:01 -06:00
Alexander Graf
2e88b0133e qemu-cvs-alsa_mmap
Hack to prevent ALSA from using mmap() interface to simplify emulation.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
2021-03-16 20:23:01 -06:00
Alexander Graf
3288a6528a qemu-cvs-alsa_ioctl
Implements ALSA ioctls on PPC hosts.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
[AF: Rebased for v2.7.0-rc2]
[BR: Rebased for v2.9.0-rc0: removed timespec ref. from syscall_types_alsa.h]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:01 -06:00
Alexander Graf
c0535d90a0 qemu-cvs-alsa_bitfield
Implements TYPE_INTBITFIELD partially. (required for ALSA support)

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
2021-03-16 20:23:01 -06:00
Andreas Färber
a70e1761f0 qemu-binfmt-conf: Modify default path
Change QEMU_PATH from /usr/local/bin to /usr/bin prefix.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-16 20:23:01 -06:00
Alexander Graf
16059d7ed4 XXX dont dump core on sigabort 2021-03-16 20:23:01 -06:00
Greg Kurz
5bbd90e113 9pfs: Fully restart unreclaim loop (CVE-2021-20181)
Git-commit: 89fbea8737
References: bsc#1182137

Depending on the client activity, the server can be asked to open a huge
number of file descriptors and eventually hit RLIMIT_NOFILE. This is
currently mitigated using a reclaim logic : the server closes the file
descriptors of idle fids, based on the assumption that it will be able
to re-open them later. This assumption doesn't hold of course if the
client requests the file to be unlinked. In this case, we loop on the
entire fid list and mark all related fids as unreclaimable (the reclaim
logic will just ignore them) and, of course, we open or re-open their
file descriptors if needed since we're about to unlink the file.

This is the purpose of v9fs_mark_fids_unreclaim(). Since the actual
opening of a file can cause the coroutine to yield, another client
request could possibly add a new fid that we may want to mark as
non-reclaimable as well. The loop is thus restarted if the re-open
request was actually transmitted to the backend. This is achieved
by keeping a reference on the first fid (head) before traversing
the list.

This is wrong in several ways:
- a potential clunk request from the client could tear the first
  fid down and cause the reference to be stale. This leads to a
  use-after-free error that can be detected with ASAN, using a
  custom 9p client
- fids are added at the head of the list : restarting from the
  previous head will always miss fids added by a some other
  potential request

All these problems could be avoided if fids were being added at the
end of the list. This can be achieved with a QSIMPLEQ, but this is
probably too much change for a bug fix. For now let's keep it
simple and just restart the loop from the current head.

Fixes: CVE-2021-20181
Buglink: https://bugs.launchpad.net/qemu/+bug/1911666
Reported-by: Zero Day Initiative <zdi-disclosures@trendmicro.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <161064025265.1838153.15185571283519390907.stgit@bahia.lan>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-16 20:23:01 -06:00
Michael Roth
4cd42653f5 Update version for 2.9.1 release
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-09-07 11:25:12 -05:00
Greg Kurz
c24c5910b7 virtfs: error out gracefully when mandatory suboptions are missing
We internally convert -virtfs to -fsdev/-device. If the user doesn't
provide the path or security_model suboptions, and the fsdev backend
requires them, we hit an assertion when populating the internal -fsdev
option:

util/qemu-option.c:547: opt_set: Assertion `opt->str' failed.
Aborted (core dumped)

Let's test the suboption presence on the command line before trying
to set it in the internal -fsdev option, and let the backend code
error out gracefully (ie, like it already does when the user passes
-fsdev on the command line).

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 32b6943699)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-09-05 15:42:17 -05:00
Richard Henderson
2d1bbf51c2 target/arm: Fix aa64 ldp register writeback
For "ldp x0, x1, [x0]", if the second load is on a second page and
the second page is unmapped, the exception would be raised with x0
already modified.  This means the instruction couldn't be restarted.

Cc: qemu-arm@nongnu.org
Cc: qemu-stable@nongnu.org
Reported-by: Andrew <andrew@fubar.geek.nz>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20170825224833.4463-1-richard.henderson@linaro.org
Fixes: https://bugs.launchpad.net/qemu/+bug/1713066
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
[PMM: tweaked comment format]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

(cherry picked from commit 3e4d91b94c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-09-04 23:12:48 -05:00
Anthony PERARD
30b76b2439 exec: Add lock parameter to qemu_ram_ptr_length
Commit 04bf2526ce (exec: use
qemu_ram_ptr_length to access guest ram) start using qemu_ram_ptr_length
instead of qemu_map_ram_ptr, but when used with Xen, the behavior of
both function is different. They both call xen_map_cache, but one with
"lock", meaning the mapping of guest memory is never released
implicitly, and the second one without, which means, mapping can be
release later, when needed.

In the context of address_space_{read,write}_continue, the ptr to those
mapping should not be locked because it is used immediatly and never
used again.

The lock parameter make it explicit in which context qemu_ram_ptr_length
is called.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20170726165326.10327-1-anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f5aa69bdc3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-09-01 18:15:39 -05:00
Stefano Stabellini
2f64063f4e xen/mapcache: store dma information in revmapcache entries for debugging
The Xen mapcache is able to create long term mappings, they are called
"locked" mappings. The third parameter of the xen_map_cache call
specifies if a mapping is a "locked" mapping.

>From the QEMU point of view there are two kinds of long term mappings:

[a] device memory mappings, such as option roms and video memory
[b] dma mappings, created by dma_memory_map & friends

After certain operations, ballooning a VM in particular, Xen asks QEMU
kindly to destroy all mappings. However, certainly [a] mappings are
present and cannot be removed. That's not a problem as they are not
affected by balloonning. The *real* problem is that if there are any
mappings of type [b], any outstanding dma operations could fail. This is
a known shortcoming. In other words, when Xen asks QEMU to destroy all
mappings, it is an error if any [b] mappings exist.

However today we have no way of distinguishing [a] from [b]. Because of
that, we cannot even print a decent warning.

This patch introduces a new "dma" bool field to MapCacheRev entires, to
remember if a given mapping is for dma or is a long term device memory
mapping. When xen_invalidate_map_cache is called, we print a warning if
any [b] mappings exist. We ignore [a] mappings.

Mappings created by qemu_map_ram_ptr are assumed to be [a], while
mappings created by address_space_map->qemu_ram_ptr_length are assumed
to be [b].

The goal of the patch is to make debugging and system understanding
easier.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit 1ff7c5986a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-09-01 18:15:24 -05:00
Prasad J Pandit
15d8f91446 exec: use qemu_ram_ptr_length to access guest ram
When accessing guest's ram block during DMA operation, use
'qemu_ram_ptr_length' to get ram block pointer. It ensures
that DMA operation of given length is possible; And avoids
any OOB memory access situations.

Reported-by: Alex <broscutamaker@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170712123840.29328-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 04bf2526ce)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-09-01 18:10:46 -05:00
Gerd Hoffmann
f9c313f70f xhci: only update dequeue ptr on completed transfers
The dequeue pointer should only be updated in case the transfer
is actually completed.  If we update it for inflight transfers
we will not pick them up again after migration, which easily
triggers with HID devices as they typically have a pending
transfer, waiting for user input to happen.

Fixes: 243afe858b
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451631
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Message-id: 20170608074122.32099-1-kraxel@redhat.com
(cherry picked from commit d54fddea98)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-09-01 18:10:36 -05:00
Dr. David Alan Gilbert
53206753ba vl.c/exit: pause cpus before closing block devices
There's a rare exit seg if the guest is accessing
IO during exit.
It's always hitting the atomic_inc(&bs->in_flight) with a NULL
bs. This was added recently in 99723548  but I don't see it
as the cause.

Flip vl.c around so we pause the cpus before closing the block devices,
that way we shouldn't have anything trying to access them when
they're gone.

This was originally Red Hat bz https://bugzilla.redhat.com/show_bug.cgi?id=1451015

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Cong Li <coli@redhat.com>

--
This is a very rare race, I'll leave it running in a loop to see if
we hit anything else and to check this really fixes it.

I do worry if there are other cases that can trigger this - e.g.
hot-unplug or ejecting a CD.

Message-Id: <20170713190116.21608-1-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 452589b6b4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 17:09:34 -05:00
Michael Roth
167e76494e PPC: E500: update u-boot to match shipped binary
Previous submodule commit contained some unused files with comments
that made redistribution requirements unclear, and the submodule commit
should really match the blob we're shipped anyway.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 16:59:16 -05:00
Farhan Ali
e22e199b2b s390-ccw: Fix alignment for CCW1
The commit 198c0d1f9d s390x/css: check ccw address validity
exposes an alignment issue in ccw bios.

According to PoP the CCW must be doubleword aligned. Let's fix
this in the bios.

Cc: qemu-stable@nongnu.org
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <3ed8b810b6592daee6a775037ce21f850e40647d.1503667215.git.alifm@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 3a1e4561ad)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 12:01:54 -05:00
Alexander Graf
5035184165 vnc: Set default kbd delay to 10ms
The current VNC default keyboard delay is 1ms. With that we're constantly
typing faster than the guest receives keyboard events from an XHCI attached
USB HID device.

The default keyboard delay time in the input layer however is 10ms. I don't know
how that number came to be, but empirical tests on some OpenQA driven ARM
systems show that 10ms really is a reasonable default number for the delay.

This patch moves the VNC delay also to 10ms. That way our default is much
safer (good!) and also consistent with the input layer default (also good!).

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1499863425-103133-1-git-send-email-agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d3b0db6dfe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:17 -05:00
Max Reitz
20920f4db6 qemu-nbd: Ignore SIGPIPE
qemu proper has done so for 13 years
(8a7ddc38a6), qemu-img and qemu-io have
done so for four years (526eda14a6).
Ignoring this signal is especially important in qemu-nbd because
otherwise a client can easily take down the qemu-nbd server by dropping
the connection when the server wants to send something, for example:

$ qemu-nbd -x foo -f raw -t null-co:// &
[1] 12726
$ qemu-io -c quit nbd://localhost/bar
can't open device nbd://localhost/bar: No export with name 'bar' available
[1]  + 12726 broken pipe  qemu-nbd -x foo -f raw -t null-co://

In this case, the client sends an NBD_OPT_ABORT and closes the
connection (because it is not required to wait for a reply), but the
server replies with an NBD_REP_ACK (because it is required to reply).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20170611123714.31292-1-mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 041e32b8d9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:17 -05:00
Gerd Hoffmann
4e6889b76b usb-redir: fix stack overflow in usbredir_log_data
Don't reinvent a broken wheel, just use the hexdump function we have.

Impact: low, broken code doesn't run unless you have debug logging
enabled.

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170509110128.27261-1-kraxel@redhat.com
(cherry picked from commit bd4a683505)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Paolo Bonzini
244a3ef947 megasas: do not read SCSI req parameters more than once from frame
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b356807fcd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Paolo Bonzini
578fb5003f megasas: do not read command more than once from frame
Avoid TOC-TOU bugs by passing the frame_cmd down, and checking
cmd->dcmd_opcode instead of cmd->frame->header.frame_cmd.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 36c327a69d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Paolo Bonzini
50b9353315 megasas: do not read DCMD opcode more than once from frame
Avoid TOC-TOU bugs by storing the DCMD opcode in the MegasasCmd

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5104fac853)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Paolo Bonzini
d016071ca5 megasas: do not read iovec count more than once from frame
Avoid TOC-TOU bugs depending on how the compiler behaves.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 24c0c77af5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Paolo Bonzini
20fd62d374 megasas: do not read sense length more than once from frame
Avoid TOC-TOU bugs depending on how the compiler behaves.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 134550bf81)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Greg Kurz
7442018a00 9pfs: local: forbid client access to metadata (CVE-2017-7493)
When using the mapped-file security mode, we shouldn't let the client mess
with the metadata. The current code already tries to hide the metadata dir
from the client by skipping it in local_readdir(). But the client can still
access or modify it through several other operations. This can be used to
escalate privileges in the guest.

Affected backend operations are:
- local_mknod()
- local_mkdir()
- local_open2()
- local_symlink()
- local_link()
- local_unlinkat()
- local_renameat()
- local_rename()
- local_name_to_path()

Other operations are safe because they are only passed a fid path, which
is computed internally in local_name_to_path().

This patch converts all the functions listed above to fail and return
EINVAL when being passed the name of the metadata dir. This may look
like a poor choice for errno, but there's no such thing as an illegal
path name on Linux and I could not think of anything better.

This fixes CVE-2017-7493.

Reported-by: Leo Gaspard <leo@gaspard.io>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 7a95434e0c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Prasad J Pandit
0f590e798f scsi: avoid an off-by-one error in megasas_mmio_write
While reading magic sequence(MFI_SEQ) in megasas_mmio_write,
an off-by-one error could occur as 's->adp_reset' index is not
reset after reading the last sequence.

Reported-by: YY Z <bigbird475958471@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170424120634.12268-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 24dfa9fa2f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Gerd Hoffmann
3c69132635 audio: release capture buffers
AUD_add_capture() allocates two buffers which are never released.
Add the missing calls to AUD_del_capture().

Impact: Allows vnc clients to exhaust host memory by repeatedly
starting and stopping audio capture.

Fixes: CVE-2017-8309
Cc: P J P <ppandit@redhat.com>
Cc: Huawei PSIRT <PSIRT@huawei.com>
Reported-by: "Jiangxin (hunter, SCC)" <jiangxin1@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20170428075612.9997-1-kraxel@redhat.com
(cherry picked from commit 3268a845f4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
P J P
40a7d47c5a vmw_pvscsi: check message ring page count at initialisation
A guest could set the message ring page count to zero, resulting in
infinite loop. Add check to avoid it.

Reported-by: YY Z <bigbird475958471@gmail.com>
Signed-off-by: P J P <ppandit@redhat.com>
Message-Id: <20170425130623.3649-1-ppandit@redhat.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f68826989c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Thomas Huth
9b9b4423d7 hw/ppc/spapr_iommu: Fix crash when removing the "spapr-tce-table" device
QEMU currently aborts unexpectedly when the user tries to add and
remove a "spapr-tce-table" device:

$ qemu-system-ppc64 -nographic -S -nodefaults -monitor stdio
QEMU 2.9.92 monitor - type 'help' for more information
(qemu) device_add spapr-tce-table,id=x
(qemu) device_del x
**
ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)

The device should not be accessable for the users at all, it's just
used internally, so mark it with user_creatable = false.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 1f98e55385)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Thomas Huth
980e8260e6 hw/ppc/spapr_rtc: Mark the RTC device with user_creatable = false
QEMU currently aborts unexpectedly when a user tries to do something
like this:

$ qemu-system-ppc64 -nographic -S -nodefaults -monitor stdio
QEMU 2.9.92 monitor - type 'help' for more information
(qemu) device_add spapr-rtc,id=spapr-rtc
(qemu) device_del spapr-rtc
**
ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)

The RTC device is not meant to be hot-pluggable - it's an internal
device only and it even should not be possible to create it a
second time with the "-device" parameter, so let's mark this
with "user_creatable = false".

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 8ccccff9dd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Eduardo Habkost
aab00230aa qdev: Replace cannot_instantiate_with_device_add_yet with !user_creatable
cannot_instantiate_with_device_add_yet was introduced by commit
efec3dd631 to replace no_user. It was
supposed to be a temporary measure.

When it was introduced, we had 54
cannot_instantiate_with_device_add_yet=true lines in the code.
Today (3 years later) this number has not shrunk: we now have
57 cannot_instantiate_with_device_add_yet=true lines. I think it
is safe to say it is not a temporary measure, and we won't see
the flag go away soon.

Instead of a long field name that misleads people to believe it
is temporary, replace it a shorter and less misleading field:
user_creatable.

Except for code comments, changes were generated using the
following Coccinelle patch:

  @@
  expression DC;
  @@
  (
  -DC->cannot_instantiate_with_device_add_yet = false;
  +DC->user_creatable = true;
  |
  -DC->cannot_instantiate_with_device_add_yet = true;
  +DC->user_creatable = false;
  )

  @@
  typedef ObjectClass;
  expression dc;
  identifier class, data;
  @@
   static void device_class_init(ObjectClass *class, void *data)
   {
   ...
   dc->hotpluggable = true;
  +dc->user_creatable = true;
   ...
   }

  @@
  @@
   struct DeviceClass {
   ...
  -bool cannot_instantiate_with_device_add_yet;
  +bool user_creatable;
   ...
  }

  @@
  expression DC;
  @@
  (
  -!DC->cannot_instantiate_with_device_add_yet
  +DC->user_creatable
  |
  -DC->cannot_instantiate_with_device_add_yet
  +!DC->user_creatable
  )

Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Thomas Huth <thuth@redhat.com>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170503203604.31462-2-ehabkost@redhat.com>
[ehabkost: kept "TODO remove once we're there" comment]
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit e90f2a8c3e)
 Conflicts:
	include/hw/qdev-core.h
* remove context dep on 08f00df
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:51:16 -05:00
Eduardo Otubo
cfc65be6fb fix qemu-system-unicore32 crashing when calling without -kernel
Starting qemu-system-unicore32 without the -kernel parameter results in
an assert() returns false and aborts qemu. This patch replaces it with a
proper error message followed by exit(1).

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 36bed541ca)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:24:20 -05:00
Thomas Huth
ac0038f182 hw/s390x/ipl: Fix crash with virtio-scsi-pci device
qemu-system-s390x currently crashes when it is started with a
virtio-scsi-pci device, e.g.:

 qemu-system-s390x -nographic -enable-kvm -device virtio-scsi-pci \
                   -drive file=/tmp/disk.dat,if=none,id=d1,format=raw \
                   -device scsi-cd,drive=d1,bootindex=1

The problem is that the code in s390_gen_initial_iplb() currently assumes
that all SCSI devices are also CCW devices, which is not the case for
virtio-scsi-pci of course. Fix it by adding an appropriate check for
TYPE_CCW_DEVICE here.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1493126327-13162-1-git-send-email-thuth@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 99efaa2696)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:24:04 -05:00
Samuel Thibault
62708c7c12 slirp: fix clearing ifq_so from pending packets
The if_fastq and if_batchq contain not only packets, but queues of packets
for the same socket. When sofree frees a socket, it thus has to clear ifq_so
from all the packets from the queues, not only the first.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 1201d30851)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:22:38 -05:00
Marc-André Lureau
746e1fd000 slirp: tftp, copy sockaddr_size
ASAN detects an "unknown-crash" when running pxe-test:

/ppc64/pxe/spapr-vlan: =================================================================
==7143==ERROR: AddressSanitizer: unknown-crash on address 0x7f6dcd298d30 at pc 0x55e22218830d bp 0x7f6dcd2989e0 sp 0x7f6dcd2989d0
READ of size 128 at 0x7f6dcd298d30 thread T2
    #0 0x55e22218830c in tftp_session_allocate /home/elmarco/src/qq/slirp/tftp.c:73
    #1 0x55e22218a1f8 in tftp_handle_rrq /home/elmarco/src/qq/slirp/tftp.c:289
    #2 0x55e22218b54c in tftp_input /home/elmarco/src/qq/slirp/tftp.c:446
    #3 0x55e2221833fe in udp6_input /home/elmarco/src/qq/slirp/udp6.c:82
    #4 0x55e222137b17 in ip6_input /home/elmarco/src/qq/slirp/ip6_input.c:67

Address 0x7f6dcd298d30 is located in stack of thread T2 at offset 96 in frame
    #0 0x55e222182420 in udp6_input /home/elmarco/src/qq/slirp/udp6.c:13

  This frame has 3 object(s):
    [32, 48) '<unknown>'
    [96, 124) 'lhost' <== Memory access at offset 96 partially overflows this variable
    [160, 200) 'save_ip' <== Memory access at offset 96 partially underflows this variable

The sockaddr_storage pointer is the sockaddr_in6 lhost on the
stack. Copy only the source addr size.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 17eb587aeb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:22:14 -05:00
Thomas Huth
e8679f5b45 monitor: Check whether TCG is enabled before running the "info jit" code
The "info jit" command currently aborts on Mac OS X with the message
"qemu_mutex_lock: Invalid argument" when running with "-M accel=qtest".
We should only call into the TCG code here if TCG has really been
enabled and initialized.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1493179907-22516-1-git-send-email-thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
(cherry picked from commit b7da97eef7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:21:45 -05:00
Philipp Kern
c152efc943 target-s390x: Mask the SIGP order_code to 8bit.
According to "CPU Signaling and Response", "Signal-Processor Orders",
the order field is bit position 56-63. Without this, the Linux
guest kernel is sometimes unable to stop emulation and enters
an infinite loop of "XXX unknown sigp: 0xffffffff00000005".

Signed-off-by: Philipp Kern <phil@philkern.de>
Reviewed-by: Thomas Huth <thuth@tuxfamily.org>
[agraf: add comment according to email]
Signed-off-by: Alexander Graf <agraf@suse.de>

(cherry picked from commit 601b9a9008)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-31 11:21:20 -05:00
Greg Kurz
077a67e929 9pfs: local: fix fchmodat_nofollow() limitations
This function has to ensure it doesn't follow a symlink that could be used
to escape the virtfs directory. This could be easily achieved if fchmodat()
on linux honored the AT_SYMLINK_NOFOLLOW flag as described in POSIX, but
it doesn't. There was a tentative to implement a new fchmodat2() syscall
with the correct semantics:

https://patchwork.kernel.org/patch/9596301/

but it didn't gain much momentum. Also it was suggested to look at an O_PATH
based solution in the first place.

The current implementation covers most use-cases, but it notably fails if:
- the target path has access rights equal to 0000 (openat() returns EPERM),
  => once you've done chmod(0000) on a file, you can never chmod() again
- the target path is UNIX domain socket (openat() returns ENXIO)
  => bind() of UNIX domain sockets fails if the file is on 9pfs

The solution is to use O_PATH: openat() now succeeds in both cases, and we
can ensure the path isn't a symlink with fstat(). The associated entry in
"/proc/self/fd" can hence be safely passed to the regular chmod() syscall.

The previous behavior is kept for older systems that don't have O_PATH.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Zhi Yong Wu <zhiyong.wu@ucloud.cn>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
(cherry picked from commit 4751fd5328)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:42:15 -05:00
Jeff Cody
f4f3529cfe block/nfs: fix mutex assertion in nfs_file_close()
Commit c096358e74 introduced assertion
checks for when qemu_mutex() functions are called without the
corresponding qemu_mutex_init() having initialized the mutex.

This uncovered a latent bug in qemu's nfs driver - in
nfs_client_close(), the NFSClient structure is overwritten with zeros,
prior to the mutex being destroyed.

Go ahead and destroy the mutex in nfs_client_close(), and change where
we call qemu_mutex_init() so that it is correctly balanced.

There are also a couple of memory leaks obscured by the memset, so this
fixes those as well.

Finally, we should be able to get rid of the memset(), as it isn't
necessary.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 113fe792fd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:41:21 -05:00
Aleksandr Bezzubikov
5f7f7e4f05 hw/i386: allow SHPC for Q35 machine
Unmask previously masked SHPC feature in _OSC method.

Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a41c78c135)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:40:47 -05:00
Laurent Vivier
de9b6728ff cpu: don't allow negative core id
With pseries machine type a negative core-id is not managed properly:
-1 gives an inaccurate error message ("core -1 already populated"),
-2 crashes QEMU (core dump)

As it seems a negative value is invalid for any architecture,
instead of checking this in spapr_core_pre_plug() I think it's better
to check this in the generic part, core_prop_set_core_id()

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170802103259.25940-1-lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit be2960baae)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:40:13 -05:00
Kevin Wolf
a0ddbcfb68 block: Skip implicit nodes in query-block/blockstats
Commits 0db832f and 6cdbceb introduced the automatic insertion of filter
nodes above the top layer of mirror and commit block jobs. The
assumption made there was that since libvirt doesn't do node-level
management of the block layer yet, it shouldn't be affected by added
nodes.

This is true as far as commands issued by libvirt are concerned. It only
uses BlockBackend names to address nodes, so any operations it performs
still operate on the root of the tree as intended.

However, the assumption breaks down when you consider query commands,
which return data for the wrong node now. These commands also return
information on some child nodes (bs->file and/or bs->backing), which
libvirt does make use of, and which refer to the wrong nodes, too.

One of the consequences is that oVirt gets wrong information about the
image size and stops the VM in response as long as a mirror or commit
job is running:

https://bugzilla.redhat.com/show_bug.cgi?id=1470634

This patch fixes the problem by hiding the implicit nodes created
automatically by the mirror and commit block jobs in the output of
query-block and BlockBackend-based query-blockstats as long as the user
doesn't indicate that they are aware of those nodes by providing a node
name for them in the QMP command to start the block job.

The node-based commands query-named-block-nodes and query-blockstats
with query-nodes=true still show all nodes, including implicit ones.
This ensures that users that are capable of node-level management can
still access the full information; users that only know BlockBackends
won't use these commands.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit d3c8c67469)
 Conflicts:
	block/qapi.c
	include/block/block_int.h
* fix context deps on 46eade7b and 5a9347c6
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:21:56 -05:00
Kevin Wolf
d445e0a022 qemu-iotests: Test automatic commit job cancel on hot unplug
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
(cherry picked from commit c3971b883a)
*prereq for d3c8c674
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:18:34 -05:00
Alexander Graf
ad480ab20b input: Decrement queue count on kbd delay
Delays in the input layer are special cased input events. Every input
event is accounted for in a global intput queue count. The special cased
delays however did not get removed from the queue, leading to queue overruns
and thus silent key drops after typing quite a few characters.

Signed-off-by: Alexander Graf <agraf@suse.de>
Message-id: 1498117318-162102-1-git-send-email-agraf@suse.de
Fixes: be1a7176 ("input: add support for kbd delays")
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 77b0359bf4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:07:42 -05:00
Gerd Hoffmann
f8d050a20a input: limit kbd queue depth
Apply a limit to the number of items we accept into the keyboard queue.

Impact: Without this limit vnc clients can exhaust host memory by
sending keyboard events faster than qemu feeds them to the guest.

Fixes: CVE-2017-8379
Cc: P J P <ppandit@redhat.com>
Cc: Huawei PSIRT <PSIRT@huawei.com>
Reported-by: jiangxin1@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170428084237.23960-1-kraxel@redhat.com
(cherry picked from commit fa18f36a46)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:07:25 -05:00
Jason Wang
9527514ceb virtio-net: fix offload ctrl endian
Spec said offloads should be le64, so use virtio_ldq_p() to guarantee
valid endian.

Fixes: 644c98587d ("virtio-net: dynamic network offloads configuration")
Cc: qemu-stable@nongnu.org
Cc: Dmitry Fleytman <dfleytma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 189ae6bb5c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:07:16 -05:00
Greg Kurz
2a7526b0ce spapr: fix memory leak in spapr_core_pre_plug()
In case of error, we must ensure the dynamically allocated base_core_type
is freed, like it is done everywhere else in this function.

This is a regression introduced in QEMU 2.9 by commit 8149e2992f.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit df8658de43)
 Conflicts:
	hw/ppc/spapr.c
* fix context dep on 459264ef2
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:07:16 -05:00
Kevin Wolf
2e40aad231 commit: Add NULL check for overlay_bs
I can't see how overlay_bs could become NULL with the current code, but
other code in this function already checks it and we can make Coverity
happy with this check, so let's add it.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit b1e1fa0c3a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:07:16 -05:00
Jason Wang
70da03f10c virtio-scsi: finalize IOMMU support
After converting to use DMA api for virtio devices, we should use
dma_as instead of address_space_memory. Otherwise it won't work if
IOMMU is enabled.

Fixes: commit 8607f5c307 ("virtio: convert to use DMA api")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <1499170866-9068-1-git-send-email-jasowang@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 025bdeab3c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:07:16 -05:00
Laurent Vivier
19284a0141 spapr: fix migration to pseries machine < 2.8
since commit 5c4537bd ("spapr: Fix 2.7<->2.8 migration of PCI host bridge"),
some migration fields are forged from the new ones in spapr_pci_pre_save().

It works well, except when the number of MSI devices is 0,
because in this case the function exits immediately.

This fix moves the migration code before the exit code.

The problem can be reproduced with these commands:

source qemu-2.9:

    qemu-system-ppc64 -monitor stdio -M pseries-2.6 -nodefaults -S

destination qemu-2.6:

    qemu-system-ppc64 -monitor stdio -M pseries-2.6 -nodefaults \
                      -incoming tcp:0:4444

on the source:

    migrate tcp:localhost:4444

Destination fails with the following error:

    qemu-system-ppc64: error while loading state for
                       instance 0x0 of device 'spapr_pci'
    qemu-system-ppc64: load of migration failed: Invalid argument

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit e806b4db14)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 18:07:16 -05:00
Alexander Graf
0060a3e936 hid: Reset kbd modifiers on reset
When resetting the keyboard, we need to reset not just the pending keystrokes,
but also any pending modifiers. Otherwise there's a race when we're getting
reset while running an escape sequence (modifier 0x100).

Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
Message-id: 1498117295-162030-1-git-send-email-agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 51dbea77a2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:59:22 -05:00
Bruce Rogers
e0398ccdf4 9pfs: local: remove: use correct path component
Commit a0e640a8 introduced a path processing error.
Pass fstatat the dirpath based path component instead
of the entire path.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 790db7efdb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:58:51 -05:00
Max Reitz
438cd1e6aa block: Do not strcmp() with NULL uri->scheme
uri_parse(...)->scheme may be NULL. In fact, probably every field may be
NULL, and the callers do test this for all of the other fields but not
for scheme (except for block/gluster.c; block/vxhs.c does not access
that field at all).

We can easily fix this by using g_strcmp0() instead of strcmp().

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170613205726.13544-1-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit f69165a8fe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:58:11 -05:00
Paolo Bonzini
40ed5cdf72 nbd: fix NBD over TLS
When attaching the NBD QIOChannel to an AioContext, the TLS channel should
be used, not the underlying socket channel.  This is because, trivially,
the TLS channel will be the one that we read/write to and thus the one
that will get the qio_channel_yield() call.

Fixes: ff82911cd3
Cc: qemu-stable@nongnu.org
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Tested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 96d06835dc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:57:32 -05:00
Max Reitz
2182791734 blkverify: Catch bs->exact_filename overflow
The bs->exact_filename field may not be sufficient to store the full
blkverify node filename. In this case, we should not generate a filename
at all instead of an unusable one.

Cc: qemu-stable@nongnu.org
Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170613172006.19685-3-mreitz@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 05cc758a3d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:57:05 -05:00
Max Reitz
1828d47845 blkdebug: Catch bs->exact_filename overflow
The bs->exact_filename field may not be sufficient to store the full
blkdebug node filename. In this case, we should not generate a filename
at all instead of an unusable one.

Cc: qemu-stable@nongnu.org
Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170613172006.19685-2-mreitz@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit de81d72d3d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:57:00 -05:00
Kevin Wolf
1dd3ba38f6 commit: Fix completion with extra reference
commit_complete() can't assume that after its block_job_completed() the
job is actually immediately freed; someone else may still be holding
references. In this case, the op blockers on the intermediate nodes make
the graph reconfiguration in the completion code fail.

Call block_job_remove_all_bdrv() manually so that we know for sure that
any blockers on intermediate nodes are given up.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 4f78a16fee)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:53:45 -05:00
Eric Blake
ecc7a24c11 nbd: Fix regression on resiliency to port scan
Back in qemu 2.5, qemu-nbd was immune to port probes (a transient
server would not quit, regardless of how many probe connections
came and went, until a connection actually negotiated).  But we
broke that in commit ee7d7aa when removing the return value to
nbd_client_new(), although that patch also introduced a bug causing
an assertion failure on a client that fails negotiation.  We then
made it worse during refactoring in commit 1a6245a (a segfault
before we could even assert); the (masked) assertion was cleaned
up in d3780c2 (still in 2.6), and just recently we finally fixed
the segfault ("nbd: Fully intialize client in case of failed
negotiation").  But that still means that ever since we added
TLS support to qemu-nbd, we have been vulnerable to an ill-timed
port-scan being able to cause a denial of service by taking down
qemu-nbd before a real client has a chance to connect.

Since negotiation is now handled asynchronously via coroutines,
we no longer have a synchronous point of return by re-adding a
return value to nbd_client_new().  So this patch instead wires
things up to pass the negotiation status through the close_fn
callback function.

Simple test across two terminals:
$ qemu-nbd -f raw -p 30001 file
$ nmap 127.0.0.1 -p 30001 && \
  qemu-io -c 'r 0 512' -f raw nbd://localhost:30001

Note that this patch does not change what constitutes successful
negotiation (thus, a client must enter transmission phase before
that client can be considered as a reason to terminate the server
when the connection ends).  Perhaps we may want to tweak things
in a later patch to also treat a client that uses NBD_OPT_ABORT
as being a 'successful' negotiation (the client correctly talked
the NBD protocol, and informed us it was not going to use our
export after all), but that's a discussion for another day.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451614

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170608222617.20376-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0c9390d978)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:52:49 -05:00
Eric Blake
ec49c8ac57 nbd: Fully initialize client in case of failed negotiation
If a non-NBD client connects to qemu-nbd, we would end up with
a SIGSEGV in nbd_client_put() because we were trying to
unregister the client's association to the export, even though
we skipped inserting the client into that list.  Easy trigger
in two terminals:

$ qemu-nbd -p 30001 --format=raw file
$ nmap 127.0.0.1 -p 30001

nmap claims that it thinks it connected to a pago-services1
server (which probably means nmap could be updated to learn the
NBD protocol and give a more accurate diagnosis of the open
port - but that's not our problem), then terminates immediately,
so our call to nbd_negotiate() fails.  The fix is to reorder
nbd_co_client_start() to ensure that all initialization occurs
before we ever try talking to a client in nbd_negotiate(), so
that the teardown sequence on negotiation failure doesn't fault
while dereferencing a half-initialized object.

While debugging this, I also noticed that nbd_update_server_watch()
called by nbd_client_closed() was still adding a channel to accept
the next client, even when the state was no longer RUNNING.  That
is fixed by making nbd_can_accept() pay attention to the current
state.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451614

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170527030421.28366-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit df8ad9f128)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:52:10 -05:00
Kevin Wolf
f28b8906dd commit: Fix use after free in completion
The final bdrv_set_backing_hd() could be working on already freed nodes
because the commit job drops its references (through BlockBackends) to
both overlay_bs and top already a bit earlier.

One way to trigger the bug is hot unplugging a disk for which
blockdev_mark_auto_del() cancels the block job.

Fix this by taking BDS-level references while we're still using the
nodes.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 19ebd13ed4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:51:19 -05:00
Max Filippov
bace1f90f9 target/xtensa: handle unknown registers in gdbstub
Xtensa cores may have registers of types/sizes not supported by the
gdbstub accessors. Ignore writes to such registers and return zero on
read, but always return correct register size, so that gdb on the other
side is able to access all registers in the packet holding unsupported
registers in the middle. This fixes gdb interaction with cores that have
vector/custom TIE registers.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit dd7b952b79)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:50:05 -05:00
Greg Kurz
3b2f3a4691 spapr: fix memory leak in spapr_memory_pre_plug()
The string returned by object_property_get_str() is dynamically allocated.

(Spotted by Coverity, CID 1375942)

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 8a9e0e7b89)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:47:06 -05:00
Laurent Vivier
7f4c9f5f3b spapr: add pre_plug function for memory
This allows to manage errors before the memory
has started to be hotplugged. We already have
the function for the CPU cores.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
[dwg: Fixed a couple of style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

(cherry picked from commit c871bc70bb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:46:50 -05:00
Greg Kurz
592ee4007e target/ppc: fix memory leak in kvmppc_is_mem_backend_page_size_ok()
The string returned by object_property_get_str() is dynamically allocated.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 2d3e302ec2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:39:31 -05:00
Greg Kurz
917a5b9f2f target/ppc: pass const string to kvmppc_is_mem_backend_page_size_ok()
This function has three implementations. Two are stubs that do nothing
and the third one only passes the obj_path argument to:

Object *object_resolve_path(const char *path, bool *ambiguous);

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit ec69355bef)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-24 16:39:25 -05:00
Eduardo Habkost
2401d8a490 pc: Use "min-[x]level" on compat_props
Since the automatic cpuid-level code was introduced in commit
c39c0edf9b ("target-i386: Automatically
set level/xlevel/xlevel2 when needed"), the CPU model tables just define
the default CPUID level code (set using "min-level").  Setting
"[x]level" forces CPUID level to a specific value and disable the
automatic-level logic.

But the PC compat code was not updated and the existing "[x]level"
compat properties broke compatibility for people using features that
triggered the auto-level code.  To keep previous behavior, we should set
"min-[x]level" instead of "[x]level" on compat_props.

This was not a problem for most cases, because old machine-types don't
have full-cpuid-auto-level enabled.  The only common use case it broke
was the CPUID[7] auto-level code, that was already enabled since the
first CPUID[7] feature was introduced (in QEMU 1.4.0).

This causes the regression reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1454641

Change the PC compat code to use "min-[x]level" instead of "[x]level" on
compat_props, and add new test cases to ensure we don't break this
again.

Reported-by: "Guo, Zhiyi" <zhguo@redhat.com>
Fixes: c39c0edf9b ("target-i386: Automatically set level/xlevel/xlevel2 when needed")
Cc: qemu-stable@nongnu.org
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 1f43571604)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:29 -05:00
Michael Roth
1775fe6148 monitor: fix object_del for command-line-created objects
Currently objects specified on the command-line are only partially
cleaned up when 'object_del' is issued in either HMP or QMP: the
object itself is fully finalized, but the QemuOpts are not removed.
This results in the following behavior:

  x86_64-softmmu/qemu-system-x86_64 -monitor stdio \
    -object memory-backend-ram,id=ram1,size=256M

  QEMU 2.7.91 monitor - type 'help' for more information
  (qemu) object_del ram1
  (qemu) object_del ram1
  object 'ram1' not found
  (qemu) object_add memory-backend-ram,id=ram1,size=256M
  Duplicate ID 'ram1' for object
  Try "help object_add" for more information

which can be an issue for use-cases like memory hotplug.

This happens on the HMP side because hmp_object_add() attempts to
create a temporary QemuOpts entry with ID 'ram1', which ends up
conflicting with the command-line-created entry, since it was never
cleaned up during the previous hmp_object_del() call.

We address this by adding a check in user_creatable_del(), which
is called by both qmp_object_del() and hmp_object_del() to handle
the actual object cleanup, to determine whether an option group entry
matching the object's ID is present and removing it if it is.

Note that qmp_object_add() never attempts to create a temporary
QemuOpts entry, so it does not encounter the duplicate ID error,
which is why this isn't generally visible in libvirt.

Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Daniel Berrange <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1496531612-22166-3-git-send-email-mdroth@linux.vnet.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit c645d5acee)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:28 -05:00
Michael Roth
b0a3eadd8c tests: check-qom-proplist: add checks for cmdline-created objects
check-qom-proplist originally added tests for verifying that
object-creation helpers object_new_with_{props,propv} behaved in
similar fashion to the "traditional" method involving setting each
individual property separately after object creation rather than
via a single call.

Another similar "helper" for creating Objects exists in the form of
objects specified via -object command-line parameters. By that
rationale, we extend check-qom-proplist to include similar checks
for command-line-created objects by employing the same
qemu_opts_parse()-based parsing the vl.c employs.

This parser has a side-effect of parsing the object's options into
a QemuOpt structure and registering this in the global QemuOptsList
using the Object's ID. This can conflict with future Object instances
that attempt to use the same ID if we don't ensure this is cleaned
up as part of Object finalization, so we include a FIXME stub to test
for this case, which will then be resolved in a subsequent patch.

Suggested-by: Daniel Berrange <berrange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Daniel Berrange <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1496531612-22166-2-git-send-email-mdroth@linux.vnet.ibm.com>
[Comment formatting tidied up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>

(cherry picked from commit a1af255f06)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:28 -05:00
Paolo Bonzini
3b428e9512 linuxboot_dma: compile for i486
The ROM uses the cmovne instruction, which is new in Pentium Pro and does not
work when running QEMU with "-cpu 486".  Avoid producing that instruction.

Suggested-by: Richard W.M. Jones <rjones@redhat.com>
Suggested-by: Thomas Huth <thuth@redhat.com>
Reported-by: Rob Landley <rob@landley.net>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7e01838510)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:28 -05:00
Ladi Prosek
11bac2f941 virtio-serial-bus: Unset hotplug handler when unrealize
Virtio serial device controls the lifetime of virtio-serial-bus and
virtio-serial-bus links back to the device via its hotplug-handler
property. This extra ref-count prevents the device from getting
finalized, leaving the VirtIODevice memory listener registered and
leading to use-after-free later on.

This patch addresses the same issue as Fam Zheng's
"virtio-scsi: Unset hotplug handler when unrealize"
only for a different virtio device.

Cc: qemu-stable@nongnu.org
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
(cherry picked from commit f811f97040)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:28 -05:00
Kevin Wolf
0ebbef1fa1 mirror: Drop permissions on s->target on completion
This fixes an assertion failure that was triggered by qemu-iotests 129
on some CI host, while the same test case didn't seem to fail on other
hosts.

Essentially the problem is that the blk_unref(s->target) in
mirror_exit() doesn't necessarily mean that the BlockBackend goes away
immediately. It is possible that the job completion was triggered nested
in mirror_drain(), which looks like this:

    BlockBackend *target = s->target;
    blk_ref(target);
    blk_drain(target);
    blk_unref(target);

In this case, the write permissions for s->target are retained until
after blk_drain(), which makes removing mirror_top_bs fail for the
active commit case (can't have a writable backing file in the chain
without the filter driver).

Explicitly dropping the permissions first means that the additional
reference doesn't hurt and the job can complete successfully even if
called from the nested blk_drain().

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 63c8ef2890)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:28 -05:00
Eric Blake
64945cb5f3 block: Guarantee that *file is set on bdrv_get_block_status()
We document that *file is valid if the return is not an error and
includes BDRV_BLOCK_OFFSET_VALID, but forgot to obey this contract
when a driver (such as blkdebug) lacks a callback.  Messed up in
commit 67a0fd2 (v2.6), when we added the file parameter.

Enhance qemu-iotest 177 to cover this, using a sequence that would
print garbage or even SEGV, because it was dererefencing through
uninitialized memory.  [The resulting test output shows that we
have less-than-ideal block status from the blkdebug driver, but
that's a separate fix coming up soon.]

Setting *file on all paths that return BDRV_BLOCK_OFFSET_VALID is
enough to fix the crash, but we can go one step further: always
setting *file, even on error, means that a broken caller that
blindly dereferences file without checking for error is now more
likely to get a reliable SEGV instead of randomly acting on garbage,
making it easier to diagnose such buggy callers.  Adding an
assertion that file is set where expected doesn't hurt either.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 81c219ac6c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:28 -05:00
Eric Blake
6a3f9c5c6e block: Simplify BDRV_BLOCK_RAW recursion
Since we are already in coroutine context during the body of
bdrv_co_get_block_status(), we can shave off a few layers of
wrappers when recursing to query the protocol when a format driver
returned BDRV_BLOCK_RAW.

Note that we are already using the correct recursion later on in
the same function, when probing whether the protocol layer is sparse
in order to find out if we can add BDRV_BLOCK_ZERO to an existing
BDRV_BLOCK_DATA|BDRV_BLOCK_OFFSET_VALID.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170504173745.27414-1-eblake@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ee29d6adef)
* prereq for 81c219a
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:28 -05:00
Eric Blake
3f3fe284ef tests: Add coverage for recent block geometry fixes
Use blkdebug's new geometry constraints to emulate setups that
have needed past regression fixes: write zeroes asserting
when running through a loopback block device with max-transfer
smaller than cluster size, and discard rounding away portions
of requests not aligned to preferred boundaries.  Also, add
coverage that the block layer is honoring max transfer limits.

For now, a single iotest performs all actions, with the idea
that we can add future blkdebug constraint test cases in the
same file; but it can be split into multiple iotests if we find
reason to run one portion of the test in more setups than what
are possible in the other.

For reference, the final portion of the test (checking whether
discard passes as much as possible to the lowest layers of the
stack) works as follows:

qemu-io: discard 30M at 80000001, passed to blkdebug
  blkdebug: discard 511 bytes at 80000001, -ENOTSUP (smaller than
blkdebug's 512 align)
  blkdebug: discard 14371328 bytes at 80000512, passed to qcow2
    qcow2: discard 739840 bytes at 80000512, -ENOTSUP (smaller than
qcow2's 1M align)
    qcow2: discard 13M bytes at 77M, succeeds
  blkdebug: discard 15M bytes at 90M, passed to qcow2
    qcow2: discard 15M bytes at 90M, succeeds
  blkdebug: discard 1356800 bytes at 105M, passed to qcow2
    qcow2: discard 1M at 105M, succeeds
    qcow2: discard 308224 bytes at 106M, -ENOTSUP (smaller than qcow2's
1M align)
  blkdebug: discard 1 byte at 111457280, -ENOTSUP (smaller than
blkdebug's 512 align)

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-10-eblake@redhat.com
[mreitz: For cooperation with image locking, add -r to the qemu-io
         invocation which verifies the image content]
Signed-off-by: Max Reitz <mreitz@redhat.com>

(cherry picked from commit 40812d9373)
 Conflicts:
	tests/qemu-iotests/group
* dropped context dependency on other test groups
* prereq for 81c219a
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:24 -05:00
Eric Blake
48f2dc0657 blkdebug: Add ability to override unmap geometries
Make it easier to simulate various unusual hardware setups (for
example, recent commits 3482b9b and b8d0a98 affect the Dell
Equallogic iSCSI with its 15M preferred and maximum unmap and
write zero sizing, or b2f95fe deals with the Linux loopback
block device having a max_transfer of 64k), by allowing blkdebug
to wrap any other device with further restrictions on various
alignments.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-9-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 430b26a82d)
* prereq for 81c219a
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:19 -05:00
Eric Blake
3ae74003b5 blkdebug: Simplify override logic
Rather than store into a local variable, then copy to the struct
if the value is valid, then reporting errors otherwise, it is
simpler to just store into the struct and report errors if the
value is invalid.  This however requires that the struct store
a 64-bit number, rather than a narrower type.  Likewise, setting
a sane errno value in ret prior to the sequence of parsing and
jumping to out: on error makes it easier for the next patch to
add a chain of similar checks.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170429191419.30051-8-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 3dc834f879)
* prereq for 81c219a
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:15 -05:00
Eric Blake
577cf9e6bb blkdebug: Add pass-through write_zero and discard support
In order to test the effects of artificial geometry constraints
on operations like write zero or discard, we first need blkdebug
to manage these actions.  It also allows us to inject errors on
those operations, just like we can for read/write/flush.

We can also test the contract promised by the block layer; namely,
if a device has specified limits on alignment or maximum size,
then those limits must be obeyed (for now, the blkdebug driver
merely inherits limits from whatever it is wrapping, but the next
patch will further enhance it to allow specific limit overrides).

This patch intentionally refuses to service requests smaller than
the requested alignments; this is because an upcoming patch adds
a qemu-iotest to prove that the block layer is correctly handling
fragmentation, but the test only works if there is a way to tell
the difference at artificial alignment boundaries when blkdebug is
using a larger-than-default alignment.  If we let the blkdebug
layer always defer to the underlying layer, which potentially has
a smaller granularity, the iotest will be thwarted.

Tested by setting up an NBD server with export 'foo', then invoking:
$ ./qemu-io
qemu-io> open -o driver=blkdebug blkdebug::nbd://localhost:10809/foo
qemu-io> d 0 15M
qemu-io> w -z 0 15M

Pre-patch, the server never sees the discard (it was silently
eaten by the block layer); post-patch it is passed across the
wire.  Likewise, pre-patch the write is always passed with
NBD_WRITE (with 15M of zeroes on the wire), while post-patch
it can utilize NBD_WRITE_ZEROES (for less traffic).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-7-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 63188c2450)
* prereq for 81c219a
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:11 -05:00
Eric Blake
138cf638ba blkdebug: Refactor error injection
Rather than repeat the logic at each caller of checking if a Rule
exists that warrants an error injection, fold that logic into
inject_error(); and rename it to rule_check() for legibility.
This will help the next patch, which adds two more callers that
need to check rules for the potential of injecting errors.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-6-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit d157ed5f72)
* prereq for 81c219a
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:01:08 -05:00
Eric Blake
a1a3d603c4 blkdebug: Sanity check block layer guarantees
Commits 04ed95f4 and 1a62d0ac updated the block layer to auto-fragment
any I/O to fit within device boundaries. Additionally, when using a
minimum alignment of 4k, we want to ensure the block layer does proper
read-modify-write rather than requesting I/O on a slice of a sector.
Let's enforce that the contract is obeyed when using blkdebug.  For
now, blkdebug only allows alignment overrides, and just inherits other
limits from whatever device it is wrapping, but a future patch will
further enhance things.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170429191419.30051-5-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit e0ef439588)
* prereq for 81c219a
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 16:00:51 -05:00
Yunjian Wang
0b185544c9 virtio-net: fix wild pointer when remove virtio-net queues
The tx_bh or tx_timer will free in virtio_net_del_queue() function, when
removing virtio-net queues if the guest doesn't support multiqueue. But
it might be still referenced by virtio_net_set_status(), which needs to
be set NULL. And also the tx_waiting needs to be set zero to prevent
virtio_net_set_status() accessing tx_bh or tx_timer.

Cc: qemu-stable@nongnu.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit f989c30cf8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:52:02 -05:00
Halil Pasic
f3676379cc s390x/css: catch section mismatch on load
Prior to the virtio-ccw-2.7 machine (and commit 2a79eb1a), our virtio
devices residing under the virtual-css bus do not have qdev_path based
migration stream identifiers (because their qdev_path is NULL). The ids
are instead generated when the device is registered as a composition of
the so called idstr, which takes the vmsd name as its value, and an
instance_id, which is which is calculated as a maximal instance_id
registered with the same idstr plus one, or zero (if none was registered
previously).

That means, under certain circumstances, one device might try, and even
succeed, to load the state of a different device. This can lead to
trouble.

Let us fail the migration if the above problem is detected during load.

How to reproduce the problem:
1) start qemu-system-s390x making sure you have the following devices
   defined on your command line:
     -device virtio-rng-ccw,id=rng1,devno=fe.0.0001
     -device virtio-rng-ccw,id=rng2,devno=fe.0.0002
2) detach the devices and reattach in reverse order using the monitor:
     (qemu) device_del rng1
     (qemu) device_del rng2
     (qemu) device_add virtio-rng-ccw,id=rng2,devno=fe.0.0002
     (qemu) device_add virtio-rng-ccw,id=rng1,devno=fe.0.0001
3) save the state of the vm into a temporary file and quit QEMU:
     (qemu) migrate "exec:gzip -c > /tmp/tmp_vmstate.gz"
     (qemu) q
4) use your command line from step 1 with
     -incoming "exec:gzip -c -d /tmp/tmp_vmstate.gz"
   appended to reproduce the problem (while trying to to load the saved vm)

CC: qemu-stable@nongnu.org
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-Id: <20170518111405.56947-1-pasic@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit 8ed179c937)
* removed context dep on d8d98db5
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:50:49 -05:00
Sameeh Jubran
4921c573c8 e1000e: Fix ICR "Other" causes clear logic
This commit fixes a bug which causes the guest to hang. The bug was
observed upon a "receive overrun" (bit #6 of the ICR register)
interrupt which could be triggered post migration in a heavy traffic
environment. Even though the "receive overrun" bit (#6) is masked out
by the IMS register (refer to the log below) the driver still receives
an interrupt as the "receive overrun" bit (#6) causes the "Other" -
bit #24 of the ICR register - bit to be set as documented below. The
driver handles the interrupt and clears the "Other" bit (#24) but
doesn't clear the "receive overrun" bit (#6) which leads to an
infinite loop. Apparently the Windows driver expects that the "receive
overrun" bit and other ones - documented below - to be cleared when
the "Other" bit (#24) is cleared.

So to sum that up:
1. Bit #6 of the ICR register is set by heavy traffic
2. As a results of setting bit #6, bit #24 is set
3. The driver receives an interrupt for bit 24 (it doesn't receieve an
   interrupt for bit #6 as it is masked out by IMS)
4. The driver handles and clears the interrupt of bit #24
5. Bit #6 is still set.
6. 2 happens all over again

The Interrupt Cause Read - ICR register:

The ICR has the "Other" bit - bit #24 - that is set when one or more
of the following ICR register's bits are set:

LSC - bit #2, RXO - bit #6, MDAC - bit #9, SRPD - bit #16, ACK - bit
#17, MNG - bit #18

This bug can occur with any of these bits depending on the driver's
behaviour and the way it configures the device. However, trying to
reproduce it with any bit other than RX0 is challenging and came to
failure as the drivers don't implement most of these bits, trying to
reproduce it with LSC (Link Status Change - bit #2) bit didn't succeed
too as it seems that Windows handles this bit differently.

Log sample of the storm:

27563@1494850819.411877:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
27563@1494850819.411900:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.411915:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412380:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412395:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412436:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412441:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
27563@1494850819.412998:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)

* This bug behaviour wasn't observed with the Linux driver.

This commit solves:
https://bugzilla.redhat.com/show_bug.cgi?id=1447935
https://bugzilla.redhat.com/show_bug.cgi?id=1449490

Cc: qemu-stable@nongnu.org
Signed-off-by: Sameeh Jubran <sjubran@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 82342e91b6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:44:08 -05:00
Fam Zheng
952cc38204 virtio-scsi: Unset hotplug handler when unrealize
This matches the qbus_set_hotplug_handler in realize, and it releases
the final reference to the embedded VirtIODevice so that it is
properly finalized.

A use-after-free is fixed with this patch, indirectly:
virtio_device_instance_finalize wasn't called at hot-unplug, and the
vdev->listener would be a dangling pointer in the global and the per
address space listener list. See also RHBZ 1449031.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170518102808.30046-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2cbe2de545)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:42:49 -05:00
Greg Kurz
c6b510d1e5 virtio: allow broken device to notify guest
According to section 2.1.2 of the virtio-1 specification:

"The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that
a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET,
the device MUST send a device configuration change notification to the
driver."

Commit "f5ed36635d8f virtio: stop virtqueue processing if device is broken"
introduced a virtio_error() call that just does that:

- internally mark the device as broken
- set the DEVICE_NEEDS_RESET bit in the status
- send a configuration change notification

Unfortunately, virtio_notify_vector(), called by virtio_notify_config(),
returns right away when the device is marked as broken and the notification
isn't sent in this case.

The spec doesn't say whether a broken device can send notifications
in other situations or not. But since the driver isn't supposed to do
anything but to reset the device, it makes sense to keep the check in
virtio_notify_config().

Marking the device as broken AFTER the configuration change notification was
sent is enough to fix the issue.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 66453cff9e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:42:05 -05:00
Hervé Poussineau
636eacb641 vvfat: fix qemu-img map and qemu-img convert
- bs->total_sectors is the number of sectors of the whole disk
- s->sector_count is the number of sectors of the FAT partition

This fixes the following assert in qemu-img map:
qemu-img.c:2641: get_block_status: Assertion `nb_sectors' failed.

This also fixes an infinite loop in qemu-img convert.

Fixes: 4480e0f924
Fixes: https://bugs.launchpad.net/qemu/+bug/1599539
Cc: qemu-stable@nongnu.org
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 139921aaa7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:35:15 -05:00
Alberto Garcia
c60a8ed89b stream: fix crash in stream_start() when block_job_create() fails
The code that tries to reopen a BlockDriverState in stream_start()
when the creation of a new block job fails crashes because it attempts
to dereference a pointer that is known to be NULL.

This is a regression introduced in a170a91fd3,
likely because the code was copied from stream_complete().

Cc: qemu-stable@nongnu.org
Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 525989a50a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:33:26 -05:00
Paolo Bonzini
c79bef68c4 curl: avoid recursive locking of BDRVCURLState mutex
The curl driver has a ugly hack where, if it cannot find an empty CURLState,
it just uses aio_poll to wait for one to be empty.  This is probably
buggy when used together with dataplane, and the simplest way to fix it
is to use coroutines instead.

A more immediate effect of the bug however is that it can cause a
recursive call to curl_readv_bh_cb and recursively taking the
BDRVCURLState mutex.  This causes a deadlock.

The fix is to unlock the mutex around aio_poll, but for cleanliness we
should also take the mutex around all calls to curl_init_state, even if
reaching the unlock/lock pair is impossible.  The same is true for
curl_clean_state.

Reported-by: Kun Wei <kuwei@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170515100059.15795-4-pbonzini@redhat.com
Cc: qemu-stable@nongnu.org
Cc: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 456af34629)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:32:58 -05:00
Paolo Bonzini
4b519b9fd7 curl: never invoke callbacks with s->mutex held
All curl callbacks go through curl_multi_do, and hence are called with
s->mutex held.  Note that with comments, and make curl_read_cb drop the
lock before invoking the callback.

Likewise for curl_find_buf, where the callback can be invoked by the
caller.

Cc: qemu-stable@nongnu.org
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-3-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 34db05e7ff)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:32:52 -05:00
Paolo Bonzini
f00c08cbac curl: strengthen assertion in curl_clean_state
curl_clean_state should only be called after all AIOCBs have been
completed.  This is not so obvious for the call from curl_detach_aio_context,
so assert that.

Cc: qemu-stable@nongnu.org
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170515100059.15795-2-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 675a775633)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:32:46 -05:00
Max Filippov
d81db0beca target/xtensa: fix return value of read/write simcalls
Return value of read/write simcalls is not calculated correctly in case
of operations crossing page boundary and in case of short reads/writes.
Read and write simcalls should return the size of data actually
read/written or -1 in case of error.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit 347ec03093)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:32:06 -05:00
Max Filippov
e442253479 target/xtensa: fix mapping direction in read/write simcalls
Read and write simcalls map physical memory to access I/O buffers, but
'read' simcall need to map it for writing and 'write' simcall need to
map it for reading, i.e. the opposite of what they do now. Fix that.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit 30c2afd151)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:31:52 -05:00
John Snow
af8ca55a6b blockdev: use drained_begin/end for qmp_block_resize
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1447551

If one tries to issue a block_resize while a guest is busy
accessing the disk, it is possible that qemu may deadlock
when invoking aio_poll from both the main loop and the iothread.

Replace another instance of bdrv_drain_all that doesn't
quite belong.

Cc: qemu-stable@nongnu.org
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 698bdfa07d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:29:55 -05:00
Max Reitz
5797a36abd block: Add errp to b{lk,drv}_truncate()
For one thing, this allows us to drop the error message generation from
qemu-img.c and blockdev.c and instead have it unified in
bdrv_truncate().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-3-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit ed3d2ec98a)
* prereq for 698bdfa
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:29:55 -05:00
Max Reitz
73aa7ad7d2 block/vhdx: Make vhdx_create() always set errp
This patch makes vhdx_create() always set errp in case of an error. It
also adds errp parameters to vhdx_create_bat() and
vhdx_create_new_region_table() so we can pass on the error object
generated by blk_truncate() as of a future commit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170328205129.15138-2-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 55b9392b98)
* prereq for 698bdfa
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-08-03 14:29:10 -05:00
Anton Nefedov
d8cddcc2b9 qemu-img: wait for convert coroutines to complete
On error path (like i/o error in one of the coroutines), it's required to
  - wait for coroutines completion before cleaning the common structures
  - reenter dependent coroutines so they ever finish

Introduced in 2d9187bc65.

Cc: qemu-stable@nongnu.org
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b91127edd0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 17:23:52 -05:00
Stefan Hajnoczi
ce119247a1 aio: add missing aio_notify() to aio_enable_external()
The main loop uses aio_disable_external()/aio_enable_external() to
temporarily disable processing of external AioContext clients like
device emulation.

This allows monitor commands to quiesce I/O and prevent the guest from
submitting new requests while a monitor command is in progress.

The aio_enable_external() API is currently broken when an IOThread is in
aio_poll() waiting for fd activity when the main loop re-enables
external clients.  Incrementing ctx->external_disable_cnt does not wake
the IOThread from ppoll(2) so fd processing remains suspended and leads
to unresponsive emulated devices.

This patch adds an aio_notify() call to aio_enable_external() so the
IOThread is kicked out of ppoll(2) and will re-arm the file descriptors.

The bug can be reproduced as follows:

  $ qemu -M accel=kvm -m 1024 \
         -object iothread,id=iothread0 \
         -device virtio-scsi-pci,iothread=iothread0,id=virtio-scsi-pci0 \
         -drive if=none,id=drive0,aio=native,cache=none,format=raw,file=test.img \
         -device scsi-hd,id=scsi-hd0,drive=drive0 \
         -qmp tcp::5555,server,nowait

  $ scripts/qmp/qmp-shell localhost:5555
  (qemu) blockdev-snapshot-sync device=drive0 snapshot-file=sn1.qcow2
         mode=absolute-paths format=qcow2

After blockdev-snapshot-sync completes the SCSI disk will be
unresponsive.  This leads to request timeouts inside the guest.

Reported-by: Qianqian Zhu <qizhu@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170508180705.20609-1-stefanha@redhat.com
Suggested-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 321d1dba8b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 17:21:41 -05:00
Zhiyong Yang
0e727a207e hw/virtio: fix vhost user fails to startup when MQ
Qemu2.7~2.9 and vhost user for dpdk 17.02 release work together
to cause failures of new connection when negotiating to set MQ.
(one queue pair works well).
   Because there exist some bugs in qemu code when introducing
VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. When vhost_user_set_mem_table
is invoked to deal with the vhost message VHOST_USER_SET_MEM_TABLE
for the second time, qemu indeed doesn't send the messge (The message
needs to be sent only once)but still will be waiting for dpdk's reply
ack, then, qemu is always freezing, while DPDK is always waiting for
next vhost message from qemu.
  The patch aims to fix the bug, MQ can work well.
  The same bug is found in function vhost_user_net_set_mtu, it is fixed
at the same time.
  DPDK related patch is as following:
  http://www.dpdk.org/dev/patchwork/patch/23955/

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Cc: qemu-stable@nongnu.org
Fixes: ca525ce561 ("vhost-user: Introduce a new protocol feature REPLY_ACK.")
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Jens Freimann <jfreiman@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
(cherry picked from commit 60cd11024f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 17:19:32 -05:00
Fam Zheng
d2fcb92b18 block: Reuse bs as backing hd for drive-backup sync=none
Opening the backing image for the second time is bad, especially here
when it is also in use as the active image as the source. The
drive-backup job itself doesn't read from target->backing for COW,
instead it gets data from the write notifier, so it's not a big problem.
However, exporting the target to NBD etc. won't work, because of the
likely stale metadata cache.

Use BDRV_O_NO_BACKING in this case and manually set up the backing
BdrvChild.

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit fc0932fdcf)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 17:11:50 -05:00
Eric Blake
e59084b5b2 qobject: Use simpler QDict/QList scalar insertion macros
We now have macros in place to make it less verbose to add a scalar
to QDict and QList, so use them.

Patch created mechanically via:
  spatch --sp-file scripts/coccinelle/qobject.cocci \
    --macro-file scripts/cocci-macro-file.h --dir . --in-place
then touched up manually to fix a couple of '?:' back to original
spacing, as well as avoiding a long line in monitor.c.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-7-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit 46f5ac205a)
* prereq for fc0932f
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 17:11:11 -05:00
Eric Blake
1eaf431077 s390x: Drop useless casts
An upcoming Coccinelle cleanup script wanted to reformat the casts
present in this file - but on closer look, we don't need the casts
at all because C automatically converts void* to any other pointer.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170405194741.18956-4-eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit cb55c19a26)
* prereq for 46f5ac2
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 17:09:19 -05:00
Eric Blake
396474a18c qobject: Add helper macros for common scalar insertions
Rather than making lots of callers wrap a scalar in a QInt, QString,
or QBool, provide helper macros that do the wrapping automatically.

Update the Coccinelle script to make mass conversions easy, although
the conversion itself will be done as a separate patches to ease
review and backport efforts.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-6-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit a92c21591b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 17:07:14 -05:00
Eric Blake
3f308bf3ac qobject: Drop useless QObject casts
We have macros in place to make it less verbose to add a subtype
of QObject to both QDict and QList. While we have made cleanups
like this in the past (see commit fcfcd8ffc, for example), having
it be automated by Coccinelle makes it easier to maintain.

Patch created mechanically via:
  spatch --sp-file scripts/coccinelle/qobject.cocci \
    --macro-file scripts/cocci-macro-file.h --dir . --in-place
then I verified that no manual touchups were required.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-5-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit de6e7951fe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 17:07:06 -05:00
Eric Blake
21047240d0 coccinelle: Add script to remove useless QObject casts
We have macros in place to make it less verbose to add a subtype
of QObject to both QDict and QList. While we have made cleanups
like this in the past (see commit fcfcd8ffc, for example), having
it be automated by Coccinelle makes it easier to maintain.

The script is separate from the cleanups, for ease of review and
backporting.  A later patch will then add further possible cleanups.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-4-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit a2f3453ebc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 17:06:58 -05:00
Greg Kurz
785d9ab216 9pfs: local: fix unlink of alien files in mapped-file mode
When trying to remove a file from a directory, both created in non-mapped
mode, the file remains and EBADF is returned to the guest.

This is a regression introduced by commit "df4938a6651b 9pfs: local:
unlinkat: don't follow symlinks" when fixing CVE-2016-9602. It changed the
way we unlink the metadata file from

    ret = remove("$dir/.virtfs_metadata/$name");
    if (ret < 0 && errno != ENOENT) {
         /* Error out */
    }
    /* Ignore absence of metadata */

to

    fd = openat("$dir/.virtfs_metadata")
    unlinkat(fd, "$name")
    if (ret < 0 && errno != ENOENT) {
         /* Error out */
    }
    /* Ignore absence of metadata */

If $dir was created in non-mapped mode, openat() fails with ENOENT and
we pass -1 to unlinkat(), which fails in turn with EBADF.

We just need to check the return of openat() and ignore ENOENT, in order
to restore the behaviour we had with remove().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
[groug: rewrote the comments as suggested by Eric]

(cherry picked from commit 6a87e7929f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 16:56:59 -05:00
Markus Armbruster
45b3eac752 replication: Make --disable-replication compile again
Broken in commit daa33c5.

Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Message-id: 1493298053-17140-1-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 38bb54f323)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 16:55:42 -05:00
Bruce Rogers
c64d184584 ACPI: don't call acpi_pcihp_device_plug_cb on xen
Commit f0c9d64a exposed the issue that with a xenfv machine using
pci passthrough, acpi pci hotplug code was being executed by mistake.
Guard calls to acpi_pcihp_device_plug_cb (and corresponding
acpi_pcihp_device_unplug_cb) with a check for xen_enabled(). Without
this check I am seeing an error that the bus doesn't have the
acpi-pcihp-bsel property set.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 153eba4726)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 16:55:12 -05:00
Max Reitz
c1059a3a3c block: Do not unref bs->file on error in BD's open
The block layer takes care of removing the bs->file child if the block
driver's bdrv_open()/bdrv_file_open() implementation fails. The block
driver therefore does not need to do so, and indeed should not unless it
sets bs->file to NULL afterwards -- because if this is not done, the
bdrv_unref_child() in bdrv_open_inherit() will dereference the freed
memory block at bs->file afterwards, which is not good.

We can now decide whether to add a "bs->file = NULL;" after each of the
offending bdrv_unref_child() invocations, or just drop them altogether.
The latter is simpler, so let's do that.

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit de234897b6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 16:44:54 -05:00
Herongguang (Stephen)
0b906e4d0a pci: deassert intx when pci device unrealize
If a pci device is not reset by VM (by writing into config space)
and unplugged by VM, after that when VM reboots, qemu may assert:
pcibus_reset: Assertion `bus->irq_count[i] == 0' failed

Cc: qemu-stable@nongnu.org
Signed-off-by: herongguang <herongguang.he@huawei.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 3936161f1f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 16:43:34 -05:00
Daniel P. Berrange
181e005f14 migration: setup bi-directional I/O channel for exec: protocol
Historically the migration data channel has only needed to be
unidirectional. Thus the 'exec:' protocol was requesting an
I/O channel with O_RDONLY on incoming side, and O_WRONLY on
the outgoing side.

This is fine for classic migration, but if you then try to run
TLS over it, this fails because the TLS handshake requires a
bi-directional channel.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 062d81f0e9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 16:42:57 -05:00
Max Reitz
b8420f7102 iotests/051: Add test for empty filename
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 42dc10f17a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 16:23:33 -05:00
Max Reitz
bd1039b463 block: An empty filename counts as no filename
Reproducer:
    $ ./qemu-img info ''
    qemu-img: ./block.c:1008: bdrv_open_driver: Assertion
        `!drv->bdrv_needs_filename || bs->filename[0]' failed.
    [1]    26105 abort (core dumped)  ./qemu-img info ''

This patch fixes this to be:
    $ ./qemu-img info ''
    qemu-img: Could not open '': The 'file' block driver requires a file
    name

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 4a0082401a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 16:23:04 -05:00
Max Reitz
bc70597a48 qemu-img/convert: Move bs_n > 1 && -B check down
It does not make much sense to use a backing image for the target when
you concatenate multiple images (because then there is no correspondence
between the source images' backing files and the target's); but it was
still possible to give one by using -o backing_file=X instead of -B X.

Fix this by moving the check.

(Also, change the error message because -B is not the only way to
 specify the backing file, evidently.)

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
* applied patch from v1 of series as suggested by author
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 16:22:00 -05:00
Max Reitz
a1c850fd9d qemu-img/convert: Use @opts for one thing only
After storing the creation options for the new image into @opts, we
fetch some things for our own information, like the backing file name,
or whether to use encryption or preallocation.

With the -n parameter, there will not be any creation options; this is
not too bad because this just means that querying a NULL @opts will
always return the default value.

However, we also use @opts for the --object options. Therefore, @opts is
not necessarily NULL if -n was specified; instead, it may contain those
options. In practice, this probably does not cause any problems because
there most likely is no object that supports any of the parameters we
query here, but this is neither something we should rely on nor does
this variable reuse make the code very nice to read.

Therefore, just use an own variable for the --object options.

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
* applied patch from v1 of series as suggested by author
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 16:20:06 -05:00
Max Reitz
c37a62b751 qemu-img/convert: Always set ret < 0 on error
Otherwise the qemu-img process will exit with EXIT_SUCCESS instead of
EXIT_FAILURE.

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* applied directly to stable, upstream code has issue fixed via a
  refactoring introduced by 9fd77f9, which isn't targetted for stable
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 15:55:25 -05:00
Eric Blake
4aa16db9cf dirty-bitmap: Report BlockDirtyInfo.count in bytes, as documented
We've been documenting the value in bytes since its introduction
in commit b9a9b3a4 (v1.3), where it was actually reported in bytes.

Commit e4654d2 (v2.0) then removed things from block/qapi.c, in
preparation for a rewrite to a list of dirty sectors in the next
commit 21b5683 in block.c, but the new code mistakenly started
reporting in sectors.

Fixes: https://bugzilla.redhat.com/1441460

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6c98c57af3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 15:33:09 -05:00
Sameeh Jubran
27dd31f164 qga-win: Enable 'can-offline' field in 'guest-get-vcpus' reply
The QGA schema states:

@can-offline: Whether offlining the VCPU is possible. This member
               is always filled in by the guest agent when the structure
               is returned, and always ignored on input (hence it can be
               omitted then).

Currently 'can-offline' is missing entirely from the reply. This causes
errors in libvirt which is expecting the reply to be compliant with the
schema docs.

BZ#1438735: https://bugzilla.redhat.com/show_bug.cgi?id=1438735

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 54858553de)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-07-31 15:25:56 -05:00
3040 changed files with 117181 additions and 353732 deletions

View File

@@ -1,15 +0,0 @@
# http://editorconfig.org
root = true
[*]
end_of_line = lf
insert_final_newline = true
charset = utf-8
[Makefile*]
indent_style = tab
indent_size = 8
[*.{c,h}]
indent_style = space
indent_size = 4

View File

@@ -1,8 +0,0 @@
# GDB may have ./.gdbinit loading disabled by default. In that case you can
# follow the instructions it prints. They boil down to adding the following to
# your home directory's ~/.gdbinit file:
#
# add-auto-load-safe-path /path/to/qemu/.gdbinit
# Load QEMU-specific sub-commands and settings
source scripts/qemu-gdb.py

107
.gitignore vendored
View File

@@ -14,8 +14,6 @@
/trace/generated-tcg-tracers.h
/ui/shader/texture-blit-frag.h
/ui/shader/texture-blit-vert.h
/ui/shader/texture-blit-flip-vert.h
/ui/input-keymap-*.c
*-timestamp
/*-softmmu
/*-darwin-user
@@ -27,79 +25,13 @@
/libuser
/linux-headers/asm
/qga/qapi-generated
/qapi-gen-timestamp
/qapi/qapi-builtin-types.[ch]
/qapi/qapi-builtin-visit.[ch]
/qapi/qapi-commands-block-core.[ch]
/qapi/qapi-commands-block.[ch]
/qapi/qapi-commands-char.[ch]
/qapi/qapi-commands-common.[ch]
/qapi/qapi-commands-crypto.[ch]
/qapi/qapi-commands-introspect.[ch]
/qapi/qapi-commands-migration.[ch]
/qapi/qapi-commands-misc.[ch]
/qapi/qapi-commands-net.[ch]
/qapi/qapi-commands-rocker.[ch]
/qapi/qapi-commands-run-state.[ch]
/qapi/qapi-commands-sockets.[ch]
/qapi/qapi-commands-tpm.[ch]
/qapi/qapi-commands-trace.[ch]
/qapi/qapi-commands-transaction.[ch]
/qapi/qapi-commands-ui.[ch]
/qapi/qapi-commands.[ch]
/qapi/qapi-events-block-core.[ch]
/qapi/qapi-events-block.[ch]
/qapi/qapi-events-char.[ch]
/qapi/qapi-events-common.[ch]
/qapi/qapi-events-crypto.[ch]
/qapi/qapi-events-introspect.[ch]
/qapi/qapi-events-migration.[ch]
/qapi/qapi-events-misc.[ch]
/qapi/qapi-events-net.[ch]
/qapi/qapi-events-rocker.[ch]
/qapi/qapi-events-run-state.[ch]
/qapi/qapi-events-sockets.[ch]
/qapi/qapi-events-tpm.[ch]
/qapi/qapi-events-trace.[ch]
/qapi/qapi-events-transaction.[ch]
/qapi/qapi-events-ui.[ch]
/qapi/qapi-events.[ch]
/qapi/qapi-introspect.[ch]
/qapi/qapi-types-block-core.[ch]
/qapi/qapi-types-block.[ch]
/qapi/qapi-types-char.[ch]
/qapi/qapi-types-common.[ch]
/qapi/qapi-types-crypto.[ch]
/qapi/qapi-types-introspect.[ch]
/qapi/qapi-types-migration.[ch]
/qapi/qapi-types-misc.[ch]
/qapi/qapi-types-net.[ch]
/qapi/qapi-types-rocker.[ch]
/qapi/qapi-types-run-state.[ch]
/qapi/qapi-types-sockets.[ch]
/qapi/qapi-types-tpm.[ch]
/qapi/qapi-types-trace.[ch]
/qapi/qapi-types-transaction.[ch]
/qapi/qapi-types-ui.[ch]
/qapi/qapi-types.[ch]
/qapi/qapi-visit-block-core.[ch]
/qapi/qapi-visit-block.[ch]
/qapi/qapi-visit-char.[ch]
/qapi/qapi-visit-common.[ch]
/qapi/qapi-visit-crypto.[ch]
/qapi/qapi-visit-introspect.[ch]
/qapi/qapi-visit-migration.[ch]
/qapi/qapi-visit-misc.[ch]
/qapi/qapi-visit-net.[ch]
/qapi/qapi-visit-rocker.[ch]
/qapi/qapi-visit-run-state.[ch]
/qapi/qapi-visit-sockets.[ch]
/qapi/qapi-visit-tpm.[ch]
/qapi/qapi-visit-trace.[ch]
/qapi/qapi-visit-transaction.[ch]
/qapi/qapi-visit-ui.[ch]
/qapi/qapi-visit.[ch]
/qapi/qapi-doc.texi
/qapi-generated
/qapi-types.[ch]
/qapi-visit.[ch]
/qapi-event.[ch]
/qmp-commands.h
/qmp-introspect.[ch]
/qmp-marshal.c
/qemu-doc.html
/qemu-doc.info
/qemu-doc.txt
@@ -112,17 +44,13 @@
/qemu-io
/qemu-ga
/qemu-bridge-helper
/qemu-keymap
/qemu-monitor.texi
/qemu-monitor-info.texi
/qemu-version.h
/qemu-version.h.tmp
/module_block.h
/scsi/qemu-pr-helper
/vhost-user-scsi
/vhost-user-blk
/vscclient
/fsdev/virtfs-proxy-helper
*.tmp
*.[1-9]
*.a
*.aux
@@ -171,25 +99,22 @@
/pc-bios/optionrom/kvmvapic.img
/pc-bios/s390-ccw/s390-ccw.elf
/pc-bios/s390-ccw/s390-ccw.img
/docs/interop/qemu-ga-qapi.texi
/docs/interop/qemu-ga-ref.html
/docs/interop/qemu-ga-ref.info*
/docs/interop/qemu-ga-ref.txt
/docs/interop/qemu-qmp-qapi.texi
/docs/interop/qemu-qmp-ref.html
/docs/interop/qemu-qmp-ref.info*
/docs/interop/qemu-qmp-ref.txt
/docs/qemu-ga-qapi.texi
/docs/qemu-ga-ref.html
/docs/qemu-ga-ref.info*
/docs/qemu-ga-ref.txt
/docs/qemu-qmp-qapi.texi
/docs/qemu-qmp-ref.html
/docs/qemu-qmp-ref.info*
/docs/qemu-qmp-ref.txt
/docs/version.texi
*.tps
.stgit-*
.git-submodule-status
cscope.*
tags
TAGS
docker-src.*
*~
*.ast_raw
*.depend_raw
trace.h
trace.c
trace-ust.h

18
.gitmodules vendored
View File

@@ -22,6 +22,9 @@
[submodule "roms/sgabios"]
path = roms/sgabios
url = git://git.qemu-project.org/sgabios.git
[submodule "pixman"]
path = pixman
url = git://anongit.freedesktop.org/pixman
[submodule "dtc"]
path = dtc
url = git://git.qemu-project.org/dtc.git
@@ -31,18 +34,3 @@
[submodule "roms/skiboot"]
path = roms/skiboot
url = git://git.qemu.org/skiboot.git
[submodule "roms/QemuMacDrivers"]
path = roms/QemuMacDrivers
url = git://git.qemu.org/QemuMacDrivers.git
[submodule "ui/keycodemapdb"]
path = ui/keycodemapdb
url = git://git.qemu.org/keycodemapdb.git
[submodule "capstone"]
path = capstone
url = git://git.qemu.org/capstone.git
[submodule "roms/seabios-hppa"]
path = roms/seabios-hppa
url = git://github.com/hdeller/seabios-hppa.git
[submodule "roms/u-boot-sam460ex"]
path = roms/u-boot-sam460ex
url = git://github.com/zbalaton/u-boot-sam460ex

View File

@@ -1,51 +0,0 @@
#
# Common git-publish profiles that can be used to send patches to QEMU upstream.
#
# See https://github.com/stefanha/git-publish for more information
#
[gitpublishprofile "default"]
base = master
to = qemu-devel@nongnu.org
cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null
[gitpublishprofile "rfc"]
base = master
prefix = RFC PATCH
to = qemu-devel@nongnu.org
cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null
[gitpublishprofile "stable"]
base = master
to = qemu-devel@nongnu.org
cc = qemu-stable@nongnu.org
cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null
[gitpublishprofile "trivial"]
base = master
to = qemu-devel@nongnu.org
cc = qemu-trivial@nongnu.org
cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null
[gitpublishprofile "block"]
base = master
to = qemu-devel@nongnu.org
cc = qemu-block@nongnu.org
cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null
[gitpublishprofile "arm"]
base = master
to = qemu-devel@nongnu.org
cc = qemu-arm@nongnu.org
cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null
[gitpublishprofile "s390"]
base = master
to = qemu-devel@nongnu.org
cc = qemu-s390@nongnu.org
cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null
[gitpublishprofile "ppc"]
base = master
to = qemu-devel@nongnu.org
cc = qemu-ppc@nongnu.org
cccmd = scripts/get_maintainer.pl --noroles --norolestats --nogit --nogit-fallback 2>/dev/null

View File

@@ -8,17 +8,10 @@ Aurelien Jarno <aurelien@aurel32.net> aurel32 <aurel32@c046a42c-6fe2-441c-8c8c-7
Blue Swirl <blauwirbel@gmail.com> blueswir1 <blueswir1@c046a42c-6fe2-441c-8c8c-71466251a162>
Edgar E. Iglesias <edgar.iglesias@gmail.com> edgar_igl <edgar_igl@c046a42c-6fe2-441c-8c8c-71466251a162>
Fabrice Bellard <fabrice@bellard.org> bellard <bellard@c046a42c-6fe2-441c-8c8c-71466251a162>
James Hogan <jhogan@kernel.org> <james.hogan@imgtec.com>
Jocelyn Mayer <l_indien@magic.fr> j_mayer <j_mayer@c046a42c-6fe2-441c-8c8c-71466251a162>
Paul Brook <paul@codesourcery.com> pbrook <pbrook@c046a42c-6fe2-441c-8c8c-71466251a162>
Paul Burton <paul.burton@mips.com> <paul.burton@imgtec.com>
Paul Burton <paul.burton@mips.com> <paul@archlinuxmips.org>
Thiemo Seufer <ths@networkno.de> ths <ths@c046a42c-6fe2-441c-8c8c-71466251a162>
malc <av1474@comtv.ru> malc <malc@c046a42c-6fe2-441c-8c8c-71466251a162>
# There is also a:
# (no author) <(no author)@c046a42c-6fe2-441c-8c8c-71466251a162>
# for the cvs2svn initialization commit e63c3dc74bf.
#
# Also list preferred name forms where people have changed their
# git author config
Daniel P. Berrangé <berrange@redhat.com>

View File

@@ -1,33 +1,15 @@
language: c
git:
submodules: false
env:
global:
- LC_ALL=C
matrix:
- IMAGE=debian-amd64
TARGET_LIST=x86_64-softmmu,x86_64-linux-user
- IMAGE=debian-win32-cross
TARGET_LIST=arm-softmmu,i386-softmmu,lm32-softmmu
- IMAGE=debian-win64-cross
TARGET_LIST=aarch64-softmmu,sparc64-softmmu,x86_64-softmmu
- IMAGE=debian-armel-cross
TARGET_LIST=arm-softmmu,arm-linux-user,armeb-linux-user
- IMAGE=debian-armhf-cross
TARGET_LIST=arm-softmmu,arm-linux-user,armeb-linux-user
TARGET_LIST=arm-softmmu,arm-linux-user
- IMAGE=debian-arm64-cross
TARGET_LIST=aarch64-softmmu,aarch64-linux-user
- IMAGE=debian-s390x-cross
TARGET_LIST=s390x-softmmu,s390x-linux-user
- IMAGE=debian-mips-cross
TARGET_LIST=mips-softmmu,mipsel-linux-user
- IMAGE=debian-mips64el-cross
TARGET_LIST=mips64el-softmmu,mips64el-linux-user
- IMAGE=debian-ppc64el-cross
TARGET_LIST=ppc64-softmmu,ppc64-linux-user,ppc64abi32-linux-user
build:
pre_ci:
- make docker-image-${IMAGE} V=1
- make docker-image-${IMAGE}
pre_ci_boot:
image_name: qemu
image_tag: ${IMAGE}
@@ -35,13 +17,5 @@ build:
options: "-e HOME=/root"
ci:
- unset CC
# some targets require newer up to date packages, for example TARGET_LIST matching
# aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu|mips64el-softmmu)
# see the configure script:
# error_exit "DTC (libfdt) version >= 1.4.2 not present. Your options:"
# " (1) Preferred: Install the DTC (libfdt) devel package"
# " (2) Fetch the DTC submodule, using:"
# " git submodule update --init dtc"
- dpkg --compare-versions `dpkg-query --showformat='${Version}' --show libfdt-dev` ge 1.4.2 || git submodule update --init dtc
- ./configure ${QEMU_CONFIGURE_OPTS} --target-list=${TARGET_LIST}
- make -j$(($(getconf _NPROCESSORS_ONLN) + 1))
- make -j2

View File

@@ -1,7 +1,7 @@
sudo: false
language: c
python:
- "2.6"
- "2.4"
compiler:
- gcc
cache: ccache
@@ -13,13 +13,12 @@ addons:
- libattr1-dev
- libbrlapi-dev
- libcap-ng-dev
- libgcc-4.8-dev
- libgnutls-dev
- libgtk-3-dev
- libiscsi-dev
- liblttng-ust-dev
- libncurses5-dev
- libnfs-dev
- libncurses5-dev
- libnss3-dev
- libpixman-1-dev
- libpng12-dev
@@ -47,14 +46,13 @@ notifications:
env:
global:
- TEST_CMD="make check"
- MAKEFLAGS="-j3"
matrix:
- CONFIG=""
- CONFIG="--enable-debug --enable-debug-tcg --enable-trace-backends=log"
- CONFIG="--disable-linux-aio --disable-cap-ng --disable-attr --disable-brlapi --disable-uuid --disable-libusb"
- CONFIG="--enable-modules --disable-linux-user"
- CONFIG="--with-coroutine=ucontext --disable-linux-user"
- CONFIG="--with-coroutine=sigaltstack --disable-linux-user"
- CONFIG="--enable-modules"
- CONFIG="--with-coroutine=ucontext"
- CONFIG="--with-coroutine=sigaltstack"
git:
# we want to do this ourselves
submodules: false
@@ -66,7 +64,7 @@ before_install:
before_script:
- ./configure ${CONFIG}
script:
- make ${MAKEFLAGS} && ${TEST_CMD}
- make -j3 && ${TEST_CMD}
matrix:
include:
# Test with CLang for compile portability
@@ -88,7 +86,7 @@ matrix:
- env: CONFIG="--enable-trace-backends=ust"
TEST_CMD=""
compiler: gcc
- env: CONFIG="--disable-tcg"
- env: CONFIG="--with-coroutine=gthread"
TEST_CMD=""
compiler: gcc
- env: CONFIG=""
@@ -116,17 +114,15 @@ matrix:
- sudo apt-get build-dep -qq qemu
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive
# Trusty System build with latest stable clang & python 3.0
# Trusty System build with latest stable clang
- sudo: required
addons:
dist: trusty
language: generic
compiler: none
python:
- "3.0"
env:
- COMPILER_NAME=clang CXX=clang++-3.9 CC=clang-3.9
- CONFIG="--disable-linux-user --cc=clang-3.9 --cxx=clang++-3.9 --python=/usr/bin/python3"
- CONFIG="--disable-linux-user --cc=clang-3.9 --cxx=clang++-3.9"
before_install:
- wget -nv -O - http://llvm.org/apt/llvm-snapshot.gpg.key | sudo apt-key add -
- sudo apt-add-repository -y 'deb http://llvm.org/apt/trusty llvm-toolchain-trusty-3.9 main'
@@ -137,17 +133,15 @@ matrix:
- git submodule update --init --recursive
before_script:
- ./configure ${CONFIG} || cat config.log
# Trusty Linux User build with latest stable clang & python 3.6
# Trusty Linux User build with latest stable clang
- sudo: required
addons:
dist: trusty
language: generic
compiler: none
python:
- "3.6"
env:
- COMPILER_NAME=clang CXX=clang++-3.9 CC=clang-3.9
- CONFIG="--disable-system --cc=clang-3.9 --cxx=clang++-3.9 --python=/usr/bin/python3"
- CONFIG="--disable-system --cc=clang-3.9 --cxx=clang++-3.9"
before_install:
- wget -nv -O - http://llvm.org/apt/llvm-snapshot.gpg.key | sudo apt-key add -
- sudo apt-add-repository -y 'deb http://llvm.org/apt/trusty llvm-toolchain-trusty-3.9 main'
@@ -197,7 +191,7 @@ matrix:
compiler: none
env:
- COMPILER_NAME=gcc CXX=g++-5 CC=gcc-5
- CONFIG="--cc=gcc-5 --cxx=g++-5 --disable-pie --disable-linux-user"
- CONFIG="--cc=gcc-5 --cxx=g++-5 --disable-pie --disable-linux-user --with-coroutine=gthread"
- TEST_CMD=""
before_script:
- ./configure ${CONFIG} --extra-cflags="-g3 -O0 -fsanitize=thread -fuse-ld=gold" || cat config.log

View File

@@ -123,38 +123,3 @@ We use traditional C-style /* */ comments and avoid // comments.
Rationale: The // form is valid in C99, so this is purely a matter of
consistency of style. The checkpatch script will warn you about this.
8. trace-events style
8.1 0x prefix
In trace-events files, use a '0x' prefix to specify hex numbers, as in:
some_trace(unsigned x, uint64_t y) "x 0x%x y 0x" PRIx64
An exception is made for groups of numbers that are hexadecimal by
convention and separated by the symbols '.', '/', ':', or ' ' (such as
PCI bus id):
another_trace(int cssid, int ssid, int dev_num) "bus id: %x.%x.%04x"
However, you can use '0x' for such groups if you want. Anyway, be sure that
it is obvious that numbers are in hex, ex.:
data_dump(uint8_t c1, uint8_t c2, uint8_t c3) "bytes (in hex): %02x %02x %02x"
Rationale: hex numbers are hard to read in logs when there is no 0x prefix,
especially when (occasionally) the representation doesn't contain any letters
and especially in one line with other decimal numbers. Number groups are allowed
to not use '0x' because for some things notations like %x.%x.%x are used not
only in Qemu. Also dumping raw data bytes with '0x' is less readable.
8.2 '#' printf flag
Do not use printf flag '#', like '%#x'.
Rationale: there are two ways to add a '0x' prefix to printed number: '0x%...'
and '%#...'. For consistency the only one way should be used. Arguments for
'0x%' are:
- it is more popular
- '%#' omits the 0x for the value 0 which makes output inconsistent

View File

@@ -1,270 +0,0 @@
A. HISTORY OF THE SOFTWARE
==========================
Python was created in the early 1990s by Guido van Rossum at Stichting
Mathematisch Centrum (CWI, see http://www.cwi.nl) in the Netherlands
as a successor of a language called ABC. Guido remains Python's
principal author, although it includes many contributions from others.
In 1995, Guido continued his work on Python at the Corporation for
National Research Initiatives (CNRI, see http://www.cnri.reston.va.us)
in Reston, Virginia where he released several versions of the
software.
In May 2000, Guido and the Python core development team moved to
BeOpen.com to form the BeOpen PythonLabs team. In October of the same
year, the PythonLabs team moved to Digital Creations (now Zope
Corporation, see http://www.zope.com). In 2001, the Python Software
Foundation (PSF, see http://www.python.org/psf/) was formed, a
non-profit organization created specifically to own Python-related
Intellectual Property. Zope Corporation is a sponsoring member of
the PSF.
All Python releases are Open Source (see http://www.opensource.org for
the Open Source Definition). Historically, most, but not all, Python
releases have also been GPL-compatible; the table below summarizes
the various releases.
Release Derived Year Owner GPL-
from compatible? (1)
0.9.0 thru 1.2 1991-1995 CWI yes
1.3 thru 1.5.2 1.2 1995-1999 CNRI yes
1.6 1.5.2 2000 CNRI no
2.0 1.6 2000 BeOpen.com no
1.6.1 1.6 2001 CNRI yes (2)
2.1 2.0+1.6.1 2001 PSF no
2.0.1 2.0+1.6.1 2001 PSF yes
2.1.1 2.1+2.0.1 2001 PSF yes
2.2 2.1.1 2001 PSF yes
2.1.2 2.1.1 2002 PSF yes
2.1.3 2.1.2 2002 PSF yes
2.2.1 2.2 2002 PSF yes
2.2.2 2.2.1 2002 PSF yes
2.2.3 2.2.2 2003 PSF yes
2.3 2.2.2 2002-2003 PSF yes
2.3.1 2.3 2002-2003 PSF yes
2.3.2 2.3.1 2002-2003 PSF yes
2.3.3 2.3.2 2002-2003 PSF yes
2.3.4 2.3.3 2004 PSF yes
2.3.5 2.3.4 2005 PSF yes
2.4 2.3 2004 PSF yes
2.4.1 2.4 2005 PSF yes
2.4.2 2.4.1 2005 PSF yes
2.4.3 2.4.2 2006 PSF yes
2.5 2.4 2006 PSF yes
2.7 2.6 2010 PSF yes
Footnotes:
(1) GPL-compatible doesn't mean that we're distributing Python under
the GPL. All Python licenses, unlike the GPL, let you distribute
a modified version without making your changes open source. The
GPL-compatible licenses make it possible to combine Python with
other software that is released under the GPL; the others don't.
(2) According to Richard Stallman, 1.6.1 is not GPL-compatible,
because its license has a choice of law clause. According to
CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1
is "not incompatible" with the GPL.
Thanks to the many outside volunteers who have worked under Guido's
direction to make these releases possible.
B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON
===============================================================
PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2
--------------------------------------------
1. This LICENSE AGREEMENT is between the Python Software Foundation
("PSF"), and the Individual or Organization ("Licensee") accessing and
otherwise using this software ("Python") in source or binary form and
its associated documentation.
2. Subject to the terms and conditions of this License Agreement, PSF
hereby grants Licensee a nonexclusive, royalty-free, world-wide
license to reproduce, analyze, test, perform and/or display publicly,
prepare derivative works, distribute, and otherwise use Python
alone or in any derivative version, provided, however, that PSF's
License Agreement and PSF's notice of copyright, i.e., "Copyright (c)
2001, 2002, 2003, 2004, 2005, 2006 Python Software Foundation; All Rights
Reserved" are retained in Python alone or in any derivative version
prepared by Licensee.
3. In the event Licensee prepares a derivative work that is based on
or incorporates Python or any part thereof, and wants to make
the derivative work available to others as provided herein, then
Licensee hereby agrees to include in any such work a brief summary of
the changes made to Python.
4. PSF is making Python available to Licensee on an "AS IS"
basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT
INFRINGE ANY THIRD PARTY RIGHTS.
5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON,
OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
6. This License Agreement will automatically terminate upon a material
breach of its terms and conditions.
7. Nothing in this License Agreement shall be deemed to create any
relationship of agency, partnership, or joint venture between PSF and
Licensee. This License Agreement does not grant permission to use PSF
trademarks or trade name in a trademark sense to endorse or promote
products or services of Licensee, or any third party.
8. By copying, installing or otherwise using Python, Licensee
agrees to be bound by the terms and conditions of this License
Agreement.
BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0
-------------------------------------------
BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1
1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an
office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the
Individual or Organization ("Licensee") accessing and otherwise using
this software in source or binary form and its associated
documentation ("the Software").
2. Subject to the terms and conditions of this BeOpen Python License
Agreement, BeOpen hereby grants Licensee a non-exclusive,
royalty-free, world-wide license to reproduce, analyze, test, perform
and/or display publicly, prepare derivative works, distribute, and
otherwise use the Software alone or in any derivative version,
provided, however, that the BeOpen Python License is retained in the
Software, alone or in any derivative version prepared by Licensee.
3. BeOpen is making the Software available to Licensee on an "AS IS"
basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT
INFRINGE ANY THIRD PARTY RIGHTS.
4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE
SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS
AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY
DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
5. This License Agreement will automatically terminate upon a material
breach of its terms and conditions.
6. This License Agreement shall be governed by and interpreted in all
respects by the law of the State of California, excluding conflict of
law provisions. Nothing in this License Agreement shall be deemed to
create any relationship of agency, partnership, or joint venture
between BeOpen and Licensee. This License Agreement does not grant
permission to use BeOpen trademarks or trade names in a trademark
sense to endorse or promote products or services of Licensee, or any
third party. As an exception, the "BeOpen Python" logos available at
http://www.pythonlabs.com/logos.html may be used according to the
permissions granted on that web page.
7. By copying, installing or otherwise using the software, Licensee
agrees to be bound by the terms and conditions of this License
Agreement.
CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1
---------------------------------------
1. This LICENSE AGREEMENT is between the Corporation for National
Research Initiatives, having an office at 1895 Preston White Drive,
Reston, VA 20191 ("CNRI"), and the Individual or Organization
("Licensee") accessing and otherwise using Python 1.6.1 software in
source or binary form and its associated documentation.
2. Subject to the terms and conditions of this License Agreement, CNRI
hereby grants Licensee a nonexclusive, royalty-free, world-wide
license to reproduce, analyze, test, perform and/or display publicly,
prepare derivative works, distribute, and otherwise use Python 1.6.1
alone or in any derivative version, provided, however, that CNRI's
License Agreement and CNRI's notice of copyright, i.e., "Copyright (c)
1995-2001 Corporation for National Research Initiatives; All Rights
Reserved" are retained in Python 1.6.1 alone or in any derivative
version prepared by Licensee. Alternately, in lieu of CNRI's License
Agreement, Licensee may substitute the following text (omitting the
quotes): "Python 1.6.1 is made available subject to the terms and
conditions in CNRI's License Agreement. This Agreement together with
Python 1.6.1 may be located on the Internet using the following
unique, persistent identifier (known as a handle): 1895.22/1013. This
Agreement may also be obtained from a proxy server on the Internet
using the following URL: http://hdl.handle.net/1895.22/1013".
3. In the event Licensee prepares a derivative work that is based on
or incorporates Python 1.6.1 or any part thereof, and wants to make
the derivative work available to others as provided herein, then
Licensee hereby agrees to include in any such work a brief summary of
the changes made to Python 1.6.1.
4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS"
basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT
INFRINGE ANY THIRD PARTY RIGHTS.
5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1,
OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
6. This License Agreement will automatically terminate upon a material
breach of its terms and conditions.
7. This License Agreement shall be governed by the federal
intellectual property law of the United States, including without
limitation the federal copyright law, and, to the extent such
U.S. federal law does not apply, by the law of the Commonwealth of
Virginia, excluding Virginia's conflict of law provisions.
Notwithstanding the foregoing, with regard to derivative works based
on Python 1.6.1 that incorporate non-separable material that was
previously distributed under the GNU General Public License (GPL), the
law of the Commonwealth of Virginia shall govern this License
Agreement only as to issues arising under or with respect to
Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this
License Agreement shall be deemed to create any relationship of
agency, partnership, or joint venture between CNRI and Licensee. This
License Agreement does not grant permission to use CNRI trademarks or
trade name in a trademark sense to endorse or promote products or
services of Licensee, or any third party.
8. By clicking on the "ACCEPT" button where indicated, or by copying,
installing or otherwise using Python 1.6.1, Licensee agrees to be
bound by the terms and conditions of this License Agreement.
ACCEPT
CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2
--------------------------------------------------
Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam,
The Netherlands. All rights reserved.
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the name of Stichting Mathematisch
Centrum or CWI not be used in advertising or publicity pertaining to
distribution of the software without specific, written prior
permission.
STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO
THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE
FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

View File

@@ -1,6 +1,6 @@
This file documents changes for QEMU releases 0.12 and earlier.
For changelog information for later releases, see
https://wiki.qemu.org/ChangeLog or look at the git history for
http://wiki.qemu-project.org/ChangeLog or look at the git history for
more detailed information.

File diff suppressed because it is too large Load Diff

498
Makefile
View File

@@ -6,13 +6,7 @@ BUILD_DIR=$(CURDIR)
# Before including a proper config-host.mak, assume we are in the source tree
SRC_PATH=.
UNCHECKED_GOALS := %clean TAGS cscope ctags dist \
html info pdf txt \
help check-help print-% \
docker docker-% vm-test vm-build-%
print-%:
@echo '$*=$($*)'
UNCHECKED_GOALS := %clean TAGS cscope ctags docker docker-%
# All following code might depend on configuration variables
ifneq ($(wildcard config-host.mak),)
@@ -20,45 +14,13 @@ ifneq ($(wildcard config-host.mak),)
all:
include config-host.mak
PYTHON_UTF8 = LC_ALL= LANG=C LC_CTYPE=en_US.UTF-8 $(PYTHON)
git-submodule-update:
.PHONY: git-submodule-update
git_module_status := $(shell \
cd '$(SRC_PATH)' && \
GIT="$(GIT)" ./scripts/git-submodule.sh status $(GIT_SUBMODULES); \
echo $$?; \
)
ifeq (1,$(git_module_status))
ifeq (no,$(GIT_UPDATE))
git-submodule-update:
$(call quiet-command, \
echo && \
echo "GIT submodule checkout is out of date. Please run" && \
echo " scripts/git-submodule.sh update $(GIT_SUBMODULES)" && \
echo "from the source directory checkout $(SRC_PATH)" && \
echo && \
exit 1)
else
git-submodule-update:
$(call quiet-command, \
(cd $(SRC_PATH) && GIT="$(GIT)" ./scripts/git-submodule.sh update $(GIT_SUBMODULES)), \
"GIT","$(GIT_SUBMODULES)")
endif
endif
.git-submodule-status: git-submodule-update config-host.mak
# Check that we're not trying to do an out-of-tree build from
# a tree that's been used for an in-tree build.
ifneq ($(realpath $(SRC_PATH)),$(realpath .))
ifneq ($(wildcard $(SRC_PATH)/config-host.mak),)
$(error This is an out of tree build but your source tree ($(SRC_PATH)) \
seems to have been used for an in-tree build. You can fix this by running \
"$(MAKE) distclean && rm -rf *-linux-user *-softmmu" in your source tree)
"make distclean && rm -rf *-linux-user *-softmmu" in your source tree)
endif
endif
@@ -90,78 +52,10 @@ endif
include $(SRC_PATH)/rules.mak
GENERATED_FILES = qemu-version.h config-host.h qemu-options.def
GENERATED_FILES += qapi/qapi-builtin-types.h qapi/qapi-builtin-types.c
GENERATED_FILES += qapi/qapi-types.h qapi/qapi-types.c
GENERATED_FILES += qapi/qapi-types-block-core.h qapi/qapi-types-block-core.c
GENERATED_FILES += qapi/qapi-types-block.h qapi/qapi-types-block.c
GENERATED_FILES += qapi/qapi-types-char.h qapi/qapi-types-char.c
GENERATED_FILES += qapi/qapi-types-common.h qapi/qapi-types-common.c
GENERATED_FILES += qapi/qapi-types-crypto.h qapi/qapi-types-crypto.c
GENERATED_FILES += qapi/qapi-types-introspect.h qapi/qapi-types-introspect.c
GENERATED_FILES += qapi/qapi-types-migration.h qapi/qapi-types-migration.c
GENERATED_FILES += qapi/qapi-types-misc.h qapi/qapi-types-misc.c
GENERATED_FILES += qapi/qapi-types-net.h qapi/qapi-types-net.c
GENERATED_FILES += qapi/qapi-types-rocker.h qapi/qapi-types-rocker.c
GENERATED_FILES += qapi/qapi-types-run-state.h qapi/qapi-types-run-state.c
GENERATED_FILES += qapi/qapi-types-sockets.h qapi/qapi-types-sockets.c
GENERATED_FILES += qapi/qapi-types-tpm.h qapi/qapi-types-tpm.c
GENERATED_FILES += qapi/qapi-types-trace.h qapi/qapi-types-trace.c
GENERATED_FILES += qapi/qapi-types-transaction.h qapi/qapi-types-transaction.c
GENERATED_FILES += qapi/qapi-types-ui.h qapi/qapi-types-ui.c
GENERATED_FILES += qapi/qapi-builtin-visit.h qapi/qapi-builtin-visit.c
GENERATED_FILES += qapi/qapi-visit.h qapi/qapi-visit.c
GENERATED_FILES += qapi/qapi-visit-block-core.h qapi/qapi-visit-block-core.c
GENERATED_FILES += qapi/qapi-visit-block.h qapi/qapi-visit-block.c
GENERATED_FILES += qapi/qapi-visit-char.h qapi/qapi-visit-char.c
GENERATED_FILES += qapi/qapi-visit-common.h qapi/qapi-visit-common.c
GENERATED_FILES += qapi/qapi-visit-crypto.h qapi/qapi-visit-crypto.c
GENERATED_FILES += qapi/qapi-visit-introspect.h qapi/qapi-visit-introspect.c
GENERATED_FILES += qapi/qapi-visit-migration.h qapi/qapi-visit-migration.c
GENERATED_FILES += qapi/qapi-visit-misc.h qapi/qapi-visit-misc.c
GENERATED_FILES += qapi/qapi-visit-net.h qapi/qapi-visit-net.c
GENERATED_FILES += qapi/qapi-visit-rocker.h qapi/qapi-visit-rocker.c
GENERATED_FILES += qapi/qapi-visit-run-state.h qapi/qapi-visit-run-state.c
GENERATED_FILES += qapi/qapi-visit-sockets.h qapi/qapi-visit-sockets.c
GENERATED_FILES += qapi/qapi-visit-tpm.h qapi/qapi-visit-tpm.c
GENERATED_FILES += qapi/qapi-visit-trace.h qapi/qapi-visit-trace.c
GENERATED_FILES += qapi/qapi-visit-transaction.h qapi/qapi-visit-transaction.c
GENERATED_FILES += qapi/qapi-visit-ui.h qapi/qapi-visit-ui.c
GENERATED_FILES += qapi/qapi-commands.h qapi/qapi-commands.c
GENERATED_FILES += qapi/qapi-commands-block-core.h qapi/qapi-commands-block-core.c
GENERATED_FILES += qapi/qapi-commands-block.h qapi/qapi-commands-block.c
GENERATED_FILES += qapi/qapi-commands-char.h qapi/qapi-commands-char.c
GENERATED_FILES += qapi/qapi-commands-common.h qapi/qapi-commands-common.c
GENERATED_FILES += qapi/qapi-commands-crypto.h qapi/qapi-commands-crypto.c
GENERATED_FILES += qapi/qapi-commands-introspect.h qapi/qapi-commands-introspect.c
GENERATED_FILES += qapi/qapi-commands-migration.h qapi/qapi-commands-migration.c
GENERATED_FILES += qapi/qapi-commands-misc.h qapi/qapi-commands-misc.c
GENERATED_FILES += qapi/qapi-commands-net.h qapi/qapi-commands-net.c
GENERATED_FILES += qapi/qapi-commands-rocker.h qapi/qapi-commands-rocker.c
GENERATED_FILES += qapi/qapi-commands-run-state.h qapi/qapi-commands-run-state.c
GENERATED_FILES += qapi/qapi-commands-sockets.h qapi/qapi-commands-sockets.c
GENERATED_FILES += qapi/qapi-commands-tpm.h qapi/qapi-commands-tpm.c
GENERATED_FILES += qapi/qapi-commands-trace.h qapi/qapi-commands-trace.c
GENERATED_FILES += qapi/qapi-commands-transaction.h qapi/qapi-commands-transaction.c
GENERATED_FILES += qapi/qapi-commands-ui.h qapi/qapi-commands-ui.c
GENERATED_FILES += qapi/qapi-events.h qapi/qapi-events.c
GENERATED_FILES += qapi/qapi-events-block-core.h qapi/qapi-events-block-core.c
GENERATED_FILES += qapi/qapi-events-block.h qapi/qapi-events-block.c
GENERATED_FILES += qapi/qapi-events-char.h qapi/qapi-events-char.c
GENERATED_FILES += qapi/qapi-events-common.h qapi/qapi-events-common.c
GENERATED_FILES += qapi/qapi-events-crypto.h qapi/qapi-events-crypto.c
GENERATED_FILES += qapi/qapi-events-introspect.h qapi/qapi-events-introspect.c
GENERATED_FILES += qapi/qapi-events-migration.h qapi/qapi-events-migration.c
GENERATED_FILES += qapi/qapi-events-misc.h qapi/qapi-events-misc.c
GENERATED_FILES += qapi/qapi-events-net.h qapi/qapi-events-net.c
GENERATED_FILES += qapi/qapi-events-rocker.h qapi/qapi-events-rocker.c
GENERATED_FILES += qapi/qapi-events-run-state.h qapi/qapi-events-run-state.c
GENERATED_FILES += qapi/qapi-events-sockets.h qapi/qapi-events-sockets.c
GENERATED_FILES += qapi/qapi-events-tpm.h qapi/qapi-events-tpm.c
GENERATED_FILES += qapi/qapi-events-trace.h qapi/qapi-events-trace.c
GENERATED_FILES += qapi/qapi-events-transaction.h qapi/qapi-events-transaction.c
GENERATED_FILES += qapi/qapi-events-ui.h qapi/qapi-events-ui.c
GENERATED_FILES += qapi/qapi-introspect.c qapi/qapi-introspect.h
GENERATED_FILES += qapi/qapi-doc.texi
GENERATED_FILES += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h
GENERATED_FILES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c
GENERATED_FILES += qmp-introspect.h
GENERATED_FILES += qmp-introspect.c
GENERATED_FILES += trace/generated-tcg-tracers.h
@@ -190,7 +84,6 @@ endif
GENERATED_FILES += $(TRACE_HEADERS)
GENERATED_FILES += $(TRACE_SOURCES)
GENERATED_FILES += $(BUILD_DIR)/trace-events-all
GENERATED_FILES += .git-submodule-status
trace-group-name = $(shell dirname $1 | sed -e 's/[^a-zA-Z0-9]/_/g')
@@ -298,50 +191,13 @@ trace-dtrace-root.h: trace-dtrace-root.dtrace
trace-dtrace-root.o: trace-dtrace-root.dtrace
KEYCODEMAP_GEN = $(SRC_PATH)/ui/keycodemapdb/tools/keymap-gen
KEYCODEMAP_CSV = $(SRC_PATH)/ui/keycodemapdb/data/keymaps.csv
KEYCODEMAP_FILES = \
ui/input-keymap-atset1-to-qcode.c \
ui/input-keymap-linux-to-qcode.c \
ui/input-keymap-qcode-to-atset1.c \
ui/input-keymap-qcode-to-atset2.c \
ui/input-keymap-qcode-to-atset3.c \
ui/input-keymap-qcode-to-linux.c \
ui/input-keymap-qcode-to-qnum.c \
ui/input-keymap-qcode-to-sun.c \
ui/input-keymap-qnum-to-qcode.c \
ui/input-keymap-usb-to-qcode.c \
ui/input-keymap-win32-to-qcode.c \
ui/input-keymap-x11-to-qcode.c \
ui/input-keymap-xorgevdev-to-qcode.c \
ui/input-keymap-xorgkbd-to-qcode.c \
ui/input-keymap-xorgxquartz-to-qcode.c \
ui/input-keymap-xorgxwin-to-qcode.c \
$(NULL)
GENERATED_FILES += $(KEYCODEMAP_FILES)
ui/input-keymap-%.c: $(KEYCODEMAP_GEN) $(KEYCODEMAP_CSV) $(SRC_PATH)/ui/Makefile.objs
$(call quiet-command,\
stem=$* && src=$${stem%-to-*} dst=$${stem#*-to-} && \
test -e $(KEYCODEMAP_GEN) && \
$(PYTHON) $(KEYCODEMAP_GEN) \
--lang glib2 \
--varname qemu_input_map_$${src}_to_$${dst} \
code-map $(KEYCODEMAP_CSV) $${src} $${dst} \
> $@ || rm -f $@, "GEN", "$@")
$(KEYCODEMAP_GEN): .git-submodule-status
$(KEYCODEMAP_CSV): .git-submodule-status
# Don't try to regenerate Makefile or configure
# We don't generate any of them
Makefile: ;
configure: ;
.PHONY: all clean cscope distclean html info install install-doc \
pdf txt recurse-all dist msi FORCE
pdf txt recurse-all speed test dist msi FORCE
$(call set-vpath, $(SRC_PATH))
@@ -351,9 +207,8 @@ HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)
ifdef BUILD_DOCS
DOCS=qemu-doc.html qemu-doc.txt qemu.1 qemu-img.1 qemu-nbd.8 qemu-ga.8
DOCS+=docs/interop/qemu-qmp-ref.html docs/interop/qemu-qmp-ref.txt docs/interop/qemu-qmp-ref.7
DOCS+=docs/interop/qemu-ga-ref.html docs/interop/qemu-ga-ref.txt docs/interop/qemu-ga-ref.7
DOCS+=docs/qemu-block-drivers.7
DOCS+=docs/qemu-qmp-ref.html docs/qemu-qmp-ref.txt docs/qemu-qmp-ref.7
DOCS+=docs/qemu-ga-ref.html docs/qemu-ga-ref.txt docs/qemu-ga-ref.7
ifdef CONFIG_VIRTFS
DOCS+=fsdev/virtfs-proxy-helper.1
endif
@@ -361,7 +216,7 @@ else
DOCS=
endif
SUBDIR_MAKEFLAGS=$(if $(V),,--no-print-directory --quiet) BUILD_DIR=$(BUILD_DIR)
SUBDIR_MAKEFLAGS=$(if $(V),,--no-print-directory) BUILD_DIR=$(BUILD_DIR)
SUBDIR_DEVICES_MAK=$(patsubst %, %/config-devices.mak, $(TARGET_DIRS))
SUBDIR_DEVICES_MAK_DEP=$(patsubst %, %-config-devices.mak.d, $(TARGET_DIRS))
@@ -391,7 +246,7 @@ endif
else \
echo "WARNING: $@ out of date.";\
fi; \
echo "Run \"$(MAKE) defconfig\" to regenerate."; \
echo "Run \"make defconfig\" to regenerate."; \
rm $@.tmp; \
fi; \
else \
@@ -414,8 +269,6 @@ dummy := $(call unnest-vars,, \
ivshmem-client-obj-y \
ivshmem-server-obj-y \
libvhost-user-obj-y \
vhost-user-scsi-obj-y \
vhost-user-blk-obj-y \
qga-vss-dll-obj-y \
block-obj-y \
block-obj-m \
@@ -425,13 +278,11 @@ dummy := $(call unnest-vars,, \
io-obj-y \
common-obj-y \
common-obj-m \
ui-obj-y \
ui-obj-m \
audio-obj-y \
audio-obj-m \
trace-obj-y)
ifneq ($(wildcard config-host.mak),)
include $(SRC_PATH)/tests/Makefile.include
endif
all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules
@@ -475,33 +326,27 @@ $(SOFTMMU_SUBDIR_RULES): config-all-devices.mak
subdir-%:
$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C $* V="$(V)" TARGET_DIR="$*/" all,)
subdir-pixman: pixman/Makefile
$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C pixman V="$(V)" all,)
pixman/Makefile: $(SRC_PATH)/pixman/configure
(cd pixman; CFLAGS="$(CFLAGS) -fPIC $(extra_cflags) $(extra_ldflags)" $(SRC_PATH)/pixman/configure $(AUTOCONF_HOST) --disable-gtk --disable-shared --enable-static)
$(SRC_PATH)/pixman/configure:
(cd $(SRC_PATH)/pixman; autoreconf -v --install)
DTC_MAKE_ARGS=-I$(SRC_PATH)/dtc VPATH=$(SRC_PATH)/dtc -C dtc V="$(V)" LIBFDT_srcdir=$(SRC_PATH)/dtc/libfdt
DTC_CFLAGS=$(CFLAGS) $(QEMU_CFLAGS)
DTC_CPPFLAGS=-I$(BUILD_DIR)/dtc -I$(SRC_PATH)/dtc -I$(SRC_PATH)/dtc/libfdt
subdir-dtc: .git-submodule-status dtc/libfdt dtc/tests
subdir-dtc:dtc/libfdt dtc/tests
$(call quiet-command,$(MAKE) $(DTC_MAKE_ARGS) CPPFLAGS="$(DTC_CPPFLAGS)" CFLAGS="$(DTC_CFLAGS)" LDFLAGS="$(LDFLAGS)" ARFLAGS="$(ARFLAGS)" CC="$(CC)" AR="$(AR)" LD="$(LD)" $(SUBDIR_MAKEFLAGS) libfdt/libfdt.a,)
dtc/%: .git-submodule-status
dtc/%:
mkdir -p $@
# Overriding CFLAGS causes us to lose defines added in the sub-makefile.
# Not overriding CFLAGS leads to mis-matches between compilation modes.
# Therefore we replicate some of the logic in the sub-makefile.
# Remove all the extra -Warning flags that QEMU uses that Capstone doesn't;
# no need to annoy QEMU developers with such things.
CAP_CFLAGS = $(patsubst -W%,,$(CFLAGS) $(QEMU_CFLAGS))
CAP_CFLAGS += -DCAPSTONE_USE_SYS_DYN_MEM
CAP_CFLAGS += -DCAPSTONE_HAS_ARM
CAP_CFLAGS += -DCAPSTONE_HAS_ARM64
CAP_CFLAGS += -DCAPSTONE_HAS_POWERPC
CAP_CFLAGS += -DCAPSTONE_HAS_X86
subdir-capstone: .git-submodule-status
$(call quiet-command,$(MAKE) -C $(SRC_PATH)/capstone CAPSTONE_SHARED=no BUILDDIR="$(BUILD_DIR)/capstone" CC="$(CC)" AR="$(AR)" LD="$(LD)" RANLIB="$(RANLIB)" CFLAGS="$(CAP_CFLAGS)" $(SUBDIR_MAKEFLAGS) $(BUILD_DIR)/capstone/$(LIBCAPSTONE))
$(SUBDIR_RULES): libqemuutil.a $(common-obj-y) $(chardev-obj-y) \
$(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY))
$(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y) $(chardev-obj-y) \
$(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY)) $(trace-obj-y)
ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS))
# Only keep -O and -g cflags
@@ -520,12 +365,12 @@ Makefile: $(version-obj-y)
######################################################################
# Build libraries
libqemuutil.a: $(util-obj-y) $(trace-obj-y) $(stub-obj-y)
libvhost-user.a: $(libvhost-user-obj-y)
libqemustub.a: $(stub-obj-y)
libqemuutil.a: $(util-obj-y)
######################################################################
COMMON_LDADDS = libqemuutil.a
COMMON_LDADDS = $(trace-obj-y) libqemuutil.a libqemustub.a
qemu-img.o: qemu-img-cmds.h
@@ -535,143 +380,69 @@ qemu-io$(EXESUF): qemu-io.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj
qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o $(COMMON_LDADDS)
qemu-keymap$(EXESUF): qemu-keymap.o ui/input-keymap.o $(COMMON_LDADDS)
fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/9p-marshal.o fsdev/9p-iov-marshal.o $(COMMON_LDADDS)
fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
scsi/qemu-pr-helper$(EXESUF): scsi/qemu-pr-helper.o scsi/utils.o $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS)
ifdef CONFIG_MPATH
scsi/qemu-pr-helper$(EXESUF): LIBS += -ludev -lmultipath -lmpathpersist
endif
qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"GEN","$@")
qemu-ga$(EXESUF): LIBS = $(LIBS_QGA)
qemu-ga$(EXESUF): QEMU_CFLAGS += -I qga/qapi-generated
qemu-keymap$(EXESUF): LIBS += $(XKBCOMMON_LIBS)
qemu-keymap$(EXESUF): QEMU_CFLAGS += $(XKBCOMMON_CFLAGS)
gen-out-type = $(subst .,-,$(suffix $@))
qapi-py = $(SRC_PATH)/scripts/qapi/commands.py \
$(SRC_PATH)/scripts/qapi/events.py \
$(SRC_PATH)/scripts/qapi/introspect.py \
$(SRC_PATH)/scripts/qapi/types.py \
$(SRC_PATH)/scripts/qapi/visit.py \
$(SRC_PATH)/scripts/qapi/common.py \
$(SRC_PATH)/scripts/qapi/doc.py \
$(SRC_PATH)/scripts/ordereddict.py \
$(SRC_PATH)/scripts/qapi-gen.py
qapi-py = $(SRC_PATH)/scripts/qapi.py $(SRC_PATH)/scripts/ordereddict.py
qga/qapi-generated/qga-qapi-types.c qga/qapi-generated/qga-qapi-types.h \
qga/qapi-generated/qga-qapi-visit.c qga/qapi-generated/qga-qapi-visit.h \
qga/qapi-generated/qga-qapi-commands.h qga/qapi-generated/qga-qapi-commands.c \
qga/qapi-generated/qga-qapi-doc.texi: \
qga/qapi-generated/qapi-gen-timestamp ;
qga/qapi-generated/qapi-gen-timestamp: $(SRC_PATH)/qga/qapi-schema.json $(qapi-py)
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-gen.py \
-o qga/qapi-generated -p "qga-" $<, \
"GEN","$(@:%-timestamp=%)")
@>$@
qga/qapi-generated/qga-qapi-types.c qga/qapi-generated/qga-qapi-types.h :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
"GEN","$@")
qga/qapi-generated/qga-qapi-visit.c qga/qapi-generated/qga-qapi-visit.h :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
"GEN","$@")
qga/qapi-generated/qga-qmp-commands.h qga/qapi-generated/qga-qmp-marshal.c :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
"GEN","$@")
qapi-modules = $(SRC_PATH)/qapi/qapi-schema.json $(SRC_PATH)/qapi/common.json \
qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json \
$(SRC_PATH)/qapi/block.json $(SRC_PATH)/qapi/block-core.json \
$(SRC_PATH)/qapi/char.json \
$(SRC_PATH)/qapi/crypto.json \
$(SRC_PATH)/qapi/introspect.json \
$(SRC_PATH)/qapi/migration.json \
$(SRC_PATH)/qapi/misc.json \
$(SRC_PATH)/qapi/net.json \
$(SRC_PATH)/qapi/rocker.json \
$(SRC_PATH)/qapi/run-state.json \
$(SRC_PATH)/qapi/sockets.json \
$(SRC_PATH)/qapi/tpm.json \
$(SRC_PATH)/qapi/trace.json \
$(SRC_PATH)/qapi/transaction.json \
$(SRC_PATH)/qapi/ui.json
$(SRC_PATH)/qapi/event.json $(SRC_PATH)/qapi/introspect.json \
$(SRC_PATH)/qapi/crypto.json $(SRC_PATH)/qapi/rocker.json \
$(SRC_PATH)/qapi/trace.json
qapi/qapi-builtin-types.c qapi/qapi-builtin-types.h \
qapi/qapi-types.c qapi/qapi-types.h \
qapi/qapi-types-block-core.c qapi/qapi-types-block-core.h \
qapi/qapi-types-block.c qapi/qapi-types-block.h \
qapi/qapi-types-char.c qapi/qapi-types-char.h \
qapi/qapi-types-common.c qapi/qapi-types-common.h \
qapi/qapi-types-crypto.c qapi/qapi-types-crypto.h \
qapi/qapi-types-introspect.c qapi/qapi-types-introspect.h \
qapi/qapi-types-migration.c qapi/qapi-types-migration.h \
qapi/qapi-types-misc.c qapi/qapi-types-misc.h \
qapi/qapi-types-net.c qapi/qapi-types-net.h \
qapi/qapi-types-rocker.c qapi/qapi-types-rocker.h \
qapi/qapi-types-run-state.c qapi/qapi-types-run-state.h \
qapi/qapi-types-sockets.c qapi/qapi-types-sockets.h \
qapi/qapi-types-tpm.c qapi/qapi-types-tpm.h \
qapi/qapi-types-trace.c qapi/qapi-types-trace.h \
qapi/qapi-types-transaction.c qapi/qapi-types-transaction.h \
qapi/qapi-types-ui.c qapi/qapi-types-ui.h \
qapi/qapi-builtin-visit.c qapi/qapi-builtin-visit.h \
qapi/qapi-visit.c qapi/qapi-visit.h \
qapi/qapi-visit-block-core.c qapi/qapi-visit-block-core.h \
qapi/qapi-visit-block.c qapi/qapi-visit-block.h \
qapi/qapi-visit-char.c qapi/qapi-visit-char.h \
qapi/qapi-visit-common.c qapi/qapi-visit-common.h \
qapi/qapi-visit-crypto.c qapi/qapi-visit-crypto.h \
qapi/qapi-visit-introspect.c qapi/qapi-visit-introspect.h \
qapi/qapi-visit-migration.c qapi/qapi-visit-migration.h \
qapi/qapi-visit-misc.c qapi/qapi-visit-misc.h \
qapi/qapi-visit-net.c qapi/qapi-visit-net.h \
qapi/qapi-visit-rocker.c qapi/qapi-visit-rocker.h \
qapi/qapi-visit-run-state.c qapi/qapi-visit-run-state.h \
qapi/qapi-visit-sockets.c qapi/qapi-visit-sockets.h \
qapi/qapi-visit-tpm.c qapi/qapi-visit-tpm.h \
qapi/qapi-visit-trace.c qapi/qapi-visit-trace.h \
qapi/qapi-visit-transaction.c qapi/qapi-visit-transaction.h \
qapi/qapi-visit-ui.c qapi/qapi-visit-ui.h \
qapi/qapi-commands.h qapi/qapi-commands.c \
qapi/qapi-commands-block-core.c qapi/qapi-commands-block-core.h \
qapi/qapi-commands-block.c qapi/qapi-commands-block.h \
qapi/qapi-commands-char.c qapi/qapi-commands-char.h \
qapi/qapi-commands-common.c qapi/qapi-commands-common.h \
qapi/qapi-commands-crypto.c qapi/qapi-commands-crypto.h \
qapi/qapi-commands-introspect.c qapi/qapi-commands-introspect.h \
qapi/qapi-commands-migration.c qapi/qapi-commands-migration.h \
qapi/qapi-commands-misc.c qapi/qapi-commands-misc.h \
qapi/qapi-commands-net.c qapi/qapi-commands-net.h \
qapi/qapi-commands-rocker.c qapi/qapi-commands-rocker.h \
qapi/qapi-commands-run-state.c qapi/qapi-commands-run-state.h \
qapi/qapi-commands-sockets.c qapi/qapi-commands-sockets.h \
qapi/qapi-commands-tpm.c qapi/qapi-commands-tpm.h \
qapi/qapi-commands-trace.c qapi/qapi-commands-trace.h \
qapi/qapi-commands-transaction.c qapi/qapi-commands-transaction.h \
qapi/qapi-commands-ui.c qapi/qapi-commands-ui.h \
qapi/qapi-events.c qapi/qapi-events.h \
qapi/qapi-events-block-core.c qapi/qapi-events-block-core.h \
qapi/qapi-events-block.c qapi/qapi-events-block.h \
qapi/qapi-events-char.c qapi/qapi-events-char.h \
qapi/qapi-events-common.c qapi/qapi-events-common.h \
qapi/qapi-events-crypto.c qapi/qapi-events-crypto.h \
qapi/qapi-events-introspect.c qapi/qapi-events-introspect.h \
qapi/qapi-events-migration.c qapi/qapi-events-migration.h \
qapi/qapi-events-misc.c qapi/qapi-events-misc.h \
qapi/qapi-events-net.c qapi/qapi-events-net.h \
qapi/qapi-events-rocker.c qapi/qapi-events-rocker.h \
qapi/qapi-events-run-state.c qapi/qapi-events-run-state.h \
qapi/qapi-events-sockets.c qapi/qapi-events-sockets.h \
qapi/qapi-events-tpm.c qapi/qapi-events-tpm.h \
qapi/qapi-events-trace.c qapi/qapi-events-trace.h \
qapi/qapi-events-transaction.c qapi/qapi-events-transaction.h \
qapi/qapi-events-ui.c qapi/qapi-events-ui.h \
qapi/qapi-introspect.h qapi/qapi-introspect.c \
qapi/qapi-doc.texi: \
qapi-gen-timestamp ;
qapi-gen-timestamp: $(qapi-modules) $(qapi-py)
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-gen.py \
-o "qapi" -b $<, \
"GEN","$(@:%-timestamp=%)")
@>$@
qapi-types.c qapi-types.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
$(gen-out-type) -o "." -b $<, \
"GEN","$@")
qapi-visit.c qapi-visit.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
$(gen-out-type) -o "." -b $<, \
"GEN","$@")
qapi-event.c qapi-event.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-event.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-event.py \
$(gen-out-type) -o "." $<, \
"GEN","$@")
qmp-commands.h qmp-marshal.c :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
$(gen-out-type) -o "." $<, \
"GEN","$@")
qmp-introspect.h qmp-introspect.c :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-introspect.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-introspect.py \
$(gen-out-type) -o "." $<, \
"GEN","$@")
QGALIB_GEN=$(addprefix qga/qapi-generated/, qga-qapi-types.h qga-qapi-visit.h qga-qapi-commands.h)
$(qga-obj-y): $(QGALIB_GEN)
QGALIB_GEN=$(addprefix qga/qapi-generated/, qga-qapi-types.h qga-qapi-visit.h qga-qmp-commands.h)
$(qga-obj-y) qemu-ga.o: $(QGALIB_GEN)
qemu-ga$(EXESUF): $(qga-obj-y) $(COMMON_LDADDS)
$(call LINK, $^)
@@ -698,16 +469,10 @@ ifneq ($(EXESUF),)
qemu-ga: qemu-ga$(EXESUF) $(QGA_VSS_PROVIDER) $(QEMU_GA_MSI)
endif
ifdef CONFIG_IVSHMEM
ivshmem-client$(EXESUF): $(ivshmem-client-obj-y) $(COMMON_LDADDS)
$(call LINK, $^)
ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) $(COMMON_LDADDS)
$(call LINK, $^)
endif
vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y) libvhost-user.a
$(call LINK, $^)
vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y) libvhost-user.a
$(call LINK, $^)
module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak
$(call quiet-command,$(PYTHON) $< $@ \
@@ -721,14 +486,14 @@ clean:
rm -f *.msi
find . \( -name '*.so' -o -name '*.dll' -o -name '*.mo' -o -name '*.[oda]' \) -type f -exec rm {} +
rm -f $(filter-out %.tlb,$(TOOLS)) $(HELPERS-y) qemu-ga TAGS cscope.* *.pod *~ */*~
rm -f fsdev/*.pod scsi/*.pod
rm -f fsdev/*.pod
rm -f qemu-img-cmds.h
rm -f ui/shader/*-vert.h ui/shader/*-frag.h
@# May not be present in GENERATED_FILES
rm -f trace/generated-tracers-dtrace.dtrace*
rm -f trace/generated-tracers-dtrace.h*
rm -f $(foreach f,$(GENERATED_FILES),$(f) $(f)-timestamp)
rm -f qapi-gen-timestamp
rm -rf qapi-generated
rm -rf qga/qapi-generated
for d in $(ALL_SUBDIRS); do \
if test -d $$d; then $(MAKE) -C $$d $@ || exit 1; fi; \
@@ -754,17 +519,16 @@ distclean: clean
rm -f qemu-doc.vr qemu-doc.txt
rm -f config.log
rm -f linux-headers/asm
rm -f docs/version.texi
rm -f docs/interop/qemu-ga-qapi.texi docs/interop/qemu-qmp-qapi.texi
rm -f docs/interop/qemu-qmp-ref.7 docs/interop/qemu-ga-ref.7
rm -f docs/interop/qemu-qmp-ref.txt docs/interop/qemu-ga-ref.txt
rm -f docs/interop/qemu-qmp-ref.pdf docs/interop/qemu-ga-ref.pdf
rm -f docs/interop/qemu-qmp-ref.html docs/interop/qemu-ga-ref.html
rm -f docs/qemu-block-drivers.7
rm -f docs/qemu-ga-qapi.texi docs/qemu-qmp-qapi.texi docs/version.texi
rm -f docs/qemu-qmp-ref.7 docs/qemu-ga-ref.7
rm -f docs/qemu-qmp-ref.txt docs/qemu-ga-ref.txt
rm -f docs/qemu-qmp-ref.pdf docs/qemu-ga-ref.pdf
rm -f docs/qemu-qmp-ref.html docs/qemu-ga-ref.html
for d in $(TARGET_DIRS); do \
rm -rf $$d || exit 1 ; \
done
rm -Rf .sdk
if test -f pixman/config.log; then $(MAKE) -C pixman distclean; fi
if test -f dtc/version_gen.h; then $(MAKE) $(DTC_MAKE_ARGS) clean; fi
KEYMAPS=da en-gb et fr fr-ch is lt modifiers no pt-br sv \
@@ -783,14 +547,12 @@ efi-e1000.rom efi-eepro100.rom efi-ne2k_pci.rom \
efi-pcnet.rom efi-rtl8139.rom efi-virtio.rom \
efi-e1000e.rom efi-vmxnet3.rom \
qemu-icon.bmp qemu_logo_no_text.svg \
bamboo.dtb canyonlands.dtb petalogix-s3adsp1800.dtb petalogix-ml605.dtb \
bamboo.dtb petalogix-s3adsp1800.dtb petalogix-ml605.dtb \
multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin \
s390-ccw.img s390-netboot.img \
s390-ccw.img \
spapr-rtas.bin slof.bin skiboot.lid \
palcode-clipper \
u-boot.e500 u-boot-sam460-20100605.bin \
qemu_vga.ndrv \
hppa-firmware.img
u-boot.e500
else
BLOBS=
endif
@@ -799,14 +561,13 @@ install-doc: $(DOCS)
$(INSTALL_DIR) "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qemu-doc.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qemu-doc.txt "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/interop/qemu-qmp-ref.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/interop/qemu-qmp-ref.txt "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/qemu-qmp-ref.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/qemu-qmp-ref.txt "$(DESTDIR)$(qemu_docdir)"
ifdef CONFIG_POSIX
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DATA) qemu.1 "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man7"
$(INSTALL_DATA) docs/interop/qemu-qmp-ref.7 "$(DESTDIR)$(mandir)/man7"
$(INSTALL_DATA) docs/qemu-block-drivers.7 "$(DESTDIR)$(mandir)/man7"
$(INSTALL_DATA) docs/qemu-qmp-ref.7 "$(DESTDIR)$(mandir)/man7"
ifneq ($(TOOLS),)
$(INSTALL_DATA) qemu-img.1 "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man8"
@@ -814,9 +575,9 @@ ifneq ($(TOOLS),)
endif
ifneq (,$(findstring qemu-ga,$(TOOLS)))
$(INSTALL_DATA) qemu-ga.8 "$(DESTDIR)$(mandir)/man8"
$(INSTALL_DATA) docs/interop/qemu-ga-ref.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/interop/qemu-ga-ref.txt "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/interop/qemu-ga-ref.7 "$(DESTDIR)$(mandir)/man7"
$(INSTALL_DATA) docs/qemu-ga-ref.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/qemu-ga-ref.txt "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/qemu-ga-ref.7 "$(DESTDIR)$(mandir)/man7"
endif
endif
ifdef CONFIG_VIRTFS
@@ -855,7 +616,7 @@ ifneq ($(BLOBS),)
$(INSTALL_DATA) $(SRC_PATH)/pc-bios/$$x "$(DESTDIR)$(qemu_datadir)"; \
done
endif
ifeq ($(CONFIG_GTK),m)
ifeq ($(CONFIG_GTK),y)
$(MAKE) -C po $@
endif
$(INSTALL_DIR) "$(DESTDIR)$(qemu_datadir)/keymaps"
@@ -867,6 +628,10 @@ endif
$(MAKE) $(SUBDIR_MAKEFLAGS) TARGET_DIR=$$d/ -C $$d $@ || exit 1 ; \
done
# various test targets
test speed: all
$(MAKE) -C tests/tcg $@
.PHONY: ctags
ctags:
rm -f tags
@@ -895,34 +660,33 @@ ui/shader/%-frag.h: $(SRC_PATH)/ui/shader/%.frag $(SRC_PATH)/scripts/shaderinclu
perl $(SRC_PATH)/scripts/shaderinclude.pl $< > $@,\
"FRAG","$@")
ui/shader.o: $(SRC_PATH)/ui/shader.c \
ui/shader/texture-blit-vert.h \
ui/shader/texture-blit-flip-vert.h \
ui/shader/texture-blit-frag.h
ui/console-gl.o: $(SRC_PATH)/ui/console-gl.c \
ui/shader/texture-blit-vert.h ui/shader/texture-blit-frag.h
# documentation
MAKEINFO=makeinfo
MAKEINFOINCLUDES= -I docs -I $(<D) -I $(@D)
MAKEINFOFLAGS=--no-split --number-sections $(MAKEINFOINCLUDES)
TEXI2PODFLAGS=$(MAKEINFOINCLUDES) "-DVERSION=$(VERSION)"
TEXI2PDFFLAGS=$(if $(V),,--quiet) -I $(SRC_PATH) $(MAKEINFOINCLUDES)
MAKEINFOFLAGS=--no-split --number-sections -I docs
TEXIFLAG=$(if $(V),,--quiet)
docs/version.texi: $(SRC_PATH)/VERSION
$(call quiet-command,echo "@set VERSION $(VERSION)" > $@,"GEN","$@")
%.html: %.texi docs/version.texi
%.html: %.texi
$(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --no-headers \
--html $< -o $@,"GEN","$@")
%.info: %.texi docs/version.texi
%.info: %.texi
$(call quiet-command,$(MAKEINFO) $(MAKEINFOFLAGS) $< -o $@,"GEN","$@")
%.txt: %.texi docs/version.texi
%.txt: %.texi
$(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --no-headers \
--plaintext $< -o $@,"GEN","$@")
%.pdf: %.texi docs/version.texi
$(call quiet-command,texi2pdf $(TEXI2PDFFLAGS) $< -o $@,"GEN","$@")
%.pdf: %.texi
$(call quiet-command,texi2pdf $(TEXIFLAG) -I $(SRC_PATH) -I docs $< -o $@,"GEN","$@")
docs/qemu-ga-ref.html docs/qemu-ga-ref.info docs/qemu-ga-ref.txt docs/qemu-ga-ref.pdf docs/qemu-ga-ref.7.pod: docs/version.texi
docs/qemu-qmp-ref.html docs/qemu-qmp-ref.info docs/qemu-qmp-ref.txt docs/qemu-qmp-ref.pdf docs/qemu-qmp-ref.pod: docs/version.texi
qemu-options.texi: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"GEN","$@")
@@ -936,11 +700,13 @@ qemu-monitor-info.texi: $(SRC_PATH)/hmp-commands-info.hx $(SRC_PATH)/scripts/hxt
qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"GEN","$@")
docs/interop/qemu-qmp-qapi.texi: qapi/qapi-doc.texi
@cp -p $< $@
docs/qemu-qmp-qapi.texi docs/qemu-ga-qapi.texi: $(SRC_PATH)/scripts/qapi2texi.py $(qapi-py)
docs/interop/qemu-ga-qapi.texi: qga/qapi-generated/qga-qapi-doc.texi
@cp -p $< $@
docs/qemu-qmp-qapi.texi: $(qapi-modules)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi2texi.py $< > $@,"GEN","$@")
docs/qemu-ga-qapi.texi: $(SRC_PATH)/qga/qapi-schema.json
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi2texi.py $< > $@,"GEN","$@")
qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi qemu-monitor-info.texi
qemu.1: qemu-option-trace.texi
@@ -948,27 +714,22 @@ qemu-img.1: qemu-img.texi qemu-option-trace.texi qemu-img-cmds.texi
fsdev/virtfs-proxy-helper.1: fsdev/virtfs-proxy-helper.texi
qemu-nbd.8: qemu-nbd.texi qemu-option-trace.texi
qemu-ga.8: qemu-ga.texi
docs/qemu-block-drivers.7: docs/qemu-block-drivers.texi
html: qemu-doc.html docs/interop/qemu-qmp-ref.html docs/interop/qemu-ga-ref.html
info: qemu-doc.info docs/interop/qemu-qmp-ref.info docs/interop/qemu-ga-ref.info
pdf: qemu-doc.pdf docs/interop/qemu-qmp-ref.pdf docs/interop/qemu-ga-ref.pdf
txt: qemu-doc.txt docs/interop/qemu-qmp-ref.txt docs/interop/qemu-ga-ref.txt
html: qemu-doc.html docs/qemu-qmp-ref.html docs/qemu-ga-ref.html
info: qemu-doc.info docs/qemu-qmp-ref.info docs/qemu-ga-ref.info
pdf: qemu-doc.pdf docs/qemu-qmp-ref.pdf docs/qemu-ga-ref.pdf
txt: qemu-doc.txt docs/qemu-qmp-ref.txt docs/qemu-ga-ref.txt
qemu-doc.html qemu-doc.info qemu-doc.pdf qemu-doc.txt: \
qemu-img.texi qemu-nbd.texi qemu-options.texi qemu-option-trace.texi \
qemu-monitor.texi qemu-img-cmds.texi qemu-ga.texi \
qemu-monitor-info.texi docs/qemu-block-drivers.texi
qemu-monitor-info.texi
docs/interop/qemu-ga-ref.dvi docs/interop/qemu-ga-ref.html \
docs/interop/qemu-ga-ref.info docs/interop/qemu-ga-ref.pdf \
docs/interop/qemu-ga-ref.txt docs/interop/qemu-ga-ref.7: \
docs/interop/qemu-ga-ref.texi docs/interop/qemu-ga-qapi.texi
docs/qemu-ga-ref.dvi docs/qemu-ga-ref.html docs/qemu-ga-ref.info docs/qemu-ga-ref.pdf docs/qemu-ga-ref.txt docs/qemu-ga-ref.7: \
docs/qemu-ga-ref.texi docs/qemu-ga-qapi.texi
docs/interop/qemu-qmp-ref.dvi docs/interop/qemu-qmp-ref.html \
docs/interop/qemu-qmp-ref.info docs/interop/qemu-qmp-ref.pdf \
docs/interop/qemu-qmp-ref.txt docs/interop/qemu-qmp-ref.7: \
docs/interop/qemu-qmp-ref.texi docs/interop/qemu-qmp-qapi.texi
docs/qemu-qmp-ref.dvi docs/qemu-qmp-ref.html docs/qemu-qmp-ref.info docs/qemu-qmp-ref.pdf docs/qemu-qmp-ref.txt docs/qemu-qmp-ref.7: \
docs/qemu-qmp-ref.texi docs/qemu-qmp-qapi.texi
ifdef CONFIG_WIN32
@@ -1029,11 +790,9 @@ endif # CONFIG_WIN
# Add a dependency on the generated files, so that they are always
# rebuilt before other object files
ifneq ($(wildcard config-host.mak),)
ifneq ($(filter-out $(UNCHECKED_GOALS),$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
Makefile: $(GENERATED_FILES)
endif
endif
.SECONDARY: $(TRACE_HEADERS) $(TRACE_HEADERS:%=%-timestamp) \
$(TRACE_SOURCES) $(TRACE_SOURCES:%=%-timestamp) \
@@ -1044,7 +803,6 @@ endif
-include $(wildcard *.d tests/*.d)
include $(SRC_PATH)/tests/docker/Makefile.include
include $(SRC_PATH)/tests/vm/Makefile.include
.PHONY: help
help:
@@ -1068,7 +826,6 @@ help:
@echo 'Test targets:'
@echo ' check - Run all tests (check-help for details)'
@echo ' docker - Help about targets running tests inside Docker containers'
@echo ' vm-test - Help about targets running tests inside VM'
@echo ''
@echo 'Documentation targets:'
@echo ' html info pdf txt'
@@ -1082,5 +839,4 @@ ifdef QEMU_GA_MSI_ENABLED
endif
@echo ''
endif
@echo ' $(MAKE) [targets] (quiet build, default)'
@echo ' $(MAKE) V=1 [targets] (verbose build)'
@echo ' make V=0|1 [targets] 0 => quiet build (default), 1 => verbose build'

View File

@@ -2,60 +2,7 @@
# Common libraries for tools and emulators
stub-obj-y = stubs/ crypto/
util-obj-y = util/ qobject/ qapi/
util-obj-y += qapi/qapi-builtin-types.o
util-obj-y += qapi/qapi-types.o
util-obj-y += qapi/qapi-types-block-core.o
util-obj-y += qapi/qapi-types-block.o
util-obj-y += qapi/qapi-types-char.o
util-obj-y += qapi/qapi-types-common.o
util-obj-y += qapi/qapi-types-crypto.o
util-obj-y += qapi/qapi-types-introspect.o
util-obj-y += qapi/qapi-types-migration.o
util-obj-y += qapi/qapi-types-misc.o
util-obj-y += qapi/qapi-types-net.o
util-obj-y += qapi/qapi-types-rocker.o
util-obj-y += qapi/qapi-types-run-state.o
util-obj-y += qapi/qapi-types-sockets.o
util-obj-y += qapi/qapi-types-tpm.o
util-obj-y += qapi/qapi-types-trace.o
util-obj-y += qapi/qapi-types-transaction.o
util-obj-y += qapi/qapi-types-ui.o
util-obj-y += qapi/qapi-builtin-visit.o
util-obj-y += qapi/qapi-visit.o
util-obj-y += qapi/qapi-visit-block-core.o
util-obj-y += qapi/qapi-visit-block.o
util-obj-y += qapi/qapi-visit-char.o
util-obj-y += qapi/qapi-visit-common.o
util-obj-y += qapi/qapi-visit-crypto.o
util-obj-y += qapi/qapi-visit-introspect.o
util-obj-y += qapi/qapi-visit-migration.o
util-obj-y += qapi/qapi-visit-misc.o
util-obj-y += qapi/qapi-visit-net.o
util-obj-y += qapi/qapi-visit-rocker.o
util-obj-y += qapi/qapi-visit-run-state.o
util-obj-y += qapi/qapi-visit-sockets.o
util-obj-y += qapi/qapi-visit-tpm.o
util-obj-y += qapi/qapi-visit-trace.o
util-obj-y += qapi/qapi-visit-transaction.o
util-obj-y += qapi/qapi-visit-ui.o
util-obj-y += qapi/qapi-events.o
util-obj-y += qapi/qapi-events-block-core.o
util-obj-y += qapi/qapi-events-block.o
util-obj-y += qapi/qapi-events-char.o
util-obj-y += qapi/qapi-events-common.o
util-obj-y += qapi/qapi-events-crypto.o
util-obj-y += qapi/qapi-events-introspect.o
util-obj-y += qapi/qapi-events-migration.o
util-obj-y += qapi/qapi-events-misc.o
util-obj-y += qapi/qapi-events-net.o
util-obj-y += qapi/qapi-events-rocker.o
util-obj-y += qapi/qapi-events-run-state.o
util-obj-y += qapi/qapi-events-sockets.o
util-obj-y += qapi/qapi-events-tpm.o
util-obj-y += qapi/qapi-events-trace.o
util-obj-y += qapi/qapi-events-transaction.o
util-obj-y += qapi/qapi-events-ui.o
util-obj-y += qapi/qapi-introspect.o
util-obj-y += qmp-introspect.o qapi-types.o qapi-visit.o qapi-event.o
chardev-obj-y = chardev/
@@ -64,7 +11,7 @@ chardev-obj-y = chardev/
block-obj-y += nbd/
block-obj-y += block.o blockjob.o
block-obj-y += block/ scsi/
block-obj-y += block/
block-obj-y += qemu-io-cmds.o
block-obj-$(CONFIG_REPLICATION) += replication.o
@@ -93,7 +40,7 @@ io-obj-y = io/
ifeq ($(CONFIG_SOFTMMU),y)
common-obj-y = blockdev.o blockdev-nbd.o block/
common-obj-y += bootdevice.o iothread.o
common-obj-y += iothread.o
common-obj-y += net/
common-obj-y += qdev-monitor.o device-hotplug.o
common-obj-$(CONFIG_WIN32) += os-win32.o
@@ -102,55 +49,38 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
common-obj-$(CONFIG_LINUX) += fsdev/
common-obj-y += migration/
common-obj-y += page_cache.o #aio.o
common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
common-obj-y += audio/
common-obj-m += audio/
common-obj-y += hw/
common-obj-y += accel.o
common-obj-y += replay/
common-obj-y += ui/
common-obj-m += ui/
common-obj-y += bt-host.o bt-vhci.o
bt-host.o-cflags := $(BLUEZ_CFLAGS)
common-obj-y += dma-helpers.o
common-obj-y += vl.o
vl.o-cflags := $(GPROF_CFLAGS) $(SDL_CFLAGS)
common-obj-$(CONFIG_TPM) += tpm.o
common-obj-y += tpm.o
common-obj-$(CONFIG_SLIRP) += slirp/
common-obj-y += backends/
common-obj-y += chardev/
common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
qemu-seccomp.o-cflags := $(SECCOMP_CFLAGS)
qemu-seccomp.o-libs := $(SECCOMP_LIBS)
common-obj-$(CONFIG_FDT) += device_tree.o
######################################################################
# qapi
common-obj-y += qapi/qapi-commands.o
common-obj-y += qapi/qapi-commands-block-core.o
common-obj-y += qapi/qapi-commands-block.o
common-obj-y += qapi/qapi-commands-char.o
common-obj-y += qapi/qapi-commands-common.o
common-obj-y += qapi/qapi-commands-crypto.o
common-obj-y += qapi/qapi-commands-introspect.o
common-obj-y += qapi/qapi-commands-migration.o
common-obj-y += qapi/qapi-commands-misc.o
common-obj-y += qapi/qapi-commands-net.o
common-obj-y += qapi/qapi-commands-rocker.o
common-obj-y += qapi/qapi-commands-run-state.o
common-obj-y += qapi/qapi-commands-sockets.o
common-obj-y += qapi/qapi-commands-tpm.o
common-obj-y += qapi/qapi-commands-trace.o
common-obj-y += qapi/qapi-commands-transaction.o
common-obj-y += qapi/qapi-commands-ui.o
common-obj-y += qapi/qapi-introspect.o
common-obj-y += qmp-marshal.o
common-obj-y += qmp-introspect.o
common-obj-y += qmp.o hmp.o
endif
@@ -173,21 +103,16 @@ target-obj-y += trace/
######################################################################
# guest agent
# FIXME: a few definitions from qapi/qapi-types.o and
# qapi/qapi-visit.o are needed by libqemuutil.a. These should be
# extracted into a QAPI schema module, or perhaps a separate schema.
# FIXME: a few definitions from qapi-types.o/qapi-visit.o are needed
# by libqemuutil.a. These should be moved to a separate .json schema.
qga-obj-y = qga/
qga-vss-dll-obj-y = qga/
######################################################################
# contrib
ivshmem-client-obj-$(CONFIG_IVSHMEM) = contrib/ivshmem-client/
ivshmem-server-obj-$(CONFIG_IVSHMEM) = contrib/ivshmem-server/
ivshmem-client-obj-y = contrib/ivshmem-client/
ivshmem-server-obj-y = contrib/ivshmem-server/
libvhost-user-obj-y = contrib/libvhost-user/
vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
vhost-user-scsi-obj-y = contrib/vhost-user-scsi/
vhost-user-blk-obj-y = contrib/vhost-user-blk/
######################################################################
trace-events-subdirs =
@@ -196,18 +121,15 @@ trace-events-subdirs += crypto
trace-events-subdirs += io
trace-events-subdirs += migration
trace-events-subdirs += block
trace-events-subdirs += chardev
trace-events-subdirs += backends
trace-events-subdirs += hw/block
trace-events-subdirs += hw/block/dataplane
trace-events-subdirs += hw/char
trace-events-subdirs += hw/intc
trace-events-subdirs += hw/net
trace-events-subdirs += hw/rdma
trace-events-subdirs += hw/rdma/vmw
trace-events-subdirs += hw/virtio
trace-events-subdirs += hw/audio
trace-events-subdirs += hw/misc
trace-events-subdirs += hw/misc/macio
trace-events-subdirs += hw/usb
trace-events-subdirs += hw/scsi
trace-events-subdirs += hw/nvram
@@ -216,7 +138,6 @@ trace-events-subdirs += hw/input
trace-events-subdirs += hw/timer
trace-events-subdirs += hw/dma
trace-events-subdirs += hw/sparc
trace-events-subdirs += hw/sparc64
trace-events-subdirs += hw/sd
trace-events-subdirs += hw/isa
trace-events-subdirs += hw/mem
@@ -225,16 +146,12 @@ trace-events-subdirs += hw/i386/xen
trace-events-subdirs += hw/9pfs
trace-events-subdirs += hw/ppc
trace-events-subdirs += hw/pci
trace-events-subdirs += hw/pci-host
trace-events-subdirs += hw/s390x
trace-events-subdirs += hw/vfio
trace-events-subdirs += hw/acpi
trace-events-subdirs += hw/arm
trace-events-subdirs += hw/alpha
trace-events-subdirs += hw/hppa
trace-events-subdirs += hw/xen
trace-events-subdirs += hw/ide
trace-events-subdirs += hw/tpm
trace-events-subdirs += ui
trace-events-subdirs += audio
trace-events-subdirs += net
@@ -247,10 +164,6 @@ trace-events-subdirs += target/ppc
trace-events-subdirs += qom
trace-events-subdirs += linux-user
trace-events-subdirs += qapi
trace-events-subdirs += accel/tcg
trace-events-subdirs += accel/kvm
trace-events-subdirs += nbd
trace-events-subdirs += scsi
trace-events-files = $(SRC_PATH)/trace-events $(trace-events-subdirs:%=$(SRC_PATH)/%/trace-events)

View File

@@ -22,7 +22,7 @@ QEMU_PROG_BUILD = $(QEMU_PROG)
else
# system emulator name
QEMU_PROG=qemu-system-$(TARGET_NAME)$(EXESUF)
ifneq (,$(findstring -mwindows,$(SDL_LIBS)))
ifneq (,$(findstring -mwindows,$(libs_softmmu)))
# Terminate program name with a 'w' because the linker builds a windows executable.
QEMU_PROGW=qemu-system-$(TARGET_NAME)w$(EXESUF)
$(QEMU_PROG): $(QEMU_PROGW)
@@ -36,6 +36,10 @@ endif
PROGS=$(QEMU_PROG) $(QEMU_PROGW)
STPFILES=
ifdef CONFIG_LINUX_USER
PROGS+=$(QEMU_PROG)-binfmt
endif
config-target.h: config-target.h-timestamp
config-target.h-timestamp: config-target.mak
@@ -48,10 +52,7 @@ else
TARGET_TYPE=system
endif
tracetool-y = $(SRC_PATH)/scripts/tracetool.py
tracetool-y += $(shell find $(SRC_PATH)/scripts/tracetool -name "*.py")
$(QEMU_PROG).stp-installed: $(BUILD_DIR)/trace-events-all $(tracetool-y)
$(QEMU_PROG).stp-installed: $(BUILD_DIR)/trace-events-all
$(call quiet-command,$(TRACETOOL) \
--group=all \
--format=stap \
@@ -61,7 +62,7 @@ $(QEMU_PROG).stp-installed: $(BUILD_DIR)/trace-events-all $(tracetool-y)
--target-type=$(TARGET_TYPE) \
$< > $@,"GEN","$(TARGET_DIR)$(QEMU_PROG).stp-installed")
$(QEMU_PROG).stp: $(BUILD_DIR)/trace-events-all $(tracetool-y)
$(QEMU_PROG).stp: $(BUILD_DIR)/trace-events-all
$(call quiet-command,$(TRACETOOL) \
--group=all \
--format=stap \
@@ -71,7 +72,7 @@ $(QEMU_PROG).stp: $(BUILD_DIR)/trace-events-all $(tracetool-y)
--target-type=$(TARGET_TYPE) \
$< > $@,"GEN","$(TARGET_DIR)$(QEMU_PROG).stp")
$(QEMU_PROG)-simpletrace.stp: $(BUILD_DIR)/trace-events-all $(tracetool-y)
$(QEMU_PROG)-simpletrace.stp: $(BUILD_DIR)/trace-events-all
$(call quiet-command,$(TRACETOOL) \
--group=all \
--format=simpletrace-stap \
@@ -91,16 +92,26 @@ all: $(PROGS) stap
#########################################################
# cpu emulator library
obj-y += exec.o
obj-y += accel/
obj-$(CONFIG_TCG) += tcg/tcg.o tcg/tcg-op.o tcg/tcg-op-vec.o tcg/tcg-op-gvec.o
obj-$(CONFIG_TCG) += tcg/tcg-common.o tcg/optimize.o
obj-$(CONFIG_TCG_INTERPRETER) += tcg/tci.o
obj-y = exec.o translate-all.o cpu-exec.o
obj-y += translate-common.o
obj-y += cpu-exec-common.o
obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o
obj-$(CONFIG_TCG_INTERPRETER) += tci.o
obj-y += tcg/tcg-common.o
obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
obj-y += fpu/softfloat.o
obj-y += target/$(TARGET_BASE_ARCH)/
obj-y += disas.o
obj-y += tcg-runtime.o
obj-$(call notempty,$(TARGET_XML_FILES)) += gdbstub-xml.o
obj-$(call lnot,$(CONFIG_HAX)) += hax-stub.o
obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decContext.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decNumber.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal32.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal64.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal128.o
#########################################################
# Linux user emulator target
@@ -112,7 +123,9 @@ QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) \
-I$(SRC_PATH)/linux-user
obj-y += linux-user/
obj-y += gdbstub.o thunk.o
obj-y += gdbstub.o thunk.o user-exec.o user-exec-stub.o
obj-binfmt-y += linux-user/
endif #CONFIG_LINUX_USER
@@ -125,7 +138,7 @@ QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ABI_DIR) \
-I$(SRC_PATH)/bsd-user/$(HOST_VARIANT_DIR)
obj-y += bsd-user/
obj-y += gdbstub.o
obj-y += gdbstub.o user-exec.o user-exec-stub.o
endif #CONFIG_BSD_USER
@@ -133,14 +146,21 @@ endif #CONFIG_BSD_USER
# System emulator target
ifdef CONFIG_SOFTMMU
obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
obj-y += qtest.o
obj-y += qtest.o bootdevice.o
obj-y += hw/
obj-y += memory.o
obj-$(CONFIG_KVM) += kvm-all.o
obj-y += memory.o cputlb.o
obj-y += memory_mapping.o
obj-y += dump.o
obj-y += migration/ram.o
obj-y += migration/ram.o migration/savevm.o
LIBS := $(libs_softmmu) $(LIBS)
# xen support
obj-$(CONFIG_XEN) += xen-common.o
obj-$(CONFIG_XEN_I386) += xen-hvm.o xen-mapcache.o
obj-$(call lnot,$(CONFIG_XEN)) += xen-common-stub.o
obj-$(call lnot,$(CONFIG_XEN_I386)) += xen-hvm-stub.o
# Hardware support
ifeq ($(TARGET_NAME), sparc64)
obj-y += hw/sparc64/
@@ -155,7 +175,11 @@ endif # CONFIG_SOFTMMU
# Workaround for http://gcc.gnu.org/PR55489, see configure.
%/translate.o: QEMU_CFLAGS += $(TRANSLATE_OPT_CFLAGS)
ifdef CONFIG_LINUX_USER
dummy := $(call unnest-vars,,obj-y obj-binfmt-y)
else
dummy := $(call unnest-vars,,obj-y)
endif
all-obj-y := $(obj-y)
target-obj-y :=
@@ -174,7 +198,8 @@ dummy := $(call unnest-vars,.., \
qom-obj-y \
io-obj-y \
common-obj-y \
common-obj-m)
common-obj-m \
trace-obj-y)
target-obj-y := $(target-obj-y-save)
all-obj-y += $(common-obj-y)
all-obj-y += $(target-obj-y)
@@ -186,7 +211,7 @@ all-obj-$(CONFIG_SOFTMMU) += $(io-obj-y)
$(QEMU_PROG_BUILD): config-devices.mak
COMMON_LDADDS = ../libqemuutil.a
COMMON_LDADDS = $(trace-obj-y) ../libqemuutil.a ../libqemustub.a
# build either PROG or PROGW
$(QEMU_PROG_BUILD): $(all-obj-y) $(COMMON_LDADDS)
@@ -196,6 +221,9 @@ ifdef CONFIG_DARWIN
$(call quiet-command,SetFile -a C $@,"SETFILE","$(TARGET_DIR)$@")
endif
$(QEMU_PROG)-binfmt: $(obj-binfmt-y)
$(call LINK,$^)
gdbstub-xml.c: $(TARGET_XML_FILES) $(SRC_PATH)/scripts/feature_to_c.sh
$(call quiet-command,rm -f $@ && $(SHELL) $(SRC_PATH)/scripts/feature_to_c.sh $@ $(TARGET_XML_FILES),"GEN","$(TARGET_DIR)$@")

53
README
View File

@@ -44,9 +44,9 @@ of other UNIX targets. The simple steps to build QEMU are:
Additional information can also be found online via the QEMU website:
https://qemu.org/Hosts/Linux
https://qemu.org/Hosts/Mac
https://qemu.org/Hosts/W32
http://qemu-project.org/Hosts/Linux
http://qemu-project.org/Hosts/Mac
http://qemu-project.org/Hosts/W32
Submitting patches
@@ -54,9 +54,9 @@ Submitting patches
The QEMU source code is maintained under the GIT version control system.
git clone git://git.qemu.org/qemu.git
git clone git://git.qemu-project.org/qemu.git
When submitting patches, one common approach is to use 'git
When submitting patches, the preferred approach is to use 'git
format-patch' and/or 'git send-email' to format & send the mail to the
qemu-devel@nongnu.org mailing list. All patches submitted must contain
a 'Signed-off-by' line from the author. Patches should follow the
@@ -65,42 +65,9 @@ guidelines set out in the HACKING and CODING_STYLE files.
Additional information on submitting patches can be found online via
the QEMU website
https://qemu.org/Contribute/SubmitAPatch
https://qemu.org/Contribute/TrivialPatches
http://qemu-project.org/Contribute/SubmitAPatch
http://qemu-project.org/Contribute/TrivialPatches
The QEMU website is also maintained under source control.
git clone git://git.qemu.org/qemu-web.git
https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/
A 'git-publish' utility was created to make above process less
cumbersome, and is highly recommended for making regular contributions,
or even just for sending consecutive patch series revisions. It also
requires a working 'git send-email' setup, and by default doesn't
automate everything, so you may want to go through the above steps
manually for once.
For installation instructions, please go to
https://github.com/stefanha/git-publish
The workflow with 'git-publish' is:
$ git checkout master -b my-feature
$ # work on new commits, add your 'Signed-off-by' lines to each
$ git publish
Your patch series will be sent and tagged as my-feature-v1 if you need to refer
back to it in the future.
Sending v2:
$ git checkout my-feature # same topic branch
$ # making changes to the commits (using 'git rebase', for example)
$ git publish
Your patch series will be sent with 'v2' tag in the subject and the git tip
will be tagged as my-feature-v2.
Bug reporting
=============
@@ -118,7 +85,7 @@ reported via launchpad.
For additional information on bug reporting consult:
https://qemu.org/Contribute/ReportABug
http://qemu-project.org/Contribute/ReportABug
Contact
@@ -128,12 +95,12 @@ The QEMU community can be contacted in a number of ways, with the two
main methods being email and IRC
- qemu-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/qemu-devel
http://lists.nongnu.org/mailman/listinfo/qemu-devel
- #qemu on irc.oftc.net
Information on additional methods of contacting the community can be
found online via the QEMU website:
https://qemu.org/Contribute/StartHere
http://qemu-project.org/Contribute/StartHere
-- End

View File

@@ -1 +1 @@
2.11.50
2.9.1

View File

@@ -26,14 +26,22 @@
#include "qemu/osdep.h"
#include "sysemu/accel.h"
#include "hw/boards.h"
#include "qemu-common.h"
#include "sysemu/arch_init.h"
#include "sysemu/sysemu.h"
#include "sysemu/kvm.h"
#include "sysemu/qtest.h"
#include "hw/xen/xen.h"
#include "qom/object.h"
#include "qemu/error-report.h"
#include "qemu/option.h"
int tcg_tb_size;
static bool tcg_allowed = true;
static int tcg_init(MachineState *ms)
{
tcg_exec_init(tcg_tb_size * 1024 * 1024);
return 0;
}
static const TypeInfo accel_type = {
.name = TYPE_ACCEL,
@@ -70,20 +78,19 @@ static int accel_init_machine(AccelClass *acc, MachineState *ms)
void configure_accelerator(MachineState *ms)
{
const char *accel, *p;
const char *p;
char buf[10];
int ret;
bool accel_initialised = false;
bool init_failed = false;
AccelClass *acc = NULL;
accel = qemu_opt_get(qemu_get_machine_opts(), "accel");
if (accel == NULL) {
p = qemu_opt_get(qemu_get_machine_opts(), "accel");
if (p == NULL) {
/* Use the default "accelerator", tcg */
accel = "tcg";
p = "tcg";
}
p = accel;
while (!accel_initialised && *p != '\0') {
if (*p == ':') {
p++;
@@ -91,6 +98,7 @@ void configure_accelerator(MachineState *ms)
p = get_opt_name(buf, sizeof(buf), p, ':');
acc = accel_find(buf);
if (!acc) {
fprintf(stderr, "\"%s\" accelerator not found.\n", buf);
continue;
}
if (acc->available && !acc->available()) {
@@ -101,8 +109,9 @@ void configure_accelerator(MachineState *ms)
ret = accel_init_machine(acc, ms);
if (ret < 0) {
init_failed = true;
error_report("failed to initialize %s: %s",
acc->name, strerror(-ret));
fprintf(stderr, "failed to initialize %s: %s\n",
acc->name,
strerror(-ret));
} else {
accel_initialised = true;
}
@@ -110,25 +119,37 @@ void configure_accelerator(MachineState *ms)
if (!accel_initialised) {
if (!init_failed) {
error_report("-machine accel=%s: No accelerator found", accel);
fprintf(stderr, "No accelerator found!\n");
}
exit(1);
}
if (init_failed) {
error_report("Back to %s accelerator", acc->name);
fprintf(stderr, "Back to %s accelerator.\n", acc->name);
}
}
void accel_register_compat_props(AccelState *accel)
static void tcg_accel_class_init(ObjectClass *oc, void *data)
{
AccelClass *class = ACCEL_GET_CLASS(accel);
register_compat_props_array(class->global_props);
AccelClass *ac = ACCEL_CLASS(oc);
ac->name = "tcg";
ac->init_machine = tcg_init;
ac->allowed = &tcg_allowed;
}
#define TYPE_TCG_ACCEL ACCEL_CLASS_NAME("tcg")
static const TypeInfo tcg_accel_type = {
.name = TYPE_TCG_ACCEL,
.parent = TYPE_ACCEL,
.class_init = tcg_accel_class_init,
};
static void register_accel_types(void)
{
type_register_static(&accel_type);
type_register_static(&tcg_accel_type);
}
type_init(register_accel_types);

View File

@@ -1,4 +0,0 @@
obj-$(CONFIG_SOFTMMU) += accel.o
obj-y += kvm/
obj-$(CONFIG_TCG) += tcg/
obj-y += stubs/

View File

@@ -1 +0,0 @@
obj-$(CONFIG_KVM) += kvm-all.o

View File

@@ -1,16 +0,0 @@
# Trace events for debugging and performance instrumentation
# kvm-all.c
kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
kvm_vm_ioctl(int type, void *arg) "type 0x%x, arg %p"
kvm_vcpu_ioctl(int cpu_index, int type, void *arg) "cpu_index %d, type 0x%x, arg %p"
kvm_run_exit(int cpu_index, uint32_t reason) "cpu_index %d, reason %d"
kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p"
kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s"
kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s"
kvm_irqchip_commit_routes(void) ""
kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s vector %d virq %d"
kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d"
kvm_irqchip_release_virq(int virq) "virq %d"
kvm_set_user_memory(uint32_t slot, uint32_t flags, uint64_t guest_phys_addr, uint64_t memory_size, uint64_t userspace_addr, int ret) "Slot#%d flags=0x%x gpa=0x%"PRIx64 " size=0x%"PRIx64 " ua=0x%"PRIx64 " ret=%d"

View File

@@ -1,5 +0,0 @@
obj-$(call lnot,$(CONFIG_HAX)) += hax-stub.o
obj-$(call lnot,$(CONFIG_HVF)) += hvf-stub.o
obj-$(call lnot,$(CONFIG_WHPX)) += whpx-stub.o
obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
obj-$(call lnot,$(CONFIG_TCG)) += tcg-stub.o

View File

@@ -1,31 +0,0 @@
/*
* QEMU HVF support
*
* Copyright 2017 Red Hat, Inc.
*
* This software is licensed under the terms of the GNU General Public
* License version 2 or later, as published by the Free Software Foundation,
* and may be copied, distributed, and modified under those terms.
*
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "cpu.h"
#include "sysemu/hvf.h"
int hvf_init_vcpu(CPUState *cpu)
{
return -ENOSYS;
}
int hvf_vcpu_exec(CPUState *cpu)
{
return -ENOSYS;
}
void hvf_vcpu_destroy(CPUState *cpu)
{
}

View File

@@ -1,30 +0,0 @@
/*
* QEMU TCG accelerator stub
*
* Copyright Red Hat, Inc. 2013
*
* Author: Paolo Bonzini <pbonzini@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "cpu.h"
#include "tcg/tcg.h"
#include "exec/cpu-common.h"
#include "exec/exec-all.h"
void tb_flush(CPUState *cpu)
{
}
void tb_unlock(void)
{
}
void tlb_set_dirty(CPUState *cpu, target_ulong vaddr)
{
}

View File

@@ -1,48 +0,0 @@
/*
* QEMU Windows Hypervisor Platform accelerator (WHPX) stub
*
* Copyright Microsoft Corp. 2017
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "cpu.h"
#include "sysemu/whpx.h"
int whpx_init_vcpu(CPUState *cpu)
{
return -1;
}
int whpx_vcpu_exec(CPUState *cpu)
{
return -1;
}
void whpx_destroy_vcpu(CPUState *cpu)
{
}
void whpx_vcpu_kick(CPUState *cpu)
{
}
void whpx_cpu_synchronize_state(CPUState *cpu)
{
}
void whpx_cpu_synchronize_post_reset(CPUState *cpu)
{
}
void whpx_cpu_synchronize_post_init(CPUState *cpu)
{
}
void whpx_cpu_synchronize_pre_loadvm(CPUState *cpu)
{
}

View File

@@ -1,8 +0,0 @@
obj-$(CONFIG_SOFTMMU) += tcg-all.o
obj-$(CONFIG_SOFTMMU) += cputlb.o
obj-y += tcg-runtime.o tcg-runtime-gvec.o
obj-y += cpu-exec.o cpu-exec-common.o translate-all.o
obj-y += translator.o
obj-$(CONFIG_USER_ONLY) += user-exec.o
obj-$(call lnot,$(CONFIG_SOFTMMU)) += user-exec-stub.o

View File

@@ -1,92 +0,0 @@
/*
* QEMU System Emulator, accelerator interfaces
*
* Copyright (c) 2003-2008 Fabrice Bellard
* Copyright (c) 2014 Red Hat Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "sysemu/accel.h"
#include "sysemu/sysemu.h"
#include "qom/object.h"
#include "qemu-common.h"
#include "qom/cpu.h"
#include "sysemu/cpus.h"
#include "qemu/main-loop.h"
unsigned long tcg_tb_size;
#ifndef CONFIG_USER_ONLY
/* mask must never be zero, except for A20 change call */
static void tcg_handle_interrupt(CPUState *cpu, int mask)
{
int old_mask;
g_assert(qemu_mutex_iothread_locked());
old_mask = cpu->interrupt_request;
cpu->interrupt_request |= mask;
/*
* If called from iothread context, wake the target cpu in
* case its halted.
*/
if (!qemu_cpu_is_self(cpu)) {
qemu_cpu_kick(cpu);
} else {
cpu->icount_decr.u16.high = -1;
if (use_icount &&
!cpu->can_do_io
&& (mask & ~old_mask) != 0) {
cpu_abort(cpu, "Raised interrupt while not in I/O function");
}
}
}
#endif
static int tcg_init(MachineState *ms)
{
tcg_exec_init(tcg_tb_size * 1024 * 1024);
cpu_interrupt_handler = tcg_handle_interrupt;
return 0;
}
static void tcg_accel_class_init(ObjectClass *oc, void *data)
{
AccelClass *ac = ACCEL_CLASS(oc);
ac->name = "tcg";
ac->init_machine = tcg_init;
ac->allowed = &tcg_allowed;
}
#define TYPE_TCG_ACCEL ACCEL_CLASS_NAME("tcg")
static const TypeInfo tcg_accel_type = {
.name = TYPE_TCG_ACCEL,
.parent = TYPE_ACCEL,
.class_init = tcg_accel_class_init,
};
static void register_accel_types(void)
{
type_register_static(&tcg_accel_type);
}
type_init(register_accel_types);

View File

@@ -1,997 +0,0 @@
/*
* Generic vectorized operation runtime
*
* Copyright (c) 2018 Linaro
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*/
#include "qemu/osdep.h"
#include "qemu/host-utils.h"
#include "cpu.h"
#include "exec/helper-proto.h"
#include "tcg-gvec-desc.h"
/* Virtually all hosts support 16-byte vectors. Those that don't can emulate
* them via GCC's generic vector extension. This turns out to be simpler and
* more reliable than getting the compiler to autovectorize.
*
* In tcg-op-gvec.c, we asserted that both the size and alignment of the data
* are multiples of 16.
*
* When the compiler does not support all of the operations we require, the
* loops are written so that we can always fall back on the base types.
*/
#ifdef CONFIG_VECTOR16
typedef uint8_t vec8 __attribute__((vector_size(16)));
typedef uint16_t vec16 __attribute__((vector_size(16)));
typedef uint32_t vec32 __attribute__((vector_size(16)));
typedef uint64_t vec64 __attribute__((vector_size(16)));
typedef int8_t svec8 __attribute__((vector_size(16)));
typedef int16_t svec16 __attribute__((vector_size(16)));
typedef int32_t svec32 __attribute__((vector_size(16)));
typedef int64_t svec64 __attribute__((vector_size(16)));
#define DUP16(X) { X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X }
#define DUP8(X) { X, X, X, X, X, X, X, X }
#define DUP4(X) { X, X, X, X }
#define DUP2(X) { X, X }
#else
typedef uint8_t vec8;
typedef uint16_t vec16;
typedef uint32_t vec32;
typedef uint64_t vec64;
typedef int8_t svec8;
typedef int16_t svec16;
typedef int32_t svec32;
typedef int64_t svec64;
#define DUP16(X) X
#define DUP8(X) X
#define DUP4(X) X
#define DUP2(X) X
#endif /* CONFIG_VECTOR16 */
static inline void clear_high(void *d, intptr_t oprsz, uint32_t desc)
{
intptr_t maxsz = simd_maxsz(desc);
intptr_t i;
if (unlikely(maxsz > oprsz)) {
for (i = oprsz; i < maxsz; i += sizeof(uint64_t)) {
*(uint64_t *)(d + i) = 0;
}
}
}
void HELPER(gvec_add8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) + *(vec8 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_add16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) + *(vec16 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_add32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) + *(vec32 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_add64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) + *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_adds8)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec8 vecb = (vec8)DUP16(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) + vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_adds16)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec16 vecb = (vec16)DUP8(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) + vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_adds32)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec32 vecb = (vec32)DUP4(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) + vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_adds64)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) + vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sub8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) - *(vec8 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sub16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) - *(vec16 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sub32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) - *(vec32 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sub64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) - *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_subs8)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec8 vecb = (vec8)DUP16(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) - vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_subs16)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec16 vecb = (vec16)DUP8(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) - vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_subs32)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec32 vecb = (vec32)DUP4(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) - vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_subs64)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) - vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_mul8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) * *(vec8 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_mul16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) * *(vec16 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_mul32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) * *(vec32 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_mul64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) * *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_muls8)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec8 vecb = (vec8)DUP16(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) * vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_muls16)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec16 vecb = (vec16)DUP8(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) * vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_muls32)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec32 vecb = (vec32)DUP4(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) * vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_muls64)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) * vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_neg8)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = -*(vec8 *)(a + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_neg16)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = -*(vec16 *)(a + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_neg32)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = -*(vec32 *)(a + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_neg64)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = -*(vec64 *)(a + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_mov)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
memcpy(d, a, oprsz);
clear_high(d, oprsz, desc);
}
void HELPER(gvec_dup64)(void *d, uint32_t desc, uint64_t c)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
if (c == 0) {
oprsz = 0;
} else {
for (i = 0; i < oprsz; i += sizeof(uint64_t)) {
*(uint64_t *)(d + i) = c;
}
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_dup32)(void *d, uint32_t desc, uint32_t c)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
if (c == 0) {
oprsz = 0;
} else {
for (i = 0; i < oprsz; i += sizeof(uint32_t)) {
*(uint32_t *)(d + i) = c;
}
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_dup16)(void *d, uint32_t desc, uint32_t c)
{
HELPER(gvec_dup32)(d, desc, 0x00010001 * (c & 0xffff));
}
void HELPER(gvec_dup8)(void *d, uint32_t desc, uint32_t c)
{
HELPER(gvec_dup32)(d, desc, 0x01010101 * (c & 0xff));
}
void HELPER(gvec_not)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = ~*(vec64 *)(a + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_and)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) & *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_or)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) | *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_xor)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) ^ *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_andc)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) &~ *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_orc)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) |~ *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ands)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) & vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_xors)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) ^ vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ors)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) | vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shl8i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) << shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shl16i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) << shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shl32i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) << shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shl64i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) << shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shr8i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shr16i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shr32i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shr64i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sar8i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(svec8 *)(d + i) = *(svec8 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sar16i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(svec16 *)(d + i) = *(svec16 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sar32i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(svec32 *)(d + i) = *(svec32 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sar64i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(svec64 *)(d + i) = *(svec64 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
/* If vectors are enabled, the compiler fills in -1 for true.
Otherwise, we must take care of this by hand. */
#ifdef CONFIG_VECTOR16
# define DO_CMP0(X) X
#else
# define DO_CMP0(X) -(X)
#endif
#define DO_CMP1(NAME, TYPE, OP) \
void HELPER(NAME)(void *d, void *a, void *b, uint32_t desc) \
{ \
intptr_t oprsz = simd_oprsz(desc); \
intptr_t i; \
for (i = 0; i < oprsz; i += sizeof(vec64)) { \
*(TYPE *)(d + i) = DO_CMP0(*(TYPE *)(a + i) OP *(TYPE *)(b + i)); \
} \
clear_high(d, oprsz, desc); \
}
#define DO_CMP2(SZ) \
DO_CMP1(gvec_eq##SZ, vec##SZ, ==) \
DO_CMP1(gvec_ne##SZ, vec##SZ, !=) \
DO_CMP1(gvec_lt##SZ, svec##SZ, <) \
DO_CMP1(gvec_le##SZ, svec##SZ, <=) \
DO_CMP1(gvec_ltu##SZ, vec##SZ, <) \
DO_CMP1(gvec_leu##SZ, vec##SZ, <=)
DO_CMP2(8)
DO_CMP2(16)
DO_CMP2(32)
DO_CMP2(64)
#undef DO_CMP0
#undef DO_CMP1
#undef DO_CMP2
void HELPER(gvec_ssadd8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int8_t)) {
int r = *(int8_t *)(a + i) + *(int8_t *)(b + i);
if (r > INT8_MAX) {
r = INT8_MAX;
} else if (r < INT8_MIN) {
r = INT8_MIN;
}
*(int8_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ssadd16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int16_t)) {
int r = *(int16_t *)(a + i) + *(int16_t *)(b + i);
if (r > INT16_MAX) {
r = INT16_MAX;
} else if (r < INT16_MIN) {
r = INT16_MIN;
}
*(int16_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ssadd32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int32_t)) {
int32_t ai = *(int32_t *)(a + i);
int32_t bi = *(int32_t *)(b + i);
int32_t di = ai + bi;
if (((di ^ ai) &~ (ai ^ bi)) < 0) {
/* Signed overflow. */
di = (di < 0 ? INT32_MAX : INT32_MIN);
}
*(int32_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ssadd64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int64_t)) {
int64_t ai = *(int64_t *)(a + i);
int64_t bi = *(int64_t *)(b + i);
int64_t di = ai + bi;
if (((di ^ ai) &~ (ai ^ bi)) < 0) {
/* Signed overflow. */
di = (di < 0 ? INT64_MAX : INT64_MIN);
}
*(int64_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sssub8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint8_t)) {
int r = *(int8_t *)(a + i) - *(int8_t *)(b + i);
if (r > INT8_MAX) {
r = INT8_MAX;
} else if (r < INT8_MIN) {
r = INT8_MIN;
}
*(uint8_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sssub16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int16_t)) {
int r = *(int16_t *)(a + i) - *(int16_t *)(b + i);
if (r > INT16_MAX) {
r = INT16_MAX;
} else if (r < INT16_MIN) {
r = INT16_MIN;
}
*(int16_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sssub32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int32_t)) {
int32_t ai = *(int32_t *)(a + i);
int32_t bi = *(int32_t *)(b + i);
int32_t di = ai - bi;
if (((di ^ ai) & (ai ^ bi)) < 0) {
/* Signed overflow. */
di = (di < 0 ? INT32_MAX : INT32_MIN);
}
*(int32_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sssub64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int64_t)) {
int64_t ai = *(int64_t *)(a + i);
int64_t bi = *(int64_t *)(b + i);
int64_t di = ai - bi;
if (((di ^ ai) & (ai ^ bi)) < 0) {
/* Signed overflow. */
di = (di < 0 ? INT64_MAX : INT64_MIN);
}
*(int64_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_usadd8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint8_t)) {
unsigned r = *(uint8_t *)(a + i) + *(uint8_t *)(b + i);
if (r > UINT8_MAX) {
r = UINT8_MAX;
}
*(uint8_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_usadd16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint16_t)) {
unsigned r = *(uint16_t *)(a + i) + *(uint16_t *)(b + i);
if (r > UINT16_MAX) {
r = UINT16_MAX;
}
*(uint16_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_usadd32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint32_t)) {
uint32_t ai = *(uint32_t *)(a + i);
uint32_t bi = *(uint32_t *)(b + i);
uint32_t di = ai + bi;
if (di < ai) {
di = UINT32_MAX;
}
*(uint32_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_usadd64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint64_t)) {
uint64_t ai = *(uint64_t *)(a + i);
uint64_t bi = *(uint64_t *)(b + i);
uint64_t di = ai + bi;
if (di < ai) {
di = UINT64_MAX;
}
*(uint64_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ussub8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint8_t)) {
int r = *(uint8_t *)(a + i) - *(uint8_t *)(b + i);
if (r < 0) {
r = 0;
}
*(uint8_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ussub16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint16_t)) {
int r = *(uint16_t *)(a + i) - *(uint16_t *)(b + i);
if (r < 0) {
r = 0;
}
*(uint16_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ussub32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint32_t)) {
uint32_t ai = *(uint32_t *)(a + i);
uint32_t bi = *(uint32_t *)(b + i);
uint32_t di = ai - bi;
if (ai < bi) {
di = 0;
}
*(uint32_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ussub64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint64_t)) {
uint64_t ai = *(uint64_t *)(a + i);
uint64_t bi = *(uint64_t *)(b + i);
uint64_t di = ai - bi;
if (ai < bi) {
di = 0;
}
*(uint64_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}

View File

@@ -1,254 +0,0 @@
DEF_HELPER_FLAGS_2(div_i32, TCG_CALL_NO_RWG_SE, s32, s32, s32)
DEF_HELPER_FLAGS_2(rem_i32, TCG_CALL_NO_RWG_SE, s32, s32, s32)
DEF_HELPER_FLAGS_2(divu_i32, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(remu_i32, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(div_i64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
DEF_HELPER_FLAGS_2(rem_i64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
DEF_HELPER_FLAGS_2(divu_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(remu_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(shl_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(shr_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(sar_i64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
DEF_HELPER_FLAGS_2(mulsh_i64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
DEF_HELPER_FLAGS_2(muluh_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(clz_i32, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(ctz_i32, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(clz_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(ctz_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_1(clrsb_i32, TCG_CALL_NO_RWG_SE, i32, i32)
DEF_HELPER_FLAGS_1(clrsb_i64, TCG_CALL_NO_RWG_SE, i64, i64)
DEF_HELPER_FLAGS_1(ctpop_i32, TCG_CALL_NO_RWG_SE, i32, i32)
DEF_HELPER_FLAGS_1(ctpop_i64, TCG_CALL_NO_RWG_SE, i64, i64)
DEF_HELPER_FLAGS_1(lookup_tb_ptr, TCG_CALL_NO_WG_SE, ptr, env)
DEF_HELPER_FLAGS_1(exit_atomic, TCG_CALL_NO_WG, noreturn, env)
#ifdef CONFIG_SOFTMMU
DEF_HELPER_FLAGS_5(atomic_cmpxchgb, TCG_CALL_NO_WG,
i32, env, tl, i32, i32, i32)
DEF_HELPER_FLAGS_5(atomic_cmpxchgw_be, TCG_CALL_NO_WG,
i32, env, tl, i32, i32, i32)
DEF_HELPER_FLAGS_5(atomic_cmpxchgw_le, TCG_CALL_NO_WG,
i32, env, tl, i32, i32, i32)
DEF_HELPER_FLAGS_5(atomic_cmpxchgl_be, TCG_CALL_NO_WG,
i32, env, tl, i32, i32, i32)
DEF_HELPER_FLAGS_5(atomic_cmpxchgl_le, TCG_CALL_NO_WG,
i32, env, tl, i32, i32, i32)
#ifdef CONFIG_ATOMIC64
DEF_HELPER_FLAGS_5(atomic_cmpxchgq_be, TCG_CALL_NO_WG,
i64, env, tl, i64, i64, i32)
DEF_HELPER_FLAGS_5(atomic_cmpxchgq_le, TCG_CALL_NO_WG,
i64, env, tl, i64, i64, i32)
#endif
#ifdef CONFIG_ATOMIC64
#define GEN_ATOMIC_HELPERS(NAME) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), b), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), w_le), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), w_be), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), l_le), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), l_be), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), q_le), \
TCG_CALL_NO_WG, i64, env, tl, i64, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), q_be), \
TCG_CALL_NO_WG, i64, env, tl, i64, i32)
#else
#define GEN_ATOMIC_HELPERS(NAME) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), b), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), w_le), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), w_be), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), l_le), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), l_be), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32)
#endif /* CONFIG_ATOMIC64 */
#else
DEF_HELPER_FLAGS_4(atomic_cmpxchgb, TCG_CALL_NO_WG, i32, env, tl, i32, i32)
DEF_HELPER_FLAGS_4(atomic_cmpxchgw_be, TCG_CALL_NO_WG, i32, env, tl, i32, i32)
DEF_HELPER_FLAGS_4(atomic_cmpxchgw_le, TCG_CALL_NO_WG, i32, env, tl, i32, i32)
DEF_HELPER_FLAGS_4(atomic_cmpxchgl_be, TCG_CALL_NO_WG, i32, env, tl, i32, i32)
DEF_HELPER_FLAGS_4(atomic_cmpxchgl_le, TCG_CALL_NO_WG, i32, env, tl, i32, i32)
#ifdef CONFIG_ATOMIC64
DEF_HELPER_FLAGS_4(atomic_cmpxchgq_be, TCG_CALL_NO_WG, i64, env, tl, i64, i64)
DEF_HELPER_FLAGS_4(atomic_cmpxchgq_le, TCG_CALL_NO_WG, i64, env, tl, i64, i64)
#endif
#ifdef CONFIG_ATOMIC64
#define GEN_ATOMIC_HELPERS(NAME) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), b), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), w_le), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), w_be), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), l_le), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), l_be), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), q_le), \
TCG_CALL_NO_WG, i64, env, tl, i64) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), q_be), \
TCG_CALL_NO_WG, i64, env, tl, i64)
#else
#define GEN_ATOMIC_HELPERS(NAME) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), b), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), w_le), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), w_be), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), l_le), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), l_be), \
TCG_CALL_NO_WG, i32, env, tl, i32)
#endif /* CONFIG_ATOMIC64 */
#endif /* CONFIG_SOFTMMU */
GEN_ATOMIC_HELPERS(fetch_add)
GEN_ATOMIC_HELPERS(fetch_and)
GEN_ATOMIC_HELPERS(fetch_or)
GEN_ATOMIC_HELPERS(fetch_xor)
GEN_ATOMIC_HELPERS(add_fetch)
GEN_ATOMIC_HELPERS(and_fetch)
GEN_ATOMIC_HELPERS(or_fetch)
GEN_ATOMIC_HELPERS(xor_fetch)
GEN_ATOMIC_HELPERS(xchg)
#undef GEN_ATOMIC_HELPERS
DEF_HELPER_FLAGS_3(gvec_mov, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_dup8, TCG_CALL_NO_RWG, void, ptr, i32, i32)
DEF_HELPER_FLAGS_3(gvec_dup16, TCG_CALL_NO_RWG, void, ptr, i32, i32)
DEF_HELPER_FLAGS_3(gvec_dup32, TCG_CALL_NO_RWG, void, ptr, i32, i32)
DEF_HELPER_FLAGS_3(gvec_dup64, TCG_CALL_NO_RWG, void, ptr, i32, i64)
DEF_HELPER_FLAGS_4(gvec_add8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_add16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_add32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_add64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_adds8, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_adds16, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_adds32, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_adds64, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_sub8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sub16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sub32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sub64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_subs8, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_subs16, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_subs32, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_subs64, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_mul8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_mul16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_mul32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_mul64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_muls8, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_muls16, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_muls32, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_muls64, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_ssadd8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ssadd16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ssadd32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ssadd64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sssub8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sssub16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sssub32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sssub64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_usadd8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_usadd16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_usadd32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_usadd64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ussub8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ussub16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ussub32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ussub64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_neg8, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_neg16, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_neg32, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_neg64, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_not, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_and, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_or, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_xor, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_andc, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_orc, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ands, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_xors, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_ors, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_3(gvec_shl8i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shl16i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shl32i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shl64i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shr8i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shr16i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shr32i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shr64i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_sar8i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_sar16i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_sar32i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_sar64i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_eq8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_eq16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_eq32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_eq64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ne8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ne16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ne32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ne64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_lt8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_lt16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_lt32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_lt64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_le8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_le16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_le32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_le64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ltu8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ltu16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ltu32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ltu64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_leu8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_leu16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_leu32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_leu64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)

View File

@@ -1,10 +0,0 @@
# Trace events for debugging and performance instrumentation
# TCG related tracing (mostly disabled by default)
# cpu-exec.c
disable exec_tb(void *tb, uintptr_t pc) "tb:%p pc=0x%"PRIxPTR
disable exec_tb_nocache(void *tb, uintptr_t pc) "tb:%p pc=0x%"PRIxPTR
disable exec_tb_exit(void *last_tb, unsigned int flags) "tb:%p flags=0x%x"
# translate-all.c
translate_block(void *tb, uintptr_t pc, uint8_t *tb_code) "tb:%p, pc:0x%"PRIxPTR", tb_code:%p"

View File

@@ -1,138 +0,0 @@
/*
* Generic intermediate code generation.
*
* Copyright (C) 2016-2017 Lluís Vilanova <vilanova@ac.upc.edu>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/error-report.h"
#include "cpu.h"
#include "tcg/tcg.h"
#include "tcg/tcg-op.h"
#include "exec/exec-all.h"
#include "exec/gen-icount.h"
#include "exec/log.h"
#include "exec/translator.h"
/* Pairs with tcg_clear_temp_count.
To be called by #TranslatorOps.{translate_insn,tb_stop} if
(1) the target is sufficiently clean to support reporting,
(2) as and when all temporaries are known to be consumed.
For most targets, (2) is at the end of translate_insn. */
void translator_loop_temp_check(DisasContextBase *db)
{
if (tcg_check_temp_count()) {
qemu_log("warning: TCG temporary leaks before "
TARGET_FMT_lx "\n", db->pc_next);
}
}
void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
CPUState *cpu, TranslationBlock *tb)
{
int max_insns;
/* Initialize DisasContext */
db->tb = tb;
db->pc_first = tb->pc;
db->pc_next = db->pc_first;
db->is_jmp = DISAS_NEXT;
db->num_insns = 0;
db->singlestep_enabled = cpu->singlestep_enabled;
/* Instruction counting */
max_insns = tb_cflags(db->tb) & CF_COUNT_MASK;
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
if (max_insns > TCG_MAX_INSNS) {
max_insns = TCG_MAX_INSNS;
}
if (db->singlestep_enabled || singlestep) {
max_insns = 1;
}
max_insns = ops->init_disas_context(db, cpu, max_insns);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
/* Reset the temp count so that we can identify leaks */
tcg_clear_temp_count();
/* Start translating. */
gen_tb_start(db->tb);
ops->tb_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
while (true) {
db->num_insns++;
ops->insn_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
/* Pass breakpoint hits to target for further processing */
if (unlikely(!QTAILQ_EMPTY(&cpu->breakpoints))) {
CPUBreakpoint *bp;
QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
if (bp->pc == db->pc_next) {
if (ops->breakpoint_check(db, cpu, bp)) {
break;
}
}
}
/* The breakpoint_check hook may use DISAS_TOO_MANY to indicate
that only one more instruction is to be executed. Otherwise
it should use DISAS_NORETURN when generating an exception,
but may use a DISAS_TARGET_* value for Something Else. */
if (db->is_jmp > DISAS_TOO_MANY) {
break;
}
}
/* Disassemble one instruction. The translate_insn hook should
update db->pc_next and db->is_jmp to indicate what should be
done next -- either exiting this loop or locate the start of
the next instruction. */
if (db->num_insns == max_insns && (tb_cflags(db->tb) & CF_LAST_IO)) {
/* Accept I/O on the last instruction. */
gen_io_start();
ops->translate_insn(db, cpu);
gen_io_end();
} else {
ops->translate_insn(db, cpu);
}
/* Stop translation if translate_insn so indicated. */
if (db->is_jmp != DISAS_NEXT) {
break;
}
/* Stop translation if the output buffer is full,
or we have executed all of the allowed instructions. */
if (tcg_op_buf_full() || db->num_insns >= max_insns) {
db->is_jmp = DISAS_TOO_MANY;
break;
}
}
/* Emit code to exit the TB, as indicated by db->is_jmp. */
ops->tb_stop(db, cpu);
gen_tb_end(db->tb, db->num_insns);
/* The disas_log hook may use these values rather than recompute. */
db->tb->size = db->pc_next - db->pc_first;
db->tb->icount = db->num_insns;
#ifdef DEBUG_DISAS
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
&& qemu_log_in_addr_range(db->pc_first)) {
qemu_log_lock();
qemu_log("----------------\n");
ops->disas_log(db, cpu);
qemu_log("\n");
qemu_log_unlock();
}
#endif
}

View File

@@ -27,10 +27,10 @@
#include "sysemu/sysemu.h"
#include "sysemu/arch_init.h"
#include "hw/pci/pci.h"
#include "hw/audio/soundhw.h"
#include "qapi/qapi-commands-misc.h"
#include "hw/audio/audio.h"
#include "qemu/config-file.h"
#include "qemu/error-report.h"
#include "qmp-commands.h"
#include "hw/acpi/acpi.h"
#include "qemu/help_option.h"
@@ -53,8 +53,6 @@ int graphic_depth = 32;
#define QEMU_ARCH QEMU_ARCH_CRIS
#elif defined(TARGET_I386)
#define QEMU_ARCH QEMU_ARCH_I386
#elif defined(TARGET_HPPA)
#define QEMU_ARCH QEMU_ARCH_HPPA
#elif defined(TARGET_M68K)
#define QEMU_ARCH QEMU_ARCH_M68K
#elif defined(TARGET_LM32)
@@ -71,8 +69,6 @@ int graphic_depth = 32;
#define QEMU_ARCH QEMU_ARCH_OPENRISC
#elif defined(TARGET_PPC)
#define QEMU_ARCH QEMU_ARCH_PPC
#elif defined(TARGET_RISCV)
#define QEMU_ARCH QEMU_ARCH_RISCV
#elif defined(TARGET_S390X)
#define QEMU_ARCH QEMU_ARCH_S390X
#elif defined(TARGET_SH4)
@@ -89,6 +85,130 @@ int graphic_depth = 32;
const uint32_t arch_type = QEMU_ARCH;
struct soundhw {
const char *name;
const char *descr;
int enabled;
int isa;
union {
int (*init_isa) (ISABus *bus);
int (*init_pci) (PCIBus *bus);
} init;
};
static struct soundhw soundhw[9];
static int soundhw_count;
void isa_register_soundhw(const char *name, const char *descr,
int (*init_isa)(ISABus *bus))
{
assert(soundhw_count < ARRAY_SIZE(soundhw) - 1);
soundhw[soundhw_count].name = name;
soundhw[soundhw_count].descr = descr;
soundhw[soundhw_count].isa = 1;
soundhw[soundhw_count].init.init_isa = init_isa;
soundhw_count++;
}
void pci_register_soundhw(const char *name, const char *descr,
int (*init_pci)(PCIBus *bus))
{
assert(soundhw_count < ARRAY_SIZE(soundhw) - 1);
soundhw[soundhw_count].name = name;
soundhw[soundhw_count].descr = descr;
soundhw[soundhw_count].isa = 0;
soundhw[soundhw_count].init.init_pci = init_pci;
soundhw_count++;
}
void select_soundhw(const char *optarg)
{
struct soundhw *c;
if (is_help_option(optarg)) {
show_valid_cards:
if (soundhw_count) {
printf("Valid sound card names (comma separated):\n");
for (c = soundhw; c->name; ++c) {
printf ("%-11s %s\n", c->name, c->descr);
}
printf("\n-soundhw all will enable all of the above\n");
} else {
printf("Machine has no user-selectable audio hardware "
"(it may or may not have always-present audio hardware).\n");
}
exit(!is_help_option(optarg));
}
else {
size_t l;
const char *p;
char *e;
int bad_card = 0;
if (!strcmp(optarg, "all")) {
for (c = soundhw; c->name; ++c) {
c->enabled = 1;
}
return;
}
p = optarg;
while (*p) {
e = strchr(p, ',');
l = !e ? strlen(p) : (size_t) (e - p);
for (c = soundhw; c->name; ++c) {
if (!strncmp(c->name, p, l) && !c->name[l]) {
c->enabled = 1;
break;
}
}
if (!c->name) {
if (l > 80) {
error_report("Unknown sound card name (too big to show)");
}
else {
error_report("Unknown sound card name `%.*s'",
(int) l, p);
}
bad_card = 1;
}
p += l + (e != NULL);
}
if (bad_card) {
goto show_valid_cards;
}
}
}
void audio_init(void)
{
struct soundhw *c;
ISABus *isa_bus = (ISABus *) object_resolve_path_type("", TYPE_ISA_BUS, NULL);
PCIBus *pci_bus = (PCIBus *) object_resolve_path_type("", TYPE_PCI_BUS, NULL);
for (c = soundhw; c->name; ++c) {
if (c->enabled) {
if (c->isa) {
if (!isa_bus) {
error_report("ISA bus not available for %s", c->name);
exit(1);
}
c->init.init_isa(isa_bus);
} else {
if (!pci_bus) {
error_report("PCI bus not available for %s", c->name);
exit(1);
}
c->init.init_pci(pci_bus);
}
}
}
}
int kvm_available(void)
{
#ifdef CONFIG_KVM

View File

@@ -61,52 +61,39 @@
ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong addr,
ABI_TYPE cmpv, ABI_TYPE newv EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
DATA_TYPE ret = atomic_cmpxchg__nocheck(haddr, cmpv, newv);
ATOMIC_MMU_CLEANUP;
return ret;
return atomic_cmpxchg__nocheck(haddr, cmpv, newv);
}
#if DATA_SIZE >= 16
ABI_TYPE ATOMIC_NAME(ld)(CPUArchState *env, target_ulong addr EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE val, *haddr = ATOMIC_MMU_LOOKUP;
__atomic_load(haddr, &val, __ATOMIC_RELAXED);
ATOMIC_MMU_CLEANUP;
return val;
}
void ATOMIC_NAME(st)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
__atomic_store(haddr, &val, __ATOMIC_RELAXED);
ATOMIC_MMU_CLEANUP;
}
#else
ABI_TYPE ATOMIC_NAME(xchg)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
DATA_TYPE ret = atomic_xchg__nocheck(haddr, val);
ATOMIC_MMU_CLEANUP;
return ret;
return atomic_xchg__nocheck(haddr, val);
}
#define GEN_ATOMIC_HELPER(X) \
ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong addr, \
ABI_TYPE val EXTRA_ARGS) \
{ \
ATOMIC_MMU_DECLS; \
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP; \
DATA_TYPE ret = atomic_##X(haddr, val); \
ATOMIC_MMU_CLEANUP; \
return ret; \
}
return atomic_##X(haddr, val); \
} \
GEN_ATOMIC_HELPER(fetch_add)
GEN_ATOMIC_HELPER(fetch_and)
@@ -135,52 +122,39 @@ GEN_ATOMIC_HELPER(xor_fetch)
ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong addr,
ABI_TYPE cmpv, ABI_TYPE newv EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
DATA_TYPE ret = atomic_cmpxchg__nocheck(haddr, BSWAP(cmpv), BSWAP(newv));
ATOMIC_MMU_CLEANUP;
return BSWAP(ret);
return BSWAP(atomic_cmpxchg__nocheck(haddr, BSWAP(cmpv), BSWAP(newv)));
}
#if DATA_SIZE >= 16
ABI_TYPE ATOMIC_NAME(ld)(CPUArchState *env, target_ulong addr EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE val, *haddr = ATOMIC_MMU_LOOKUP;
__atomic_load(haddr, &val, __ATOMIC_RELAXED);
ATOMIC_MMU_CLEANUP;
return BSWAP(val);
}
void ATOMIC_NAME(st)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
val = BSWAP(val);
__atomic_store(haddr, &val, __ATOMIC_RELAXED);
ATOMIC_MMU_CLEANUP;
}
#else
ABI_TYPE ATOMIC_NAME(xchg)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
ABI_TYPE ret = atomic_xchg__nocheck(haddr, BSWAP(val));
ATOMIC_MMU_CLEANUP;
return BSWAP(ret);
return BSWAP(atomic_xchg__nocheck(haddr, BSWAP(val)));
}
#define GEN_ATOMIC_HELPER(X) \
ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong addr, \
ABI_TYPE val EXTRA_ARGS) \
{ \
ATOMIC_MMU_DECLS; \
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP; \
DATA_TYPE ret = atomic_##X(haddr, BSWAP(val)); \
ATOMIC_MMU_CLEANUP; \
return BSWAP(ret); \
return BSWAP(atomic_##X(haddr, BSWAP(val))); \
}
GEN_ATOMIC_HELPER(fetch_and)
@@ -197,7 +171,6 @@ GEN_ATOMIC_HELPER(xor_fetch)
ABI_TYPE ATOMIC_NAME(fetch_add)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
DATA_TYPE ldo, ldn, ret, sto;
@@ -207,7 +180,6 @@ ABI_TYPE ATOMIC_NAME(fetch_add)(CPUArchState *env, target_ulong addr,
sto = BSWAP(ret + val);
ldn = atomic_cmpxchg__nocheck(haddr, ldo, sto);
if (ldn == ldo) {
ATOMIC_MMU_CLEANUP;
return ret;
}
ldo = ldn;
@@ -217,7 +189,6 @@ ABI_TYPE ATOMIC_NAME(fetch_add)(CPUArchState *env, target_ulong addr,
ABI_TYPE ATOMIC_NAME(add_fetch)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
DATA_TYPE ldo, ldn, ret, sto;
@@ -227,7 +198,6 @@ ABI_TYPE ATOMIC_NAME(add_fetch)(CPUArchState *env, target_ulong addr,
sto = BSWAP(ret);
ldn = atomic_cmpxchg__nocheck(haddr, ldo, sto);
if (ldn == ldo) {
ATOMIC_MMU_CLEANUP;
return ret;
}
ldo = ldn;

View File

@@ -1,31 +1,13 @@
common-obj-y = audio.o noaudio.o wavaudio.o mixeng.o
common-obj-$(CONFIG_SDL) += sdlaudio.o
common-obj-$(CONFIG_OSS) += ossaudio.o
common-obj-$(CONFIG_SPICE) += spiceaudio.o
common-obj-$(CONFIG_AUDIO_COREAUDIO) += coreaudio.o
common-obj-$(CONFIG_AUDIO_DSOUND) += dsoundaudio.o
common-obj-$(CONFIG_COREAUDIO) += coreaudio.o
common-obj-$(CONFIG_ALSA) += alsaaudio.o
common-obj-$(CONFIG_DSOUND) += dsoundaudio.o
common-obj-$(CONFIG_PA) += paaudio.o
common-obj-$(CONFIG_AUDIO_PT_INT) += audio_pt_int.o
common-obj-$(CONFIG_AUDIO_WIN_INT) += audio_win_int.o
common-obj-y += wavcapture.o
coreaudio.o-libs := $(COREAUDIO_LIBS)
dsoundaudio.o-libs := $(DSOUND_LIBS)
# alsa module
common-obj-$(CONFIG_AUDIO_ALSA) += alsa.mo
alsa.mo-objs = alsaaudio.o
alsa.mo-libs := $(ALSA_LIBS)
# oss module
common-obj-$(CONFIG_AUDIO_OSS) += oss.mo
oss.mo-objs = ossaudio.o
oss.mo-libs := $(OSS_LIBS)
# pulseaudio module
common-obj-$(CONFIG_AUDIO_PA) += pa.mo
pa.mo-objs = paaudio.o
pa.mo-libs := $(PULSE_LIBS)
# sdl module
common-obj-$(CONFIG_AUDIO_SDL) += sdl.mo
sdl.mo-objs = sdlaudio.o
sdl.mo-cflags := $(SDL_CFLAGS)
sdl.mo-libs := $(SDL_LIBS)
sdlaudio.o-cflags := $(SDL_CFLAGS)

View File

@@ -823,7 +823,7 @@ static int alsa_init_out(HWVoiceOut *hw, struct audsettings *as,
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = obt.samples;
alsa->pcm_buf = audio_calloc(__func__, obt.samples, 1 << hw->info.shift);
alsa->pcm_buf = audio_calloc (AUDIO_FUNC, obt.samples, 1 << hw->info.shift);
if (!alsa->pcm_buf) {
dolog ("Could not allocate DAC buffer (%d samples, each %d bytes)\n",
hw->samples, 1 << hw->info.shift);
@@ -934,7 +934,7 @@ static int alsa_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = obt.samples;
alsa->pcm_buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
alsa->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
if (!alsa->pcm_buf) {
dolog ("Could not allocate ADC buffer (%d samples, each %d bytes)\n",
hw->samples, 1 << hw->info.shift);
@@ -1213,7 +1213,7 @@ static struct audio_pcm_ops alsa_pcm_ops = {
.ctl_in = alsa_ctl_in,
};
static struct audio_driver alsa_audio_driver = {
struct audio_driver alsa_audio_driver = {
.name = "alsa",
.descr = "ALSA http://www.alsa-project.org",
.options = alsa_options,
@@ -1226,9 +1226,3 @@ static struct audio_driver alsa_audio_driver = {
.voice_size_out = sizeof (ALSAVoiceOut),
.voice_size_in = sizeof (ALSAVoiceIn)
};
static void register_audio_alsa(void)
{
audio_driver_register(&alsa_audio_driver);
}
type_init(register_audio_alsa);

View File

@@ -45,49 +45,15 @@
The 1st one is the one used by default, that is the reason
that we generate the list.
*/
static const char *audio_prio_list[] = {
"spice",
static struct audio_driver *drvtab[] = {
#ifdef CONFIG_SPICE
&spice_audio_driver,
#endif
CONFIG_AUDIO_DRIVERS
"none",
"wav",
&no_audio_driver,
&wav_audio_driver
};
static QLIST_HEAD(, audio_driver) audio_drivers;
void audio_driver_register(audio_driver *drv)
{
QLIST_INSERT_HEAD(&audio_drivers, drv, next);
}
audio_driver *audio_driver_lookup(const char *name)
{
struct audio_driver *d;
QLIST_FOREACH(d, &audio_drivers, next) {
if (strcmp(name, d->name) == 0) {
return d;
}
}
audio_module_load_one(name);
QLIST_FOREACH(d, &audio_drivers, next) {
if (strcmp(name, d->name) == 0) {
return d;
}
}
return NULL;
}
static void audio_module_load_all(void)
{
int i;
for (i = 0; i < ARRAY_SIZE(audio_prio_list); i++) {
audio_driver_lookup(audio_prio_list[i]);
}
}
struct fixed_settings {
int enabled;
int nb_voices;
@@ -458,12 +424,12 @@ static void audio_process_options (const char *prefix,
const char qemu_prefix[] = "QEMU_";
size_t preflen, optlen;
if (audio_bug(__func__, !prefix)) {
if (audio_bug (AUDIO_FUNC, !prefix)) {
dolog ("prefix = NULL\n");
return;
}
if (audio_bug(__func__, !opt)) {
if (audio_bug (AUDIO_FUNC, !opt)) {
dolog ("opt = NULL\n");
return;
}
@@ -826,7 +792,7 @@ static int audio_attach_capture (HWVoiceOut *hw)
SWVoiceOut *sw;
HWVoiceOut *hw_cap = &cap->hw;
sc = audio_calloc(__func__, 1, sizeof(*sc));
sc = audio_calloc (AUDIO_FUNC, 1, sizeof (*sc));
if (!sc) {
dolog ("Could not allocate soft capture voice (%zu bytes)\n",
sizeof (*sc));
@@ -882,7 +848,7 @@ static int audio_pcm_hw_find_min_in (HWVoiceIn *hw)
int audio_pcm_hw_get_live_in (HWVoiceIn *hw)
{
int live = hw->total_samples_captured - audio_pcm_hw_find_min_in (hw);
if (audio_bug(__func__, live < 0 || live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hw->samples)) {
dolog ("live=%d hw->samples=%d\n", live, hw->samples);
return 0;
}
@@ -920,7 +886,7 @@ static int audio_pcm_sw_get_rpos_in (SWVoiceIn *sw)
int live = hw->total_samples_captured - sw->total_hw_samples_acquired;
int rpos;
if (audio_bug(__func__, live < 0 || live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hw->samples)) {
dolog ("live=%d hw->samples=%d\n", live, hw->samples);
return 0;
}
@@ -943,7 +909,7 @@ int audio_pcm_sw_read (SWVoiceIn *sw, void *buf, int size)
rpos = audio_pcm_sw_get_rpos_in (sw) % hw->samples;
live = hw->total_samples_captured - sw->total_hw_samples_acquired;
if (audio_bug(__func__, live < 0 || live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hw->samples)) {
dolog ("live_in=%d hw->samples=%d\n", live, hw->samples);
return 0;
}
@@ -969,7 +935,7 @@ int audio_pcm_sw_read (SWVoiceIn *sw, void *buf, int size)
}
osamp = swlim;
if (audio_bug(__func__, osamp < 0)) {
if (audio_bug (AUDIO_FUNC, osamp < 0)) {
dolog ("osamp=%d\n", osamp);
return 0;
}
@@ -1024,7 +990,7 @@ static int audio_pcm_hw_get_live_out (HWVoiceOut *hw, int *nb_live)
if (nb_live1) {
int live = smin;
if (audio_bug(__func__, live < 0 || live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hw->samples)) {
dolog ("live=%d hw->samples=%d\n", live, hw->samples);
return 0;
}
@@ -1048,7 +1014,7 @@ int audio_pcm_sw_write (SWVoiceOut *sw, void *buf, int size)
hwsamples = sw->hw->samples;
live = sw->total_hw_samples_mixed;
if (audio_bug(__func__, live < 0 || live > hwsamples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hwsamples)){
dolog ("live=%d hw->samples=%d\n", live, hwsamples);
return 0;
}
@@ -1297,7 +1263,7 @@ static int audio_get_avail (SWVoiceIn *sw)
}
live = sw->hw->total_samples_captured - sw->total_hw_samples_acquired;
if (audio_bug(__func__, live < 0 || live > sw->hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > sw->hw->samples)) {
dolog ("live=%d sw->hw->samples=%d\n", live, sw->hw->samples);
return 0;
}
@@ -1321,7 +1287,7 @@ static int audio_get_free (SWVoiceOut *sw)
live = sw->total_hw_samples_mixed;
if (audio_bug(__func__, live < 0 || live > sw->hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > sw->hw->samples)) {
dolog ("live=%d sw->hw->samples=%d\n", live, sw->hw->samples);
return 0;
}
@@ -1388,7 +1354,7 @@ static void audio_run_out (AudioState *s)
live = 0;
}
if (audio_bug(__func__, live < 0 || live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hw->samples)) {
dolog ("live=%d hw->samples=%d\n", live, hw->samples);
continue;
}
@@ -1423,7 +1389,7 @@ static void audio_run_out (AudioState *s)
prev_rpos = hw->rpos;
played = hw->pcm_ops->run_out (hw, live);
replay_audio_out(&played);
if (audio_bug(__func__, hw->rpos >= hw->samples)) {
if (audio_bug (AUDIO_FUNC, hw->rpos >= hw->samples)) {
dolog ("hw->rpos=%d hw->samples=%d played=%d\n",
hw->rpos, hw->samples, played);
hw->rpos = 0;
@@ -1444,7 +1410,7 @@ static void audio_run_out (AudioState *s)
continue;
}
if (audio_bug(__func__, played > sw->total_hw_samples_mixed)) {
if (audio_bug (AUDIO_FUNC, played > sw->total_hw_samples_mixed)) {
dolog ("played=%d sw->total_hw_samples_mixed=%d\n",
played, sw->total_hw_samples_mixed);
played = sw->total_hw_samples_mixed;
@@ -1547,7 +1513,7 @@ static void audio_run_capture (AudioState *s)
continue;
}
if (audio_bug(__func__, captured > sw->total_hw_samples_mixed)) {
if (audio_bug (AUDIO_FUNC, captured > sw->total_hw_samples_mixed)) {
dolog ("captured=%d sw->total_hw_samples_mixed=%d\n",
captured, sw->total_hw_samples_mixed);
captured = sw->total_hw_samples_mixed;
@@ -1690,13 +1656,11 @@ static void audio_pp_nb_voices (const char *typ, int nb)
void AUD_help (void)
{
struct audio_driver *d;
/* make sure we print the help text for modular drivers too */
audio_module_load_all();
size_t i;
audio_process_options ("AUDIO", audio_options);
QLIST_FOREACH(d, &audio_drivers, next) {
for (i = 0; i < ARRAY_SIZE (drvtab); i++) {
struct audio_driver *d = drvtab[i];
if (d->options) {
audio_process_options (d->name, d->options);
}
@@ -1708,7 +1672,8 @@ void AUD_help (void)
printf ("Available drivers:\n");
QLIST_FOREACH(d, &audio_drivers, next) {
for (i = 0; i < ARRAY_SIZE (drvtab); i++) {
struct audio_driver *d = drvtab[i];
printf ("Name: %s\n", d->name);
printf ("Description: %s\n", d->descr);
@@ -1842,7 +1807,6 @@ static void audio_init (void)
const char *drvname;
VMChangeStateEntry *e;
AudioState *s = &glob_audio_state;
struct audio_driver *driver;
if (s->drv) {
return;
@@ -1878,27 +1842,32 @@ static void audio_init (void)
}
if (drvname) {
driver = audio_driver_lookup(drvname);
if (driver) {
done = !audio_driver_init(s, driver);
} else {
int found = 0;
for (i = 0; i < ARRAY_SIZE (drvtab); i++) {
if (!strcmp (drvname, drvtab[i]->name)) {
done = !audio_driver_init (s, drvtab[i]);
found = 1;
break;
}
}
if (!found) {
dolog ("Unknown audio driver `%s'\n", drvname);
dolog ("Run with -audio-help to list available drivers\n");
}
}
if (!done) {
for (i = 0; !done && i < ARRAY_SIZE(audio_prio_list); i++) {
driver = audio_driver_lookup(audio_prio_list[i]);
if (driver && driver->can_be_default) {
done = !audio_driver_init(s, driver);
for (i = 0; !done && i < ARRAY_SIZE (drvtab); i++) {
if (drvtab[i]->can_be_default) {
done = !audio_driver_init (s, drvtab[i]);
}
}
}
if (!done) {
driver = audio_driver_lookup("none");
done = !audio_driver_init(s, driver);
done = !audio_driver_init (s, &no_audio_driver);
assert(done);
dolog("warning: Using timer based audio emulation\n");
}
@@ -1955,7 +1924,7 @@ CaptureVoiceOut *AUD_add_capture (
goto err0;
}
cb = audio_calloc(__func__, 1, sizeof(*cb));
cb = audio_calloc (AUDIO_FUNC, 1, sizeof (*cb));
if (!cb) {
dolog ("Could not allocate capture callback information, size %zu\n",
sizeof (*cb));
@@ -1973,7 +1942,7 @@ CaptureVoiceOut *AUD_add_capture (
HWVoiceOut *hw;
CaptureVoiceOut *cap;
cap = audio_calloc(__func__, 1, sizeof(*cap));
cap = audio_calloc (AUDIO_FUNC, 1, sizeof (*cap));
if (!cap) {
dolog ("Could not allocate capture voice, size %zu\n",
sizeof (*cap));
@@ -1986,8 +1955,8 @@ CaptureVoiceOut *AUD_add_capture (
/* XXX find a more elegant way */
hw->samples = 4096 * 4;
hw->mix_buf = audio_calloc(__func__, hw->samples,
sizeof(struct st_sample));
hw->mix_buf = audio_calloc (AUDIO_FUNC, hw->samples,
sizeof (struct st_sample));
if (!hw->mix_buf) {
dolog ("Could not allocate capture mix buffer (%d samples)\n",
hw->samples);
@@ -1996,7 +1965,7 @@ CaptureVoiceOut *AUD_add_capture (
audio_pcm_init_info (&hw->info, as);
cap->buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
cap->buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
if (!cap->buf) {
dolog ("Could not allocate capture buffer "
"(%d samples, each %d bytes)\n",

View File

@@ -25,7 +25,7 @@
#ifndef QEMU_AUDIO_INT_H
#define QEMU_AUDIO_INT_H
#ifdef CONFIG_AUDIO_COREAUDIO
#ifdef CONFIG_COREAUDIO
#define FLOAT_MIXENG
/* #define RECIPROCAL */
#endif
@@ -141,7 +141,6 @@ struct SWVoiceIn {
QLIST_ENTRY (SWVoiceIn) entries;
};
typedef struct audio_driver audio_driver;
struct audio_driver {
const char *name;
const char *descr;
@@ -155,7 +154,6 @@ struct audio_driver {
int voice_size_out;
int voice_size_in;
int ctl_caps;
QLIST_ENTRY(audio_driver) next;
};
struct audio_pcm_ops {
@@ -205,11 +203,17 @@ struct AudioState {
int vm_running;
};
extern struct audio_driver no_audio_driver;
extern struct audio_driver oss_audio_driver;
extern struct audio_driver sdl_audio_driver;
extern struct audio_driver wav_audio_driver;
extern struct audio_driver alsa_audio_driver;
extern struct audio_driver coreaudio_audio_driver;
extern struct audio_driver dsound_audio_driver;
extern struct audio_driver pa_audio_driver;
extern struct audio_driver spice_audio_driver;
extern const struct mixeng_volume nominal_volume;
void audio_driver_register(audio_driver *drv);
audio_driver *audio_driver_lookup(const char *name);
void audio_pcm_init_info (struct audio_pcm_info *info, struct audsettings *as);
void audio_pcm_info_clear_buf (struct audio_pcm_info *info, void *buf, int len);
@@ -248,4 +252,10 @@ static inline int audio_ring_dist (int dst, int src, int len)
#define AUDIO_STRINGIFY_(n) #n
#define AUDIO_STRINGIFY(n) AUDIO_STRINGIFY_(n)
#if defined _MSC_VER || defined __GNUC__
#define AUDIO_FUNC __FUNCTION__
#else
#define AUDIO_FUNC __FILE__ ":" AUDIO_STRINGIFY (__LINE__)
#endif
#endif /* QEMU_AUDIO_INT_H */

View File

@@ -31,7 +31,7 @@ int audio_pt_init (struct audio_pt *p, void *(*func) (void *),
err = sigfillset (&set);
if (err) {
logerr(p, errno, "%s(%s): sigfillset failed", cap, __func__);
logerr (p, errno, "%s(%s): sigfillset failed", cap, AUDIO_FUNC);
return -1;
}
@@ -57,8 +57,8 @@ int audio_pt_init (struct audio_pt *p, void *(*func) (void *),
err2 = pthread_sigmask (SIG_SETMASK, &old_set, NULL);
if (err2) {
logerr(p, err2, "%s(%s): pthread_sigmask (restore) failed",
cap, __func__);
logerr (p, err2, "%s(%s): pthread_sigmask (restore) failed",
cap, AUDIO_FUNC);
/* We have failed to restore original signal mask, all bets are off,
so terminate the process */
exit (EXIT_FAILURE);
@@ -74,17 +74,17 @@ int audio_pt_init (struct audio_pt *p, void *(*func) (void *),
err2:
err2 = pthread_cond_destroy (&p->cond);
if (err2) {
logerr(p, err2, "%s(%s): pthread_cond_destroy failed", cap, __func__);
logerr (p, err2, "%s(%s): pthread_cond_destroy failed", cap, AUDIO_FUNC);
}
err1:
err2 = pthread_mutex_destroy (&p->mutex);
if (err2) {
logerr(p, err2, "%s(%s): pthread_mutex_destroy failed", cap, __func__);
logerr (p, err2, "%s(%s): pthread_mutex_destroy failed", cap, AUDIO_FUNC);
}
err0:
logerr(p, err, "%s(%s): %s failed", cap, __func__, efunc);
logerr (p, err, "%s(%s): %s failed", cap, AUDIO_FUNC, efunc);
return -1;
}
@@ -94,13 +94,13 @@ int audio_pt_fini (struct audio_pt *p, const char *cap)
err = pthread_cond_destroy (&p->cond);
if (err) {
logerr(p, err, "%s(%s): pthread_cond_destroy failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_cond_destroy failed", cap, AUDIO_FUNC);
ret = -1;
}
err = pthread_mutex_destroy (&p->mutex);
if (err) {
logerr(p, err, "%s(%s): pthread_mutex_destroy failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_mutex_destroy failed", cap, AUDIO_FUNC);
ret = -1;
}
return ret;
@@ -112,7 +112,7 @@ int audio_pt_lock (struct audio_pt *p, const char *cap)
err = pthread_mutex_lock (&p->mutex);
if (err) {
logerr(p, err, "%s(%s): pthread_mutex_lock failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_mutex_lock failed", cap, AUDIO_FUNC);
return -1;
}
return 0;
@@ -124,7 +124,7 @@ int audio_pt_unlock (struct audio_pt *p, const char *cap)
err = pthread_mutex_unlock (&p->mutex);
if (err) {
logerr(p, err, "%s(%s): pthread_mutex_unlock failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_mutex_unlock failed", cap, AUDIO_FUNC);
return -1;
}
return 0;
@@ -136,7 +136,7 @@ int audio_pt_wait (struct audio_pt *p, const char *cap)
err = pthread_cond_wait (&p->cond, &p->mutex);
if (err) {
logerr(p, err, "%s(%s): pthread_cond_wait failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_cond_wait failed", cap, AUDIO_FUNC);
return -1;
}
return 0;
@@ -148,12 +148,12 @@ int audio_pt_unlock_and_signal (struct audio_pt *p, const char *cap)
err = pthread_mutex_unlock (&p->mutex);
if (err) {
logerr(p, err, "%s(%s): pthread_mutex_unlock failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_mutex_unlock failed", cap, AUDIO_FUNC);
return -1;
}
err = pthread_cond_signal (&p->cond);
if (err) {
logerr(p, err, "%s(%s): pthread_cond_signal failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_cond_signal failed", cap, AUDIO_FUNC);
return -1;
}
return 0;
@@ -166,7 +166,7 @@ int audio_pt_join (struct audio_pt *p, void **arg, const char *cap)
err = pthread_join (p->thread, &ret);
if (err) {
logerr(p, err, "%s(%s): pthread_join failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_join failed", cap, AUDIO_FUNC);
return -1;
}
*arg = ret;

View File

@@ -57,13 +57,13 @@ static void glue (audio_init_nb_voices_, TYPE) (struct audio_driver *drv)
glue (s->nb_hw_voices_, TYPE) = max_voices;
}
if (audio_bug(__func__, !voice_size && max_voices)) {
if (audio_bug (AUDIO_FUNC, !voice_size && max_voices)) {
dolog ("drv=`%s' voice_size=0 max_voices=%d\n",
drv->name, max_voices);
glue (s->nb_hw_voices_, TYPE) = 0;
}
if (audio_bug(__func__, voice_size && !max_voices)) {
if (audio_bug (AUDIO_FUNC, voice_size && !max_voices)) {
dolog ("drv=`%s' voice_size=%d max_voices=0\n",
drv->name, voice_size);
}
@@ -77,7 +77,7 @@ static void glue (audio_pcm_hw_free_resources_, TYPE) (HW *hw)
static int glue (audio_pcm_hw_alloc_resources_, TYPE) (HW *hw)
{
HWBUF = audio_calloc(__func__, hw->samples, sizeof(struct st_sample));
HWBUF = audio_calloc (AUDIO_FUNC, hw->samples, sizeof (struct st_sample));
if (!HWBUF) {
dolog ("Could not allocate " NAME " buffer (%d samples)\n",
hw->samples);
@@ -105,7 +105,7 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW *sw)
samples = ((int64_t) sw->hw->samples << 32) / sw->ratio;
sw->buf = audio_calloc(__func__, samples, sizeof(struct st_sample));
sw->buf = audio_calloc (AUDIO_FUNC, samples, sizeof (struct st_sample));
if (!sw->buf) {
dolog ("Could not allocate buffer for `%s' (%d samples)\n",
SW_NAME (sw), samples);
@@ -238,17 +238,17 @@ static HW *glue (audio_pcm_hw_add_new_, TYPE) (struct audsettings *as)
return NULL;
}
if (audio_bug(__func__, !drv)) {
if (audio_bug (AUDIO_FUNC, !drv)) {
dolog ("No host audio driver\n");
return NULL;
}
if (audio_bug(__func__, !drv->pcm_ops)) {
if (audio_bug (AUDIO_FUNC, !drv->pcm_ops)) {
dolog ("Host audio driver without pcm_ops\n");
return NULL;
}
hw = audio_calloc(__func__, 1, glue(drv->voice_size_, TYPE));
hw = audio_calloc (AUDIO_FUNC, 1, glue (drv->voice_size_, TYPE));
if (!hw) {
dolog ("Can not allocate voice `%s' size %d\n",
drv->name, glue (drv->voice_size_, TYPE));
@@ -266,7 +266,7 @@ static HW *glue (audio_pcm_hw_add_new_, TYPE) (struct audsettings *as)
goto err0;
}
if (audio_bug(__func__, hw->samples <= 0)) {
if (audio_bug (AUDIO_FUNC, hw->samples <= 0)) {
dolog ("hw->samples=%d\n", hw->samples);
goto err1;
}
@@ -339,7 +339,7 @@ static SW *glue (audio_pcm_create_voice_pair_, TYPE) (
hw_as = *as;
}
sw = audio_calloc(__func__, 1, sizeof(*sw));
sw = audio_calloc (AUDIO_FUNC, 1, sizeof (*sw));
if (!sw) {
dolog ("Could not allocate soft voice `%s' (%zu bytes)\n",
sw_name ? sw_name : "unknown", sizeof (*sw));
@@ -379,7 +379,7 @@ static void glue (audio_close_, TYPE) (SW *sw)
void glue (AUD_close_, TYPE) (QEMUSoundCard *card, SW *sw)
{
if (sw) {
if (audio_bug(__func__, !card)) {
if (audio_bug (AUDIO_FUNC, !card)) {
dolog ("card=%p\n", card);
return;
}
@@ -399,7 +399,7 @@ SW *glue (AUD_open_, TYPE) (
{
AudioState *s = &glob_audio_state;
if (audio_bug(__func__, !card || !name || !callback_fn || !as)) {
if (audio_bug (AUDIO_FUNC, !card || !name || !callback_fn || !as)) {
dolog ("card=%p name=%p callback_fn=%p as=%p\n",
card, name, callback_fn, as);
goto fail;
@@ -408,12 +408,12 @@ SW *glue (AUD_open_, TYPE) (
ldebug ("open %s, freq %d, nchannels %d, fmt %d\n",
name, as->freq, as->nchannels, as->fmt);
if (audio_bug(__func__, audio_validate_settings(as))) {
if (audio_bug (AUDIO_FUNC, audio_validate_settings (as))) {
audio_print_settings (as);
goto fail;
}
if (audio_bug(__func__, !s->drv)) {
if (audio_bug (AUDIO_FUNC, !s->drv)) {
dolog ("Can not open `%s' (no host audio driver)\n", name);
goto fail;
}

View File

@@ -722,7 +722,7 @@ static struct audio_pcm_ops coreaudio_pcm_ops = {
.ctl_out = coreaudio_ctl_out
};
static struct audio_driver coreaudio_audio_driver = {
struct audio_driver coreaudio_audio_driver = {
.name = "coreaudio",
.descr = "CoreAudio http://developer.apple.com/audio/coreaudio.html",
.options = coreaudio_options,
@@ -735,9 +735,3 @@ static struct audio_driver coreaudio_audio_driver = {
.voice_size_out = sizeof (coreaudioVoiceOut),
.voice_size_in = 0
};
static void register_audio_coreaudio(void)
{
audio_driver_register(&coreaudio_audio_driver);
}
type_init(register_audio_coreaudio);

View File

@@ -543,7 +543,7 @@ static int dsound_run_out (HWVoiceOut *hw, int live)
}
}
if (audio_bug(__func__, len < 0 || len > bufsize)) {
if (audio_bug (AUDIO_FUNC, len < 0 || len > bufsize)) {
dolog ("len=%d bufsize=%d old_pos=%ld ppos=%ld\n",
len, bufsize, old_pos, ppos);
return 0;
@@ -890,7 +890,7 @@ static struct audio_pcm_ops dsound_pcm_ops = {
.ctl_in = dsound_ctl_in
};
static struct audio_driver dsound_audio_driver = {
struct audio_driver dsound_audio_driver = {
.name = "dsound",
.descr = "DirectSound http://wikipedia.org/wiki/DirectSound",
.options = dsound_options,
@@ -903,9 +903,3 @@ static struct audio_driver dsound_audio_driver = {
.voice_size_out = sizeof (DSoundVoiceOut),
.voice_size_in = sizeof (DSoundVoiceIn)
};
static void register_audio_dsound(void)
{
audio_driver_register(&dsound_audio_driver);
}
type_init(register_audio_dsound);

View File

@@ -344,7 +344,7 @@ struct rate {
*/
void *st_rate_start (int inrate, int outrate)
{
struct rate *rate = audio_calloc(__func__, 1, sizeof(*rate));
struct rate *rate = audio_calloc (AUDIO_FUNC, 1, sizeof (*rate));
if (!rate) {
dolog ("Could not allocate resampler (%zu bytes)\n", sizeof (*rate));

View File

@@ -160,7 +160,7 @@ static struct audio_pcm_ops no_pcm_ops = {
.ctl_in = no_ctl_in
};
static struct audio_driver no_audio_driver = {
struct audio_driver no_audio_driver = {
.name = "none",
.descr = "Timer based audio emulation",
.options = NULL,
@@ -173,9 +173,3 @@ static struct audio_driver no_audio_driver = {
.voice_size_out = sizeof (NoVoiceOut),
.voice_size_in = sizeof (NoVoiceIn)
};
static void register_audio_none(void)
{
audio_driver_register(&no_audio_driver);
}
type_init(register_audio_none);

View File

@@ -582,9 +582,11 @@ static int oss_init_out(HWVoiceOut *hw, struct audsettings *as,
}
if (!oss->mmapped) {
oss->pcm_buf = audio_calloc(__func__,
hw->samples,
1 << hw->info.shift);
oss->pcm_buf = audio_calloc (
AUDIO_FUNC,
hw->samples,
1 << hw->info.shift
);
if (!oss->pcm_buf) {
dolog (
"Could not allocate DAC buffer (%d samples, each %d bytes)\n",
@@ -703,7 +705,7 @@ static int oss_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
}
hw->samples = (obt.nfrags * obt.fragsize) >> hw->info.shift;
oss->pcm_buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
oss->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
if (!oss->pcm_buf) {
dolog ("Could not allocate ADC buffer (%d samples, each %d bytes)\n",
hw->samples, 1 << hw->info.shift);
@@ -922,7 +924,7 @@ static struct audio_pcm_ops oss_pcm_ops = {
.ctl_in = oss_ctl_in
};
static struct audio_driver oss_audio_driver = {
struct audio_driver oss_audio_driver = {
.name = "oss",
.descr = "OSS http://www.opensound.com",
.options = oss_options,
@@ -935,9 +937,3 @@ static struct audio_driver oss_audio_driver = {
.voice_size_out = sizeof (OSSVoiceOut),
.voice_size_in = sizeof (OSSVoiceIn)
};
static void register_audio_oss(void)
{
audio_driver_register(&oss_audio_driver);
}
type_init(register_audio_oss);

View File

@@ -89,7 +89,7 @@ static inline int PA_STREAM_IS_GOOD(pa_stream_state_t x)
} \
goto label; \
} \
} while (0)
} while (0);
#define CHECK_DEAD_GOTO(c, stream, rerror, label) \
do { \
@@ -107,7 +107,7 @@ static inline int PA_STREAM_IS_GOOD(pa_stream_state_t x)
} \
goto label; \
} \
} while (0)
} while (0);
static int qpa_simple_read (PAVoiceIn *p, void *data, size_t length, int *rerror)
{
@@ -206,7 +206,7 @@ static void *qpa_thread_out (void *arg)
PAVoiceOut *pa = arg;
HWVoiceOut *hw = &pa->hw;
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -222,7 +222,7 @@ static void *qpa_thread_out (void *arg)
break;
}
if (audio_pt_wait(&pa->pt, __func__)) {
if (audio_pt_wait (&pa->pt, AUDIO_FUNC)) {
goto exit;
}
}
@@ -230,7 +230,7 @@ static void *qpa_thread_out (void *arg)
decr = to_mix = audio_MIN (pa->live, pa->g->conf.samples >> 2);
rpos = pa->rpos;
if (audio_pt_unlock(&pa->pt, __func__)) {
if (audio_pt_unlock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -251,7 +251,7 @@ static void *qpa_thread_out (void *arg)
to_mix -= chunk;
}
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -261,7 +261,7 @@ static void *qpa_thread_out (void *arg)
}
exit:
audio_pt_unlock(&pa->pt, __func__);
audio_pt_unlock (&pa->pt, AUDIO_FUNC);
return NULL;
}
@@ -270,7 +270,7 @@ static int qpa_run_out (HWVoiceOut *hw, int live)
int decr;
PAVoiceOut *pa = (PAVoiceOut *) hw;
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return 0;
}
@@ -279,10 +279,10 @@ static int qpa_run_out (HWVoiceOut *hw, int live)
pa->live = live - decr;
hw->rpos = pa->rpos;
if (pa->live > 0) {
audio_pt_unlock_and_signal(&pa->pt, __func__);
audio_pt_unlock_and_signal (&pa->pt, AUDIO_FUNC);
}
else {
audio_pt_unlock(&pa->pt, __func__);
audio_pt_unlock (&pa->pt, AUDIO_FUNC);
}
return decr;
}
@@ -298,7 +298,7 @@ static void *qpa_thread_in (void *arg)
PAVoiceIn *pa = arg;
HWVoiceIn *hw = &pa->hw;
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -314,7 +314,7 @@ static void *qpa_thread_in (void *arg)
break;
}
if (audio_pt_wait(&pa->pt, __func__)) {
if (audio_pt_wait (&pa->pt, AUDIO_FUNC)) {
goto exit;
}
}
@@ -322,7 +322,7 @@ static void *qpa_thread_in (void *arg)
incr = to_grab = audio_MIN (pa->dead, pa->g->conf.samples >> 2);
wpos = pa->wpos;
if (audio_pt_unlock(&pa->pt, __func__)) {
if (audio_pt_unlock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -342,7 +342,7 @@ static void *qpa_thread_in (void *arg)
to_grab -= chunk;
}
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -352,7 +352,7 @@ static void *qpa_thread_in (void *arg)
}
exit:
audio_pt_unlock(&pa->pt, __func__);
audio_pt_unlock (&pa->pt, AUDIO_FUNC);
return NULL;
}
@@ -361,7 +361,7 @@ static int qpa_run_in (HWVoiceIn *hw)
int live, incr, dead;
PAVoiceIn *pa = (PAVoiceIn *) hw;
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return 0;
}
@@ -372,10 +372,10 @@ static int qpa_run_in (HWVoiceIn *hw)
pa->dead = dead - incr;
hw->wpos = pa->wpos;
if (pa->dead > 0) {
audio_pt_unlock_and_signal(&pa->pt, __func__);
audio_pt_unlock_and_signal (&pa->pt, AUDIO_FUNC);
}
else {
audio_pt_unlock(&pa->pt, __func__);
audio_pt_unlock (&pa->pt, AUDIO_FUNC);
}
return incr;
}
@@ -579,7 +579,7 @@ static int qpa_init_out(HWVoiceOut *hw, struct audsettings *as,
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = g->conf.samples;
pa->pcm_buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
pa->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
pa->rpos = hw->rpos;
if (!pa->pcm_buf) {
dolog ("Could not allocate buffer (%d bytes)\n",
@@ -587,7 +587,7 @@ static int qpa_init_out(HWVoiceOut *hw, struct audsettings *as,
goto fail2;
}
if (audio_pt_init(&pa->pt, qpa_thread_out, hw, AUDIO_CAP, __func__)) {
if (audio_pt_init (&pa->pt, qpa_thread_out, hw, AUDIO_CAP, AUDIO_FUNC)) {
goto fail3;
}
@@ -636,7 +636,7 @@ static int qpa_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = g->conf.samples;
pa->pcm_buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
pa->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
pa->wpos = hw->wpos;
if (!pa->pcm_buf) {
dolog ("Could not allocate buffer (%d bytes)\n",
@@ -644,7 +644,7 @@ static int qpa_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
goto fail2;
}
if (audio_pt_init(&pa->pt, qpa_thread_in, hw, AUDIO_CAP, __func__)) {
if (audio_pt_init (&pa->pt, qpa_thread_in, hw, AUDIO_CAP, AUDIO_FUNC)) {
goto fail3;
}
@@ -667,17 +667,17 @@ static void qpa_fini_out (HWVoiceOut *hw)
void *ret;
PAVoiceOut *pa = (PAVoiceOut *) hw;
audio_pt_lock(&pa->pt, __func__);
audio_pt_lock (&pa->pt, AUDIO_FUNC);
pa->done = 1;
audio_pt_unlock_and_signal(&pa->pt, __func__);
audio_pt_join(&pa->pt, &ret, __func__);
audio_pt_unlock_and_signal (&pa->pt, AUDIO_FUNC);
audio_pt_join (&pa->pt, &ret, AUDIO_FUNC);
if (pa->stream) {
pa_stream_unref (pa->stream);
pa->stream = NULL;
}
audio_pt_fini(&pa->pt, __func__);
audio_pt_fini (&pa->pt, AUDIO_FUNC);
g_free (pa->pcm_buf);
pa->pcm_buf = NULL;
}
@@ -687,17 +687,17 @@ static void qpa_fini_in (HWVoiceIn *hw)
void *ret;
PAVoiceIn *pa = (PAVoiceIn *) hw;
audio_pt_lock(&pa->pt, __func__);
audio_pt_lock (&pa->pt, AUDIO_FUNC);
pa->done = 1;
audio_pt_unlock_and_signal(&pa->pt, __func__);
audio_pt_join(&pa->pt, &ret, __func__);
audio_pt_unlock_and_signal (&pa->pt, AUDIO_FUNC);
audio_pt_join (&pa->pt, &ret, AUDIO_FUNC);
if (pa->stream) {
pa_stream_unref (pa->stream);
pa->stream = NULL;
}
audio_pt_fini(&pa->pt, __func__);
audio_pt_fini (&pa->pt, AUDIO_FUNC);
g_free (pa->pcm_buf);
pa->pcm_buf = NULL;
}
@@ -937,7 +937,7 @@ static struct audio_pcm_ops qpa_pcm_ops = {
.ctl_in = qpa_ctl_in
};
static struct audio_driver pa_audio_driver = {
struct audio_driver pa_audio_driver = {
.name = "pa",
.descr = "http://www.pulseaudio.org/",
.options = qpa_options,
@@ -951,9 +951,3 @@ static struct audio_driver pa_audio_driver = {
.voice_size_in = sizeof (PAVoiceIn),
.ctl_caps = VOICE_VOLUME_CAP
};
static void register_audio_pa(void)
{
audio_driver_register(&pa_audio_driver);
}
type_init(register_audio_pa);

View File

@@ -71,12 +71,6 @@ void NAME (void *opaque, struct st_sample *ibuf, struct st_sample *obuf,
while (rate->ipos <= (rate->opos >> 32)) {
ilast = *ibuf++;
rate->ipos++;
/* if ipos overflow, there is a infinite loop */
if (rate->ipos == 0xffffffff) {
rate->ipos = 1;
rate->opos = rate->opos & 0xffffffff;
}
/* See if we finished the input buffer yet */
if (ibuf >= iend) {
goto the_end;

View File

@@ -277,7 +277,7 @@ static void sdl_callback (void *opaque, Uint8 *buf, int len)
return;
}
if (audio_bug(__func__, sdl->live < 0 || sdl->live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, sdl->live < 0 || sdl->live > hw->samples)) {
dolog ("sdl->live=%d hw->samples=%d\n",
sdl->live, hw->samples);
return;
@@ -500,7 +500,7 @@ static struct audio_pcm_ops sdl_pcm_ops = {
.ctl_out = sdl_ctl_out,
};
static struct audio_driver sdl_audio_driver = {
struct audio_driver sdl_audio_driver = {
.name = "sdl",
.descr = "SDL http://www.libsdl.org",
.options = sdl_options,
@@ -513,9 +513,3 @@ static struct audio_driver sdl_audio_driver = {
.voice_size_out = sizeof (SDLVoiceOut),
.voice_size_in = 0
};
static void register_audio_sdl(void)
{
audio_driver_register(&sdl_audio_driver);
}
type_init(register_audio_sdl);

View File

@@ -391,7 +391,7 @@ static struct audio_pcm_ops audio_callbacks = {
.ctl_in = line_in_ctl,
};
static struct audio_driver spice_audio_driver = {
struct audio_driver spice_audio_driver = {
.name = "spice",
.descr = "spice audio driver",
.options = audio_options,
@@ -411,9 +411,3 @@ void qemu_spice_audio_init (void)
{
spice_audio_driver.can_be_default = 1;
}
static void register_audio_spice(void)
{
audio_driver_register(&spice_audio_driver);
}
type_init(register_audio_spice);

View File

@@ -1,9 +1,9 @@
# See docs/devel/tracing.txt for syntax documentation.
# See docs/tracing.txt for syntax documentation.
# audio/alsaaudio.c
alsa_revents(int revents) "revents = %d"
alsa_pollout(int i, int fd) "i = %d fd = %d"
alsa_set_handler(int events, int index, int fd, int err) "events=0x%x index=%d fd=%d err=%d"
alsa_set_handler(int events, int index, int fd, int err) "events=%#x index=%d fd=%d err=%d"
alsa_wrote_zero(int len) "Failed to write %d frames (wrote zero)"
alsa_read_zero(long len) "Failed to read %ld frames (read zero)"
alsa_xrun_out(void) "Recovering from playback xrun"
@@ -13,5 +13,5 @@ alsa_resume_in(void) "Resuming suspended input stream"
alsa_no_frames(int state) "No frames available and ALSA state is %d"
# audio/ossaudio.c
oss_version(int version) "OSS version = 0x%x"
oss_version(int version) "OSS version = %#x"
oss_invalid_available_size(int size, int bufsize) "Invalid available size, size=%d bufsize=%d"

View File

@@ -139,7 +139,7 @@ static int wav_init_out(HWVoiceOut *hw, struct audsettings *as,
audio_pcm_init_info (&hw->info, &wav_as);
hw->samples = 1024;
wav->pcm_buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
wav->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
if (!wav->pcm_buf) {
dolog ("Could not allocate buffer (%d bytes)\n",
hw->samples << hw->info.shift);
@@ -278,7 +278,7 @@ static struct audio_pcm_ops wav_pcm_ops = {
.ctl_out = wav_ctl_out,
};
static struct audio_driver wav_audio_driver = {
struct audio_driver wav_audio_driver = {
.name = "wav",
.descr = "WAV renderer http://wikipedia.org/wiki/WAV",
.options = wav_options,
@@ -291,9 +291,3 @@ static struct audio_driver wav_audio_driver = {
.voice_size_out = sizeof (WAVVoiceOut),
.voice_size_in = 0
};
static void register_audio_wav(void)
{
audio_driver_register(&wav_audio_driver);
}
type_init(register_audio_wav);

View File

@@ -1,7 +1,6 @@
#include "qemu/osdep.h"
#include "hw/hw.h"
#include "monitor/monitor.h"
#include "qapi/error.h"
#include "qemu/error-report.h"
#include "audio.h"
@@ -89,7 +88,6 @@ static void wav_capture_destroy (void *opaque)
WAVState *wav = opaque;
AUD_del_capture (wav->cap, wav);
g_free (wav);
}
static void wav_capture_info (void *opaque)

View File

@@ -1,6 +1,10 @@
common-obj-y += rng.o rng-egd.o
common-obj-$(CONFIG_POSIX) += rng-random.o
common-obj-y += msmouse.o wctablet.o testdev.o
common-obj-$(CONFIG_BRLAPI) += baum.o
baum.o-cflags := $(SDL_CFLAGS)
common-obj-$(CONFIG_TPM) += tpm.o
common-obj-y += hostmem.o hostmem-ram.o
@@ -8,11 +12,3 @@ common-obj-$(CONFIG_LINUX) += hostmem-file.o
common-obj-y += cryptodev.o
common-obj-y += cryptodev-builtin.o
ifeq ($(CONFIG_VIRTIO),y)
common-obj-y += cryptodev-vhost.o
common-obj-$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX)) += \
cryptodev-vhost-user.o
endif
common-obj-$(CONFIG_LINUX) += hostmem-memfd.o

View File

@@ -1,7 +1,7 @@
/*
* QEMU Baum Braille Device
*
* Copyright (c) 2008, 2010-2011, 2016-2017 Samuel Thibault
* Copyright (c) 2008, 2010-2011, 2016 Samuel Thibault
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
@@ -24,7 +24,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "chardev/char.h"
#include "sysemu/char.h"
#include "qemu/timer.h"
#include "hw/usb.h"
#include "ui/console.h"
@@ -239,12 +239,6 @@ static int baum_deferred_init(BaumChardev *baum)
brlapi_perror("baum: brlapi__getDisplaySize");
return 0;
}
if (baum->y > 1) {
baum->y = 1;
}
if (baum->x > 84) {
baum->x = 84;
}
con = qemu_console_lookup_by_index(0);
if (con && qemu_console_is_graphic(con)) {
@@ -649,7 +643,6 @@ static void baum_chr_open(Chardev *chr,
error_setg(errp, "brlapi__openConnection: %s",
brlapi_strerror(brlapi_error_location()));
g_free(handle);
baum->brlapi = NULL;
return;
}
baum->deferred_init = 0;

View File

@@ -78,7 +78,6 @@ static void cryptodev_builtin_init(
"cryptodev-builtin", NULL);
cc->info_str = g_strdup_printf("cryptodev-builtin0");
cc->queue_index = 0;
cc->type = CRYPTODEV_BACKEND_TYPE_BUILTIN;
backend->conf.peers.ccs[0] = cc;
backend->conf.crypto_services =

View File

@@ -1,377 +0,0 @@
/*
* QEMU Cryptodev backend for QEMU cipher APIs
*
* Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
*
* Authors:
* Gonglei <arei.gonglei@huawei.com>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "qemu/osdep.h"
#include "hw/boards.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "qemu/error-report.h"
#include "standard-headers/linux/virtio_crypto.h"
#include "sysemu/cryptodev-vhost.h"
#include "chardev/char-fe.h"
#include "sysemu/cryptodev-vhost-user.h"
/**
* @TYPE_CRYPTODEV_BACKEND_VHOST_USER:
* name of backend that uses vhost user server
*/
#define TYPE_CRYPTODEV_BACKEND_VHOST_USER "cryptodev-vhost-user"
#define CRYPTODEV_BACKEND_VHOST_USER(obj) \
OBJECT_CHECK(CryptoDevBackendVhostUser, \
(obj), TYPE_CRYPTODEV_BACKEND_VHOST_USER)
typedef struct CryptoDevBackendVhostUser {
CryptoDevBackend parent_obj;
CharBackend chr;
char *chr_name;
bool opened;
CryptoDevBackendVhost *vhost_crypto[MAX_CRYPTO_QUEUE_NUM];
} CryptoDevBackendVhostUser;
static int
cryptodev_vhost_user_running(
CryptoDevBackendVhost *crypto)
{
return crypto ? 1 : 0;
}
CryptoDevBackendVhost *
cryptodev_vhost_user_get_vhost(
CryptoDevBackendClient *cc,
CryptoDevBackend *b,
uint16_t queue)
{
CryptoDevBackendVhostUser *s =
CRYPTODEV_BACKEND_VHOST_USER(b);
assert(cc->type == CRYPTODEV_BACKEND_TYPE_VHOST_USER);
assert(queue < MAX_CRYPTO_QUEUE_NUM);
return s->vhost_crypto[queue];
}
static void cryptodev_vhost_user_stop(int queues,
CryptoDevBackendVhostUser *s)
{
size_t i;
for (i = 0; i < queues; i++) {
if (!cryptodev_vhost_user_running(s->vhost_crypto[i])) {
continue;
}
cryptodev_vhost_cleanup(s->vhost_crypto[i]);
s->vhost_crypto[i] = NULL;
}
}
static int
cryptodev_vhost_user_start(int queues,
CryptoDevBackendVhostUser *s)
{
CryptoDevBackendVhostOptions options;
CryptoDevBackend *b = CRYPTODEV_BACKEND(s);
int max_queues;
size_t i;
for (i = 0; i < queues; i++) {
if (cryptodev_vhost_user_running(s->vhost_crypto[i])) {
continue;
}
options.opaque = &s->chr;
options.backend_type = VHOST_BACKEND_TYPE_USER;
options.cc = b->conf.peers.ccs[i];
s->vhost_crypto[i] = cryptodev_vhost_init(&options);
if (!s->vhost_crypto[i]) {
error_report("failed to init vhost_crypto for queue %zu", i);
goto err;
}
if (i == 0) {
max_queues =
cryptodev_vhost_get_max_queues(s->vhost_crypto[i]);
if (queues > max_queues) {
error_report("you are asking more queues than supported: %d",
max_queues);
goto err;
}
}
}
return 0;
err:
cryptodev_vhost_user_stop(i + 1, s);
return -1;
}
static Chardev *
cryptodev_vhost_claim_chardev(CryptoDevBackendVhostUser *s,
Error **errp)
{
Chardev *chr;
if (s->chr_name == NULL) {
error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
"chardev", "a valid character device");
return NULL;
}
chr = qemu_chr_find(s->chr_name);
if (chr == NULL) {
error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
"Device '%s' not found", s->chr_name);
return NULL;
}
return chr;
}
static void cryptodev_vhost_user_event(void *opaque, int event)
{
CryptoDevBackendVhostUser *s = opaque;
CryptoDevBackend *b = CRYPTODEV_BACKEND(s);
Error *err = NULL;
int queues = b->conf.peers.queues;
assert(queues < MAX_CRYPTO_QUEUE_NUM);
switch (event) {
case CHR_EVENT_OPENED:
if (cryptodev_vhost_user_start(queues, s) < 0) {
exit(1);
}
b->ready = true;
break;
case CHR_EVENT_CLOSED:
b->ready = false;
cryptodev_vhost_user_stop(queues, s);
break;
}
if (err) {
error_report_err(err);
}
}
static void cryptodev_vhost_user_init(
CryptoDevBackend *backend, Error **errp)
{
int queues = backend->conf.peers.queues;
size_t i;
Error *local_err = NULL;
Chardev *chr;
CryptoDevBackendClient *cc;
CryptoDevBackendVhostUser *s =
CRYPTODEV_BACKEND_VHOST_USER(backend);
chr = cryptodev_vhost_claim_chardev(s, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
s->opened = true;
for (i = 0; i < queues; i++) {
cc = cryptodev_backend_new_client(
"cryptodev-vhost-user", NULL);
cc->info_str = g_strdup_printf("cryptodev-vhost-user%zu to %s ",
i, chr->label);
cc->queue_index = i;
cc->type = CRYPTODEV_BACKEND_TYPE_VHOST_USER;
backend->conf.peers.ccs[i] = cc;
if (i == 0) {
if (!qemu_chr_fe_init(&s->chr, chr, &local_err)) {
error_propagate(errp, local_err);
return;
}
}
}
qemu_chr_fe_set_handlers(&s->chr, NULL, NULL,
cryptodev_vhost_user_event, NULL, s, NULL, true);
backend->conf.crypto_services =
1u << VIRTIO_CRYPTO_SERVICE_CIPHER |
1u << VIRTIO_CRYPTO_SERVICE_HASH |
1u << VIRTIO_CRYPTO_SERVICE_MAC;
backend->conf.cipher_algo_l = 1u << VIRTIO_CRYPTO_CIPHER_AES_CBC;
backend->conf.hash_algo = 1u << VIRTIO_CRYPTO_HASH_SHA1;
backend->conf.max_size = UINT64_MAX;
backend->conf.max_cipher_key_len = VHOST_USER_MAX_CIPHER_KEY_LEN;
backend->conf.max_auth_key_len = VHOST_USER_MAX_AUTH_KEY_LEN;
}
static int64_t cryptodev_vhost_user_sym_create_session(
CryptoDevBackend *backend,
CryptoDevBackendSymSessionInfo *sess_info,
uint32_t queue_index, Error **errp)
{
CryptoDevBackendClient *cc =
backend->conf.peers.ccs[queue_index];
CryptoDevBackendVhost *vhost_crypto;
uint64_t session_id = 0;
int ret;
vhost_crypto = cryptodev_vhost_user_get_vhost(cc, backend, queue_index);
if (vhost_crypto) {
struct vhost_dev *dev = &(vhost_crypto->dev);
ret = dev->vhost_ops->vhost_crypto_create_session(dev,
sess_info,
&session_id);
if (ret < 0) {
return -1;
} else {
return session_id;
}
}
return -1;
}
static int cryptodev_vhost_user_sym_close_session(
CryptoDevBackend *backend,
uint64_t session_id,
uint32_t queue_index, Error **errp)
{
CryptoDevBackendClient *cc =
backend->conf.peers.ccs[queue_index];
CryptoDevBackendVhost *vhost_crypto;
int ret;
vhost_crypto = cryptodev_vhost_user_get_vhost(cc, backend, queue_index);
if (vhost_crypto) {
struct vhost_dev *dev = &(vhost_crypto->dev);
ret = dev->vhost_ops->vhost_crypto_close_session(dev,
session_id);
if (ret < 0) {
return -1;
} else {
return 0;
}
}
return -1;
}
static void cryptodev_vhost_user_cleanup(
CryptoDevBackend *backend,
Error **errp)
{
CryptoDevBackendVhostUser *s =
CRYPTODEV_BACKEND_VHOST_USER(backend);
size_t i;
int queues = backend->conf.peers.queues;
CryptoDevBackendClient *cc;
cryptodev_vhost_user_stop(queues, s);
for (i = 0; i < queues; i++) {
cc = backend->conf.peers.ccs[i];
if (cc) {
cryptodev_backend_free_client(cc);
backend->conf.peers.ccs[i] = NULL;
}
}
}
static void cryptodev_vhost_user_set_chardev(Object *obj,
const char *value, Error **errp)
{
CryptoDevBackendVhostUser *s =
CRYPTODEV_BACKEND_VHOST_USER(obj);
if (s->opened) {
error_setg(errp, QERR_PERMISSION_DENIED);
} else {
g_free(s->chr_name);
s->chr_name = g_strdup(value);
}
}
static char *
cryptodev_vhost_user_get_chardev(Object *obj, Error **errp)
{
CryptoDevBackendVhostUser *s =
CRYPTODEV_BACKEND_VHOST_USER(obj);
Chardev *chr = qemu_chr_fe_get_driver(&s->chr);
if (chr && chr->label) {
return g_strdup(chr->label);
}
return NULL;
}
static void cryptodev_vhost_user_instance_int(Object *obj)
{
object_property_add_str(obj, "chardev",
cryptodev_vhost_user_get_chardev,
cryptodev_vhost_user_set_chardev,
NULL);
}
static void cryptodev_vhost_user_finalize(Object *obj)
{
CryptoDevBackendVhostUser *s =
CRYPTODEV_BACKEND_VHOST_USER(obj);
qemu_chr_fe_deinit(&s->chr, false);
g_free(s->chr_name);
}
static void
cryptodev_vhost_user_class_init(ObjectClass *oc, void *data)
{
CryptoDevBackendClass *bc = CRYPTODEV_BACKEND_CLASS(oc);
bc->init = cryptodev_vhost_user_init;
bc->cleanup = cryptodev_vhost_user_cleanup;
bc->create_session = cryptodev_vhost_user_sym_create_session;
bc->close_session = cryptodev_vhost_user_sym_close_session;
bc->do_sym_op = NULL;
}
static const TypeInfo cryptodev_vhost_user_info = {
.name = TYPE_CRYPTODEV_BACKEND_VHOST_USER,
.parent = TYPE_CRYPTODEV_BACKEND,
.class_init = cryptodev_vhost_user_class_init,
.instance_init = cryptodev_vhost_user_instance_int,
.instance_finalize = cryptodev_vhost_user_finalize,
.instance_size = sizeof(CryptoDevBackendVhostUser),
};
static void
cryptodev_vhost_user_register_types(void)
{
type_register_static(&cryptodev_vhost_user_info);
}
type_init(cryptodev_vhost_user_register_types);

View File

@@ -1,347 +0,0 @@
/*
* QEMU Cryptodev backend for QEMU cipher APIs
*
* Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
*
* Authors:
* Gonglei <arei.gonglei@huawei.com>
* Jay Zhou <jianjay.zhou@huawei.com>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "qemu/osdep.h"
#include "hw/virtio/virtio-bus.h"
#include "sysemu/cryptodev-vhost.h"
#ifdef CONFIG_VHOST_CRYPTO
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "qemu/error-report.h"
#include "hw/virtio/virtio-crypto.h"
#include "sysemu/cryptodev-vhost-user.h"
uint64_t
cryptodev_vhost_get_max_queues(
CryptoDevBackendVhost *crypto)
{
return crypto->dev.max_queues;
}
void cryptodev_vhost_cleanup(CryptoDevBackendVhost *crypto)
{
vhost_dev_cleanup(&crypto->dev);
g_free(crypto);
}
struct CryptoDevBackendVhost *
cryptodev_vhost_init(
CryptoDevBackendVhostOptions *options)
{
int r;
CryptoDevBackendVhost *crypto;
crypto = g_new(CryptoDevBackendVhost, 1);
crypto->dev.max_queues = 1;
crypto->dev.nvqs = 1;
crypto->dev.vqs = crypto->vqs;
crypto->cc = options->cc;
crypto->dev.protocol_features = 0;
crypto->backend = -1;
/* vhost-user needs vq_index to initiate a specific queue pair */
crypto->dev.vq_index = crypto->cc->queue_index * crypto->dev.nvqs;
r = vhost_dev_init(&crypto->dev, options->opaque, options->backend_type, 0);
if (r < 0) {
goto fail;
}
return crypto;
fail:
g_free(crypto);
return NULL;
}
static int
cryptodev_vhost_start_one(CryptoDevBackendVhost *crypto,
VirtIODevice *dev)
{
int r;
crypto->dev.nvqs = 1;
crypto->dev.vqs = crypto->vqs;
r = vhost_dev_enable_notifiers(&crypto->dev, dev);
if (r < 0) {
goto fail_notifiers;
}
r = vhost_dev_start(&crypto->dev, dev);
if (r < 0) {
goto fail_start;
}
return 0;
fail_start:
vhost_dev_disable_notifiers(&crypto->dev, dev);
fail_notifiers:
return r;
}
static void
cryptodev_vhost_stop_one(CryptoDevBackendVhost *crypto,
VirtIODevice *dev)
{
vhost_dev_stop(&crypto->dev, dev);
vhost_dev_disable_notifiers(&crypto->dev, dev);
}
CryptoDevBackendVhost *
cryptodev_get_vhost(CryptoDevBackendClient *cc,
CryptoDevBackend *b,
uint16_t queue)
{
CryptoDevBackendVhost *vhost_crypto = NULL;
if (!cc) {
return NULL;
}
switch (cc->type) {
#if defined(CONFIG_VHOST_USER) && defined(CONFIG_LINUX)
case CRYPTODEV_BACKEND_TYPE_VHOST_USER:
vhost_crypto = cryptodev_vhost_user_get_vhost(cc, b, queue);
break;
#endif
default:
break;
}
return vhost_crypto;
}
static void
cryptodev_vhost_set_vq_index(CryptoDevBackendVhost *crypto,
int vq_index)
{
crypto->dev.vq_index = vq_index;
}
static int
vhost_set_vring_enable(CryptoDevBackendClient *cc,
CryptoDevBackend *b,
uint16_t queue, int enable)
{
CryptoDevBackendVhost *crypto =
cryptodev_get_vhost(cc, b, queue);
const VhostOps *vhost_ops;
cc->vring_enable = enable;
if (!crypto) {
return 0;
}
vhost_ops = crypto->dev.vhost_ops;
if (vhost_ops->vhost_set_vring_enable) {
return vhost_ops->vhost_set_vring_enable(&crypto->dev, enable);
}
return 0;
}
int cryptodev_vhost_start(VirtIODevice *dev, int total_queues)
{
VirtIOCrypto *vcrypto = VIRTIO_CRYPTO(dev);
BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
VirtioBusState *vbus = VIRTIO_BUS(qbus);
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
int r, e;
int i;
CryptoDevBackend *b = vcrypto->cryptodev;
CryptoDevBackendVhost *vhost_crypto;
CryptoDevBackendClient *cc;
if (!k->set_guest_notifiers) {
error_report("binding does not support guest notifiers");
return -ENOSYS;
}
for (i = 0; i < total_queues; i++) {
cc = b->conf.peers.ccs[i];
vhost_crypto = cryptodev_get_vhost(cc, b, i);
cryptodev_vhost_set_vq_index(vhost_crypto, i);
/* Suppress the masking guest notifiers on vhost user
* because vhost user doesn't interrupt masking/unmasking
* properly.
*/
if (cc->type == CRYPTODEV_BACKEND_TYPE_VHOST_USER) {
dev->use_guest_notifier_mask = false;
}
}
r = k->set_guest_notifiers(qbus->parent, total_queues, true);
if (r < 0) {
error_report("error binding guest notifier: %d", -r);
goto err;
}
for (i = 0; i < total_queues; i++) {
cc = b->conf.peers.ccs[i];
vhost_crypto = cryptodev_get_vhost(cc, b, i);
r = cryptodev_vhost_start_one(vhost_crypto, dev);
if (r < 0) {
goto err_start;
}
if (cc->vring_enable) {
/* restore vring enable state */
r = vhost_set_vring_enable(cc, b, i, cc->vring_enable);
if (r < 0) {
goto err_start;
}
}
}
return 0;
err_start:
while (--i >= 0) {
cc = b->conf.peers.ccs[i];
vhost_crypto = cryptodev_get_vhost(cc, b, i);
cryptodev_vhost_stop_one(vhost_crypto, dev);
}
e = k->set_guest_notifiers(qbus->parent, total_queues, false);
if (e < 0) {
error_report("vhost guest notifier cleanup failed: %d", e);
}
err:
return r;
}
void cryptodev_vhost_stop(VirtIODevice *dev, int total_queues)
{
BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
VirtioBusState *vbus = VIRTIO_BUS(qbus);
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
VirtIOCrypto *vcrypto = VIRTIO_CRYPTO(dev);
CryptoDevBackend *b = vcrypto->cryptodev;
CryptoDevBackendVhost *vhost_crypto;
CryptoDevBackendClient *cc;
size_t i;
int r;
for (i = 0; i < total_queues; i++) {
cc = b->conf.peers.ccs[i];
vhost_crypto = cryptodev_get_vhost(cc, b, i);
cryptodev_vhost_stop_one(vhost_crypto, dev);
}
r = k->set_guest_notifiers(qbus->parent, total_queues, false);
if (r < 0) {
error_report("vhost guest notifier cleanup failed: %d", r);
}
assert(r >= 0);
}
void cryptodev_vhost_virtqueue_mask(VirtIODevice *dev,
int queue,
int idx, bool mask)
{
VirtIOCrypto *vcrypto = VIRTIO_CRYPTO(dev);
CryptoDevBackend *b = vcrypto->cryptodev;
CryptoDevBackendVhost *vhost_crypto;
CryptoDevBackendClient *cc;
assert(queue < MAX_CRYPTO_QUEUE_NUM);
cc = b->conf.peers.ccs[queue];
vhost_crypto = cryptodev_get_vhost(cc, b, queue);
vhost_virtqueue_mask(&vhost_crypto->dev, dev, idx, mask);
}
bool cryptodev_vhost_virtqueue_pending(VirtIODevice *dev,
int queue, int idx)
{
VirtIOCrypto *vcrypto = VIRTIO_CRYPTO(dev);
CryptoDevBackend *b = vcrypto->cryptodev;
CryptoDevBackendVhost *vhost_crypto;
CryptoDevBackendClient *cc;
assert(queue < MAX_CRYPTO_QUEUE_NUM);
cc = b->conf.peers.ccs[queue];
vhost_crypto = cryptodev_get_vhost(cc, b, queue);
return vhost_virtqueue_pending(&vhost_crypto->dev, idx);
}
#else
uint64_t
cryptodev_vhost_get_max_queues(CryptoDevBackendVhost *crypto)
{
return 0;
}
void cryptodev_vhost_cleanup(CryptoDevBackendVhost *crypto)
{
}
struct CryptoDevBackendVhost *
cryptodev_vhost_init(CryptoDevBackendVhostOptions *options)
{
return NULL;
}
CryptoDevBackendVhost *
cryptodev_get_vhost(CryptoDevBackendClient *cc,
CryptoDevBackend *b,
uint16_t queue)
{
return NULL;
}
int cryptodev_vhost_start(VirtIODevice *dev, int total_queues)
{
return -1;
}
void cryptodev_vhost_stop(VirtIODevice *dev, int total_queues)
{
}
void cryptodev_vhost_virtqueue_mask(VirtIODevice *dev,
int queue,
int idx, bool mask)
{
}
bool cryptodev_vhost_virtqueue_pending(VirtIODevice *dev,
int queue, int idx)
{
return false;
}
#endif

View File

@@ -26,6 +26,8 @@
#include "hw/boards.h"
#include "qapi/error.h"
#include "qapi/visitor.h"
#include "qapi-types.h"
#include "qapi-visit.h"
#include "qemu/config-file.h"
#include "qom/object_interfaces.h"
#include "hw/virtio/virtio-crypto.h"
@@ -213,14 +215,14 @@ bool cryptodev_backend_is_ready(CryptoDevBackend *backend)
}
static bool
cryptodev_backend_can_be_deleted(UserCreatable *uc)
cryptodev_backend_can_be_deleted(UserCreatable *uc, Error **errp)
{
return !cryptodev_backend_is_used(CRYPTODEV_BACKEND(uc));
}
static void cryptodev_backend_instance_init(Object *obj)
{
object_property_add(obj, "queues", "uint32",
object_property_add(obj, "queues", "int",
cryptodev_backend_get_queues,
cryptodev_backend_set_queues,
NULL, NULL, NULL);

View File

@@ -31,9 +31,8 @@ typedef struct HostMemoryBackendFile HostMemoryBackendFile;
struct HostMemoryBackendFile {
HostMemoryBackend parent_obj;
bool discard_data;
bool share;
char *mem_path;
uint64_t align;
};
static void
@@ -52,13 +51,13 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
#ifndef CONFIG_LINUX
error_setg(errp, "-mem-path not supported on this host");
#else
if (!host_memory_backend_mr_inited(backend)) {
if (!memory_region_size(&backend->mr)) {
gchar *path;
backend->force_prealloc = mem_prealloc;
path = object_get_canonical_path(OBJECT(backend));
memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
path,
backend->size, fb->align, backend->share,
backend->size, fb->share,
fb->mem_path, errp);
g_free(path);
}
@@ -77,7 +76,7 @@ static void set_mem_path(Object *o, const char *str, Error **errp)
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (host_memory_backend_mr_inited(backend)) {
if (memory_region_size(&backend->mr)) {
error_setg(errp, "cannot change property value");
return;
}
@@ -85,62 +84,23 @@ static void set_mem_path(Object *o, const char *str, Error **errp)
fb->mem_path = g_strdup(str);
}
static bool file_memory_backend_get_discard_data(Object *o, Error **errp)
{
return MEMORY_BACKEND_FILE(o)->discard_data;
}
static void file_memory_backend_set_discard_data(Object *o, bool value,
Error **errp)
{
MEMORY_BACKEND_FILE(o)->discard_data = value;
}
static void file_memory_backend_get_align(Object *o, Visitor *v,
const char *name, void *opaque,
Error **errp)
static bool file_memory_backend_get_share(Object *o, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
uint64_t val = fb->align;
visit_type_size(v, name, &val, errp);
return fb->share;
}
static void file_memory_backend_set_align(Object *o, Visitor *v,
const char *name, void *opaque,
Error **errp)
static void file_memory_backend_set_share(Object *o, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
Error *local_err = NULL;
uint64_t val;
if (host_memory_backend_mr_inited(backend)) {
error_setg(&local_err, "cannot change property value");
goto out;
}
visit_type_size(v, name, &val, &local_err);
if (local_err) {
goto out;
}
fb->align = val;
out:
error_propagate(errp, local_err);
}
static void file_backend_unparent(Object *obj)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj);
if (host_memory_backend_mr_inited(backend) && fb->discard_data) {
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
qemu_madvise(ptr, sz, QEMU_MADV_REMOVE);
if (memory_region_size(&backend->mr)) {
error_setg(errp, "cannot change property value");
return;
}
fb->share = value;
}
static void
@@ -149,18 +109,13 @@ file_backend_class_init(ObjectClass *oc, void *data)
HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
bc->alloc = file_backend_memory_alloc;
oc->unparent = file_backend_unparent;
object_class_property_add_bool(oc, "discard-data",
file_memory_backend_get_discard_data, file_memory_backend_set_discard_data,
object_class_property_add_bool(oc, "share",
file_memory_backend_get_share, file_memory_backend_set_share,
&error_abort);
object_class_property_add_str(oc, "mem-path",
get_mem_path, set_mem_path,
&error_abort);
object_class_property_add(oc, "align", "int",
file_memory_backend_get_align,
file_memory_backend_set_align,
NULL, NULL, &error_abort);
}
static void file_backend_instance_finalize(Object *o)

View File

@@ -1,170 +0,0 @@
/*
* QEMU host memfd memory backend
*
* Copyright (C) 2018 Red Hat Inc
*
* Authors:
* Marc-André Lureau <marcandre.lureau@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/hostmem.h"
#include "sysemu/sysemu.h"
#include "qom/object_interfaces.h"
#include "qemu/memfd.h"
#include "qapi/error.h"
#define TYPE_MEMORY_BACKEND_MEMFD "memory-backend-memfd"
#define MEMORY_BACKEND_MEMFD(obj) \
OBJECT_CHECK(HostMemoryBackendMemfd, (obj), TYPE_MEMORY_BACKEND_MEMFD)
typedef struct HostMemoryBackendMemfd HostMemoryBackendMemfd;
struct HostMemoryBackendMemfd {
HostMemoryBackend parent_obj;
bool hugetlb;
uint64_t hugetlbsize;
bool seal;
};
static void
memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
HostMemoryBackendMemfd *m = MEMORY_BACKEND_MEMFD(backend);
char *name;
int fd;
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
}
if (host_memory_backend_mr_inited(backend)) {
return;
}
backend->force_prealloc = mem_prealloc;
fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
m->hugetlb, m->hugetlbsize, m->seal ?
F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
errp);
if (fd == -1) {
return;
}
name = object_get_canonical_path(OBJECT(backend));
memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend),
name, backend->size, true, fd, errp);
g_free(name);
}
static bool
memfd_backend_get_hugetlb(Object *o, Error **errp)
{
return MEMORY_BACKEND_MEMFD(o)->hugetlb;
}
static void
memfd_backend_set_hugetlb(Object *o, bool value, Error **errp)
{
MEMORY_BACKEND_MEMFD(o)->hugetlb = value;
}
static void
memfd_backend_set_hugetlbsize(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
HostMemoryBackendMemfd *m = MEMORY_BACKEND_MEMFD(obj);
Error *local_err = NULL;
uint64_t value;
if (host_memory_backend_mr_inited(MEMORY_BACKEND(obj))) {
error_setg(&local_err, "cannot change property value");
goto out;
}
visit_type_size(v, name, &value, &local_err);
if (local_err) {
goto out;
}
if (!value) {
error_setg(&local_err, "Property '%s.%s' doesn't take value '%"
PRIu64 "'", object_get_typename(obj), name, value);
goto out;
}
m->hugetlbsize = value;
out:
error_propagate(errp, local_err);
}
static void
memfd_backend_get_hugetlbsize(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
HostMemoryBackendMemfd *m = MEMORY_BACKEND_MEMFD(obj);
uint64_t value = m->hugetlbsize;
visit_type_size(v, name, &value, errp);
}
static bool
memfd_backend_get_seal(Object *o, Error **errp)
{
return MEMORY_BACKEND_MEMFD(o)->seal;
}
static void
memfd_backend_set_seal(Object *o, bool value, Error **errp)
{
MEMORY_BACKEND_MEMFD(o)->seal = value;
}
static void
memfd_backend_instance_init(Object *obj)
{
HostMemoryBackendMemfd *m = MEMORY_BACKEND_MEMFD(obj);
/* default to sealed file */
m->seal = true;
}
static void
memfd_backend_class_init(ObjectClass *oc, void *data)
{
HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
bc->alloc = memfd_backend_memory_alloc;
object_class_property_add_bool(oc, "hugetlb",
memfd_backend_get_hugetlb,
memfd_backend_set_hugetlb,
&error_abort);
object_class_property_add(oc, "hugetlbsize", "int",
memfd_backend_get_hugetlbsize,
memfd_backend_set_hugetlbsize,
NULL, NULL, &error_abort);
object_class_property_add_bool(oc, "seal",
memfd_backend_get_seal,
memfd_backend_set_seal,
&error_abort);
}
static const TypeInfo memfd_backend_info = {
.name = TYPE_MEMORY_BACKEND_MEMFD,
.parent = TYPE_MEMORY_BACKEND,
.instance_init = memfd_backend_instance_init,
.class_init = memfd_backend_class_init,
.instance_size = sizeof(HostMemoryBackendMemfd),
};
static void register_types(void)
{
type_register_static(&memfd_backend_info);
}
type_init(register_types);

View File

@@ -28,8 +28,8 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
}
path = object_get_canonical_path_component(OBJECT(backend));
memory_region_init_ram_shared_nomigrate(&backend->mr, OBJECT(backend), path,
backend->size, backend->share, errp);
memory_region_init_ram(&backend->mr, OBJECT(backend), path,
backend->size, errp);
g_free(path);
}

View File

@@ -9,13 +9,13 @@
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "sysemu/hostmem.h"
#include "hw/boards.h"
#include "qapi/error.h"
#include "qapi/qapi-builtin-visit.h"
#include "qapi/visitor.h"
#include "qapi-types.h"
#include "qapi-visit.h"
#include "qemu/config-file.h"
#include "qom/object_interfaces.h"
@@ -45,7 +45,7 @@ host_memory_backend_set_size(Object *obj, Visitor *v, const char *name,
Error *local_err = NULL;
uint64_t value;
if (host_memory_backend_mr_inited(backend)) {
if (memory_region_size(&backend->mr)) {
error_setg(&local_err, "cannot change property value");
goto out;
}
@@ -146,7 +146,7 @@ static void host_memory_backend_set_merge(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!host_memory_backend_mr_inited(backend)) {
if (!memory_region_size(&backend->mr)) {
backend->merge = value;
return;
}
@@ -172,7 +172,7 @@ static void host_memory_backend_set_dump(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!host_memory_backend_mr_inited(backend)) {
if (!memory_region_size(&backend->mr)) {
backend->dump = value;
return;
}
@@ -208,7 +208,7 @@ static void host_memory_backend_set_prealloc(Object *obj, bool value,
}
}
if (!host_memory_backend_mr_inited(backend)) {
if (!memory_region_size(&backend->mr)) {
backend->prealloc = value;
return;
}
@@ -237,19 +237,10 @@ static void host_memory_backend_init(Object *obj)
backend->prealloc = mem_prealloc;
}
bool host_memory_backend_mr_inited(HostMemoryBackend *backend)
{
/*
* NOTE: We forbid zero-length memory backend, so here zero means
* "we haven't inited the backend memory region yet".
*/
return memory_region_size(&backend->mr) != 0;
}
MemoryRegion *
host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
{
return host_memory_backend_mr_inited(backend) ? &backend->mr : NULL;
return memory_region_size(&backend->mr) ? &backend->mr : NULL;
}
void host_memory_backend_set_mapped(HostMemoryBackend *backend, bool mapped)
@@ -304,7 +295,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
return;
} else if (maxnode == 0 && backend->policy != MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be set for policy %s",
HostMemPolicy_str(backend->policy));
HostMemPolicy_lookup[backend->policy]);
return;
}
@@ -342,7 +333,7 @@ out:
}
static bool
host_memory_backend_can_be_deleted(UserCreatable *uc)
host_memory_backend_can_be_deleted(UserCreatable *uc, Error **errp)
{
if (host_memory_backend_is_mapped(MEMORY_BACKEND(uc))) {
return false;
@@ -369,24 +360,6 @@ static void set_id(Object *o, const char *str, Error **errp)
backend->id = g_strdup(str);
}
static bool host_memory_backend_get_share(Object *o, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
return backend->share;
}
static void host_memory_backend_set_share(Object *o, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
if (host_memory_backend_mr_inited(backend)) {
error_setg(errp, "cannot change property value");
return;
}
backend->share = value;
}
static void
host_memory_backend_class_init(ObjectClass *oc, void *data)
{
@@ -413,13 +386,10 @@ host_memory_backend_class_init(ObjectClass *oc, void *data)
host_memory_backend_set_host_nodes,
NULL, NULL, &error_abort);
object_class_property_add_enum(oc, "policy", "HostMemPolicy",
&HostMemPolicy_lookup,
HostMemPolicy_lookup,
host_memory_backend_get_policy,
host_memory_backend_set_policy, &error_abort);
object_class_property_add_str(oc, "id", get_id, set_id, &error_abort);
object_class_property_add_bool(oc, "share",
host_memory_backend_get_share, host_memory_backend_set_share,
&error_abort);
}
static void host_memory_backend_finalize(Object *o)

View File

@@ -23,7 +23,7 @@
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "chardev/char.h"
#include "sysemu/char.h"
#include "ui/console.h"
#include "ui/input.h"

View File

@@ -12,7 +12,7 @@
#include "qemu/osdep.h"
#include "sysemu/rng.h"
#include "chardev/char-fe.h"
#include "sysemu/char.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
@@ -106,7 +106,7 @@ static void rng_egd_opened(RngBackend *b, Error **errp)
/* FIXME we should resubmit pending requests when the CDS reconnects. */
qemu_chr_fe_set_handlers(&s->chr, rng_egd_chr_can_read,
rng_egd_chr_read, NULL, NULL, s, NULL, true);
rng_egd_chr_read, NULL, s, NULL, true);
}
static void rng_egd_set_chardev(Object *obj, const char *value, Error **errp)
@@ -145,7 +145,7 @@ static void rng_egd_finalize(Object *obj)
{
RngEgd *s = RNG_EGD(obj);
qemu_chr_fe_deinit(&s->chr, false);
qemu_chr_fe_deinit(&s->chr);
g_free(s->chr_name);
}

View File

@@ -25,7 +25,7 @@
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "chardev/char.h"
#include "sysemu/char.h"
#define BUF_SIZE 32

View File

@@ -15,193 +15,184 @@
#include "qemu/osdep.h"
#include "sysemu/tpm_backend.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "sysemu/tpm.h"
#include "qemu/thread.h"
#include "qemu/main-loop.h"
#include "block/thread-pool.h"
#include "qemu/error-report.h"
static void tpm_backend_request_completed(void *opaque, int ret)
{
TPMBackend *s = TPM_BACKEND(opaque);
TPMIfClass *tic = TPM_IF_GET_CLASS(s->tpmif);
tic->request_completed(s->tpmif, ret);
/* no need for atomic, as long the BQL is taken */
s->cmd = NULL;
object_unref(OBJECT(s));
}
static int tpm_backend_worker_thread(gpointer data)
{
TPMBackend *s = TPM_BACKEND(data);
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
Error *err = NULL;
k->handle_request(s, s->cmd, &err);
if (err) {
error_report_err(err);
return -1;
}
return 0;
}
void tpm_backend_finish_sync(TPMBackend *s)
{
while (s->cmd) {
aio_poll(qemu_get_aio_context(), true);
}
}
#include "sysemu/tpm_backend_int.h"
enum TpmType tpm_backend_get_type(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->type;
return k->ops->type;
}
int tpm_backend_init(TPMBackend *s, TPMIf *tpmif, Error **errp)
const char *tpm_backend_get_desc(TPMBackend *s)
{
if (s->tpmif) {
error_setg(errp, "TPM backend '%s' is already initialized", s->id);
return -1;
}
s->tpmif = tpmif;
object_ref(OBJECT(tpmif));
s->had_startup_error = false;
return 0;
}
int tpm_backend_startup_tpm(TPMBackend *s, size_t buffersize)
{
int res = 0;
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
/* terminate a running TPM */
tpm_backend_finish_sync(s);
return k->ops->desc();
}
res = k->startup_tpm ? k->startup_tpm(s, buffersize) : 0;
void tpm_backend_destroy(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
s->had_startup_error = (res != 0);
k->ops->destroy(s);
}
return res;
int tpm_backend_init(TPMBackend *s, TPMState *state,
TPMRecvDataCB *datacb)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->init(s, state, datacb);
}
int tpm_backend_startup_tpm(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->startup_tpm(s);
}
bool tpm_backend_had_startup_error(TPMBackend *s)
{
return s->had_startup_error;
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->had_startup_error(s);
}
void tpm_backend_deliver_request(TPMBackend *s, TPMBackendCmd *cmd)
size_t tpm_backend_realloc_buffer(TPMBackend *s, TPMSizedBuffer *sb)
{
ThreadPool *pool = aio_get_thread_pool(qemu_get_aio_context());
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
if (s->cmd != NULL) {
error_report("There is a TPM request pending");
return;
}
return k->ops->realloc_buffer(sb);
}
s->cmd = cmd;
object_ref(OBJECT(s));
thread_pool_submit_aio(pool, tpm_backend_worker_thread, s,
tpm_backend_request_completed, s);
void tpm_backend_deliver_request(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
k->ops->deliver_request(s);
}
void tpm_backend_reset(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
if (k->reset) {
k->reset(s);
}
tpm_backend_finish_sync(s);
s->had_startup_error = false;
k->ops->reset(s);
}
void tpm_backend_cancel_cmd(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
k->cancel_cmd(s);
k->ops->cancel_cmd(s);
}
bool tpm_backend_get_tpm_established_flag(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->get_tpm_established_flag ?
k->get_tpm_established_flag(s) : false;
return k->ops->get_tpm_established_flag(s);
}
int tpm_backend_reset_tpm_established_flag(TPMBackend *s, uint8_t locty)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->reset_tpm_established_flag ?
k->reset_tpm_established_flag(s, locty) : 0;
return k->ops->reset_tpm_established_flag(s, locty);
}
TPMVersion tpm_backend_get_tpm_version(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->get_tpm_version(s);
return k->ops->get_tpm_version(s);
}
size_t tpm_backend_get_buffer_size(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->get_buffer_size(s);
}
TPMInfo *tpm_backend_query_tpm(TPMBackend *s)
{
TPMInfo *info = g_new0(TPMInfo, 1);
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
TPMIfClass *tic = TPM_IF_GET_CLASS(s->tpmif);
info->id = g_strdup(s->id);
info->model = tic->model;
info->options = k->get_tpm_options(s);
return info;
}
static void tpm_backend_instance_finalize(Object *obj)
static bool tpm_backend_prop_get_opened(Object *obj, Error **errp)
{
TPMBackend *s = TPM_BACKEND(obj);
object_unref(OBJECT(s->tpmif));
g_free(s->id);
return s->opened;
}
void tpm_backend_open(TPMBackend *s, Error **errp)
{
object_property_set_bool(OBJECT(s), true, "opened", errp);
}
static void tpm_backend_prop_set_opened(Object *obj, bool value, Error **errp)
{
TPMBackend *s = TPM_BACKEND(obj);
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
Error *local_err = NULL;
if (value == s->opened) {
return;
}
if (!value && s->opened) {
error_setg(errp, QERR_PERMISSION_DENIED);
return;
}
if (k->opened) {
k->opened(s, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
}
s->opened = true;
}
static void tpm_backend_instance_init(Object *obj)
{
object_property_add_bool(obj, "opened",
tpm_backend_prop_get_opened,
tpm_backend_prop_set_opened,
NULL);
}
void tpm_backend_thread_deliver_request(TPMBackendThread *tbt)
{
g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_PROCESS_CMD, NULL);
}
void tpm_backend_thread_create(TPMBackendThread *tbt,
GFunc func, gpointer user_data)
{
if (!tbt->pool) {
tbt->pool = g_thread_pool_new(func, user_data, 1, TRUE, NULL);
g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_INIT, NULL);
}
}
void tpm_backend_thread_end(TPMBackendThread *tbt)
{
if (tbt->pool) {
g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_END, NULL);
g_thread_pool_free(tbt->pool, FALSE, TRUE);
tbt->pool = NULL;
}
}
static const TypeInfo tpm_backend_info = {
.name = TYPE_TPM_BACKEND,
.parent = TYPE_OBJECT,
.instance_size = sizeof(TPMBackend),
.instance_finalize = tpm_backend_instance_finalize,
.instance_init = tpm_backend_instance_init,
.class_size = sizeof(TPMBackendClass),
.abstract = true,
};
static const TypeInfo tpm_if_info = {
.name = TYPE_TPM_IF,
.parent = TYPE_INTERFACE,
.class_size = sizeof(TPMIfClass),
};
static void register_types(void)
{
type_register_static(&tpm_backend_info);
type_register_static(&tpm_if_info);
}
type_init(register_types);

10
backends/trace-events Normal file
View File

@@ -0,0 +1,10 @@
# See docs/tracing.txt for syntax documentation.
# backends/wctablet.c
wct_init(void) ""
wct_cmd_re(void) ""
wct_cmd_st(void) ""
wct_cmd_sp(void) ""
wct_cmd_ts(int input) "0x%02x"
wct_cmd_other(const char *cmd) "%s"
wct_speed(int speed) "%d"

View File

@@ -25,10 +25,14 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include <stdlib.h>
#include <string.h>
#include <sys/time.h>
#include <time.h>
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "chardev/char-serial.h"
#include "sysemu/char.h"
#include "ui/console.h"
#include "ui/input.h"
#include "trace.h"

View File

@@ -30,9 +30,9 @@
#include "sysemu/kvm.h"
#include "sysemu/balloon.h"
#include "trace-root.h"
#include "qapi/error.h"
#include "qapi/qapi-commands-misc.h"
#include "qmp-commands.h"
#include "qapi/qmp/qerror.h"
#include "qapi/qmp/qjson.h"
static QEMUBalloonEvent *balloon_event_fn;
static QEMUBalloonStatus *balloon_stat_fn;

1301
block.c

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
block-obj-y += raw-format.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o dmg.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-bitmap.o
block-obj-y += qed.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o
block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o
block-obj-y += quorum.o
@@ -9,9 +9,8 @@ block-obj-y += block-backend.o snapshot.o qapi.o
block-obj-$(CONFIG_WIN32) += file-win32.o win32-aio.o
block-obj-$(CONFIG_POSIX) += file-posix.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
block-obj-y += null.o mirror.o commit.o io.o create.o
block-obj-y += null.o mirror.o commit.o io.o
block-obj-y += throttle-groups.o
block-obj-$(CONFIG_LINUX) += nvme.o
block-obj-y += nbd.o nbd-client.o sheepdog.o
block-obj-$(CONFIG_LIBISCSI) += iscsi.o
@@ -20,13 +19,13 @@ block-obj-$(CONFIG_LIBNFS) += nfs.o
block-obj-$(CONFIG_CURL) += curl.o
block-obj-$(CONFIG_RBD) += rbd.o
block-obj-$(CONFIG_GLUSTERFS) += gluster.o
block-obj-$(CONFIG_VXHS) += vxhs.o
block-obj-$(CONFIG_LIBSSH2) += ssh.o
block-obj-y += accounting.o dirty-bitmap.o
block-obj-y += dictzip.o
block-obj-y += tar.o
block-obj-y += write-threshold.o
block-obj-y += backup.o
block-obj-$(CONFIG_REPLICATION) += replication.o
block-obj-y += throttle.o
block-obj-y += crypto.o
@@ -41,12 +40,9 @@ rbd.o-cflags := $(RBD_CFLAGS)
rbd.o-libs := $(RBD_LIBS)
gluster.o-cflags := $(GLUSTERFS_CFLAGS)
gluster.o-libs := $(GLUSTERFS_LIBS)
vxhs.o-libs := $(VXHS_LIBS)
ssh.o-cflags := $(LIBSSH2_CFLAGS)
ssh.o-libs := $(LIBSSH2_LIBS)
block-obj-$(if $(CONFIG_BZIP2),m,n) += dmg-bz2.o
dmg-bz2.o-libs := $(BZIP2_LIBS)
qcow.o-libs := -lz
linux-aio.o-libs := -laio
parallels.o-cflags := $(LIBXML2_CFLAGS)
parallels.o-libs := $(LIBXML2_LIBS)

View File

@@ -32,19 +32,15 @@
static QEMUClockType clock_type = QEMU_CLOCK_REALTIME;
static const int qtest_latency_ns = NANOSECONDS_PER_SECOND / 1000;
void block_acct_init(BlockAcctStats *stats)
{
qemu_mutex_init(&stats->lock);
if (qtest_enabled()) {
clock_type = QEMU_CLOCK_VIRTUAL;
}
}
void block_acct_setup(BlockAcctStats *stats, bool account_invalid,
bool account_failed)
void block_acct_init(BlockAcctStats *stats, bool account_invalid,
bool account_failed)
{
stats->account_invalid = account_invalid;
stats->account_failed = account_failed;
if (qtest_enabled()) {
clock_type = QEMU_CLOCK_VIRTUAL;
}
}
void block_acct_cleanup(BlockAcctStats *stats)
@@ -53,7 +49,6 @@ void block_acct_cleanup(BlockAcctStats *stats)
QSLIST_FOREACH_SAFE(s, &stats->intervals, entries, next) {
g_free(s);
}
qemu_mutex_destroy(&stats->lock);
}
void block_acct_add_interval(BlockAcctStats *stats, unsigned interval_length)
@@ -63,15 +58,12 @@ void block_acct_add_interval(BlockAcctStats *stats, unsigned interval_length)
s = g_new0(BlockAcctTimedStats, 1);
s->interval_length = interval_length;
s->stats = stats;
qemu_mutex_lock(&stats->lock);
QSLIST_INSERT_HEAD(&stats->intervals, s, entries);
for (i = 0; i < BLOCK_MAX_IOTYPE; i++) {
timed_average_init(&s->latency[i], clock_type,
(uint64_t) interval_length * NANOSECONDS_PER_SECOND);
}
qemu_mutex_unlock(&stats->lock);
}
BlockAcctTimedStats *block_acct_interval_next(BlockAcctStats *stats,
@@ -94,8 +86,7 @@ void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
cookie->type = type;
}
static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie *cookie,
bool failed)
void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
BlockAcctTimedStats *s;
int64_t time_ns = qemu_clock_get_ns(clock_type);
@@ -107,16 +98,31 @@ static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie *cookie,
assert(cookie->type < BLOCK_MAX_IOTYPE);
qemu_mutex_lock(&stats->lock);
stats->nr_bytes[cookie->type] += cookie->bytes;
stats->nr_ops[cookie->type]++;
stats->total_time_ns[cookie->type] += latency_ns;
stats->last_access_time_ns = time_ns;
if (failed) {
stats->failed_ops[cookie->type]++;
} else {
stats->nr_bytes[cookie->type] += cookie->bytes;
stats->nr_ops[cookie->type]++;
QSLIST_FOREACH(s, &stats->intervals, entries) {
timed_average_account(&s->latency[cookie->type], latency_ns);
}
}
void block_acct_failed(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
assert(cookie->type < BLOCK_MAX_IOTYPE);
stats->failed_ops[cookie->type]++;
if (stats->account_failed) {
BlockAcctTimedStats *s;
int64_t time_ns = qemu_clock_get_ns(clock_type);
int64_t latency_ns = time_ns - cookie->start_time_ns;
if (qtest_enabled()) {
latency_ns = qtest_latency_ns;
}
if (!failed || stats->account_failed) {
stats->total_time_ns[cookie->type] += latency_ns;
stats->last_access_time_ns = time_ns;
@@ -124,45 +130,29 @@ static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie *cookie,
timed_average_account(&s->latency[cookie->type], latency_ns);
}
}
qemu_mutex_unlock(&stats->lock);
}
void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
block_account_one_io(stats, cookie, false);
}
void block_acct_failed(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
block_account_one_io(stats, cookie, true);
}
void block_acct_invalid(BlockAcctStats *stats, enum BlockAcctType type)
{
assert(type < BLOCK_MAX_IOTYPE);
/* block_account_one_io() updates total_time_ns[], but this one does
* not. The reason is that invalid requests are accounted during their
* submission, therefore there's no actual I/O involved.
*/
qemu_mutex_lock(&stats->lock);
/* block_acct_done() and block_acct_failed() update
* total_time_ns[], but this one does not. The reason is that
* invalid requests are accounted during their submission,
* therefore there's no actual I/O involved. */
stats->invalid_ops[type]++;
if (stats->account_invalid) {
stats->last_access_time_ns = qemu_clock_get_ns(clock_type);
}
qemu_mutex_unlock(&stats->lock);
}
void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type,
int num_requests)
{
assert(type < BLOCK_MAX_IOTYPE);
qemu_mutex_lock(&stats->lock);
stats->merged[type] += num_requests;
qemu_mutex_unlock(&stats->lock);
}
int64_t block_acct_idle_time_ns(BlockAcctStats *stats)
@@ -177,9 +167,7 @@ double block_acct_queue_depth(BlockAcctTimedStats *stats,
assert(type < BLOCK_MAX_IOTYPE);
qemu_mutex_lock(&stats->stats->lock);
sum = timed_average_sum(&stats->latency[type], &elapsed);
qemu_mutex_unlock(&stats->stats->lock);
return (double) sum / elapsed;
}

View File

@@ -39,15 +39,20 @@ typedef struct BackupBlockJob {
BlockdevOnError on_source_error;
BlockdevOnError on_target_error;
CoRwlock flush_rwlock;
uint64_t bytes_read;
uint64_t sectors_read;
unsigned long *done_bitmap;
int64_t cluster_size;
bool compress;
NotifierWithReturn before_write;
QLIST_HEAD(, CowRequest) inflight_reqs;
HBitmap *copy_bitmap;
} BackupBlockJob;
/* Size of a cluster in sectors, instead of bytes. */
static inline int64_t cluster_size_sectors(BackupBlockJob *job)
{
return job->cluster_size / BDRV_SECTOR_SIZE;
}
/* See if in-flight requests overlap and wait for them to complete */
static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
int64_t start,
@@ -59,7 +64,7 @@ static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
do {
retry = false;
QLIST_FOREACH(req, &job->inflight_reqs, list) {
if (end > req->start_byte && start < req->end_byte) {
if (end > req->start && start < req->end) {
qemu_co_queue_wait(&req->wait_queue, NULL);
retry = true;
break;
@@ -70,10 +75,10 @@ static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
/* Keep track of an in-flight request */
static void cow_request_begin(CowRequest *req, BackupBlockJob *job,
int64_t start, int64_t end)
int64_t start, int64_t end)
{
req->start_byte = start;
req->end_byte = end;
req->start = start;
req->end = end;
qemu_co_queue_init(&req->wait_queue);
QLIST_INSERT_HEAD(&job->inflight_reqs, req, list);
}
@@ -86,7 +91,7 @@ static void cow_request_end(CowRequest *req)
}
static int coroutine_fn backup_do_cow(BackupBlockJob *job,
int64_t offset, uint64_t bytes,
int64_t sector_num, int nb_sectors,
bool *error_is_read,
bool is_write_notifier)
{
@@ -96,53 +101,55 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
QEMUIOVector bounce_qiov;
void *bounce_buffer = NULL;
int ret = 0;
int64_t start, end; /* bytes */
int n; /* bytes */
int64_t sectors_per_cluster = cluster_size_sectors(job);
int64_t start, end;
int n;
qemu_co_rwlock_rdlock(&job->flush_rwlock);
start = QEMU_ALIGN_DOWN(offset, job->cluster_size);
end = QEMU_ALIGN_UP(bytes + offset, job->cluster_size);
start = sector_num / sectors_per_cluster;
end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
trace_backup_do_cow_enter(job, start, offset, bytes);
trace_backup_do_cow_enter(job, start, sector_num, nb_sectors);
wait_for_overlapping_requests(job, start, end);
cow_request_begin(&cow_request, job, start, end);
for (; start < end; start += job->cluster_size) {
if (!hbitmap_get(job->copy_bitmap, start / job->cluster_size)) {
for (; start < end; start++) {
if (test_bit(start, job->done_bitmap)) {
trace_backup_do_cow_skip(job, start);
continue; /* already copied */
}
hbitmap_reset(job->copy_bitmap, start / job->cluster_size, 1);
trace_backup_do_cow_process(job, start);
n = MIN(job->cluster_size, job->common.len - start);
n = MIN(sectors_per_cluster,
job->common.len / BDRV_SECTOR_SIZE -
start * sectors_per_cluster);
if (!bounce_buffer) {
bounce_buffer = blk_blockalign(blk, job->cluster_size);
}
iov.iov_base = bounce_buffer;
iov.iov_len = n;
iov.iov_len = n * BDRV_SECTOR_SIZE;
qemu_iovec_init_external(&bounce_qiov, &iov, 1);
ret = blk_co_preadv(blk, start, bounce_qiov.size, &bounce_qiov,
ret = blk_co_preadv(blk, start * job->cluster_size,
bounce_qiov.size, &bounce_qiov,
is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
if (ret < 0) {
trace_backup_do_cow_read_fail(job, start, ret);
if (error_is_read) {
*error_is_read = true;
}
hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1);
goto out;
}
if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
ret = blk_co_pwrite_zeroes(job->target, start,
ret = blk_co_pwrite_zeroes(job->target, start * job->cluster_size,
bounce_qiov.size, BDRV_REQ_MAY_UNMAP);
} else {
ret = blk_co_pwritev(job->target, start,
ret = blk_co_pwritev(job->target, start * job->cluster_size,
bounce_qiov.size, &bounce_qiov,
job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
}
@@ -151,15 +158,16 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
if (error_is_read) {
*error_is_read = false;
}
hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1);
goto out;
}
set_bit(start, job->done_bitmap);
/* Publish progress, guest I/O counts as progress too. Note that the
* offset field is an opaque progress value, it is not a disk offset.
*/
job->bytes_read += n;
job->common.offset += n;
job->sectors_read += n;
job->common.offset += n * BDRV_SECTOR_SIZE;
}
out:
@@ -169,7 +177,7 @@ out:
cow_request_end(&cow_request);
trace_backup_do_cow_return(job, offset, bytes, ret);
trace_backup_do_cow_return(job, sector_num, nb_sectors, ret);
qemu_co_rwlock_unlock(&job->flush_rwlock);
@@ -182,12 +190,14 @@ static int coroutine_fn backup_before_write_notify(
{
BackupBlockJob *job = container_of(notifier, BackupBlockJob, before_write);
BdrvTrackedRequest *req = opaque;
int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;
assert(req->bs == blk_bs(job->common.blk));
assert(QEMU_IS_ALIGNED(req->offset, BDRV_SECTOR_SIZE));
assert(QEMU_IS_ALIGNED(req->bytes, BDRV_SECTOR_SIZE));
assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
return backup_do_cow(job, req->offset, req->bytes, NULL, true);
return backup_do_cow(job, sector_num, nb_sectors, NULL, true);
}
static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -198,7 +208,7 @@ static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
@@ -262,32 +272,35 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
}
len = DIV_ROUND_UP(backup_job->common.len, backup_job->cluster_size);
hbitmap_set(backup_job->copy_bitmap, 0, len);
bitmap_zero(backup_job->done_bitmap, len);
}
void backup_wait_for_overlapping_requests(BlockJob *job, int64_t offset,
uint64_t bytes)
void backup_wait_for_overlapping_requests(BlockJob *job, int64_t sector_num,
int nb_sectors)
{
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
int64_t sectors_per_cluster = cluster_size_sectors(backup_job);
int64_t start, end;
assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
start = QEMU_ALIGN_DOWN(offset, backup_job->cluster_size);
end = QEMU_ALIGN_UP(offset + bytes, backup_job->cluster_size);
start = sector_num / sectors_per_cluster;
end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
wait_for_overlapping_requests(backup_job, start, end);
}
void backup_cow_request_begin(CowRequest *req, BlockJob *job,
int64_t offset, uint64_t bytes)
int64_t sector_num,
int nb_sectors)
{
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
int64_t sectors_per_cluster = cluster_size_sectors(backup_job);
int64_t start, end;
assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
start = QEMU_ALIGN_DOWN(offset, backup_job->cluster_size);
end = QEMU_ALIGN_UP(offset + bytes, backup_job->cluster_size);
start = sector_num / sectors_per_cluster;
end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
cow_request_begin(req, backup_job, start, end);
}
@@ -346,11 +359,11 @@ static bool coroutine_fn yield_and_check(BackupBlockJob *job)
*/
if (job->common.speed) {
uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
job->bytes_read);
job->bytes_read = 0;
block_job_sleep_ns(&job->common, delay_ns);
job->sectors_read);
job->sectors_read = 0;
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
} else {
block_job_sleep_ns(&job->common, 0);
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, 0);
}
if (block_job_is_cancelled(&job->common)) {
@@ -362,68 +375,65 @@ static bool coroutine_fn yield_and_check(BackupBlockJob *job)
static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
{
int ret;
bool error_is_read;
int ret = 0;
int clusters_per_iter;
uint32_t granularity;
int64_t sector;
int64_t cluster;
HBitmapIter hbi;
hbitmap_iter_init(&hbi, job->copy_bitmap, 0);
while ((cluster = hbitmap_iter_next(&hbi)) != -1) {
do {
if (yield_and_check(job)) {
return 0;
}
ret = backup_do_cow(job, cluster * job->cluster_size,
job->cluster_size, &error_is_read, false);
if (ret < 0 && backup_error_action(job, error_is_read, -ret) ==
BLOCK_ERROR_ACTION_REPORT)
{
return ret;
}
} while (ret < 0);
}
return 0;
}
/* init copy_bitmap from sync_bitmap */
static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
{
int64_t end;
int64_t last_cluster = -1;
int64_t sectors_per_cluster = cluster_size_sectors(job);
BdrvDirtyBitmapIter *dbi;
int64_t offset;
int64_t end = DIV_ROUND_UP(bdrv_dirty_bitmap_size(job->sync_bitmap),
job->cluster_size);
dbi = bdrv_dirty_iter_new(job->sync_bitmap);
while ((offset = bdrv_dirty_iter_next(dbi)) != -1) {
int64_t cluster = offset / job->cluster_size;
int64_t next_cluster;
granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
clusters_per_iter = MAX((granularity / job->cluster_size), 1);
dbi = bdrv_dirty_iter_new(job->sync_bitmap, 0);
offset += bdrv_dirty_bitmap_granularity(job->sync_bitmap);
if (offset >= bdrv_dirty_bitmap_size(job->sync_bitmap)) {
hbitmap_set(job->copy_bitmap, cluster, end - cluster);
break;
/* Find the next dirty sector(s) */
while ((sector = bdrv_dirty_iter_next(dbi)) != -1) {
cluster = sector / sectors_per_cluster;
/* Fake progress updates for any clusters we skipped */
if (cluster != last_cluster + 1) {
job->common.offset += ((cluster - last_cluster - 1) *
job->cluster_size);
}
offset = bdrv_dirty_bitmap_next_zero(job->sync_bitmap, offset);
if (offset == -1) {
hbitmap_set(job->copy_bitmap, cluster, end - cluster);
break;
for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
do {
if (yield_and_check(job)) {
goto out;
}
ret = backup_do_cow(job, cluster * sectors_per_cluster,
sectors_per_cluster, &error_is_read,
false);
if ((ret < 0) &&
backup_error_action(job, error_is_read, -ret) ==
BLOCK_ERROR_ACTION_REPORT) {
goto out;
}
} while (ret < 0);
}
next_cluster = DIV_ROUND_UP(offset, job->cluster_size);
hbitmap_set(job->copy_bitmap, cluster, next_cluster - cluster);
if (next_cluster >= end) {
break;
/* If the bitmap granularity is smaller than the backup granularity,
* we need to advance the iterator pointer to the next cluster. */
if (granularity < job->cluster_size) {
bdrv_set_dirty_iter(dbi, cluster * sectors_per_cluster);
}
bdrv_set_dirty_iter(dbi, next_cluster * job->cluster_size);
last_cluster = cluster - 1;
}
job->common.offset = job->common.len -
hbitmap_count(job->copy_bitmap) * job->cluster_size;
/* Play some final catchup with the progress meter */
end = DIV_ROUND_UP(job->common.len, job->cluster_size);
if (last_cluster + 1 < end) {
job->common.offset += ((end - last_cluster - 1) * job->cluster_size);
}
out:
bdrv_dirty_iter_free(dbi);
return ret;
}
static void coroutine_fn backup_run(void *opaque)
@@ -431,27 +441,22 @@ static void coroutine_fn backup_run(void *opaque)
BackupBlockJob *job = opaque;
BackupCompleteData *data;
BlockDriverState *bs = blk_bs(job->common.blk);
int64_t offset, nb_clusters;
int64_t start, end;
int64_t sectors_per_cluster = cluster_size_sectors(job);
int ret = 0;
QLIST_INIT(&job->inflight_reqs);
qemu_co_rwlock_init(&job->flush_rwlock);
nb_clusters = DIV_ROUND_UP(job->common.len, job->cluster_size);
job->copy_bitmap = hbitmap_alloc(nb_clusters, 0);
if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
backup_incremental_init_copy_bitmap(job);
} else {
hbitmap_set(job->copy_bitmap, 0, nb_clusters);
}
start = 0;
end = DIV_ROUND_UP(job->common.len, job->cluster_size);
job->done_bitmap = bitmap_new(end);
job->before_write.notify = backup_before_write_notify;
bdrv_add_before_write_notifier(bs, &job->before_write);
if (job->sync_mode == MIRROR_SYNC_MODE_NONE) {
/* All bits are set in copy_bitmap to allow any cluster to be copied.
* This does not actually require them to be copied. */
while (!block_job_is_cancelled(&job->common)) {
/* Yield until the job is cancelled. We just let our before_write
* notify callback service CoW requests. */
@@ -461,8 +466,7 @@ static void coroutine_fn backup_run(void *opaque)
ret = backup_run_incremental(job);
} else {
/* Both FULL and TOP SYNC_MODE's require copying.. */
for (offset = 0; offset < job->common.len;
offset += job->cluster_size) {
for (; start < end; start++) {
bool error_is_read;
int alloced = 0;
@@ -471,13 +475,12 @@ static void coroutine_fn backup_run(void *opaque)
}
if (job->sync_mode == MIRROR_SYNC_MODE_TOP) {
int i;
int64_t n;
int i, n;
/* Check to see if these blocks are already in the
* backing file. */
for (i = 0; i < job->cluster_size;) {
for (i = 0; i < sectors_per_cluster;) {
/* bdrv_is_allocated() only returns true/false based
* on the first set of sectors it comes across that
* are are all in the same state.
@@ -485,8 +488,9 @@ static void coroutine_fn backup_run(void *opaque)
* backup cluster length. We end up copying more than
* needed but at some point that is always the case. */
alloced =
bdrv_is_allocated(bs, offset + i,
job->cluster_size - i, &n);
bdrv_is_allocated(bs,
start * sectors_per_cluster + i,
sectors_per_cluster - i, &n);
i += n;
if (alloced || n == 0) {
@@ -504,8 +508,9 @@ static void coroutine_fn backup_run(void *opaque)
if (alloced < 0) {
ret = alloced;
} else {
ret = backup_do_cow(job, offset, job->cluster_size,
&error_is_read, false);
ret = backup_do_cow(job, start * sectors_per_cluster,
sectors_per_cluster, &error_is_read,
false);
}
if (ret < 0) {
/* Depending on error action, fail now or retry cluster */
@@ -514,7 +519,7 @@ static void coroutine_fn backup_run(void *opaque)
if (action == BLOCK_ERROR_ACTION_REPORT) {
break;
} else {
offset -= job->cluster_size;
start--;
continue;
}
}
@@ -526,7 +531,7 @@ static void coroutine_fn backup_run(void *opaque)
/* wait until pending backup_do_cow() calls have completed */
qemu_co_rwlock_wrlock(&job->flush_rwlock);
qemu_co_rwlock_unlock(&job->flush_rwlock);
hbitmap_free(job->copy_bitmap);
g_free(job->done_bitmap);
data = g_malloc(sizeof(*data));
data->ret = ret;
@@ -609,7 +614,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
error_setg(errp,
"a sync_bitmap was provided to backup_run, "
"but received an incompatible sync_mode (%s)",
MirrorSyncMode_str(sync_mode));
MirrorSyncMode_lookup[sync_mode]);
return NULL;
}
@@ -652,12 +657,12 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
ret = bdrv_get_info(target, &bdi);
if (ret == -ENOTSUP && !target->backing) {
/* Cluster size is not defined */
warn_report("The target block device doesn't provide "
"information about the block size and it doesn't have a "
"backing file. The default block size of %u bytes is "
"used. If the actual block size of the target exceeds "
"this default, the backup may be unusable",
BACKUP_CLUSTER_SIZE_DEFAULT);
error_report("WARNING: The target block device doesn't provide "
"information about the block size and it doesn't have a "
"backing file. The default block size of %u bytes is "
"used. If the actual block size of the target exceeds "
"this default, the backup may be unusable",
BACKUP_CLUSTER_SIZE_DEFAULT);
job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT;
} else if (ret < 0 && !target->backing) {
error_setg_errno(errp, -ret,
@@ -687,7 +692,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
}
if (job) {
backup_clean(&job->common);
block_job_early_fail(&job->common);
block_job_unref(&job->common);
}
return NULL;

View File

@@ -29,8 +29,9 @@
#include "qemu/config-file.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/option.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
#include "sysemu/qtest.h"
@@ -149,6 +150,20 @@ static QemuOptsList *config_groups[] = {
NULL
};
static int get_event_by_name(const char *name, BlkdebugEvent *event)
{
int i;
for (i = 0; i < BLKDBG__MAX; i++) {
if (!strcmp(BlkdebugEvent_lookup[i], name)) {
*event = i;
return 0;
}
}
return -1;
}
struct add_rule_data {
BDRVBlkdebugState *s;
int action;
@@ -159,7 +174,7 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
struct add_rule_data *d = opaque;
BDRVBlkdebugState *s = d->s;
const char* event_name;
int event;
BlkdebugEvent event;
struct BlkdebugRule *rule;
int64_t sector;
@@ -168,9 +183,8 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
if (!event_name) {
error_setg(errp, "Missing event name for rule");
return -1;
}
event = qapi_enum_parse(&BlkdebugEvent_lookup, event_name, -1, errp);
if (event < 0) {
} else if (get_event_by_name(event_name, &event) < 0) {
error_setg(errp, "Invalid event name \"%s\"", event_name);
return -1;
}
@@ -244,6 +258,7 @@ static int read_config(BDRVBlkdebugState *s, const char *filename,
ret = qemu_config_parse(f, config_groups, filename);
if (ret < 0) {
error_setg(errp, "Could not parse blkdebug config file");
ret = -EINVAL;
goto fail;
}
}
@@ -561,7 +576,7 @@ static int blkdebug_co_flush(BlockDriverState *bs)
}
static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int bytes,
int64_t offset, int count,
BdrvRequestFlags flags)
{
uint32_t align = MAX(bs->bl.request_alignment,
@@ -572,29 +587,29 @@ static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs,
* preferred alignment (so that we test the fallback to writes on
* unaligned portions), and check that the block layer never hands
* us anything unaligned that crosses an alignment boundary. */
if (bytes < align) {
if (count < align) {
assert(QEMU_IS_ALIGNED(offset, align) ||
QEMU_IS_ALIGNED(offset + bytes, align) ||
QEMU_IS_ALIGNED(offset + count, align) ||
DIV_ROUND_UP(offset, align) ==
DIV_ROUND_UP(offset + bytes, align));
DIV_ROUND_UP(offset + count, align));
return -ENOTSUP;
}
assert(QEMU_IS_ALIGNED(offset, align));
assert(QEMU_IS_ALIGNED(bytes, align));
assert(QEMU_IS_ALIGNED(count, align));
if (bs->bl.max_pwrite_zeroes) {
assert(bytes <= bs->bl.max_pwrite_zeroes);
assert(count <= bs->bl.max_pwrite_zeroes);
}
err = rule_check(bs, offset, bytes);
err = rule_check(bs, offset, count);
if (err) {
return err;
}
return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
return bdrv_co_pwrite_zeroes(bs->file, offset, count, flags);
}
static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs,
int64_t offset, int bytes)
int64_t offset, int count)
{
uint32_t align = bs->bl.pdiscard_alignment;
int err;
@@ -602,42 +617,29 @@ static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs,
/* Only pass through requests that are larger than requested
* minimum alignment, and ensure that unaligned requests do not
* cross optimum discard boundaries. */
if (bytes < bs->bl.request_alignment) {
if (count < bs->bl.request_alignment) {
assert(QEMU_IS_ALIGNED(offset, align) ||
QEMU_IS_ALIGNED(offset + bytes, align) ||
QEMU_IS_ALIGNED(offset + count, align) ||
DIV_ROUND_UP(offset, align) ==
DIV_ROUND_UP(offset + bytes, align));
DIV_ROUND_UP(offset + count, align));
return -ENOTSUP;
}
assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
assert(QEMU_IS_ALIGNED(bytes, bs->bl.request_alignment));
if (align && bytes >= align) {
assert(QEMU_IS_ALIGNED(count, bs->bl.request_alignment));
if (align && count >= align) {
assert(QEMU_IS_ALIGNED(offset, align));
assert(QEMU_IS_ALIGNED(bytes, align));
assert(QEMU_IS_ALIGNED(count, align));
}
if (bs->bl.max_pdiscard) {
assert(bytes <= bs->bl.max_pdiscard);
assert(count <= bs->bl.max_pdiscard);
}
err = rule_check(bs, offset, bytes);
err = rule_check(bs, offset, count);
if (err) {
return err;
}
return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
}
static int coroutine_fn blkdebug_co_block_status(BlockDriverState *bs,
bool want_zero,
int64_t offset,
int64_t bytes,
int64_t *pnum,
int64_t *map,
BlockDriverState **file)
{
assert(QEMU_IS_ALIGNED(offset | bytes, bs->bl.request_alignment));
return bdrv_co_block_status_from_file(bs, want_zero, offset, bytes,
pnum, map, file);
return bdrv_co_pdiscard(bs->file->bs, offset, count);
}
static void blkdebug_close(BlockDriverState *bs)
@@ -732,13 +734,13 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
{
BDRVBlkdebugState *s = bs->opaque;
struct BlkdebugRule *rule;
int blkdebug_event;
BlkdebugEvent blkdebug_event;
blkdebug_event = qapi_enum_parse(&BlkdebugEvent_lookup, event, -1, NULL);
if (blkdebug_event < 0) {
if (get_event_by_name(event, &blkdebug_event) < 0) {
return -ENOENT;
}
rule = g_malloc(sizeof(*rule));
*rule = (struct BlkdebugRule) {
.event = blkdebug_event,
@@ -810,6 +812,11 @@ static int64_t blkdebug_getlength(BlockDriverState *bs)
return bdrv_getlength(bs->file->bs);
}
static int blkdebug_truncate(BlockDriverState *bs, int64_t offset)
{
return bdrv_truncate(bs->file, offset, NULL);
}
static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
{
BDRVBlkdebugState *s = bs->opaque;
@@ -892,7 +899,6 @@ static BlockDriver bdrv_blkdebug = {
.format_name = "blkdebug",
.protocol_name = "blkdebug",
.instance_size = sizeof(BDRVBlkdebugState),
.is_filter = true,
.bdrv_parse_filename = blkdebug_parse_filename,
.bdrv_file_open = blkdebug_open,
@@ -901,6 +907,7 @@ static BlockDriver bdrv_blkdebug = {
.bdrv_child_perm = bdrv_filter_default_perms,
.bdrv_getlength = blkdebug_getlength,
.bdrv_truncate = blkdebug_truncate,
.bdrv_refresh_filename = blkdebug_refresh_filename,
.bdrv_refresh_limits = blkdebug_refresh_limits,
@@ -909,7 +916,6 @@ static BlockDriver bdrv_blkdebug = {
.bdrv_co_flush_to_disk = blkdebug_co_flush,
.bdrv_co_pwrite_zeroes = blkdebug_co_pwrite_zeroes,
.bdrv_co_pdiscard = blkdebug_co_pdiscard,
.bdrv_co_block_status = blkdebug_co_block_status,
.bdrv_debug_event = blkdebug_debug_event,
.bdrv_debug_breakpoint = blkdebug_debug_breakpoint,

View File

@@ -96,10 +96,10 @@ static int coroutine_fn blkreplay_co_pwritev(BlockDriverState *bs,
}
static int coroutine_fn blkreplay_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int bytes, BdrvRequestFlags flags)
int64_t offset, int count, BdrvRequestFlags flags)
{
uint64_t reqid = blkreplay_next_id();
int ret = bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
int ret = bdrv_co_pwrite_zeroes(bs->file, offset, count, flags);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
@@ -107,10 +107,10 @@ static int coroutine_fn blkreplay_co_pwrite_zeroes(BlockDriverState *bs,
}
static int coroutine_fn blkreplay_co_pdiscard(BlockDriverState *bs,
int64_t offset, int bytes)
int64_t offset, int count)
{
uint64_t reqid = blkreplay_next_id();
int ret = bdrv_co_pdiscard(bs->file->bs, offset, bytes);
int ret = bdrv_co_pdiscard(bs->file->bs, offset, count);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
@@ -129,9 +129,10 @@ static int coroutine_fn blkreplay_co_flush(BlockDriverState *bs)
static BlockDriver bdrv_blkreplay = {
.format_name = "blkreplay",
.protocol_name = "blkreplay",
.instance_size = 0,
.bdrv_open = blkreplay_open,
.bdrv_file_open = blkreplay_open,
.bdrv_close = blkreplay_close,
.bdrv_child_perm = bdrv_filter_default_perms,
.bdrv_getlength = blkreplay_getlength,

View File

@@ -14,7 +14,6 @@
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
#include "qemu/cutils.h"
#include "qemu/option.h"
typedef struct {
BdrvChild *test_file;

View File

@@ -17,12 +17,9 @@
#include "block/throttle-groups.h"
#include "sysemu/blockdev.h"
#include "sysemu/sysemu.h"
#include "qapi/error.h"
#include "qapi/qapi-events-block.h"
#include "qapi-event.h"
#include "qemu/id.h"
#include "qemu/option.h"
#include "trace.h"
#include "migration/misc.h"
/* Number of coroutines to reserve per attached device model */
#define COROUTINE_POOL_RESERVATION 64
@@ -71,16 +68,6 @@ struct BlockBackend {
NotifierList remove_bs_notifiers, insert_bs_notifiers;
int quiesce_counter;
VMChangeStateEntry *vmsh;
bool force_allow_inactivate;
/* Number of in-flight aio requests. BlockDriverState also counts
* in-flight requests but aio requests can exist even when blk->root is
* NULL, so we cannot rely on its counter for that case.
* Accessed with atomic ops.
*/
unsigned int in_flight;
AioWait wait;
};
typedef struct BlockBackendAIOCB {
@@ -96,6 +83,7 @@ static const AIOCBInfo block_backend_aiocb_info = {
static void drive_info_del(DriveInfo *dinfo);
static BlockBackend *bdrv_first_blk(BlockDriverState *bs);
static char *blk_get_attached_dev_id(BlockBackend *blk);
/* All BlockBackends */
static QTAILQ_HEAD(, BlockBackend) block_backends =
@@ -142,111 +130,6 @@ static const char *blk_root_get_name(BdrvChild *child)
return blk_name(child->opaque);
}
static void blk_vm_state_changed(void *opaque, int running, RunState state)
{
Error *local_err = NULL;
BlockBackend *blk = opaque;
if (state == RUN_STATE_INMIGRATE) {
return;
}
qemu_del_vm_change_state_handler(blk->vmsh);
blk->vmsh = NULL;
blk_set_perm(blk, blk->perm, blk->shared_perm, &local_err);
if (local_err) {
error_report_err(local_err);
}
}
/*
* Notifies the user of the BlockBackend that migration has completed. qdev
* devices can tighten their permissions in response (specifically revoke
* shared write permissions that we needed for storage migration).
*
* If an error is returned, the VM cannot be allowed to be resumed.
*/
static void blk_root_activate(BdrvChild *child, Error **errp)
{
BlockBackend *blk = child->opaque;
Error *local_err = NULL;
if (!blk->disable_perm) {
return;
}
blk->disable_perm = false;
blk_set_perm(blk, blk->perm, BLK_PERM_ALL, &local_err);
if (local_err) {
error_propagate(errp, local_err);
blk->disable_perm = true;
return;
}
if (runstate_check(RUN_STATE_INMIGRATE)) {
/* Activation can happen when migration process is still active, for
* example when nbd_server_add is called during non-shared storage
* migration. Defer the shared_perm update to migration completion. */
if (!blk->vmsh) {
blk->vmsh = qemu_add_vm_change_state_handler(blk_vm_state_changed,
blk);
}
return;
}
blk_set_perm(blk, blk->perm, blk->shared_perm, &local_err);
if (local_err) {
error_propagate(errp, local_err);
blk->disable_perm = true;
return;
}
}
void blk_set_force_allow_inactivate(BlockBackend *blk)
{
blk->force_allow_inactivate = true;
}
static bool blk_can_inactivate(BlockBackend *blk)
{
/* If it is a guest device, inactivate is ok. */
if (blk->dev || blk_name(blk)[0]) {
return true;
}
/* Inactivating means no more writes to the image can be done,
* even if those writes would be changes invisible to the
* guest. For block job BBs that satisfy this, we can just allow
* it. This is the case for mirror job source, which is required
* by libvirt non-shared block migration. */
if (!(blk->perm & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED))) {
return true;
}
return blk->force_allow_inactivate;
}
static int blk_root_inactivate(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
if (blk->disable_perm) {
return 0;
}
if (!blk_can_inactivate(blk)) {
return -EPERM;
}
blk->disable_perm = true;
if (blk->root) {
bdrv_child_try_set_perm(blk->root, 0, BLK_PERM_ALL, &error_abort);
}
return 0;
}
static const BdrvChildRole child_root = {
.inherit_options = blk_root_inherit_options,
@@ -257,9 +140,6 @@ static const BdrvChildRole child_root = {
.drained_begin = blk_root_drained_begin,
.drained_end = blk_root_drained_end,
.activate = blk_root_activate,
.inactivate = blk_root_inactivate,
};
/*
@@ -283,7 +163,8 @@ BlockBackend *blk_new(uint64_t perm, uint64_t shared_perm)
blk->shared_perm = shared_perm;
blk_set_enable_write_cache(blk, true);
block_acct_init(&blk->stats);
qemu_co_queue_init(&blk->public.throttled_reqs[0]);
qemu_co_queue_init(&blk->public.throttled_reqs[1]);
notifier_list_init(&blk->remove_bs_notifiers);
notifier_list_init(&blk->insert_bs_notifiers);
@@ -309,7 +190,7 @@ BlockBackend *blk_new_open(const char *filename, const char *reference,
{
BlockBackend *blk;
BlockDriverState *bs;
uint64_t perm = 0;
uint64_t perm;
/* blk_new_open() is mainly used in .bdrv_create implementations and the
* tools where sharing isn't a concern because the BDS stays private, so we
@@ -319,11 +200,9 @@ BlockBackend *blk_new_open(const char *filename, const char *reference,
* caller of blk_new_open() doesn't make use of the permissions, but they
* shouldn't hurt either. We can still share everything here because the
* guest devices will add their own blockers if they can't share. */
if ((flags & BDRV_O_NO_IO) == 0) {
perm |= BLK_PERM_CONSISTENT_READ;
if (flags & BDRV_O_RDWR) {
perm |= BLK_PERM_WRITE;
}
perm = BLK_PERM_CONSISTENT_READ;
if (flags & BDRV_O_RDWR) {
perm |= BLK_PERM_WRITE;
}
if (flags & BDRV_O_RESIZE) {
perm |= BLK_PERM_RESIZE;
@@ -352,16 +231,12 @@ static void blk_delete(BlockBackend *blk)
assert(!blk->refcnt);
assert(!blk->name);
assert(!blk->dev);
if (blk->public.throttle_group_member.throttle_state) {
if (blk->public.throttle_state) {
blk_io_limits_disable(blk);
}
if (blk->root) {
blk_remove_bs(blk);
}
if (blk->vmsh) {
qemu_del_vm_change_state_handler(blk->vmsh);
blk->vmsh = NULL;
}
assert(QLIST_EMPTY(&blk->remove_bs_notifiers.notifiers));
assert(QLIST_EMPTY(&blk->insert_bs_notifiers.notifiers));
QTAILQ_REMOVE(&block_backends, blk, link);
@@ -413,7 +288,7 @@ void blk_unref(BlockBackend *blk)
* Behaves similarly to blk_next() but iterates over all BlockBackends, even the
* ones which are hidden (i.e. are not referenced by the monitor).
*/
BlockBackend *blk_all_next(BlockBackend *blk)
static BlockBackend *blk_all_next(BlockBackend *blk)
{
return blk ? QTAILQ_NEXT(blk, link)
: QTAILQ_FIRST(&block_backends);
@@ -454,37 +329,21 @@ BlockBackend *blk_next(BlockBackend *blk)
* the monitor or attached to a BlockBackend */
BlockDriverState *bdrv_next(BdrvNextIterator *it)
{
BlockDriverState *bs, *old_bs;
/* Must be called from the main loop */
assert(qemu_get_current_aio_context() == qemu_get_aio_context());
BlockDriverState *bs;
/* First, return all root nodes of BlockBackends. In order to avoid
* returning a BDS twice when multiple BBs refer to it, we only return it
* if the BB is the first one in the parent list of the BDS. */
if (it->phase == BDRV_NEXT_BACKEND_ROOTS) {
BlockBackend *old_blk = it->blk;
old_bs = old_blk ? blk_bs(old_blk) : NULL;
do {
it->blk = blk_all_next(it->blk);
bs = it->blk ? blk_bs(it->blk) : NULL;
} while (it->blk && (bs == NULL || bdrv_first_blk(bs) != it->blk));
if (it->blk) {
blk_ref(it->blk);
}
blk_unref(old_blk);
if (bs) {
bdrv_ref(bs);
bdrv_unref(old_bs);
return bs;
}
it->phase = BDRV_NEXT_MONITOR_OWNED;
} else {
old_bs = it->bs;
}
/* Then return the monitor-owned BDSes without a BB attached. Ignore all
@@ -495,46 +354,18 @@ BlockDriverState *bdrv_next(BdrvNextIterator *it)
bs = it->bs;
} while (bs && bdrv_has_blk(bs));
if (bs) {
bdrv_ref(bs);
}
bdrv_unref(old_bs);
return bs;
}
static void bdrv_next_reset(BdrvNextIterator *it)
{
*it = (BdrvNextIterator) {
.phase = BDRV_NEXT_BACKEND_ROOTS,
};
}
BlockDriverState *bdrv_first(BdrvNextIterator *it)
{
bdrv_next_reset(it);
*it = (BdrvNextIterator) {
.phase = BDRV_NEXT_BACKEND_ROOTS,
};
return bdrv_next(it);
}
/* Must be called when aborting a bdrv_next() iteration before
* bdrv_next() returns NULL */
void bdrv_next_cleanup(BdrvNextIterator *it)
{
/* Must be called from the main loop */
assert(qemu_get_current_aio_context() == qemu_get_aio_context());
if (it->phase == BDRV_NEXT_BACKEND_ROOTS) {
if (it->blk) {
bdrv_unref(blk_bs(it->blk));
blk_unref(it->blk);
}
} else {
bdrv_unref(it->bs);
}
bdrv_next_reset(it);
}
/*
* Add a BlockBackend into the list of backends referenced by the monitor, with
* the given @name acting as the handle for the monitor.
@@ -589,7 +420,7 @@ void monitor_remove_blk(BlockBackend *blk)
* Return @blk's name, a non-null string.
* Returns an empty string iff @blk is not referenced by the monitor.
*/
const char *blk_name(const BlockBackend *blk)
const char *blk_name(BlockBackend *blk)
{
return blk->name ?: "";
}
@@ -711,16 +542,9 @@ BlockBackend *blk_by_public(BlockBackendPublic *public)
*/
void blk_remove_bs(BlockBackend *blk)
{
ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
BlockDriverState *bs;
notifier_list_notify(&blk->remove_bs_notifiers, blk);
if (tgm->throttle_state) {
bs = blk_bs(blk);
bdrv_drained_begin(bs);
throttle_group_detach_aio_context(tgm);
throttle_group_attach_aio_context(tgm, qemu_get_aio_context());
bdrv_drained_end(bs);
if (blk->public.throttle_state) {
throttle_timers_detach_aio_context(&blk->public.throttle_timers);
}
blk_update_root_state(blk);
@@ -734,7 +558,6 @@ void blk_remove_bs(BlockBackend *blk)
*/
int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, Error **errp)
{
ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
blk->root = bdrv_root_attach_child(bs, "root", &child_root,
blk->perm, blk->shared_perm, blk, errp);
if (blk->root == NULL) {
@@ -743,9 +566,9 @@ int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, Error **errp)
bdrv_ref(bs);
notifier_list_notify(&blk->insert_bs_notifiers, blk);
if (tgm->throttle_state) {
throttle_group_detach_aio_context(tgm);
throttle_group_attach_aio_context(tgm, bdrv_get_aio_context(bs));
if (blk->public.throttle_state) {
throttle_timers_attach_aio_context(
&blk->public.throttle_timers, bdrv_get_aio_context(bs));
}
return 0;
@@ -778,6 +601,34 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm)
*shared_perm = blk->shared_perm;
}
/*
* Notifies the user of all BlockBackends that migration has completed. qdev
* devices can tighten their permissions in response (specifically revoke
* shared write permissions that we needed for storage migration).
*
* If an error is returned, the VM cannot be allowed to be resumed.
*/
void blk_resume_after_migration(Error **errp)
{
BlockBackend *blk;
Error *local_err = NULL;
for (blk = blk_all_next(NULL); blk; blk = blk_all_next(blk)) {
if (!blk->disable_perm) {
continue;
}
blk->disable_perm = false;
blk_set_perm(blk, blk->perm, blk->shared_perm, &local_err);
if (local_err) {
error_propagate(errp, local_err);
blk->disable_perm = true;
return;
}
}
}
static int blk_do_attach_dev(BlockBackend *blk, void *dev)
{
if (blk->dev) {
@@ -848,7 +699,7 @@ void *blk_get_attached_dev(BlockBackend *blk)
/* Return the qdev ID, or if no ID is assigned the QOM path, of the block
* device attached to the BlockBackend. */
char *blk_get_attached_dev_id(BlockBackend *blk)
static char *blk_get_attached_dev_id(BlockBackend *blk)
{
DeviceState *dev;
@@ -1107,9 +958,8 @@ int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset,
bdrv_inc_in_flight(bs);
/* throttling disk I/O */
if (blk->public.throttle_group_member.throttle_state) {
throttle_group_co_io_limits_intercept(&blk->public.throttle_group_member,
bytes, false);
if (blk->public.throttle_state) {
throttle_group_co_io_limits_intercept(blk, bytes, false);
}
ret = bdrv_co_preadv(blk->root, offset, bytes, qiov, flags);
@@ -1132,10 +982,10 @@ int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
}
bdrv_inc_in_flight(bs);
/* throttling disk I/O */
if (blk->public.throttle_group_member.throttle_state) {
throttle_group_co_io_limits_intercept(&blk->public.throttle_group_member,
bytes, true);
if (blk->public.throttle_state) {
throttle_group_co_io_limits_intercept(blk, bytes, true);
}
if (!blk->enable_write_cache) {
@@ -1150,7 +1000,7 @@ int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
typedef struct BlkRwCo {
BlockBackend *blk;
int64_t offset;
void *iobuf;
QEMUIOVector *qiov;
int ret;
BdrvRequestFlags flags;
} BlkRwCo;
@@ -1158,19 +1008,17 @@ typedef struct BlkRwCo {
static void blk_read_entry(void *opaque)
{
BlkRwCo *rwco = opaque;
QEMUIOVector *qiov = rwco->iobuf;
rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, qiov->size,
qiov, rwco->flags);
rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, rwco->qiov->size,
rwco->qiov, rwco->flags);
}
static void blk_write_entry(void *opaque)
{
BlkRwCo *rwco = opaque;
QEMUIOVector *qiov = rwco->iobuf;
rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, qiov->size,
qiov, rwco->flags);
rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, rwco->qiov->size,
rwco->qiov, rwco->flags);
}
static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
@@ -1190,7 +1038,7 @@ static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
rwco = (BlkRwCo) {
.blk = blk,
.offset = offset,
.iobuf = &qiov,
.qiov = &qiov,
.flags = flags,
.ret = NOT_DONE,
};
@@ -1224,9 +1072,9 @@ int blk_pread_unthrottled(BlockBackend *blk, int64_t offset, uint8_t *buf,
}
int blk_pwrite_zeroes(BlockBackend *blk, int64_t offset,
int bytes, BdrvRequestFlags flags)
int count, BdrvRequestFlags flags)
{
return blk_prw(blk, offset, NULL, bytes, blk_write_entry,
return blk_prw(blk, offset, NULL, count, blk_write_entry,
flags | BDRV_REQ_ZERO_WRITE);
}
@@ -1235,22 +1083,11 @@ int blk_make_zero(BlockBackend *blk, BdrvRequestFlags flags)
return bdrv_make_zero(blk->root, flags);
}
static void blk_inc_in_flight(BlockBackend *blk)
{
atomic_inc(&blk->in_flight);
}
static void blk_dec_in_flight(BlockBackend *blk)
{
atomic_dec(&blk->in_flight);
aio_wait_kick(&blk->wait);
}
static void error_callback_bh(void *opaque)
{
struct BlockBackendAIOCB *acb = opaque;
blk_dec_in_flight(acb->blk);
bdrv_dec_in_flight(acb->common.bs);
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_unref(acb);
}
@@ -1261,7 +1098,7 @@ BlockAIOCB *blk_abort_aio_request(BlockBackend *blk,
{
struct BlockBackendAIOCB *acb;
blk_inc_in_flight(blk);
bdrv_inc_in_flight(blk_bs(blk));
acb = blk_aio_get(&block_backend_aiocb_info, blk, cb, opaque);
acb->blk = blk;
acb->ret = ret;
@@ -1284,7 +1121,7 @@ static const AIOCBInfo blk_aio_em_aiocb_info = {
static void blk_aio_complete(BlkAioEmAIOCB *acb)
{
if (acb->has_returned) {
blk_dec_in_flight(acb->rwco.blk);
bdrv_dec_in_flight(acb->common.bs);
acb->common.cb(acb->common.opaque, acb->rwco.ret);
qemu_aio_unref(acb);
}
@@ -1298,19 +1135,19 @@ static void blk_aio_complete_bh(void *opaque)
}
static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
void *iobuf, CoroutineEntry co_entry,
QEMUIOVector *qiov, CoroutineEntry co_entry,
BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque)
{
BlkAioEmAIOCB *acb;
Coroutine *co;
blk_inc_in_flight(blk);
bdrv_inc_in_flight(blk_bs(blk));
acb = blk_aio_get(&blk_aio_em_aiocb_info, blk, cb, opaque);
acb->rwco = (BlkRwCo) {
.blk = blk,
.offset = offset,
.iobuf = iobuf,
.qiov = qiov,
.flags = flags,
.ret = NOT_DONE,
};
@@ -1333,11 +1170,10 @@ static void blk_aio_read_entry(void *opaque)
{
BlkAioEmAIOCB *acb = opaque;
BlkRwCo *rwco = &acb->rwco;
QEMUIOVector *qiov = rwco->iobuf;
assert(qiov->size == acb->bytes);
assert(rwco->qiov->size == acb->bytes);
rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, acb->bytes,
qiov, rwco->flags);
rwco->qiov, rwco->flags);
blk_aio_complete(acb);
}
@@ -1345,11 +1181,10 @@ static void blk_aio_write_entry(void *opaque)
{
BlkAioEmAIOCB *acb = opaque;
BlkRwCo *rwco = &acb->rwco;
QEMUIOVector *qiov = rwco->iobuf;
assert(!qiov || qiov->size == acb->bytes);
assert(!rwco->qiov || rwco->qiov->size == acb->bytes);
rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, acb->bytes,
qiov, rwco->flags);
rwco->qiov, rwco->flags);
blk_aio_complete(acb);
}
@@ -1449,10 +1284,10 @@ static void blk_aio_pdiscard_entry(void *opaque)
}
BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk,
int64_t offset, int bytes,
int64_t offset, int count,
BlockCompletionFunc *cb, void *opaque)
{
return blk_aio_prwv(blk, offset, bytes, NULL, blk_aio_pdiscard_entry, 0,
return blk_aio_prwv(blk, offset, count, NULL, blk_aio_pdiscard_entry, 0,
cb, opaque);
}
@@ -1478,10 +1313,8 @@ int blk_co_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
static void blk_ioctl_entry(void *opaque)
{
BlkRwCo *rwco = opaque;
QEMUIOVector *qiov = rwco->iobuf;
rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
qiov->iov[0].iov_base);
rwco->qiov->iov[0].iov_base);
}
int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
@@ -1494,25 +1327,34 @@ static void blk_aio_ioctl_entry(void *opaque)
BlkAioEmAIOCB *acb = opaque;
BlkRwCo *rwco = &acb->rwco;
rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, rwco->iobuf);
rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
rwco->qiov->iov[0].iov_base);
blk_aio_complete(acb);
}
BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
BlockCompletionFunc *cb, void *opaque)
{
return blk_aio_prwv(blk, req, 0, buf, blk_aio_ioctl_entry, 0, cb, opaque);
QEMUIOVector qiov;
struct iovec iov;
iov = (struct iovec) {
.iov_base = buf,
.iov_len = 0,
};
qemu_iovec_init_external(&qiov, &iov, 1);
return blk_aio_prwv(blk, req, 0, &qiov, blk_aio_ioctl_entry, 0, cb, opaque);
}
int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int count)
{
int ret = blk_check_byte_request(blk, offset, bytes);
int ret = blk_check_byte_request(blk, offset, count);
if (ret < 0) {
return ret;
}
return bdrv_co_pdiscard(blk_bs(blk), offset, bytes);
return bdrv_co_pdiscard(blk_bs(blk), offset, count);
}
int blk_co_flush(BlockBackend *blk)
@@ -1537,41 +1379,14 @@ int blk_flush(BlockBackend *blk)
void blk_drain(BlockBackend *blk)
{
BlockDriverState *bs = blk_bs(blk);
if (bs) {
bdrv_drained_begin(bs);
}
/* We may have -ENOMEDIUM completions in flight */
AIO_WAIT_WHILE(&blk->wait,
blk_get_aio_context(blk),
atomic_mb_read(&blk->in_flight) > 0);
if (bs) {
bdrv_drained_end(bs);
if (blk_bs(blk)) {
bdrv_drain(blk_bs(blk));
}
}
void blk_drain_all(void)
{
BlockBackend *blk = NULL;
bdrv_drain_all_begin();
while ((blk = blk_all_next(blk)) != NULL) {
AioContext *ctx = blk_get_aio_context(blk);
aio_context_acquire(ctx);
/* We may have -ENOMEDIUM completions in flight */
AIO_WAIT_WHILE(&blk->wait, ctx,
atomic_mb_read(&blk->in_flight) > 0);
aio_context_release(ctx);
}
bdrv_drain_all_end();
bdrv_drain_all();
}
void blk_set_on_error(BlockBackend *blk, BlockdevOnError on_read_error,
@@ -1612,11 +1427,10 @@ static void send_qmp_error_event(BlockBackend *blk,
bool is_read, int error)
{
IoOperationType optype;
BlockDriverState *bs = blk_bs(blk);
optype = is_read ? IO_OPERATION_TYPE_READ : IO_OPERATION_TYPE_WRITE;
qapi_event_send_block_io_error(blk_name(blk), !!bs,
bs ? bdrv_get_node_name(bs) : NULL, optype,
qapi_event_send_block_io_error(blk_name(blk),
bdrv_get_node_name(blk_bs(blk)), optype,
action, blk_iostatus_is_enabled(blk),
error == ENOSPC, strerror(error),
&error_abort);
@@ -1840,16 +1654,16 @@ static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb)
void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
{
BlockDriverState *bs = blk_bs(blk);
ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
if (bs) {
if (tgm->throttle_state) {
bdrv_drained_begin(bs);
throttle_group_detach_aio_context(tgm);
throttle_group_attach_aio_context(tgm, new_context);
bdrv_drained_end(bs);
if (blk->public.throttle_state) {
throttle_timers_detach_aio_context(&blk->public.throttle_timers);
}
bdrv_set_aio_context(bs, new_context);
if (blk->public.throttle_state) {
throttle_timers_attach_aio_context(&blk->public.throttle_timers,
new_context);
}
}
}
@@ -1919,9 +1733,9 @@ void *blk_aio_get(const AIOCBInfo *aiocb_info, BlockBackend *blk,
}
int coroutine_fn blk_co_pwrite_zeroes(BlockBackend *blk, int64_t offset,
int bytes, BdrvRequestFlags flags)
int count, BdrvRequestFlags flags)
{
return blk_co_pwritev(blk, offset, bytes, NULL,
return blk_co_pwritev(blk, offset, count, NULL,
flags | BDRV_REQ_ZERO_WRITE);
}
@@ -1932,28 +1746,32 @@ int blk_pwrite_compressed(BlockBackend *blk, int64_t offset, const void *buf,
BDRV_REQ_WRITE_COMPRESSED);
}
int blk_truncate(BlockBackend *blk, int64_t offset, PreallocMode prealloc,
Error **errp)
int blk_truncate(BlockBackend *blk, int64_t offset, Error **errp)
{
if (!blk_is_available(blk)) {
error_setg(errp, "No medium inserted");
return -ENOMEDIUM;
}
return bdrv_truncate(blk->root, offset, prealloc, errp);
return bdrv_truncate(blk->root, offset, errp);
}
void blk_legacy_resize_cb(BlockBackend *blk)
{
if (blk->legacy_dev) {
xen_blk_resize_cb(blk->dev);
}
}
static void blk_pdiscard_entry(void *opaque)
{
BlkRwCo *rwco = opaque;
QEMUIOVector *qiov = rwco->iobuf;
rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, qiov->size);
rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, rwco->qiov->size);
}
int blk_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
int blk_pdiscard(BlockBackend *blk, int64_t offset, int count)
{
return blk_prw(blk, offset, NULL, bytes, blk_pdiscard_entry, 0);
return blk_prw(blk, offset, NULL, count, blk_pdiscard_entry, 0);
}
int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
@@ -2069,41 +1887,33 @@ int blk_commit_all(void)
/* throttling disk I/O limits */
void blk_set_io_limits(BlockBackend *blk, ThrottleConfig *cfg)
{
throttle_group_config(&blk->public.throttle_group_member, cfg);
throttle_group_config(blk, cfg);
}
void blk_io_limits_disable(BlockBackend *blk)
{
BlockDriverState *bs = blk_bs(blk);
ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
assert(tgm->throttle_state);
if (bs) {
bdrv_drained_begin(bs);
}
throttle_group_unregister_tgm(tgm);
if (bs) {
bdrv_drained_end(bs);
}
assert(blk->public.throttle_state);
bdrv_drained_begin(blk_bs(blk));
throttle_group_unregister_blk(blk);
bdrv_drained_end(blk_bs(blk));
}
/* should be called before blk_set_io_limits if a limit is set */
void blk_io_limits_enable(BlockBackend *blk, const char *group)
{
assert(!blk->public.throttle_group_member.throttle_state);
throttle_group_register_tgm(&blk->public.throttle_group_member,
group, blk_get_aio_context(blk));
assert(!blk->public.throttle_state);
throttle_group_register_blk(blk, group);
}
void blk_io_limits_update_group(BlockBackend *blk, const char *group)
{
/* this BB is not part of any group */
if (!blk->public.throttle_group_member.throttle_state) {
if (!blk->public.throttle_state) {
return;
}
/* this BB is a part of the same group than the one we want */
if (!g_strcmp0(throttle_group_get_name(&blk->public.throttle_group_member),
group)) {
if (!g_strcmp0(throttle_group_get_name(blk), group)) {
return;
}
@@ -2125,8 +1935,8 @@ static void blk_root_drained_begin(BdrvChild *child)
/* Note that blk->root may not be accessible here yet if we are just
* attaching to a BlockDriverState that is drained. Use child instead. */
if (atomic_fetch_inc(&blk->public.throttle_group_member.io_limits_disabled) == 0) {
throttle_group_restart_tgm(&blk->public.throttle_group_member);
if (blk->public.io_limits_disabled++ == 0) {
throttle_group_restart_blk(blk);
}
}
@@ -2135,8 +1945,8 @@ static void blk_root_drained_end(BdrvChild *child)
BlockBackend *blk = child->opaque;
assert(blk->quiesce_counter);
assert(blk->public.throttle_group_member.io_limits_disabled);
atomic_dec(&blk->public.throttle_group_member.io_limits_disabled);
assert(blk->public.io_limits_disabled);
--blk->public.io_limits_disabled;
if (--blk->quiesce_counter == 0) {
if (blk->dev_ops && blk->dev_ops->drained_end) {
@@ -2144,13 +1954,3 @@ static void blk_root_drained_end(BdrvChild *child)
}
}
}
void blk_register_buf(BlockBackend *blk, void *host, size_t size)
{
bdrv_register_buf(blk_bs(blk), host, size);
}
void blk_unregister_buf(BlockBackend *blk, void *host)
{
bdrv_unregister_buf(blk_bs(blk), host);
}

View File

@@ -28,7 +28,6 @@
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include "qemu/error-report.h"
/**************************************************************/
@@ -111,16 +110,7 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
if (!bdrv_is_read_only(bs)) {
error_report("Opening bochs images without an explicit read-only=on "
"option is deprecated. Future versions will refuse to "
"open the image instead of automatically marking the "
"image read-only.");
ret = bdrv_set_read_only(bs, true, errp); /* no write support yet */
if (ret < 0) {
return ret;
}
}
bs->read_only = true; /* no write support yet */
ret = bdrv_pread(bs->file, 0, &bochs, sizeof(bochs));
if (ret < 0) {

View File

@@ -23,7 +23,6 @@
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu/error-report.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
@@ -73,16 +72,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
if (!bdrv_is_read_only(bs)) {
error_report("Opening cloop images without an explicit read-only=on "
"option is deprecated. Future versions will refuse to "
"open the image instead of automatically marking the "
"image read-only.");
ret = bdrv_set_read_only(bs, true, errp);
if (ret < 0) {
return ret;
}
}
bs->read_only = true;
/* read header */
ret = bdrv_pread(bs->file, 128, &s->block_size, 4);

View File

@@ -36,34 +36,37 @@ enum {
typedef struct CommitBlockJob {
BlockJob common;
RateLimit limit;
BlockDriverState *active;
BlockDriverState *commit_top_bs;
BlockBackend *top;
BlockBackend *base;
BlockdevOnError on_error;
int base_flags;
int orig_overlay_flags;
char *backing_file_str;
} CommitBlockJob;
static int coroutine_fn commit_populate(BlockBackend *bs, BlockBackend *base,
int64_t offset, uint64_t bytes,
int64_t sector_num, int nb_sectors,
void *buf)
{
int ret = 0;
QEMUIOVector qiov;
struct iovec iov = {
.iov_base = buf,
.iov_len = bytes,
.iov_len = nb_sectors * BDRV_SECTOR_SIZE,
};
assert(bytes < SIZE_MAX);
qemu_iovec_init_external(&qiov, &iov, 1);
ret = blk_co_preadv(bs, offset, qiov.size, &qiov, 0);
ret = blk_co_preadv(bs, sector_num * BDRV_SECTOR_SIZE,
qiov.size, &qiov, 0);
if (ret < 0) {
return ret;
}
ret = blk_co_pwritev(base, offset, qiov.size, &qiov, 0);
ret = blk_co_pwritev(base, sector_num * BDRV_SECTOR_SIZE,
qiov.size, &qiov, 0);
if (ret < 0) {
return ret;
}
@@ -79,15 +82,18 @@ static void commit_complete(BlockJob *job, void *opaque)
{
CommitBlockJob *s = container_of(job, CommitBlockJob, common);
CommitCompleteData *data = opaque;
BlockDriverState *active = s->active;
BlockDriverState *top = blk_bs(s->top);
BlockDriverState *base = blk_bs(s->base);
BlockDriverState *commit_top_bs = s->commit_top_bs;
BlockDriverState *overlay_bs = bdrv_find_overlay(active, s->commit_top_bs);
int ret = data->ret;
bool remove_commit_top_bs = false;
/* Make sure commit_top_bs and top stay around until bdrv_replace_node() */
/* Make sure overlay_bs and top stay around until bdrv_set_backing_hd() */
bdrv_ref(top);
bdrv_ref(commit_top_bs);
if (overlay_bs) {
bdrv_ref(overlay_bs);
}
/* Remove base node parent that still uses BLK_PERM_WRITE/RESIZE before
* the normal backing chain can be restored. */
@@ -95,9 +101,9 @@ static void commit_complete(BlockJob *job, void *opaque)
if (!block_job_is_cancelled(&s->common) && ret == 0) {
/* success */
ret = bdrv_drop_intermediate(s->commit_top_bs, base,
ret = bdrv_drop_intermediate(active, s->commit_top_bs, base,
s->backing_file_str);
} else {
} else if (overlay_bs) {
/* XXX Can (or should) we somehow keep 'consistent read' blocked even
* after the failed/cancelled commit job is gone? If we already wrote
* something to base, the intermediate images aren't valid any more. */
@@ -110,6 +116,9 @@ static void commit_complete(BlockJob *job, void *opaque)
if (s->base_flags != bdrv_get_flags(base)) {
bdrv_reopen(base, s->base_flags, NULL);
}
if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
}
g_free(s->backing_file_str);
blk_unref(s->top);
@@ -126,13 +135,10 @@ static void commit_complete(BlockJob *job, void *opaque)
* filter driver from the backing chain. Do this as the final step so that
* the 'consistent read' permission can be granted. */
if (remove_commit_top_bs) {
bdrv_child_try_set_perm(commit_top_bs->backing, 0, BLK_PERM_ALL,
&error_abort);
bdrv_replace_node(commit_top_bs, backing_bs(commit_top_bs),
&error_abort);
bdrv_set_backing_hd(overlay_bs, top, &error_abort);
}
bdrv_unref(commit_top_bs);
bdrv_unref(overlay_bs);
bdrv_unref(top);
}
@@ -140,16 +146,17 @@ static void coroutine_fn commit_run(void *opaque)
{
CommitBlockJob *s = opaque;
CommitCompleteData *data;
int64_t offset;
int64_t sector_num, end;
uint64_t delay_ns = 0;
int ret = 0;
int64_t n = 0; /* bytes */
int n = 0;
void *buf = NULL;
int bytes_written = 0;
int64_t base_len;
ret = s->common.len = blk_getlength(s->top);
if (s->common.len < 0) {
goto out;
}
@@ -160,32 +167,35 @@ static void coroutine_fn commit_run(void *opaque)
}
if (base_len < s->common.len) {
ret = blk_truncate(s->base, s->common.len, PREALLOC_MODE_OFF, NULL);
ret = blk_truncate(s->base, s->common.len, NULL);
if (ret) {
goto out;
}
}
end = s->common.len >> BDRV_SECTOR_BITS;
buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
for (offset = 0; offset < s->common.len; offset += n) {
for (sector_num = 0; sector_num < end; sector_num += n) {
bool copy;
/* Note that even when no rate limit is applied we need to yield
* with no pending I/O here so that bdrv_drain_all() returns.
*/
block_job_sleep_ns(&s->common, delay_ns);
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
if (block_job_is_cancelled(&s->common)) {
break;
}
/* Copy if allocated above the base */
ret = bdrv_is_allocated_above(blk_bs(s->top), blk_bs(s->base),
offset, COMMIT_BUFFER_SIZE, &n);
sector_num,
COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
&n);
copy = (ret == 1);
trace_commit_one_iteration(s, offset, n, ret);
trace_commit_one_iteration(s, sector_num, n, ret);
if (copy) {
ret = commit_populate(s->top, s->base, offset, n, buf);
bytes_written += n;
ret = commit_populate(s->top, s->base, sector_num, n, buf);
bytes_written += n * BDRV_SECTOR_SIZE;
}
if (ret < 0) {
BlockErrorAction action =
@@ -198,7 +208,7 @@ static void coroutine_fn commit_run(void *opaque)
}
}
/* Publish progress */
s->common.offset += n;
s->common.offset += n * BDRV_SECTOR_SIZE;
if (copy && s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, n);
@@ -223,7 +233,7 @@ static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static const BlockJobDriver commit_job_driver = {
@@ -239,6 +249,16 @@ static int coroutine_fn bdrv_commit_top_preadv(BlockDriverState *bs,
return bdrv_co_preadv(bs->backing, offset, bytes, qiov, flags);
}
static int64_t coroutine_fn bdrv_commit_top_get_block_status(
BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum,
BlockDriverState **file)
{
*pnum = nb_sectors;
*file = bs->backing->bs;
return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
(sector_num << BDRV_SECTOR_BITS);
}
static void bdrv_commit_top_refresh_filename(BlockDriverState *bs, QDict *opts)
{
bdrv_refresh_filename(bs->backing->bs);
@@ -252,7 +272,6 @@ static void bdrv_commit_top_close(BlockDriverState *bs)
static void bdrv_commit_top_child_perm(BlockDriverState *bs, BdrvChild *c,
const BdrvChildRole *role,
BlockReopenQueue *reopen_queue,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
{
@@ -265,7 +284,7 @@ static void bdrv_commit_top_child_perm(BlockDriverState *bs, BdrvChild *c,
static BlockDriver bdrv_commit_top = {
.format_name = "commit_top",
.bdrv_co_preadv = bdrv_commit_top_preadv,
.bdrv_co_block_status = bdrv_co_block_status_from_backing,
.bdrv_co_get_block_status = bdrv_commit_top_get_block_status,
.bdrv_refresh_filename = bdrv_commit_top_refresh_filename,
.bdrv_close = bdrv_commit_top_close,
.bdrv_child_perm = bdrv_commit_top_child_perm,
@@ -277,8 +296,11 @@ void commit_start(const char *job_id, BlockDriverState *bs,
const char *filter_node_name, Error **errp)
{
CommitBlockJob *s;
BlockReopenQueue *reopen_queue = NULL;
int orig_overlay_flags;
int orig_base_flags;
BlockDriverState *iter;
BlockDriverState *overlay_bs;
BlockDriverState *commit_top_bs = NULL;
Error *local_err = NULL;
int ret;
@@ -289,16 +311,33 @@ void commit_start(const char *job_id, BlockDriverState *bs,
return;
}
overlay_bs = bdrv_find_overlay(bs, top);
if (overlay_bs == NULL) {
error_setg(errp, "Could not find overlay image for %s:", top->filename);
return;
}
s = block_job_create(job_id, &commit_job_driver, bs, 0, BLK_PERM_ALL,
speed, BLOCK_JOB_DEFAULT, NULL, NULL, errp);
if (!s) {
return;
}
/* convert base to r/w, if necessary */
orig_base_flags = bdrv_get_flags(base);
orig_base_flags = bdrv_get_flags(base);
orig_overlay_flags = bdrv_get_flags(overlay_bs);
/* convert base & overlay_bs to r/w, if necessary */
if (!(orig_base_flags & BDRV_O_RDWR)) {
bdrv_reopen(base, orig_base_flags | BDRV_O_RDWR, &local_err);
reopen_queue = bdrv_reopen_queue(reopen_queue, base, NULL,
orig_base_flags | BDRV_O_RDWR);
}
if (!(orig_overlay_flags & BDRV_O_RDWR)) {
reopen_queue = bdrv_reopen_queue(reopen_queue, overlay_bs, NULL,
orig_overlay_flags | BDRV_O_RDWR);
}
if (reopen_queue) {
bdrv_reopen_multiple(bdrv_get_aio_context(bs), reopen_queue, &local_err);
if (local_err != NULL) {
error_propagate(errp, local_err);
goto fail;
@@ -325,7 +364,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
error_propagate(errp, local_err);
goto fail;
}
bdrv_replace_node(top, commit_top_bs, &local_err);
bdrv_set_backing_hd(overlay_bs, commit_top_bs, &local_err);
if (local_err) {
bdrv_unref(commit_top_bs);
commit_top_bs = NULL;
@@ -357,6 +396,14 @@ void commit_start(const char *job_id, BlockDriverState *bs,
goto fail;
}
/* overlay_bs must be blocked because it needs to be modified to
* update the backing image string. */
ret = block_job_add_bdrv(&s->common, "overlay of top", overlay_bs,
BLK_PERM_GRAPH_MOD, BLK_PERM_ALL, errp);
if (ret < 0) {
goto fail;
}
s->base = blk_new(BLK_PERM_CONSISTENT_READ
| BLK_PERM_WRITE
| BLK_PERM_RESIZE,
@@ -375,8 +422,13 @@ void commit_start(const char *job_id, BlockDriverState *bs,
goto fail;
}
s->base_flags = orig_base_flags;
s->active = bs;
s->base_flags = orig_base_flags;
s->orig_overlay_flags = orig_overlay_flags;
s->backing_file_str = g_strdup(backing_file_str);
s->on_error = on_error;
trace_commit_start(bs, base, top, s);
@@ -391,13 +443,13 @@ fail:
blk_unref(s->top);
}
if (commit_top_bs) {
bdrv_replace_node(commit_top_bs, top, &error_abort);
bdrv_set_backing_hd(overlay_bs, top, &error_abort);
}
block_job_early_fail(&s->common);
block_job_unref(&s->common);
}
#define COMMIT_BUF_SIZE (2048 * BDRV_SECTOR_SIZE)
#define COMMIT_BUF_SECTORS 2048
/* commit COW file into the raw image */
int bdrv_commit(BlockDriverState *bs)
@@ -406,9 +458,8 @@ int bdrv_commit(BlockDriverState *bs)
BlockDriverState *backing_file_bs = NULL;
BlockDriverState *commit_top_bs = NULL;
BlockDriver *drv = bs->drv;
int64_t offset, length, backing_length;
int ro, open_flags;
int64_t n;
int64_t sector, total_sectors, length, backing_length;
int n, ro, open_flags;
int ret = 0;
uint8_t *buf = NULL;
Error *local_err = NULL;
@@ -479,33 +530,37 @@ int bdrv_commit(BlockDriverState *bs)
* grow the backing file image if possible. If not possible,
* we must return an error */
if (length > backing_length) {
ret = blk_truncate(backing, length, PREALLOC_MODE_OFF, &local_err);
ret = blk_truncate(backing, length, &local_err);
if (ret < 0) {
error_report_err(local_err);
goto ro_cleanup;
}
}
total_sectors = length >> BDRV_SECTOR_BITS;
/* blk_try_blockalign() for src will choose an alignment that works for
* backing as well, so no need to compare the alignment manually. */
buf = blk_try_blockalign(src, COMMIT_BUF_SIZE);
buf = blk_try_blockalign(src, COMMIT_BUF_SECTORS * BDRV_SECTOR_SIZE);
if (buf == NULL) {
ret = -ENOMEM;
goto ro_cleanup;
}
for (offset = 0; offset < length; offset += n) {
ret = bdrv_is_allocated(bs, offset, COMMIT_BUF_SIZE, &n);
for (sector = 0; sector < total_sectors; sector += n) {
ret = bdrv_is_allocated(bs, sector, COMMIT_BUF_SECTORS, &n);
if (ret < 0) {
goto ro_cleanup;
}
if (ret) {
ret = blk_pread(src, offset, buf, n);
ret = blk_pread(src, sector * BDRV_SECTOR_SIZE, buf,
n * BDRV_SECTOR_SIZE);
if (ret < 0) {
goto ro_cleanup;
}
ret = blk_pwrite(backing, offset, buf, n, 0);
ret = blk_pwrite(backing, sector * BDRV_SECTOR_SIZE, buf,
n * BDRV_SECTOR_SIZE, 0);
if (ret < 0) {
goto ro_cleanup;
}

View File

@@ -1,76 +0,0 @@
/*
* Block layer code related to image creation
*
* Copyright (c) 2018 Kevin Wolf <kwolf@redhat.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "block/block_int.h"
#include "qapi/qapi-commands-block-core.h"
#include "qapi/error.h"
typedef struct BlockdevCreateCo {
BlockDriver *drv;
BlockdevCreateOptions *opts;
int ret;
Error **errp;
} BlockdevCreateCo;
static void coroutine_fn bdrv_co_create_co_entry(void *opaque)
{
BlockdevCreateCo *cco = opaque;
cco->ret = cco->drv->bdrv_co_create(cco->opts, cco->errp);
}
void qmp_x_blockdev_create(BlockdevCreateOptions *options, Error **errp)
{
const char *fmt = BlockdevDriver_str(options->driver);
BlockDriver *drv = bdrv_find_format(fmt);
Coroutine *co;
BlockdevCreateCo cco;
/* If the driver is in the schema, we know that it exists. But it may not
* be whitelisted. */
assert(drv);
if (bdrv_uses_whitelist() && !bdrv_is_whitelisted(drv, false)) {
error_setg(errp, "Driver is not whitelisted");
return;
}
/* Call callback if it exists */
if (!drv->bdrv_co_create) {
error_setg(errp, "Driver does not support blockdev-create");
return;
}
cco = (BlockdevCreateCo) {
.drv = drv,
.opts = options,
.ret = -EINPROGRESS,
.errp = errp,
};
co = qemu_coroutine_create(bdrv_co_create_co_entry, &cco);
qemu_coroutine_enter(co);
while (cco.ret == -EINPROGRESS) {
aio_poll(qemu_get_aio_context(), true);
}
}

View File

@@ -24,12 +24,16 @@
#include "sysemu/block-backend.h"
#include "crypto/block.h"
#include "qapi/opts-visitor.h"
#include "qapi/qapi-visit-crypto.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi-visit.h"
#include "qapi/error.h"
#include "qemu/option.h"
#include "block/crypto.h"
#define BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET "key-secret"
#define BLOCK_CRYPTO_OPT_LUKS_CIPHER_ALG "cipher-alg"
#define BLOCK_CRYPTO_OPT_LUKS_CIPHER_MODE "cipher-mode"
#define BLOCK_CRYPTO_OPT_LUKS_IVGEN_ALG "ivgen-alg"
#define BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG "ivgen-hash-alg"
#define BLOCK_CRYPTO_OPT_LUKS_HASH_ALG "hash-alg"
#define BLOCK_CRYPTO_OPT_LUKS_ITER_TIME "iter-time"
typedef struct BlockCrypto BlockCrypto;
@@ -55,8 +59,8 @@ static ssize_t block_crypto_read_func(QCryptoBlock *block,
size_t offset,
uint8_t *buf,
size_t buflen,
void *opaque,
Error **errp)
Error **errp,
void *opaque)
{
BlockDriverState *bs = opaque;
ssize_t ret;
@@ -82,8 +86,8 @@ static ssize_t block_crypto_write_func(QCryptoBlock *block,
size_t offset,
const uint8_t *buf,
size_t buflen,
void *opaque,
Error **errp)
Error **errp,
void *opaque)
{
struct BlockCryptoCreateData *data = opaque;
ssize_t ret;
@@ -99,8 +103,8 @@ static ssize_t block_crypto_write_func(QCryptoBlock *block,
static ssize_t block_crypto_init_func(QCryptoBlock *block,
size_t headerlen,
void *opaque,
Error **errp)
Error **errp,
void *opaque)
{
struct BlockCryptoCreateData *data = opaque;
int ret;
@@ -131,7 +135,11 @@ static QemuOptsList block_crypto_runtime_opts_luks = {
.name = "crypto",
.head = QTAILQ_HEAD_INITIALIZER(block_crypto_runtime_opts_luks.head),
.desc = {
BLOCK_CRYPTO_OPT_DEF_LUKS_KEY_SECRET(""),
{
.name = BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of the secret that provides the encryption key",
},
{ /* end of list */ }
},
};
@@ -146,21 +154,49 @@ static QemuOptsList block_crypto_create_opts_luks = {
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
BLOCK_CRYPTO_OPT_DEF_LUKS_KEY_SECRET(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_ALG(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_MODE(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_ALG(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME(""),
{
.name = BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of the secret that provides the encryption key",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_CIPHER_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of encryption cipher algorithm",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_CIPHER_MODE,
.type = QEMU_OPT_STRING,
.help = "Name of encryption cipher mode",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_IVGEN_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of IV generator algorithm",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of IV generator hash algorithm",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_HASH_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of encryption hash algorithm",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_ITER_TIME,
.type = QEMU_OPT_NUMBER,
.help = "Time to spend in PBKDF in milliseconds",
},
{ /* end of list */ }
},
};
QCryptoBlockOpenOptions *
static QCryptoBlockOpenOptions *
block_crypto_open_opts_init(QCryptoBlockFormat format,
QDict *opts,
QemuOpts *opts,
Error **errp)
{
Visitor *v;
@@ -170,7 +206,7 @@ block_crypto_open_opts_init(QCryptoBlockFormat format,
ret = g_new0(QCryptoBlockOpenOptions, 1);
ret->format = format;
v = qobject_input_visitor_new_keyval(QOBJECT(opts));
v = opts_visitor_new(opts);
visit_start_struct(v, NULL, NULL, 0, &local_err);
if (local_err) {
@@ -183,11 +219,6 @@ block_crypto_open_opts_init(QCryptoBlockFormat format,
v, &ret->u.luks, &local_err);
break;
case Q_CRYPTO_BLOCK_FORMAT_QCOW:
visit_type_QCryptoBlockOptionsQCow_members(
v, &ret->u.qcow, &local_err);
break;
default:
error_setg(&local_err, "Unsupported block format %d", format);
break;
@@ -209,9 +240,9 @@ block_crypto_open_opts_init(QCryptoBlockFormat format,
}
QCryptoBlockCreateOptions *
static QCryptoBlockCreateOptions *
block_crypto_create_opts_init(QCryptoBlockFormat format,
QDict *opts,
QemuOpts *opts,
Error **errp)
{
Visitor *v;
@@ -221,7 +252,7 @@ block_crypto_create_opts_init(QCryptoBlockFormat format,
ret = g_new0(QCryptoBlockCreateOptions, 1);
ret->format = format;
v = qobject_input_visitor_new_keyval(QOBJECT(opts));
v = opts_visitor_new(opts);
visit_start_struct(v, NULL, NULL, 0, &local_err);
if (local_err) {
@@ -234,11 +265,6 @@ block_crypto_create_opts_init(QCryptoBlockFormat format,
v, &ret->u.luks, &local_err);
break;
case Q_CRYPTO_BLOCK_FORMAT_QCOW:
visit_type_QCryptoBlockOptionsQCow_members(
v, &ret->u.qcow, &local_err);
break;
default:
error_setg(&local_err, "Unsupported block format %d", format);
break;
@@ -273,7 +299,6 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
int ret = -EINVAL;
QCryptoBlockOpenOptions *open_opts = NULL;
unsigned int cflags = 0;
QDict *cryptoopts = NULL;
bs->file = bdrv_open_child(NULL, options, "file", bs, &child_file,
false, errp);
@@ -281,9 +306,6 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
return -EINVAL;
}
bs->supported_write_flags = BDRV_REQ_FUA &
bs->file->bs->supported_write_flags;
opts = qemu_opts_create(opts_spec, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
@@ -291,9 +313,7 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
goto cleanup;
}
cryptoopts = qemu_opts_to_qdict(opts, NULL);
open_opts = block_crypto_open_opts_init(format, cryptoopts, errp);
open_opts = block_crypto_open_opts_init(format, opts, errp);
if (!open_opts) {
goto cleanup;
}
@@ -301,7 +321,7 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
if (flags & BDRV_O_NO_IO) {
cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
}
crypto->block = qcrypto_block_open(open_opts, NULL,
crypto->block = qcrypto_block_open(open_opts,
block_crypto_read_func,
bs,
cflags,
@@ -313,10 +333,10 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
}
bs->encrypted = true;
bs->valid_key = true;
ret = 0;
cleanup:
QDECREF(cryptoopts);
qapi_free_QCryptoBlockOpenOptions(open_opts);
return ret;
}
@@ -336,16 +356,13 @@ static int block_crypto_create_generic(QCryptoBlockFormat format,
.opts = opts,
.filename = filename,
};
QDict *cryptoopts;
cryptoopts = qemu_opts_to_qdict(opts, NULL);
create_opts = block_crypto_create_opts_init(format, cryptoopts, errp);
create_opts = block_crypto_create_opts_init(format, opts, errp);
if (!create_opts) {
return -1;
}
crypto = qcrypto_block_create(create_opts, NULL,
crypto = qcrypto_block_create(create_opts,
block_crypto_init_func,
block_crypto_write_func,
&data,
@@ -358,24 +375,21 @@ static int block_crypto_create_generic(QCryptoBlockFormat format,
ret = 0;
cleanup:
QDECREF(cryptoopts);
qcrypto_block_free(crypto);
blk_unref(data.blk);
qapi_free_QCryptoBlockCreateOptions(create_opts);
return ret;
}
static int block_crypto_truncate(BlockDriverState *bs, int64_t offset,
PreallocMode prealloc, Error **errp)
static int block_crypto_truncate(BlockDriverState *bs, int64_t offset)
{
BlockCrypto *crypto = bs->opaque;
uint64_t payload_offset =
size_t payload_offset =
qcrypto_block_get_payload_offset(crypto->block);
assert(payload_offset < (INT64_MAX - offset));
offset += payload_offset;
return bdrv_truncate(bs->file, offset, prealloc, errp);
return bdrv_truncate(bs->file, offset, NULL);
}
static void block_crypto_close(BlockDriverState *bs)
@@ -384,72 +398,67 @@ static void block_crypto_close(BlockDriverState *bs)
qcrypto_block_free(crypto->block);
}
static int block_crypto_reopen_prepare(BDRVReopenState *state,
BlockReopenQueue *queue, Error **errp)
{
/* nothing needs checking */
return 0;
}
/*
* 1 MB bounce buffer gives good performance / memory tradeoff
* when using cache=none|directsync.
*/
#define BLOCK_CRYPTO_MAX_IO_SIZE (1024 * 1024)
#define BLOCK_CRYPTO_MAX_SECTORS 32
static coroutine_fn int
block_crypto_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
block_crypto_co_readv(BlockDriverState *bs, int64_t sector_num,
int remaining_sectors, QEMUIOVector *qiov)
{
BlockCrypto *crypto = bs->opaque;
uint64_t cur_bytes; /* number of bytes in current iteration */
int cur_nr_sectors; /* number of sectors in current iteration */
uint64_t bytes_done = 0;
uint8_t *cipher_data = NULL;
QEMUIOVector hd_qiov;
int ret = 0;
uint64_t sector_size = qcrypto_block_get_sector_size(crypto->block);
uint64_t payload_offset = qcrypto_block_get_payload_offset(crypto->block);
assert(!flags);
assert(payload_offset < INT64_MAX);
assert(QEMU_IS_ALIGNED(offset, sector_size));
assert(QEMU_IS_ALIGNED(bytes, sector_size));
size_t payload_offset =
qcrypto_block_get_payload_offset(crypto->block) / 512;
qemu_iovec_init(&hd_qiov, qiov->niov);
/* Bounce buffer because we don't wish to expose cipher text
* in qiov which points to guest memory.
/* Bounce buffer so we have a linear mem region for
* entire sector. XXX optimize so we avoid bounce
* buffer in case that qiov->niov == 1
*/
cipher_data =
qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_IO_SIZE,
qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_SECTORS * 512,
qiov->size));
if (cipher_data == NULL) {
ret = -ENOMEM;
goto cleanup;
}
while (bytes) {
cur_bytes = MIN(bytes, BLOCK_CRYPTO_MAX_IO_SIZE);
while (remaining_sectors) {
cur_nr_sectors = remaining_sectors;
if (cur_nr_sectors > BLOCK_CRYPTO_MAX_SECTORS) {
cur_nr_sectors = BLOCK_CRYPTO_MAX_SECTORS;
}
qemu_iovec_reset(&hd_qiov);
qemu_iovec_add(&hd_qiov, cipher_data, cur_bytes);
qemu_iovec_add(&hd_qiov, cipher_data, cur_nr_sectors * 512);
ret = bdrv_co_preadv(bs->file, payload_offset + offset + bytes_done,
cur_bytes, &hd_qiov, 0);
ret = bdrv_co_readv(bs->file,
payload_offset + sector_num,
cur_nr_sectors, &hd_qiov);
if (ret < 0) {
goto cleanup;
}
if (qcrypto_block_decrypt(crypto->block, offset + bytes_done,
cipher_data, cur_bytes, NULL) < 0) {
if (qcrypto_block_decrypt(crypto->block,
sector_num,
cipher_data, cur_nr_sectors * 512,
NULL) < 0) {
ret = -EIO;
goto cleanup;
}
qemu_iovec_from_buf(qiov, bytes_done, cipher_data, cur_bytes);
qemu_iovec_from_buf(qiov, bytes_done,
cipher_data, cur_nr_sectors * 512);
bytes -= cur_bytes;
bytes_done += cur_bytes;
remaining_sectors -= cur_nr_sectors;
sector_num += cur_nr_sectors;
bytes_done += cur_nr_sectors * 512;
}
cleanup:
@@ -461,58 +470,63 @@ block_crypto_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
static coroutine_fn int
block_crypto_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
block_crypto_co_writev(BlockDriverState *bs, int64_t sector_num,
int remaining_sectors, QEMUIOVector *qiov)
{
BlockCrypto *crypto = bs->opaque;
uint64_t cur_bytes; /* number of bytes in current iteration */
int cur_nr_sectors; /* number of sectors in current iteration */
uint64_t bytes_done = 0;
uint8_t *cipher_data = NULL;
QEMUIOVector hd_qiov;
int ret = 0;
uint64_t sector_size = qcrypto_block_get_sector_size(crypto->block);
uint64_t payload_offset = qcrypto_block_get_payload_offset(crypto->block);
assert(!(flags & ~BDRV_REQ_FUA));
assert(payload_offset < INT64_MAX);
assert(QEMU_IS_ALIGNED(offset, sector_size));
assert(QEMU_IS_ALIGNED(bytes, sector_size));
size_t payload_offset =
qcrypto_block_get_payload_offset(crypto->block) / 512;
qemu_iovec_init(&hd_qiov, qiov->niov);
/* Bounce buffer because we're not permitted to touch
* contents of qiov - it points to guest memory.
/* Bounce buffer so we have a linear mem region for
* entire sector. XXX optimize so we avoid bounce
* buffer in case that qiov->niov == 1
*/
cipher_data =
qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_IO_SIZE,
qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_SECTORS * 512,
qiov->size));
if (cipher_data == NULL) {
ret = -ENOMEM;
goto cleanup;
}
while (bytes) {
cur_bytes = MIN(bytes, BLOCK_CRYPTO_MAX_IO_SIZE);
while (remaining_sectors) {
cur_nr_sectors = remaining_sectors;
qemu_iovec_to_buf(qiov, bytes_done, cipher_data, cur_bytes);
if (cur_nr_sectors > BLOCK_CRYPTO_MAX_SECTORS) {
cur_nr_sectors = BLOCK_CRYPTO_MAX_SECTORS;
}
if (qcrypto_block_encrypt(crypto->block, offset + bytes_done,
cipher_data, cur_bytes, NULL) < 0) {
qemu_iovec_to_buf(qiov, bytes_done,
cipher_data, cur_nr_sectors * 512);
if (qcrypto_block_encrypt(crypto->block,
sector_num,
cipher_data, cur_nr_sectors * 512,
NULL) < 0) {
ret = -EIO;
goto cleanup;
}
qemu_iovec_reset(&hd_qiov);
qemu_iovec_add(&hd_qiov, cipher_data, cur_bytes);
qemu_iovec_add(&hd_qiov, cipher_data, cur_nr_sectors * 512);
ret = bdrv_co_pwritev(bs->file, payload_offset + offset + bytes_done,
cur_bytes, &hd_qiov, flags);
ret = bdrv_co_writev(bs->file,
payload_offset + sector_num,
cur_nr_sectors, &hd_qiov);
if (ret < 0) {
goto cleanup;
}
bytes -= cur_bytes;
bytes_done += cur_bytes;
remaining_sectors -= cur_nr_sectors;
sector_num += cur_nr_sectors;
bytes_done += cur_nr_sectors * 512;
}
cleanup:
@@ -522,22 +536,13 @@ block_crypto_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
return ret;
}
static void block_crypto_refresh_limits(BlockDriverState *bs, Error **errp)
{
BlockCrypto *crypto = bs->opaque;
uint64_t sector_size = qcrypto_block_get_sector_size(crypto->block);
bs->bl.request_alignment = sector_size; /* No sub-sector I/O */
}
static int64_t block_crypto_getlength(BlockDriverState *bs)
{
BlockCrypto *crypto = bs->opaque;
int64_t len = bdrv_getlength(bs->file->bs);
uint64_t offset = qcrypto_block_get_payload_offset(crypto->block);
assert(offset < INT64_MAX);
assert(offset < len);
ssize_t offset = qcrypto_block_get_payload_offset(crypto->block);
len -= offset;
@@ -562,9 +567,9 @@ static int block_crypto_open_luks(BlockDriverState *bs,
bs, options, flags, errp);
}
static int coroutine_fn block_crypto_co_create_opts_luks(const char *filename,
QemuOpts *opts,
Error **errp)
static int block_crypto_create_luks(const char *filename,
QemuOpts *opts,
Error **errp)
{
return block_crypto_create_generic(Q_CRYPTO_BLOCK_FORMAT_LUKS,
filename, opts, errp);
@@ -582,6 +587,7 @@ static int block_crypto_get_info_luks(BlockDriverState *bs,
}
bdi->unallocated_blocks_are_zero = false;
bdi->can_write_zeroes_with_unmap = false;
bdi->cluster_size = subbdi.cluster_size;
return 0;
@@ -623,14 +629,12 @@ BlockDriver bdrv_crypto_luks = {
.bdrv_open = block_crypto_open_luks,
.bdrv_close = block_crypto_close,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_co_create_opts = block_crypto_co_create_opts_luks,
.bdrv_create = block_crypto_create_luks,
.bdrv_truncate = block_crypto_truncate,
.create_opts = &block_crypto_create_opts_luks,
.bdrv_reopen_prepare = block_crypto_reopen_prepare,
.bdrv_refresh_limits = block_crypto_refresh_limits,
.bdrv_co_preadv = block_crypto_co_preadv,
.bdrv_co_pwritev = block_crypto_co_pwritev,
.bdrv_co_readv = block_crypto_co_readv,
.bdrv_co_writev = block_crypto_co_writev,
.bdrv_getlength = block_crypto_getlength,
.bdrv_get_info = block_crypto_get_info_luks,
.bdrv_get_specific_info = block_crypto_get_specific_info_luks,

View File

@@ -1,101 +0,0 @@
/*
* QEMU block full disk encryption
*
* Copyright (c) 2015-2017 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#ifndef BLOCK_CRYPTO_H__
#define BLOCK_CRYPTO_H__
#define BLOCK_CRYPTO_OPT_DEF_KEY_SECRET(prefix, helpstr) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_QCOW_KEY_SECRET, \
.type = QEMU_OPT_STRING, \
.help = helpstr, \
}
#define BLOCK_CRYPTO_OPT_QCOW_KEY_SECRET "key-secret"
#define BLOCK_CRYPTO_OPT_DEF_QCOW_KEY_SECRET(prefix) \
BLOCK_CRYPTO_OPT_DEF_KEY_SECRET(prefix, \
"ID of the secret that provides the AES encryption key")
#define BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET "key-secret"
#define BLOCK_CRYPTO_OPT_LUKS_CIPHER_ALG "cipher-alg"
#define BLOCK_CRYPTO_OPT_LUKS_CIPHER_MODE "cipher-mode"
#define BLOCK_CRYPTO_OPT_LUKS_IVGEN_ALG "ivgen-alg"
#define BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG "ivgen-hash-alg"
#define BLOCK_CRYPTO_OPT_LUKS_HASH_ALG "hash-alg"
#define BLOCK_CRYPTO_OPT_LUKS_ITER_TIME "iter-time"
#define BLOCK_CRYPTO_OPT_DEF_LUKS_KEY_SECRET(prefix) \
BLOCK_CRYPTO_OPT_DEF_KEY_SECRET(prefix, \
"ID of the secret that provides the keyslot passphrase")
#define BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_ALG(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_CIPHER_ALG, \
.type = QEMU_OPT_STRING, \
.help = "Name of encryption cipher algorithm", \
}
#define BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_MODE(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_CIPHER_MODE, \
.type = QEMU_OPT_STRING, \
.help = "Name of encryption cipher mode", \
}
#define BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_ALG(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_IVGEN_ALG, \
.type = QEMU_OPT_STRING, \
.help = "Name of IV generator algorithm", \
}
#define BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG, \
.type = QEMU_OPT_STRING, \
.help = "Name of IV generator hash algorithm", \
}
#define BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_HASH_ALG, \
.type = QEMU_OPT_STRING, \
.help = "Name of encryption hash algorithm", \
}
#define BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_ITER_TIME, \
.type = QEMU_OPT_NUMBER, \
.help = "Time to spend in PBKDF in milliseconds", \
}
QCryptoBlockCreateOptions *
block_crypto_create_opts_init(QCryptoBlockFormat format,
QDict *opts,
Error **errp);
QCryptoBlockOpenOptions *
block_crypto_open_opts_init(QCryptoBlockFormat format,
QDict *opts,
Error **errp);
#endif /* BLOCK_CRYPTO_H__ */

View File

@@ -21,13 +21,12 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "qemu/error-report.h"
#include "qemu/option.h"
#include "block/block_int.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qstring.h"
#include "crypto/secret.h"
#include <curl/curl.h>
@@ -77,12 +76,15 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_TIMEOUT_DEFAULT 5
#define CURL_TIMEOUT_MAX 10000
#define FIND_RET_NONE 0
#define FIND_RET_OK 1
#define FIND_RET_WAIT 2
#define CURL_BLOCK_OPT_URL "url"
#define CURL_BLOCK_OPT_READAHEAD "readahead"
#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
#define CURL_BLOCK_OPT_TIMEOUT "timeout"
#define CURL_BLOCK_OPT_COOKIE "cookie"
#define CURL_BLOCK_OPT_COOKIE_SECRET "cookie-secret"
#define CURL_BLOCK_OPT_USERNAME "username"
#define CURL_BLOCK_OPT_PASSWORD_SECRET "password-secret"
#define CURL_BLOCK_OPT_PROXY_USERNAME "proxy-username"
@@ -90,15 +92,12 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
struct BDRVCURLState;
static bool libcurl_initialized;
typedef struct CURLAIOCB {
Coroutine *co;
BlockAIOCB common;
QEMUIOVector *qiov;
uint64_t offset;
uint64_t bytes;
int ret;
int64_t sector_num;
int nb_sectors;
size_t start;
size_t end;
@@ -116,7 +115,7 @@ typedef struct CURLState
CURL *curl;
QLIST_HEAD(, CURLSocket) sockets;
char *orig_buf;
uint64_t buf_start;
size_t buf_start;
size_t buf_off;
size_t buf_len;
char range[128];
@@ -127,7 +126,7 @@ typedef struct CURLState
typedef struct BDRVCURLState {
CURLM *multi;
QEMUTimer timer;
uint64_t len;
size_t len;
CURLState states[CURL_NUM_STATES];
char *url;
size_t readahead_size;
@@ -137,7 +136,6 @@ typedef struct BDRVCURLState {
bool accept_range;
AioContext *aio_context;
QemuMutex mutex;
CoQueue free_state_waitq;
char *username;
char *password;
char *proxyusername;
@@ -259,7 +257,7 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
continue;
if ((s->buf_off >= acb->end)) {
size_t request_length = acb->bytes;
size_t request_length = acb->nb_sectors * BDRV_SECTOR_SIZE;
qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
acb->end - acb->start);
@@ -270,11 +268,11 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
request_length - offset);
}
acb->ret = 0;
s->acb[i] = NULL;
qemu_mutex_unlock(&s->s->mutex);
aio_co_wake(acb->co);
acb->common.cb(acb->common.opaque, 0);
qemu_mutex_lock(&s->s->mutex);
qemu_aio_unref(acb);
s->acb[i] = NULL;
}
}
@@ -284,18 +282,18 @@ read_end:
}
/* Called with s->mutex held. */
static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
CURLAIOCB *acb)
static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
CURLAIOCB *acb)
{
int i;
uint64_t end = start + len;
uint64_t clamped_end = MIN(end, s->len);
uint64_t clamped_len = clamped_end - start;
size_t end = start + len;
size_t clamped_end = MIN(end, s->len);
size_t clamped_len = clamped_end - start;
for (i=0; i<CURL_NUM_STATES; i++) {
CURLState *state = &s->states[i];
uint64_t buf_end = (state->buf_start + state->buf_off);
uint64_t buf_fend = (state->buf_start + state->buf_len);
size_t buf_end = (state->buf_start + state->buf_off);
size_t buf_fend = (state->buf_start + state->buf_len);
if (!state->orig_buf)
continue;
@@ -314,8 +312,7 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
if (clamped_len < len) {
qemu_iovec_memset(acb->qiov, clamped_len, 0, len - clamped_len);
}
acb->ret = 0;
return true;
return FIND_RET_OK;
}
// Wait for unfinished chunks
@@ -333,13 +330,13 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
for (j=0; j<CURL_NUM_ACB; j++) {
if (!state->acb[j]) {
state->acb[j] = acb;
return true;
return FIND_RET_WAIT;
}
}
}
}
return false;
return FIND_RET_NONE;
}
/* Called with s->mutex held. */
@@ -384,11 +381,11 @@ static void curl_multi_check_completion(BDRVCURLState *s)
continue;
}
acb->ret = -EIO;
state->acb[i] = NULL;
qemu_mutex_unlock(&s->mutex);
aio_co_wake(acb->co);
acb->common.cb(acb->common.opaque, -EIO);
qemu_mutex_lock(&s->mutex);
qemu_aio_unref(acb);
state->acb[i] = NULL;
}
}
@@ -458,27 +455,34 @@ static void curl_multi_timeout_do(void *arg)
}
/* Called with s->mutex held. */
static CURLState *curl_find_state(BDRVCURLState *s)
static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
{
CURLState *state = NULL;
int i;
int i, j;
do {
for (i=0; i<CURL_NUM_STATES; i++) {
for (j=0; j<CURL_NUM_ACB; j++)
if (s->states[i].acb[j])
continue;
if (s->states[i].in_use)
continue;
for (i = 0; i < CURL_NUM_STATES; i++) {
if (!s->states[i].in_use) {
state = &s->states[i];
state->in_use = 1;
break;
}
}
return state;
}
if (!state) {
qemu_mutex_unlock(&s->mutex);
aio_poll(bdrv_get_aio_context(bs), true);
qemu_mutex_lock(&s->mutex);
}
} while(!state);
static int curl_init_state(BDRVCURLState *s, CURLState *state)
{
if (!state->curl) {
state->curl = curl_easy_init();
if (!state->curl) {
return -EIO;
return NULL;
}
curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
curl_easy_setopt(state->curl, CURLOPT_SSL_VERIFYPEER,
@@ -531,7 +535,7 @@ static int curl_init_state(BDRVCURLState *s, CURLState *state)
QLIST_INIT(&state->sockets);
state->s = s;
return 0;
return state;
}
/* Called with s->mutex held. */
@@ -553,8 +557,6 @@ static void curl_clean_state(CURLState *s)
}
s->in_use = 0;
qemu_co_enter_next(&s->s->free_state_waitq, &s->s->mutex);
}
static void curl_parse_filename(const char *filename, QDict *options,
@@ -637,11 +639,6 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "Pass the cookie or list of cookies with each request"
},
{
.name = CURL_BLOCK_OPT_COOKIE_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of secret used as cookie passed with each request"
},
{
.name = CURL_BLOCK_OPT_USERNAME,
.type = QEMU_OPT_STRING,
@@ -676,27 +673,17 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
Error *local_err = NULL;
const char *file;
const char *cookie;
const char *cookie_secret;
double d;
const char *secretid;
const char *protocol_delimiter;
int ret;
static int inited = 0;
if (flags & BDRV_O_RDWR) {
error_setg(errp, "curl block device does not support writes");
return -EROFS;
}
if (!libcurl_initialized) {
ret = curl_global_init(CURL_GLOBAL_ALL);
if (ret) {
error_setg(errp, "libcurl initialization failed with %d", ret);
return -EIO;
}
libcurl_initialized = true;
}
qemu_mutex_init(&s->mutex);
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
@@ -723,22 +710,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
s->sslverify = qemu_opt_get_bool(opts, CURL_BLOCK_OPT_SSLVERIFY, true);
cookie = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE);
cookie_secret = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE_SECRET);
if (cookie && cookie_secret) {
error_setg(errp,
"curl driver cannot handle both cookie and cookie secret");
goto out_noclean;
}
if (cookie_secret) {
s->cookie = qcrypto_secret_lookup_as_utf8(cookie_secret, errp);
if (!s->cookie) {
goto out_noclean;
}
} else {
s->cookie = g_strdup(cookie);
}
s->cookie = g_strdup(cookie);
file = qemu_opt_get(opts, CURL_BLOCK_OPT_URL);
if (file == NULL) {
@@ -775,23 +747,22 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
}
}
if (!inited) {
curl_global_init(CURL_GLOBAL_ALL);
inited = 1;
}
DPRINTF("CURL: Opening %s\n", file);
qemu_co_queue_init(&s->free_state_waitq);
s->aio_context = bdrv_get_aio_context(bs);
s->url = g_strdup(file);
qemu_mutex_lock(&s->mutex);
state = curl_find_state(s);
state = curl_init_state(bs, s);
qemu_mutex_unlock(&s->mutex);
if (!state) {
if (!state)
goto out_noclean;
}
// Get file size
if (curl_init_state(s, state) < 0) {
goto out;
}
s->accept_range = false;
curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1);
curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION,
@@ -819,7 +790,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
}
#endif
s->len = d;
s->len = (size_t)d;
if ((!strncasecmp(s->url, "http://", strlen("http://"))
|| !strncasecmp(s->url, "https://", strlen("https://")))
@@ -828,7 +799,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
"Server does not support 'range' (byte ranges).");
goto out;
}
DPRINTF("CURL: Size = %" PRIu64 "\n", s->len);
DPRINTF("CURL: Size = %zd\n", s->len);
qemu_mutex_lock(&s->mutex);
curl_clean_state(state);
@@ -849,48 +820,51 @@ out_noclean:
qemu_mutex_destroy(&s->mutex);
g_free(s->cookie);
g_free(s->url);
g_free(s->username);
g_free(s->proxyusername);
g_free(s->proxypassword);
qemu_opts_del(opts);
return -EINVAL;
}
static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
static const AIOCBInfo curl_aiocb_info = {
.aiocb_size = sizeof(CURLAIOCB),
};
static void curl_readv_bh_cb(void *p)
{
CURLState *state;
int running;
int ret = -EINPROGRESS;
CURLAIOCB *acb = p;
BlockDriverState *bs = acb->common.bs;
BDRVCURLState *s = bs->opaque;
uint64_t start = acb->offset;
uint64_t end;
size_t start = acb->sector_num * BDRV_SECTOR_SIZE;
size_t end;
qemu_mutex_lock(&s->mutex);
// In case we have the requested data already (e.g. read-ahead),
// we can just call the callback and be done.
if (curl_find_buf(s, start, acb->bytes, acb)) {
goto out;
switch (curl_find_buf(s, start, acb->nb_sectors * BDRV_SECTOR_SIZE, acb)) {
case FIND_RET_OK:
ret = 0;
goto out;
case FIND_RET_WAIT:
goto out;
default:
break;
}
// No cache found, so let's start a new request
for (;;) {
state = curl_find_state(s);
if (state) {
break;
}
qemu_co_queue_wait(&s->free_state_waitq, &s->mutex);
}
if (curl_init_state(s, state) < 0) {
curl_clean_state(state);
acb->ret = -EIO;
state = curl_init_state(acb->common.bs, s);
if (!state) {
ret = -EIO;
goto out;
}
acb->start = 0;
acb->end = MIN(acb->bytes, s->len - start);
acb->end = MIN(acb->nb_sectors * BDRV_SECTOR_SIZE, s->len - start);
state->buf_off = 0;
g_free(state->orig_buf);
@@ -900,14 +874,14 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
state->orig_buf = g_try_malloc(state->buf_len);
if (state->buf_len && state->orig_buf == NULL) {
curl_clean_state(state);
acb->ret = -ENOMEM;
ret = -ENOMEM;
goto out;
}
state->acb[0] = acb;
snprintf(state->range, 127, "%" PRIu64 "-%" PRIu64, start, end);
DPRINTF("CURL (AIO): Reading %" PRIu64 " at %" PRIu64 " (%s)\n",
acb->bytes, start, state->range);
snprintf(state->range, 127, "%zd-%zd", start, end);
DPRINTF("CURL (AIO): Reading %llu at %zd (%s)\n",
(acb->nb_sectors * BDRV_SECTOR_SIZE), start, state->range);
curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
curl_multi_add_handle(s->multi, state->curl);
@@ -917,24 +891,26 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
out:
qemu_mutex_unlock(&s->mutex);
if (ret != -EINPROGRESS) {
acb->common.cb(acb->common.opaque, ret);
qemu_aio_unref(acb);
}
}
static int coroutine_fn curl_co_preadv(BlockDriverState *bs,
uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
CURLAIOCB acb = {
.co = qemu_coroutine_self(),
.ret = -EINPROGRESS,
.qiov = qiov,
.offset = offset,
.bytes = bytes
};
CURLAIOCB *acb;
curl_setup_preadv(bs, &acb);
while (acb.ret == -EINPROGRESS) {
qemu_coroutine_yield();
}
return acb.ret;
acb = qemu_aio_get(&curl_aiocb_info, bs, cb, opaque);
acb->qiov = qiov;
acb->sector_num = sector_num;
acb->nb_sectors = nb_sectors;
aio_bh_schedule_oneshot(bdrv_get_aio_context(bs), curl_readv_bh_cb, acb);
return &acb->common;
}
static void curl_close(BlockDriverState *bs)
@@ -947,9 +923,6 @@ static void curl_close(BlockDriverState *bs)
g_free(s->cookie);
g_free(s->url);
g_free(s->username);
g_free(s->proxyusername);
g_free(s->proxypassword);
}
static int64_t curl_getlength(BlockDriverState *bs)
@@ -968,7 +941,7 @@ static BlockDriver bdrv_http = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
@@ -984,7 +957,7 @@ static BlockDriver bdrv_https = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
@@ -1000,7 +973,7 @@ static BlockDriver bdrv_ftp = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
@@ -1016,7 +989,7 @@ static BlockDriver bdrv_ftps = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,

586
block/dictzip.c Normal file
View File

@@ -0,0 +1,586 @@
/*
* DictZip Block driver for dictzip enabled gzip files
*
* Use the "dictzip" tool from the "dictd" package to create gzip files that
* contain the extra DictZip headers.
*
* dictzip(1) is a compression program which creates compressed files in the
* gzip format (see RFC 1952). However, unlike gzip(1), dictzip(1) compresses
* the file in pieces and stores an index to the pieces in the gzip header.
* This allows random access to the file at the granularity of the compressed
* pieces (currently about 64kB) while maintaining good compression ratios
* (within 5% of the expected ratio for dictionary data).
* dictd(8) uses files stored in this format.
*
* For details on DictZip see http://dict.org/.
*
* Copyright (c) 2009 Alexander Graf <agraf@suse.de>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include <zlib.h>
// #define DEBUG
#ifdef DEBUG
#define dprintf(fmt, ...) do { printf("dzip: " fmt, ## __VA_ARGS__); } while (0)
#else
#define dprintf(fmt, ...) do { } while (0)
#endif
#define SECTOR_SIZE 512
#define Z_STREAM_COUNT 4
#define CACHE_COUNT 20
/* magic values */
#define GZ_MAGIC1 0x1f
#define GZ_MAGIC2 0x8b
#define DZ_MAGIC1 'R'
#define DZ_MAGIC2 'A'
#define GZ_FEXTRA 0x04 /* Optional field (random access index) */
#define GZ_FNAME 0x08 /* Original name */
#define GZ_COMMENT 0x10 /* Zero-terminated, human-readable comment */
#define GZ_FHCRC 0x02 /* Header CRC16 */
/* offsets */
#define GZ_ID 0 /* GZ_MAGIC (16bit) */
#define GZ_FLG 3 /* FLaGs (see above) */
#define GZ_XLEN 10 /* eXtra LENgth (16bit) */
#define GZ_SI 12 /* Subfield ID (16bit) */
#define GZ_VERSION 16 /* Version for subfield format */
#define GZ_CHUNKSIZE 18 /* Chunk size (16bit) */
#define GZ_CHUNKCNT 20 /* Number of chunks (16bit) */
#define GZ_RNDDATA 22 /* Random access data (16bit) */
#define GZ_99_CHUNKSIZE 18 /* Chunk size (32bit) */
#define GZ_99_CHUNKCNT 22 /* Number of chunks (32bit) */
#define GZ_99_FILESIZE 26 /* Size of unpacked file (64bit) */
#define GZ_99_RNDDATA 34 /* Random access data (32bit) */
struct BDRVDictZipState;
typedef struct DictZipAIOCB {
BlockAIOCB common;
struct BDRVDictZipState *s;
QEMUIOVector *qiov; /* QIOV of the original request */
QEMUIOVector *qiov_gz; /* QIOV of the gz subrequest */
QEMUBH *bh; /* BH for cache */
z_stream *zStream; /* stream to use for decoding */
int zStream_id; /* stream id of the above pointer */
size_t start; /* offset into the uncompressed file */
size_t len; /* uncompressed bytes to read */
uint8_t *gzipped; /* the gzipped data */
uint8_t *buf; /* cached result */
size_t gz_len; /* amount of gzip data */
size_t gz_start; /* uncompressed starting point of gzip data */
uint64_t offset; /* offset for "start" into the uncompressed chunk */
int chunks_len; /* amount of uncompressed data in all gzip data */
} DictZipAIOCB;
typedef struct dict_cache {
size_t start;
size_t len;
uint8_t *buf;
} DictCache;
typedef struct BDRVDictZipState {
BlockDriverState *hd;
z_stream zStream[Z_STREAM_COUNT];
DictCache cache[CACHE_COUNT];
int cache_index;
uint8_t stream_in_use;
uint64_t chunk_len;
uint32_t chunk_cnt;
uint16_t *chunks;
uint32_t *chunks32;
uint64_t *offsets;
int64_t file_len;
} BDRVDictZipState;
static int start_zStream(z_stream *zStream)
{
zStream->zalloc = NULL;
zStream->zfree = NULL;
zStream->opaque = NULL;
zStream->next_in = 0;
zStream->avail_in = 0;
zStream->next_out = NULL;
zStream->avail_out = 0;
return inflateInit2( zStream, -15 );
}
static QemuOptsList runtime_opts = {
.name = "dzip",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.type = QEMU_OPT_STRING,
.help = "URL to the dictzip file",
},
{ /* end of list */ }
},
};
static int dictzip_open(BlockDriverState *bs, QDict *options, int flags, Error **errp)
{
BDRVDictZipState *s = bs->opaque;
const char *err = "Unknown (read error?)";
uint8_t magic[2];
char buf[100];
uint8_t header_flags;
uint16_t chunk_len16;
uint16_t chunk_cnt16;
uint32_t chunk_len32;
uint16_t header_ver;
uint16_t tmp_short;
uint64_t offset;
int chunks_len;
int headerLength = GZ_XLEN - 1;
int rnd_offs;
int ret;
int i;
QemuOpts *opts;
Error *local_err = NULL;
const char *filename;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err != NULL) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
filename = qemu_opt_get(opts, "filename");
if (!strncmp(filename, "dzip://", 7))
filename += 7;
else if (!strncmp(filename, "dzip:", 5))
filename += 5;
s->hd = bdrv_open(filename, NULL, NULL, flags | BDRV_O_PROTOCOL, errp);
if (!s->hd) {
ret = -EINVAL;
qemu_opts_del(opts);
return ret;
}
/* initialize zlib streams */
for (i = 0; i < Z_STREAM_COUNT; i++) {
if (start_zStream( &s->zStream[i] ) != Z_OK) {
err = s->zStream[i].msg;
goto fail;
}
}
/* gzip header */
if (bdrv_pread(s->hd->file, GZ_ID, &magic, sizeof(magic)) != sizeof(magic))
goto fail;
if (!((magic[0] == GZ_MAGIC1) && (magic[1] == GZ_MAGIC2))) {
err = "No gzip file";
goto fail;
}
/* dzip header */
if (bdrv_pread(s->hd->file, GZ_FLG, &header_flags, 1) != 1)
goto fail;
if (!(header_flags & GZ_FEXTRA)) {
err = "Not a dictzip file (wrong flags)";
goto fail;
}
/* extra length */
if (bdrv_pread(s->hd->file, GZ_XLEN, &tmp_short, 2) != 2)
goto fail;
headerLength += le16_to_cpu(tmp_short) + 2;
/* DictZip magic */
if (bdrv_pread(s->hd->file, GZ_SI, &magic, 2) != 2)
goto fail;
if (magic[0] != DZ_MAGIC1 || magic[1] != DZ_MAGIC2) {
err = "Not a dictzip file (missing extra magic)";
goto fail;
}
/* DictZip version */
if (bdrv_pread(s->hd->file, GZ_VERSION, &header_ver, 2) != 2)
goto fail;
header_ver = le16_to_cpu(header_ver);
switch (header_ver) {
case 1: /* Normal DictZip */
/* number of chunks */
if (bdrv_pread(s->hd->file, GZ_CHUNKSIZE, &chunk_len16, 2) != 2)
goto fail;
s->chunk_len = le16_to_cpu(chunk_len16);
/* chunk count */
if (bdrv_pread(s->hd->file, GZ_CHUNKCNT, &chunk_cnt16, 2) != 2)
goto fail;
s->chunk_cnt = le16_to_cpu(chunk_cnt16);
chunks_len = sizeof(short) * s->chunk_cnt;
rnd_offs = GZ_RNDDATA;
break;
case 99: /* Special Alex pigz version */
/* number of chunks */
if (bdrv_pread(s->hd->file, GZ_99_CHUNKSIZE, &chunk_len32, 4) != 4)
goto fail;
dprintf("chunk len [%#x] = %d\n", GZ_99_CHUNKSIZE, chunk_len32);
s->chunk_len = le32_to_cpu(chunk_len32);
/* chunk count */
if (bdrv_pread(s->hd->file, GZ_99_CHUNKCNT, &s->chunk_cnt, 4) != 4)
goto fail;
s->chunk_cnt = le32_to_cpu(s->chunk_cnt);
dprintf("chunk len | count = %"PRId64" | %d\n", s->chunk_len, s->chunk_cnt);
/* file size */
if (bdrv_pread(s->hd->file, GZ_99_FILESIZE, &s->file_len, 8) != 8)
goto fail;
s->file_len = le64_to_cpu(s->file_len);
chunks_len = sizeof(int) * s->chunk_cnt;
rnd_offs = GZ_99_RNDDATA;
break;
default:
err = "Invalid DictZip version";
goto fail;
}
/* random access data */
s->chunks = g_malloc(chunks_len);
if (header_ver == 99)
s->chunks32 = (uint32_t *)s->chunks;
if (bdrv_pread(s->hd->file, rnd_offs, s->chunks, chunks_len) != chunks_len)
goto fail;
/* orig filename */
if (header_flags & GZ_FNAME) {
if (bdrv_pread(s->hd->file, headerLength + 1, buf, sizeof(buf)) != sizeof(buf))
goto fail;
buf[sizeof(buf) - 1] = '\0';
headerLength += strlen(buf) + 1;
if (strlen(buf) == sizeof(buf))
goto fail;
dprintf("filename: %s\n", buf);
}
/* comment field */
if (header_flags & GZ_COMMENT) {
if (bdrv_pread(s->hd->file, headerLength, buf, sizeof(buf)) != sizeof(buf))
goto fail;
buf[sizeof(buf) - 1] = '\0';
headerLength += strlen(buf) + 1;
if (strlen(buf) == sizeof(buf))
goto fail;
dprintf("comment: %s\n", buf);
}
if (header_flags & GZ_FHCRC)
headerLength += 2;
/* uncompressed file length*/
if (!s->file_len) {
uint32_t file_len;
if (bdrv_pread(s->hd->file, bdrv_getlength(s->hd) - 4, &file_len, 4) != 4)
goto fail;
s->file_len = le32_to_cpu(file_len);
}
/* compute offsets */
s->offsets = g_malloc(sizeof( *s->offsets ) * s->chunk_cnt);
for (offset = headerLength + 1, i = 0; i < s->chunk_cnt; i++) {
s->offsets[i] = offset;
switch (header_ver) {
case 1:
offset += le16_to_cpu(s->chunks[i]);
break;
case 99:
offset += le32_to_cpu(s->chunks32[i]);
break;
}
dprintf("chunk %#"PRIx64" - %#"PRIx64" = offset %#"PRIx64" -> %#"PRIx64"\n", i * s->chunk_len, (i+1) * s->chunk_len, s->offsets[i], offset);
}
qemu_opts_del(opts);
return 0;
fail:
fprintf(stderr, "DictZip: Error opening file: %s\n", err);
bdrv_unref(s->hd);
if (s->chunks)
g_free(s->chunks);
qemu_opts_del(opts);
return -EINVAL;
}
/* This callback gets invoked when we have the result in cache already */
static void dictzip_cache_cb(void *opaque)
{
DictZipAIOCB *acb = (DictZipAIOCB *)opaque;
qemu_iovec_from_buf(acb->qiov, 0, acb->buf, acb->len);
acb->common.cb(acb->common.opaque, 0);
qemu_bh_delete(acb->bh);
qemu_aio_unref(acb);
}
/* This callback gets invoked by the underlying block reader when we have
* all compressed data. We uncompress in here. */
static void dictzip_read_cb(void *opaque, int ret)
{
DictZipAIOCB *acb = (DictZipAIOCB *)opaque;
struct BDRVDictZipState *s = acb->s;
uint8_t *buf;
DictCache *cache;
int r, i;
buf = g_malloc(acb->chunks_len);
/* try to find zlib stream for decoding */
do {
for (i = 0; i < Z_STREAM_COUNT; i++) {
if (!(s->stream_in_use & (1 << i))) {
s->stream_in_use |= (1 << i);
acb->zStream_id = i;
acb->zStream = &s->zStream[i];
break;
}
}
} while(!acb->zStream);
/* sure, we could handle more streams, but this callback should be single
threaded and when it's not, we really want to know! */
assert(i == 0);
/* uncompress the chunk */
acb->zStream->next_in = acb->gzipped;
acb->zStream->avail_in = acb->gz_len;
acb->zStream->next_out = buf;
acb->zStream->avail_out = acb->chunks_len;
r = inflate( acb->zStream, Z_PARTIAL_FLUSH );
if ( (r != Z_OK) && (r != Z_STREAM_END) )
fprintf(stderr, "Error inflating: [%d] %s\n", r, acb->zStream->msg);
if ( r == Z_STREAM_END )
inflateReset(acb->zStream);
dprintf("inflating [%d] left: %d | %d bytes\n", r, acb->zStream->avail_in, acb->zStream->avail_out);
s->stream_in_use &= ~(1 << acb->zStream_id);
/* nofity the caller */
qemu_iovec_from_buf(acb->qiov, 0, buf + acb->offset, acb->len);
acb->common.cb(acb->common.opaque, 0);
/* fill the cache */
cache = &s->cache[s->cache_index];
s->cache_index++;
if (s->cache_index == CACHE_COUNT)
s->cache_index = 0;
cache->len = 0;
if (cache->buf)
g_free(cache->buf);
cache->start = acb->gz_start;
cache->buf = buf;
cache->len = acb->chunks_len;
/* free occupied ressources */
g_free(acb->qiov_gz);
qemu_aio_unref(acb);
}
static const AIOCBInfo dictzip_aiocb_info = {
.aiocb_size = sizeof(DictZipAIOCB),
};
/* This is where we get a request from a caller to read something */
static BlockAIOCB *dictzip_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
BDRVDictZipState *s = bs->opaque;
DictZipAIOCB *acb;
QEMUIOVector *qiov_gz;
struct iovec *iov;
uint8_t *buf;
size_t start = sector_num * SECTOR_SIZE;
size_t len = nb_sectors * SECTOR_SIZE;
size_t end = start + len;
size_t gz_start;
size_t gz_len;
int64_t gz_sector_num;
int gz_nb_sectors;
int first_chunk, last_chunk;
int first_offset;
int i;
acb = qemu_aio_get(&dictzip_aiocb_info, bs, cb, opaque);
if (!acb)
return NULL;
/* Search Cache */
for (i = 0; i < CACHE_COUNT; i++) {
if (!s->cache[i].len)
continue;
if ((start >= s->cache[i].start) &&
(end <= (s->cache[i].start + s->cache[i].len))) {
acb->buf = s->cache[i].buf + (start - s->cache[i].start);
acb->len = len;
acb->qiov = qiov;
acb->bh = qemu_bh_new(dictzip_cache_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
}
}
/* No cache, so let's decode */
/* We need to read these chunks */
first_chunk = start / s->chunk_len;
first_offset = start - first_chunk * s->chunk_len;
last_chunk = end / s->chunk_len;
gz_start = s->offsets[first_chunk];
gz_len = 0;
for (i = first_chunk; i <= last_chunk; i++) {
if (s->chunks32)
gz_len += le32_to_cpu(s->chunks32[i]);
else
gz_len += le16_to_cpu(s->chunks[i]);
}
gz_sector_num = gz_start / SECTOR_SIZE;
gz_nb_sectors = (gz_len / SECTOR_SIZE);
/* account for tail and heads */
while ((gz_start + gz_len) > ((gz_sector_num + gz_nb_sectors) * SECTOR_SIZE))
gz_nb_sectors++;
/* Allocate qiov, iov and buf in one chunk so we only need to free qiov */
qiov_gz = g_malloc0(sizeof(QEMUIOVector) + sizeof(struct iovec) +
(gz_nb_sectors * SECTOR_SIZE));
iov = (struct iovec *)(((char *)qiov_gz) + sizeof(QEMUIOVector));
buf = ((uint8_t *)iov) + sizeof(struct iovec *);
/* Kick off the read by the backing file, so we can start decompressing */
iov->iov_base = (void *)buf;
iov->iov_len = gz_nb_sectors * 512;
qemu_iovec_init_external(qiov_gz, iov, 1);
dprintf("read %zd - %zd => %zd - %zd\n", start, end, gz_start, gz_start + gz_len);
acb->s = s;
acb->qiov = qiov;
acb->qiov_gz = qiov_gz;
acb->start = start;
acb->len = len;
acb->gzipped = buf + (gz_start % SECTOR_SIZE);
acb->gz_len = gz_len;
acb->gz_start = first_chunk * s->chunk_len;
acb->offset = first_offset;
acb->chunks_len = (last_chunk - first_chunk + 1) * s->chunk_len;
return bdrv_aio_readv(s->hd->file, gz_sector_num, qiov_gz, gz_nb_sectors,
dictzip_read_cb, acb);
}
static void dictzip_close(BlockDriverState *bs)
{
BDRVDictZipState *s = bs->opaque;
int i;
for (i = 0; i < CACHE_COUNT; i++) {
if (!s->cache[i].len)
continue;
g_free(s->cache[i].buf);
}
for (i = 0; i < Z_STREAM_COUNT; i++) {
inflateEnd(&s->zStream[i]);
}
if (s->chunks)
g_free(s->chunks);
if (s->offsets)
g_free(s->offsets);
dprintf("Close\n");
}
static int64_t dictzip_getlength(BlockDriverState *bs)
{
BDRVDictZipState *s = bs->opaque;
dprintf("getlength -> %ld\n", s->file_len);
return s->file_len;
}
static BlockDriver bdrv_dictzip = {
.format_name = "dzip",
.protocol_name = "dzip",
.instance_size = sizeof(BDRVDictZipState),
.bdrv_file_open = dictzip_open,
.bdrv_close = dictzip_close,
.bdrv_getlength = dictzip_getlength,
.bdrv_aio_readv = dictzip_aio_readv,
};
static void dictzip_block_init(void)
{
bdrv_register(&bdrv_dictzip);
}
block_init(dictzip_block_init);

View File

@@ -1,7 +1,7 @@
/*
* Block Dirty Bitmap
*
* Copyright (c) 2016-2017 Red Hat. Inc
* Copyright (c) 2016 Red Hat. Inc
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
@@ -37,22 +37,13 @@
* or enabled. A frozen bitmap can only abdicate() or reclaim().
*/
struct BdrvDirtyBitmap {
QemuMutex *mutex;
HBitmap *bitmap; /* Dirty bitmap implementation */
HBitmap *bitmap; /* Dirty sector bitmap implementation */
HBitmap *meta; /* Meta dirty bitmap */
BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
char *name; /* Optional non-empty unique ID */
int64_t size; /* Size of the bitmap, in bytes */
bool disabled; /* Bitmap is disabled. It ignores all writes to
the device */
int64_t size; /* Size of the bitmap (Number of sectors) */
bool disabled; /* Bitmap is read-only */
int active_iterators; /* How many iterators are active */
bool readonly; /* Bitmap is read-only. This field also
prevents the respective image from being
modified (i.e. blocks writes and discards).
Such operations must fail and both the image
and this bitmap must remain unchanged while
this flag is set. */
bool persistent; /* bitmap must be saved to owner disk image */
QLIST_ENTRY(BdrvDirtyBitmap) list;
};
@@ -61,27 +52,6 @@ struct BdrvDirtyBitmapIter {
BdrvDirtyBitmap *bitmap;
};
static inline void bdrv_dirty_bitmaps_lock(BlockDriverState *bs)
{
qemu_mutex_lock(&bs->dirty_bitmap_mutex);
}
static inline void bdrv_dirty_bitmaps_unlock(BlockDriverState *bs)
{
qemu_mutex_unlock(&bs->dirty_bitmap_mutex);
}
void bdrv_dirty_bitmap_lock(BdrvDirtyBitmap *bitmap)
{
qemu_mutex_lock(bitmap->mutex);
}
void bdrv_dirty_bitmap_unlock(BdrvDirtyBitmap *bitmap)
{
qemu_mutex_unlock(bitmap->mutex);
}
/* Called with BQL or dirty_bitmap lock taken. */
BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
{
BdrvDirtyBitmap *bm;
@@ -95,16 +65,13 @@ BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
return NULL;
}
/* Called with BQL taken. */
void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
g_free(bitmap->name);
bitmap->name = NULL;
bitmap->persistent = false;
}
/* Called with BQL taken. */
BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
uint32_t granularity,
const char *name,
@@ -112,28 +79,28 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
{
int64_t bitmap_size;
BdrvDirtyBitmap *bitmap;
uint32_t sector_granularity;
assert(is_power_of_2(granularity) && granularity >= BDRV_SECTOR_SIZE);
assert((granularity & (granularity - 1)) == 0);
if (name && bdrv_find_dirty_bitmap(bs, name)) {
error_setg(errp, "Bitmap already exists: %s", name);
return NULL;
}
bitmap_size = bdrv_getlength(bs);
sector_granularity = granularity >> BDRV_SECTOR_BITS;
assert(sector_granularity);
bitmap_size = bdrv_nb_sectors(bs);
if (bitmap_size < 0) {
error_setg_errno(errp, -bitmap_size, "could not get length of device");
errno = -bitmap_size;
return NULL;
}
bitmap = g_new0(BdrvDirtyBitmap, 1);
bitmap->mutex = &bs->dirty_bitmap_mutex;
bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(granularity));
bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(sector_granularity));
bitmap->size = bitmap_size;
bitmap->name = g_strdup(name);
bitmap->disabled = false;
bdrv_dirty_bitmaps_lock(bs);
QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
bdrv_dirty_bitmaps_unlock(bs);
return bitmap;
}
@@ -152,19 +119,39 @@ void bdrv_create_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int chunk_size)
{
assert(!bitmap->meta);
qemu_mutex_lock(bitmap->mutex);
bitmap->meta = hbitmap_create_meta(bitmap->bitmap,
chunk_size * BITS_PER_BYTE);
qemu_mutex_unlock(bitmap->mutex);
}
void bdrv_release_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(bitmap->meta);
qemu_mutex_lock(bitmap->mutex);
hbitmap_free_meta(bitmap->bitmap);
bitmap->meta = NULL;
qemu_mutex_unlock(bitmap->mutex);
}
int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, int64_t sector,
int nb_sectors)
{
uint64_t i;
int sectors_per_bit = 1 << hbitmap_granularity(bitmap->meta);
/* To optimize: we can make hbitmap to internally check the range in a
* coarse level, or at least do it word by word. */
for (i = sector; i < sector + nb_sectors; i += sectors_per_bit) {
if (hbitmap_get(bitmap->meta, i)) {
return true;
}
}
return false;
}
void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, int64_t sector,
int nb_sectors)
{
hbitmap_reset(bitmap->meta, sector, nb_sectors);
}
int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap)
@@ -177,19 +164,16 @@ const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap)
return bitmap->name;
}
/* Called with BQL taken. */
bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
{
return bitmap->successor;
}
/* Called with BQL taken. */
bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
{
return !(bitmap->disabled || bitmap->successor);
}
/* Called with BQL taken. */
DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
{
if (bdrv_dirty_bitmap_frozen(bitmap)) {
@@ -204,7 +188,6 @@ DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
/**
* Create a successor bitmap destined to replace this bitmap after an operation.
* Requires that the bitmap is not frozen and has no successor.
* Called with BQL taken.
*/
int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, Error **errp)
@@ -237,7 +220,6 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
/**
* For a bitmap with a successor, yield our name to the successor,
* delete the old bitmap, and return a handle to the new bitmap.
* Called with BQL taken.
*/
BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
@@ -256,8 +238,6 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
bitmap->name = NULL;
successor->name = name;
bitmap->successor = NULL;
successor->persistent = bitmap->persistent;
bitmap->persistent = false;
bdrv_release_dirty_bitmap(bs, bitmap);
return successor;
@@ -267,7 +247,6 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
* In cases of failure where we can no longer safely delete the parent,
* we may wish to re-join the parent and child/successor.
* The merged parent will be un-frozen, but not explicitly re-enabled.
* Called with BQL taken.
*/
BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
@@ -292,36 +271,27 @@ BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
/**
* Truncates _all_ bitmaps attached to a BDS.
* Called with BQL taken.
*/
void bdrv_dirty_bitmap_truncate(BlockDriverState *bs, int64_t bytes)
void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
{
BdrvDirtyBitmap *bitmap;
uint64_t size = bdrv_nb_sectors(bs);
bdrv_dirty_bitmaps_lock(bs);
QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
assert(!bdrv_dirty_bitmap_frozen(bitmap));
assert(!bitmap->active_iterators);
hbitmap_truncate(bitmap->bitmap, bytes);
bitmap->size = bytes;
hbitmap_truncate(bitmap->bitmap, size);
bitmap->size = size;
}
bdrv_dirty_bitmaps_unlock(bs);
}
static bool bdrv_dirty_bitmap_has_name(BdrvDirtyBitmap *bitmap)
{
return !!bdrv_dirty_bitmap_name(bitmap);
}
/* Called with BQL taken. */
static void bdrv_do_release_matching_dirty_bitmap(
BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
bool (*cond)(BdrvDirtyBitmap *bitmap))
static void bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
bool only_named)
{
BdrvDirtyBitmap *bm, *next;
bdrv_dirty_bitmaps_lock(bs);
QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
if ((!bitmap || bm == bitmap) && (!cond || cond(bm))) {
if ((!bitmap || bm == bitmap) && (!only_named || bm->name)) {
assert(!bm->active_iterators);
assert(!bdrv_dirty_bitmap_frozen(bm));
assert(!bm->meta);
@@ -331,72 +301,35 @@ static void bdrv_do_release_matching_dirty_bitmap(
g_free(bm);
if (bitmap) {
goto out;
return;
}
}
}
if (bitmap) {
abort();
}
out:
bdrv_dirty_bitmaps_unlock(bs);
}
/* Called with BQL taken. */
void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
{
bdrv_do_release_matching_dirty_bitmap(bs, bitmap, NULL);
bdrv_do_release_matching_dirty_bitmap(bs, bitmap, false);
}
/**
* Release all named dirty bitmaps attached to a BDS (for use in bdrv_close()).
* There must not be any frozen bitmaps attached.
* This function does not remove persistent bitmaps from the storage.
* Called with BQL taken.
*/
void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs)
{
bdrv_do_release_matching_dirty_bitmap(bs, NULL, bdrv_dirty_bitmap_has_name);
bdrv_do_release_matching_dirty_bitmap(bs, NULL, true);
}
/**
* Release all persistent dirty bitmaps attached to a BDS (for use in
* bdrv_inactivate_recurse()).
* There must not be any frozen bitmaps attached.
* This function does not remove persistent bitmaps from the storage.
*/
void bdrv_release_persistent_dirty_bitmaps(BlockDriverState *bs)
{
bdrv_do_release_matching_dirty_bitmap(bs, NULL,
bdrv_dirty_bitmap_get_persistance);
}
/**
* Remove persistent dirty bitmap from the storage if it exists.
* Absence of bitmap is not an error, because we have the following scenario:
* BdrvDirtyBitmap can have .persistent = true but not yet saved and have no
* stored version. For such bitmap bdrv_remove_persistent_dirty_bitmap() should
* not fail.
* This function doesn't release corresponding BdrvDirtyBitmap.
*/
void bdrv_remove_persistent_dirty_bitmap(BlockDriverState *bs,
const char *name,
Error **errp)
{
if (bs->drv && bs->drv->bdrv_remove_persistent_dirty_bitmap) {
bs->drv->bdrv_remove_persistent_dirty_bitmap(bs, name, errp);
}
}
/* Called with BQL taken. */
void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
bitmap->disabled = true;
}
/* Called with BQL taken. */
void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
@@ -409,11 +342,10 @@ BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
BlockDirtyInfoList *list = NULL;
BlockDirtyInfoList **plist = &list;
bdrv_dirty_bitmaps_lock(bs);
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
info->count = bdrv_get_dirty_count(bm);
info->count = bdrv_get_dirty_count(bm) << BDRV_SECTOR_BITS;
info->granularity = bdrv_dirty_bitmap_granularity(bm);
info->has_name = !!bm->name;
info->name = g_strdup(bm->name);
@@ -422,19 +354,17 @@ BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
*plist = entry;
plist = &entry->next;
}
bdrv_dirty_bitmaps_unlock(bs);
return list;
}
/* Called within bdrv_dirty_bitmap_lock..unlock */
bool bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t offset)
int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t sector)
{
if (bitmap) {
return hbitmap_get(bitmap->bitmap, offset);
return hbitmap_get(bitmap->bitmap, sector);
} else {
return false;
return 0;
}
}
@@ -458,15 +388,21 @@ uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
return granularity;
}
uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap)
uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
{
return 1U << hbitmap_granularity(bitmap->bitmap);
return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
}
BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap)
uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap)
{
return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->meta);
}
BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
uint64_t first_sector)
{
BdrvDirtyBitmapIter *iter = g_new(BdrvDirtyBitmapIter, 1);
hbitmap_iter_init(&iter->hbi, bitmap->bitmap, 0);
hbitmap_iter_init(&iter->hbi, bitmap->bitmap, first_sector);
iter->bitmap = bitmap;
bitmap->active_iterators++;
return iter;
@@ -496,45 +432,23 @@ int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter)
return hbitmap_iter_next(&iter->hbi);
}
/* Called within bdrv_dirty_bitmap_lock..unlock */
void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
int64_t offset, int64_t bytes)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
assert(!bdrv_dirty_bitmap_readonly(bitmap));
hbitmap_set(bitmap->bitmap, offset, bytes);
}
void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t offset, int64_t bytes)
{
bdrv_dirty_bitmap_lock(bitmap);
bdrv_set_dirty_bitmap_locked(bitmap, offset, bytes);
bdrv_dirty_bitmap_unlock(bitmap);
}
/* Called within bdrv_dirty_bitmap_lock..unlock */
void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
int64_t offset, int64_t bytes)
int64_t cur_sector, int64_t nr_sectors)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
assert(!bdrv_dirty_bitmap_readonly(bitmap));
hbitmap_reset(bitmap->bitmap, offset, bytes);
hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
}
void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t offset, int64_t bytes)
int64_t cur_sector, int64_t nr_sectors)
{
bdrv_dirty_bitmap_lock(bitmap);
bdrv_reset_dirty_bitmap_locked(bitmap, offset, bytes);
bdrv_dirty_bitmap_unlock(bitmap);
assert(bdrv_dirty_bitmap_enabled(bitmap));
hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
}
void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
assert(!bdrv_dirty_bitmap_readonly(bitmap));
bdrv_dirty_bitmap_lock(bitmap);
if (!out) {
hbitmap_reset_all(bitmap->bitmap);
} else {
@@ -543,55 +457,46 @@ void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
hbitmap_granularity(backup));
*out = backup;
}
bdrv_dirty_bitmap_unlock(bitmap);
}
void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in)
{
HBitmap *tmp = bitmap->bitmap;
assert(bdrv_dirty_bitmap_enabled(bitmap));
assert(!bdrv_dirty_bitmap_readonly(bitmap));
bitmap->bitmap = in;
hbitmap_free(tmp);
}
uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
uint64_t offset, uint64_t bytes)
uint64_t start, uint64_t count)
{
return hbitmap_serialization_size(bitmap->bitmap, offset, bytes);
return hbitmap_serialization_size(bitmap->bitmap, start, count);
}
uint64_t bdrv_dirty_bitmap_serialization_align(const BdrvDirtyBitmap *bitmap)
{
return hbitmap_serialization_align(bitmap->bitmap);
return hbitmap_serialization_granularity(bitmap->bitmap);
}
void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
uint8_t *buf, uint64_t offset,
uint64_t bytes)
uint8_t *buf, uint64_t start,
uint64_t count)
{
hbitmap_serialize_part(bitmap->bitmap, buf, offset, bytes);
hbitmap_serialize_part(bitmap->bitmap, buf, start, count);
}
void bdrv_dirty_bitmap_deserialize_part(BdrvDirtyBitmap *bitmap,
uint8_t *buf, uint64_t offset,
uint64_t bytes, bool finish)
uint8_t *buf, uint64_t start,
uint64_t count, bool finish)
{
hbitmap_deserialize_part(bitmap->bitmap, buf, offset, bytes, finish);
hbitmap_deserialize_part(bitmap->bitmap, buf, start, count, finish);
}
void bdrv_dirty_bitmap_deserialize_zeroes(BdrvDirtyBitmap *bitmap,
uint64_t offset, uint64_t bytes,
uint64_t start, uint64_t count,
bool finish)
{
hbitmap_deserialize_zeroes(bitmap->bitmap, offset, bytes, finish);
}
void bdrv_dirty_bitmap_deserialize_ones(BdrvDirtyBitmap *bitmap,
uint64_t offset, uint64_t bytes,
bool finish)
{
hbitmap_deserialize_ones(bitmap->bitmap, offset, bytes, finish);
hbitmap_deserialize_zeroes(bitmap->bitmap, start, count, finish);
}
void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap)
@@ -599,31 +504,24 @@ void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap)
hbitmap_deserialize_finish(bitmap->bitmap);
}
void bdrv_set_dirty(BlockDriverState *bs, int64_t offset, int64_t bytes)
void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
int64_t nr_sectors)
{
BdrvDirtyBitmap *bitmap;
if (QLIST_EMPTY(&bs->dirty_bitmaps)) {
return;
}
bdrv_dirty_bitmaps_lock(bs);
QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
if (!bdrv_dirty_bitmap_enabled(bitmap)) {
continue;
}
assert(!bdrv_dirty_bitmap_readonly(bitmap));
hbitmap_set(bitmap->bitmap, offset, bytes);
hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
}
bdrv_dirty_bitmaps_unlock(bs);
}
/**
* Advance a BdrvDirtyBitmapIter to an arbitrary offset.
*/
void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *iter, int64_t offset)
void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *iter, int64_t sector_num)
{
hbitmap_iter_init(&iter->hbi, iter->hbi.hb, offset);
hbitmap_iter_init(&iter->hbi, iter->hbi.hb, sector_num);
}
int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
@@ -635,70 +533,3 @@ int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap)
{
return hbitmap_count(bitmap->meta);
}
bool bdrv_dirty_bitmap_readonly(const BdrvDirtyBitmap *bitmap)
{
return bitmap->readonly;
}
/* Called with BQL taken. */
void bdrv_dirty_bitmap_set_readonly(BdrvDirtyBitmap *bitmap, bool value)
{
qemu_mutex_lock(bitmap->mutex);
bitmap->readonly = value;
qemu_mutex_unlock(bitmap->mutex);
}
bool bdrv_has_readonly_bitmaps(BlockDriverState *bs)
{
BdrvDirtyBitmap *bm;
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
if (bm->readonly) {
return true;
}
}
return false;
}
/* Called with BQL taken. */
void bdrv_dirty_bitmap_set_persistance(BdrvDirtyBitmap *bitmap, bool persistent)
{
qemu_mutex_lock(bitmap->mutex);
bitmap->persistent = persistent;
qemu_mutex_unlock(bitmap->mutex);
}
bool bdrv_dirty_bitmap_get_persistance(BdrvDirtyBitmap *bitmap)
{
return bitmap->persistent;
}
bool bdrv_has_changed_persistent_bitmaps(BlockDriverState *bs)
{
BdrvDirtyBitmap *bm;
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
if (bm->persistent && !bm->readonly) {
return true;
}
}
return false;
}
BdrvDirtyBitmap *bdrv_dirty_bitmap_next(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap)
{
return bitmap == NULL ? QLIST_FIRST(&bs->dirty_bitmaps) :
QLIST_NEXT(bitmap, list);
}
char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
{
return hbitmap_sha256(bitmap->bitmap, errp);
}
int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, uint64_t offset)
{
return hbitmap_next_zero(bitmap->bitmap, offset);
}

View File

@@ -111,7 +111,7 @@ static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
uncompressed_sectors = s->sectorcounts[chunk];
break;
case 1: /* copy */
uncompressed_sectors = DIV_ROUND_UP(s->lengths[chunk], 512);
uncompressed_sectors = (s->lengths[chunk] + 511) / 512;
break;
case 2: /* zero */
/* as the all-zeroes block may be large, it is treated specially: the
@@ -419,18 +419,8 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
if (!bdrv_is_read_only(bs)) {
error_report("Opening dmg images without an explicit read-only=on "
"option is deprecated. Future versions will refuse to "
"open the image instead of automatically marking the "
"image read-only.");
ret = bdrv_set_read_only(bs, true, errp);
if (ret < 0) {
return ret;
}
}
block_module_load_one("dmg-bz2");
bs->read_only = true;
s->n_chunks = 0;
s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;

View File

@@ -26,6 +26,7 @@
#ifndef BLOCK_DMG_H
#define BLOCK_DMG_H
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include <zlib.h>

File diff suppressed because it is too large Load Diff

View File

@@ -21,19 +21,18 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu/cutils.h"
#include "qemu/timer.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/option.h"
#include "block/raw-aio.h"
#include "trace.h"
#include "block/thread-pool.h"
#include "qemu/iov.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
#include "qapi/util.h"
#include <windows.h>
#include <winioctl.h>
@@ -278,7 +277,12 @@ static void raw_parse_flags(int flags, bool use_aio, int *access_flags,
static void raw_parse_filename(const char *filename, QDict *options,
Error **errp)
{
bdrv_parse_filename_strip_prefix(filename, "file:", options);
/* The filename does not have to be prefixed by the protocol name, since
* "file" is the default protocol; therefore, the return value of this
* function call can be ignored. */
strstart(filename, "file:", &filename);
qdict_put_str(options, "filename", filename);
}
static QemuOptsList raw_runtime_opts = {
@@ -305,8 +309,8 @@ static bool get_aio_option(QemuOpts *opts, int flags, Error **errp)
aio_default = (flags & BDRV_O_NATIVE_AIO) ? BLOCKDEV_AIO_OPTIONS_NATIVE
: BLOCKDEV_AIO_OPTIONS_THREADS;
aio = qapi_enum_parse(&BlockdevAioOptions_lookup, qemu_opt_get(opts, "aio"),
aio_default, errp);
aio = qapi_enum_parse(BlockdevAioOptions_lookup, qemu_opt_get(opts, "aio"),
BLOCKDEV_AIO_OPTIONS__MAX, aio_default, errp);
switch (aio) {
case BLOCKDEV_AIO_OPTIONS_NATIVE:
@@ -341,12 +345,6 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
if (qdict_get_try_bool(options, "locking", false)) {
error_setg(errp, "locking=on is not supported on Windows");
ret = -EINVAL;
goto fail;
}
filename = qemu_opt_get(opts, "filename");
use_aio = get_aio_option(opts, flags, &local_err);
@@ -463,19 +461,12 @@ static void raw_close(BlockDriverState *bs)
}
}
static int raw_truncate(BlockDriverState *bs, int64_t offset,
PreallocMode prealloc, Error **errp)
static int raw_truncate(BlockDriverState *bs, int64_t offset)
{
BDRVRawState *s = bs->opaque;
LONG low, high;
DWORD dwPtrLow;
if (prealloc != PREALLOC_MODE_OFF) {
error_setg(errp, "Unsupported preallocation mode '%s'",
PreallocMode_str(prealloc));
return -ENOTSUP;
}
low = offset;
high = offset >> 32;
@@ -485,11 +476,11 @@ static int raw_truncate(BlockDriverState *bs, int64_t offset,
*/
dwPtrLow = SetFilePointer(s->hfile, low, &high, FILE_BEGIN);
if (dwPtrLow == INVALID_SET_FILE_POINTER && GetLastError() != NO_ERROR) {
error_setg_win32(errp, GetLastError(), "SetFilePointer error");
fprintf(stderr, "SetFilePointer error: %lu\n", GetLastError());
return -EIO;
}
if (SetEndOfFile(s->hfile) == 0) {
error_setg_win32(errp, GetLastError(), "SetEndOfFile error");
fprintf(stderr, "SetEndOfFile error: %lu\n", GetLastError());
return -EIO;
}
return 0;
@@ -553,40 +544,9 @@ static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
return st.st_size;
}
static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
{
BlockdevCreateOptionsFile *file_opts;
int fd;
assert(options->driver == BLOCKDEV_DRIVER_FILE);
file_opts = &options->u.file;
if (file_opts->has_preallocation) {
error_setg(errp, "Preallocation is not supported on Windows");
return -EINVAL;
}
if (file_opts->has_nocow) {
error_setg(errp, "nocow is not supported on Windows");
return -EINVAL;
}
fd = qemu_open(file_opts->filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
if (fd < 0) {
error_setg_errno(errp, errno, "Could not create file");
return -EIO;
}
set_sparse(fd);
ftruncate(fd, file_opts->size);
qemu_close(fd);
return 0;
}
static int coroutine_fn raw_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
{
BlockdevCreateOptions options;
int64_t total_size = 0;
strstart(filename, "file:", &filename);
@@ -595,18 +555,19 @@ static int coroutine_fn raw_co_create_opts(const char *filename, QemuOpts *opts,
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
options = (BlockdevCreateOptions) {
.driver = BLOCKDEV_DRIVER_FILE,
.u.file = {
.filename = (char *) filename,
.size = total_size,
.has_preallocation = false,
.has_nocow = false,
},
};
return raw_co_create(&options, errp);
fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
0644);
if (fd < 0) {
error_setg_errno(errp, errno, "Could not create file");
return -EIO;
}
set_sparse(fd);
ftruncate(fd, total_size);
qemu_close(fd);
return 0;
}
static QemuOptsList raw_create_opts = {
.name = "raw-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
@@ -629,7 +590,7 @@ BlockDriver bdrv_file = {
.bdrv_file_open = raw_open,
.bdrv_refresh_limits = raw_probe_alignment,
.bdrv_close = raw_close,
.bdrv_co_create_opts = raw_co_create_opts,
.bdrv_create = raw_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_aio_readv = raw_aio_readv,
@@ -705,7 +666,10 @@ static int hdev_probe_device(const char *filename)
static void hdev_parse_filename(const char *filename, QDict *options,
Error **errp)
{
bdrv_parse_filename_strip_prefix(filename, "host_device:", options);
/* The prefix is optional, just as for "file". */
strstart(filename, "host_device:", &filename);
qdict_put_str(options, "filename", filename);
}
static int hdev_open(BlockDriverState *bs, QDict *options, int flags,

View File

@@ -7,16 +7,14 @@
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include <glusterfs/api/glfs.h>
#include "block/block_int.h"
#include "qapi/error.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qerror.h"
#include "qapi/util.h"
#include "qemu/uri.h"
#include "qemu/error-report.h"
#include "qemu/option.h"
#include "qemu/cutils.h"
#define GLUSTER_OPT_FILENAME "filename"
@@ -323,7 +321,7 @@ static int parse_volume_options(BlockdevOptionsGluster *gconf, char *path)
static int qemu_gluster_parse_uri(BlockdevOptionsGluster *gconf,
const char *filename)
{
SocketAddress *gsconf;
SocketAddressFlat *gsconf;
URI *uri;
QueryParams *qp = NULL;
bool is_unix = false;
@@ -334,20 +332,21 @@ static int qemu_gluster_parse_uri(BlockdevOptionsGluster *gconf,
return -EINVAL;
}
gconf->server = g_new0(SocketAddressList, 1);
gconf->server->value = gsconf = g_new0(SocketAddress, 1);
gconf->server = g_new0(SocketAddressFlatList, 1);
gconf->server->value = gsconf = g_new0(SocketAddressFlat, 1);
/* transport */
if (!uri->scheme || !strcmp(uri->scheme, "gluster")) {
gsconf->type = SOCKET_ADDRESS_TYPE_INET;
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_INET;
} else if (!strcmp(uri->scheme, "gluster+tcp")) {
gsconf->type = SOCKET_ADDRESS_TYPE_INET;
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_INET;
} else if (!strcmp(uri->scheme, "gluster+unix")) {
gsconf->type = SOCKET_ADDRESS_TYPE_UNIX;
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_UNIX;
is_unix = true;
} else if (!strcmp(uri->scheme, "gluster+rdma")) {
gsconf->type = SOCKET_ADDRESS_TYPE_INET;
warn_report("rdma feature is not supported, falling back to tcp");
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_INET;
error_report("Warning: rdma feature is not supported, falling "
"back to tcp");
} else {
ret = -EINVAL;
goto out;
@@ -397,7 +396,7 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
struct glfs *glfs;
int ret;
int old_errno;
SocketAddressList *server;
SocketAddressFlatList *server;
unsigned long long port;
glfs = glfs_find_preopened(gconf->volume);
@@ -414,11 +413,11 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
for (server = gconf->server; server; server = server->next) {
switch (server->value->type) {
case SOCKET_ADDRESS_TYPE_UNIX:
case SOCKET_ADDRESS_FLAT_TYPE_UNIX:
ret = glfs_set_volfile_server(glfs, "unix",
server->value->u.q_unix.path, 0);
break;
case SOCKET_ADDRESS_TYPE_INET:
case SOCKET_ADDRESS_FLAT_TYPE_INET:
if (parse_uint_full(server->value->u.inet.port, &port, 10) < 0 ||
port > 65535) {
error_setg(errp, "'%s' is not a valid port number",
@@ -430,8 +429,8 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
server->value->u.inet.host,
(int)port);
break;
case SOCKET_ADDRESS_TYPE_VSOCK:
case SOCKET_ADDRESS_TYPE_FD:
case SOCKET_ADDRESS_FLAT_TYPE_VSOCK:
case SOCKET_ADDRESS_FLAT_TYPE_FD:
default:
abort();
}
@@ -451,7 +450,7 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
error_setg(errp, "Gluster connection for volume %s, path %s failed"
" to connect", gconf->volume, gconf->path);
for (server = gconf->server; server; server = server->next) {
if (server->value->type == SOCKET_ADDRESS_TYPE_UNIX) {
if (server->value->type == SOCKET_ADDRESS_FLAT_TYPE_UNIX) {
error_append_hint(errp, "hint: failed on socket %s ",
server->value->u.q_unix.path);
} else {
@@ -488,13 +487,14 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
QDict *options, Error **errp)
{
QemuOpts *opts;
SocketAddress *gsconf = NULL;
SocketAddressList *curr = NULL;
SocketAddressFlat *gsconf = NULL;
SocketAddressFlatList *curr = NULL;
QDict *backing_options = NULL;
Error *local_err = NULL;
char *str = NULL;
const char *ptr;
int i, type, num_servers;
size_t num_servers;
int i, type;
/* create opts info from runtime_json_opts list */
opts = qemu_opts_create(&runtime_json_opts, NULL, 0, &error_abort);
@@ -542,13 +542,14 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
goto out;
}
gsconf = g_new0(SocketAddress, 1);
gsconf = g_new0(SocketAddressFlat, 1);
if (!strcmp(ptr, "tcp")) {
ptr = "inet"; /* accept legacy "tcp" */
}
type = qapi_enum_parse(&SocketAddressType_lookup, ptr, -1, NULL);
if (type != SOCKET_ADDRESS_TYPE_INET
&& type != SOCKET_ADDRESS_TYPE_UNIX) {
type = qapi_enum_parse(SocketAddressFlatType_lookup, ptr,
SOCKET_ADDRESS_FLAT_TYPE__MAX, -1, NULL);
if (type != SOCKET_ADDRESS_FLAT_TYPE_INET
&& type != SOCKET_ADDRESS_FLAT_TYPE_UNIX) {
error_setg(&local_err,
"Parameter '%s' may be 'inet' or 'unix'",
GLUSTER_OPT_TYPE);
@@ -558,7 +559,7 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
gsconf->type = type;
qemu_opts_del(opts);
if (gsconf->type == SOCKET_ADDRESS_TYPE_INET) {
if (gsconf->type == SOCKET_ADDRESS_FLAT_TYPE_INET) {
/* create opts info from runtime_inet_opts list */
opts = qemu_opts_create(&runtime_inet_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, backing_options, &local_err);
@@ -627,11 +628,11 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
}
if (gconf->server == NULL) {
gconf->server = g_new0(SocketAddressList, 1);
gconf->server = g_new0(SocketAddressFlatList, 1);
gconf->server->value = gsconf;
curr = gconf->server;
} else {
curr->next = g_new0(SocketAddressList, 1);
curr->next = g_new0(SocketAddressFlatList, 1);
curr->next->value = gsconf;
curr = curr->next;
}
@@ -647,7 +648,7 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
out:
error_propagate(errp, local_err);
qapi_free_SocketAddress(gsconf);
qapi_free_SocketAddressFlat(gsconf);
qemu_opts_del(opts);
g_free(str);
QDECREF(backing_options);
@@ -655,11 +656,9 @@ out:
return -errno;
}
/* Converts options given in @filename and the @options QDict into the QAPI
* object @gconf. */
static int qemu_gluster_parse(BlockdevOptionsGluster *gconf,
const char *filename,
QDict *options, Error **errp)
static struct glfs *qemu_gluster_init(BlockdevOptionsGluster *gconf,
const char *filename,
QDict *options, Error **errp)
{
int ret;
if (filename) {
@@ -670,7 +669,8 @@ static int qemu_gluster_parse(BlockdevOptionsGluster *gconf,
"[host[:port]]volume/path[?socket=...]"
"[,file.debug=N]"
"[,file.logfile=/path/filename.log]\n");
return ret;
errno = -ret;
return NULL;
}
} else {
ret = qemu_gluster_parse_json(gconf, options, errp);
@@ -686,23 +686,10 @@ static int qemu_gluster_parse(BlockdevOptionsGluster *gconf,
"file.server.1.transport=unix,"
"file.server.1.socket=/var/run/glusterd.socket ..."
"\n");
return ret;
errno = -ret;
return NULL;
}
}
return 0;
}
static struct glfs *qemu_gluster_init(BlockdevOptionsGluster *gconf,
const char *filename,
QDict *options, Error **errp)
{
int ret;
ret = qemu_gluster_parse(gconf, filename, options, errp);
if (ret < 0) {
errno = -ret;
return NULL;
}
return qemu_gluster_glfs_init(gconf, errp);
@@ -977,130 +964,43 @@ static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
qemu_coroutine_yield();
return acb.ret;
}
#endif
static int qemu_gluster_do_truncate(struct glfs_fd *fd, int64_t offset,
PreallocMode prealloc, Error **errp)
static inline bool gluster_supports_zerofill(void)
{
int64_t current_length;
return 1;
}
current_length = glfs_lseek(fd, 0, SEEK_END);
if (current_length < 0) {
error_setg_errno(errp, errno, "Failed to determine current size");
return -errno;
}
if (current_length > offset && prealloc != PREALLOC_MODE_OFF) {
error_setg(errp, "Cannot use preallocation for shrinking files");
return -ENOTSUP;
}
if (current_length == offset) {
return 0;
}
switch (prealloc) {
#ifdef CONFIG_GLUSTERFS_FALLOCATE
case PREALLOC_MODE_FALLOC:
if (glfs_fallocate(fd, 0, current_length, offset - current_length)) {
error_setg_errno(errp, errno, "Could not preallocate data");
return -errno;
}
break;
#endif /* CONFIG_GLUSTERFS_FALLOCATE */
#ifdef CONFIG_GLUSTERFS_ZEROFILL
case PREALLOC_MODE_FULL:
if (glfs_ftruncate(fd, offset)) {
error_setg_errno(errp, errno, "Could not resize file");
return -errno;
}
if (glfs_zerofill(fd, current_length, offset - current_length)) {
error_setg_errno(errp, errno, "Could not zerofill the new area");
return -errno;
}
break;
#endif /* CONFIG_GLUSTERFS_ZEROFILL */
case PREALLOC_MODE_OFF:
if (glfs_ftruncate(fd, offset)) {
error_setg_errno(errp, errno, "Could not resize file");
return -errno;
}
break;
default:
error_setg(errp, "Unsupported preallocation mode: %s",
PreallocMode_str(prealloc));
return -EINVAL;
}
static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
int64_t size)
{
return glfs_zerofill(fd, offset, size);
}
#else
static inline bool gluster_supports_zerofill(void)
{
return 0;
}
static int qemu_gluster_co_create(BlockdevCreateOptions *options,
Error **errp)
static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
int64_t size)
{
BlockdevCreateOptionsGluster *opts = &options->u.gluster;
struct glfs *glfs;
struct glfs_fd *fd = NULL;
int ret = 0;
assert(options->driver == BLOCKDEV_DRIVER_GLUSTER);
glfs = qemu_gluster_glfs_init(opts->location, errp);
if (!glfs) {
ret = -errno;
goto out;
}
fd = glfs_creat(glfs, opts->location->path,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR | S_IWUSR);
if (!fd) {
ret = -errno;
goto out;
}
ret = qemu_gluster_do_truncate(fd, opts->size, opts->preallocation, errp);
out:
if (fd) {
if (glfs_close(fd) != 0 && ret == 0) {
ret = -errno;
}
}
glfs_clear_preopened(glfs);
return ret;
return 0;
}
#endif
static int coroutine_fn qemu_gluster_co_create_opts(const char *filename,
QemuOpts *opts,
Error **errp)
static int qemu_gluster_create(const char *filename,
QemuOpts *opts, Error **errp)
{
BlockdevCreateOptions *options;
BlockdevCreateOptionsGluster *gopts;
BlockdevOptionsGluster *gconf;
struct glfs *glfs;
struct glfs_fd *fd;
int ret = 0;
int prealloc = 0;
int64_t total_size = 0;
char *tmp = NULL;
Error *local_err = NULL;
int ret;
options = g_new0(BlockdevCreateOptions, 1);
options->driver = BLOCKDEV_DRIVER_GLUSTER;
gopts = &options->u.gluster;
gconf = g_new0(BlockdevOptionsGluster, 1);
gopts->location = gconf;
gopts->size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
gopts->preallocation = qapi_enum_parse(&PreallocMode_lookup, tmp,
PREALLOC_MODE_OFF, &local_err);
g_free(tmp);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
gconf->debug = qemu_opt_get_number_del(opts, GLUSTER_OPT_DEBUG,
GLUSTER_DEBUG_DEFAULT);
if (gconf->debug < 0) {
@@ -1116,19 +1016,48 @@ static int coroutine_fn qemu_gluster_co_create_opts(const char *filename,
}
gconf->has_logfile = true;
ret = qemu_gluster_parse(gconf, filename, NULL, errp);
if (ret < 0) {
goto fail;
glfs = qemu_gluster_init(gconf, filename, NULL, errp);
if (!glfs) {
ret = -errno;
goto out;
}
ret = qemu_gluster_co_create(options, errp);
if (ret < 0) {
goto fail;
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
if (!tmp || !strcmp(tmp, "off")) {
prealloc = 0;
} else if (!strcmp(tmp, "full") && gluster_supports_zerofill()) {
prealloc = 1;
} else {
error_setg(errp, "Invalid preallocation mode: '%s'"
" or GlusterFS doesn't support zerofill API", tmp);
ret = -EINVAL;
goto out;
}
ret = 0;
fail:
qapi_free_BlockdevCreateOptions(options);
fd = glfs_creat(glfs, gconf->path,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR | S_IWUSR);
if (!fd) {
ret = -errno;
} else {
if (!glfs_ftruncate(fd, total_size)) {
if (prealloc && qemu_gluster_zerofill(fd, 0, total_size)) {
ret = -errno;
}
} else {
ret = -errno;
}
if (glfs_close(fd) != 0) {
ret = -errno;
}
}
out:
g_free(tmp);
qapi_free_BlockdevOptionsGluster(gconf);
glfs_clear_preopened(glfs);
return ret;
}
@@ -1163,11 +1092,17 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
return acb.ret;
}
static int qemu_gluster_truncate(BlockDriverState *bs, int64_t offset,
PreallocMode prealloc, Error **errp)
static int qemu_gluster_truncate(BlockDriverState *bs, int64_t offset)
{
int ret;
BDRVGlusterState *s = bs->opaque;
return qemu_gluster_do_truncate(s->fd, offset, prealloc, errp);
ret = glfs_ftruncate(s->fd, offset);
if (ret < 0) {
return -errno;
}
return 0;
}
static coroutine_fn int qemu_gluster_co_readv(BlockDriverState *bs,
@@ -1337,14 +1272,7 @@ static int find_allocation(BlockDriverState *bs, off_t start,
if (offs < 0) {
return -errno; /* D3 or D4 */
}
if (offs < start) {
/* This is not a valid return by lseek(). We are safe to just return
* -EIO in this case, and we'll treat it like D4. Unfortunately some
* versions of gluster server will return offs < start, so an assert
* here will unnecessarily abort QEMU. */
return -EIO;
}
assert(offs >= start);
if (offs > start) {
/* D2: in hole, next data at offs */
@@ -1376,14 +1304,7 @@ static int find_allocation(BlockDriverState *bs, off_t start,
if (offs < 0) {
return -errno; /* D1 and (H3 or H4) */
}
if (offs < start) {
/* This is not a valid return by lseek(). We are safe to just return
* -EIO in this case, and we'll treat it like H4. Unfortunately some
* versions of gluster server will return offs < start, so an assert
* here will unnecessarily abort QEMU. */
return -EIO;
}
assert(offs >= start);
if (offs > start) {
/*
@@ -1406,66 +1327,68 @@ exit:
}
/*
* Returns the allocation status of the specified offset.
* Returns the allocation status of the specified sectors.
*
* The block layer guarantees 'offset' and 'bytes' are within bounds.
* If 'sector_num' is beyond the end of the disk image the return value is 0
* and 'pnum' is set to 0.
*
* 'pnum' is set to the number of bytes (including and immediately following
* the specified offset) that are known to be in the same
* 'pnum' is set to the number of sectors (including and immediately following
* the specified sector) that are known to be in the same
* allocated/unallocated state.
*
* 'bytes' is the max value 'pnum' should be set to.
* 'nb_sectors' is the max value 'pnum' should be set to. If nb_sectors goes
* beyond the end of the disk image it will be clamped.
*
* (Based on raw_co_block_status() from file-posix.c.)
* (Based on raw_co_get_block_status() from file-posix.c.)
*/
static int coroutine_fn qemu_gluster_co_block_status(BlockDriverState *bs,
bool want_zero,
int64_t offset,
int64_t bytes,
int64_t *pnum,
int64_t *map,
BlockDriverState **file)
static int64_t coroutine_fn qemu_gluster_co_get_block_status(
BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum,
BlockDriverState **file)
{
BDRVGlusterState *s = bs->opaque;
off_t data = 0, hole = 0;
off_t start, data = 0, hole = 0;
int64_t total_size;
int ret = -EINVAL;
if (!s->fd) {
return ret;
}
if (!want_zero) {
*pnum = bytes;
*map = offset;
*file = bs;
return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
start = sector_num * BDRV_SECTOR_SIZE;
total_size = bdrv_getlength(bs);
if (total_size < 0) {
return total_size;
} else if (start >= total_size) {
*pnum = 0;
return 0;
} else if (start + nb_sectors * BDRV_SECTOR_SIZE > total_size) {
nb_sectors = DIV_ROUND_UP(total_size - start, BDRV_SECTOR_SIZE);
}
ret = find_allocation(bs, offset, &data, &hole);
ret = find_allocation(bs, start, &data, &hole);
if (ret == -ENXIO) {
/* Trailing hole */
*pnum = bytes;
*pnum = nb_sectors;
ret = BDRV_BLOCK_ZERO;
} else if (ret < 0) {
/* No info available, so pretend there are no holes */
*pnum = bytes;
*pnum = nb_sectors;
ret = BDRV_BLOCK_DATA;
} else if (data == offset) {
/* On a data extent, compute bytes to the end of the extent,
} else if (data == start) {
/* On a data extent, compute sectors to the end of the extent,
* possibly including a partial sector at EOF. */
*pnum = MIN(bytes, hole - offset);
*pnum = MIN(nb_sectors, DIV_ROUND_UP(hole - start, BDRV_SECTOR_SIZE));
ret = BDRV_BLOCK_DATA;
} else {
/* On a hole, compute bytes to the beginning of the next extent. */
assert(hole == offset);
*pnum = MIN(bytes, data - offset);
/* On a hole, compute sectors to the beginning of the next extent. */
assert(hole == start);
*pnum = MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE);
ret = BDRV_BLOCK_ZERO;
}
*map = offset;
*file = bs;
return ret | BDRV_BLOCK_OFFSET_VALID;
return ret | BDRV_BLOCK_OFFSET_VALID | start;
}
@@ -1479,8 +1402,7 @@ static BlockDriver bdrv_gluster = {
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_co_create = qemu_gluster_co_create,
.bdrv_co_create_opts = qemu_gluster_co_create_opts,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
@@ -1494,7 +1416,7 @@ static BlockDriver bdrv_gluster = {
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_pwrite_zeroes = qemu_gluster_co_pwrite_zeroes,
#endif
.bdrv_co_block_status = qemu_gluster_co_block_status,
.bdrv_co_get_block_status = qemu_gluster_co_get_block_status,
.create_opts = &qemu_gluster_create_opts,
};
@@ -1508,8 +1430,7 @@ static BlockDriver bdrv_gluster_tcp = {
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_co_create = qemu_gluster_co_create,
.bdrv_co_create_opts = qemu_gluster_co_create_opts,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
@@ -1523,7 +1444,7 @@ static BlockDriver bdrv_gluster_tcp = {
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_pwrite_zeroes = qemu_gluster_co_pwrite_zeroes,
#endif
.bdrv_co_block_status = qemu_gluster_co_block_status,
.bdrv_co_get_block_status = qemu_gluster_co_get_block_status,
.create_opts = &qemu_gluster_create_opts,
};
@@ -1537,8 +1458,7 @@ static BlockDriver bdrv_gluster_unix = {
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_co_create = qemu_gluster_co_create,
.bdrv_co_create_opts = qemu_gluster_co_create_opts,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
@@ -1552,7 +1472,7 @@ static BlockDriver bdrv_gluster_unix = {
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_pwrite_zeroes = qemu_gluster_co_pwrite_zeroes,
#endif
.bdrv_co_block_status = qemu_gluster_co_block_status,
.bdrv_co_get_block_status = qemu_gluster_co_get_block_status,
.create_opts = &qemu_gluster_create_opts,
};
@@ -1572,8 +1492,7 @@ static BlockDriver bdrv_gluster_rdma = {
.bdrv_reopen_commit = qemu_gluster_reopen_commit,
.bdrv_reopen_abort = qemu_gluster_reopen_abort,
.bdrv_close = qemu_gluster_close,
.bdrv_co_create = qemu_gluster_co_create,
.bdrv_co_create_opts = qemu_gluster_co_create_opts,
.bdrv_create = qemu_gluster_create,
.bdrv_getlength = qemu_gluster_getlength,
.bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
.bdrv_truncate = qemu_gluster_truncate,
@@ -1587,7 +1506,7 @@ static BlockDriver bdrv_gluster_rdma = {
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_pwrite_zeroes = qemu_gluster_co_pwrite_zeroes,
#endif
.bdrv_co_block_status = qemu_gluster_co_block_status,
.bdrv_co_get_block_status = qemu_gluster_co_get_block_status,
.create_opts = &qemu_gluster_create_opts,
};

1161
block/io.c

File diff suppressed because it is too large Load Diff

View File

@@ -25,7 +25,6 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/config-file.h"
#include "qemu/option.h"
static QemuOptsList qemu_iscsi_opts = {
.name = "iscsi",

View File

@@ -2,7 +2,7 @@
* QEMU Block driver for iSCSI images
*
* Copyright (c) 2010-2011 Ronnie Sahlberg <ronniesahlberg@gmail.com>
* Copyright (c) 2012-2017 Peter Lieven <pl@kamp.de>
* Copyright (c) 2012-2016 Peter Lieven <pl@kamp.de>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
@@ -28,28 +28,21 @@
#include <poll.h>
#include <math.h>
#include <arpa/inet.h>
#include "qemu-common.h"
#include "qemu/config-file.h"
#include "qemu/error-report.h"
#include "qemu/bitops.h"
#include "qemu/bitmap.h"
#include "block/block_int.h"
#include "scsi/constants.h"
#include "block/scsi.h"
#include "qemu/iov.h"
#include "qemu/option.h"
#include "qemu/uuid.h"
#include "qapi/error.h"
#include "qapi/qapi-commands-misc.h"
#include "qapi/qmp/qdict.h"
#include "qmp-commands.h"
#include "qapi/qmp/qstring.h"
#include "crypto/secret.h"
#include "scsi/utils.h"
/* Conflict between scsi/utils.h and libiscsi! :( */
#define SCSI_XFER_NONE ISCSI_XFER_NONE
#include <iscsi/iscsi.h>
#include <iscsi/scsi-lowlevel.h>
#undef SCSI_XFER_NONE
QEMU_BUILD_BUG_ON((int)SCSI_XFER_NONE != (int)ISCSI_XFER_NONE);
#ifdef __linux__
#include <scsi/sg.h>
@@ -86,7 +79,7 @@ typedef struct IscsiLun {
unsigned long *allocmap;
unsigned long *allocmap_valid;
long allocmap_size;
int cluster_size;
int cluster_sectors;
bool use_16_for_rw;
bool write_protected;
bool lbpme;
@@ -106,7 +99,6 @@ typedef struct IscsiTask {
IscsiLun *iscsilun;
QEMUTimer retry_timer;
int err_code;
char *err_str;
} IscsiTask;
typedef struct IscsiAIOCB {
@@ -217,9 +209,47 @@ static inline unsigned exp_random(double mean)
static int iscsi_translate_sense(struct scsi_sense *sense)
{
return - scsi_sense_to_errno(sense->key,
(sense->ascq & 0xFF00) >> 8,
sense->ascq & 0xFF);
int ret;
switch (sense->key) {
case SCSI_SENSE_NOT_READY:
return -EBUSY;
case SCSI_SENSE_DATA_PROTECTION:
return -EACCES;
case SCSI_SENSE_COMMAND_ABORTED:
return -ECANCELED;
case SCSI_SENSE_ILLEGAL_REQUEST:
/* Parse ASCQ */
break;
default:
return -EIO;
}
switch (sense->ascq) {
case SCSI_SENSE_ASCQ_PARAMETER_LIST_LENGTH_ERROR:
case SCSI_SENSE_ASCQ_INVALID_OPERATION_CODE:
case SCSI_SENSE_ASCQ_INVALID_FIELD_IN_CDB:
case SCSI_SENSE_ASCQ_INVALID_FIELD_IN_PARAMETER_LIST:
ret = -EINVAL;
break;
case SCSI_SENSE_ASCQ_LBA_OUT_OF_RANGE:
ret = -ENOSPC;
break;
case SCSI_SENSE_ASCQ_LOGICAL_UNIT_NOT_SUPPORTED:
ret = -ENOTSUP;
break;
case SCSI_SENSE_ASCQ_MEDIUM_NOT_PRESENT:
case SCSI_SENSE_ASCQ_MEDIUM_NOT_PRESENT_TRAY_CLOSED:
case SCSI_SENSE_ASCQ_MEDIUM_NOT_PRESENT_TRAY_OPEN:
ret = -ENOMEDIUM;
break;
case SCSI_SENSE_ASCQ_WRITE_PROTECTED:
ret = -EACCES;
break;
default:
ret = -EIO;
break;
}
return ret;
}
/* Called (via iscsi_service) with QemuMutex held. */
@@ -268,7 +298,7 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
}
}
iTask->err_code = iscsi_translate_sense(&task->sense);
iTask->err_str = g_strdup(iscsi_get_error(iscsi));
error_report("iSCSI Failure: %s", iscsi_get_error(iscsi));
}
out:
@@ -430,10 +460,9 @@ static int iscsi_allocmap_init(IscsiLun *iscsilun, int open_flags)
{
iscsi_allocmap_free(iscsilun);
assert(iscsilun->cluster_size);
iscsilun->allocmap_size =
DIV_ROUND_UP(iscsilun->num_blocks * iscsilun->block_size,
iscsilun->cluster_size);
DIV_ROUND_UP(sector_lun2qemu(iscsilun->num_blocks, iscsilun),
iscsilun->cluster_sectors);
iscsilun->allocmap = bitmap_try_new(iscsilun->allocmap_size);
if (!iscsilun->allocmap) {
@@ -441,7 +470,7 @@ static int iscsi_allocmap_init(IscsiLun *iscsilun, int open_flags)
}
if (open_flags & BDRV_O_NOCACHE) {
/* when cache.direct = on all allocmap entries are
/* in case that cache.direct = on all allocmap entries are
* treated as invalid to force a relookup of the block
* status on every read request */
return 0;
@@ -458,8 +487,8 @@ static int iscsi_allocmap_init(IscsiLun *iscsilun, int open_flags)
}
static void
iscsi_allocmap_update(IscsiLun *iscsilun, int64_t offset,
int64_t bytes, bool allocated, bool valid)
iscsi_allocmap_update(IscsiLun *iscsilun, int64_t sector_num,
int nb_sectors, bool allocated, bool valid)
{
int64_t cl_num_expanded, nb_cls_expanded, cl_num_shrunk, nb_cls_shrunk;
@@ -467,13 +496,13 @@ iscsi_allocmap_update(IscsiLun *iscsilun, int64_t offset,
return;
}
/* expand to entirely contain all affected clusters */
assert(iscsilun->cluster_size);
cl_num_expanded = offset / iscsilun->cluster_size;
nb_cls_expanded = DIV_ROUND_UP(offset + bytes,
iscsilun->cluster_size) - cl_num_expanded;
cl_num_expanded = sector_num / iscsilun->cluster_sectors;
nb_cls_expanded = DIV_ROUND_UP(sector_num + nb_sectors,
iscsilun->cluster_sectors) - cl_num_expanded;
/* shrink to touch only completely contained clusters */
cl_num_shrunk = DIV_ROUND_UP(offset, iscsilun->cluster_size);
nb_cls_shrunk = (offset + bytes) / iscsilun->cluster_size - cl_num_shrunk;
cl_num_shrunk = DIV_ROUND_UP(sector_num, iscsilun->cluster_sectors);
nb_cls_shrunk = (sector_num + nb_sectors) / iscsilun->cluster_sectors
- cl_num_shrunk;
if (allocated) {
bitmap_set(iscsilun->allocmap, cl_num_expanded, nb_cls_expanded);
} else {
@@ -496,26 +525,26 @@ iscsi_allocmap_update(IscsiLun *iscsilun, int64_t offset,
}
static void
iscsi_allocmap_set_allocated(IscsiLun *iscsilun, int64_t offset,
int64_t bytes)
iscsi_allocmap_set_allocated(IscsiLun *iscsilun, int64_t sector_num,
int nb_sectors)
{
iscsi_allocmap_update(iscsilun, offset, bytes, true, true);
iscsi_allocmap_update(iscsilun, sector_num, nb_sectors, true, true);
}
static void
iscsi_allocmap_set_unallocated(IscsiLun *iscsilun, int64_t offset,
int64_t bytes)
iscsi_allocmap_set_unallocated(IscsiLun *iscsilun, int64_t sector_num,
int nb_sectors)
{
/* Note: if cache.direct=on the fifth argument to iscsi_allocmap_update
* is ignored, so this will in effect be an iscsi_allocmap_set_invalid.
*/
iscsi_allocmap_update(iscsilun, offset, bytes, false, true);
iscsi_allocmap_update(iscsilun, sector_num, nb_sectors, false, true);
}
static void iscsi_allocmap_set_invalid(IscsiLun *iscsilun, int64_t offset,
int64_t bytes)
static void iscsi_allocmap_set_invalid(IscsiLun *iscsilun, int64_t sector_num,
int nb_sectors)
{
iscsi_allocmap_update(iscsilun, offset, bytes, false, false);
iscsi_allocmap_update(iscsilun, sector_num, nb_sectors, false, false);
}
static void iscsi_allocmap_invalidate(IscsiLun *iscsilun)
@@ -529,30 +558,28 @@ static void iscsi_allocmap_invalidate(IscsiLun *iscsilun)
}
static inline bool
iscsi_allocmap_is_allocated(IscsiLun *iscsilun, int64_t offset,
int64_t bytes)
iscsi_allocmap_is_allocated(IscsiLun *iscsilun, int64_t sector_num,
int nb_sectors)
{
unsigned long size;
if (iscsilun->allocmap == NULL) {
return true;
}
assert(iscsilun->cluster_size);
size = DIV_ROUND_UP(offset + bytes, iscsilun->cluster_size);
size = DIV_ROUND_UP(sector_num + nb_sectors, iscsilun->cluster_sectors);
return !(find_next_bit(iscsilun->allocmap, size,
offset / iscsilun->cluster_size) == size);
sector_num / iscsilun->cluster_sectors) == size);
}
static inline bool iscsi_allocmap_is_valid(IscsiLun *iscsilun,
int64_t offset, int64_t bytes)
int64_t sector_num, int nb_sectors)
{
unsigned long size;
if (iscsilun->allocmap_valid == NULL) {
return false;
}
assert(iscsilun->cluster_size);
size = DIV_ROUND_UP(offset + bytes, iscsilun->cluster_size);
size = DIV_ROUND_UP(sector_num + nb_sectors, iscsilun->cluster_sectors);
return (find_next_zero_bit(iscsilun->allocmap_valid, size,
offset / iscsilun->cluster_size) == size);
sector_num / iscsilun->cluster_sectors) == size);
}
static int coroutine_fn
@@ -634,60 +661,54 @@ retry:
}
if (iTask.status != SCSI_STATUS_GOOD) {
iscsi_allocmap_set_invalid(iscsilun, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE);
error_report("iSCSI WRITE10/16 failed at lba %" PRIu64 ": %s", lba,
iTask.err_str);
iscsi_allocmap_set_invalid(iscsilun, sector_num, nb_sectors);
r = iTask.err_code;
goto out_unlock;
}
iscsi_allocmap_set_allocated(iscsilun, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE);
iscsi_allocmap_set_allocated(iscsilun, sector_num, nb_sectors);
out_unlock:
qemu_mutex_unlock(&iscsilun->mutex);
g_free(iTask.err_str);
return r;
}
static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs,
bool want_zero, int64_t offset,
int64_t bytes, int64_t *pnum,
int64_t *map,
BlockDriverState **file)
static int64_t coroutine_fn iscsi_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum,
BlockDriverState **file)
{
IscsiLun *iscsilun = bs->opaque;
struct scsi_get_lba_status *lbas = NULL;
struct scsi_lba_status_descriptor *lbasd = NULL;
struct IscsiTask iTask;
uint64_t lba;
int ret;
int64_t ret, max_sector;
iscsi_co_init_iscsitask(iscsilun, &iTask);
assert(QEMU_IS_ALIGNED(offset | bytes, iscsilun->block_size));
if (!is_sector_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
ret = -EINVAL;
goto out;
}
/* default to all sectors allocated */
ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
if (map) {
*map = offset;
}
*pnum = bytes;
ret = BDRV_BLOCK_DATA;
ret |= (sector_num << BDRV_SECTOR_BITS) | BDRV_BLOCK_OFFSET_VALID;
*pnum = nb_sectors;
/* LUN does not support logical block provisioning */
if (!iscsilun->lbpme) {
goto out;
}
lba = offset / iscsilun->block_size;
max_sector = iscsilun->num_blocks - sector_num;
qemu_mutex_lock(&iscsilun->mutex);
retry:
if (iscsi_get_lba_status_task(iscsilun->iscsi, iscsilun->lun,
lba, 8 + 16, iscsi_co_generic_cb,
sector_qemu2lun(sector_num, iscsilun),
8 + 16, iscsi_co_generic_cb,
&iTask) == NULL) {
ret = -ENOMEM;
goto out_unlock;
@@ -714,8 +735,6 @@ retry:
* because the device is busy or the cmd is not
* supported) we pretend all blocks are allocated
* for backwards compatibility */
error_report("iSCSI GET_LBA_STATUS failed at lba %" PRIu64 ": %s",
lba, iTask.err_str);
goto out_unlock;
}
@@ -727,12 +746,12 @@ retry:
lbasd = &lbas->descriptors[0];
if (lba != lbasd->lba) {
if (sector_qemu2lun(sector_num, iscsilun) != lbasd->lba) {
ret = -EIO;
goto out_unlock;
}
*pnum = lbasd->num_blocks * iscsilun->block_size;
*pnum = MIN(sector_lun2qemu(lbasd->num_blocks, iscsilun), max_sector);
if (lbasd->provisioning == SCSI_PROVISIONING_TYPE_DEALLOCATED ||
lbasd->provisioning == SCSI_PROVISIONING_TYPE_ANCHORED) {
@@ -743,22 +762,21 @@ retry:
}
if (ret & BDRV_BLOCK_ZERO) {
iscsi_allocmap_set_unallocated(iscsilun, offset, *pnum);
iscsi_allocmap_set_unallocated(iscsilun, sector_num, *pnum);
} else {
iscsi_allocmap_set_allocated(iscsilun, offset, *pnum);
iscsi_allocmap_set_allocated(iscsilun, sector_num, *pnum);
}
if (*pnum > bytes) {
*pnum = bytes;
if (*pnum > nb_sectors) {
*pnum = nb_sectors;
}
out_unlock:
qemu_mutex_unlock(&iscsilun->mutex);
g_free(iTask.err_str);
out:
if (iTask.task != NULL) {
scsi_free_scsi_task(iTask.task);
}
if (ret > 0 && ret & BDRV_BLOCK_OFFSET_VALID && file) {
if (ret > 0 && ret & BDRV_BLOCK_OFFSET_VALID) {
*file = bs;
}
return ret;
@@ -772,7 +790,6 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
struct IscsiTask iTask;
uint64_t lba;
uint32_t num_sectors;
int r = 0;
if (!is_sector_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
return -EINVAL;
@@ -785,37 +802,29 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
/* if cache.direct is off and we have a valid entry in our allocation map
* we can skip checking the block status and directly return zeroes if
* the request falls within an unallocated area */
if (iscsi_allocmap_is_valid(iscsilun, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE) &&
!iscsi_allocmap_is_allocated(iscsilun, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE)) {
if (iscsi_allocmap_is_valid(iscsilun, sector_num, nb_sectors) &&
!iscsi_allocmap_is_allocated(iscsilun, sector_num, nb_sectors)) {
qemu_iovec_memset(iov, 0, 0x00, iov->size);
return 0;
}
if (nb_sectors >= ISCSI_CHECKALLOC_THRES &&
!iscsi_allocmap_is_valid(iscsilun, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE) &&
!iscsi_allocmap_is_allocated(iscsilun, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE)) {
int64_t pnum;
!iscsi_allocmap_is_valid(iscsilun, sector_num, nb_sectors) &&
!iscsi_allocmap_is_allocated(iscsilun, sector_num, nb_sectors)) {
int pnum;
BlockDriverState *file;
/* check the block status from the beginning of the cluster
* containing the start sector */
int64_t head;
int ret;
assert(iscsilun->cluster_size);
head = (sector_num * BDRV_SECTOR_SIZE) % iscsilun->cluster_size;
ret = iscsi_co_block_status(bs, true,
sector_num * BDRV_SECTOR_SIZE - head,
BDRV_REQUEST_MAX_BYTES, &pnum, NULL, NULL);
int64_t ret = iscsi_co_get_block_status(bs,
sector_num - sector_num % iscsilun->cluster_sectors,
BDRV_REQUEST_MAX_SECTORS, &pnum, &file);
if (ret < 0) {
return ret;
}
/* if the whole request falls into an unallocated area we can avoid
* reading and directly return zeroes instead */
* to read and directly return zeroes instead */
if (ret & BDRV_BLOCK_ZERO &&
pnum >= nb_sectors * BDRV_SECTOR_SIZE + head) {
pnum >= nb_sectors + sector_num % iscsilun->cluster_sectors) {
qemu_iovec_memset(iov, 0, 0x00, iov->size);
return 0;
}
@@ -878,23 +887,19 @@ retry:
iTask.complete = 0;
goto retry;
}
qemu_mutex_unlock(&iscsilun->mutex);
if (iTask.status != SCSI_STATUS_GOOD) {
error_report("iSCSI READ10/16 failed at lba %" PRIu64 ": %s",
lba, iTask.err_str);
r = iTask.err_code;
return iTask.err_code;
}
qemu_mutex_unlock(&iscsilun->mutex);
g_free(iTask.err_str);
return r;
return 0;
}
static int coroutine_fn iscsi_co_flush(BlockDriverState *bs)
{
IscsiLun *iscsilun = bs->opaque;
struct IscsiTask iTask;
int r = 0;
iscsi_co_init_iscsitask(iscsilun, &iTask);
qemu_mutex_lock(&iscsilun->mutex);
@@ -921,15 +926,13 @@ retry:
iTask.complete = 0;
goto retry;
}
qemu_mutex_unlock(&iscsilun->mutex);
if (iTask.status != SCSI_STATUS_GOOD) {
error_report("iSCSI SYNCHRONIZECACHE10 failed: %s", iTask.err_str);
r = iTask.err_code;
return iTask.err_code;
}
qemu_mutex_unlock(&iscsilun->mutex);
g_free(iTask.err_str);
return r;
return 0;
}
#ifdef __linux__
@@ -963,8 +966,7 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
acb->ioh->driver_status |= SG_ERR_DRIVER_SENSE;
acb->ioh->sb_len_wr = acb->task->datain.size - 2;
ss = (acb->ioh->mx_sb_len >= acb->ioh->sb_len_wr) ?
acb->ioh->mx_sb_len : acb->ioh->sb_len_wr;
ss = MIN(acb->ioh->mx_sb_len, acb->ioh->sb_len_wr);
memcpy(acb->ioh->sbp, &acb->task->datain.data[2], ss);
}
@@ -1114,14 +1116,14 @@ iscsi_getlength(BlockDriverState *bs)
}
static int
coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
{
IscsiLun *iscsilun = bs->opaque;
struct IscsiTask iTask;
struct unmap_list list;
int r = 0;
if (!is_byte_request_lun_aligned(offset, bytes, iscsilun)) {
if (!is_byte_request_lun_aligned(offset, count, iscsilun)) {
return -ENOTSUP;
}
@@ -1131,7 +1133,7 @@ coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
}
list.lba = offset / iscsilun->block_size;
list.num = bytes / iscsilun->block_size;
list.num = count / iscsilun->block_size;
iscsi_co_init_iscsitask(iscsilun, &iTask);
qemu_mutex_lock(&iscsilun->mutex);
@@ -1159,8 +1161,6 @@ retry:
goto retry;
}
iscsi_allocmap_set_invalid(iscsilun, offset, bytes);
if (iTask.status == SCSI_STATUS_CHECK_CONDITION) {
/* the target might fail with a check condition if it
is not happy with the alignment of the UNMAP request
@@ -1169,21 +1169,21 @@ retry:
}
if (iTask.status != SCSI_STATUS_GOOD) {
error_report("iSCSI UNMAP failed at lba %" PRIu64 ": %s",
list.lba, iTask.err_str);
r = iTask.err_code;
goto out_unlock;
}
iscsi_allocmap_set_invalid(iscsilun, offset >> BDRV_SECTOR_BITS,
count >> BDRV_SECTOR_BITS);
out_unlock:
qemu_mutex_unlock(&iscsilun->mutex);
g_free(iTask.err_str);
return r;
}
static int
coroutine_fn iscsi_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
int bytes, BdrvRequestFlags flags)
int count, BdrvRequestFlags flags)
{
IscsiLun *iscsilun = bs->opaque;
struct IscsiTask iTask;
@@ -1192,7 +1192,7 @@ coroutine_fn iscsi_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
bool use_16_for_ws = iscsilun->use_16_for_rw;
int r = 0;
if (!is_byte_request_lun_aligned(offset, bytes, iscsilun)) {
if (!is_byte_request_lun_aligned(offset, count, iscsilun)) {
return -ENOTSUP;
}
@@ -1215,7 +1215,7 @@ coroutine_fn iscsi_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
}
lba = offset / iscsilun->block_size;
nb_blocks = bytes / iscsilun->block_size;
nb_blocks = count / iscsilun->block_size;
if (iscsilun->zeroblock == NULL) {
iscsilun->zeroblock = g_try_malloc0(iscsilun->block_size);
@@ -1272,22 +1272,22 @@ retry:
}
if (iTask.status != SCSI_STATUS_GOOD) {
iscsi_allocmap_set_invalid(iscsilun, offset, bytes);
error_report("iSCSI WRITESAME10/16 failed at lba %" PRIu64 ": %s",
lba, iTask.err_str);
iscsi_allocmap_set_invalid(iscsilun, offset >> BDRV_SECTOR_BITS,
count >> BDRV_SECTOR_BITS);
r = iTask.err_code;
goto out_unlock;
}
if (flags & BDRV_REQ_MAY_UNMAP) {
iscsi_allocmap_set_invalid(iscsilun, offset, bytes);
iscsi_allocmap_set_invalid(iscsilun, offset >> BDRV_SECTOR_BITS,
count >> BDRV_SECTOR_BITS);
} else {
iscsi_allocmap_set_allocated(iscsilun, offset, bytes);
iscsi_allocmap_set_allocated(iscsilun, offset >> BDRV_SECTOR_BITS,
count >> BDRV_SECTOR_BITS);
}
out_unlock:
qemu_mutex_unlock(&iscsilun->mutex);
g_free(iTask.err_str);
return r;
}
@@ -1732,10 +1732,6 @@ static QemuOptsList runtime_opts = {
.name = "timeout",
.type = QEMU_OPT_NUMBER,
},
{
.name = "filename",
.type = QEMU_OPT_STRING,
},
{ /* end of list */ }
},
};
@@ -1751,27 +1747,12 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
char *initiator_name = NULL;
QemuOpts *opts;
Error *local_err = NULL;
const char *transport_name, *portal, *target, *filename;
const char *transport_name, *portal, *target;
#if LIBISCSI_API_VERSION >= (20160603)
enum iscsi_transport_type transport;
#endif
int i, ret = 0, timeout = 0, lun;
/* If we are given a filename, parse the filename, with precedence given to
* filename encoded options */
filename = qdict_get_try_str(options, "filename");
if (filename) {
warn_report("'filename' option specified. "
"This is an unsupported option, and may be deprecated "
"in the future");
iscsi_parse_filename(filename, options, &local_err);
if (local_err) {
ret = -EINVAL;
error_propagate(errp, local_err);
goto exit;
}
}
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
@@ -1886,6 +1867,7 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
if (iscsilun->dpofua) {
bs->supported_write_flags = BDRV_REQ_FUA;
}
bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP;
/* Check the write protect flag of the LUN if we want to write */
if (iscsilun->type == TYPE_DISK && (flags & BDRV_O_RDWR) &&
@@ -1962,17 +1944,13 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
* reasonable size */
if (iscsilun->bl.opt_unmap_gran * iscsilun->block_size >= 4 * 1024 &&
iscsilun->bl.opt_unmap_gran * iscsilun->block_size <= 16 * 1024 * 1024) {
iscsilun->cluster_size = iscsilun->bl.opt_unmap_gran *
iscsilun->block_size;
iscsilun->cluster_sectors = (iscsilun->bl.opt_unmap_gran *
iscsilun->block_size) >> BDRV_SECTOR_BITS;
if (iscsilun->lbprz) {
ret = iscsi_allocmap_init(iscsilun, bs->open_flags);
}
}
if (iscsilun->lbprz && iscsilun->lbp.lbpws) {
bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP;
}
out:
qemu_opts_del(opts);
g_free(initiator_name);
@@ -1989,7 +1967,6 @@ out:
}
memset(iscsilun, 0, sizeof(IscsiLun));
}
exit:
return ret;
}
@@ -2082,31 +2059,22 @@ static void iscsi_reopen_commit(BDRVReopenState *reopen_state)
}
}
static int iscsi_truncate(BlockDriverState *bs, int64_t offset,
PreallocMode prealloc, Error **errp)
static int iscsi_truncate(BlockDriverState *bs, int64_t offset)
{
IscsiLun *iscsilun = bs->opaque;
Error *local_err = NULL;
if (prealloc != PREALLOC_MODE_OFF) {
error_setg(errp, "Unsupported preallocation mode '%s'",
PreallocMode_str(prealloc));
return -ENOTSUP;
}
if (iscsilun->type != TYPE_DISK) {
error_setg(errp, "Cannot resize non-disk iSCSI devices");
return -ENOTSUP;
}
iscsi_readcapacity_sync(iscsilun, &local_err);
if (local_err != NULL) {
error_propagate(errp, local_err);
error_free(local_err);
return -EIO;
}
if (offset > iscsi_getlength(bs)) {
error_setg(errp, "Cannot grow iSCSI devices");
return -EINVAL;
}
@@ -2117,8 +2085,7 @@ static int iscsi_truncate(BlockDriverState *bs, int64_t offset,
return 0;
}
static int coroutine_fn iscsi_co_create_opts(const char *filename, QemuOpts *opts,
Error **errp)
static int iscsi_create(const char *filename, QemuOpts *opts, Error **errp)
{
int ret = 0;
int64_t total_size = 0;
@@ -2173,12 +2140,13 @@ static int iscsi_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
{
IscsiLun *iscsilun = bs->opaque;
bdi->unallocated_blocks_are_zero = iscsilun->lbprz;
bdi->cluster_size = iscsilun->cluster_size;
bdi->can_write_zeroes_with_unmap = iscsilun->lbprz && iscsilun->lbp.lbpws;
bdi->cluster_size = iscsilun->cluster_sectors * BDRV_SECTOR_SIZE;
return 0;
}
static void coroutine_fn iscsi_co_invalidate_cache(BlockDriverState *bs,
Error **errp)
static void iscsi_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
IscsiLun *iscsilun = bs->opaque;
iscsi_allocmap_invalidate(iscsilun);
@@ -2205,18 +2173,18 @@ static BlockDriver bdrv_iscsi = {
.bdrv_parse_filename = iscsi_parse_filename,
.bdrv_file_open = iscsi_open,
.bdrv_close = iscsi_close,
.bdrv_co_create_opts = iscsi_co_create_opts,
.bdrv_create = iscsi_create,
.create_opts = &iscsi_create_opts,
.bdrv_reopen_prepare = iscsi_reopen_prepare,
.bdrv_reopen_commit = iscsi_reopen_commit,
.bdrv_co_invalidate_cache = iscsi_co_invalidate_cache,
.bdrv_invalidate_cache = iscsi_invalidate_cache,
.bdrv_getlength = iscsi_getlength,
.bdrv_get_info = iscsi_get_info,
.bdrv_truncate = iscsi_truncate,
.bdrv_refresh_limits = iscsi_refresh_limits,
.bdrv_co_block_status = iscsi_co_block_status,
.bdrv_co_get_block_status = iscsi_co_get_block_status,
.bdrv_co_pdiscard = iscsi_co_pdiscard,
.bdrv_co_pwrite_zeroes = iscsi_co_pwrite_zeroes,
.bdrv_co_readv = iscsi_co_readv,
@@ -2240,7 +2208,7 @@ static BlockDriver bdrv_iser = {
.bdrv_parse_filename = iscsi_parse_filename,
.bdrv_file_open = iscsi_open,
.bdrv_close = iscsi_close,
.bdrv_co_create_opts = iscsi_co_create_opts,
.bdrv_create = iscsi_create,
.create_opts = &iscsi_create_opts,
.bdrv_reopen_prepare = iscsi_reopen_prepare,
.bdrv_reopen_commit = iscsi_reopen_commit,
@@ -2251,7 +2219,7 @@ static BlockDriver bdrv_iser = {
.bdrv_truncate = iscsi_truncate,
.bdrv_refresh_limits = iscsi_refresh_limits,
.bdrv_co_block_status = iscsi_co_block_status,
.bdrv_co_get_block_status = iscsi_co_get_block_status,
.bdrv_co_pdiscard = iscsi_co_pdiscard,
.bdrv_co_pwrite_zeroes = iscsi_co_pwrite_zeroes,
.bdrv_co_readv = iscsi_co_readv,

View File

@@ -24,8 +24,9 @@
#define SLICE_TIME 100000000ULL /* ns */
#define MAX_IN_FLIGHT 16
#define MAX_IO_BYTES (1 << 20) /* 1 Mb */
#define DEFAULT_MIRROR_BUF_SIZE (MAX_IN_FLIGHT * MAX_IO_BYTES)
#define MAX_IO_SECTORS ((1 << 20) >> BDRV_SECTOR_BITS) /* 1 Mb */
#define DEFAULT_MIRROR_BUF_SIZE \
(MAX_IN_FLIGHT * MAX_IO_SECTORS * BDRV_SECTOR_SIZE)
/* The mirroring buffer is a list of granularity-sized chunks.
* Free chunks are organized in a list.
@@ -66,11 +67,11 @@ typedef struct MirrorBlockJob {
uint64_t last_pause_ns;
unsigned long *in_flight_bitmap;
int in_flight;
int64_t bytes_in_flight;
int64_t sectors_in_flight;
int ret;
bool unmap;
bool waiting_for_io;
int target_cluster_size;
int target_cluster_sectors;
int max_iov;
bool initial_zeroing_ongoing;
} MirrorBlockJob;
@@ -78,8 +79,8 @@ typedef struct MirrorBlockJob {
typedef struct MirrorOp {
MirrorBlockJob *s;
QEMUIOVector qiov;
int64_t offset;
uint64_t bytes;
int64_t sector_num;
int nb_sectors;
} MirrorOp;
static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read,
@@ -100,12 +101,12 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
MirrorBlockJob *s = op->s;
struct iovec *iov;
int64_t chunk_num;
int i, nb_chunks;
int i, nb_chunks, sectors_per_chunk;
trace_mirror_iteration_done(s, op->offset, op->bytes, ret);
trace_mirror_iteration_done(s, op->sector_num, op->nb_sectors, ret);
s->in_flight--;
s->bytes_in_flight -= op->bytes;
s->sectors_in_flight -= op->nb_sectors;
iov = op->qiov.iov;
for (i = 0; i < op->qiov.niov; i++) {
MirrorBuffer *buf = (MirrorBuffer *) iov[i].iov_base;
@@ -113,15 +114,16 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
s->buf_free_count++;
}
chunk_num = op->offset / s->granularity;
nb_chunks = DIV_ROUND_UP(op->bytes, s->granularity);
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
chunk_num = op->sector_num / sectors_per_chunk;
nb_chunks = DIV_ROUND_UP(op->nb_sectors, sectors_per_chunk);
bitmap_clear(s->in_flight_bitmap, chunk_num, nb_chunks);
if (ret >= 0) {
if (s->cow_bitmap) {
bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
}
if (!s->initial_zeroing_ongoing) {
s->common.offset += op->bytes;
s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE;
}
}
qemu_iovec_destroy(&op->qiov);
@@ -141,7 +143,7 @@ static void mirror_write_complete(void *opaque, int ret)
if (ret < 0) {
BlockErrorAction action;
bdrv_set_dirty_bitmap(s->dirty_bitmap, op->offset, op->bytes);
bdrv_set_dirty_bitmap(s->dirty_bitmap, op->sector_num, op->nb_sectors);
action = mirror_error_action(s, false, -ret);
if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
s->ret = ret;
@@ -160,7 +162,7 @@ static void mirror_read_complete(void *opaque, int ret)
if (ret < 0) {
BlockErrorAction action;
bdrv_set_dirty_bitmap(s->dirty_bitmap, op->offset, op->bytes);
bdrv_set_dirty_bitmap(s->dirty_bitmap, op->sector_num, op->nb_sectors);
action = mirror_error_action(s, true, -ret);
if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
s->ret = ret;
@@ -168,52 +170,56 @@ static void mirror_read_complete(void *opaque, int ret)
mirror_iteration_done(op, ret);
} else {
blk_aio_pwritev(s->target, op->offset, &op->qiov,
blk_aio_pwritev(s->target, op->sector_num * BDRV_SECTOR_SIZE, &op->qiov,
0, mirror_write_complete, op);
}
aio_context_release(blk_get_aio_context(s->common.blk));
}
/* Clip bytes relative to offset to not exceed end-of-file */
static inline int64_t mirror_clip_bytes(MirrorBlockJob *s,
int64_t offset,
int64_t bytes)
static inline void mirror_clip_sectors(MirrorBlockJob *s,
int64_t sector_num,
int *nb_sectors)
{
return MIN(bytes, s->bdev_length - offset);
*nb_sectors = MIN(*nb_sectors,
s->bdev_length / BDRV_SECTOR_SIZE - sector_num);
}
/* Round offset and/or bytes to target cluster if COW is needed, and
* return the offset of the adjusted tail against original. */
static int mirror_cow_align(MirrorBlockJob *s, int64_t *offset,
uint64_t *bytes)
/* Round sector_num and/or nb_sectors to target cluster if COW is needed, and
* return the offset of the adjusted tail sector against original. */
static int mirror_cow_align(MirrorBlockJob *s,
int64_t *sector_num,
int *nb_sectors)
{
bool need_cow;
int ret = 0;
int64_t align_offset = *offset;
int64_t align_bytes = *bytes;
int max_bytes = s->granularity * s->max_iov;
int chunk_sectors = s->granularity >> BDRV_SECTOR_BITS;
int64_t align_sector_num = *sector_num;
int align_nb_sectors = *nb_sectors;
int max_sectors = chunk_sectors * s->max_iov;
need_cow = !test_bit(*offset / s->granularity, s->cow_bitmap);
need_cow |= !test_bit((*offset + *bytes - 1) / s->granularity,
need_cow = !test_bit(*sector_num / chunk_sectors, s->cow_bitmap);
need_cow |= !test_bit((*sector_num + *nb_sectors - 1) / chunk_sectors,
s->cow_bitmap);
if (need_cow) {
bdrv_round_to_clusters(blk_bs(s->target), *offset, *bytes,
&align_offset, &align_bytes);
bdrv_round_sectors_to_clusters(blk_bs(s->target), *sector_num,
*nb_sectors, &align_sector_num,
&align_nb_sectors);
}
if (align_bytes > max_bytes) {
align_bytes = max_bytes;
if (align_nb_sectors > max_sectors) {
align_nb_sectors = max_sectors;
if (need_cow) {
align_bytes = QEMU_ALIGN_DOWN(align_bytes, s->target_cluster_size);
align_nb_sectors = QEMU_ALIGN_DOWN(align_nb_sectors,
s->target_cluster_sectors);
}
}
/* Clipping may result in align_bytes unaligned to chunk boundary, but
/* Clipping may result in align_nb_sectors unaligned to chunk boundary, but
* that doesn't matter because it's already the end of source image. */
align_bytes = mirror_clip_bytes(s, align_offset, align_bytes);
mirror_clip_sectors(s, align_sector_num, &align_nb_sectors);
ret = align_offset + align_bytes - (*offset + *bytes);
*offset = align_offset;
*bytes = align_bytes;
ret = align_sector_num + align_nb_sectors - (*sector_num + *nb_sectors);
*sector_num = align_sector_num;
*nb_sectors = align_nb_sectors;
assert(ret >= 0);
return ret;
}
@@ -227,51 +233,50 @@ static inline void mirror_wait_for_io(MirrorBlockJob *s)
}
/* Submit async read while handling COW.
* Returns: The number of bytes copied after and including offset,
* excluding any bytes copied prior to offset due to alignment.
* This will be @bytes if no alignment is necessary, or
* (new_end - offset) if tail is rounded up or down due to
* Returns: The number of sectors copied after and including sector_num,
* excluding any sectors copied prior to sector_num due to alignment.
* This will be nb_sectors if no alignment is necessary, or
* (new_end - sector_num) if tail is rounded up or down due to
* alignment or buffer limit.
*/
static uint64_t mirror_do_read(MirrorBlockJob *s, int64_t offset,
uint64_t bytes)
static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
int nb_sectors)
{
BlockBackend *source = s->common.blk;
int nb_chunks;
uint64_t ret;
int sectors_per_chunk, nb_chunks;
int ret;
MirrorOp *op;
uint64_t max_bytes;
int max_sectors;
max_bytes = s->granularity * s->max_iov;
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
max_sectors = sectors_per_chunk * s->max_iov;
/* We can only handle as much as buf_size at a time. */
bytes = MIN(s->buf_size, MIN(max_bytes, bytes));
assert(bytes);
assert(bytes < BDRV_REQUEST_MAX_BYTES);
ret = bytes;
nb_sectors = MIN(s->buf_size >> BDRV_SECTOR_BITS, nb_sectors);
nb_sectors = MIN(max_sectors, nb_sectors);
assert(nb_sectors);
ret = nb_sectors;
if (s->cow_bitmap) {
ret += mirror_cow_align(s, &offset, &bytes);
ret += mirror_cow_align(s, &sector_num, &nb_sectors);
}
assert(bytes <= s->buf_size);
/* The offset is granularity-aligned because:
assert(nb_sectors << BDRV_SECTOR_BITS <= s->buf_size);
/* The sector range must meet granularity because:
* 1) Caller passes in aligned values;
* 2) mirror_cow_align is used only when target cluster is larger. */
assert(QEMU_IS_ALIGNED(offset, s->granularity));
/* The range is sector-aligned, since bdrv_getlength() rounds up. */
assert(QEMU_IS_ALIGNED(bytes, BDRV_SECTOR_SIZE));
nb_chunks = DIV_ROUND_UP(bytes, s->granularity);
assert(!(sector_num % sectors_per_chunk));
nb_chunks = DIV_ROUND_UP(nb_sectors, sectors_per_chunk);
while (s->buf_free_count < nb_chunks) {
trace_mirror_yield_in_flight(s, offset, s->in_flight);
trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
mirror_wait_for_io(s);
}
/* Allocate a MirrorOp that is used as an AIO callback. */
op = g_new(MirrorOp, 1);
op->s = s;
op->offset = offset;
op->bytes = bytes;
op->sector_num = sector_num;
op->nb_sectors = nb_sectors;
/* Now make a QEMUIOVector taking enough granularity-sized chunks
* from s->buf_free.
@@ -279,7 +284,7 @@ static uint64_t mirror_do_read(MirrorBlockJob *s, int64_t offset,
qemu_iovec_init(&op->qiov, nb_chunks);
while (nb_chunks-- > 0) {
MirrorBuffer *buf = QSIMPLEQ_FIRST(&s->buf_free);
size_t remaining = bytes - op->qiov.size;
size_t remaining = nb_sectors * BDRV_SECTOR_SIZE - op->qiov.size;
QSIMPLEQ_REMOVE_HEAD(&s->buf_free, next);
s->buf_free_count--;
@@ -288,16 +293,17 @@ static uint64_t mirror_do_read(MirrorBlockJob *s, int64_t offset,
/* Copy the dirty cluster. */
s->in_flight++;
s->bytes_in_flight += bytes;
trace_mirror_one_iteration(s, offset, bytes);
s->sectors_in_flight += nb_sectors;
trace_mirror_one_iteration(s, sector_num, nb_sectors);
blk_aio_preadv(source, offset, &op->qiov, 0, mirror_read_complete, op);
blk_aio_preadv(source, sector_num * BDRV_SECTOR_SIZE, &op->qiov, 0,
mirror_read_complete, op);
return ret;
}
static void mirror_do_zero_or_discard(MirrorBlockJob *s,
int64_t offset,
uint64_t bytes,
int64_t sector_num,
int nb_sectors,
bool is_discard)
{
MirrorOp *op;
@@ -306,17 +312,19 @@ static void mirror_do_zero_or_discard(MirrorBlockJob *s,
* so the freeing in mirror_iteration_done is nop. */
op = g_new0(MirrorOp, 1);
op->s = s;
op->offset = offset;
op->bytes = bytes;
op->sector_num = sector_num;
op->nb_sectors = nb_sectors;
s->in_flight++;
s->bytes_in_flight += bytes;
s->sectors_in_flight += nb_sectors;
if (is_discard) {
blk_aio_pdiscard(s->target, offset,
op->bytes, mirror_write_complete, op);
blk_aio_pdiscard(s->target, sector_num << BDRV_SECTOR_BITS,
op->nb_sectors << BDRV_SECTOR_BITS,
mirror_write_complete, op);
} else {
blk_aio_pwrite_zeroes(s->target, offset,
op->bytes, s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
blk_aio_pwrite_zeroes(s->target, sector_num * BDRV_SECTOR_SIZE,
op->nb_sectors * BDRV_SECTOR_SIZE,
s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
mirror_write_complete, op);
}
}
@@ -324,26 +332,27 @@ static void mirror_do_zero_or_discard(MirrorBlockJob *s,
static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
{
BlockDriverState *source = s->source;
int64_t offset, first_chunk;
int64_t sector_num, first_chunk;
uint64_t delay_ns = 0;
/* At least the first dirty chunk is mirrored in one iteration. */
int nb_chunks = 1;
int64_t end = s->bdev_length / BDRV_SECTOR_SIZE;
int sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
bool write_zeroes_ok = bdrv_can_write_zeroes_with_unmap(blk_bs(s->target));
int max_io_bytes = MAX(s->buf_size / MAX_IN_FLIGHT, MAX_IO_BYTES);
int max_io_sectors = MAX((s->buf_size >> BDRV_SECTOR_BITS) / MAX_IN_FLIGHT,
MAX_IO_SECTORS);
bdrv_dirty_bitmap_lock(s->dirty_bitmap);
offset = bdrv_dirty_iter_next(s->dbi);
if (offset < 0) {
sector_num = bdrv_dirty_iter_next(s->dbi);
if (sector_num < 0) {
bdrv_set_dirty_iter(s->dbi, 0);
offset = bdrv_dirty_iter_next(s->dbi);
sector_num = bdrv_dirty_iter_next(s->dbi);
trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
assert(offset >= 0);
assert(sector_num >= 0);
}
bdrv_dirty_bitmap_unlock(s->dirty_bitmap);
first_chunk = offset / s->granularity;
first_chunk = sector_num / sectors_per_chunk;
while (test_bit(first_chunk, s->in_flight_bitmap)) {
trace_mirror_yield_in_flight(s, offset, s->in_flight);
trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
mirror_wait_for_io(s);
}
@@ -351,13 +360,12 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
/* Find the number of consective dirty chunks following the first dirty
* one, and wait for in flight requests in them. */
bdrv_dirty_bitmap_lock(s->dirty_bitmap);
while (nb_chunks * s->granularity < s->buf_size) {
while (nb_chunks * sectors_per_chunk < (s->buf_size >> BDRV_SECTOR_BITS)) {
int64_t next_dirty;
int64_t next_offset = offset + nb_chunks * s->granularity;
int64_t next_chunk = next_offset / s->granularity;
if (next_offset >= s->bdev_length ||
!bdrv_get_dirty_locked(source, s->dirty_bitmap, next_offset)) {
int64_t next_sector = sector_num + nb_chunks * sectors_per_chunk;
int64_t next_chunk = next_sector / sectors_per_chunk;
if (next_sector >= end ||
!bdrv_get_dirty(source, s->dirty_bitmap, next_sector)) {
break;
}
if (test_bit(next_chunk, s->in_flight_bitmap)) {
@@ -365,54 +373,53 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
}
next_dirty = bdrv_dirty_iter_next(s->dbi);
if (next_dirty > next_offset || next_dirty < 0) {
if (next_dirty > next_sector || next_dirty < 0) {
/* The bitmap iterator's cache is stale, refresh it */
bdrv_set_dirty_iter(s->dbi, next_offset);
bdrv_set_dirty_iter(s->dbi, next_sector);
next_dirty = bdrv_dirty_iter_next(s->dbi);
}
assert(next_dirty == next_offset);
assert(next_dirty == next_sector);
nb_chunks++;
}
/* Clear dirty bits before querying the block status, because
* calling bdrv_block_status_above could yield - if some blocks are
* calling bdrv_get_block_status_above could yield - if some blocks are
* marked dirty in this window, we need to know.
*/
bdrv_reset_dirty_bitmap_locked(s->dirty_bitmap, offset,
nb_chunks * s->granularity);
bdrv_dirty_bitmap_unlock(s->dirty_bitmap);
bitmap_set(s->in_flight_bitmap, offset / s->granularity, nb_chunks);
while (nb_chunks > 0 && offset < s->bdev_length) {
int ret;
int64_t io_bytes;
int64_t io_bytes_acct;
bdrv_reset_dirty_bitmap(s->dirty_bitmap, sector_num,
nb_chunks * sectors_per_chunk);
bitmap_set(s->in_flight_bitmap, sector_num / sectors_per_chunk, nb_chunks);
while (nb_chunks > 0 && sector_num < end) {
int64_t ret;
int io_sectors, io_sectors_acct;
BlockDriverState *file;
enum MirrorMethod {
MIRROR_METHOD_COPY,
MIRROR_METHOD_ZERO,
MIRROR_METHOD_DISCARD
} mirror_method = MIRROR_METHOD_COPY;
assert(!(offset % s->granularity));
ret = bdrv_block_status_above(source, NULL, offset,
nb_chunks * s->granularity,
&io_bytes, NULL, NULL);
assert(!(sector_num % sectors_per_chunk));
ret = bdrv_get_block_status_above(source, NULL, sector_num,
nb_chunks * sectors_per_chunk,
&io_sectors, &file);
if (ret < 0) {
io_bytes = MIN(nb_chunks * s->granularity, max_io_bytes);
io_sectors = MIN(nb_chunks * sectors_per_chunk, max_io_sectors);
} else if (ret & BDRV_BLOCK_DATA) {
io_bytes = MIN(io_bytes, max_io_bytes);
io_sectors = MIN(io_sectors, max_io_sectors);
}
io_bytes -= io_bytes % s->granularity;
if (io_bytes < s->granularity) {
io_bytes = s->granularity;
io_sectors -= io_sectors % sectors_per_chunk;
if (io_sectors < sectors_per_chunk) {
io_sectors = sectors_per_chunk;
} else if (ret >= 0 && !(ret & BDRV_BLOCK_DATA)) {
int64_t target_offset;
int64_t target_bytes;
bdrv_round_to_clusters(blk_bs(s->target), offset, io_bytes,
&target_offset, &target_bytes);
if (target_offset == offset &&
target_bytes == io_bytes) {
int64_t target_sector_num;
int target_nb_sectors;
bdrv_round_sectors_to_clusters(blk_bs(s->target), sector_num,
io_sectors, &target_sector_num,
&target_nb_sectors);
if (target_sector_num == sector_num &&
target_nb_sectors == io_sectors) {
mirror_method = ret & BDRV_BLOCK_ZERO ?
MIRROR_METHOD_ZERO :
MIRROR_METHOD_DISCARD;
@@ -420,7 +427,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
}
while (s->in_flight >= MAX_IN_FLIGHT) {
trace_mirror_yield_in_flight(s, offset, s->in_flight);
trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
mirror_wait_for_io(s);
}
@@ -428,29 +435,30 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
return 0;
}
io_bytes = mirror_clip_bytes(s, offset, io_bytes);
mirror_clip_sectors(s, sector_num, &io_sectors);
switch (mirror_method) {
case MIRROR_METHOD_COPY:
io_bytes = io_bytes_acct = mirror_do_read(s, offset, io_bytes);
io_sectors = mirror_do_read(s, sector_num, io_sectors);
io_sectors_acct = io_sectors;
break;
case MIRROR_METHOD_ZERO:
case MIRROR_METHOD_DISCARD:
mirror_do_zero_or_discard(s, offset, io_bytes,
mirror_do_zero_or_discard(s, sector_num, io_sectors,
mirror_method == MIRROR_METHOD_DISCARD);
if (write_zeroes_ok) {
io_bytes_acct = 0;
io_sectors_acct = 0;
} else {
io_bytes_acct = io_bytes;
io_sectors_acct = io_sectors;
}
break;
default:
abort();
}
assert(io_bytes);
offset += io_bytes;
nb_chunks -= DIV_ROUND_UP(io_bytes, s->granularity);
assert(io_sectors);
sector_num += io_sectors;
nb_chunks -= DIV_ROUND_UP(io_sectors, sectors_per_chunk);
if (s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, io_bytes_acct);
delay_ns = ratelimit_calculate_delay(&s->limit, io_sectors_acct);
}
}
return delay_ns;
@@ -498,8 +506,6 @@ static void mirror_exit(BlockJob *job, void *opaque)
BlockDriverState *mirror_top_bs = s->mirror_top_bs;
Error *local_err = NULL;
bdrv_release_dirty_bitmap(src, s->dirty_bitmap);
/* Make sure that the source BDS doesn't go away before we called
* block_job_completed(). */
bdrv_ref(src);
@@ -598,7 +604,7 @@ static void mirror_throttle(MirrorBlockJob *s)
if (now - s->last_pause_ns > SLICE_TIME) {
s->last_pause_ns = now;
block_job_sleep_ns(&s->common, 0);
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, 0);
} else {
block_job_pause_point(&s->common);
}
@@ -606,23 +612,24 @@ static void mirror_throttle(MirrorBlockJob *s)
static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
{
int64_t offset;
int64_t sector_num, end;
BlockDriverState *base = s->base;
BlockDriverState *bs = s->source;
BlockDriverState *target_bs = blk_bs(s->target);
int ret;
int64_t count;
int ret, n;
end = s->bdev_length / BDRV_SECTOR_SIZE;
if (base == NULL && !bdrv_has_zero_init(target_bs)) {
if (!bdrv_can_write_zeroes_with_unmap(target_bs)) {
bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, s->bdev_length);
bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, end);
return 0;
}
s->initial_zeroing_ongoing = true;
for (offset = 0; offset < s->bdev_length; ) {
int bytes = MIN(s->bdev_length - offset,
QEMU_ALIGN_DOWN(INT_MAX, s->granularity));
for (sector_num = 0; sector_num < end; ) {
int nb_sectors = MIN(end - sector_num,
QEMU_ALIGN_DOWN(INT_MAX, s->granularity) >> BDRV_SECTOR_BITS);
mirror_throttle(s);
@@ -638,8 +645,8 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
continue;
}
mirror_do_zero_or_discard(s, offset, bytes, false);
offset += bytes;
mirror_do_zero_or_discard(s, sector_num, nb_sectors, false);
sector_num += nb_sectors;
}
mirror_wait_for_all_io(s);
@@ -647,10 +654,10 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
}
/* First part, loop on the sectors and initialize the dirty bitmap. */
for (offset = 0; offset < s->bdev_length; ) {
for (sector_num = 0; sector_num < end; ) {
/* Just to make sure we are not exceeding int limit. */
int bytes = MIN(s->bdev_length - offset,
QEMU_ALIGN_DOWN(INT_MAX, s->granularity));
int nb_sectors = MIN(INT_MAX >> BDRV_SECTOR_BITS,
end - sector_num);
mirror_throttle(s);
@@ -658,16 +665,16 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
return 0;
}
ret = bdrv_is_allocated_above(bs, base, offset, bytes, &count);
ret = bdrv_is_allocated_above(bs, base, sector_num, nb_sectors, &n);
if (ret < 0) {
return ret;
}
assert(count);
assert(n > 0);
if (ret == 1) {
bdrv_set_dirty_bitmap(s->dirty_bitmap, offset, count);
bdrv_set_dirty_bitmap(s->dirty_bitmap, sector_num, n);
}
offset += count;
sector_num += n;
}
return 0;
}
@@ -698,6 +705,7 @@ static void coroutine_fn mirror_run(void *opaque)
char backing_filename[2]; /* we only need 2 characters because we are only
checking for a NULL string */
int ret = 0;
int target_cluster_size = BDRV_SECTOR_SIZE;
if (block_job_is_cancelled(&s->common)) {
goto immediate_exit;
@@ -721,8 +729,7 @@ static void coroutine_fn mirror_run(void *opaque)
}
if (s->bdev_length > base_length) {
ret = blk_truncate(s->target, s->bdev_length, PREALLOC_MODE_OFF,
NULL);
ret = blk_truncate(s->target, s->bdev_length, NULL);
if (ret < 0) {
goto immediate_exit;
}
@@ -750,15 +757,14 @@ static void coroutine_fn mirror_run(void *opaque)
bdrv_get_backing_filename(target_bs, backing_filename,
sizeof(backing_filename));
if (!bdrv_get_info(target_bs, &bdi) && bdi.cluster_size) {
s->target_cluster_size = bdi.cluster_size;
} else {
s->target_cluster_size = BDRV_SECTOR_SIZE;
target_cluster_size = bdi.cluster_size;
}
if (backing_filename[0] && !target_bs->backing &&
s->granularity < s->target_cluster_size) {
s->buf_size = MAX(s->buf_size, s->target_cluster_size);
if (backing_filename[0] && !target_bs->backing
&& s->granularity < target_cluster_size) {
s->buf_size = MAX(s->buf_size, target_cluster_size);
s->cow_bitmap = bitmap_new(length);
}
s->target_cluster_sectors = target_cluster_size >> BDRV_SECTOR_BITS;
s->max_iov = MIN(bs->bl.max_iov, target_bs->bl.max_iov);
s->buf = qemu_try_blockalign(bs, s->buf_size);
@@ -778,7 +784,7 @@ static void coroutine_fn mirror_run(void *opaque)
}
assert(!s->dbi);
s->dbi = bdrv_dirty_iter_new(s->dirty_bitmap);
s->dbi = bdrv_dirty_iter_new(s->dirty_bitmap, 0);
for (;;) {
uint64_t delay_ns = 0;
int64_t cnt, delta;
@@ -793,10 +799,11 @@ static void coroutine_fn mirror_run(void *opaque)
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
/* s->common.offset contains the number of bytes already processed so
* far, cnt is the number of dirty bytes remaining and
* s->bytes_in_flight is the number of bytes currently being
* far, cnt is the number of dirty sectors remaining and
* s->sectors_in_flight is the number of sectors currently being
* processed; together those are the current total operation length */
s->common.len = s->common.offset + s->bytes_in_flight + cnt;
s->common.len = s->common.offset +
(cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;
/* Note that even when no rate limit is applied we need to yield
* periodically with no pending I/O so that bdrv_drain_all() returns.
@@ -870,13 +877,13 @@ static void coroutine_fn mirror_run(void *opaque)
ret = 0;
trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
if (!s->synced) {
block_job_sleep_ns(&s->common, delay_ns);
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
if (block_job_is_cancelled(&s->common)) {
break;
}
} else if (!should_complete) {
delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
block_job_sleep_ns(&s->common, delay_ns);
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
}
s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
}
@@ -897,6 +904,7 @@ immediate_exit:
g_free(s->cow_bitmap);
g_free(s->in_flight_bitmap);
bdrv_dirty_iter_free(s->dbi);
bdrv_release_dirty_bitmap(bs, s->dirty_bitmap);
data = g_malloc(sizeof(*data));
data->ret = ret;
@@ -915,7 +923,7 @@ static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static void mirror_complete(BlockJob *job, Error **errp)
@@ -1035,32 +1043,33 @@ static int coroutine_fn bdrv_mirror_top_pwritev(BlockDriverState *bs,
static int coroutine_fn bdrv_mirror_top_flush(BlockDriverState *bs)
{
if (bs->backing == NULL) {
/* we can be here after failed bdrv_append in mirror_start_job */
return 0;
}
return bdrv_co_flush(bs->backing->bs);
}
static int coroutine_fn bdrv_mirror_top_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int bytes, BdrvRequestFlags flags)
static int64_t coroutine_fn bdrv_mirror_top_get_block_status(
BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum,
BlockDriverState **file)
{
return bdrv_co_pwrite_zeroes(bs->backing, offset, bytes, flags);
*pnum = nb_sectors;
*file = bs->backing->bs;
return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
(sector_num << BDRV_SECTOR_BITS);
}
static int coroutine_fn bdrv_mirror_top_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int count, BdrvRequestFlags flags)
{
return bdrv_co_pwrite_zeroes(bs->backing, offset, count, flags);
}
static int coroutine_fn bdrv_mirror_top_pdiscard(BlockDriverState *bs,
int64_t offset, int bytes)
int64_t offset, int count)
{
return bdrv_co_pdiscard(bs->backing->bs, offset, bytes);
return bdrv_co_pdiscard(bs->backing->bs, offset, count);
}
static void bdrv_mirror_top_refresh_filename(BlockDriverState *bs, QDict *opts)
{
if (bs->backing == NULL) {
/* we can be here after failed bdrv_attach_child in
* bdrv_set_backing_hd */
return;
}
bdrv_refresh_filename(bs->backing->bs);
pstrcpy(bs->exact_filename, sizeof(bs->exact_filename),
bs->backing->bs->filename);
@@ -1072,7 +1081,6 @@ static void bdrv_mirror_top_close(BlockDriverState *bs)
static void bdrv_mirror_top_child_perm(BlockDriverState *bs, BdrvChild *c,
const BdrvChildRole *role,
BlockReopenQueue *reopen_queue,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
{
@@ -1094,7 +1102,7 @@ static BlockDriver bdrv_mirror_top = {
.bdrv_co_pwrite_zeroes = bdrv_mirror_top_pwrite_zeroes,
.bdrv_co_pdiscard = bdrv_mirror_top_pdiscard,
.bdrv_co_flush = bdrv_mirror_top_flush,
.bdrv_co_block_status = bdrv_co_block_status_from_backing,
.bdrv_co_get_block_status = bdrv_mirror_top_get_block_status,
.bdrv_refresh_filename = bdrv_mirror_top_refresh_filename,
.bdrv_close = bdrv_mirror_top_close,
.bdrv_child_perm = bdrv_mirror_top_child_perm,
@@ -1109,12 +1117,10 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
BlockdevOnError on_target_error,
bool unmap,
BlockCompletionFunc *cb,
void *opaque,
void *opaque, Error **errp,
const BlockJobDriver *driver,
bool is_none_mode, BlockDriverState *base,
bool auto_complete, const char *filter_node_name,
bool is_mirror,
Error **errp)
bool auto_complete, const char *filter_node_name)
{
MirrorBlockJob *s;
BlockDriverState *mirror_top_bs;
@@ -1127,7 +1133,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
granularity = bdrv_get_default_bitmap_granularity(target);
}
assert(is_power_of_2(granularity));
assert ((granularity & (granularity - 1)) == 0);
if (buf_size < 0) {
error_setg(errp, "Invalid parameter 'buf-size'");
@@ -1200,15 +1206,6 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
if (ret < 0) {
goto fail;
}
if (is_mirror) {
/* XXX: Mirror target could be a NBD server of target QEMU in the case
* of non-shared block migration. To allow migration completion, we
* have to allow "inactivate" of the target BB. When that happens, we
* know the job is drained, and the vcpus are stopped, so no write
* operation will be performed. Block layer already has assertions to
* ensure that. */
blk_set_force_allow_inactivate(s->target);
}
s->replaces = g_strdup(replaces);
s->on_source_error = on_source_error;
@@ -1262,7 +1259,7 @@ fail:
g_free(s->replaces);
blk_unref(s->target);
block_job_early_fail(&s->common);
block_job_unref(&s->common);
}
bdrv_child_try_set_perm(mirror_top_bs->backing, 0, BLK_PERM_ALL,
@@ -1291,17 +1288,17 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
mirror_start_job(job_id, bs, BLOCK_JOB_DEFAULT, target, replaces,
speed, granularity, buf_size, backing_mode,
on_source_error, on_target_error, unmap, NULL, NULL,
on_source_error, on_target_error, unmap, NULL, NULL, errp,
&mirror_job_driver, is_none_mode, base, false,
filter_node_name, true, errp);
filter_node_name);
}
void commit_active_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *base, int creation_flags,
int64_t speed, BlockdevOnError on_error,
const char *filter_node_name,
BlockCompletionFunc *cb, void *opaque,
bool auto_complete, Error **errp)
BlockCompletionFunc *cb, void *opaque, Error **errp,
bool auto_complete)
{
int orig_base_flags;
Error *local_err = NULL;
@@ -1314,9 +1311,9 @@ void commit_active_start(const char *job_id, BlockDriverState *bs,
mirror_start_job(job_id, bs, creation_flags, base, NULL, speed, 0, 0,
MIRROR_LEAVE_BACKING_CHAIN,
on_error, on_error, true, cb, opaque,
on_error, on_error, true, cb, opaque, &local_err,
&commit_active_job_driver, false, base, auto_complete,
filter_node_name, false, &local_err);
filter_node_name);
if (local_err) {
error_propagate(errp, local_err);
goto error_restore_flags;

View File

@@ -28,21 +28,18 @@
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "nbd-client.h"
#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ (uint64_t)(intptr_t)(bs))
#define INDEX_TO_HANDLE(bs, index) ((index) ^ (uint64_t)(intptr_t)(bs))
#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
#define INDEX_TO_HANDLE(bs, index) ((index) ^ ((uint64_t)(intptr_t)bs))
static void nbd_recv_coroutines_wake_all(NBDClientSession *s)
static void nbd_recv_coroutines_enter_all(NBDClientSession *s)
{
int i;
for (i = 0; i < MAX_NBD_REQUESTS; i++) {
NBDClientRequest *req = &s->requests[i];
if (req->coroutine && req->receiving) {
aio_co_wake(req->coroutine);
if (s->recv_coroutine[i]) {
aio_co_wake(s->recv_coroutine[i]);
}
}
}
@@ -72,15 +69,11 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
{
NBDClientSession *s = opaque;
uint64_t i;
int ret = 0;
Error *local_err = NULL;
int ret;
while (!s->quit) {
for (;;) {
assert(s->reply.handle == 0);
ret = nbd_receive_reply(s->ioc, &s->reply, &local_err);
if (local_err) {
error_report_err(local_err);
}
ret = nbd_receive_reply(s->ioc, &s->reply);
if (ret <= 0) {
break;
}
@@ -90,31 +83,26 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
* one coroutine is called until the reply finishes.
*/
i = HANDLE_TO_INDEX(s, s->reply.handle);
if (i >= MAX_NBD_REQUESTS ||
!s->requests[i].coroutine ||
!s->requests[i].receiving ||
(nbd_reply_is_structured(&s->reply) && !s->info.structured_reply))
{
if (i >= MAX_NBD_REQUESTS || !s->recv_coroutine[i]) {
break;
}
/* We're woken up again by the request itself. Note that there
/* We're woken up by the recv_coroutine itself. Note that there
* is no race between yielding and reentering read_reply_co. This
* is because:
*
* - if the request runs on the same AioContext, it is only
* - if recv_coroutine[i] runs on the same AioContext, it is only
* entered after we yield
*
* - if the request runs on a different AioContext, reentering
* - if recv_coroutine[i] runs on a different AioContext, reentering
* read_reply_co happens through a bottom half, which can only
* run after we yield.
*/
aio_co_wake(s->requests[i].coroutine);
aio_co_wake(s->recv_coroutine[i]);
qemu_coroutine_yield();
}
s->quit = true;
nbd_recv_coroutines_wake_all(s);
nbd_recv_coroutines_enter_all(s);
s->read_reply_co = NULL;
}
@@ -123,578 +111,125 @@ static int nbd_co_send_request(BlockDriverState *bs,
QEMUIOVector *qiov)
{
NBDClientSession *s = nbd_get_client_session(bs);
int rc, i;
int rc, ret, i;
qemu_co_mutex_lock(&s->send_mutex);
while (s->in_flight == MAX_NBD_REQUESTS) {
qemu_co_queue_wait(&s->free_sema, &s->send_mutex);
}
s->in_flight++;
for (i = 0; i < MAX_NBD_REQUESTS; i++) {
if (s->requests[i].coroutine == NULL) {
if (s->recv_coroutine[i] == NULL) {
s->recv_coroutine[i] = qemu_coroutine_self();
break;
}
}
g_assert(qemu_in_coroutine());
assert(i < MAX_NBD_REQUESTS);
s->requests[i].coroutine = qemu_coroutine_self();
s->requests[i].offset = request->from;
s->requests[i].receiving = false;
request->handle = INDEX_TO_HANDLE(s, i);
if (s->quit) {
rc = -EIO;
goto err;
}
if (!s->ioc) {
rc = -EPIPE;
goto err;
qemu_co_mutex_unlock(&s->send_mutex);
return -EPIPE;
}
if (qiov) {
qio_channel_set_cork(s->ioc, true);
rc = nbd_send_request(s->ioc, request);
if (rc >= 0 && !s->quit) {
if (qio_channel_writev_all(s->ioc, qiov->iov, qiov->niov,
NULL) < 0) {
if (rc >= 0) {
ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, request->len,
false);
if (ret != request->len) {
rc = -EIO;
}
} else if (rc >= 0) {
rc = -EIO;
}
qio_channel_set_cork(s->ioc, false);
} else {
rc = nbd_send_request(s->ioc, request);
}
err:
if (rc < 0) {
s->quit = true;
s->requests[i].coroutine = NULL;
s->in_flight--;
qemu_co_queue_next(&s->free_sema);
}
qemu_co_mutex_unlock(&s->send_mutex);
return rc;
}
static inline uint16_t payload_advance16(uint8_t **payload)
{
*payload += 2;
return lduw_be_p(*payload - 2);
}
static inline uint32_t payload_advance32(uint8_t **payload)
{
*payload += 4;
return ldl_be_p(*payload - 4);
}
static inline uint64_t payload_advance64(uint8_t **payload)
{
*payload += 8;
return ldq_be_p(*payload - 8);
}
static int nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk,
uint8_t *payload, uint64_t orig_offset,
QEMUIOVector *qiov, Error **errp)
{
uint64_t offset;
uint32_t hole_size;
if (chunk->length != sizeof(offset) + sizeof(hole_size)) {
error_setg(errp, "Protocol error: invalid payload for "
"NBD_REPLY_TYPE_OFFSET_HOLE");
return -EINVAL;
}
offset = payload_advance64(&payload);
hole_size = payload_advance32(&payload);
if (!hole_size || offset < orig_offset || hole_size > qiov->size ||
offset > orig_offset + qiov->size - hole_size) {
error_setg(errp, "Protocol error: server sent chunk exceeding requested"
" region");
return -EINVAL;
}
qemu_iovec_memset(qiov, offset - orig_offset, 0, hole_size);
return 0;
}
/* nbd_parse_error_payload
* on success @errp contains message describing nbd error reply
*/
static int nbd_parse_error_payload(NBDStructuredReplyChunk *chunk,
uint8_t *payload, int *request_ret,
Error **errp)
{
uint32_t error;
uint16_t message_size;
assert(chunk->type & (1 << 15));
if (chunk->length < sizeof(error) + sizeof(message_size)) {
error_setg(errp,
"Protocol error: invalid payload for structured error");
return -EINVAL;
}
error = nbd_errno_to_system_errno(payload_advance32(&payload));
if (error == 0) {
error_setg(errp, "Protocol error: server sent structured error chunk "
"with error = 0");
return -EINVAL;
}
*request_ret = -error;
message_size = payload_advance16(&payload);
if (message_size > chunk->length - sizeof(error) - sizeof(message_size)) {
error_setg(errp, "Protocol error: server sent structured error chunk "
"with incorrect message size");
return -EINVAL;
}
/* TODO: Add a trace point to mention the server complaint */
/* TODO handle ERROR_OFFSET */
return 0;
}
static int nbd_co_receive_offset_data_payload(NBDClientSession *s,
uint64_t orig_offset,
QEMUIOVector *qiov, Error **errp)
{
QEMUIOVector sub_qiov;
uint64_t offset;
size_t data_size;
int ret;
NBDStructuredReplyChunk *chunk = &s->reply.structured;
assert(nbd_reply_is_structured(&s->reply));
/* The NBD spec requires at least one byte of payload */
if (chunk->length <= sizeof(offset)) {
error_setg(errp, "Protocol error: invalid payload for "
"NBD_REPLY_TYPE_OFFSET_DATA");
return -EINVAL;
}
if (nbd_read(s->ioc, &offset, sizeof(offset), errp) < 0) {
return -EIO;
}
be64_to_cpus(&offset);
data_size = chunk->length - sizeof(offset);
assert(data_size);
if (offset < orig_offset || data_size > qiov->size ||
offset > orig_offset + qiov->size - data_size) {
error_setg(errp, "Protocol error: server sent chunk exceeding requested"
" region");
return -EINVAL;
}
qemu_iovec_init(&sub_qiov, qiov->niov);
qemu_iovec_concat(&sub_qiov, qiov, offset - orig_offset, data_size);
ret = qio_channel_readv_all(s->ioc, sub_qiov.iov, sub_qiov.niov, errp);
qemu_iovec_destroy(&sub_qiov);
return ret < 0 ? -EIO : 0;
}
#define NBD_MAX_MALLOC_PAYLOAD 1000
/* nbd_co_receive_structured_payload
*/
static coroutine_fn int nbd_co_receive_structured_payload(
NBDClientSession *s, void **payload, Error **errp)
static void nbd_co_receive_reply(NBDClientSession *s,
NBDRequest *request,
NBDReply *reply,
QEMUIOVector *qiov)
{
int ret;
uint32_t len;
assert(nbd_reply_is_structured(&s->reply));
len = s->reply.structured.length;
if (len == 0) {
return 0;
}
if (payload == NULL) {
error_setg(errp, "Unexpected structured payload");
return -EINVAL;
}
if (len > NBD_MAX_MALLOC_PAYLOAD) {
error_setg(errp, "Payload too large");
return -EINVAL;
}
*payload = g_new(char, len);
ret = nbd_read(s->ioc, *payload, len, errp);
if (ret < 0) {
g_free(*payload);
*payload = NULL;
return ret;
}
return 0;
}
/* nbd_co_do_receive_one_chunk
* for simple reply:
* set request_ret to received reply error
* if qiov is not NULL: read payload to @qiov
* for structured reply chunk:
* if error chunk: read payload, set @request_ret, do not set @payload
* else if offset_data chunk: read payload data to @qiov, do not set @payload
* else: read payload to @payload
*
* If function fails, @errp contains corresponding error message, and the
* connection with the server is suspect. If it returns 0, then the
* transaction succeeded (although @request_ret may be a negative errno
* corresponding to the server's error reply), and errp is unchanged.
*/
static coroutine_fn int nbd_co_do_receive_one_chunk(
NBDClientSession *s, uint64_t handle, bool only_structured,
int *request_ret, QEMUIOVector *qiov, void **payload, Error **errp)
{
int ret;
int i = HANDLE_TO_INDEX(s, handle);
void *local_payload = NULL;
NBDStructuredReplyChunk *chunk;
if (payload) {
*payload = NULL;
}
*request_ret = 0;
/* Wait until we're woken up by nbd_read_reply_entry. */
s->requests[i].receiving = true;
qemu_coroutine_yield();
s->requests[i].receiving = false;
if (!s->ioc || s->quit) {
error_setg(errp, "Connection closed");
return -EIO;
}
assert(s->reply.handle == handle);
if (nbd_reply_is_simple(&s->reply)) {
if (only_structured) {
error_setg(errp, "Protocol error: simple reply when structured "
"reply chunk was expected");
return -EINVAL;
*reply = s->reply;
if (reply->handle != request->handle ||
!s->ioc) {
reply->error = EIO;
} else {
if (qiov && reply->error == 0) {
ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov, request->len,
true);
if (ret != request->len) {
reply->error = EIO;
}
}
*request_ret = -nbd_errno_to_system_errno(s->reply.simple.error);
if (*request_ret < 0 || !qiov) {
return 0;
}
return qio_channel_readv_all(s->ioc, qiov->iov, qiov->niov,
errp) < 0 ? -EIO : 0;
/* Tell the read handler to read another header. */
s->reply.handle = 0;
}
/* handle structured reply chunk */
assert(s->info.structured_reply);
chunk = &s->reply.structured;
if (chunk->type == NBD_REPLY_TYPE_NONE) {
if (!(chunk->flags & NBD_REPLY_FLAG_DONE)) {
error_setg(errp, "Protocol error: NBD_REPLY_TYPE_NONE chunk without"
" NBD_REPLY_FLAG_DONE flag set");
return -EINVAL;
}
if (chunk->length) {
error_setg(errp, "Protocol error: NBD_REPLY_TYPE_NONE chunk with"
" nonzero length");
return -EINVAL;
}
return 0;
}
if (chunk->type == NBD_REPLY_TYPE_OFFSET_DATA) {
if (!qiov) {
error_setg(errp, "Unexpected NBD_REPLY_TYPE_OFFSET_DATA chunk");
return -EINVAL;
}
return nbd_co_receive_offset_data_payload(s, s->requests[i].offset,
qiov, errp);
}
if (nbd_reply_type_is_error(chunk->type)) {
payload = &local_payload;
}
ret = nbd_co_receive_structured_payload(s, payload, errp);
if (ret < 0) {
return ret;
}
if (nbd_reply_type_is_error(chunk->type)) {
ret = nbd_parse_error_payload(chunk, local_payload, request_ret, errp);
g_free(local_payload);
return ret;
}
return 0;
}
/* nbd_co_receive_one_chunk
* Read reply, wake up read_reply_co and set s->quit if needed.
* Return value is a fatal error code or normal nbd reply error code
*/
static coroutine_fn int nbd_co_receive_one_chunk(
NBDClientSession *s, uint64_t handle, bool only_structured,
QEMUIOVector *qiov, NBDReply *reply, void **payload, Error **errp)
static void nbd_coroutine_start(NBDClientSession *s,
NBDRequest *request)
{
int request_ret;
int ret = nbd_co_do_receive_one_chunk(s, handle, only_structured,
&request_ret, qiov, payload, errp);
if (ret < 0) {
s->quit = true;
} else {
/* For assert at loop start in nbd_read_reply_entry */
if (reply) {
*reply = s->reply;
}
s->reply.handle = 0;
ret = request_ret;
/* Poor man semaphore. The free_sema is locked when no other request
* can be accepted, and unlocked after receiving one reply. */
if (s->in_flight == MAX_NBD_REQUESTS) {
qemu_co_queue_wait(&s->free_sema, NULL);
assert(s->in_flight < MAX_NBD_REQUESTS);
}
s->in_flight++;
/* s->recv_coroutine[i] is set as soon as we get the send_lock. */
}
static void nbd_coroutine_end(BlockDriverState *bs,
NBDRequest *request)
{
NBDClientSession *s = nbd_get_client_session(bs);
int i = HANDLE_TO_INDEX(s, request->handle);
s->recv_coroutine[i] = NULL;
s->in_flight--;
qemu_co_queue_next(&s->free_sema);
/* Kick the read_reply_co to get the next reply. */
if (s->read_reply_co) {
aio_co_wake(s->read_reply_co);
}
return ret;
}
typedef struct NBDReplyChunkIter {
int ret;
Error *err;
bool done, only_structured;
} NBDReplyChunkIter;
static void nbd_iter_error(NBDReplyChunkIter *iter, bool fatal,
int ret, Error **local_err)
{
assert(ret < 0);
if (fatal || iter->ret == 0) {
if (iter->ret != 0) {
error_free(iter->err);
iter->err = NULL;
}
iter->ret = ret;
error_propagate(&iter->err, *local_err);
} else {
error_free(*local_err);
}
*local_err = NULL;
}
/* NBD_FOREACH_REPLY_CHUNK
*/
#define NBD_FOREACH_REPLY_CHUNK(s, iter, handle, structured, \
qiov, reply, payload) \
for (iter = (NBDReplyChunkIter) { .only_structured = structured }; \
nbd_reply_chunk_iter_receive(s, &iter, handle, qiov, reply, payload);)
/* nbd_reply_chunk_iter_receive
*/
static bool nbd_reply_chunk_iter_receive(NBDClientSession *s,
NBDReplyChunkIter *iter,
uint64_t handle,
QEMUIOVector *qiov, NBDReply *reply,
void **payload)
{
int ret;
NBDReply local_reply;
NBDStructuredReplyChunk *chunk;
Error *local_err = NULL;
if (s->quit) {
error_setg(&local_err, "Connection closed");
nbd_iter_error(iter, true, -EIO, &local_err);
goto break_loop;
}
if (iter->done) {
/* Previous iteration was last. */
goto break_loop;
}
if (reply == NULL) {
reply = &local_reply;
}
ret = nbd_co_receive_one_chunk(s, handle, iter->only_structured,
qiov, reply, payload, &local_err);
if (ret < 0) {
/* If it is a fatal error s->quit is set by nbd_co_receive_one_chunk */
nbd_iter_error(iter, s->quit, ret, &local_err);
}
/* Do not execute the body of NBD_FOREACH_REPLY_CHUNK for simple reply. */
if (nbd_reply_is_simple(&s->reply) || s->quit) {
goto break_loop;
}
chunk = &reply->structured;
iter->only_structured = true;
if (chunk->type == NBD_REPLY_TYPE_NONE) {
/* NBD_REPLY_FLAG_DONE is already checked in nbd_co_receive_one_chunk */
assert(chunk->flags & NBD_REPLY_FLAG_DONE);
goto break_loop;
}
if (chunk->flags & NBD_REPLY_FLAG_DONE) {
/* This iteration is last. */
iter->done = true;
}
/* Execute the loop body */
return true;
break_loop:
s->requests[HANDLE_TO_INDEX(s, handle)].coroutine = NULL;
qemu_co_mutex_lock(&s->send_mutex);
s->in_flight--;
qemu_co_queue_next(&s->free_sema);
qemu_co_mutex_unlock(&s->send_mutex);
return false;
}
static int nbd_co_receive_return_code(NBDClientSession *s, uint64_t handle,
Error **errp)
{
NBDReplyChunkIter iter;
NBD_FOREACH_REPLY_CHUNK(s, iter, handle, false, NULL, NULL, NULL) {
/* nbd_reply_chunk_iter_receive does all the work */
}
error_propagate(errp, iter.err);
return iter.ret;
}
static int nbd_co_receive_cmdread_reply(NBDClientSession *s, uint64_t handle,
uint64_t offset, QEMUIOVector *qiov,
Error **errp)
{
NBDReplyChunkIter iter;
NBDReply reply;
void *payload = NULL;
Error *local_err = NULL;
NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply,
qiov, &reply, &payload)
{
int ret;
NBDStructuredReplyChunk *chunk = &reply.structured;
assert(nbd_reply_is_structured(&reply));
switch (chunk->type) {
case NBD_REPLY_TYPE_OFFSET_DATA:
/* special cased in nbd_co_receive_one_chunk, data is already
* in qiov */
break;
case NBD_REPLY_TYPE_OFFSET_HOLE:
ret = nbd_parse_offset_hole_payload(&reply.structured, payload,
offset, qiov, &local_err);
if (ret < 0) {
s->quit = true;
nbd_iter_error(&iter, true, ret, &local_err);
}
break;
default:
if (!nbd_reply_type_is_error(chunk->type)) {
/* not allowed reply type */
s->quit = true;
error_setg(&local_err,
"Unexpected reply type: %d (%s) for CMD_READ",
chunk->type, nbd_reply_type_lookup(chunk->type));
nbd_iter_error(&iter, true, -EINVAL, &local_err);
}
}
g_free(payload);
payload = NULL;
}
error_propagate(errp, iter.err);
return iter.ret;
}
static int nbd_co_request(BlockDriverState *bs, NBDRequest *request,
QEMUIOVector *write_qiov)
{
int ret;
Error *local_err = NULL;
NBDClientSession *client = nbd_get_client_session(bs);
assert(request->type != NBD_CMD_READ);
if (write_qiov) {
assert(request->type == NBD_CMD_WRITE);
assert(request->len == iov_size(write_qiov->iov, write_qiov->niov));
} else {
assert(request->type != NBD_CMD_WRITE);
}
ret = nbd_co_send_request(bs, request, write_qiov);
if (ret < 0) {
return ret;
}
ret = nbd_co_receive_return_code(client, request->handle, &local_err);
if (local_err) {
error_report_err(local_err);
}
return ret;
}
int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov, int flags)
{
int ret;
Error *local_err = NULL;
NBDClientSession *client = nbd_get_client_session(bs);
NBDRequest request = {
.type = NBD_CMD_READ,
.from = offset,
.len = bytes,
};
NBDReply reply;
ssize_t ret;
assert(bytes <= NBD_MAX_BUFFER_SIZE);
assert(!flags);
if (!bytes) {
return 0;
}
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL);
if (ret < 0) {
return ret;
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, qiov);
}
ret = nbd_co_receive_cmdread_reply(client, request.handle, offset, qiov,
&local_err);
if (local_err) {
error_report_err(local_err);
}
return ret;
nbd_coroutine_end(bs, &request);
return -reply.error;
}
int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
@@ -706,80 +241,112 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
.from = offset,
.len = bytes,
};
NBDReply reply;
ssize_t ret;
assert(!(client->info.flags & NBD_FLAG_READ_ONLY));
if (flags & BDRV_REQ_FUA) {
assert(client->info.flags & NBD_FLAG_SEND_FUA);
assert(client->nbdflags & NBD_FLAG_SEND_FUA);
request.flags |= NBD_CMD_FLAG_FUA;
}
assert(bytes <= NBD_MAX_BUFFER_SIZE);
if (!bytes) {
return 0;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, qiov);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, NULL);
}
return nbd_co_request(bs, &request, qiov);
nbd_coroutine_end(bs, &request);
return -reply.error;
}
int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
int bytes, BdrvRequestFlags flags)
int count, BdrvRequestFlags flags)
{
ssize_t ret;
NBDClientSession *client = nbd_get_client_session(bs);
NBDRequest request = {
.type = NBD_CMD_WRITE_ZEROES,
.from = offset,
.len = bytes,
.len = count,
};
NBDReply reply;
assert(!(client->info.flags & NBD_FLAG_READ_ONLY));
if (!(client->info.flags & NBD_FLAG_SEND_WRITE_ZEROES)) {
if (!(client->nbdflags & NBD_FLAG_SEND_WRITE_ZEROES)) {
return -ENOTSUP;
}
if (flags & BDRV_REQ_FUA) {
assert(client->info.flags & NBD_FLAG_SEND_FUA);
assert(client->nbdflags & NBD_FLAG_SEND_FUA);
request.flags |= NBD_CMD_FLAG_FUA;
}
if (!(flags & BDRV_REQ_MAY_UNMAP)) {
request.flags |= NBD_CMD_FLAG_NO_HOLE;
}
if (!bytes) {
return 0;
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, NULL);
}
return nbd_co_request(bs, &request, NULL);
nbd_coroutine_end(bs, &request);
return -reply.error;
}
int nbd_client_co_flush(BlockDriverState *bs)
{
NBDClientSession *client = nbd_get_client_session(bs);
NBDRequest request = { .type = NBD_CMD_FLUSH };
NBDReply reply;
ssize_t ret;
if (!(client->info.flags & NBD_FLAG_SEND_FLUSH)) {
if (!(client->nbdflags & NBD_FLAG_SEND_FLUSH)) {
return 0;
}
request.from = 0;
request.len = 0;
return nbd_co_request(bs, &request, NULL);
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, NULL);
}
nbd_coroutine_end(bs, &request);
return -reply.error;
}
int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
{
NBDClientSession *client = nbd_get_client_session(bs);
NBDRequest request = {
.type = NBD_CMD_TRIM,
.from = offset,
.len = bytes,
.len = count,
};
NBDReply reply;
ssize_t ret;
assert(!(client->info.flags & NBD_FLAG_READ_ONLY));
if (!(client->info.flags & NBD_FLAG_SEND_TRIM) || !bytes) {
if (!(client->nbdflags & NBD_FLAG_SEND_TRIM)) {
return 0;
}
return nbd_co_request(bs, &request, NULL);
nbd_coroutine_start(client, &request);
ret = nbd_co_send_request(bs, &request, NULL);
if (ret < 0) {
reply.error = -ret;
} else {
nbd_co_receive_reply(client, &request, &reply, NULL);
}
nbd_coroutine_end(bs, &request);
return -reply.error;
}
void nbd_client_detach_aio_context(BlockDriverState *bs)
@@ -824,26 +391,20 @@ int nbd_client_init(BlockDriverState *bs,
logout("session init %s\n", export);
qio_channel_set_blocking(QIO_CHANNEL(sioc), true, NULL);
client->info.request_sizes = true;
client->info.structured_reply = true;
ret = nbd_receive_negotiate(QIO_CHANNEL(sioc), export,
&client->nbdflags,
tlscreds, hostname,
&client->ioc, &client->info, errp);
&client->ioc,
&client->size, errp);
if (ret < 0) {
logout("Failed to negotiate with the NBD server\n");
return ret;
}
if (client->info.flags & NBD_FLAG_READ_ONLY &&
!bdrv_is_read_only(bs)) {
error_setg(errp,
"request for write access conflicts with read-only export");
return -EACCES;
}
if (client->info.flags & NBD_FLAG_SEND_FUA) {
if (client->nbdflags & NBD_FLAG_SEND_FUA) {
bs->supported_write_flags = BDRV_REQ_FUA;
bs->supported_zero_flags |= BDRV_REQ_FUA;
}
if (client->info.flags & NBD_FLAG_SEND_WRITE_ZEROES) {
if (client->nbdflags & NBD_FLAG_SEND_WRITE_ZEROES) {
bs->supported_zero_flags |= BDRV_REQ_MAY_UNMAP;
}

View File

@@ -17,25 +17,19 @@
#define MAX_NBD_REQUESTS 16
typedef struct {
Coroutine *coroutine;
uint64_t offset; /* original offset of the request */
bool receiving; /* waiting for read_reply_co? */
} NBDClientRequest;
typedef struct NBDClientSession {
QIOChannelSocket *sioc; /* The master data channel */
QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */
NBDExportInfo info;
uint16_t nbdflags;
off_t size;
CoMutex send_mutex;
CoQueue free_sema;
Coroutine *read_reply_co;
int in_flight;
NBDClientRequest requests[MAX_NBD_REQUESTS];
Coroutine *recv_coroutine[MAX_NBD_REQUESTS];
NBDReply reply;
bool quit;
} NBDClientSession;
NBDClientSession *nbd_get_client_session(BlockDriverState *bs);
@@ -48,12 +42,12 @@ int nbd_client_init(BlockDriverState *bs,
Error **errp);
void nbd_client_close(BlockDriverState *bs);
int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes);
int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count);
int nbd_client_co_flush(BlockDriverState *bs);
int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov, int flags);
int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
int bytes, BdrvRequestFlags flags);
int count, BdrvRequestFlags flags);
int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov, int flags);

View File

@@ -32,11 +32,12 @@
#include "qemu/uri.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/option.h"
#include "qapi/qapi-visit-sockets.h"
#include "qapi-visit.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi/qobject-output-visitor.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qjson.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
#include "qemu/cutils.h"
@@ -46,7 +47,7 @@ typedef struct BDRVNBDState {
NBDClientSession client;
/* For nbd_refresh_filename() */
SocketAddress *saddr;
SocketAddressFlat *saddr;
char *export, *tlscredsid;
} BDRVNBDState;
@@ -197,16 +198,16 @@ static void nbd_parse_filename(const char *filename, QDict *options,
qdict_put_str(options, "server.type", "unix");
qdict_put_str(options, "server.path", unixpath);
} else {
InetSocketAddress *addr = g_new(InetSocketAddress, 1);
InetSocketAddress *addr = NULL;
if (inet_parse(addr, host_spec, errp)) {
goto out_inet;
addr = inet_parse(host_spec, errp);
if (!addr) {
goto out;
}
qdict_put_str(options, "server.type", "inet");
qdict_put_str(options, "server.host", addr->host);
qdict_put_str(options, "server.port", addr->port);
out_inet:
qapi_free_InetSocketAddress(addr);
}
@@ -257,10 +258,10 @@ static bool nbd_process_legacy_socket_options(QDict *output_options,
return true;
}
static SocketAddress *nbd_config(BDRVNBDState *s, QDict *options,
Error **errp)
static SocketAddressFlat *nbd_config(BDRVNBDState *s, QDict *options,
Error **errp)
{
SocketAddress *saddr = NULL;
SocketAddressFlat *saddr = NULL;
QDict *addr = NULL;
QObject *crumpled_addr = NULL;
Visitor *iv = NULL;
@@ -286,7 +287,7 @@ static SocketAddress *nbd_config(BDRVNBDState *s, QDict *options,
* visitor expects the former.
*/
iv = qobject_input_visitor_new(crumpled_addr);
visit_type_SocketAddress(iv, NULL, &saddr, &local_err);
visit_type_SocketAddressFlat(iv, NULL, &saddr, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto done;
@@ -305,9 +306,10 @@ NBDClientSession *nbd_get_client_session(BlockDriverState *bs)
return &s->client;
}
static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
static QIOChannelSocket *nbd_establish_connection(SocketAddressFlat *saddr_flat,
Error **errp)
{
SocketAddress *saddr = socket_address_crumple(saddr_flat);
QIOChannelSocket *sioc;
Error *local_err = NULL;
@@ -317,6 +319,7 @@ static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
qio_channel_socket_connect_sync(sioc,
saddr,
&local_err);
qapi_free_SocketAddress(saddr);
if (local_err) {
object_unref(OBJECT(sioc));
error_propagate(errp, local_err);
@@ -388,7 +391,6 @@ static QemuOptsList nbd_runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "ID of the TLS credentials to use",
},
{ /* end of list */ }
},
};
@@ -410,7 +412,7 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
goto error;
}
/* Translate @host, @port, and @path to a SocketAddress */
/* Translate @host, @port, and @path to a SocketAddressFlat */
if (!nbd_process_legacy_socket_options(options, opts, errp)) {
goto error;
}
@@ -431,7 +433,7 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
}
/* TODO SOCKET_ADDRESS_KIND_FD where fd has AF_INET or AF_INET6 */
if (s->saddr->type != SOCKET_ADDRESS_TYPE_INET) {
if (s->saddr->type != SOCKET_ADDRESS_FLAT_TYPE_INET) {
error_setg(errp, "TLS only supported over IP sockets");
goto error;
}
@@ -458,7 +460,7 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
object_unref(OBJECT(tlscreds));
}
if (ret < 0) {
qapi_free_SocketAddress(s->saddr);
qapi_free_SocketAddressFlat(s->saddr);
g_free(s->export);
g_free(s->tlscredsid);
}
@@ -473,19 +475,9 @@ static int nbd_co_flush(BlockDriverState *bs)
static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
{
NBDClientSession *s = nbd_get_client_session(bs);
uint32_t min = s->info.min_block;
uint32_t max = MIN_NON_ZERO(NBD_MAX_BUFFER_SIZE, s->info.max_block);
bs->bl.request_alignment = min ? min : BDRV_SECTOR_SIZE;
bs->bl.max_pdiscard = max;
bs->bl.max_pwrite_zeroes = max;
bs->bl.max_transfer = max;
if (s->info.opt_block &&
s->info.opt_block > bs->bl.opt_transfer) {
bs->bl.opt_transfer = s->info.opt_block;
}
bs->bl.max_pdiscard = NBD_MAX_BUFFER_SIZE;
bs->bl.max_pwrite_zeroes = NBD_MAX_BUFFER_SIZE;
bs->bl.max_transfer = NBD_MAX_BUFFER_SIZE;
}
static void nbd_close(BlockDriverState *bs)
@@ -494,7 +486,7 @@ static void nbd_close(BlockDriverState *bs)
nbd_client_close(bs);
qapi_free_SocketAddress(s->saddr);
qapi_free_SocketAddressFlat(s->saddr);
g_free(s->export);
g_free(s->tlscredsid);
}
@@ -503,7 +495,7 @@ static int64_t nbd_getlength(BlockDriverState *bs)
{
BDRVNBDState *s = bs->opaque;
return s->client.info.size;
return s->client.size;
}
static void nbd_detach_aio_context(BlockDriverState *bs)
@@ -525,13 +517,13 @@ static void nbd_refresh_filename(BlockDriverState *bs, QDict *options)
Visitor *ov;
const char *host = NULL, *port = NULL, *path = NULL;
if (s->saddr->type == SOCKET_ADDRESS_TYPE_INET) {
if (s->saddr->type == SOCKET_ADDRESS_FLAT_TYPE_INET) {
const InetSocketAddress *inet = &s->saddr->u.inet;
if (!inet->has_ipv4 && !inet->has_ipv6 && !inet->has_to) {
host = inet->host;
port = inet->port;
}
} else if (s->saddr->type == SOCKET_ADDRESS_TYPE_UNIX) {
} else if (s->saddr->type == SOCKET_ADDRESS_FLAT_TYPE_UNIX) {
path = s->saddr->u.q_unix.path;
} /* else can't represent as pseudo-filename */
@@ -552,7 +544,7 @@ static void nbd_refresh_filename(BlockDriverState *bs, QDict *options)
}
ov = qobject_output_visitor_new(&saddr_qdict);
visit_type_SocketAddress(ov, NULL, &s->saddr, &error_abort);
visit_type_SocketAddressFlat(ov, NULL, &s->saddr, &error_abort);
visit_complete(ov, &saddr_qdict);
visit_free(ov);
qdict_put_obj(opts, "server", saddr_qdict);

View File

@@ -1,7 +1,7 @@
/*
* QEMU Block driver for native access to files on NFS shares
*
* Copyright (c) 2014-2017 Peter Lieven <pl@kamp.de>
* Copyright (c) 2014-2016 Peter Lieven <pl@kamp.de>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
@@ -25,19 +25,20 @@
#include "qemu/osdep.h"
#include <poll.h>
#include "qemu-common.h"
#include "qemu/config-file.h"
#include "qemu/error-report.h"
#include "qapi/error.h"
#include "block/block_int.h"
#include "trace.h"
#include "qemu/iov.h"
#include "qemu/option.h"
#include "qemu/uri.h"
#include "qemu/cutils.h"
#include "sysemu/sysemu.h"
#include "qapi/qapi-visit-block-core.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
#include "qapi-visit.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi/qobject-output-visitor.h"
#include <nfsc/libnfs.h>
@@ -367,6 +368,49 @@ static int coroutine_fn nfs_co_flush(BlockDriverState *bs)
return task.ret;
}
static QemuOptsList runtime_opts = {
.name = "nfs",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "path",
.type = QEMU_OPT_STRING,
.help = "Path of the image on the host",
},
{
.name = "user",
.type = QEMU_OPT_NUMBER,
.help = "UID value to use when talking to the server",
},
{
.name = "group",
.type = QEMU_OPT_NUMBER,
.help = "GID value to use when talking to the server",
},
{
.name = "tcp-syn-count",
.type = QEMU_OPT_NUMBER,
.help = "Number of SYNs to send during the session establish",
},
{
.name = "readahead-size",
.type = QEMU_OPT_NUMBER,
.help = "Set the readahead size in bytes",
},
{
.name = "page-cache-size",
.type = QEMU_OPT_NUMBER,
.help = "Set the pagecache size in bytes",
},
{
.name = "debug",
.type = QEMU_OPT_NUMBER,
.help = "Set the NFS debug level (max 2)",
},
{ /* end of list */ }
},
};
static void nfs_detach_aio_context(BlockDriverState *bs)
{
NFSClient *client = bs->opaque;
@@ -409,16 +453,71 @@ static void nfs_file_close(BlockDriverState *bs)
nfs_client_close(client);
}
static int64_t nfs_client_open(NFSClient *client, BlockdevOptionsNfs *opts,
int flags, int open_flags, Error **errp)
static NFSServer *nfs_config(QDict *options, Error **errp)
{
int64_t ret = -EINVAL;
NFSServer *server = NULL;
QDict *addr = NULL;
QObject *crumpled_addr = NULL;
Visitor *iv = NULL;
Error *local_error = NULL;
qdict_extract_subqdict(options, &addr, "server.");
if (!qdict_size(addr)) {
error_setg(errp, "NFS server address missing");
goto out;
}
crumpled_addr = qdict_crumple(addr, errp);
if (!crumpled_addr) {
goto out;
}
/*
* Caution: this works only because all scalar members of
* NFSServer are QString in @crumpled_addr. The visitor expects
* @crumpled_addr to be typed according to the QAPI schema. It
* is when @options come from -blockdev or blockdev_add. But when
* they come from -drive, they're all QString.
*/
iv = qobject_input_visitor_new(crumpled_addr);
visit_type_NFSServer(iv, NULL, &server, &local_error);
if (local_error) {
error_propagate(errp, local_error);
goto out;
}
out:
QDECREF(addr);
qobject_decref(crumpled_addr);
visit_free(iv);
return server;
}
static int64_t nfs_client_open(NFSClient *client, QDict *options,
int flags, Error **errp, int open_flags)
{
int ret = -EINVAL;
QemuOpts *opts = NULL;
Error *local_err = NULL;
struct stat st;
char *file = NULL, *strp = NULL;
qemu_mutex_init(&client->mutex);
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
client->path = g_strdup(opts->path);
client->path = g_strdup(qemu_opt_get(opts, "path"));
if (!client->path) {
ret = -EINVAL;
error_setg(errp, "No path was specified");
goto fail;
}
strp = strrchr(client->path, '/');
if (strp == NULL) {
@@ -428,10 +527,12 @@ static int64_t nfs_client_open(NFSClient *client, BlockdevOptionsNfs *opts,
file = g_strdup(strp);
*strp = 0;
/* Steal the NFSServer object from opts; set the original pointer to NULL
* to avoid use after free and double free. */
client->server = opts->server;
opts->server = NULL;
/* Pop the config into our state object, Exit if invalid */
client->server = nfs_config(options, errp);
if (!client->server) {
ret = -EINVAL;
goto fail;
}
client->context = nfs_init_context();
if (client->context == NULL) {
@@ -439,32 +540,32 @@ static int64_t nfs_client_open(NFSClient *client, BlockdevOptionsNfs *opts,
goto fail;
}
if (opts->has_user) {
client->uid = opts->user;
if (qemu_opt_get(opts, "user")) {
client->uid = qemu_opt_get_number(opts, "user", 0);
nfs_set_uid(client->context, client->uid);
}
if (opts->has_group) {
client->gid = opts->group;
if (qemu_opt_get(opts, "group")) {
client->gid = qemu_opt_get_number(opts, "group", 0);
nfs_set_gid(client->context, client->gid);
}
if (opts->has_tcp_syn_count) {
client->tcp_syncnt = opts->tcp_syn_count;
if (qemu_opt_get(opts, "tcp-syn-count")) {
client->tcp_syncnt = qemu_opt_get_number(opts, "tcp-syn-count", 0);
nfs_set_tcp_syncnt(client->context, client->tcp_syncnt);
}
#ifdef LIBNFS_FEATURE_READAHEAD
if (opts->has_readahead_size) {
if (qemu_opt_get(opts, "readahead-size")) {
if (open_flags & BDRV_O_NOCACHE) {
error_setg(errp, "Cannot enable NFS readahead "
"if cache.direct = on");
goto fail;
}
client->readahead = opts->readahead_size;
client->readahead = qemu_opt_get_number(opts, "readahead-size", 0);
if (client->readahead > QEMU_NFS_MAX_READAHEAD_SIZE) {
warn_report("Truncating NFS readahead size to %d",
QEMU_NFS_MAX_READAHEAD_SIZE);
error_report("NFS Warning: Truncating NFS readahead "
"size to %d", QEMU_NFS_MAX_READAHEAD_SIZE);
client->readahead = QEMU_NFS_MAX_READAHEAD_SIZE;
}
nfs_set_readahead(client->context, client->readahead);
@@ -476,16 +577,16 @@ static int64_t nfs_client_open(NFSClient *client, BlockdevOptionsNfs *opts,
#endif
#ifdef LIBNFS_FEATURE_PAGECACHE
if (opts->has_page_cache_size) {
if (qemu_opt_get(opts, "page-cache-size")) {
if (open_flags & BDRV_O_NOCACHE) {
error_setg(errp, "Cannot enable NFS pagecache "
"if cache.direct = on");
goto fail;
}
client->pagecache = opts->page_cache_size;
client->pagecache = qemu_opt_get_number(opts, "page-cache-size", 0);
if (client->pagecache > QEMU_NFS_MAX_PAGECACHE_SIZE) {
warn_report("Truncating NFS pagecache size to %d pages",
QEMU_NFS_MAX_PAGECACHE_SIZE);
error_report("NFS Warning: Truncating NFS pagecache "
"size to %d pages", QEMU_NFS_MAX_PAGECACHE_SIZE);
client->pagecache = QEMU_NFS_MAX_PAGECACHE_SIZE;
}
nfs_set_pagecache(client->context, client->pagecache);
@@ -495,13 +596,13 @@ static int64_t nfs_client_open(NFSClient *client, BlockdevOptionsNfs *opts,
#endif
#ifdef LIBNFS_FEATURE_DEBUG
if (opts->has_debug) {
client->debug = opts->debug;
if (qemu_opt_get(opts, "debug")) {
client->debug = qemu_opt_get_number(opts, "debug", 0);
/* limit the maximum debug level to avoid potential flooding
* of our log files. */
if (client->debug > QEMU_NFS_MAX_DEBUG_LEVEL) {
warn_report("Limiting NFS debug level to %d",
QEMU_NFS_MAX_DEBUG_LEVEL);
error_report("NFS Warning: Limiting NFS debug level "
"to %d", QEMU_NFS_MAX_DEBUG_LEVEL);
client->debug = QEMU_NFS_MAX_DEBUG_LEVEL;
}
nfs_set_debug(client->context, client->debug);
@@ -547,53 +648,11 @@ static int64_t nfs_client_open(NFSClient *client, BlockdevOptionsNfs *opts,
fail:
nfs_client_close(client);
out:
qemu_opts_del(opts);
g_free(file);
return ret;
}
static BlockdevOptionsNfs *nfs_options_qdict_to_qapi(QDict *options,
Error **errp)
{
BlockdevOptionsNfs *opts = NULL;
QObject *crumpled = NULL;
Visitor *v;
Error *local_err = NULL;
crumpled = qdict_crumple(options, errp);
if (crumpled == NULL) {
return NULL;
}
v = qobject_input_visitor_new_keyval(crumpled);
visit_type_BlockdevOptionsNfs(v, NULL, &opts, &local_err);
visit_free(v);
qobject_decref(crumpled);
if (local_err) {
return NULL;
}
return opts;
}
static int64_t nfs_client_open_qdict(NFSClient *client, QDict *options,
int flags, int open_flags, Error **errp)
{
BlockdevOptionsNfs *opts;
int ret;
opts = nfs_options_qdict_to_qapi(options, errp);
if (opts == NULL) {
ret = -EINVAL;
goto fail;
}
ret = nfs_client_open(client, opts, flags, open_flags, errp);
fail:
qapi_free_BlockdevOptionsNfs(opts);
return ret;
}
static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp) {
NFSClient *client = bs->opaque;
@@ -601,9 +660,9 @@ static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
client->aio_context = bdrv_get_aio_context(bs);
ret = nfs_client_open_qdict(client, options,
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
bs->open_flags, errp);
ret = nfs_client_open(client, options,
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
errp, bs->open_flags);
if (ret < 0) {
return ret;
}
@@ -626,43 +685,18 @@ static QemuOptsList nfs_create_opts = {
}
};
static int nfs_file_co_create(BlockdevCreateOptions *options, Error **errp)
static int nfs_file_create(const char *url, QemuOpts *opts, Error **errp)
{
BlockdevCreateOptionsNfs *opts = &options->u.nfs;
int ret = 0;
int64_t total_size = 0;
NFSClient *client = g_new0(NFSClient, 1);
int ret;
assert(options->driver == BLOCKDEV_DRIVER_NFS);
QDict *options = NULL;
client->aio_context = qemu_get_aio_context();
ret = nfs_client_open(client, opts->location, O_CREAT, 0, errp);
if (ret < 0) {
goto out;
}
ret = nfs_ftruncate(client->context, client->fh, opts->size);
nfs_client_close(client);
out:
g_free(client);
return ret;
}
static int coroutine_fn nfs_file_co_create_opts(const char *url, QemuOpts *opts,
Error **errp)
{
BlockdevCreateOptions *create_options;
BlockdevCreateOptionsNfs *nfs_opts;
QDict *options;
int ret;
create_options = g_new0(BlockdevCreateOptions, 1);
create_options->driver = BLOCKDEV_DRIVER_NFS;
nfs_opts = &create_options->u.nfs;
/* Read out options */
nfs_opts->size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
options = qdict_new();
ret = nfs_parse_uri(url, options, errp);
@@ -670,21 +704,15 @@ static int coroutine_fn nfs_file_co_create_opts(const char *url, QemuOpts *opts,
goto out;
}
nfs_opts->location = nfs_options_qdict_to_qapi(options, errp);
if (nfs_opts->location == NULL) {
ret = -EINVAL;
goto out;
}
ret = nfs_file_co_create(create_options, errp);
ret = nfs_client_open(client, options, O_CREAT, errp, 0);
if (ret < 0) {
goto out;
}
ret = 0;
ret = nfs_ftruncate(client->context, client->fh, total_size);
nfs_client_close(client);
out:
QDECREF(options);
qapi_free_BlockdevCreateOptions(create_options);
g_free(client);
return ret;
}
@@ -707,9 +735,7 @@ nfs_get_allocated_file_size_cb(int ret, struct nfs_context *nfs, void *data,
if (task->ret < 0) {
error_report("NFS Error: %s", nfs_get_error(nfs));
}
/* Set task->complete before reading bs->wakeup. */
atomic_mb_set(&task->complete, 1);
task->complete = 1;
bdrv_wakeup(task->bs);
}
@@ -737,25 +763,10 @@ static int64_t nfs_get_allocated_file_size(BlockDriverState *bs)
return (task.ret < 0 ? task.ret : st.st_blocks * 512);
}
static int nfs_file_truncate(BlockDriverState *bs, int64_t offset,
PreallocMode prealloc, Error **errp)
static int nfs_file_truncate(BlockDriverState *bs, int64_t offset)
{
NFSClient *client = bs->opaque;
int ret;
if (prealloc != PREALLOC_MODE_OFF) {
error_setg(errp, "Unsupported preallocation mode '%s'",
PreallocMode_str(prealloc));
return -ENOTSUP;
}
ret = nfs_ftruncate(client->context, client->fh, offset);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to truncate file");
return ret;
}
return 0;
return nfs_ftruncate(client->context, client->fh, offset);
}
/* Note that this will not re-establish a connection with the NFS server
@@ -849,8 +860,8 @@ static void nfs_refresh_filename(BlockDriverState *bs, QDict *options)
}
#ifdef LIBNFS_FEATURE_PAGECACHE
static void coroutine_fn nfs_co_invalidate_cache(BlockDriverState *bs,
Error **errp)
static void nfs_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
NFSClient *client = bs->opaque;
nfs_pagecache_invalidate(client->context, client->fh);
@@ -871,8 +882,7 @@ static BlockDriver bdrv_nfs = {
.bdrv_file_open = nfs_file_open,
.bdrv_close = nfs_file_close,
.bdrv_co_create = nfs_file_co_create,
.bdrv_co_create_opts = nfs_file_co_create_opts,
.bdrv_create = nfs_file_create,
.bdrv_reopen_prepare = nfs_reopen_prepare,
.bdrv_co_preadv = nfs_co_preadv,
@@ -884,7 +894,7 @@ static BlockDriver bdrv_nfs = {
.bdrv_refresh_filename = nfs_refresh_filename,
#ifdef LIBNFS_FEATURE_PAGECACHE
.bdrv_co_invalidate_cache = nfs_co_invalidate_cache,
.bdrv_invalidate_cache = nfs_invalidate_cache,
#endif
};

View File

@@ -14,7 +14,6 @@
#include "qapi/error.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
#include "qemu/option.h"
#include "block/block_int.h"
#define NULL_OPT_LATENCY "latency-ns"
@@ -30,6 +29,11 @@ static QemuOptsList runtime_opts = {
.name = "null",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.type = QEMU_OPT_STRING,
.help = "",
},
{
.name = BLOCK_OPT_SIZE,
.type = QEMU_OPT_SIZE,
@@ -50,30 +54,6 @@ static QemuOptsList runtime_opts = {
},
};
static void null_co_parse_filename(const char *filename, QDict *options,
Error **errp)
{
/* This functions only exists so that a null-co:// filename is accepted
* with the null-co driver. */
if (strcmp(filename, "null-co://")) {
error_setg(errp, "The only allowed filename for this driver is "
"'null-co://'");
return;
}
}
static void null_aio_parse_filename(const char *filename, QDict *options,
Error **errp)
{
/* This functions only exists so that a null-aio:// filename is accepted
* with the null-aio driver. */
if (strcmp(filename, "null-aio://")) {
error_setg(errp, "The only allowed filename for this driver is "
"'null-aio://'");
return;
}
}
static int null_file_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
@@ -111,7 +91,8 @@ static coroutine_fn int null_co_common(BlockDriverState *bs)
BDRVNullState *s = bs->opaque;
if (s->latency_ns) {
qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, s->latency_ns);
co_aio_sleep_ns(bdrv_get_aio_context(bs), QEMU_CLOCK_REALTIME,
s->latency_ns);
}
return 0;
}
@@ -223,23 +204,22 @@ static int null_reopen_prepare(BDRVReopenState *reopen_state,
return 0;
}
static int coroutine_fn null_co_block_status(BlockDriverState *bs,
bool want_zero, int64_t offset,
int64_t bytes, int64_t *pnum,
int64_t *map,
BlockDriverState **file)
static int64_t coroutine_fn null_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum,
BlockDriverState **file)
{
BDRVNullState *s = bs->opaque;
int ret = BDRV_BLOCK_OFFSET_VALID;
off_t start = sector_num * BDRV_SECTOR_SIZE;
*pnum = bytes;
*map = offset;
*pnum = nb_sectors;
*file = bs;
if (s->read_zeroes) {
ret |= BDRV_BLOCK_ZERO;
return BDRV_BLOCK_OFFSET_VALID | start | BDRV_BLOCK_ZERO;
} else {
return BDRV_BLOCK_OFFSET_VALID | start;
}
return ret;
}
static void null_refresh_filename(BlockDriverState *bs, QDict *opts)
@@ -262,7 +242,6 @@ static BlockDriver bdrv_null_co = {
.instance_size = sizeof(BDRVNullState),
.bdrv_file_open = null_file_open,
.bdrv_parse_filename = null_co_parse_filename,
.bdrv_close = null_close,
.bdrv_getlength = null_getlength,
@@ -271,7 +250,7 @@ static BlockDriver bdrv_null_co = {
.bdrv_co_flush_to_disk = null_co_flush,
.bdrv_reopen_prepare = null_reopen_prepare,
.bdrv_co_block_status = null_co_block_status,
.bdrv_co_get_block_status = null_co_get_block_status,
.bdrv_refresh_filename = null_refresh_filename,
};
@@ -282,7 +261,6 @@ static BlockDriver bdrv_null_aio = {
.instance_size = sizeof(BDRVNullState),
.bdrv_file_open = null_file_open,
.bdrv_parse_filename = null_aio_parse_filename,
.bdrv_close = null_close,
.bdrv_getlength = null_getlength,
@@ -291,7 +269,7 @@ static BlockDriver bdrv_null_aio = {
.bdrv_aio_flush = null_aio_flush,
.bdrv_reopen_prepare = null_reopen_prepare,
.bdrv_co_block_status = null_co_block_status,
.bdrv_co_get_block_status = null_co_get_block_status,
.bdrv_refresh_filename = null_refresh_filename,
};

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More