References: bsc#1190011 (CVE-2021-3750)
40p and g3beige will hang forever with these messages:
qemu-system-ppc: warning: Blocked re-entrant IO on MemoryRegion: pci-conf-idx at addr: 0x0
qemu-system-ppc64: warning: Blocked re-entrant IO on MemoryRegion: lpc-hc at addr: 0x34
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Git-commit: f6b0de53fb
References: bsc#1212968 CVE-2023-2861
The 9p protocol does not specifically define how server shall behave when
client tries to open a special file, however from security POV it does
make sense for 9p server to prohibit opening any special file on host side
in general. A sane Linux 9p client for instance would never attempt to
open a special file on host side, it would always handle those exclusively
on its guest side. A malicious client however could potentially escape
from the exported 9p tree by creating and opening a device file on host
side.
With QEMU this could only be exploited in the following unsafe setups:
- Running QEMU binary as root AND 9p 'local' fs driver AND 'passthrough'
security model.
or
- Using 9p 'proxy' fs driver (which is running its helper daemon as
root).
These setups were already discouraged for safety reasons before,
however for obvious reasons we are now tightening behaviour on this.
Fixes: CVE-2023-2861
Reported-by: Yanwu Shen <ywsPlz@gmail.com>
Reported-by: Jietao Xiao <shawtao1125@gmail.com>
Reported-by: Jinku Li <jkli@xidian.edu.cn>
Reported-by: Wenbo Shen <shenwenbo@zju.edu.cn>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <E1q6w7r-0000Q0-NM@lizzy.crudebyte.com>
Signed-off-by: Li Zhang <lizhang@suse.de>
Guest driver might execute HW commands when shared buffers are not yet
allocated.
This could happen on purpose (malicious guest) or because of some other
guest/host address mapping error.
We need to protect againts such case.
Fixes: CVE-2022-1050
Reported-by: Raven <wxhusst@gmail.com>
Signed-off-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
Message-Id: <20220403095234.2210-1-yuval.shaia.ml@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Li Zhang <lizhang@suse.de>
Git-commit: 3ab6fdc91b
Resolves: bsc#1190011 (CVE-2021-3750)
Add the 'memory' bit to the memory attributes to restrict bus
controller accesses to memories.
Introduce flatview_access_allowed() to check bus permission
before running any bus transaction.
Have read/write accessors return MEMTX_ACCESS_ERROR if an access is
restricted.
There is no change for the default case where 'memory' is not set.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211215182421.418374-4-philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[thuth: Replaced MEMTX_BUS_ERROR with MEMTX_ACCESS_ERROR, remove "inline"]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 205ccfd7a5
References: bsc#1188609 (CVE-2021-3638)
When building QEMU with DEBUG_ATI defined then running with
'-device ati-vga,romfile="" -d unimp,guest_errors -trace ati\*'
we get:
ati_mm_write 4 0x16c0 DP_CNTL <- 0x1
ati_mm_write 4 0x146c DP_GUI_MASTER_CNTL <- 0x2
ati_mm_write 4 0x16c8 DP_MIX <- 0xff0000
ati_mm_write 4 0x16c4 DP_DATATYPE <- 0x2
ati_mm_write 4 0x224 CRTC_OFFSET <- 0x0
ati_mm_write 4 0x142c DST_PITCH_OFFSET <- 0xfe00000
ati_mm_write 4 0x1420 DST_Y <- 0x3fff
ati_mm_write 4 0x1410 DST_HEIGHT <- 0x3fff
ati_mm_write 4 0x1588 DST_WIDTH_X <- 0x3fff3fff
ati_2d_blt: vram:0x7fff5fa00000 addr:0 ds:0x7fff61273800 stride:2560 bpp:32 rop:0xff
ati_2d_blt: 0 0 0, 0 127 0, (0,0) -> (16383,16383) 16383x16383 > ^
ati_2d_blt: pixman_fill(dst:0x7fff5fa00000, stride:254, bpp:8, x:16383, y:16383, w:16383, h:16383, xor:0xff000000)
Thread 3 "qemu-system-i38" received signal SIGSEGV, Segmentation fault.
(gdb) bt
#0 0x00007ffff7f62ce0 in sse2_fill.lto_priv () at /lib64/libpixman-1.so.0
#1 0x00007ffff7f09278 in pixman_fill () at /lib64/libpixman-1.so.0
#2 0x0000555557b5a9af in ati_2d_blt (s=0x631000028800) at hw/display/ati_2d.c:196
#3 0x0000555557b4b5a2 in ati_mm_write (opaque=0x631000028800, addr=5512, data=1073692671, size=4) at hw/display/ati.c:843
#4 0x0000555558b90ec4 in memory_region_write_accessor (mr=0x631000039cc0, addr=5512, ..., size=4, ...) at softmmu/memory.c:492
Commit 584acf34cb ("ati-vga: Fix reverse bit blts") introduced
the local dst_x and dst_y which adjust the (x, y) coordinates
depending on the direction in the SRCCOPY ROP3 operation, but
forgot to address the same issue for the PATCOPY, BLACKNESS and
WHITENESS operations, which also call pixman_fill().
Fix that now by using the adjusted coordinates in the pixman_fill
call, and update the related debug printf().
Reported-by: Qiang Liu <qiangliu@zju.edu.cn>
Fixes: 584acf34cb ("ati-vga: Fix reverse bit blts")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Mauro Matteo Cascella <mcascell@redhat.com>
Message-Id: <20210906153103.1661195-1-philmd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 10be627d2b
References: bsc#1212850 (CVE-2023-3354)
The TLS handshake make take some time to complete, during which time an
I/O watch might be registered with the main loop. If the owner of the
I/O channel invokes qio_channel_close() while the handshake is waiting
to continue the I/O watch must be removed. Failing to remove it will
later trigger the completion callback which the owner is not expecting
to receive. In the case of the VNC server, this results in a SEGV as
vnc_disconnect_start() tries to shutdown a client connection that is
already gone / NULL.
CVE-2023-3354
Reported-by: jiangyegen <jiangyegen@huawei.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com
DF: Open-code g_clear_handle_id(), as it's not available in glib
until version 2.56.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 9d38a84347
References: bsc#1213925 (CVE-2023-3180)
For symmetric algorithms, the length of ciphertext must be as same
as the plaintext.
The missing verification of the src_len and the dst_len in
virtio_crypto_sym_op_helper() may lead buffer overflow/divulged.
This patch is originally written by Yiming Tao for QEMU-SECURITY,
resend it(a few changes of error message) in qemu-devel.
Fixes: CVE-2023-3180
Fixes: 04b9b37edda("virtio-crypto: add data queue processing handler")
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: Mauro Matteo Cascella <mcascell@redhat.com>
Cc: Yiming Tao <taoym@zju.edu.cn>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230803024314.29962-2-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 3059344f01
References: bsc#1172382 (CVE-2020-13754)
The "BCM2835 ARM Peripherals" datasheet [*] chapter 2
("Auxiliaries: UART1 & SPI1, SPI2"), list the register
sizes as 3/8/16/32 bits. We assume this means this
peripheral allows 8-bit accesses.
This was not an issue until commit 5d971f9e67 which reverted
("memory: accept mismatching sizes in memory_region_access_valid").
The model is implemented as 32-bit accesses (see commit 97398d900c,
all registers are 32-bit) so replace MemoryRegionOps.valid as
MemoryRegionOps.impl, and re-introduce MemoryRegionOps.valid
with a 8/32-bit range.
[*] https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
Fixes: 97398d900c ("bcm2835_aux: add emulation of BCM2835 AUX (aka UART1) block")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201002181032.1899463-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 8e67fda2dd
References: bsc#1172382 (CVE-2020-13754)
QEMU XHCI advertises AC64 (64-bit addressing) but doesn't allow
64-bit mode access in "runtime" and "operational" MemoryRegionOps.
Set the max_access_size based on sizeof(dma_addr_t) as AC64 is set.
XHCI specs:
"If the xHC supports 64-bit addressing (AC64 = ‘1’), then software
should write 64-bit registers using only Qword accesses. If a
system is incapable of issuing Qword accesses, then writes to the
64-bit address fields shall be performed using 2 Dword accesses;
low Dword-first, high-Dword second. If the xHC supports 32-bit
addressing (AC64 = ‘0’), then the high Dword of registers containing
64-bit address fields are unused and software should write addresses
using only Dword accesses"
The problem has been detected with SLOF, as linux kernel always accesses
registers using 32-bit access even if AC64 is set and revealed by
5d971f9e67 ("memory: Revert "memory: accept mismatching sizes in memory_region_access_valid"")
Suggested-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-id: 20200721083322.90651-1-lvivier@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 89ed83d8b2
References: bsc#1172382 (CVE-2020-13754)
The memory region ops have min_access_size == 4 so obey it.
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 4b7c06837a
References: bsc#1172382 (CVE-2020-13754)
The memory region ops have min_access_size == 4 so obey it.
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 5d971f9e67
References: bsc#1172382 (CVE-2020-13754)
Memory API documentation documents valid .min_access_size and .max_access_size
fields and explains that any access outside these boundaries is blocked.
This is what devices seem to assume.
However this is not what the implementation does: it simply
ignores the boundaries unless there's an "accepts" callback.
Naturally, this breaks a bunch of devices.
Revert to the documented behaviour.
Devices that want to allow any access can just drop the valid field,
or add the impl field to have accesses converted to appropriate
length.
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Fixes: CVE-2020-13754
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1842363
Fixes: a014ed07bd ("memory: accept mismatching sizes in memory_region_access_valid")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200610134731.1514409-1-mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 736b01642d
References: bsc#1193880,CVE-2021-3929
This fixes CVE-2021-3929 "locally" by denying DMA to the iomem of the
device itself. This still allows DMA to MMIO regions of other devices
(e.g. doing P2P DMA to the controller memory buffer of another NVMe
device).
Fixes: CVE-2021-3929
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: b987718bbb
Referecens: bsc#1207205 (CVE-2023-0330)
We cannot use the generic reentrancy guard in the LSI code, so
we have to manually prevent endless reentrancy here. The problematic
lsi_execute_script() function has already a way to detect whether
too many instructions have been executed - we just have to slightly
change the logic here that it also takes into account if the function
has been called too often in a reentrant way.
The code in fuzz-lsi53c895a-test.c has been taken from an earlier
patch by Mauro Matteo Cascella.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563
Message-Id: <20230522091011.1082574-1-thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 6dbbf05514
References: bsc#1205808, CVE-2022-4144
Have qxl_get_check_slot_offset() return false if the requested
buffer size does not fit within the slot memory region.
Similarly qxl_phys2virt() now returns NULL in such case, and
qxl_dirty_one_surface() aborts.
This avoids buffer overrun in the host pointer returned by
memory_region_get_ram_ptr().
Fixes: CVE-2022-4144 (out-of-bounds read)
Reported-by: Wenxu Yin (@awxylitol)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1336
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-5-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 8efec0ef8b
References: bsc#1205808, CVE-2022-4144
Currently qxl_phys2virt() doesn't check for buffer overrun.
In order to do so in the next commit, pass the buffer size
as argument.
For QXLCursor in qxl_render_cursor() -> qxl_cursor() we
verify the size of the chunked data ahead, checking we can
access 'sizeof(QXLCursor) + chunk->data_size' bytes.
Since in the SPICE_CURSOR_TYPE_MONO case the cursor is
assumed to fit in one chunk, no change are required.
In SPICE_CURSOR_TYPE_ALPHA the ahead read is handled in
qxl_unpack_chunks().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-4-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 61c34fc194
References: bac#1205808, CVE-2022-4144
Only 3 command types are logged: no need to call qxl_phys2virt()
for the other types. Using different cases will help to pass
different structure sizes to qxl_phys2virt() in a pair of commits.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-2-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: defac5e2fb
References: bsc#1185000, CVE-2021-3507
Per the 82078 datasheet, if the end-of-track (EOT byte in
the FIFO) is more than the number of sectors per side, the
command is terminated unsuccessfully:
* 5.2.5 DATA TRANSFER TERMINATION
The 82078 supports terminal count explicitly through
the TC pin and implicitly through the underrun/over-
run and end-of-track (EOT) functions. For full sector
transfers, the EOT parameter can define the last
sector to be transferred in a single or multisector
transfer. If the last sector to be transferred is a par-
tial sector, the host can stop transferring the data in
mid-sector, and the 82078 will continue to complete
the sector as if a hardware TC was received. The
only difference between these implicit functions and
TC is that they return "abnormal termination" result
status. Such status indications can be ignored if they
were expected.
* 6.1.3 READ TRACK
This command terminates when the EOT specified
number of sectors have been read. If the 82078
does not find an I D Address Mark on the diskette
after the second· occurrence of a pulse on the
INDX# pin, then it sets the IC code in Status Regis-
ter 0 to "01" (Abnormal termination), sets the MA bit
in Status Register 1 to "1", and terminates the com-
mand.
* 6.1.6 VERIFY
Refer to Table 6-6 and Table 6-7 for information
concerning the values of MT and EC versus SC and
EOT value.
* Table 6·6. Result Phase Table
* Table 6-7. Verify Command Result Phase Table
Fix by aborting the transfer when EOT > # Sectors Per Side.
Cc: qemu-stable@nongnu.org
Cc: Hervé Poussineau <hpoussin@reactos.org>
Fixes: baca51faff ("floppy driver: disk geometry auto detect")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/339
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211118115733.4038610-2-philmd@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Move the check for SG_IO errors after the VPD block limits emulation,
otherwise the emulation will never the triggered.
References: bsc#1202364
Signed-off-by: Hannes Reinecke <hare@suse.de>
Git-commit: 51e15194b0
References: bsc#1202364
Commits 01ef8185b8 amd 24b36e9813 updated the way that the maximum
transfer length is calculated for patching block limits VPD page in an
INQUIRY response.
The same updates also need to be made for the case where the host device
does not support the block limits VPD page at all and we emulate the
whole page.
Without this fix, on host block devices a maximum transfer length of
(INT_MAX - sector_size) bytes is advertised to the guest, resulting in
I/O errors when a request that exceeds the host limits is made by the
guest. (Prior to commit 24b36e9813, this code path would use the
max_transfer value from the host instead of INT_MAX, but still miss the
fix from 01ef8185b8 where max_transfer is also capped to max_iov
host pages, so it would be less wrong, but still wrong.)
Cc: qemu-stable@nongnu.org
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2096251
Fixes: 01ef8185b8
Fixes: 24b36e9813
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20220822125320.48257-1-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 51e15194b0)
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 06b80795ee
References: bsc#1202364
When the device doesn't support the VPD block limits page, we emulate it even
for SCSI passthrough.
As a part of the emulation we need to add it to the 'Supported VPD Pages'
The code that does this adds it to the page, but it doesn't increase the length
of the data to be copied to the guest, thus the guest never sees the VPD block
limits page as supported.
Bump the transfer size by 1 in this case.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20201217165612.942849-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 06b80795ee)
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: cc07162953
References: bsc#1190425
Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel
sources, IOV_MAX in POSIX). Because of this, on some host adapters
requests with many iovecs are rejected with -EINVAL by the
io_submit() or readv()/writev() system calls.
In fact, the same limit applies to SG_IO as well. To fix both the
EINVAL and the possible performance issues from using fewer iovecs
than allowed by Linux (some HBAs have max_segments as low as 128),
introduce a separate entry in BlockLimits to hold the max_segments
value from sysfs. This new limit is used only for SG_IO and clamped
to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to
bs->bl.max_transfer.
Reported-by: Halil Pasic <pasic@linux.ibm.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org
Cc: qemu-stable@nongnu.org
Fixes: 18473467d5 ("file-posix: try BLKSECTGET on block devices too, do not round to power of 2", 2021-06-25)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210923130436.1187591-1-pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit cc07162953)
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: b263d8f928
References: bsc#1182282, CVE-2021-3409
At the end of sdhci_send_command(), it starts a data transfer if the
command register indicates data is associated. But the data transfer
should only be initiated when the command execution has succeeded.
With this fix, the following reproducer:
outl 0xcf8 0x80001810
outl 0xcfc 0xe1068000
outl 0xcf8 0x80001804
outw 0xcfc 0x7
write 0xe106802c 0x1 0x0f
write 0xe1068004 0xc 0x2801d10101fffffbff28a384
write 0xe106800c 0x1f 0x9dacbbcad9e8f7061524334251606f7e8d9cabbac9d8e7f60514233241505f
write 0xe1068003 0x28 0x80d000251480d000252280d000253080d000253e80d000254c80d000255a80d000256880d0002576
write 0xe1068003 0x1 0xfe
cannot be reproduced with the following QEMU command line:
$ qemu-system-x86_64 -nographic -M pc-q35-5.0 \
-device sdhci-pci,sd-spec-version=3 \
-drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
-device sd-card,drive=mydrive \
-monitor none -serial none -qtest stdio
Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-2-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 418ade7849
References: bsc#1201367, CVE-2022-35414
The bug is an uninitialized memory read, along the translate_fail
path, which results in garbage being read from iotlb_to_section,
which can lead to a crash in io_readx/io_writex.
The bug may be fixed by writing any value with zero
in ~TARGET_PAGE_MASK, so that the call to iotlb_to_section using
the xlat'ed address returns io_mem_unassigned, as desired by the
translate_fail path.
It is most useful to record the original physical page address,
which will eventually be logged by memory_region_access_valid
when the access is rejected by unassigned_mem_accepts.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1065
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220621153829.366423-1-richard.henderson@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit f471e8b060
References: bsc#1192115
The 'active' bit passes control over a qTD between the guest and the
controller: set to 1 by guest to enable execution by the controller,
and the controller sets it to '0' to hand back control to the guest.
ehci_state_writeback write two dwords to main memory using DMA:
the third dword of the qTD (containing dt, total bytes to transfer,
cpage, cerr and status) and the fourth dword of the qTD (containing
the offset).
This commit makes sure the fourth dword is written before the third,
avoiding a race condition where a new offset written into the qTD
by the guest after it observed the status going to go to '0' gets
overwritten by a 'late' DMA writeback of the previous offset.
This race condition could lead to 'cpage out of range (5)' errors,
and reproduced by:
./qemu-system-x86_64 -enable-kvm -bios $SEABIOS/bios.bin -m 4096 -device usb-ehci -blockdev driver=file,read-only=on,filename=/home/aengelen/Downloads/openSUSE-Tumbleweed-DVD-i586-Snapshot20220428-Media.iso,node-name=iso -device usb-storage,drive=iso,bootindex=0 -chardev pipe,id=shell,path=/tmp/pipe -device virtio-serial -device virtconsole,chardev=shell -device virtio-rng-pci -serial mon:stdio -nographic
(press a key, select 'Installation' (2), and accept the default
values. On my machine the 'cpage out of range' is reproduced while
loading the Linux Kernel about once per 7 attempts. With the fix in
this commit it no longer fails)
This problem was previously reported as a seabios problem in
https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/OUTHT5ISSQJGXPNTUPY3O5E5EPZJCHM3/
and as a nixos CI build failure in
https://github.com/NixOS/nixpkgs/issues/170803
Signed-off-by: Arnout Engelen <arnout@bzzt.net>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 790762e548
References: bsc#1172033 CVE-2020-13253
Only move the state machine to ReceivingData if there is no
pending error. This avoids later OOB access while processing
commands queued.
"SD Specifications Part 1 Physical Layer Simplified Spec. v3.01"
4.3.3 Data Read
Read command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
occurred and no data transfer is performed.
4.3.4 Data Write
Write command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
occurred and no data transfer is performed.
WP_VIOLATION errors are not modified: the error bit is set, we
stay in receive-data state, wait for a stop command. All further
data transfer is ignored. See the check on sd->card_status at the
beginning of sd_read_data() and sd_write_data().
Fixes: CVE-2020-13253
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Buglink: https://bugs.launchpad.net/qemu/+bug/1880822
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20200630133912.9428-6-f4bug@amsat.org>
Git-commit: bedd7e93d0
References: bsc#1189938 CVE-2021-3748
When mergeable buffer is enabled, we try to set the num_buffers after
the virtqueue elem has been unmapped. This will lead several issues,
E.g a use after free when the descriptor has an address which belongs
to the non direct access region. In this case we use bounce buffer
that is allocated during address_space_map() and freed during
address_space_unmap().
Fixing this by storing the elems temporarily in an array and delay the
unmap after we set the the num_buffers.
This addresses CVE-2021-3748.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: fbe78f4f55 ("virtio-net support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: 18473467d5
References: bsc#1190425
bs->sg is only true for character devices, but block devices can also
be used with scsi-block and scsi-generic. Unfortunately BLKSECTGET
returns bytes in an int for /dev/sgN devices, and sectors in a short
for block devices, so account for that in the code.
The maximum transfer also need not be a power of 2 (for example I have
seen disks with 1280 KiB maximum transfer) so there's no need to pass
the result through pow2floor.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 24b36e9813
References: bsc#1190425
For block host devices, I/O can happen through either the kernel file
descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
or the SCSI passthrough ioctl SG_IO.
In the latter case, the size of each transfer can be limited by the
HBA, while for file descriptor I/O the kernel is able to split and
merge I/O in smaller pieces as needed. Applying the HBA limits to
file descriptor I/O results in more system calls and suboptimal
performance, so this patch splits the max_transfer limit in two:
max_transfer remains valid and is used in general, while max_hw_transfer
is limited to the maximum hardware size. max_hw_transfer can then be
included by the scsi-generic driver in the block limits page, to ensure
that the stricter hardware limit is used.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: b99f7fa08a
References: bsc#1190425
Block device requests must be aligned to bs->bl.request_alignment.
It makes sense for drivers to align bs->bl.max_transfer the same
way; however when there is no specified limit, blk_get_max_transfer
just returns INT_MAX. Since the contract of the function does not
specify that INT_MAX means "no maximum", just align the outcome
of the function (whether INT_MAX or bs->bl.max_transfer) before
returning it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: c9797456f6
References: bsc#1190425
osdep.h provides a ROUND_UP macro to hide bitwise operations for the
purpose of rounding a number up to a power of two; add a ROUND_DOWN
macro that does the same with truncation towards zero.
While at it, change the formatting of some comments.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 01ef8185b8
References: bsc#1190425
I/O to a disk via read/write is not limited by the number of segments allowed
by the host adapter; the kernel can split requests if needed, and the limit
imposed by the host adapter can be very low (256k or so) to avoid that SG_IO
returns EINVAL if memory is heavily fragmented.
Since this value is only interesting for SG_IO-based I/O, do not include
it in the max_transfer and only take it into account when patching the
block limits VPD page in the scsi-generic device.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 8ad5ab6148
References: bsc#1190425
Even though it was only called for devices that have bs->sg set (which
must be character devices), sg_get_max_segments looked at /sys/dev/block
which only works for block devices.
On Linux the sg driver has its own way to provide the maximum number of
iovecs in a scatter/gather list, so add support for it. The block device
path is kept because it will be reinstated in the next patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 05a40b172e
References: bsc#1186012, CVE-2021-3527
usb-host and usb-redirect try to batch bulk transfers by combining many
small usb packets into a single, large transfer request, to reduce the
overhead and improve performance.
This patch adds a size limit of 1 MiB for those combined packets to
restrict the host resources the guest can bind that way.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210503132915.2335822-6-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: 000000000000000000000000000000000000000000000
References: bsc#1182651, CVE-2021-20255
Patch based on discussion:
https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg06098.html
While processing controller commands, eepro100 emulator gets
command unit(CU) base address OR receive unit (RU) base address
OR command block (CB) address from guest. If these values are not
checked, it may lead to an infinite loop kind of issues. Add checks
to avoid it.
Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-By: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: 00000000000000000000000000000000000000000000
References: bsc#1180432, CVE-2020-35503
Ensure that 'cmd->frame' is not NULL before accessing the 'header' field.
This check prevents a potential NULL pointer dereference issue.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1910346
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Acked-By: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: 607206948c
References: bsc#1180433, CVE-2020-35504
bsc#1180434, CVE-2020-35505
bsc#1180435, CVE-2020-35506
When a CDB has been received and is about to be submitted to the SCSI layer
via one of the ESP select commands, ensure that do_cmd is set to zero before
executing the command.
Otherwise a guest executing 2 valid CDBs in quick sequence can invoke the SCSI
.transfer_data callback again before do_cmd is set to zero by the callback
function triggering an assert at the start of esp_transfer_data().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210407195801.685-12-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: 1bf8b88f14
References: bsc#1187529 CVE-2021-3611
Object property insertion code iterates over an integer to get an unused
index that can be used as an unique name for an object property. This loop
increments the integer value indefinitely. Although very unlikely, this can
still cause an integer overflow.
In this change, we fix the above code by checking against INT16_MAX and making
sure that the interger index does not overflow beyond that value. If no
available index is found, the code would cause an assertion failure. This
assertion failure is necessary because the callers of the function do not check
the return value for NULL.
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20200921093325.25617-1-ani@anisinha.ca>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Git-commit: d8a18da56d
References: bsc#1184574
Test 067 from qemu-iotests is executing QMP commands to hotplug
and hot-unplug disks, devices and blockdevs. Because the power
of the text-based test harness is limited, it is actually limiting
the checks that it does, for example by skipping DEVICE_DELETED
events.
tests/qtest already has a similar test, drive_del-test.c.
We can merge them, and even reuse some of the existing code in
drive_del-test.c. This will improve the quality of the test by
covering DEVICE_DELETED events and testing multiple architectures
(therefore covering multiple PCI hotplug mechanisms as well as s390x
virtio-ccw).
The only difference is that the new test will always use null-co:// for
the medium rather than qcow2 or raw, but this should be irrelevant for
what the test is covering. For example there are no "qemu-img check"
runs in 067 that would check that the file is properly closed.
The new tests requires PCI hot-plug support, so drive_del-test
is moved from qemu-system-ppc to qemu-system-ppc64.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 9a613ddccc
References: bsc#1184574
Do not just trust the HMP commands to create and delete the drive, use
query-block to check that this is actually the case.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: bb1a5b97f7
References: bsc#1184574
Let test use the new functionality for buffering events.
The only remaining users of qtest_qmp_receive_dict are tests
that fuzz the QMP protocol.
Tested with 'make check-qtest'.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20201006123904.610658-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: c45a70d8c2
References: bsc#1184574
Simplify the code now that events are buffered. There is no need
anymore to separate sending the command and retrieving the response.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 5e34005571
References: bsc#1184574
The purpose of qtest_qmp_receive_success was mostly to process events
that arrived between the issueing of a command and the "return"
line from QMP. This is now handled by the buffering of events
that libqtest performs automatically.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: c22045bfe6
References: bsc#1184574
The new qtest_qmp_receive buffers all the received qmp events, allowing
qtest_qmp_eventwait_ref to return them.
This is intended to solve the race in regard to ordering of qmp events
vs qmp responses, as soon as the callers start using the new interface.
In addition to that, define qtest_qmp_event_ref a function which only scans
the buffer that qtest_qmp_receive stores the events to. This is intended
for callers that are only interested in events that were received during
the last call to the qtest_qmp_receive.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20201006123904.610658-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 1c3e2a38de
References: bsc#1184574
In the next patch a new version of qtest_qmp_receive will be
reintroduced that will buffer received qmp events for later
consumption in qtest_qmp_eventwait_ref
No functional change intended.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: d77799ccda
References: bsc#1184574
Move a few helper functions from migration-test.c to migration-helpers.c
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 076d467aac
References: bsc#1187013
Currently, if guest has workloads, IO thread will acquire aio_context
lock before do io_submit, it leads to segmentfault when do block commit
after snapshot. Just like below:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f7c7d91f700 (LWP 99907)]
0x00005576d0f65aab in bdrv_mirror_top_pwritev at ../block/mirror.c:1437
1437 ../block/mirror.c: No such file or directory.
(gdb) p s->job
$17 = (MirrorBlockJob *) 0x0
(gdb) p s->stop
$18 = false
Call trace of IO thread:
0 0x00005576d0f65aab in bdrv_mirror_top_pwritev at ../block/mirror.c:1437
1 0x00005576d0f7f3ab in bdrv_driver_pwritev at ../block/io.c:1174
2 0x00005576d0f8139d in bdrv_aligned_pwritev at ../block/io.c:1988
3 0x00005576d0f81b65 in bdrv_co_pwritev_part at ../block/io.c:2156
4 0x00005576d0f8e6b7 in blk_do_pwritev_part at ../block/block-backend.c:1260
5 0x00005576d0f8e84d in blk_aio_write_entry at ../block/block-backend.c:1476
...
Switch to qemu main thread:
0 0x00007f903be704ed in __lll_lock_wait at
/lib/../lib64/libpthread.so.0
1 0x00007f903be6bde6 in _L_lock_941 at /lib/../lib64/libpthread.so.0
2 0x00007f903be6bcdf in pthread_mutex_lock at
/lib/../lib64/libpthread.so.0
3 0x0000564b21456889 in qemu_mutex_lock_impl at
../util/qemu-thread-posix.c:79
4 0x0000564b213af8a5 in block_job_add_bdrv at ../blockjob.c:224
5 0x0000564b213b00ad in block_job_create at ../blockjob.c:440
6 0x0000564b21357c0a in mirror_start_job at ../block/mirror.c:1622
7 0x0000564b2135a9af in commit_active_start at ../block/mirror.c:1867
8 0x0000564b2133d132 in qmp_block_commit at ../blockdev.c:2768
9 0x0000564b2141fef3 in qmp_marshal_block_commit at
qapi/qapi-commands-block-core.c:346
10 0x0000564b214503c9 in do_qmp_dispatch_bh at
../qapi/qmp-dispatch.c:110
11 0x0000564b21451996 in aio_bh_poll at ../util/async.c:164
12 0x0000564b2146018e in aio_dispatch at ../util/aio-posix.c:381
13 0x0000564b2145187e in aio_ctx_dispatch at ../util/async.c:306
14 0x00007f9040239049 in g_main_context_dispatch at
/lib/../lib64/libglib-2.0.so.0
15 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:232
16 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:255
17 0x0000564b21447368 in main_loop_wait at ../util/main-loop.c:531
18 0x0000564b212304e1 in qemu_main_loop at ../softmmu/runstate.c:721
19 0x0000564b20f7975e in main at ../softmmu/main.c:50
In IO thread when do bdrv_mirror_top_pwritev, the job is NULL, and stop field
is false, this means the MirrorBDSOpaque "s" object has not been initialized
yet, and this object is initialized by block_job_create(), but the initialize
process is stuck in acquiring the lock.
In this situation, IO thread come to bdrv_mirror_top_pwritev(),which means that
mirror-top node is already inserted into block graph, but its bs->opaque->job
is not initialized.
The root cause is that qemu main thread do release/acquire when hold the lock,
at the same time, IO thread get the lock after release stage, and the crash
occured.
Actually, in this situation, job->job.aio_context will not equal to
qemu_get_aio_context(), and will be the same as bs->aio_context,
thus, no need to release the lock, becasue bdrv_root_attach_child()
will not change the context.
This patch fix this issue.
Fixes: 132ada80 "block: Adjust AioContexts when attaching nodes"
Signed-off-by: Michael Qiu <qiudayu@huayun.com>
Message-Id: <20210203024059.52683-1-08005325@163.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 3ea32d1355
References: CVE-2021-3546 bsc#1185981
CVE-2021-3545 bsc#1185990
CVE-2021-3544
Currently in vhost-user-gpu, we free resource directly in
the cleanup case of resource. If we change the cleanup logic
we need to change several places, also abstruct a
'vg_create_mapping_iov' can be symmetry with the
'vg_create_mapping_iov'. This is like what virtio-gpu does,
no function changed.
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210516030403.107723-9-liq3ea@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
Git-commit: 9f22893adc
References: CVE-2021-3546 bsc#1185981
If 'virgl_cmd_get_capset' set 'max_size' to 0,
the 'virgl_renderer_fill_caps' will write the data after the 'resp'.
This patch avoid this by checking the returned 'max_size'.
virtio-gpu fix: abd7f08b23 ("display: virtio-gpu-3d: check
virgl capabilities max_size")
Fixes: CVE-2021-3546
Reported-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210516030403.107723-8-liq3ea@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
Git-comit: f6091d86ba
References: CVE-2021-3544
The 'res->iov' will be leaked if the guest trigger following sequences:
virgl_cmd_create_resource_2d
virgl_resource_attach_backing
virgl_cmd_resource_unref
This patch fixes this.
Fixes: CVE-2021-3544
Reported-by: Li Qiang <liq3ea@163.com>
virtio-gpu fix: 5e8e3c4c75 ("virtio-gpu: fix resource leak
in virgl_cmd_resource_unref"
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210516030403.107723-6-liq3ea@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
[jrz: tweaked title to not break spec file]
Git-commit: b7afebcf9e
References: CVE-2021-3544
If the guest trigger following sequences, the attach_backing will be leaked:
vg_resource_create_2d
vg_resource_attach_backing
vg_resource_unref
This patch fix this by freeing 'res->iov' in vg_resource_destroy.
Fixes: CVE-2021-3544
Reported-by: Li Qiang <liq3ea@163.com>
virtio-gpu fix: 5e8e3c4c75 ("virtio-gpu: fix resource leak
in virgl_cmd_resource_unref")
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210516030403.107723-5-liq3ea@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jose R. Ziviani <jziviani@suse.de>
Git-commit: b807ca3fa0
References: bsc#1184574
Whenever a Xen block device is detach via xenstore, the image
associated with it remained open by the backend QEMU and an error is
logged:
qemu-system-i386: failed to destroy drive: Node xvdz-qcow2 is in use
This happened since object_unparent() doesn't immediately frees the
object and thus keep a reference to the node we are trying to free.
The reference is hold by the "drive" property and the call
xen_block_drive_destroy() fails.
In order to fix that, we call drain_call_rcu() to run the callback
setup by bus_remove_child() via object_unparent().
Fixes: 2d24a64661 ("device-core: use RCU for list of children of a bus")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20210308143232.83388-1-anthony.perard@citrix.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: a23151e8cc
References: bsc#1184574
Some code might race with placement of new devices on a bus.
We currently first place a (unrealized) device on the bus
and then realize it.
As a workaround, users that scan the child device list, can
check the realized property to see if it is safe to access such a device.
Use an atomic write here too to aid with this.
A separate discussion is what to do with devices that are unrealized:
It looks like for this case we only call the hotplug handler's unplug
callback and its up to it to unrealize the device.
An atomic operation doesn't cause harm for this code path though.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200913160259.32145-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20201006123904.610658-10-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 2d24a64661
References: bsc#1184574
This fixes the race between device emulation code that tries to find
a child device to dispatch the request to (e.g a scsi disk),
and hotplug of a new device to that bus.
Note that this doesn't convert all the readers of the list
but only these that might go over that list without BQL held.
This is a very small first step to make this code thread safe.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200913160259.32145-5-mlevitsk@redhat.com>
[Use RCU_READ_LOCK_GUARD in more places, adjust testcase now that
the delay in DEVICE_DELETED due to RCU is more consistent. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20201006123904.610658-9-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: 7bed89958b
References: bsc#1184574
Soon, a device removal might only happen on RCU callback execution.
This is okay for device-del which provides a DEVICE_DELETED event,
but not for the failure case of device-add. To avoid changing
monitor semantics, just drain all pending RCU callbacks on error.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200913160259.32145-4-mlevitsk@redhat.com>
[Don't use it in qmp_device_del. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: bb755ba47f
References: bsc#1184574
Check if an address is free on the bus before plugging in the
device. This makes it possible to do the check without any
side effects, and to detect the problem early without having
to do it in the realize callback.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20201006123904.610658-5-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
Git-commit: c5a61e5a3c
References: bsc#1184574
The object_ref/unref methods are intended for use with any subclass of
the base Object. Using "Object *" in the signature is not adding any
meaningful level of type safety, since callers simply use "OBJECT(ptr)"
and this expands to an unchecked cast "(Object *)".
By using "void *" we enable the object_unref() method to be used to
provide support for g_autoptr() with any subclass.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20200723181410.3145233-2-berrange@redhat.com>
Message-Id: <20200831210740.126168-2-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Lin Ma <lma@suse.com>
References: bsc#1178049
When SG_IO returns with a non-zero 'host_status' or 'status' we
should be folding these values into the request status to allow
any drivers to signal them back to the guest.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1178049
SG_IO may return additional status in the 'status', 'driver_status',
and 'host_status' fields. When either of these fields are set the
command has not been executed normally, so we should not continue
processing this command but rather return an error.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1178049
when running with an SG_IO backend we might be getting a SCSI host
status back, which should be translated into a virtio scsi status
to avoid having a silent data corruption if the status isn't
translated properly.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1178049
To align with standard linux settings we should be setting the
default I/O timeout to 30 seconds, and add a lower bound of
5 seconds to avoid spurious I/O failures.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1178049
Add tracepoints for SG_IO commands to get a grip on the timeout
settings.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1178049
Add an 'io_timeout' parameter for SCSIDevice to allow
SG_IO ioctls to pass in a timeout, avoiding infinite
guest stalls if the host needs to abort a command.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: boo#1169728
gcc 10 needs some help to understand that indeed cpu_irqs[0] does get
initialized in all cases. In this case an assert is sufficient.
Reported-by: Martin Liška <mliska@suse.cz>
Signed-off-by: Bruce Rogers <brogers@suse.com>
As part of the effort to close the gap with Leap I think we are fine
removing the $pkgversion component to creating a unique CONFIG_STAMP.
This stamp is only used in creating a unique symbol used in ensuring the
dynamically loaded modules correspond correctly to the loading qemu.
The default inputs to producing this unique symbol are somewhat reasonable
as a generic mechanism, but specific packaging and maintenance practices
might require the default to be modified for best use. This is an example
of that.
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
linux/kvm.h is not available on all platforms. Let us move
s390_machine_inject_pv_error into pv.c as it uses KVM structures.
Also rename the function to s390_pv_inject_reset_error.
While at it, ipl.h needs an include for "exec/address-spaces.h"
as it uses address_space_memory.
Fixes: 49fc3220175e ("s390x: protvirt: Support unpack facility")
Reported-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
The unpack facility is an indication that diagnose 308 subcodes 8-10
are available to the guest. That means, that the guest can put itself
into protected mode.
Once it is in protected mode, the hardware stops any attempt of VM
introspection by the hypervisor.
Some features are currently not supported in protected mode:
* vfio devices
* Migration
* Huge page backings
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
(cherry picked from commit 3034eaac3b2970ba85a1d77814ceef1352d05357)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
For protected guests, we need to put the IO emulation results into the
SIDA, so SIE will write them into the guest at the next entry.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 4989e18cbe5621df39020ef812316f479d8f5246)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
IO instruction data is routed through SIDAD for protected guests, so
adresses do not need to be checked, as this is kernel memory which is
always available.
Also the instruction data always starts at offset 0 of the SIDAD.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit f658bf14295ad49caf8d1b21033982ce69423fb7)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
For protected guests the IPIB is written/read to/from the SIDA, so we
need those accesses to go through s390_cpu_pv_mem_read/write().
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 258da1c7736d3aa4604ceea6cce00995c6f30058)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
Handling of CPU reset and setting of the IPL psw from guest storage at
offset 0 is done by a Ultravisor call. Let's only fetch it if
necessary.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit e8686d9849f1625f4f4b28403f0555181b72d1b6)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
SCLP for a protected guest is done over the SIDAD, so we need to use
the s390_cpu_pv_mem_* functions to access the SIDAD instead of guest
memory when reading/writing SCBs.
To not confuse the sclp emulation, we set 0x4000 as the SCCB address,
since the function that injects the sclp external interrupt would
reject a zero sccb address.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 32633cf4539341180dbc7a92c2655c711b4a6996)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
For protected guests, we need to put the STSI emulation results into
the SIDA, so SIE will write them into the guest at the next entry.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit ccce7a654911ae507c962aff5f41004a7a88fad6)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
Protected guests save the instruction control blocks in the SIDA
instead of QEMU/KVM directly accessing the guest's memory.
Let's introduce new functions to access the SIDA.
The memops for doing so are available with KVM_CAP_S390_PROTECTED, so
let's check for that.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit a9f21cec3bc9c86062c7c24bb2143d22cb3c2950)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
Protected VMs no longer intercept with code 4 for an instruction
interception. Instead they have codes 104 and 108 for protected
instruction interception and protected instruction notification
respectively.
The 104 mirrors the 4 interception.
The 108 is a notification interception to let KVM and QEMU know that
something changed and we need to update tracking information or
perform specific tasks. It's currently taken for the following
instructions:
* spx (To inform about the changed prefix location)
* sclp (On incorrect SCCB values, so we can inject a IRQ)
* sigp (All but "stop and store status")
* diag308 (Subcodes 0/1)
Of these exits only sclp errors, state changing sigps and diag308 will
reach QEMU. QEMU will do its parts of the job, while the ultravisor
has done the instruction part of the job.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit fd70eb764f176c200d6723c2ad88362f23536bfa)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
Ballooning in protected VMs can only be done when the guest shares the
pages it gives to the host. If pages are not shared, the integrity
checks will fail once those pages have been altered and are given back
to the guest.
As we currently do not yet have a solution for this we will continue
like this:
1. We block ballooning now in QEMU (with this patch).
2. Later we will provide a change to virtio that removes the blocker
and adds VIRTIO_F_IOMMU_PLATFORM automatically by QEMU when doing the
protvirt switch. This is OK, as the balloon driver in Linux (the only
supported guest) will refuse to work with the IOMMU_PLATFORM feature
bit set.
3. Later, we can fix the guest balloon driver to accept the IOMMU
feature bit and correctly exercise sharing and unsharing of balloon
pages.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 59dc32a3494d6afdd420f3e401f1f324a1179256)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
The unpack facility provides the means to setup a protected guest. A
protected guest cannot be introspected by the hypervisor or any
user/administrator of the machine it is running on.
Protected guests are encrypted at rest and need a special boot
mechanism via diag308 subcode 8 and 10.
Code 8 sets the PV specific IPLB which is retained separately from
those set via code 5.
Code 10 is used to unpack the VM into protected memory, verify its
integrity and start it.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Co-developed-by: Christian Borntraeger <borntraeger@de.ibm.com> [Changes
to machine]
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 2150c92b9b7d12b5fbdd2c59e5b17197d28f53db)
[BR: Needed to fix a compiler warning on i586 in hw/s390x/ipl.c]
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
(cherry picked from commit 6807f464961cfee1dd81c95e22ddd91fa352fcc4)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075, bsc#1167445
We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on
platform without IOMMU support. This can lead unnecessary IOTLB
transactions which will damage the performance.
Fixing this by check whether the device is backed by IOMMU and disable
device IOTLB.
Reported-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20200302042454.24814-1-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f7ef7e6e3b)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
They are part of the IPL process, so let's put them into the ipl
header.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
(cherry picked from commit 284bc3dd6e9a978e6e34b00777ce72007a88d6d9)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
Up to now we only had an ioctl to reset vcpu data QEMU couldn't reach
for the initial reset, which was also called for the clear reset. To
be architecture compliant, we also need to clear local interrupts on a
normal reset.
Because of this and the upcoming protvirt support we need to add
ioctls for the missing clear and normal resets.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200214151636.8764-3-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit b91a03946e)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
It defaults to returning 0 anyway and that return value is not
necessary, as 0 is also the default rc that the caller would return.
While doing that we can simplify the logic a bit and return early if
we inject a PGM exception.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20191129091713.4582-1-frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 15b6c0370c)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
Let's start moving the cpu reset functions into a single function with
a switch/case, so we can later use fallthroughs and share more code
between resets.
This patch introduces the reset function by renaming cpu_reset().
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20191127175046.4911-3-frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit eac4f82791)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1167075
The initiating cpu needs to be reset with an initial reset. While
doing a normal reset followed by a initial reset is not wrong per se,
the Ultravisor will only allow the correct reset to be performed.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20191127175046.4911-2-frankja@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit ec9227339f)
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1159755
With commit 7fccf2a068 a new member
smbus_no_migration_support was added, and enabled in two places.
With commit 4ab2f2a8aa the vmstate_acpi
got new elements, which are conditionally filled. As a result, an
incoming migration expected smbus related data unless smbus migration
was disabled for a given MachineClass.
Since commit 7fccf2a068 forgot to handle
xenfv, live migration to receiving hosts using qemu-4.0 and later is broken.
Adjust 'xenfv' to stay compatible with with 'pc-i440fx-3.1':
- the toolstack can not use '-M pc-i440fx-3.1,accel=xen -device xen-platform'
because this would move the PCI device from 00:02.0 to 00:04.0
- disable pvh.
Running PVH may require dedicated device_model_args= options which select
'pc-i440fx-4.x'
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
[BR: Adjust implementation to simply call pc_i440fx_3_1_machine_options]
While we don't specifically set QEMU_PROG, the code which detects the
host architecture needs a little help mapping the output of uname -m to
what the qemu project uses to reference that architecture.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Most tests previously disabled for qemu-testsuite to be able to complete
successfully are no longer (as of v4.1) listed as auto, and therefore
do not get run anymore.
27NOV2019 - added 161 since it is failing on s390x and ppc consistently
Signed-off-by: Bruce Rogers <brogers@suse.com>
This is hopefully temporary. Simply disable the warning about taking
the address of packed structure members which is new in gcc9.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Currently roms are mistakenly getting built in a linux-user only
configuration. Add check for softmmu in all places where our list of
roms is being added to.
Signed-off-by: Bruce Rogers <brogers@suse.com>
sprintf related parameter validation complains about the size of the
buffer being written to in exynos4210_gic_realize(). Provide a bit more
space to avoid the following warning:
/home/abuild/rpmbuild/BUILD/qemu-4.0.0/hw/intc/exynos4210_gic.c: In function 'exynos4210_gic_realize':
/home/abuild/rpmbuild/BUILD/qemu-4.0.0/hw/intc/exynos4210_gic.c:316:36: error: '%x' directive writing between 1 and 7 bytes into a region of size between 4 and 28 [-Werror=format-overflow=]
316 | sprintf(cpu_alias_name, "%s%x", cpu_prefix, i);
| ^~
/home/abuild/rpmbuild/BUILD/qemu-4.0.0/hw/intc/exynos4210_gic.c:316:33: note: directive argument in the range [0, 29020050]
316 | sprintf(cpu_alias_name, "%s%x", cpu_prefix, i);
| ^~~~~~
In file included from /usr/include/stdio.h:867,
from /home/abuild/rpmbuild/BUILD/qemu-4.0.0/include/qemu/osdep.h:99,
from /home/abuild/rpmbuild/BUILD/qemu-4.0.0/hw/intc/exynos4210_gic.c:23:
/usr/include/bits/stdio2.h:36:10: note: '__builtin___sprintf_chk' output between 2 and 32 bytes into a destination of size 28
36 | return __builtin___sprintf_chk (__s, __USE_FORTIFY_LEVEL - 1,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
37 | __bos (__s), __fmt, __va_arg_pack ());
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/abuild/rpmbuild/BUILD/qemu-4.0.0/hw/intc/exynos4210_gic.c:326:37: error: '%x' directive writing between 1 and 7 bytes into a region of size between 3 and 28 [-Werror=format-overflow=]
326 | sprintf(dist_alias_name, "%s%x", dist_prefix, i);
| ^~
/home/abuild/rpmbuild/BUILD/qemu-4.0.0/hw/intc/exynos4210_gic.c:326:34: note: directive argument in the range [0, 29020050]
326 | sprintf(dist_alias_name, "%s%x", dist_prefix, i);
| ^~~~~~
In file included from /usr/include/stdio.h:867,
from /home/abuild/rpmbuild/BUILD/qemu-4.0.0/include/qemu/osdep.h:99,
from /home/abuild/rpmbuild/BUILD/qemu-4.0.0/hw/intc/exynos4210_gic.c:23:
/usr/include/bits/stdio2.h:36:10: note: '__builtin___sprintf_chk' output between 2 and 33 bytes into a destination of size 28
36 | return __builtin___sprintf_chk (__s, __USE_FORTIFY_LEVEL - 1,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
37 | __bos (__s), __fmt, __va_arg_pack ());
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Bruce Rogers <brogers@suse.com>
Fix this warning with GCC 9 on Fedora 30:
hw/usb/dev-mtp.c:1715:36: error: taking address of packed member of struct <anonymous> may result in an unaligned pointer value [-Werror=address-of-packed-member]
1715 | dataset->filename);
| ~~~~~~~^~~~~~~~~~
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Fix this build warning with GCC 9 on Fedora 30:
hw/usb/hcd-xhci.c:3339:66: error: %d directive output may be truncated writing between 1 and 10 bytes into a region of size 5 [-Werror=format-truncation=]
3339 | snprintf(port->name, sizeof(port->name), "usb2 port #%d", i+1);
| ^~
hw/usb/hcd-xhci.c:3339:54: note: directive argument in the range [1, 2147483647]
3339 | snprintf(port->name, sizeof(port->name), "usb2 port #%d", i+1);
| ^~~~~~~~~~~~~~~
In file included from /usr/include/stdio.h:867,
from /home/alistair/qemu/include/qemu/osdep.h:99,
from hw/usb/hcd-xhci.c:21:
/usr/include/bits/stdio2.h:67:10: note: __builtin___snprintf_chk output between 13 and 22 bytes into a destination of size 16
67 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
68 | __bos (__s), __fmt, __va_arg_pack ());
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Since we have a quite restricted execution environment, as far as
networking is concerned, we need to change the error message we expect
in test 162. There is actually no routing set up so the error we get is
"Network is unreachable". Change the expected output accordingly.
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1079730, bsc#1101982, bsc#1063993
The final step of xl migrate|save for an HVM domU is saving the state of
qemu. This also involves releasing all block devices. While releasing
backends ought to be a separate step, such functionality is not
implemented.
Unfortunately, releasing the block devices depends on the optional
'live' option. This breaks offline migration with 'virsh migrate domU
dom0' because the sending side does not release the disks, as a result
the receiving side can not properly claim write access to the disks.
As a minimal fix, remove the dependency on the 'live' option. Upstream
may fix this in a different way, like removing the newly added 'live'
parameter entirely.
Fixes: 5d6c599fe1 ("migration, xen: Fix block image lock issue on live migration")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
The use of membarriers collides with the block test's practice of
SIGKILLing test vm's. Have them quit politely. Tests: 130, 153 - and
though test 161 seems to have the same issue, it is not yet fixed, but
just marked here as possibly needing a fix.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Executing tests in obs is very fickle, since you aren't guaranteed
reliable cpu time. Triple the timeout for each test to help ensure
we don't fail a test because the stars align against us.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Provide monitor naming of xen disks, and plumb guest driver
notification through xenstore of resizing instigated via the
monitor.
[BR: minor edits to pass qemu's checkpatch script]
[BR: significant rework needed due to upstream xen disk qdevification]
[BR: At this point, monitor_add_blk call is all we need to add!]
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#994082, bsc#1084316, boo#1131894
It's easy enough to handle either per-spec or legacy smbios structures
in the smbios file input without regard to the machine type used, by
simply applying the basic smbios formatting rules. then depending on
what is detected. terminal numm bytes are added or removed for machine
type specific processing.
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bnc#812836
qemu-kvm 0.15 uses the same GPE format as qemu 1.4, but as version 2
rather than 3.
Signed-off-by: Andreas Färber <afaerber@suse.de>
References: bnc#812836
qemu-kvm 0.15 had a VMSTATE_UINT32(flags, PITState) field that
qemu 1.4 does not have.
Signed-off-by: Andreas Färber <afaerber@suse.de>
References: bnc#812836
qemu-kvm.git commit a7fe0297840908a4fd65a1cf742481ccd45960eb
(Extend vram size to 16MB) deviated from qemu.git since kvm-61, and only
in commit 9e56edcf8d (vga: raise default
vgamem size) did qemu.git adjust the VRAM size for v1.2.
Add compatibility properties so that up to and including pc-0.15 we
maintain migration compatibility with qemu-kvm rather than QEMU and
from pc-1.0 on with QEMU (last qemu-kvm release was 1.2).
Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: adjust comma position in list in macro for v2.5.0 compat]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Allow for guests with higher amounts of ram. The current thought
is that 2TB specified on qemu commandline would be an appropriate
limit. Note that this requires the next higher bit value since
the highest address is actually more than 2TB due to the pci
memory hole.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
For SLES we want users to be able to use large memory configurations
with KVM without fiddling with ulimit -Sv.
Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: add include for sys/resource.h]
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1011213
Certain rom subpackages build from qemu git-submodules call the date
program to include date information in the packaged binaries. This
causes repeated builds of the package to be different, wkere the only
real difference is due to the fact that time build timestamp has
changed. To promote reproducible builds and avoid customers being
prompted to update packages needlessly, we'll use the timestamp of the
VERSION file as the packaging timestamp for all packages that build in a
timestamp for whatever reason.
Signed-off-by: Bruce Rogers <brogers@suse.com>
After "linux-user: use target_ulong" the poll syscall was no longer
handling infinite timeout.
/home/abuild/rpmbuild/BUILD/qemu-2.7.0-rc5/linux-user/syscall.c:9773:26: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
if (arg3 >= 0) {
^~
Signed-off-by: Andreas Schwab <schwab@suse.de>
References: boo#988279
Change from using glib alloc and free routines to those
from libc. Also perform safety measure of dropping privs
to user if configured no-caps.
Signed-off-by: Bruce Rogers <brogers@suse.com>
[AF: Rebased for v2.7.0-rc2]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Add code to read the suse specific suse-diskcache-disable-flush flag out
of xenstore, and set the equivalent flag within QEMU.
Patch taken from Xen's patch queue, Olaf Hering being the original author.
[bsc#879425]
[BR: minor edits to pass qemu's checkpatch script]
[BR: With qdevification of xen-block, code has changed significantly]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Olaf Hering <olaf@aepfle.de>
On hosts with limited virtual address space (32bit pointers), we can very
easily run out of virtual memory with big thread pools.
Instead, we should limit ourselves to small pools to keep memory footprint
low on those systems.
This patch fixes random VM stalls like
(process:25114): GLib-ERROR **: gmem.c:103: failed to allocate 1048576 bytes
on 32bit ARM systems for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
When doing lseek, SEEK_SET indicates that the offset is an unsigned variable.
Other seek types have parameters that can be negative.
When converting from 32bit to 64bit parameters, we need to take this into
account and enable SEEK_END and SEEK_CUR to be negative, while SEEK_SET stays
absolute positioned which we need to maintain as unsigned.
Signed-off-by: Alexander Graf <agraf@suse.de>
Virtio-Console can only process one character at a time. Using it on S390
gave me strage "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.
While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.
To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.
This patch fixes input when using -nographic on s390 for me.
[AF: Rebased for v2.7.0-rc2]
[BR: minor edits to pass qemu's checkpatch script]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Linux syscalls pass pointers or data length or other information of that sort
to the kernel. This is all stuff you don't want to have sign extended.
Otherwise a host 64bit variable parameter with a size parameter will extend
it to a negative number, breaking lseek for example.
Pass syscall arguments as ulong always.
Signed-off-by: Alexander Graf <agraf@suse.de>
Fedora 17 for ARM reads /proc/cpuinfo and fails if it doesn't contain
ARM related contents. This patch implements a quick hack to expose real
/proc/cpuinfo data taken from a real world machine.
The real fix would be to generate at least the flags automatically based
on the selected CPU. Please do not submit this patch upstream until this
has happened.
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased for v1.6 and v1.7]
Signed-off-by: Andreas Färber <afaerber@suse.de>
When we have a working host binary equivalent for the guest binary we're
trying to run, let's just use that instead as it will be a lot faster.
Signed-off-by: Alexander Graf <agraf@suse.de>
When using hugetlbfs (which is required for HV mode KVM on 970), we
check for MMU notifiers that on 970 can not be implemented properly.
So disable the check for mmu notifiers on PowerPC guests, making
KVM guests work there, even if possibly racy in some odd circumstances.
Signed-off-by: Bruce Rogers <brogers@suse.com>
When using qemu's linux-user binaries through binfmt, argv[0] gets lost
along the execution because qemu only gets passed in the full file name
to the executable while argv[0] can be something completely different.
This breaks in some subtile situations, such as the grep and make test
suites.
This patch adds a wrapper binary called qemu-$TARGET-binfmt that can be
used with binfmt's P flag which passes the full path _and_ argv[0] to
the binfmt handler.
The binary would be smart enough to be versatile and only exist in the
system once, creating the qemu binary path names from its own argv[0].
However, this seemed like it didn't fit the make system too well, so
we're currently creating a new binary for each target archictecture.
CC: Reinhard Max <max@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased onto new Makefile infrastructure, twice]
[AF: Updated for aarch64 for v2.0.0-rc1]
[AF: Rebased onto Makefile changes for v2.1.0-rc0]
[AF: Rebased onto script rewrite for v2.7.0-rc2 - to be fixed]
Signed-off-by: Andreas Färber <afaerber@suse.de>
the direction given in the ioctl should be correct so we can assume the
communication is uni-directional. The alsa developers did not like this
concept though and declared ioctls IOC_R and IOC_W even though they were
IOC_RW.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
[BR: minor edits to pass qemu's checkpatch script]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 0000000000000000000000000000000000000000
References: bsc#1181639
While activating device in vmxnet3_acticate_device(), it does not
validate guest supplied configuration values against predefined
minimum - maximum limits. This may lead to integer overflow or
OOB access issues. Add checks to avoid it.
Fixes: CVE-2021-20203
Buglink: https://bugs.launchpad.net/qemu/+bug/1913873
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 62271205bc
When adding the Reset register in commit 5790b757cf we
forgot to migrate it.
While it is possible a VM using the PIIX4 is migrated just
after requesting a system shutdown, it is very unlikely.
However when restoring a migrated VM, we might have the
RCR bit #4 set on the stack and when the VM resume it
directly shutdowns.
Add a post_load() migration handler and set the default
RCR value to 0 for earlier versions, assuming the VM was
not going to shutdown before migration.
Fixes: 5790b757cf ("piix4: Add the Reset Control Register")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210324200334.729899-1-f4bug@amsat.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 705df5466c
References: bsc#1182968, CVE-2021-3416
Some NIC supports loopback mode and this is done by calling
nc->info->receive() directly which in fact suppresses the effort of
reentrancy check that is done in qemu_net_queue_send().
Unfortunately we can't use qemu_net_queue_send() here since for
loopback there's no sender as peer, so this patch introduce a
qemu_receive_packet() which is used for implementing loopback mode
for a NIC with this check.
NIC that supports loopback mode will be converted to this helper.
This is intended to address CVE-2021-3416.
Cc: Prasad J Pandit <ppandit@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 3de46e6fc4
References: bsc#1182577, CVE-2021-20257
During procss_tx_desc(), driver can try to chain data descriptor with
legacy descriptor, when will lead underflow for the following
calculation in process_tx_desc() for bytes:
if (tp->size + bytes > msh)
bytes = msh - tp->size;
This will lead a infinite loop. So check and fail early if tp->size if
greater or equal to msh.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de>
Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: edfe2eb436
References: bsc#1181933
Per the ARM Generic Interrupt Controller Architecture specification
(document "ARM IHI 0048B.b (ID072613)"), the SGIINTID field is 4 bit,
not 10:
- 4.3 Distributor register descriptions
- 4.3.15 Software Generated Interrupt Register, GICD_SG
- Table 4-21 GICD_SGIR bit assignments
The Interrupt ID of the SGI to forward to the specified CPU
interfaces. The value of this field is the Interrupt ID, in
the range 0-15, for example a value of 0b0011 specifies
Interrupt ID 3.
Correct the irq mask to fix an undefined behavior (which eventually
lead to a heap-buffer-overflow, see [Buglink]):
$ echo 'writel 0x8000f00 0xff4affb0' | qemu-system-aarch64 -M virt,accel=qtest -qtest stdio
[I 1612088147.116987] OPENED
[R +0.278293] writel 0x8000f00 0xff4affb0
../hw/intc/arm_gic.c:1498:13: runtime error: index 944 out of bounds for type 'uint8_t [16][8]'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../hw/intc/arm_gic.c:1498:13
This fixes a security issue when running with KVM on Arm with
kernel-irqchip=off. (The default is kernel-irqchip=on, which is
unaffected, and which is also the correct choice for performance.)
Cc: qemu-stable@nongnu.org
Fixes: CVE-2021-20221
Fixes: 9ee6e8bb85 ("ARMv7 support.")
Buglink: https://bugs.launchpad.net/qemu/+bug/1913916
Buglink: https://bugs.launchpad.net/qemu/+bug/1913917
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210131103401.217160-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 89fbea8737
References: bsc#1182137
Depending on the client activity, the server can be asked to open a huge
number of file descriptors and eventually hit RLIMIT_NOFILE. This is
currently mitigated using a reclaim logic : the server closes the file
descriptors of idle fids, based on the assumption that it will be able
to re-open them later. This assumption doesn't hold of course if the
client requests the file to be unlinked. In this case, we loop on the
entire fid list and mark all related fids as unreclaimable (the reclaim
logic will just ignore them) and, of course, we open or re-open their
file descriptors if needed since we're about to unlink the file.
This is the purpose of v9fs_mark_fids_unreclaim(). Since the actual
opening of a file can cause the coroutine to yield, another client
request could possibly add a new fid that we may want to mark as
non-reclaimable as well. The loop is thus restarted if the re-open
request was actually transmitted to the backend. This is achieved
by keeping a reference on the first fid (head) before traversing
the list.
This is wrong in several ways:
- a potential clunk request from the client could tear the first
fid down and cause the reference to be stale. This leads to a
use-after-free error that can be detected with ASAN, using a
custom 9p client
- fids are added at the head of the list : restarting from the
previous head will always miss fids added by a some other
potential request
All these problems could be avoided if fids were being added at the
end of the list. This can be achieved with a QSIMPLEQ, but this is
probably too much change for a bug fix. For now let's keep it
simple and just restart the loop from the current head.
Fixes: CVE-2021-20181
Buglink: https://bugs.launchpad.net/qemu/+bug/1911666
Reported-by: Zero Day Initiative <zdi-disclosures@trendmicro.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <161064025265.1838153.15185571283519390907.stgit@bahia.lan>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: eb8cb3d9dc
References: bsc#1178049
Add trace events for SCSI and TMF command tracing.
Signed-off-by: Hannes Reinecke <hare@suse.de>
BR: Includes minor tweaks that came from the PTF patch as opposed to the
one upstreamed.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 4bfb024bc7
References: bsc#1179686, CVE-2020-27821
In using the address_space_translate_internal API, address_space_cache_init
forgot one piece of advice that can be found in the code for
address_space_translate_internal:
/* MMIO registers can be expected to perform full-width accesses based only
* on their address, without considering adjacent registers that could
* decode to completely different MemoryRegions. When such registers
* exist (e.g. I/O ports 0xcf8 and 0xcf9 on most PC chipsets), MMIO
* regions overlap wildly. For this reason we cannot clamp the accesses
* here.
*
* If the length is small (as is the case for address_space_ldl/stl),
* everything works fine. If the incoming length is large, however,
* the caller really has to do the clamping through memory_access_size.
*/
address_space_cache_init is exactly one such case where "the incoming length
is large", therefore we need to clamp the resulting length---not to
memory_access_size though, since we are not doing an access yet, but to
the size of the resulting section. This ensures that subsequent accesses
to the cached MemoryRegionSection will be in range.
With this patch, the enclosed testcase notices that the used ring does
not fit into the MSI-X table and prints a "qemu-system-x86_64: Cannot map used"
error.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 8132122889
References: bsc#1181108, CVE-2020-29443
A case was reported where s->io_buffer_index can be out of range.
The report skimped on the details but it seems to be triggered
by s->lba == -1 on the READ/READ CD paths (e.g. by sending an
ATAPI command with LBA = 0xFFFFFFFF). For now paper over it
with assertions. The first one ensures that there is no overflow
when incrementing s->io_buffer_index, the second checks for the
buffer overrun.
Note that the buffer overrun is only a read, so I am not sure
if the assertion failure is actually less harmful than the overrun.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20201201120926.56559-1-pbonzini@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: c2cb511634
References: bsc#1179468, CVE-2020-28916
While receiving packets via e1000e_write_packet_to_guest() routine,
'desc_offset' is advanced only when RX descriptor is processed. And
RX descriptor is not processed if it has NULL buffer address.
This may lead to an infinite loop condition. Increament 'desc_offset'
to process next descriptor in the ring to avoid infinite loop.
Reported-by: Cheol-woo Myung <330cjfdn@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 7564bf7701
References: bsc#1178174, CVE-2020-27617
eth_get_gso_type() routine returns segmentation offload type based on
L3 protocol type. It calls g_assert_not_reached if L3 protocol is
unknown, making the following return statement unreachable. Remove the
g_assert call, it maybe triggered by a guest user.
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: ca1f9cbfdc
References: bsc#1178400, CVE-2020-27616
The source and destination x,y display parameters in ati_2d_blt()
may run off the vga limits if either of s->regs.[src|dst]_[xy] is
zero. Check the parameter values to avoid potential crash.
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20201021103818.1704030-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 37fa32de70
References: bsc#1179719
When an s390 guest is using lazy unmapping, it can result in a very
large number of oustanding DMA requests, far beyond the default
limit configured for vfio. Let's track DMA usage similar to vfio
in the host, and trigger the guest to flush their DMA mappings
before vfio runs out.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[aw: non-Linux build fixes]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
Git-commit: cd7498d07f
References: bsc#1179719
Create new files for separating out vfio-specific work for s390
pci. Add the first such routine, which issues VFIO_IOMMU_GET_INFO
ioctl to collect the current dma available count.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[aw: Fix non-Linux build with CONFIG_LINUX]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
Git-commit: 7486a62845
References: bsc#1179719
The underlying host may be limiting the number of outstanding DMA
requests for type 1 IOMMU. Add helper functions to check for the
DMA available capability and retrieve the current number of DMA
mappings allowed.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[aw: vfio_get_info_dma_avail moved inside CONFIG_LINUX]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
Git-commit: 53ba2eee52
References: bsc#1179719
commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
[aw: drop pvrdma_ring.h changes to avoid revert of d73415a315 ("qemu/atomic.h: rename atomic_ to qatomic_")]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
Git-commit: 5f97ba0c74
References: bsc#1183979
This error takes effect when the magic value "zIPL" is located at the
end of a block. For example if s2_cur_blk = 0x7fe18000 and the magic
value "zIPL" is located at 0x7fe18ffc - 0x7fe18fff.
Fixes: ba831b2526 ("s390-ccw: read stage2 boot loader data to find menu")
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Message-Id: <20200924085926.21709-2-mhartmay@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Use "<= ... - 4" instead of "< ... - 3"]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 1be90ebecc
References: bsc#1176684, CVE-2020-25625
While servicing OHCI transfer descriptors(TD), ohci_service_iso_td
retires a TD if it has passed its time frame. It does not check if
the TD was already processed once and holds an error code in TD_CC.
It may happen if the TD list has a loop. Add check to avoid an
infinite loop condition.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200915182259.68522-3-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: b946434f26
References: bsc#1175441, CVE-2020-14364
Store calculated setup_len in a local variable, verify it, and only
write it to the struct (USBDevice->setup_len) in case it passed the
sanity checks.
This prevents other code (do_token_{in,out} functions specifically)
from working with invalid USBDevice->setup_len values and overrunning
the USBDevice->setup_buf[] buffer.
Fixes: CVE-2020-14364
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 035e69b063
References: bsc#1174641, CVE-2020-16092
An assertion failure issue was found in the code that processes network packets
while adding data fragments into the packet context. It could be abused by a
malicious guest to abort the QEMU process on the host. This patch replaces the
affected assert() with a conditional statement, returning false if the current
data fragment exceeds max_raw_frags.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Ziming Zhang <ezrakiez@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: d1bb69db4c
References: bsc#1174863
Right now, -no-reboot prevents secure guests from running. This is
correct from an implementation point of view, as we have modeled the
transition from non-secure to secure as a program directed IPL. From
a user perspective, this is not the behavior of least surprise.
We should implement the IPL into protected mode similar to the
functions that we use for kdump/kexec. In other words, we do not stop
here when -no-reboot is specified on the command line. Like function 0
or function 1, function 10 is not a classic reboot. For example, it
can only be called once. Before calling it a second time, a real
reboot/reset must happen in-between. So function code 10 is more or
less a state transition reset, but not a "standard" reset or reboot.
Fixes: 4d226deafc44 ("s390x: protvirt: Support unpack facility")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Message-Id: <20200721103202.30610-1-borntraeger@de.ibm.com>
[CH: tweaked description]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
Git-commit: 5519724a13
References: bsc#1174386 CVE-2020-15863
A buffer overflow issue was reported by Mr. Ziming Zhang, CC'd here. It
occurs while sending an Ethernet frame due to missing break statements
and improper checking of the buffer size.
Reported-by: Ziming Zhang <ezrakiez@gmail.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: f50ab86a26
References: bsc#1172383, CVE-2020-13362
A guest user may set 'reply_queue_head' field of MegasasState to
a negative value. Later in 'megasas_lookup_frame' it is used to
index into s->frames[] array. Use unsigned type to avoid OOB
access issue.
Also check that 'index' value stays within s->frames[] bounds
through the while() loop in 'megasas_lookup_frame' to avoid OOB
access.
Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200513192540.1583887-2-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: cbaf25d1f5
References: boo#1171712
Commit 571a8c522e caused the HMP wavcapture command to segfault when
processing audio data in audio_pcm_sw_write(), where a NULL
sw->hw->pcm_ops is dereferenced. This fix checks that the pointer is
valid before dereferincing it. A similar fix is also made in the
parallel function audio_pcm_sw_read().
Fixes: 571a8c522e (audio: split ctl_* functions into enable_* and
volume_*)
Signed-off-by: Bruce Rogers <brogers@suse.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200521172931.121903-1-brogers@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 9904adfaca
References: bsc#1179719
virtio_net_rsc_ext_num_{packets,dupacks} needs to be available
independently of the presence of VIRTIO_NET_HDR_F_RSC_INFO.
Fixes: 2974e916df ("virtio-net: support RSC v4/v6 tcp traffic for Windows HCK")
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200427102415.10915-2-cohuck@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
Git-commit: ff0507c239
References: bsc#1180523, CVE-2020-11947
There is an overflow, the source 'datain.data[2]' is 100 bytes,
but the 'ss' is 252 bytes.This may cause a security issue because
we can access a lot of unrelated memory data.
The len for sbp copy data should take the minimum of mx_sb_len and
sb_len_wr, not the maximum.
If we use iscsi device for VM backend storage, ASAN show stack:
READ of size 252 at 0xfffd149dcfc4 thread T0
#0 0xaaad433d0d34 in __asan_memcpy (aarch64-softmmu/qemu-system-aarch64+0x2cb0d34)
#1 0xaaad45f9d6d0 in iscsi_aio_ioctl_cb /qemu/block/iscsi.c:996:9
#2 0xfffd1af0e2dc (/usr/lib64/iscsi/libiscsi.so.8+0xe2dc)
#3 0xfffd1af0d174 (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
#4 0xfffd1af19fac (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
#5 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
#6 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
#7 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
#8 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
#9 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
#10 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
#11 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
#12 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
#13 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
#14 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
#15 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
#16 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
#17 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)
0xfffd149dcfc4 is located 0 bytes to the right of 100-byte region [0xfffd149dcf60,0xfffd149dcfc4)
allocated by thread T0 here:
#0 0xaaad433d1e70 in __interceptor_malloc (aarch64-softmmu/qemu-system-aarch64+0x2cb1e70)
#1 0xfffd1af0e254 (/usr/lib64/iscsi/libiscsi.so.8+0xe254)
#2 0xfffd1af0d174 (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
#3 0xfffd1af19fac (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
#4 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
#5 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
#6 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
#7 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
#8 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
#9 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
#10 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
#11 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
#12 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
#13 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
#14 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
#15 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
#16 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200418062602.10776-1-kuhn.chenqun@huawei.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 5710a3e09f
When using C11 atomics, non-seqcst reads and writes do not participate
in the total order of seqcst operations. In util/async.c and util/aio-posix.c,
in particular, the pattern that we use
write ctx->notify_me write bh->scheduled
read bh->scheduled read ctx->notify_me
if !bh->scheduled, sleep if ctx->notify_me, notify
needs to use seqcst operations for both the write and the read. In
general this is something that we do not want, because there can be
many sources that are polled in addition to bottom halves. The
alternative is to place a seqcst memory barrier between the write
and the read. This also comes with a disadvantage, in that the
memory barrier is implicit on strongly-ordered architectures and
it wastes a few dozen clock cycles.
Fortunately, ctx->notify_me is never written concurrently by two
threads, so we can assert that and relax the writes to ctx->notify_me.
The resulting solution works and performs well on both aarch64 and x86.
Note that the atomic_set/atomic_read combination is not an atomic
read-modify-write, and therefore it is even weaker than C11 ATOMIC_RELAXED;
on x86, ATOMIC_RELAXED compiles to a locked operation.
Analyzed-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Ying Fang <fangying1@huawei.com>
Message-Id: <20200407140746.8041-6-pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 3c18a92dc4
Any thread that is not a iothread returns NULL for qemu_get_current_aio_context().
As a result, it would also return true for
in_aio_context_home_thread(qemu_get_aio_context()), causing
AIO_WAIT_WHILE to invoke aio_poll() directly. This is incorrect
if the BQL is not held, because aio_poll() does not expect to
run concurrently from multiple threads, and it can actually
happen when savevm writes to the vmstate file from the
migration thread.
Therefore, restrict in_aio_context_home_thread to return true
for the main AioContext only if the BQL is held.
The function is moved to aio-wait.h because it is mostly used
there and to avoid a circular reference between main-loop.h
and block/aio.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200407140746.8041-5-pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: be02cda3af
References: jsc#SLE-17785
These two features were incorrectly tied to host_cpuid_required rather than
cpu->max_features. As a result, -cpu max was not enabling either MONITOR
features or ucode revision.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 6702514814
References: jsc#SLE-17785
Even though MSR_IA32_UCODE_REV has been available long before Linux 5.6,
which added it to the emulated MSR list, a bug caused the microcode
version to revert to 0x100000000 on INIT. As a result, processors other
than the bootstrap processor would not see the host microcode revision;
some Windows version complain loudly about this and crash with a
fairly explicit MICROCODE REVISION MISMATCH error.
[If running 5.6 prereleases, the kernel fix "KVM: x86: do not reset
microcode version on INIT or RESET" should also be applied.]
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Message-id: <20200211175516.10716-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Dependant kernel patch is bsc#1183412, commit 16ce873]
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 9028c75c9d
References: jsc#SLE-17785
This was a very interesting semantic conflict that caused git to move
the MSR_IA32_UCODE_REV read to helper_wrmsr. Not a big deal, but
still should be fixed...
Fixes: 4e45aff398 ("target/i386: add a ucode-rev property", 2020-01-24)
Message-id: <20200206171022.9289-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: a9c2b841af
References: jsc#SLE-8897
This structure describes memory side cache information for memory
proximity domains if the memory side cache is present and the
physical device forms the memory side cache.
The software could use this information to effectively place
the data in memory to maximize the performance of the system
memory that use the memory side cache.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Daniel Black <daniel@linux.ibm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-7-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 4586a2cb83
References: jsc#SLE-8897
This structure describes the memory access latency and bandwidth
information from various memory access initiator proximity domains.
The latency and bandwidth numbers represented in this structure
correspond to rated latency and bandwidth for the platform.
The software could use this information as hint for optimization.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-6-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: e6f123c3b8
References: jsc#SLE-8897
HMAT is defined in ACPI 6.3: 5.2.27 Heterogeneous Memory Attribute Table
(HMAT). The specification references below link:
http://www.uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf
It describes the memory attributes, such as memory side cache
attributes and bandwidth and latency details, related to the
Memory Proximity Domain. The software is
expected to use this information as hint for optimization.
This structure describes Memory Proximity Domain Attributes by memory
subsystem and its associativity with processor proximity domain as well as
hint for memory usage.
In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
the platform's HMAT tables.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Daniel Black <daniel@linux.ibm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-5-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: c412a48d4d
References: jsc#SLE-8897
Add -numa hmat-cache option to provide Memory Side Cache Information.
These memory attributes help to build Memory Side Cache Information
Structure(s) in ACPI Heterogeneous Memory Attribute Table (HMAT).
Before using hmat-cache option, enable HMAT with -machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-4-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bruce Rogers brogers@suse.com>
Git-commit: 9b12dfa03a
References: jsc#SLE-8897
Add -numa hmat-lb option to provide System Locality Latency and
Bandwidth Information. These memory attributes help to build
System Locality Latency and Bandwidth Information Structure(s)
in ACPI Heterogeneous Memory Attribute Table (HMAT). Before using
hmat-lb option, enable HMAT with -machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-3-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 244b3f4485
References: jsc#SLE-8897
In ACPI 6.3 chapter 5.2.27 Heterogeneous Memory Attribute Table (HMAT),
The initiator represents processor which access to memory. And in 5.2.27.3
Memory Proximity Domain Attributes Structure, the attached initiator is
defined as where the memory controller responsible for a memory proximity
domain. With attached initiator information, the topology of heterogeneous
memory can be described. Add new machine property 'hmat' to enable all
HMAT specific options.
Extend CLI of "-numa node" option to indicate the initiator numa node-id.
In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
the platform's HMAT tables. Before using initiator option, enable HMAT with
-machine hmat=on.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jingqi Liu <jingqi.liu@intel.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191213011929.2520-2-tao3.xu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: d0435bc513
Virtqueue notifications are not necessary during polling, so we disable
them. This allows the guest driver to avoid MMIO vmexits.
Unfortunately the virtio-blk and virtio-scsi handler functions re-enable
notifications, defeating this optimization.
Fix virtio-blk and virtio-scsi emulation so they leave notifications
disabled. The key thing to remember for correctness is that polling
always checks one last time after ending its loop, therefore it's safe
to lose the race when re-enabling notifications at the end of polling.
There is a measurable performance improvement of 5-10% with the null-co
block driver. Real-life storage configurations will see a smaller
improvement because the MMIO vmexit overhead contributes less to
latency.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20191209210957.65087-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 30729ae93b
Some tests create huge (but sparse) files, and to be able to run those
tests in certain limited environments (like CI containers), we have to
check for the possibility to create such files first. Thus let's introduce
a common function to check for large files, and replace the already
existing checks in the iotests 005 and 220 with this function.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Tested-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191204154618.23560-2-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: e28582fdb2
Test 079 fails in the arm64, s390x and ppc64le LXD containers on Travis
(which we will hopefully enable in our CI soon). These containers
apparently do not allow large files to be created. Test 079 tries to
create a 4G sparse file, which is apparently already too big for these
containers, so check first whether we can really create such files before
executing the test.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: efd0e5a121
Test 060 fails in the arm64, s390x and ppc64le LXD containers on Travis
(which we will hopefully enable in our CI soon). These containers
apparently do not allow large files to be created. The repair process
in test 060 creates a file of 64 GiB, so test first whether such large
files are possible and skip the test if that's not the case.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
The tulip network driver in a qemu-system-hppa emulation is broken in
the sense that bigger network packages aren't received any longer and
thus even running e.g. "apt update" inside the VM fails.
The breakage was introduced by commit 8ffb7265af ("check frame size and
r/w data length") which added checks to prevent accesses outside of the
rx/tx buffers.
But the new checks were implemented wrong. The variable rx_frame_len
counts backwards, from rx_frame_size down to zero, and the variable len
is never bigger than rx_frame_len, so accesses just can't happen and the
checks are unnecessary.
On the contrary the checks now prevented bigger packages to be moved
into the rx buffers.
This patch reverts the wrong checks and were sucessfully tested with a
qemu-system-hppa emulation.
Fixes: 8ffb7265af ("check frame size and r/w data length")
Buglink: https://bugs.launchpad.net/bugs/1874539
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit d9b6964039)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
As reported on Launchpad, Azure apparently doesn't accept images for
upload that are not both aligned to 1 MB blocks and have a BAT size that
matches the image size exactly.
As far as I can tell, there is no real reason why we create a BAT that
is one entry longer than necessary for aligned image sizes, so change
that.
(Even though the condition is only mentioned as "should" in the spec and
previous products accepted larger BATs - but we'll try to maintain
compatibility with as many of Microsoft's ever-changing interpretations
of the VHD spec as possible.)
Fixes: https://bugs.launchpad.net/bugs/1870098
Reported-by: Tobias Witek
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200402093603.2369-1-kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 3f6de653b9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
For various technical reasons we can't currently allow unplug a PCI to PCI
bridge on the pseries machine. spapr_pci_unplug_request() correctly
generates an error message if that's attempted.
But.. if the given errp is not error_abort or error_fatal, it doesn't
actually stop trying to unplug the bridge anyway.
Fixes: 14e714900f "spapr: Allow hot plug/unplug of PCI bridges and devices under PCI bridges"
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 7aab589976)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tulip network driver while copying tx/rx buffers does not check
frame size against r/w data length. This may lead to OOB buffer
access. Add check to avoid it.
Limit iterations over descriptors to avoid potential infinite
loop issue in tulip_xmit_list_update.
Reported-by: Li Qiang <pangpei.lq@antfin.com>
Reported-by: Ziming Zhang <ezrakiez@gmail.com>
Reported-by: Jason Wang <jasowang@redhat.com>
Tested-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 8ffb7265af)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
block_int.h claims that .bdrv_has_zero_init must return 0 if
.bdrv_has_zero_init_truncate does likewise; but this is violated if
only the former callback is provided if .bdrv_co_truncate also exists.
When adding the latter callback, it was mistakenly added to only one
of the three possible sheepdog instantiations.
Fixes: 1dcaf527
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200324174233.1622067-5-eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit ed04991063)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The feature table is supposed to advertise the name of all feature
bits that we support; however, we forgot to update the table for
autoclear bits. While at it, move the table to read-only memory in
code, and tweak the qcow2 spec to name the second autoclear bit.
Update iotests that are affected by the longer header length.
Fixes: 88ddffae
Fixes: 93c24936
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200324174233.1622067-3-eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit bb40ebce2c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
There is a use-after-free possible: bdrv_unref_child() leaves
bs->backing freed but not NULL. bdrv_attach_child may produce nested
polling loop due to drain, than access of freed pointer is possible.
I've produced the following crash on 30 iotest with modified code. It
does not reproduce on master, but still seems possible:
#0 __strcmp_avx2 () at /lib64/libc.so.6
#1 bdrv_backing_overridden (bs=0x55c9d3cc2060) at block.c:6350
#2 bdrv_refresh_filename (bs=0x55c9d3cc2060) at block.c:6404
#3 bdrv_backing_attach (c=0x55c9d48e5520) at block.c:1063
#4 bdrv_replace_child_noperm
(child=child@entry=0x55c9d48e5520,
new_bs=new_bs@entry=0x55c9d3cc2060) at block.c:2290
#5 bdrv_replace_child
(child=child@entry=0x55c9d48e5520,
new_bs=new_bs@entry=0x55c9d3cc2060) at block.c:2320
#6 bdrv_root_attach_child
(child_bs=child_bs@entry=0x55c9d3cc2060,
child_name=child_name@entry=0x55c9d241d478 "backing",
child_role=child_role@entry=0x55c9d26ecee0 <child_backing>,
ctx=<optimized out>, perm=<optimized out>, shared_perm=21,
opaque=0x55c9d3c5a3d0, errp=0x7ffd117108e0) at block.c:2424
#7 bdrv_attach_child
(parent_bs=parent_bs@entry=0x55c9d3c5a3d0,
child_bs=child_bs@entry=0x55c9d3cc2060,
child_name=child_name@entry=0x55c9d241d478 "backing",
child_role=child_role@entry=0x55c9d26ecee0 <child_backing>,
errp=errp@entry=0x7ffd117108e0) at block.c:5876
#8 in bdrv_set_backing_hd
(bs=bs@entry=0x55c9d3c5a3d0,
backing_hd=backing_hd@entry=0x55c9d3cc2060,
errp=errp@entry=0x7ffd117108e0)
at block.c:2576
#9 stream_prepare (job=0x55c9d49d84a0) at block/stream.c:150
#10 job_prepare (job=0x55c9d49d84a0) at job.c:761
#11 job_txn_apply (txn=<optimized out>, fn=<optimized out>) at
job.c:145
#12 job_do_finalize (job=0x55c9d49d84a0) at job.c:778
#13 job_completed_txn_success (job=0x55c9d49d84a0) at job.c:832
#14 job_completed (job=0x55c9d49d84a0) at job.c:845
#15 job_completed (job=0x55c9d49d84a0) at job.c:836
#16 job_exit (opaque=0x55c9d49d84a0) at job.c:864
#17 aio_bh_call (bh=0x55c9d471a160) at util/async.c:117
#18 aio_bh_poll (ctx=ctx@entry=0x55c9d3c46720) at util/async.c:117
#19 aio_poll (ctx=ctx@entry=0x55c9d3c46720,
blocking=blocking@entry=true)
at util/aio-posix.c:728
#20 bdrv_parent_drained_begin_single (poll=true, c=0x55c9d3d558f0)
at block/io.c:121
#21 bdrv_parent_drained_begin_single (c=c@entry=0x55c9d3d558f0,
poll=poll@entry=true)
at block/io.c:114
#22 bdrv_replace_child_noperm
(child=child@entry=0x55c9d3d558f0,
new_bs=new_bs@entry=0x55c9d3d27300) at block.c:2258
#23 bdrv_replace_child
(child=child@entry=0x55c9d3d558f0,
new_bs=new_bs@entry=0x55c9d3d27300) at block.c:2320
#24 bdrv_root_attach_child
(child_bs=child_bs@entry=0x55c9d3d27300,
child_name=child_name@entry=0x55c9d241d478 "backing",
child_role=child_role@entry=0x55c9d26ecee0 <child_backing>,
ctx=<optimized out>, perm=<optimized out>, shared_perm=21,
opaque=0x55c9d3cc2060, errp=0x7ffd11710c60) at block.c:2424
#25 bdrv_attach_child
(parent_bs=parent_bs@entry=0x55c9d3cc2060,
child_bs=child_bs@entry=0x55c9d3d27300,
child_name=child_name@entry=0x55c9d241d478 "backing",
child_role=child_role@entry=0x55c9d26ecee0 <child_backing>,
errp=errp@entry=0x7ffd11710c60) at block.c:5876
#26 bdrv_set_backing_hd
(bs=bs@entry=0x55c9d3cc2060,
backing_hd=backing_hd@entry=0x55c9d3d27300,
errp=errp@entry=0x7ffd11710c60)
at block.c:2576
#27 stream_prepare (job=0x55c9d495ead0) at block/stream.c:150
...
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200316060631.30052-2-vsementsov@virtuozzo.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 6e57963a77)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The POP states that for a list directed IPL the IPLB is stored into
memory by the machine loader and its address is stored at offset 0x14
of the lowcore.
ZIPL currently uses the address in offset 0x14 to access the IPLB and
acquire flags about secure boot. If the IPLB address points into
memory which has an unsupported mix of flags set, ZIPL will panic
instead of booting the OS.
As the lowcore can have quite a high entropy for a guest that did drop
out of protected mode (i.e. rebooted) we encountered the ZIPL panic
quite often.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Tested-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Message-Id: <20200304114231.23493-19-frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit 9bfc04f9ef)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
virtio queues forgot to delete in unrealize, and aslo error path in
realize, this patch fix these memleaks, the leak stack is as follow:
Direct leak of 114688 byte(s) in 16 object(s) allocated from:
#0 0x7f24024fdbf0 in calloc (/lib64/libasan.so.3+0xcabf0)
#1 0x7f2401642015 in g_malloc0 (/lib64/libglib-2.0.so.0+0x50015)
#2 0x55ad175a6447 in virtio_add_queue /mnt/sdb/qemu/hw/virtio/virtio.c:2327
#3 0x55ad17570cf9 in vhost_user_blk_device_realize /mnt/sdb/qemu/hw/block/vhost-user-blk.c:419
#4 0x55ad175a3707 in virtio_device_realize /mnt/sdb/qemu/hw/virtio/virtio.c:3509
#5 0x55ad176ad0d1 in device_set_realized /mnt/sdb/qemu/hw/core/qdev.c:876
#6 0x55ad1781ff9d in property_set_bool /mnt/sdb/qemu/qom/object.c:2080
#7 0x55ad178245ae in object_property_set_qobject /mnt/sdb/qemu/qom/qom-qobject.c:26
#8 0x55ad17821eb4 in object_property_set_bool /mnt/sdb/qemu/qom/object.c:1338
#9 0x55ad177aeed7 in virtio_pci_realize /mnt/sdb/qemu/hw/virtio/virtio-pci.c:1801
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200224041336.30790-2-pannengyuan@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 13e5468127)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Similar to other virtio-deivces, ctrl_vq forgot to delete in virtio_crypto_device_unrealize, this patch fix it.
This device has aleardy maintained vq pointers. Thus, we use the new virtio_delete_queue function directly to do the cleanup.
The leak stack:
Direct leak of 10752 byte(s) in 3 object(s) allocated from:
#0 0x7f4c024b1970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
#1 0x7f4c018be49d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
#2 0x55a2f8017279 in virtio_add_queue /mnt/sdb/qemu-new/qemu_test/qemu/hw/virtio/virtio.c:2333
#3 0x55a2f8057035 in virtio_crypto_device_realize /mnt/sdb/qemu-new/qemu_test/qemu/hw/virtio/virtio-crypto.c:814
#4 0x55a2f8005d80 in virtio_device_realize /mnt/sdb/qemu-new/qemu_test/qemu/hw/virtio/virtio.c:3531
#5 0x55a2f8497d1b in device_set_realized /mnt/sdb/qemu-new/qemu_test/qemu/hw/core/qdev.c:891
#6 0x55a2f8b48595 in property_set_bool /mnt/sdb/qemu-new/qemu_test/qemu/qom/object.c:2238
#7 0x55a2f8b54fad in object_property_set_qobject /mnt/sdb/qemu-new/qemu_test/qemu/qom/qom-qobject.c:26
#8 0x55a2f8b4de2c in object_property_set_bool /mnt/sdb/qemu-new/qemu_test/qemu/qom/object.c:1390
#9 0x55a2f80609c9 in virtio_crypto_pci_realize /mnt/sdb/qemu-new/qemu_test/qemu/hw/virtio/virtio-crypto-pci.c:58
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Cc: "Gonglei (Arei)" <arei.gonglei@huawei.com>
Message-Id: <20200225075554.10835-5-pannengyuan@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d56e1c8256)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When printing the snapshot list (e.g. with qemu-img snapshot -l), the VM
size field is only seven characters wide. As of de38b5005e, this is
not necessarily sufficient: We generally print three digits, and this
may require a decimal point. Also, the unit field grew from something
as plain as "M" to " MiB". This means that number and unit may take up
eight characters in total; but we also want spaces in front.
Considering previously the maximum width was four characters and the
field width was chosen to be three characters wider, let us adjust the
field width to be eleven now.
Fixes: de38b5005e
Buglink: https://bugs.launchpad.net/qemu/+bug/1859989
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200117105859.241818-2-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 804359b8b9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit 7a3f542fbd "block/io: refactor padding" occasionally dropped
aligning for zero-length request: bdrv_init_padding() blindly return
false if bytes == 0, like there is nothing to align.
This leads the following command to crash:
./qemu-io --image-opts -c 'write 1 0' \
driver=blkdebug,align=512,image.driver=null-co,image.size=512
>> qemu-io: block/io.c:1955: bdrv_aligned_pwritev: Assertion
`(offset & (align - 1)) == 0' failed.
>> Aborted (core dumped)
Prior to 7a3f542fbd we does aligning of such zero requests. Instead of
recovering this behavior let's just do nothing on such requests as it
is useless.
Note that driver may have special meaning of zero-length reqeusts, like
qcow2_co_pwritev_compressed_part, so we can't skip any zero-length
operation. But for unaligned ones, we can't pass it to driver anyway.
This commit also fixes crash in iotest 80 running with -nocache:
./check -nocache -qcow2 80
which crashes on same assertion due to trying to read empty extra data
in qcow2_do_read_snapshots().
Cc: qemu-stable@nongnu.org # v4.2
Fixes: 7a3f542fbd
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20200206164245.17781-1-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ac9d00bf7b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit e19afd5667 mentioned that target-arm only supports queryable
cpu models 'max', 'host', and the current type when KVM is in use.
The logic works well until using machine type none.
For machine type none, cpu_type will be null if cpu option is not
set by command line, strlen(cpu_type) will terminate process.
So We add a check above it.
This won't affect i386 and s390x since they do not use current_cpu.
Signed-off-by: Liang Yan <lyan@suse.com>
Message-id: 20200203134251.12986-1-lyan@suse.com
Reviewed-by: Andrew Jones <drjones@redhat.com>
Tested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 0999a4ba87)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
We can't access top after call bdrv_backup_top_drop, as it is already
freed at this time.
Also, no needs to unref target child by hand, it will be unrefed on
bdrv_close() automatically.
So, just do bdrv_backup_top_drop if append succeed and one bdrv_unref
otherwise.
Note, that in !appended case bdrv_unref(top) moved into drained section
on source. It doesn't really matter, but just for code simplicity.
Fixes: 7df7868b96
Cc: qemu-stable@nongnu.org # v4.2.0
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20200121142802.21467-2-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 0df62f45c1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If we call the qmp 'query-block' while qemu is working on
'block-commit', it will cause memleaks, the memory leak stack is as
follow:
Indirect leak of 12360 byte(s) in 3 object(s) allocated from:
#0 0x7f80f0b6d970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
#1 0x7f80ee86049d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
#2 0x55ea95b5bb67 in qdict_new /mnt/sdb/qemu-4.2.0-rc0/qobject/qdict.c:29
#3 0x55ea956cd043 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6427
#4 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399
#5 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399
#6 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399
#7 0x55ea958818ea in bdrv_block_device_info /mnt/sdb/qemu-4.2.0-rc0/block/qapi.c:56
#8 0x55ea958879de in bdrv_query_info /mnt/sdb/qemu-4.2.0-rc0/block/qapi.c:392
#9 0x55ea9588b58f in qmp_query_block /mnt/sdb/qemu-4.2.0-rc0/block/qapi.c:578
#10 0x55ea95567392 in qmp_marshal_query_block qapi/qapi-commands-block-core.c:95
Indirect leak of 4120 byte(s) in 1 object(s) allocated from:
#0 0x7f80f0b6d970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
#1 0x7f80ee86049d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
#2 0x55ea95b5bb67 in qdict_new /mnt/sdb/qemu-4.2.0-rc0/qobject/qdict.c:29
#3 0x55ea956cd043 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6427
#4 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399
#5 0x55ea956cc950 in bdrv_refresh_filename /mnt/sdb/qemu-4.2.0-rc0/block.c:6399
#6 0x55ea9569f301 in bdrv_backing_attach /mnt/sdb/qemu-4.2.0-rc0/block.c:1064
#7 0x55ea956a99dd in bdrv_replace_child_noperm /mnt/sdb/qemu-4.2.0-rc0/block.c:2283
#8 0x55ea956b9b53 in bdrv_replace_node /mnt/sdb/qemu-4.2.0-rc0/block.c:4196
#9 0x55ea956b9e49 in bdrv_append /mnt/sdb/qemu-4.2.0-rc0/block.c:4236
#10 0x55ea958c3472 in commit_start /mnt/sdb/qemu-4.2.0-rc0/block/commit.c:306
#11 0x55ea94b68ab0 in qmp_block_commit /mnt/sdb/qemu-4.2.0-rc0/blockdev.c:3459
#12 0x55ea9556a7a7 in qmp_marshal_block_commit qapi/qapi-commands-block-core.c:407
Fixes: bb808d5f5c
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-id: 20200116085600.24056-1-pannengyuan@huawei.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit cb8956144c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If LPIs are disabled, KVM will just ignore the GICR_PENDBASER.PTZ bit when
restoring GICR_CTLR. Setting PTZ here makes littlt sense in "reduce GIC
initialization time".
And what's worse, PTZ is generally programmed by guest to indicate to the
Redistributor whether the LPI Pending table is zero when enabling LPIs.
If migration is triggered when the PTZ has just been cleared by guest (and
before enabling LPIs), we will see PTZ==1 on the destination side, which
is not as expected. Let's just drop this hackish userspace behavior.
Also take this chance to refine the comment a bit.
Fixes: 367b9f527b ("hw/intc/arm_gicv3_kvm: Implement get/put functions")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Message-id: 20200119133051.642-1-yuzenghui@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 618bacabd3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
bdrv_open_driver() allocates bs->opaque according to drv->instance_size.
There is no need to allocate it and overwrite opaque in
bdrv_backup_top_append().
Reproducer:
$ QTEST_QEMU_BINARY=./x86_64-softmmu/qemu-system-x86_64 valgrind -q --leak-check=full tests/test-replication -p /replication/secondary/start
==29792== 24 bytes in 1 blocks are definitely lost in loss record 52 of 226
==29792== at 0x483AB1A: calloc (vg_replace_malloc.c:762)
==29792== by 0x4B07CE0: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6000.7)
==29792== by 0x12BAB9: bdrv_open_driver (block.c:1289)
==29792== by 0x12BEA9: bdrv_new_open_driver (block.c:1359)
==29792== by 0x1D15CB: bdrv_backup_top_append (backup-top.c:190)
==29792== by 0x1CC11A: backup_job_create (backup.c:439)
==29792== by 0x1CD542: replication_start (replication.c:544)
==29792== by 0x1401B9: replication_start_all (replication.c:52)
==29792== by 0x128B50: test_secondary_start (test-replication.c:427)
...
Fixes: 7df7868b96 ("block: introduce backup-top filter driver")
Signed-off-by: Eiichi Tsukata <devel@etsukata.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit fb574de81b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If the kernel irqchip has been disabled, we don't want the
{add,release}_adapter_routes routines to call any kvm_irqchip_*
interfaces, as they may rely on an irqchip actually having been
created. Just take a quick exit in that case instead. If you are
trying to use irqfd without a kernel irqchip, we will fail with
an error.
Also initialize routes->gsi[] with -1 in the virtio-ccw handling,
to make sure we don't trip over other errors, either. (Nobody
else uses the gsi array in that structure.)
Fixes: d426d9fba8 ("s390x/virtio-ccw: wire up irq routing and irqfds")
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20200117111147.5006-1-cohuck@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 3c5fd80743)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This reverts commit de3f7de7f4.
Remove VNC optimization to reencode framebuffer update as raw if it's
smaller than the default encoding.
QEMU's implementation was naive and didn't account for the ZLIB z_stream
mutating with each compression. Because of the mutation, simply
resetting the output buffer's offset wasn't sufficient to "rewind" the
operation. The mutated z_stream would generate future zlib blocks which
referred to symbols in past blocks which weren't sent. This would lead
to artifacting.
Considering that ZRLE is never larger than raw and even though ZLIB can
occasionally be fractionally larger than raw, the overhead of
implementing this optimization correctly isn't worth it.
Signed-off-by: Cameron Esfahani <dirty@apple.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 0780ec7be8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When using hugepages, rate limiting is necessary within each huge
page, since a 1G huge page can take a significant time to send, so
you end up with bursty behaviour.
Fixes: 4c011c37ec ("postcopy: Send whole huge pages")
Reported-by: Lin Ma <LMa@suse.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 97e1e06780)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit 1bd71dce4b tries to prevent a finishmigrate -> prelaunch
transition by exiting at the beginning of the main_loop_should_exit()
function if the state is already finishmigrate.
As the finishmigrate state is set in the migration thread it can
happen concurrently to the function. The migration thread and the
function are normally protected by the iothread mutex and thus the
state should no evolve between the start of the function and its end.
Unfortunately during the function life the lock is released by
pause_all_vcpus() just before the point we need to be sure we are
not in finishmigrate state and if the migration thread is waiting
for the lock it will take the opportunity to change the state
to finishmigrate.
The only way to be sure we are not in the finishmigrate state when
we need is to check the state after the pause_all_vcpus() function.
Fixes: 1bd71dce4b ("runstate: ignore exit request in finish migrate state")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit ddad81bd28)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit e51e711b1b has moved the initialization of start_address and
end_address after the definition of the command line argument,
where the nvramrc is initialized, and thus the loop is between 0 and 0
rather than 1 MiB and 100 MiB.
It doesn't affect the result of the test if all the tests are run in
sequence because the two first tests don't run the loop, so the
values are correctly initialized when we actually need them.
But it hangs when we ask to run only one test, for instance:
QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64 \
tests/migration-test -m=quick -p /ppc64/migration/validate_uuid_error
Fixes: e51e711b1b ("tests/migration: Add migration-test header file")
Cc: wei@redhat.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20200107163437.52139-1-lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 16c5c6928f)
Conflicts:
tests/migration-test.c
*drop context dep. on 68d95609
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Sometimes it is useful to be able to add a node to the block graph that
takes or unshare a certain set of permissions for debugging purposes.
This patch adds this capability to blkdebug.
(Note that you cannot make blkdebug release or share permissions that it
needs to take or cannot share, because this might result in assertion
failures in the block layer. But if the blkdebug node has no parents,
it will not take any permissions and share everything by default, so you
can then freely choose what permissions to take and share.)
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191108123455.39445-4-mreitz@redhat.com
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 69c6449ff1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
We need some way to correlate QAPI BlockPermission values with
BLK_PERM_* flags. We could:
(1) have the same order in the QAPI definition as the the BLK_PERM_*
flags are in LSb-first order. However, then there is no guarantee
that they actually match (e.g. when someone modifies the QAPI schema
without thinking of the BLK_PERM_* definitions).
We could add static assertions, but these would break what’s good
about this solution, namely its simplicity.
(2) define the BLK_PERM_* flags based on the BlockPermission values.
But this way whenever someone were to modify the QAPI order
(perfectly sensible in theory), the BLK_PERM_* values would change.
Because these values are used for file locking, this might break
file locking between different qemu versions.
Therefore, go the slightly more cumbersome way: Add a function to
translate from the QAPI constants to the BLK_PERM_* flags.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191108123455.39445-2-mreitz@redhat.com
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 7b1d9c4df0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The bit offsets in the EVT_SET_ADDR2 macro do not match those specified
in the ARM SMMUv3 Architecture Specification. In all events that use
this macro, e.g. F_WALK_EABT, the faulting fetch address or IPA actually
occupies the 32-bit words 6 and 7 in the event record contiguously, with
the upper and lower unused bits clear due to alignment or maximum
supported address bits. How many bits are clear depends on the
individual event type.
Update the macro to write to the correct words in the event record so
that guest drivers can obtain accurate address information on events.
ref. ARM IHI 0070C, sections 7.3.12 through 7.3.16.
Signed-off-by: Simon Veith <sveith@amazon.de>
Acked-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1576509312-13083-6-git-send-email-sveith@amazon.de
Cc: Eric Auger <eric.auger@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Acked-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit a7f65ceb85)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When checking whether a stream ID is in range of the stream table, we
have so far been only checking it against our implementation limit
(SMMU_IDR1_SIDSIZE). However, the guest can program the
STRTAB_BASE_CFG.LOG2SIZE field to a size that is smaller than this
limit.
Check the stream ID against this limit as well to match the hardware
behavior of raising C_BAD_STREAMID events in case the limit is exceeded.
Also, ensure that we do not go one entry beyond the end of the table by
checking that its index is strictly smaller than the table size.
ref. ARM IHI 0070C, section 6.3.24.
Signed-off-by: Simon Veith <sveith@amazon.de>
Acked-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1576509312-13083-4-git-send-email-sveith@amazon.de
Cc: Eric Auger <eric.auger@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 05ff2fb80c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In the SMMU_STRTAB_BASE register, the stream table base address only
occupies bits [51:6]. Other bits, such as RA (bit [62]), must be masked
out to obtain the base address.
The branch for 2-level stream tables correctly applies this mask by way
of SMMU_BASE_ADDR_MASK, but the one for linear stream tables does not.
Apply the missing mask in that case as well so that the correct stream
base address is used by guests which configure a linear stream table.
Linux guests are unaffected by this change because they choose a 2-level
stream table layout for the QEMU SMMUv3, based on the size of its stream
ID space.
ref. ARM IHI 0070C, section 6.3.23.
Signed-off-by: Simon Veith <sveith@amazon.de>
Acked-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1576509312-13083-2-git-send-email-sveith@amazon.de
Cc: Eric Auger <eric.auger@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Acked-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 3d44c60500)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This is an update on the stable-4.2 branch of libslirp.git:
git shortlog 55ab21c9a3..2faae0f778f81
Marc-André Lureau (1):
Fix use-afte-free in ip_reass() (CVE-2020-1983)
CVE-2020-1983 is actually a follow up fix for commit
126c04acbabd7ad32c2b018fe10dfac2a3bc1210 ("Fix heap overflow in
ip_reass on big packet input") which was was included in qemu
v4.1 (commit e1a4a24d26 "slirp: update
with CVE-2019-14378 fix").
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20200421170227.843555-1-marcandre.lureau@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 7769c23774)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
kvm_set_phys_mem can be called to reallocate a slot by something the
guest does (e.g. writing to PAM and other chipset registers).
This can happen in the middle of a migration, and if we're unlucky
it can now happen between the split 'sync' and 'clear'; the clear
asserts if there's no bmap to clear. Recreate the bmap whenever
we change the slot, keeping the clear path happy.
Typically this is triggered by the guest rebooting during a migrate.
Corresponds to:
https://bugzilla.redhat.com/show_bug.cgi?id=1772774https://bugzilla.redhat.com/show_bug.cgi?id=1771032
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 9b3a31c745)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
A guest user may set channel frame count via es1370_write()
such that, in es1370_transfer_audio(), total frame count
'size' is lesser than the number of frames that are processed
'cnt'.
int cnt = d->frame_cnt >> 16;
int size = d->frame_cnt & 0xffff;
if (size < cnt), it results in incorrect calculations leading
to OOB access issue(s). Add check to avoid it.
Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20200514200608.1744203-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 369ff955a8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In some corner cases (that never happen during normal operation but a
malicious guest could program wrong values) pixman functions were
called with parameters that result in a crash. Fix this and add more
checks to disallow such cases.
Reported-by: Ziming Zhang <ezrakiez@gmail.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-id: 20200406204029.19559747D5D@zero.eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ac2071c379)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When querying an iSCSI server for the provisioning status of blocks (via
GET LBA STATUS), Qemu only validates that the response descriptor zero's
LBA matches the one requested. Given the SCSI spec allows servers to
respond with the status of blocks beyond the end of the LUN, Qemu may
have its heap corrupted by clearing/setting too many bits at the end of
its allocmap for the LUN.
A malicious guest in control of the iSCSI server could carefully program
Qemu's heap (by selectively setting the bitmap) and then smash it.
This limits the number of bits that iscsi_co_block_status() will try to
update in the allocmap so it can't overflow the bitmap.
Fixes: CVE-2020-1711
Cc: qemu-stable@nongnu.org
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 693fd2acdf)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit 048c95163b ("target/i386: work around KVM_GET_MSRS bug for
secondary execution controls") added a workaround for KVM pre-dating
commit 6defc591846d ("KVM: nVMX: include conditional controls in /dev/kvm
KVM_GET_MSRS") which wasn't setting certain available controls. The
workaround uses generic CPUID feature bits to set missing VMX controls.
It was found that in some cases it is possible to observe hosts which
have certain CPUID features but lack the corresponding VMX control.
In particular, it was reported that Azure VMs have RDSEED but lack
VMX_SECONDARY_EXEC_RDSEED_EXITING; attempts to enable this feature
bit result in QEMU abort.
Resolve the issue but not applying the workaround when we don't have
to. As there is no good way to find out if KVM has the fix itself, use
95c5c7c77c ("KVM: nVMX: list VMX MSRs in KVM_GET_MSR_INDEX_LIST") instead
as these [are supposed to] come together.
Fixes: 048c95163b ("target/i386: work around KVM_GET_MSRS bug for secondary execution controls")
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200331162752.1209928-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4a910e1f6a)
Conflicts:
target/i386/kvm.c
*drop context dep. on 6702514814
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
It was found that running libquantum on riscv-linux qemu produced an
incorrect result. After investigation, FP registers are not saved
during context switch due to incorrect mstatus.FS.
In current implementation tb->flags merges all non-disabled state to
dirty. This means the code in mark_fs_dirty in translate.c that
handles initial and clean states is unreachable.
This patch fixes it and is successfully tested with:
libquantum
Thanks to Richard for pointing out the actual bug.
v3: remove the redundant condition
v2: root cause FS problem
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: ShihPo Hung <shihpo.hung@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
(cherry picked from commit 613fa160e1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
./configure --enable-sdl --audio-drv-list=sdl --enable-modules
Will generate two identical test names: /$arch/module/load/sdl
Which generates an error like:
(tests/modules-test:23814): GLib-ERROR **: 18:23:06.359: duplicate test case path: /aarch64//module/load/sdl
Add the subsystem prefix in the name as well, so instead we get:
/$arch/module/load/audio-sdl
/$arch/module/load/ui-sdl
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-Id: <d64c9aa098cc6e5c0b638438c4959eddfa7e24e2.1573679311.git.crobinso@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit eca3a94523)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Instead of truncating replies, which is problematic, wait until the
client reads more data and frees bytes on the reply ring.
Do that by calling qemu_coroutine_yield(). The corresponding
qemu_coroutine_enter_if_inactive() is called from xen_9pfs_bh upon
receiving the next notification from the client.
We need to be careful to avoid races in case xen_9pfs_bh and the
coroutine are both active at the same time. In xen_9pfs_bh, wait until
either the critical section is over (ring->co == NULL) or until the
coroutine becomes inactive (qemu_coroutine_yield() was called) before
continuing. Then, simply wake up the coroutine if it is inactive.
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <20200521192627.15259-2-sstabellini@kernel.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit a4c4d46272)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
QEMU's local 9pfs server passes through O_NOATIME from the client. If
the QEMU process doesn't have permissions to use O_NOATIME (namely, it
does not own the file nor have the CAP_FOWNER capability), the open will
fail. This causes issues when from the client's point of view, it
believes it has permissions to use O_NOATIME (e.g., a process running as
root in the virtual machine). Additionally, overlayfs on Linux opens
files on the lower layer using O_NOATIME, so in this case a 9pfs mount
can't be used as a lower layer for overlayfs (cf.
dabfe19719/vmtest/onoatimehack.c
and https://github.com/NixOS/nixpkgs/issues/54509).
Luckily, O_NOATIME is effectively a hint, and is often ignored by, e.g.,
network filesystems. open(2) notes that O_NOATIME "may not be effective
on all filesystems. One example is NFS, where the server maintains the
access time." This means that we can honor it when possible but fall
back to ignoring it.
Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Message-Id: <e9bee604e8df528584693a4ec474ded6295ce8ad.1587149256.git.osandov@fb.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit a5804fcf7b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
v->vq forgot to cleanup in virtio_9p_device_unrealize, the memory leak
stack is as follow:
Direct leak of 14336 byte(s) in 2 object(s) allocated from:
#0 0x7f819ae43970 (/lib64/libasan.so.5+0xef970) ??:?
#1 0x7f819872f49d (/lib64/libglib-2.0.so.0+0x5249d) ??:?
#2 0x55a3a58da624 (./x86_64-softmmu/qemu-system-x86_64+0x2c14624) /mnt/sdb/qemu/hw/virtio/virtio.c:2327
#3 0x55a3a571bac7 (./x86_64-softmmu/qemu-system-x86_64+0x2a55ac7) /mnt/sdb/qemu/hw/9pfs/virtio-9p-device.c:209
#4 0x55a3a58e7bc6 (./x86_64-softmmu/qemu-system-x86_64+0x2c21bc6) /mnt/sdb/qemu/hw/virtio/virtio.c:3504
#5 0x55a3a5ebfb37 (./x86_64-softmmu/qemu-system-x86_64+0x31f9b37) /mnt/sdb/qemu/hw/core/qdev.c:876
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200117060927.51996-2-pannengyuan@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Acked-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 9580d60e66)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
local_unlinkat_common() is supposed to always return -1 on error.
This is being done by jumps to the 'err_out' label, which is
a 'return ret' call, and 'ret' is initialized with -1.
Unfortunately there is a condition in which the function will
return 0 on error: in a case where flags == AT_REMOVEDIR, 'ret'
will be 0 when reaching
map_dirfd = openat_dir(...)
And, if map_dirfd == -1 and errno != ENOENT, the existing 'err_out'
jump will execute 'return ret', when ret is still set to zero
at that point.
This patch fixes it by changing all 'err_out' labels by
'return -1' calls, ensuring that the function will always
return -1 on error conditions. 'ret' can be left unintialized
since it's now being used just to store the result of 'unlinkat'
calls.
CC: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
[groug: changed prefix in title to be "9p: local:"]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 846cf408a4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit 93676c88 relaxed our NBD client code to request export names up
to the NBD protocol maximum of 4096 bytes without NUL terminator, even
though the block layer can't store anything longer than 4096 bytes
including NUL terminator for display to the user. Since this means
there are some export names where we have to truncate things, we can
at least try to make the truncation a bit more obvious for the user.
Note that in spite of the truncated display name, we can still
communicate with an NBD server using such a long export name; this was
deemed nicer than refusing to even connect to such a server (since the
server may not be under our control, and since determining our actual
length limits gets tricky when nbd://host:port/export and
nbd+unix:///export?socket=/path are themselves variable-length
expansions beyond the export name but count towards the block layer
name length).
Reported-by: Xueqiang Wei <xuwei@redhat.com>
Fixes: https://bugzilla.redhat.com/1843684
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200610163741.3745251-3-eblake@redhat.com>
(cherry picked from commit 5c86bdf120)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In case we don't have an iothread, we mark the feature as abscent but
still add the queue. 'free_page_bh' remains set to NULL.
qemu-system-i386 \
-M microvm \
-nographic \
-device virtio-balloon-device,free-page-hint=true \
-nographic \
-display none \
-monitor none \
-serial none \
-qtest stdio
Doing a "write 0xc0000e30 0x24
0x030000000300000003000000030000000300000003000000030000000300000003000000"
We will trigger a SEGFAULT. Let's move the check and bail out.
While at it, move the static initializations to instance_init().
free_page_report_status and block_iothread are implicitly set to the
right values (0/false) already, so drop the initialization.
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Fixes: c13c4153f7 ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200520100439.19872-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 12fc8903a8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Ever since commit 36683283 (v2.8), the server code asserts that error
strings sent to the client are well-formed per the protocol by not
exceeding the maximum string length of 4096. At the time the server
first started sending error messages, the assertion could not be
triggered, because messages were completely under our control.
However, over the years, we have added latent scenarios where a client
could trigger the server to attempt an error message that would
include the client's information if it passed other checks first:
- requesting NBD_OPT_INFO/GO on an export name that is not present
(commit 0cfae925 in v2.12 echoes the name)
- requesting NBD_OPT_LIST/SET_META_CONTEXT on an export name that is
not present (commit e7b1948d in v2.12 echoes the name)
At the time, those were still safe because we flagged names larger
than 256 bytes with a different message; but that changed in commit
93676c88 (v4.2) when we raised the name limit to 4096 to match the NBD
string limit. (That commit also failed to change the magic number
4096 in nbd_negotiate_send_rep_err to the just-introduced named
constant.) So with that commit, long client names appended to server
text can now trigger the assertion, and thus be used as a denial of
service attack against a server. As a mitigating factor, if the
server requires TLS, the client cannot trigger the problematic paths
unless it first supplies TLS credentials, and such trusted clients are
less likely to try to intentionally crash the server.
We may later want to further sanitize the user-supplied strings we
place into our error messages, such as scrubbing out control
characters, but that is less important to the CVE fix, so it can be a
later patch to the new nbd_sanitize_name.
Consideration was given to changing the assertion in
nbd_negotiate_send_rep_verr to instead merely log a server error and
truncate the message, to avoid leaving a latent path that could
trigger a future CVE DoS on any new error message. However, this
merely complicates the code for something that is already (correctly)
flagging coding errors, and now that we are aware of the long message
pitfall, we are less likely to introduce such errors in the future,
which would make such error handling dead code.
Reported-by: Xueqiang Wei <xuwei@redhat.com>
CC: qemu-stable@nongnu.org
Fixes: https://bugzilla.redhat.com/1843684 CVE-2020-10761
Fixes: 93676c88d7
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200610163741.3745251-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
(cherry picked from commit 5c4fe018c0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Locking was introduced in QEMU 2.7 to address the deprecation of
readdir_r(3) in glibc 2.24. It turns out that the frontend code is
the worst place to handle a critical section with a pthread mutex:
the code runs in a coroutine on behalf of the QEMU mainloop and then
yields control, waiting for the fsdev backend to process the request
in a worker thread. If the client resends another readdir request for
the same fid before the previous one finally unlocked the mutex, we're
deadlocked.
This never bit us because the linux client serializes readdir requests
for the same fid, but it is quite easy to demonstrate with a custom
client.
A good solution could be to narrow the critical section in the worker
thread code and to return a copy of the dirent to the frontend, but
this causes quite some changes in both 9p.c and codir.c. So, instead
of that, in order for people to easily backport the fix to older QEMU
versions, let's simply use a CoMutex since all the users for this
sit in coroutines.
Fixes: 7cde47d4a8 ("9p: add locking to V9fsDir")
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <158981894794.109297.3530035833368944254.stgit@bahia.lan>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit ed463454ef)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Since 5.0 QEMU uses hostmem backend for allocating main guest RAM.
The backend however calls mbind() which is typically NOP
in case of default policy/absent host-nodes bitmap.
However when runing in container with black-listed mbind()
syscall, QEMU fails to start with error
"cannot bind memory to host NUMA nodes: Operation not permitted"
even when user hasn't provided host-nodes to pin to explictly
(which is the case with -m option)
To fix issue, call mbind() only in case when user has provided
host-nodes explicitly (i.e. host_nodes bitmap is not empty).
That should allow to run QEMU in containers with black-listed
mbind() without memory pinning. If QEMU provided memory-pinning
is required user still has to white-list mbind() in container
configuration.
Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200430154606.6421-1-imammedo@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 70b6d525df)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If mtmsr L=1 sets MSR[EE] while there is a maskable exception pending,
it does not cause an interrupt. This causes the test case to hang:
https://lists.gnu.org/archive/html/qemu-ppc/2019-10/msg00826.html
More recently, Linux reduced the occurance of operations (e.g., rfi)
which stop translation and allow pending interrupts to be processed.
This started causing hangs in Linux boot in long-running kernel tests,
running with '-d int' shows the decrementer stops firing despite DEC
wrapping and MSR[EE]=1.
https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-April/208301.html
The cause is the broken mtmsr L=1 behaviour, which is contrary to the
architecture. From Power ISA v3.0B, p.977, Move To Machine State Register,
Programming Note states:
If MSR[EE]=0 and an External, Decrementer, or Performance Monitor
exception is pending, executing an mtmsrd instruction that sets
MSR[EE] to 1 will cause the interrupt to occur before the next
instruction is executed, if no higher priority exception exists
Fix this by handling L=1 exactly the same way as L=0, modulo the MSR
bits altered.
The confusion arises from L=0 being "context synchronizing" whereas L=1
is "execution synchronizing", which is a weaker semantic. However this
is not a relaxation of the requirement that these exceptions cause
interrupts when MSR[EE]=1 (e.g., when mtmsr executes to completion as
TCG is doing here), rather it specifies how a pipelined processor can
have multiple instructions in flight where one may influence how another
behaves.
Cc: qemu-stable@nongnu.org
Reported-by: Anton Blanchard <anton@ozlabs.org>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20200414111131.465560-1-npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 5ed195065c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit a31ca6801c ("qemu/queue.h: clear linked list pointers on
remove") revealed that a request was removed twice from a list, once
in xen_block_finish_request() and a second time in
xen_block_release_request() when both function are called from
xen_block_complete_aio(). But also, the `requests_inflight' counter is
decreased twice, and thus became negative.
This is a bug that was introduced in bfd0d63660 ("xen-block: improve
response latency"), where a `finished' list was removed.
That commit also introduced a leak of request in xen_block_do_aio().
That function calls xen_block_finish_request() but the request is
never released after that.
To fix both issue, we do two changes:
- we squash finish_request() and release_request() together as we want
to remove a request from 'inflight' list to add it to 'freelist'.
- before releasing a request, we need to let the other end know the
result, thus we should call xen_block_send_response() before
releasing a request.
The first change fixes the double QLIST_REMOVE() as we remove the extra
call. The second change makes the leak go away because if we want to
call finish_request(), we need to call a function that does all of
finish, send response, and release.
Fixes: bfd0d63660 ("xen-block: improve response latency")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20200406140217.1441858-1-anthony.perard@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
[mreitz: Amended commit message as per Paul's suggestions]
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 36d883ba0d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In write_elf_section() we set the 'shdr' pointer to point to local
structures shdr32 or shdr64, which we fill in to be written out to
the ELF dump. Unfortunately the address we pass to fd_write_vmcore()
has a spurious '&' operator, so instead of writing out the section
header we write out the literal pointer value followed by whatever is
on the stack after the 'shdr' local variable.
Pass the correct address into fd_write_vmcore().
Spotted by Coverity: CID 1421970.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200324173630.12221-1-peter.maydell@linaro.org
(cherry picked from commit 174d2d6856)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In the function amdvi_log_event(), we write an event log buffer
entry into guest ram, whose contents are passed to the function
via the "uint64_t *evt" argument. Unfortunately, a spurious
'&' in the call to dma_memory_write() meant that instead of
writing the event to the guest we would write the literal value
of the pointer, plus whatever was in the following 8 bytes
on the stack. This error was spotted by Coverity.
Fix the bug by removing the '&'.
Fixes: CID 1421945
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20200326105349.24588-1-peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 32a2d6b1f6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The QAPI struct GuestFileWhence has a comment about how we are
exploiting equivalent values between two different integer types
shared in a union. But C says behavior is undefined on assignments to
overlapping storage when the two types are not the same width, and
indeed, 'int64_t value' and 'enum QGASeek name' are very likely to be
different in width. Utilize a temporary variable to fix things.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: 0b4b49387
Fixes: Coverity CID 1421990
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit a23f38a729)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
guest-file-read command is currently implemented to read from a
file handle count number of bytes. when executed with a very large count number
qemu-ga crashes.
after some digging turns out that qemu-ga crashes after trying to allocate
a buffer large enough to save the data read in it, the buffer was allocated using
g_malloc0 which is not fail safe, and results a crash in case of failure.
g_malloc0 was replaced with g_try_malloc0() which returns NULL on failure,
A check was added for that case in order to prevent qemu-ga from crashing
and to send a response to the qemu-ga client accordingly.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1594054
Signed-off-by: Basil Salman <basil@daynix.com>
Reported-by: Fakhri Zulkifli <mohdfakhrizulkifli@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 807e2b6fce)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This patch handles the case where VSS Provider is already registered,
where in such case qga uninstalls the provider and registers it again.
Signed-off-by: Sameeh Jubran <sjubran@redhat.com>
Signed-off-by: Basil Salman <basil@daynix.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit b2413df833)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
rlwinm cannot just AND with Mask if shift value is zero on ppc64 when
Mask Begin is greater than Mask End and high bits are set to 1.
Note that PowerISA 3.0B says that for `rlwinm' ROTL32 is used, and
ROTL32 is defined (in 3.3.14) so that rotated value should have two
copies of lower word of the source value.
This seems to be another incarnation of the fix from 820724d170
("target-ppc: Fix rlwimi, rlwinm, rlwnm again"), except I leave
optimization when Mask value is less than 32 bits.
Fixes: 7b4d326f47 ("target-ppc: Use the new deposit and extract ops")
Cc: qemu-stable@nongnu.org
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Message-Id: <20200309204557.14836-1-vt@altlinux.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 94f040aaec)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Assume we have two regions, A and B, and region B is in-flight now,
region A is not yet touched, but it is unallocated and should be
skipped.
Correspondingly, as progress we have
total = A + B
current = 0
If we reset unallocated region A and call progress_reset_callback,
it will calculate 0 bytes dirty in the bitmap and call
job_progress_set_remaining, which will set
total = current + 0 = 0 + 0 = 0
So, B bytes are actually removed from total accounting. When job
finishes we'll have
total = 0
current = B
, which doesn't sound good.
This is because we didn't considered in-flight bytes, actually when
calculating remaining, we should have set (in_flight + dirty_bytes)
as remaining, not only dirty_bytes.
To fix it, let's refactor progress calculation, moving it to block-copy
itself instead of fixing callback. And, of course, track in_flight
bytes count.
We still have to keep one callback, to maintain backup job bytes_read
calculation, but it will go on soon, when we turn the whole backup
process into one block_copy call.
Cc: qemu-stable@nongnu.org
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <20200311103004.7649-3-vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit d0ebeca14a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
On success path we return what inflate() returns instead of 0. And it
most probably works for Z_STREAM_END as it is positive, but is
definitely broken for Z_BUF_ERROR.
While being here, switch to errno return code, to be closer to
qcow2_compress API (and usual expectations).
Revert condition in if to be more positive. Drop dead initialization of
ret.
Cc: qemu-stable@nongnu.org # v4.0
Fixes: 341926ab83
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200302150930.16218-1-vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit e7266570f2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Compile error reported by gcc 10.0.1:
scsi/qemu-pr-helper.c: In function ‘multipath_pr_out’:
scsi/qemu-pr-helper.c:523:32: error: array subscript <unknown> is outside array bounds of ‘struct transportid *[0]’ [-Werror=array-bounds]
523 | paramp.trnptid_list[paramp.num_transportid++] = id;
| ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from scsi/qemu-pr-helper.c:36:
/usr/include/mpath_persist.h:168:22: note: while referencing ‘trnptid_list’
168 | struct transportid *trnptid_list[];
| ^~~~~~~~~~~~
scsi/qemu-pr-helper.c:424:35: note: defined here ‘paramp’
424 | struct prout_param_descriptor paramp;
| ^~~~~~
This highlights an actual implementation issue in function multipath_pr_out.
The variable paramp is declared with type `struct prout_param_descriptor`,
which is a struct terminated by an empty array in mpath_persist.h:
struct transportid *trnptid_list[];
That empty array was filled with code that looked like that:
trnptid_list[paramp.descr.num_transportid++] = id;
This is an actual out-of-bounds access.
The fix is to malloc `paramp`.
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4ce1e15fbc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The virtqueue code sets up MemoryRegionCaches to access the virtqueue
guest RAM data structures. The code currently assumes that
VRingMemoryRegionCaches is initialized before device emulation code
accesses the virtqueue. An assertion will fail in
vring_get_region_caches() when this is not true. Device fuzzing found a
case where this assumption is false (see below).
Virtqueue guest RAM addresses can also be changed from a vCPU thread
while an IOThread is accessing the virtqueue. This breaks the same
assumption but this time the caches could become invalid partway through
the virtqueue code. The code fetches the caches RCU pointer multiple
times so we will need to validate the pointer every time it is fetched.
Add checks each time we call vring_get_region_caches() and treat invalid
caches as a nop: memory stores are ignored and memory reads return 0.
The fuzz test failure is as follows:
$ qemu -M pc -device virtio-blk-pci,id=drv0,drive=drive0,addr=4.0 \
-drive if=none,id=drive0,file=null-co://,format=raw,auto-read-only=off \
-drive if=none,id=drive1,file=null-co://,file.read-zeroes=on,format=raw \
-display none \
-qtest stdio
endianness
outl 0xcf8 0x80002020
outl 0xcfc 0xe0000000
outl 0xcf8 0x80002004
outw 0xcfc 0x7
write 0xe0000000 0x24 0x00ffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffab5cffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffab0000000001
inb 0x4
writew 0xe000001c 0x1
write 0xe0000014 0x1 0x0d
The following error message is produced:
qemu-system-x86_64: /home/stefanha/qemu/hw/virtio/virtio.c:286: vring_get_region_caches: Assertion `caches != NULL' failed.
The backtrace looks like this:
#0 0x00007ffff5520625 in raise () at /lib64/libc.so.6
#1 0x00007ffff55098d9 in abort () at /lib64/libc.so.6
#2 0x00007ffff55097a9 in _nl_load_domain.cold () at /lib64/libc.so.6
#3 0x00007ffff5518a66 in annobin_assert.c_end () at /lib64/libc.so.6
#4 0x00005555559073da in vring_get_region_caches (vq=<optimized out>) at qemu/hw/virtio/virtio.c:286
#5 vring_get_region_caches (vq=<optimized out>) at qemu/hw/virtio/virtio.c:283
#6 0x000055555590818d in vring_used_flags_set_bit (mask=1, vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:398
#7 virtio_queue_split_set_notification (enable=0, vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:398
#8 virtio_queue_set_notification (vq=vq@entry=0x5555575ceea0, enable=enable@entry=0) at qemu/hw/virtio/virtio.c:451
#9 0x0000555555908512 in virtio_queue_set_notification (vq=vq@entry=0x5555575ceea0, enable=enable@entry=0) at qemu/hw/virtio/virtio.c:444
#10 0x00005555558c697a in virtio_blk_handle_vq (s=0x5555575c57e0, vq=0x5555575ceea0) at qemu/hw/block/virtio-blk.c:775
#11 0x0000555555907836 in virtio_queue_notify_aio_vq (vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:2244
#12 0x0000555555cb5dd7 in aio_dispatch_handlers (ctx=ctx@entry=0x55555671a420) at util/aio-posix.c:429
#13 0x0000555555cb67a8 in aio_dispatch (ctx=0x55555671a420) at util/aio-posix.c:460
#14 0x0000555555cb307e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
#15 0x00007ffff7bbc510 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#16 0x0000555555cb5848 in glib_pollfds_poll () at util/main-loop.c:219
#17 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
#18 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
#19 0x00005555559b20c9 in main_loop () at vl.c:1683
#20 0x0000555555838115 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4441
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Cc: Michael Tsirkin <mst@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200207104619.164892-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit abdd16f468)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This adds a test for 'qemu-img convert' with copy offloading where the
target image has an external data file. If the test hosts supports it,
it tests both the case where copy offloading is supported and the case
where it isn't (otherwise we just test unsupported twice).
More specifically, the case with unsupported copy offloading tests
qcow2_alloc_cluster_abort() with external data files.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200211094900.17315-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit a0cf8daf77)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
For external data file, cluster allocations return an offset in the data
file and are not refcounted. In this case, there is nothing to do for
qcow2_alloc_cluster_abort(). Freeing the same offset in the qcow2 file
is wrong and causes crashes in the better case or image corruption in
the worse case.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200211094900.17315-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c3b6658c1a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In the case that update_refcount() frees a refcount block, it evicts it
from the metadata cache. Before doing so, however, it returns the
currently used refcount block to the cache because it might be the same.
Returning the refcount block early means that we need to reset
old_table_index so that we reload the refcount block in the next
iteration if it is actually still in use.
Fixes: f71c08ea8e
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200211094900.17315-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit dea9052ef1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Section 3.4.7 of the datasheet explains that,
The RBE bit in the Interrupt Status register is set when the
SONIC finishes using the second to last receive buffer and reads
the last RRA descriptor. Actually, the SONIC is not truly out of
resources, but gives the system an early warning of an impending
out of resources condition.
RBE does not mean actual receive buffer exhaustion, and reception should
not be stopped. This is important because Linux will not check and clear
the RBE interrupt until it receives another packet. But that won't
happen if can_receive returns false. This bug causes the SONIC to become
deaf (until reset).
Fix this with a new flag to indicate actual receive buffer exhaustion.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit c2279bd0a1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The jazzsonic driver in Linux uses the Silicon Revision register value
to probe the chip. The driver fails unless the SR register contains 4.
Unfortunately, reading this register in QEMU usually returns 0 because
the s->regs[] array gets wiped after a software reset.
Fixes: bd8f1ebce4 ("net/dp8393x: fix hardware reset")
Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 083e21bbdd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
These operations need to take place regardless of whether or not
rx descriptors have been used up (that is, EOL flag was observed).
The algorithm is now the same for a packet that was withheld as for
a packet that was not.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 80b60673ea)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When the SONIC receives a packet into the last available descriptor, it
retains ownership of that descriptor for as long as necessary.
Section 3.4.7 of the datasheet says,
When the system appends more descriptors, the SONIC releases ownership
of the descriptor after writing 0000h to the RXpkt.in_use field.
The packet can now be processed by the host, so raise a PKTRX interrupt,
just like the normal case.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit d9fae13196)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The existing code has a bug where the Remaining Buffer Word Count (RBWC)
is calculated with a truncating division, which gives the wrong result
for odd-sized packets.
Section 1.4.1 of the datasheet says,
Once the end of the packet has been reached, the serializer will
fill out the last word (16-bit mode) or long word (32-bit mode)
if the last byte did not end on a word or long word boundary
respectively. The fill byte will be 0FFh.
Implement buffer padding so that buffer limits are correctly enforced.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 350e7d9a77)
*drop context dependencies from b7cbebf2b9, 1ccda935d4, and
19f7034773
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Section 3.4.1 of the datasheet says,
The alignment of the RRA is confined to either word or long word
boundaries, depending upon the data width mode. In 16-bit mode,
the RRA must be aligned to a word boundary (A0 is always zero)
and in 32-bit mode, the RRA is aligned to a long word boundary
(A0 and A1 are always zero).
This constraint has been implemented for 16-bit mode; implement it
for 32-bit mode too.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit ea2270279b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
A received packet consumes pkt_size bytes in the buffer and the frame
checksum that's appended to it consumes another 4 bytes. The Receive
Buffer Address register takes the former quantity into account but
not the latter. So the next packet written to the buffer overwrites
the frame checksum. Fix this.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit bae112b80c)
*drop context dep. on 19f7034773
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Add a bounds check to prevent a large packet from causing a buffer
overflow. This is defensive programming -- I haven't actually tried
sending an oversized packet or a jumbo ethernet frame.
The SONIC handles packets that are too big for the buffer by raising
the RBAE interrupt and dropping them. Linux uses that interrupt to
count dropped packets.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit ada7431527)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Follow the algorithm given in the National Semiconductor DP83932C
datasheet in section 3.4.7:
At the next reception, the SONIC re-reads the last RXpkt.link field,
and updates its CRDA register to point to the next descriptor.
The chip is designed to allow the host to provide a new list of
descriptors in this way.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 5b0c98fcb7)
*drop context dep on 19f7034773
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This function re-uses its 'size' argument as a scratch variable.
Instead, declare a local 'size' variable for that purpose so that the
function result doesn't get messed up.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 9e3cd456d8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
According to the datasheet, section 3.4.4, "in 32-bit mode ... the SONIC
always writes long words".
Therefore, use the same technique for the 'in_use' field that is used
everywhere else, and write the full long word.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 46ffee9ad4)
Conflicts:
hw/net/dp8393x.c
*roll in local dependencies on b7cbebf2b9
*drop functional dep. on 19f7034773
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The DP83932 and DP83934 have 32 data lines. The datasheet says,
Data Bus: These bidirectional lines are used to transfer data on the
system bus. When the SONIC is a bus master, 16-bit data is transferred
on D15-D0 and 32-bit data is transferred on D31-D0. When the SONIC is
accessed as a slave, register data is driven onto lines D15-D0.
D31-D16 are held TRI-STATE if SONIC is in 16-bit mode. If SONIC is in
32-bit mode, they are driven, but invalid.
Always use 32-bit accesses both as bus master and bus slave.
Force the MSW to zero in bus master mode.
This gets the Linux 'jazzsonic' driver working, and avoids the need for
prior hacks to make the NetBSD 'sn' driver work.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 3fe9a838ec)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The Least Significant bit of a descriptor address register is used as
an EOL flag. It has to be masked when the register value is to be used
as an actual address for copying memory around. But when the registers
are to be updated the EOL bit should not be masked.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 88f632fbb1)
Conflicts:
hw/net/dp8393x.c
*drop context dep. on 19f7034773
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
qcow2_can_store_new_dirty_bitmap works wrong, as it considers only
bitmaps already stored in the qcow2 image and ignores persistent
BdrvDirtyBitmap objects.
So, let's instead count persistent BdrvDirtyBitmaps. We load all qcow2
bitmaps on open, so there should not be any bitmap in the image for
which we don't have BdrvDirtyBitmaps version. If it is - it's a kind of
corruption, and no reason to check for corruptions here (open() and
close() are better places for it).
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20191014115126.15360-2-vsementsov@virtuozzo.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit a1db8733d2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The commit a718978ed5 from July 2015 introduced the assertion which
implies that the size of successful DMA transfers handled in ide_dma_cb()
should be multiple of 512 (the size of a sector). But guest systems can
initiate DMA transfers that don't fit this requirement.
For fixing that let's check the number of bytes prepared for the transfer
by the prepare_buf() handler. The code in ide_dma_cb() must behave
according to the Programming Interface for Bus Master IDE Controller
(Revision 1.0 5/16/94):
1. If PRDs specified a smaller size than the IDE transfer
size, then the Interrupt and Active bits in the Controller
status register are not set (Error Condition).
2. If the size of the physical memory regions was equal to
the IDE device transfer size, the Interrupt bit in the
Controller status register is set to 1, Active bit is set to 0.
3. If PRDs specified a larger size than the IDE transfer size,
the Interrupt and Active bits in the Controller status register
are both set to 1.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20191223175117.508990-2-alex.popov@linux.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit ed78352a59)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Fuzzing the Linux kernel with syzkaller allowed to find how to crash qemu
using a special SCSI_IOCTL_SEND_COMMAND. It hits the assertion in
ide_dma_cb() introduced in the commit a718978ed5 in July 2015.
Currently this bug is not reproduced by the unit tests.
Let's improve the ide-test to cover more PRDT cases including one
that causes this particular qemu crash.
The test is developed according to the Programming Interface for
Bus Master IDE Controller (Revision 1.0 5/16/94).
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Message-id: 20191223175117.508990-3-alex.popov@linux.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 59805ae92d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When the 'vga=' parameter is succeeded by another parameter, QEMU 4.2.0
would refuse to start with a rather cryptic message:
$ qemu-system-x86_64 -kernel /boot/vmlinuz-linux -append 'vga=792 quiet'
qemu: can't parse 'vga' parameter: Invalid argument
It was not clear whether this applied to the '-vga std' parameter or the
'-append' one. Fix the parsing regression and clarify the error.
Fixes: 133ef074bd ("hw/i386/pc: replace use of strtol with qemu_strtoui in x86_load_linux()")
Cc: Sergio Lopez <slp@redhat.com>
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Message-Id: <20191221162124.1159291-1-peter@lekensteyn.nl>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a88c40f02a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
After setting CP15 bits in arm_set_cpu_on() the cached hflags must
be rebuild to reflect the changed processor state. Without rebuilding,
the cached hflags would be inconsistent until the next call to
arm_rebuild_hflags(). When QEMU is compiled with debugging enabled
(--enable-debug), this problem is captured shortly after the first
call to arm_set_cpu_on() for CPUs running in ARM 32-bit non-secure mode:
qemu-system-arm: target/arm/helper.c:11359: cpu_get_tb_cpu_state:
Assertion `flags == rebuild_hflags_internal(env)' failed.
Aborted (core dumped)
Fixes: 0c7f8c43da
Cc: qemu-stable@nongnu.org
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit c8fa6079eb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This change ensures that the FPU can be accessed in Non-Secure mode
when the CPU core is reset using the arm_set_cpu_on() function call.
The NSACR.{CP11,CP10} bits define the exception level required to
access the FPU in Non-Secure mode. Without these bits set, the CPU
will give an undefined exception trap on the first FPU access for the
secondary cores under Linux.
This is necessary because in this power-control codepath QEMU
is effectively emulating a bit of EL3 firmware, and has to set
the CPU up as the EL3 firmware would.
Fixes: fc1120a7f5
Cc: qemu-stable@nongnu.org
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
[PMM: added clarifying para to commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 0c7f8c43da)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit aa57020774, by mistake used MachineClass::numa_mem_supported
to check if NUMA is supported by machine and also as unrelated change
set it to true for sbsa-ref board.
Luckily change didn't break machines that support NUMA, as the field
is set to true for them.
But the field is not intended for checking if NUMA is supported and
will be flipped to false within this release for new machine types.
Fix it:
- by using previously used condition
!mc->cpu_index_to_instance_props || !mc->get_default_cpu_node_id
the first time and then use MachineState::numa_state down the road
to check if NUMA is supported
- dropping stray sbsa-ref chunk
Fixes: aa57020774
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1576154936-178362-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit fcd3f2cc12)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
bdrv_invalidate_cache_all() assumes that all nodes in a given subtree
are either active or inactive when it starts. Therefore, as soon as it
arrives at an already active node, it stops.
However, this assumption is wrong. For example, it's possible to take a
snapshot of an inactive node, which results in an active overlay over an
inactive backing file. The active overlay is probably also the root node
of an inactive BlockBackend (blk->disable_perm == true).
In this case, bdrv_invalidate_cache_all() does not need to do anything
to activate the overlay node, but it still needs to recurse into the
children and the parents to make sure that after returning success,
really everything is activated.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 7bb4941ace)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Mention that this is a PCI device address & give the format it is
expected in. Also mention that it must be first unbound from any
host kernel driver.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit ecaf647f30)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When using `query-cpu-definitions` using `-machine none`,
QEMU is resolving all CPU models to their latest versions. The
actual CPU model version being used by another machine type (e.g.
`pc-q35-4.0`) might be different.
In theory, this was OK because the correct CPU model
version is returned when using the correct `-machine` argument.
Except that in practice, this breaks libvirt expectations:
libvirt always use `-machine none` when checking if a CPU model
is runnable, because runnability is not expected to be affected
when the machine type is changed.
For example, when running on a Haswell host without TSX,
Haswell-v4 is runnable, but Haswell-v1 is not. On those hosts,
`query-cpu-definitions` says Haswell is runnable if using
`-machine none`, but Haswell is actually not runnable using any
of the `pc-*` machine types (because they resolve Haswell to
Haswell-v1). In other words, we're breaking the "runnability
guarantee" we promised to not break for a few releases (see
qemu-deprecated.texi).
To address this issue, change the default CPU model version to v1
on all machine types, so we make `query-cpu-definitions` output
when using `-machine none` match the results when using `pc-*`.
This will change in the future (the plan is to always return the
latest CPU model version if using `-machine none`), but only
after giving libvirt the opportunity to adapt.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1779078
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20191205223339.764534-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit ad18392892)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In currently implementation there will be a memory leak when
nbd_client_connect() returns error status. Here is an easy way to
reproduce:
1. run qemu-iotests as follow and check the result with asan:
./check -raw 143
Following is the asan output backtrack:
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f629688a560 in calloc (/usr/lib64/libasan.so.3+0xc7560)
#1 0x7f6295e7e015 in g_malloc0 (/usr/lib64/libglib-2.0.so.0+0x50015)
#2 0x56281dab4642 in qobject_input_start_struct /mnt/sdb/qemu-4.2.0-rc0/qapi/qobject-input-visitor.c:295
#3 0x56281dab1a04 in visit_start_struct /mnt/sdb/qemu-4.2.0-rc0/qapi/qapi-visit-core.c:49
#4 0x56281dad1827 in visit_type_SocketAddress qapi/qapi-visit-sockets.c:386
#5 0x56281da8062f in nbd_config /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1716
#6 0x56281da8062f in nbd_process_options /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1829
#7 0x56281da8062f in nbd_open /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1873
Direct leak of 15 byte(s) in 1 object(s) allocated from:
#0 0x7f629688a3a0 in malloc (/usr/lib64/libasan.so.3+0xc73a0)
#1 0x7f6295e7dfbd in g_malloc (/usr/lib64/libglib-2.0.so.0+0x4ffbd)
#2 0x7f6295e96ace in g_strdup (/usr/lib64/libglib-2.0.so.0+0x68ace)
#3 0x56281da804ac in nbd_process_options /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1834
#4 0x56281da804ac in nbd_open /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1873
Indirect leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x7f629688a3a0 in malloc (/usr/lib64/libasan.so.3+0xc73a0)
#1 0x7f6295e7dfbd in g_malloc (/usr/lib64/libglib-2.0.so.0+0x4ffbd)
#2 0x7f6295e96ace in g_strdup (/usr/lib64/libglib-2.0.so.0+0x68ace)
#3 0x56281dab41a3 in qobject_input_type_str_keyval /mnt/sdb/qemu-4.2.0-rc0/qapi/qobject-input-visitor.c:536
#4 0x56281dab2ee9 in visit_type_str /mnt/sdb/qemu-4.2.0-rc0/qapi/qapi-visit-core.c:297
#5 0x56281dad0fa1 in visit_type_UnixSocketAddress_members qapi/qapi-visit-sockets.c:141
#6 0x56281dad17b6 in visit_type_SocketAddress_members qapi/qapi-visit-sockets.c:366
#7 0x56281dad186a in visit_type_SocketAddress qapi/qapi-visit-sockets.c:393
#8 0x56281da8062f in nbd_config /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1716
#9 0x56281da8062f in nbd_process_options /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1829
#10 0x56281da8062f in nbd_open /mnt/sdb/qemu-4.2.0-rc0/block/nbd.c:1873
Fixes: 8f071c9db5
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <1575517528-44312-3-git-send-email-pannengyuan@huawei.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 8198cf5ef0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
* There are cases where we never get to finalise a translation - for
* example a page fault during translation. As a result we shouldn't
* do any clean-up here and make sure things are reset in
* plugin_gen_tb_start.
*/
voidplugin_gen_tb_end(CPUState*cpu)
{
structqemu_plugin_tb*ptb=tcg_ctx->plugin_tb;
inti;
/* collect instrumentation requests */
qemu_plugin_tb_trans_cb(cpu,ptb);
/* inject the instrumentation at the appropriate places */
plugin_gen_inject(ptb);
/* clean up */
for(i=0;i<PLUGIN_N_CB_SUBTYPES;i++){
if(ptb->cbs[i]){
g_array_set_size(ptb->cbs[i],0);
}
}
ptb->n=0;
tcg_ctx->plugin_insn=NULL;
}
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.