Git-commit: 2220e8189f
References: bsc#1218484
When HASH_REPORT is negotiated, the guest_hdr_len might be larger than
the size of the mergeable rx buffer header. Using
virtio_net_hdr_mrg_rxbuf during the header swap might lead a stack
overflow in this case. Fixing this by using virtio_net_hdr_v1_hash
instead.
Reported-by: Xiao Lei <leixiao.nop@zju.edu.cn>
Cc: Yuri Benditovich <yuri.benditovich@daynix.com>
Cc: qemu-stable@nongnu.org
Cc: Mauro Matteo Cascella <mcascell@redhat.com>
Fixes: CVE-2023-6693
Fixes: e22f0603fb ("virtio-net: reference implementation of hash report")
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 9e4b27ca6b
References: bsc#1222845
Per "SD Host Controller Standard Specification Version 3.00":
* 2.2.5 Transfer Mode Register (Offset 00Ch)
Writes to this register shall be ignored when the Command
Inhibit (DAT) in the Present State register is 1.
Do not update the TRNMOD register when Command Inhibit (DAT)
bit is set to avoid the present-status register going out of
sync, leading to malicious guest using DMA mode and overflowing
the FIFO buffer:
$ cat << EOF | qemu-system-i386 \
-display none -nographic -nodefaults \
-machine accel=qtest -m 512M \
-device sdhci-pci,sd-spec-version=3 \
-device sd-card,drive=mydrive \
-drive if=none,index=0,file=null-co://,format=raw,id=mydrive \
-qtest stdio
outl 0xcf8 0x80001013
outl 0xcfc 0x91
outl 0xcf8 0x80001001
outl 0xcfc 0x06000000
write 0x9100002c 0x1 0x05
write 0x91000058 0x1 0x16
write 0x91000005 0x1 0x04
write 0x91000028 0x1 0x08
write 0x16 0x1 0x21
write 0x19 0x1 0x20
write 0x9100000c 0x1 0x01
write 0x9100000e 0x1 0x20
write 0x9100000f 0x1 0x00
write 0x9100000c 0x1 0x00
write 0x91000020 0x1 0x00
EOF
Stack trace (part):
=================================================================
==89993==ERROR: AddressSanitizer: heap-buffer-overflow on address
0x615000029900 at pc 0x55d5f885700d bp 0x7ffc1e1e9470 sp 0x7ffc1e1e9468
WRITE of size 1 at 0x615000029900 thread T0
#0 0x55d5f885700c in sdhci_write_dataport hw/sd/sdhci.c:564:39
#1 0x55d5f8849150 in sdhci_write hw/sd/sdhci.c:1223:13
#2 0x55d5fa01db63 in memory_region_write_accessor system/memory.c:497:5
#3 0x55d5fa01d245 in access_with_adjusted_size system/memory.c:573:18
#4 0x55d5fa01b1a9 in memory_region_dispatch_write system/memory.c:1521:16
#5 0x55d5fa09f5c9 in flatview_write_continue system/physmem.c:2711:23
#6 0x55d5fa08f78b in flatview_write system/physmem.c:2753:12
#7 0x55d5fa08f258 in address_space_write system/physmem.c:2860:18
...
0x615000029900 is located 0 bytes to the right of 512-byte region
[0x615000029700,0x615000029900) allocated by thread T0 here:
#0 0x55d5f7237b27 in __interceptor_calloc
#1 0x7f9e36dd4c50 in g_malloc0
#2 0x55d5f88672f7 in sdhci_pci_realize hw/sd/sdhci-pci.c:36:5
#3 0x55d5f844b582 in pci_qdev_realize hw/pci/pci.c:2092:9
#4 0x55d5fa2ee74b in device_set_realized hw/core/qdev.c:510:13
#5 0x55d5fa325bfb in property_set_bool qom/object.c:2358:5
#6 0x55d5fa31ea45 in object_property_set qom/object.c:1472:5
#7 0x55d5fa332509 in object_property_set_qobject om/qom-qobject.c:28:10
#8 0x55d5fa31f6ed in object_property_set_bool qom/object.c:1541:15
#9 0x55d5fa2e2948 in qdev_realize hw/core/qdev.c:292:12
#10 0x55d5f8eed3f1 in qdev_device_add_from_qdict system/qdev-monitor.c:719:10
#11 0x55d5f8eef7ff in qdev_device_add system/qdev-monitor.c:738:11
#12 0x55d5f8f211f0 in device_init_func system/vl.c:1200:11
#13 0x55d5fad0877d in qemu_opts_foreach util/qemu-option.c:1135:14
#14 0x55d5f8f0df9c in qemu_create_cli_devices system/vl.c:2638:5
#15 0x55d5f8f0db24 in qmp_x_exit_preconfig system/vl.c:2706:5
#16 0x55d5f8f14dc0 in qemu_init system/vl.c:3737:9
...
SUMMARY: AddressSanitizer: heap-buffer-overflow hw/sd/sdhci.c:564:39
in sdhci_write_dataport
Add assertions to ensure the fifo_buffer[] is not overflowed by
malicious accesses to the Buffer Data Port register.
Fixes: CVE-2024-3447
Cc: qemu-stable@nongnu.org
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Buglink: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58813
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <CAFEAcA9iLiv1XGTGKeopgMa8Y9+8kvptvsb8z2OBeuy+5=NUfg@mail.gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240409145524.27913-1-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 8b09b7fe47
References: bsc#1223955
When the maximum count of SCRIPTS instructions is reached, the code
stops execution and returns, but fails to decrement the reentrancy
counter. This effectively renders the SCSI controller unusable
because on next entry the reentrancy counter is still above the limit.
This bug was seen on HP-UX 10.20 which seems to trigger SCRIPTS
loops.
Fixes: b987718bbb ("hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)")
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-ID: <20240128202214.2644768-1-svens@stackframe.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: d139fe9ad8
References: bsc#1223955
While trying to use a SCSI disk on the LSI controller with an
older version of Fedora (25), I'm getting:
qemu: warning: Blocked re-entrant IO on MemoryRegion: lsi-mmio at addr: 0x34
and the SCSI controller is not usable. Seems like we have to
disable the reentrancy checker for the MMIO region, too, to
get this working again.
The problem could be reproduced it like this:
./qemu-system-x86_64 -accel kvm -m 2G -machine q35 \
-device lsi53c810,id=lsi1 -device scsi-hd,drive=d0 \
-drive if=none,id=d0,file=.../somedisk.qcow2 \
-cdrom Fedora-Everything-netinst-i386-25-1.3.iso
Where somedisk.qcow2 is an image that contains already some partitions
and file systems.
In the boot menu of Fedora, go to
"Troubleshooting" -> "Rescue a Fedora system" -> "3) Skip to shell"
Then check "dmesg | grep -i 53c" for failure messages, and try to mount
a partition from somedisk.qcow2.
Message-Id: <20230516090556.553813-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
References: bsc#1190011 (CVE-2021-3750)
40p and g3beige will hang forever with these messages:
qemu-system-ppc: warning: Blocked re-entrant IO on MemoryRegion: pci-conf-idx at addr: 0x0
qemu-system-ppc64: warning: Blocked re-entrant IO on MemoryRegion: lpc-hc at addr: 0x34
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Git-commit: 3ab6fdc91b
References: bsc#1190011 (CVE-2021-3750)
Add the 'memory' bit to the memory attributes to restrict bus
controller accesses to memories.
Introduce flatview_access_allowed() to check bus permission
before running any bus transaction.
Have read/write accessors return MEMTX_ACCESS_ERROR if an access is
restricted.
There is no change for the default case where 'memory' is not set.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211215182421.418374-4-philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[thuth: Replaced MEMTX_BUS_ERROR with MEMTX_ACCESS_ERROR, remove "inline"]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 9d38a84347
References: bsc#1213925 (CVE-2023-3180)
For symmetric algorithms, the length of ciphertext must be as same
as the plaintext.
The missing verification of the src_len and the dst_len in
virtio_crypto_sym_op_helper() may lead buffer overflow/divulged.
This patch is originally written by Yiming Tao for QEMU-SECURITY,
resend it(a few changes of error message) in qemu-devel.
Fixes: CVE-2023-3180
Fixes: 04b9b37edda("virtio-crypto: add data queue processing handler")
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: Mauro Matteo Cascella <mcascell@redhat.com>
Cc: Yiming Tao <taoym@zju.edu.cn>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230803024314.29962-2-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 10be627d2b
References: bsc#1212850 (CVE-2023-3354)
The TLS handshake make take some time to complete, during which time an
I/O watch might be registered with the main loop. If the owner of the
I/O channel invokes qio_channel_close() while the handshake is waiting
to continue the I/O watch must be removed. Failing to remove it will
later trigger the completion callback which the owner is not expecting
to receive. In the case of the VNC server, this results in a SEGV as
vnc_disconnect_start() tries to shutdown a client connection that is
already gone / NULL.
CVE-2023-3354
Reported-by: jiangyegen <jiangyegen@huawei.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
DF: Open-code g_clear_handle_id(), as it's not available in glib
until version 2.56.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: b987718bbb
References: bsc#1207205, CVE-2023-0330
We cannot use the generic reentrancy guard in the LSI code, so
we have to manually prevent endless reentrancy here. The problematic
lsi_execute_script() function has already a way to detect whether
too many instructions have been executed - we just have to slightly
change the logic here that it also takes into account if the function
has been called too often in a reentrant way.
The code in fuzz-lsi53c895a-test.c has been taken from an earlier
patch by Mauro Matteo Cascella.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563
Message-Id: <20230522091011.1082574-1-thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Removed the test changes. Not only the specific test, but the
entire infra is not there yet!
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: b5989326f5
References: bsc#1212968, CVE-2023-2861
The 9p protocol does not specifically define how server shall behave when
client tries to open a special file, however from security POV it does
make sense for 9p server to prohibit opening any special file on host side
in general. A sane Linux 9p client for instance would never attempt to
open a special file on host side, it would always handle those exclusively
on its guest side. A malicious client however could potentially escape
from the exported 9p tree by creating and opening a device file on host
side.
With QEMU this could only be exploited in the following unsafe setups:
- Running QEMU binary as root AND 9p 'local' fs driver AND 'passthrough'
security model.
or
- Using 9p 'proxy' fs driver (which is running its helper daemon as
root).
These setups were already discouraged for safety reasons before,
however for obvious reasons we are now tightening behaviour on this.
Fixes: CVE-2023-2861
Reported-by: Yanwu Shen <ywsPlz@gmail.com>
Reported-by: Jietao Xiao <shawtao1125@gmail.com>
Reported-by: Jinku Li <jkli@xidian.edu.cn>
Reported-by: Wenbo Shen <shenwenbo@zju.edu.cn>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <E1q6w7r-0000Q0-NM@lizzy.crudebyte.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 790762e548
References: bsc#1172033, CVE-2020-13253
Only move the state machine to ReceivingData if there is no
pending error. This avoids later OOB access while processing
commands queued.
"SD Specifications Part 1 Physical Layer Simplified Spec. v3.01"
4.3.3 Data Read
Read command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
occurred and no data transfer is performed.
4.3.4 Data Write
Write command is rejected if BLOCK_LEN_ERROR or ADDRESS_ERROR
occurred and no data transfer is performed.
WP_VIOLATION errors are not modified: the error bit is set, we
stay in receive-data state, wait for a stop command. All further
data transfer is ignored. See the check on sd->card_status at the
beginning of sd_read_data() and sd_write_data().
Fixes: CVE-2020-13253
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Buglink: https://bugs.launchpad.net/qemu/+bug/1880822
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20200630133912.9428-6-f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: effaf5a240
References: bsc#1180207, CVE-2020-14394
The loop condition in xhci_ring_chain_length() is under control of
the guest, and additionally the code does not check for failed DMA
transfers (e.g. if reaching the end of the RAM), so the loop there
could run for a very long time or even forever. Fix it by checking
the return value of dma_memory_read() and by introducing a maximum
loop length.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/646
Message-Id: <20220804131300.96368-1-thuth@redhat.com>
Reviewed-by: Mauro Matteo Cascella <mcascell@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 3059344f01
References: bsc#1172382, CVE-2020-13754
The "BCM2835 ARM Peripherals" datasheet [*] chapter 2
("Auxiliaries: UART1 & SPI1, SPI2"), list the register
sizes as 3/8/16/32 bits. We assume this means this
peripheral allows 8-bit accesses.
This was not an issue until commit 5d971f9e67 which reverted
("memory: accept mismatching sizes in memory_region_access_valid").
The model is implemented as 32-bit accesses (see commit 97398d900c,
all registers are 32-bit) so replace MemoryRegionOps.valid as
MemoryRegionOps.impl, and re-introduce MemoryRegionOps.valid
with a 8/32-bit range.
[*] https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
Fixes: 97398d900c ("bcm2835_aux: add emulation of BCM2835 AUX (aka UART1) block")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201002181032.1899463-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 8e67fda2dd
References: bsc#1172382, CVE-2020-13754
QEMU XHCI advertises AC64 (64-bit addressing) but doesn't allow
64-bit mode access in "runtime" and "operational" MemoryRegionOps.
Set the max_access_size based on sizeof(dma_addr_t) as AC64 is set.
XHCI specs:
"If the xHC supports 64-bit addressing (AC64 = ‘1’), then software
should write 64-bit registers using only Qword accesses. If a
system is incapable of issuing Qword accesses, then writes to the
64-bit address fields shall be performed using 2 Dword accesses;
low Dword-first, high-Dword second. If the xHC supports 32-bit
addressing (AC64 = ‘0’), then the high Dword of registers containing
64-bit address fields are unused and software should write addresses
using only Dword accesses"
The problem has been detected with SLOF, as linux kernel always accesses
registers using 32-bit access even if AC64 is set and revealed by
5d971f9e67 ("memory: Revert "memory: accept mismatching sizes in memory_region_access_valid"")
Suggested-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-id: 20200721083322.90651-1-lvivier@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 5d971f9e67
References: bsc#1172382, CVE-2020-13754
Memory API documentation documents valid .min_access_size and .max_access_size
fields and explains that any access outside these boundaries is blocked.
This is what devices seem to assume.
However this is not what the implementation does: it simply
ignores the boundaries unless there's an "accepts" callback.
Naturally, this breaks a bunch of devices.
Revert to the documented behaviour.
Devices that want to allow any access can just drop the valid field,
or add the impl field to have accesses converted to appropriate
length.
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Fixes: CVE-2020-13754
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1842363
Fixes: a014ed07bd ("memory: accept mismatching sizes in memory_region_access_valid")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200610134731.1514409-1-mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 4b7c06837a
References: bsc#1172382, CVE-2020-13754
The memory region ops have min_access_size == 4 so obey it.
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 89ed83d8b2
References: bsc#1172382, CVE-2020-13754
The memory region ops have min_access_size == 4 so obey it.
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 4367a20cc4
References: bsc#1198038, CVE-2022-0216
Set current_req to NULL, not current_req->req, to prevent reusing a free'd
buffer in case of repeated SCSI cancel requests. Also apply the fix to
CLEAR QUEUE and BUS DEVICE RESET messages as well, since they also cancel
the request.
Thanks to Alexander Bulekov for providing a reproducer.
Fixes: CVE-2022-0216
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/972
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20220711123316.421279-1-mcascell@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
[DF: Killed the hunk which patches the test as it's not there]
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 736b01642d
References: bsc#1193880, CVE-2021-3929
This fixes CVE-2021-3929 "locally" by denying DMA to the iomem of the
device itself. This still allows DMA to MMIO regions of other devices
(e.g. doing P2P DMA to the controller memory buffer of another NVMe
device).
Fixes: CVE-2021-3929
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>4
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 31c4b6fb02
References: bsc#1197653, CVE-2022-1050
Guest driver might execute HW commands when shared buffers are not yet
allocated.
This could happen on purpose (malicious guest) or because of some other
guest/host address mapping error.
We need to protect againts such case.
Fixes: CVE-2022-1050
Reported-by: Raven <wxhusst@gmail.com>
Signed-off-by: Yuval Shaia <yuval.shaia.ml@gmail.com>
Message-Id: <20220403095234.2210-1-yuval.shaia.ml@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 6dbbf05514
References: bsc#1205808, CVE-2022-4144
Have qxl_get_check_slot_offset() return false if the requested
buffer size does not fit within the slot memory region.
Similarly qxl_phys2virt() now returns NULL in such case, and
qxl_dirty_one_surface() aborts.
This avoids buffer overrun in the host pointer returned by
memory_region_get_ram_ptr().
Fixes: CVE-2022-4144 (out-of-bounds read)
Reported-by: Wenxu Yin (@awxylitol)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1336
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-5-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 8efec0ef8b
References: bsc#1205808, CVE-2022-4144
Currently qxl_phys2virt() doesn't check for buffer overrun.
In order to do so in the next commit, pass the buffer size
as argument.
For QXLCursor in qxl_render_cursor() -> qxl_cursor() we
verify the size of the chunked data ahead, checking we can
access 'sizeof(QXLCursor) + chunk->data_size' bytes.
Since in the SPICE_CURSOR_TYPE_MONO case the cursor is
assumed to fit in one chunk, no change are required.
In SPICE_CURSOR_TYPE_ALPHA the ahead read is handled in
qxl_unpack_chunks().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-4-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 61c34fc194
References: bsc#1205808, CVE-2022-4144
Only 3 command types are logged: no need to call qxl_phys2virt()
for the other types. Using different cases will help to pass
different structure sizes to qxl_phys2virt() in a pair of commits.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221128202741.4945-2-philmd@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: 418ade7849
References: bsc#1201367, CVE-2022-35414
The bug is an uninitialized memory read, along the translate_fail
path, which results in garbage being read from iotlb_to_section,
which can lead to a crash in io_readx/io_writex.
The bug may be fixed by writing any value with zero
in ~TARGET_PAGE_MASK, so that the call to iotlb_to_section using
the xlat'ed address returns io_mem_unassigned, as desired by the
translate_fail path.
It is most useful to record the original physical page address,
which will eventually be logged by memory_region_access_valid
when the access is rejected by unassigned_mem_accepts.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1065
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220621153829.366423-1-richard.henderson@linaro.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: defac5e2fb
References: bsc#1185000, CVE-2021-3507
Per the 82078 datasheet, if the end-of-track (EOT byte in
the FIFO) is more than the number of sectors per side, the
command is terminated unsuccessfully:
* 5.2.5 DATA TRANSFER TERMINATION
The 82078 supports terminal count explicitly through
the TC pin and implicitly through the underrun/over-
run and end-of-track (EOT) functions. For full sector
transfers, the EOT parameter can define the last
sector to be transferred in a single or multisector
transfer. If the last sector to be transferred is a par-
tial sector, the host can stop transferring the data in
mid-sector, and the 82078 will continue to complete
the sector as if a hardware TC was received. The
only difference between these implicit functions and
TC is that they return "abnormal termination" result
status. Such status indications can be ignored if they
were expected.
* 6.1.3 READ TRACK
This command terminates when the EOT specified
number of sectors have been read. If the 82078
does not find an I D Address Mark on the diskette
after the second· occurrence of a pulse on the
INDX# pin, then it sets the IC code in Status Regis-
ter 0 to "01" (Abnormal termination), sets the MA bit
in Status Register 1 to "1", and terminates the com-
mand.
* 6.1.6 VERIFY
Refer to Table 6-6 and Table 6-7 for information
concerning the values of MT and EC versus SC and
EOT value.
* Table 6·6. Result Phase Table
* Table 6-7. Verify Command Result Phase Table
Fix by aborting the transfer when EOT > # Sectors Per Side.
Cc: qemu-stable@nongnu.org
Cc: Hervé Poussineau <hpoussin@reactos.org>
Fixes: baca51faff ("floppy driver: disk geometry auto detect")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/339
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211118115733.4038610-2-philmd@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit defac5e2fb)
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: b263d8f928
References: bsc#1175144
At the end of sdhci_send_command(), it starts a data transfer if the
command register indicates data is associated. But the data transfer
should only be initiated when the command execution has succeeded.
With this fix, the following reproducer:
outl 0xcf8 0x80001810
outl 0xcfc 0xe1068000
outl 0xcf8 0x80001804
outw 0xcfc 0x7
write 0xe106802c 0x1 0x0f
write 0xe1068004 0xc 0x2801d10101fffffbff28a384
write 0xe106800c 0x1f 0x9dacbbcad9e8f7061524334251606f7e8d9cabbac9d8e7f60514233241505f
write 0xe1068003 0x28 0x80d000251480d000252280d000253080d000253e80d000254c80d000255a80d000256880d0002576
write 0xe1068003 0x1 0xfe
cannot be reproduced with the following QEMU command line:
$ qemu-system-x86_64 -nographic -M pc-q35-5.0 \
-device sdhci-pci,sd-spec-version=3 \
-drive if=sd,index=0,file=null-co://,format=raw,id=mydrive \
-device sd-card,drive=mydrive \
-monitor none -serial none -qtest stdio
Cc: qemu-stable@nongnu.org
Fixes: CVE-2020-17380
Fixes: CVE-2020-25085
Fixes: CVE-2021-3409
Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cornelius Aschermann (Ruhr-Universität Bochum)
Reported-by: Sergej Schumilo (Ruhr-Universität Bochum)
Reported-by: Simon Wörner (Ruhr-Universität Bochum)
Buglink: https://bugs.launchpad.net/qemu/+bug/1892960
Buglink: https://bugs.launchpad.net/qemu/+bug/1909418
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1928146
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210303122639.20004-2-bmeng.cn@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Git-commit: ffd205ef29
References: bsc#1192463
This partially reverts commit bbd2d5a812.
This commit was misguided and broke using --disable-pie on any distro
that enables PIE by default in their compiler driver, including Debian
and its derivatives. Whilst -no-pie is not a linker flag, it is a
compiler driver flag that ensures -pie is not automatically passed by it
to the linker. Without it, all compile_prog checks will fail as any code
built with the explicit -fno-pie will fail to link with the implicit
default -pie due to trying to use position-dependent relocations. The
only bug that needed fixing was LDFLAGS_NOPIE being used as a flag for
the linker itself in pc-bios/optionrom/Makefile.
Note this does not reinstate exporting LDFLAGS_NOPIE, as it is unused,
since the only previous use was the one that should not have existed. I
have also updated the comment for the -fno-pie and -no-pie checks to
reflect what they're actually needed for.
Fixes: bbd2d5a812
Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Message-Id: <20210805192545.38279-1-jrtc27@jrtc27.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
Git-commit: bbd2d5a812
References: bsc#1192463
Recent binutils changes dropping unsupported options [1] caused a build
issue in regard to the optionroms.
ld -m elf_i386 -T /<<PKGBUILDDIR>>/pc-bios/optionrom//flat.lds -no-pie \
-s -o multiboot.img multiboot.o
ld.bfd: Error: unable to disambiguate: -no-pie (did you mean --no-pie ?)
This isn't really a regression in ld.bfd, filing the bug upstream
revealed that this never worked as a ld flag [2] - in fact it seems we
were by accident setting --nmagic).
Since it never had the wanted effect this usage of LDFLAGS_NOPIE, should be
droppable without any effect. This also is the only use-case of LDFLAGS_NOPIE
in .mak, therefore we can also remove it from being added there.
[1]: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=983d925d
[2]: https://sourceware.org/bugzilla/show_bug.cgi?id=27050#c5
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Message-Id: <20201214150938.1297512-1-christian.ehrhardt@canonical.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
Git-commit: bedd7e93d0
References: bsc#1189938 CVE-2021-3748
When mergeable buffer is enabled, we try to set the num_buffers after
the virtqueue elem has been unmapped. This will lead several issues,
E.g a use after free when the descriptor has an address which belongs
to the non direct access region. In this case we use bounce buffer
that is allocated during address_space_map() and freed during
address_space_unmap().
Fixing this by storing the elems temporarily in an array and delay the
unmap after we set the the num_buffers.
This addresses CVE-2021-3748.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: fbe78f4f55 ("virtio-net support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: 05a40b172e
References: bsc#1186012, CVE-2021-3527
usb-host and usb-redirect try to batch bulk transfers by combining many
small usb packets into a single, large transfer request, to reduce the
overhead and improve performance.
This patch adds a size limit of 1 MiB for those combined packets to
restrict the host resources the guest can bind that way.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210503132915.2335822-6-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: 000000000000000000000000000000000000000000000
References: bsc#1182651, CVE-2021-20255
Patch based on discussion:
https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg06098.html
While processing controller commands, eepro100 emulator gets
command unit(CU) base address OR receive unit (RU) base address
OR command block (CB) address from guest. If these values are not
checked, it may lead to an infinite loop kind of issues. Add checks
to avoid it.
Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-By: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: 00000000000000000000000000000000000000000000
References: bsc#1180432, CVE-2020-35503
Ensure that 'cmd->frame' is not NULL before accessing the 'header' field.
This check prevents a potential NULL pointer dereference issue.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1910346
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Acked-By: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: d7fb54218424c3b2517aee5b391ced0f75386a5d
References: bsc#1187364, CVE-2021-3592
RFC2131 suggests that the options field may be at least 312 bytes.
Some DHCP clients seem to assume that it has to be at least 312 bytes.
Fixes#51
Fixes: f13cad45b25d92760bb0ad67bec0300a4d7d5275 ("bootp: limit
vendor-specific area to input packet memory buffer")
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: f13cad45b25d92760bb0ad67bec0300a4d7d5275
References: bsc#1187364, CVE-2021-3592
sizeof(bootp_t) currently holds DHCP_OPT_LEN. Remove this optional field
from the structure, to help with the following patch checking for
minimal header size. Modify the bootp_reply() function to take the
buffer boundaries and avoiding potential buffer overflow.
Related to CVE-2021-3592.
https://gitlab.freedesktop.org/slirp/libslirp/-/issues/44
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: 93e645e72a056ec0b2c16e0299fc5c6b94e4ca17
References: bsc#1187364, CVE-2021-3592
bsc#1187367, CVE-2021-3594
Recent security issues demonstrate the lack of safety care when casting
a mbuf to a particular structure type. At least, it should check that
the buffer is large enough. The following patches will make use of this
function.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
Git-commit: 1bf8b88f14
References: bsc#1187529 CVE-2021-3611
Object property insertion code iterates over an integer to get an unused
index that can be used as an unique name for an object property. This loop
increments the integer value indefinitely. Although very unlikely, this can
still cause an integer overflow.
In this change, we fix the above code by checking against INT16_MAX and making
sure that the interger index does not overflow beyond that value. If no
available index is found, the code would cause an assertion failure. This
assertion failure is necessary because the callers of the function do not check
the return value for NULL.
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20200921093325.25617-1-ani@anisinha.ca>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Git-commit: c7ede54cbd2e2b25385325600958ba0124e31cc0
References: bsc#1172380 CVE-2020-10756
Drop IPv6 message shorter than what's mentioned in the payload
length header (+ the size of the IPv6 header). They're invalid an could
lead to data leakage in icmp6_send_echoreply().
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
Include-If: %if 0%{?suse_version} == 1315
References: bsc#1179725
revert 0157-s390x-fix-build-for-without-default.patch
This broke QEMU builds where the --without-default-devices option is
specified during configuration.
Backport patch 0157-s390x-fix-build-for-without-default.patch
subsequently changes this dependency from CONFIG_LINUX to CONFIG_VFIO,
which made sense upstream however it looks to me like CONFIG_VFIO does not
exist at this level of QEMU, meaning it's always OFF and
this is why we never build in s390-pci-vfio.o
Fixes: 77280d33bc ("s390x: fix build for --without-default-devices
(v5.2.0-rc1)")
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Git-commit: 0000000000000000000000000000000000000000
References: bsc#1181639
While activating device in vmxnet3_acticate_device(), it does not
validate guest supplied configuration values against predefined
minimum - maximum limits. This may lead to integer overflow or
OOB access issues. Add checks to avoid it.
Fixes: CVE-2021-20203
Buglink: https://bugs.launchpad.net/qemu/+bug/1913873
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: ff0507c239
References: bsc#1180523, CVE-2020-11947
There is an overflow, the source 'datain.data[2]' is 100 bytes,
but the 'ss' is 252 bytes.This may cause a security issue because
we can access a lot of unrelated memory data.
The len for sbp copy data should take the minimum of mx_sb_len and
sb_len_wr, not the maximum.
If we use iscsi device for VM backend storage, ASAN show stack:
READ of size 252 at 0xfffd149dcfc4 thread T0
#0 0xaaad433d0d34 in __asan_memcpy (aarch64-softmmu/qemu-system-aarch64+0x2cb0d34)
#1 0xaaad45f9d6d0 in iscsi_aio_ioctl_cb /qemu/block/iscsi.c:996:9
#2 0xfffd1af0e2dc (/usr/lib64/iscsi/libiscsi.so.8+0xe2dc)
#3 0xfffd1af0d174 (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
#4 0xfffd1af19fac (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
#5 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
#6 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
#7 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
#8 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
#9 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
#10 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
#11 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
#12 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
#13 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
#14 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
#15 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
#16 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
#17 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)
0xfffd149dcfc4 is located 0 bytes to the right of 100-byte region [0xfffd149dcf60,0xfffd149dcfc4)
allocated by thread T0 here:
#0 0xaaad433d1e70 in __interceptor_malloc (aarch64-softmmu/qemu-system-aarch64+0x2cb1e70)
#1 0xfffd1af0e254 (/usr/lib64/iscsi/libiscsi.so.8+0xe254)
#2 0xfffd1af0d174 (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
#3 0xfffd1af19fac (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
#4 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
#5 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
#6 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
#7 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
#8 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
#9 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
#10 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
#11 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
#12 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
#13 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
#14 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
#15 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
#16 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200418062602.10776-1-kuhn.chenqun@huawei.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 5f97ba0c74
References: bsc#1183979
This error takes effect when the magic value "zIPL" is located at the
end of a block. For example if s2_cur_blk = 0x7fe18000 and the magic
value "zIPL" is located at 0x7fe18ffc - 0x7fe18fff.
Fixes: ba831b2526 ("s390-ccw: read stage2 boot loader data to find menu")
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Message-Id: <20200924085926.21709-2-mhartmay@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Use "<= ... - 4" instead of "< ... - 3"]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 705df5466c
References: bsc#1182968, CVE-2021-3416
Some NIC supports loopback mode and this is done by calling
nc->info->receive() directly which in fact suppresses the effort of
reentrancy check that is done in qemu_net_queue_send().
Unfortunately we can't use qemu_net_queue_send() here since for
loopback there's no sender as peer, so this patch introduce a
qemu_receive_packet() which is used for implementing loopback mode
for a NIC with this check.
NIC that supports loopback mode will be converted to this helper.
This is intended to address CVE-2021-3416.
Cc: Prasad J Pandit <ppandit@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 3de46e6fc4
References: bsc#1182577, CVE-2021-20257
During procss_tx_desc(), driver can try to chain data descriptor with
legacy descriptor, when will lead underflow for the following
calculation in process_tx_desc() for bytes:
if (tp->size + bytes > msh)
bytes = msh - tp->size;
This will lead a infinite loop. So check and fail early if tp->size if
greater or equal to msh.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de>
Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 37e571f1e0
References: bsc#1173612, CVE-2020-15469
The Peripheral Protection Controller's handling of unused ports
is that if there is nothing connected to the port's downstream
then it does not create the sysbus MMIO region for the upstream
end of the port. This results in odd behaviour when there is
an unused port in the middle of the range: since sysbus MMIO
regions are implicitly consecutively allocated, any used ports
above the unused ones end up with sysbus MMIO region numbers
that don't match the port number.
Avoid this numbering mismatch by creating dummy MMIO regions
for the unused ports. This doesn't change anything for our
existing boards, which don't have any gaps in the middle of
the port ranges they use; but it will be needed for the Musca
board.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Git-commit: 4bfb024bc7
References: bsc#1179686, CVE-2020-27821
In using the address_space_translate_internal API, address_space_cache_init
forgot one piece of advice that can be found in the code for
address_space_translate_internal:
/* MMIO registers can be expected to perform full-width accesses based only
* on their address, without considering adjacent registers that could
* decode to completely different MemoryRegions. When such registers
* exist (e.g. I/O ports 0xcf8 and 0xcf9 on most PC chipsets), MMIO
* regions overlap wildly. For this reason we cannot clamp the accesses
* here.
*
* If the length is small (as is the case for address_space_ldl/stl),
* everything works fine. If the incoming length is large, however,
* the caller really has to do the clamping through memory_access_size.
*/
address_space_cache_init is exactly one such case where "the incoming length
is large", therefore we need to clamp the resulting length---not to
memory_access_size though, since we are not doing an access yet, but to
the size of the resulting section. This ensures that subsequent accesses
to the cached MemoryRegionSection will be in range.
With this patch, the enclosed testcase notices that the used ring does
not fit into the MSI-X table and prints a "qemu-system-x86_64: Cannot map used"
error.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 8132122889
References: bsc#1181108, CVE-2020-29443
A case was reported where s->io_buffer_index can be out of range.
The report skimped on the details but it seems to be triggered
by s->lba == -1 on the READ/READ CD paths (e.g. by sending an
ATAPI command with LBA = 0xFFFFFFFF). For now paper over it
with assertions. The first one ensures that there is no overflow
when incrementing s->io_buffer_index, the second checks for the
buffer overrun.
Note that the buffer overrun is only a read, so I am not sure
if the assertion failure is actually less harmful than the overrun.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20201201120926.56559-1-pbonzini@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: c2cb511634
References: bsc#1179468, CVE-2020-28916
While receiving packets via e1000e_write_packet_to_guest() routine,
'desc_offset' is advanced only when RX descriptor is processed. And
RX descriptor is not processed if it has NULL buffer address.
This may lead to an infinite loop condition. Increament 'desc_offset'
to process next descriptor in the ring to avoid infinite loop.
Reported-by: Cheol-woo Myung <330cjfdn@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 7564bf7701
References: bsc#1178174, CVE-2020-27617
eth_get_gso_type() routine returns segmentation offload type based on
L3 protocol type. It calls g_assert_not_reached if L3 protocol is
unknown, making the following return statement unreachable. Remove the
g_assert call, it maybe triggered by a guest user.
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 1be90ebecc
References: bsc#1176684, CVE-2020-25625
While servicing OHCI transfer descriptors(TD), ohci_service_iso_td
retires a TD if it has passed its time frame. It does not check if
the TD was already processed once and holds an error code in TD_CC.
It may happen if the TD list has a loop. Add check to avoid an
infinite loop condition.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200915182259.68522-3-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: f50ab86a26
References: bsc#1172383, CVE-2020-13362
A guest user may set 'reply_queue_head' field of MegasasState to
a negative value. Later in 'megasas_lookup_frame' it is used to
index into s->frames[] array. Use unsigned type to avoid OOB
access issue.
Also check that 'index' value stays within s->frames[] bounds
through the while() loop in 'megasas_lookup_frame' to avoid OOB
access.
Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200513192540.1583887-2-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: edfe2eb436
References: bsc#1181933
Per the ARM Generic Interrupt Controller Architecture specification
(document "ARM IHI 0048B.b (ID072613)"), the SGIINTID field is 4 bit,
not 10:
- 4.3 Distributor register descriptions
- 4.3.15 Software Generated Interrupt Register, GICD_SG
- Table 4-21 GICD_SGIR bit assignments
The Interrupt ID of the SGI to forward to the specified CPU
interfaces. The value of this field is the Interrupt ID, in
the range 0-15, for example a value of 0b0011 specifies
Interrupt ID 3.
Correct the irq mask to fix an undefined behavior (which eventually
lead to a heap-buffer-overflow, see [Buglink]):
$ echo 'writel 0x8000f00 0xff4affb0' | qemu-system-aarch64 -M virt,accel=qtest -qtest stdio
[I 1612088147.116987] OPENED
[R +0.278293] writel 0x8000f00 0xff4affb0
../hw/intc/arm_gic.c:1498:13: runtime error: index 944 out of bounds for type 'uint8_t [16][8]'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../hw/intc/arm_gic.c:1498:13
This fixes a security issue when running with KVM on Arm with
kernel-irqchip=off. (The default is kernel-irqchip=on, which is
unaffected, and which is also the correct choice for performance.)
Cc: qemu-stable@nongnu.org
Fixes: CVE-2021-20221
Fixes: 9ee6e8bb85 ("ARMv7 support.")
Buglink: https://bugs.launchpad.net/qemu/+bug/1913916
Buglink: https://bugs.launchpad.net/qemu/+bug/1913917
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210131103401.217160-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1178049
When SG_IO returns with a non-zero 'host_status' or 'status' we
should be folding these values into the request status to allow
any drivers to signal them back to the guest.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1178049
SG_IO may return additional status in the 'status', 'driver_status',
and 'host_status' fields. When either of these fields are set the
command has not been executed normally, so we should not continue
processing this command but rather return an error.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1178049
when running with an SG_IO backend we might be getting a SCSI host
status back, which should be translated into a virtio scsi status
to avoid having a silent data corruption if the status isn't
translated properly.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1178049
To align with standard linux settings we should be setting the
default I/O timeout to 30 seconds, and add a lower bound of
5 seconds to avoid spurious I/O failures.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1178049
Add tracepoints for SG_IO commands to get a grip on the timeout
settings.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1178049
Add an 'io_timeout' parameter for SCSIDevice to allow
SG_IO ioctls to pass in a timeout, avoiding infinite
guest stalls if the host needs to abort a command.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: eb8cb3d9dc
References: bsc#1178049
Add trace events for SCSI and TMF command tracing.
Signed-off-by: Hannes Reinecke <hare@suse.de>
BR: Includes minor tweaks that came from the PTF patch as opposed to the
one upstreamed.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Git-commit: db08244a3a
References: bsc#1179726
Currently, a subsystem reset event leaves PCI devices enabled, causing
issues post-reset in the guest (an example would be after a kexec). These
devices need to be reset during a subsystem reset, allowing them to be
properly re-enabled afterwards. Add the S390 PCI host bridge to the list
of qdevs to be reset during subsystem reset.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: qemu-stable@nongnu.org
Message-Id: <1602767767-32713-1-git-send-email-mjrosato@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Git-commit: 37fa32de70
References: bsc#1179725
When an s390 guest is using lazy unmapping, it can result in a very
large number of oustanding DMA requests, far beyond the default
limit configured for vfio. Let's track DMA usage similar to vfio
in the host, and trigger the guest to flush their DMA mappings
before vfio runs out.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[aw: non-Linux build fixes]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Git-commit: cd7498d07f
References: bsc#1179725
Create new files for separating out vfio-specific work for s390
pci. Add the first such routine, which issues VFIO_IOMMU_GET_INFO
ioctl to collect the current dma available count.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[aw: Fix non-Linux build with CONFIG_LINUX]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Git-commit: 7486a62845
References: bsc#1179725
The underlying host may be limiting the number of outstanding DMA
requests for type 1 IOMMU. Add helper functions to check for the
DMA available capability and retrieve the current number of DMA
mappings allowed.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[aw: vfio_get_info_dma_avail moved inside CONFIG_LINUX]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Git-commit: 3ab7a0b40d
References: bsc#1179725
Rather than duplicating the same loop in multiple locations,
create a static function to do the work.
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Git-commit: e0998fe891
References: bsc#1179725
PCI on s390x is really weird and how it was modeled in QEMU might not have
been the right choice. Anyhow, right now it is the case that:
- Hotplugging a PCI device will silently create a zPCI device
(if none is provided)
- Hotunplugging a zPCI device will unplug the PCI device (if any)
- Hotunplugging a PCI device will unplug also the zPCI device
As far as I can see, we can no longer change this behavior. But we
should fix it.
Both device types are handled via a single hotplug handler call. This
is problematic for various reasons:
1. Unplugging via the zPCI device allows to unplug devices that are not
hot removable. (check performed in qdev_unplug()) - bad.
2. Hotplug handler chains are not possible for the unplug case. In the
future, the machine might want to override hotplug handlers, to
process device specific stuff and to then branch off to the actual
hotplug handler. We need separate hotplug handler calls for both the
PCI and zPCI device to make this work reliably. All other PCI
implementations are already prepared to handle this correctly, only
s390x is missing.
Therefore, introduce the unplug_request handler and properly perform
unplug checks by redirecting to the separate unplug_request handlers.
When finally unplugging, perform two separate hotplug_handler_unplug()
calls, first for the PCI device, followed by the zPCI device. This now
nicely splits unplugging paths for both devices.
The redirect part is a little hairy, as the user is allowed to trigger
unplug either via the PCI or the zPCI device. So redirect always to the
PCI unplug request handler first and remember if that check has been
performed in the zPCI device. Redirect then to the zPCI device unplug
request handler to perform the magic. Remembering that we already
checked the PCI device breaks the redirect loop.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190130155733.32742-5-david@redhat.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Git-commit: 3549f8c9e4
References: bsc#1179725
... otherwise two successive calls to qdev_unplug() (e.g. by an impatient
user) will effectively overwrite pbdev->release_timer, resulting in a
memory leak. We are already processing the unplug.
If there is already a release_timer, the unplug will be performed after
the timeout.
Can be easily triggered by
(hmp) device_add virtio-mouse-pci,id=test
(hmp) stop
(hmp) device_del test
(hmp) device_del test
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190114103110.10909-5-david@redhat.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Git-commit: 6069bcdeac
References: bsc#1179725
Let's move most of the checks to the new pre_plug handler. As a PCI
bridge is just a PCI device, we can simplify the code.
Notes: We cannot yet move the MSIX check or device ID creation +
zPCI device creation to the pre_plug handler as both parts are not
fixed before actual device realization (and therefore after pre_plug and
before plug). Once that part is factored out, we can move these parts to
the pre_plug handler, too and therefore remove all possible errors from
the plug handler.
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190114103110.10909-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Git-commit: 6e92c70c37
References: bsc#1179725
Common function measurement block is used to report zPCI internal
counters of successful pcilg/stg/stb and rpcit instructions to
a memory location provided by the program.
This patch introduces a new ZpciFmb structure and schedules a timer
callback to copy the zPCI measures to the FMB in the guest memory
at an interval time set to 4s.
An error while attemping to update the FMB, would generate an error
event to the guest.
The pcilg/stg/stb and rpcit interception handlers increase the
related counter on a successful call.
The guest shall pass a null FMBA (FMB address) in the FIB (Function
Information Block) when it issues a Modify PCI Function Control
instruction to switch off FMB and stop the corresponding timer.
Signed-off-by: Yi Min Zhao <zyimin@linux.ibm.com>
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <1546969050-8884-2-git-send-email-pmorel@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Git-commit: d648a3e62d
References: bsc#1179725
We should always get rid of it. I don't see a reason to keep the timer
alive if the devices are going away. This looks like a memory leak.
(hmp) device_add virtio-mouse-pci,id=test
(hmp) device_del test
-> guest notified, timer pending.
-> guest does not react for some reason (e.g. crash)
-> s390_pcihost_timer_cb(). Timer not pending anymore. qmp_unplug().
-> Device deleted. Timer expired (not pending) but not freed.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190114103110.10909-4-david@redhat.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Git-commit: 6ed675c92a
References: bsc#1179725
When getting the 'pbdev', the if...else has no default branch.
From Coverity, the 'pbdev' maybe null when the 'dev' is not
the TYPE_PCI_BRIDGE/TYPE_PCI_DEVICE/TYPE_S390_PCI_DEVICE.
This patch adds a default branch for device plug and unplug.
Spotted by Coverity: CID 1398593
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20190108151114.33140-1-liq3ea@163.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
Git-commit: 5519724a13
References: bsc#1174386 CVE-2020-15863
A buffer overflow issue was reported by Mr. Ziming Zhang, CC'd here. It
occurs while sending an Ethernet frame due to missing break statements
and improper checking of the buffer size.
Reported-by: Ziming Zhang <ezrakiez@gmail.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 035e69b063
References: bsc#1174641, CVE-2020-16092
An assertion failure issue was found in the code that processes network packets
while adding data fragments into the packet context. It could be abused by a
malicious guest to abort the QEMU process on the host. This patch replaces the
affected assert() with a conditional statement, returning false if the current
data fragment exceeds max_raw_frags.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Ziming Zhang <ezrakiez@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: b946434f26
References: bsc#1175441, CVE-2020-14364
Store calculated setup_len in a local variable, verify it, and only
write it to the struct (USBDevice->setup_len) in case it passed the
sanity checks.
This prevents other code (do_token_{in,out} functions specifically)
from working with invalid USBDevice->setup_len values and overrunning
the USBDevice->setup_buf[] buffer.
Fixes: CVE-2020-14364
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200825053636.29648-1-kraxel@redhat.com
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: e423455c4f
References: bsc#1172478, CVE-2020-13765
Both, "rom->addr" and "addr" are derived from the binary image
that can be loaded with the "-kernel" paramer. The code in
rom_copy() then calculates:
d = dest + (rom->addr - addr);
and uses "d" as destination in a memcpy() some lines later. Now with
bad kernel images, it is possible that rom->addr is smaller than addr,
thus "rom->addr - addr" gets negative and the memcpy() then tries to
copy contents from the image to a bad memory location. This could
maybe be used to inject code from a kernel image into the QEMU binary,
so we better fix it with an additional sanity check here.
Cc: qemu-stable@nongnu.org
Reported-by: Guangming Liu
Buglink: https://bugs.launchpad.net/qemu/+bug/1844635
Message-Id: <20190925130331.27825-1-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 369ff955a8
References: bsc#1172384, CVE-2020-13361
A guest user may set channel frame count via es1370_write()
such that, in es1370_transfer_audio(), total frame count
'size' is lesser than the number of frames that are processed
'cnt'.
int cnt = d->frame_cnt >> 16;
int size = d->frame_cnt & 0xffff;
if (size < cnt), it results in incorrect calculations leading
to OOB access issue(s). Add check to avoid it.
Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20200514200608.1744203-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 97e1e06780
References: bsc#1167816
When using hugepages, rate limiting is necessary within each huge
page, since a 1G huge page can take a significant time to send, so
you end up with bursty behaviour.
Fixes: 4c011c37ec ("postcopy: Send whole huge pages")
Reported-by: Lin Ma <LMa@suse.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 2faae0f778f818fadc873308f983289df697eb93
References: bsc#1170940, CVE-2020-1983
The q pointer is updated when the mbuf data is moved from m_dat to
m_ext.
m_ext buffer may also be realloc()'ed and moved during m_cat():
q should also be updated in this case.
Reported-by: Aviv Sasson <asasson@paloaltonetworks.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 9bd6c5913271eabcb7768a58197ed3301fe19f2d)
Signed-off-by: Bruce Rogers <brogers@suse.com>
Currently, the Cascadelake-Server, Icelake-Client, and
Icelake-Server are always generating the following warning:
qemu-system-x86_64: warning: \
host doesn't support requested feature: CPUID.07H:ECX [bit 4]
This happens because OSPKE was never returned by
GET_SUPPORTED_CPUID or x86_cpu_get_supported_feature_word().
OSPKE is a runtime flag automatically set by the KVM module or by
TCG code, was always cleared by x86_cpu_filter_features(), and
was not supposed to appear on the CPU model table.
Remove the OSPKE flag from the CPU model table entries, to avoid
the bogus warning and avoid returning invalid feature data on
query-cpu-* QMP commands. As OSPKE was always cleared by
x86_cpu_filter_features(), this won't have any guest-visible
impact.
Include a test case that should detect the problem if we introduce
a similar bug again.
Fixes: c7a88b52f6 ("i386: Add new model of Cascadelake-Server")
Fixes: 8a11c62da9 ("i386: Add new CPU model Icelake-{Server,Client}")
Cc: Tao Xu <tao3.xu@intel.com>
Cc: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190319200515.14999-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit bb4928c7ca)
[BR: BSC#1162729]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 9bfc04f9ef
References: bsc#1158880
The POP states that for a list directed IPL the IPLB is stored into
memory by the machine loader and its address is stored at offset 0x14
of the lowcore.
ZIPL currently uses the address in offset 0x14 to access the IPLB and
acquire flags about secure boot. If the IPLB address points into
memory which has an unsupported mix of flags set, ZIPL will panic
instead of booting the OS.
As the lowcore can have quite a high entropy for a guest that did drop
out of protected mode (i.e. rebooted) we encountered the ZIPL panic
quite often.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Tested-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Message-Id: <20200304114231.23493-19-frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Liang Yan <lyan@suse.com>
A 4 component version (X.Y.Z.A) as opposed to the traditional 3
component version (X.Y.Z), common.filter needs to be adjusted
accordingly.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 693fd2acdf
References: bsc#1166240
When querying an iSCSI server for the provisioning status of blocks (via
GET LBA STATUS), Qemu only validates that the response descriptor zero's
LBA matches the one requested. Given the SCSI spec allows servers to
respond with the status of blocks beyond the end of the LUN, Qemu may
have its heap corrupted by clearing/setting too many bits at the end of
its allocmap for the LUN.
A malicious guest in control of the iSCSI server could carefully program
Qemu's heap (by selectively setting the bitmap) and then smash it.
This limits the number of bits that iscsi_co_block_status() will try to
update in the allocmap so it can't overflow the bitmap.
Fixes: CVE-2020-1711
Cc: qemu-stable@nongnu.org
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 6bf21f3d83
References: bsc#1165776, CVE-2019-20382
Currently when qemu receives a vnc connect, it creates a 'VncState' to
represent this connection. In 'vnc_worker_thread_loop' it creates a
local 'VncState'. The connection 'VcnState' and local 'VncState' exchange
data in 'vnc_async_encoding_start' and 'vnc_async_encoding_end'.
In 'zrle_compress_data' it calls 'deflateInit2' to allocate the libz library
opaque data. The 'VncState' used in 'zrle_compress_data' is the local
'VncState'. In 'vnc_zrle_clear' it calls 'deflateEnd' to free the libz
library opaque data. The 'VncState' used in 'vnc_zrle_clear' is the connection
'VncState'. In currently implementation there will be a memory leak when the
vnc disconnect. Following is the asan output backtrack:
Direct leak of 29760 byte(s) in 5 object(s) allocated from:
0 0xffffa67ef3c3 in __interceptor_calloc (/lib64/libasan.so.4+0xd33c3)
1 0xffffa65071cb in g_malloc0 (/lib64/libglib-2.0.so.0+0x571cb)
2 0xffffa5e968f7 in deflateInit2_ (/lib64/libz.so.1+0x78f7)
3 0xaaaacec58613 in zrle_compress_data ui/vnc-enc-zrle.c:87
4 0xaaaacec58613 in zrle_send_framebuffer_update ui/vnc-enc-zrle.c:344
5 0xaaaacec34e77 in vnc_send_framebuffer_update ui/vnc.c:919
6 0xaaaacec5e023 in vnc_worker_thread_loop ui/vnc-jobs.c:271
7 0xaaaacec5e5e7 in vnc_worker_thread ui/vnc-jobs.c:340
8 0xaaaacee4d3c3 in qemu_thread_start util/qemu-thread-posix.c:502
9 0xffffa544e8bb in start_thread (/lib64/libpthread.so.0+0x78bb)
10 0xffffa53965cb in thread_start (/lib64/libc.so.6+0xd55cb)
This is because the opaque allocated in 'deflateInit2' is not freed in
'deflateEnd'. The reason is that the 'deflateEnd' calls 'deflateStateCheck'
and in the latter will check whether 's->strm != strm'(libz's data structure).
This check will be true so in 'deflateEnd' it just return 'Z_STREAM_ERROR' and
not free the data allocated in 'deflateInit2'.
The reason this happens is that the 'VncState' contains the whole 'VncZrle',
so when calling 'deflateInit2', the 's->strm' will be the local address.
So 's->strm != strm' will be true.
To fix this issue, we need to make 'zrle' of 'VncState' to be a pointer.
Then the connection 'VncState' and local 'VncState' exchange mechanism will
work as expection. The 'tight' of 'VncState' has the same issue, let's also turn
it to a pointer.
Reported-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-id: 20190831153922.121308-1-liq3ea@163.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 5e7bcdcfe6
References: bsc#1166379, CVE-2019-15034
Set QEMU_PCI_CAP_EXPRESS unconditionally in init(), then clear it in
realize() in case the device is not connected to a PCIe bus.
This makes sure the pci config space allocation is big enough, so
accessing the PCIe extended config space doesn't overflow the pci
config space buffer.
PCI(e) config space is guest writable. Writes are limited by
write mask (which probably is also filled with random stuff),
so the guest can only flip enabled bits. But I suspect it
still might be exploitable, so rather serious because it might
be a host escape for the guest. On the other hand the device
is probably not yet in widespread use.
(For a QEMU version without this commit, a mitigation for the
bug is available: use "-device bochs-display" as a conventional pci
device only.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190812065221.20907-2-kraxel@redhat.com
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 68ccb8021a838066f0951d4b2817eb6b6f10a843
References: bsc#1163018, CVE-2020-8608
Various calls to snprintf() assume that snprintf() returns "only" the
number of bytes written (excluding terminating NUL).
https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.html#tag_16_159_04
"Upon successful completion, the snprintf() function shall return the
number of bytes that would be written to s had n been sufficiently
large excluding the terminating null byte."
Before patch ce131029, if there isn't enough room in "m_data" for the
"DCC ..." message, we overflow "m_data".
After the patch, if there isn't enough room for the same, we don't
overflow "m_data", but we set "m_len" out-of-bounds. The next time an
access is bounded by "m_len", we'll have a buffer overflow then.
Use slirp_fmt*() to fix potential OOB memory access.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200127092414.169796-7-marcandre.lureau@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 82ebe9c370a0e2970fb5695aa19aa5214a6a1c80
References: bsc#1161066, CVE-2020-7039, bsc#1163018, CVE-2020-8608
While emulating services in tcp_emu(), it uses 'mbuf' size
'm->m_size' to write commands via snprintf(3). Use M_FREEROOM(m)
size to avoid possible OOB access.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200109094228.79764-3-ppandit@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: ce131029d6d4a405cb7d3ac6716d03e58fb4a5d9
References: bsc#1161066, CVE-2020-7039, bsc#1163018, CVE-2020-8608
While emulating IRC DCC commands, tcp_emu() uses 'mbuf' size
'm->m_size' to write DCC commands via snprintf(3). This may
lead to OOB write access, because 'bptr' points somewhere in
the middle of 'mbuf' buffer, not at the start. Use M_FREEROOM(m)
size to avoid OOB access.
Reported-by: Vishnu Dev TJ <vishnudevtj@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200109094228.79764-2-ppandit@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 2655fffed7a9e765bcb4701dd876e9dab975f289
References: bsc#1161066, CVE-2020-7039, bsc#1163018, CVE-2020-8608
The main loop only checks for one available byte, while we sometimes
need two bytes.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 30648c03b27fb8d9611b723184216cd3174b6775
References: bsc#1163018, CVE-2020-8608
Various calls to snprintf() in libslirp assume that snprintf() returns
"only" the number of bytes written (excluding terminating NUL).
https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.html#tag_16_159_04
"Upon successful completion, the snprintf() function shall return the
number of bytes that would be written to s had n been sufficiently
large excluding the terminating null byte."
Introduce slirp_fmt() that handles several pathological cases the
way libslirp usually expect:
- treat error as fatal (instead of silently returning -1)
- fmt0() will always \0 end
- return the number of bytes actually written (instead of what would
have been written, which would usually result in OOB later), including
the ending \0 for fmt0()
- warn if truncation happened (instead of ignoring)
Other less common cases can still be handled with strcpy/snprintf() etc.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200127092414.169796-2-marcandre.lureau@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
For some reason, EMU_IDENT is not like other "emulated" protocols and
tries to reconstitute the original buffer, if it came in multiple
packets. Unfortunately, it does so wrongly, as it doesn't respect the
sbuf circular buffer appending rules, nor does it maintain some of the
invariants (rptr is incremented without bounds, etc): this leads to
further memory corruption revealed by ASAN or various malloc
errors. Furthermore, the so_rcv buffer is regularly flushed, so there
is no guarantee that buffer reconstruction will do what is expected.
Instead, do what the function comment says: "XXX Assumes the whole
command came in one packet", and don't touch so_rcv.
Related to: https://bugzilla.redhat.com/show_bug.cgi?id=1664205
Cc: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[BFR: In support of BSC#1123156]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Now that scsi-disk is not using scsi_sense_to_errno to separate guest-recoverable
sense codes, we can modify it to simplify iscsi's own sense handling.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 8c460269aa)
[LM: BSC#1154790]
Signed-off-by: Lin Ma <lma@suse.com>
In this case, do_retry was set without calling aio_co_wake, thus never
waking up the coroutine.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 00e3cccdf4)
[LM: BSC#1154790]
Signed-off-by: Lin Ma <lma@suse.com>
When running basic operations on zoned storage from the guest via
scsi-block, the following ASCs are reported for write or read commands
due to unexpected zone status or write pointer status:
21h 04h: UNALIGNED WRITE COMMAND
21h 05h: WRITE BOUNDARY VIOLATION
21h 06h: ATTEMPT TO READ INVALID DATA
55h 0Eh: INSUFFICIENT ZONE RESOURCES
Reporting these ASCs to the guest, the user applications can handle
them to manage zone/write pointer status, or help the user application
developers to understand the failure reason and fix bugs.
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 396ce7b94e)
[LM: BSC#1154790]
Signed-off-by: Lin Ma <lma@suse.com>
It's not really possible to fit all sense codes into errno codes,
especially in such a way that sense codes can be properly categorized as
either guest-recoverable or host-handled. Create a new function that
checks for guest recoverable sense, then scsi_sense_buf_to_errno only
needs to be called for host handled sense codes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit bdf9613b7f)
[LM: BSC#1154790]
Signed-off-by: Lin Ma <lma@suse.com>
When an error was passed down to the guest because it was recoverable,
the sense length was not copied from the SG_IO data. As a result,
the guest saw the CHECK CONDITION status but not the sense data.
Signed-off-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d31347f5ff)
[LM: BSC#1154790]
Signed-off-by: Lin Ma <lma@suse.com>
It's problematic to include the recent linux header for the
virtio-balloon device because it extends the size of the
virtio_balloon_config structure, which our older qemu code relies on for
determining the virtio-ballon device config size. This change breaks
migration.
Revert this portion of the linux header update to maintain the previous
config size for this device. (bsc#1156642)
Signed-off-by: Bruce Rogers <brogers@suse.com>
Add cpu feature bit for machine check error avoidance on page size
change (aka IFU issue).
This is required to disable ITLB multihit mitigations in nested
hypervisors.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1155812 CVE-2018-12207]
Signed-off-by: Bruce Rogers <brogers@suse.com>
TSX Async Abort (TAA) is a side channel attack on internal buffers in
some Intel processors similar to Microachitectural Data Sampling (MDS).
In this issue certain loads may speculatively pass invalid data to
dependent operations when an asynchronous abort condition is pending in
a TSX transaction.
Some Intel processors use the ARCH_CAP_TAA_NO bit in the
IA32_ARCH_CAPABILITIES MSR to report that they are not vulnerable.
Make this available to guests.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
[BR: BSC#1152506 CVE-2019-11135]
Signed-off-by: Bruce Rogers <brogers@suse.com>
In current code, __NR_msgrcv and__NR_semtimedop are supposed to be
defined if __NR_msgsnd is defined.
But linux headers 5.2-rc1 for MIPS define __NR_msgsnd without defining
__NR_semtimedop and it breaks the QEMU build.
__NR_semtimedop is defined in asm-mips/unistd_n64.h and asm-mips/unistd_n32.h
but not in asm-mips/unistd_o32.h.
Commit d9cb433615 ("linux headers: update against Linux 5.2-rc1") has
updated asm-mips/unistd_o32.h and added __NR_msgsnd but not __NR_semtimedop.
It introduces __NR_semtimedop_time64 instead.
This patch fixes the problem by checking for each __NR_XXX symbol
before defining the corresponding syscall.
Fixes: d9cb433615 ("linux headers: update against Linux 5.2-rc1")
Reported-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190523175413.14448-1-laurent@vivier.eu>
(cherry picked from commit 86e636951d)
[BR: BSC#1145436 JIRA-SLE-6237]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When the user does not specify which device to boot from then we end
up guessing. Instead of simply grabbing the first available device let's
be a little bit smarter and only choose devices that might be bootable
like disk, and not console devices.
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
Message-Id: <1554388475-18329-17-git-send-email-jjherne@linux.ibm.com>
[thuth: Added fix for virtio_is_supported() not being called anymore]
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 2880469c95)
[LY: BSC#1145379 JIRA-SLE-6132]
Signed-off-by: Liang Yan <lyan@suse.com>
The boot method is different depending on which device type we are
booting from. Let's examine the control unit type to determine if we're
a virtio device. We'll eventually add a case to check for a real dasd device
here as well.
Since we have to call enable_subchannel() in main now, might as well
remove that call from virtio.c : run_ccw(). This requires adding some
additional enable_subchannel calls to not break calls to
virtio_is_supported().
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1554388475-18329-14-git-send-email-jjherne@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 3668cb7ce8)
[LY: BSC#1145379 JIRA-SLE-6132]
Signed-off-by: Liang Yan <lyan@suse.com>
Make a new routine find_boot_device to locate the boot device for all
cases, not just virtio.
The error message for the case where no boot device has been specified
and a suitable boot device cannot be auto detected was specific to
virtio devices. We update this message to remove virtio specific wording.
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <1554388475-18329-12-git-send-email-jjherne@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 7b361db37b)
[LY: BSC#1145379 JIRA-SLE-6132]
Signed-off-by: Liang Yan <lyan@suse.com>
Add verbose error output for when unexpected i/o errors happen. This eases the
burden of debugging and reporting i/o errors. No error information is printed
in the success case, here is an example of what is output on error:
cio device error
ssid : 0x0000000000000000
cssid : 0x0000000000000000
sch_no: 0x0000000000000000
Interrupt Response Block Data:
Function Ctrl : [Start]
Activity Ctrl : [Start-Pending]
Status Ctrl : [Alert] [Primary] [Secondary] [Status-Pending]
Device Status : [Unit-Check]
Channel Status :
cpa=: 0x000000007f8d6038
prev_ccw=: 0x0000000000000000
this_ccw=: 0x0000000000000000
Eckd Dasd Sense Data (fmt 32-bytes):
Sense Condition Flags :
Residual Count =: 0x0000000000000000
Phys Drive ID =: 0x000000000000009e
low cyl address =: 0x0000000000000000
head addr & hi cyl =: 0x0000000000000000
format/message =: 0x0000000000000008
fmt-dependent[0-7] =: 0x0000000000000004
fmt-dependent[8-15]=: 0xe561282305082fff
prog action code =: 0x0000000000000016
Configuration info =: 0x00000000000040e0
mcode / hi-cyl =: 0x0000000000000000
cyl & head addr [0]=: 0x0000000000000000
cyl & head addr [1]=: 0x0000000000000000
cyl & head addr [2]=: 0x0000000000000000
The Sense Data section is currently only printed for ECKD DASD.
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <1554388475-18329-10-git-send-email-jjherne@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 86c58705bb)
[LY: BSC#1145379 JIRA-SLE-6132]
Signed-off-by: Liang Yan <lyan@suse.com>
Introduce a library function for executing format-0 and format-1
channel programs and waiting for their completion before continuing
execution.
Add cu_type() to channel io library. This will be used to query control
unit type which is used to determine if we are booting a virtio device or a
real dasd device.
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <1554388475-18329-9-git-send-email-jjherne@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 3083a1bbb8)
[LY: BSC#1145379 JIRA-SLE-6132]
Signed-off-by: Liang Yan <lyan@suse.com>
Add bootindex property and iplb data for vfio-ccw devices. This allows us to
forward boot information into the bios for vfio-ccw devices.
Refactor s390_get_ccw_device() to return device type. This prevents us from
having to use messy casting logic in several places.
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1554388475-18329-2-git-send-email-jjherne@linux.ibm.com>
[thuth: fixed "typedef struct VFIOCCWDevice" build failure with clang]
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 44445d8668)
[LY: BSC#1145379 JIRA-SLE-6132]
Signed-off-by: Liang Yan <lyan@suse.com>
When compiling the s390-ccw firmware with Clang 7.0.1, I get the
following errors:
pc-bios/s390-ccw/start.S:62:19: error: invalid use of length addressing
stctg 0,0,0(15)
^
pc-bios/s390-ccw/start.S:63:12: error: invalid use of length addressing
oi 6(15), 0x2
^
pc-bios/s390-ccw/start.S:64:19: error: invalid use of length addressing
lctlg 0,0,0(15)
^
pc-bios/s390-ccw/start.S:76:19: error: invalid use of length addressing
stctg 0,0,0(15)
^
pc-bios/s390-ccw/start.S:77:12: error: invalid use of length addressing
ni 6(15), 0xfd
^
pc-bios/s390-ccw/start.S:78:19: error: invalid use of length addressing
lctlg 0,0,0(15)
^
pc-bios/s390-ccw/start.S:79:12: error: invalid operand for instruction
br 14
^
Let's use proper register names like in the rest of this file to fix it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1547123559-30476-1-git-send-email-thuth@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 0d3a761398)
[LY: BSC#1145379 JIRA-SLE-6132]
Signed-off-by: Liang Yan <lyan@suse.com>
David suggested to keep everything in sync as 4.1 is not yet released.
This patch fixes the name "vxbeh" into "vxpdeh".
To simplify the backports this patch will not change VECTOR_BCD_ENH as
this is just an internal name. That will be done by an extra patch that
does not need to be backported.
Suggested-by: David Hildenbrand <david@redhat.com>
Fixes: d05be57ddc ("s390: cpumodel: fix description for the new vector facility")
Fixes: 54d65de0b5 ("s390x/cpumodel: vector enhancements")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20190715142304.215018-3-borntraeger@de.ibm.com>
[CH: vxp->vxpdeh, as discussed]
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 0d4cb295db)
[LY: BSC#1145436 JIRA-SLE-6237]
Signed-off-by: Liang Yan <lyan@suse.com>
The new facility is called "Vector-Packed-Decimal-Enhancement Facility"
and not "Vector BCD enhancements facility 1". As the shortname might
have already found its way into some backports, let's keep vxbeh.
Fixes: 54d65de0b5 ("s390x/cpumodel: vector enhancements")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20190708150931.93448-1-borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit d05be57ddc)
[LY: BSC#1145436 JIRA-SLE-6237]
Signed-off-by: Liang Yan <lyan@suse.com>
Let's define features at a single spot and make it less error prone to
define new features.
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
(cherry picked from commit 220ae9002f)
[LY: BSC#1145436 JIRA-SLE-6237]
Signed-off-by: Liang Yan <lyan@suse.com>
add several new features (msa9, sort, deflate, additional vector
instructions, new general purpose instructions) to generation 15.
Also disable csske and bpb from the default and base models >=15.
This will allow to migrate gen15 machines to future machines that
do not have these features.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20190429090250.7648-9-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit caef62430f)
[LY: BSC#1145436 JIRA-SLE-6237]
Signed-off-by: Liang Yan <lyan@suse.com>
Provide the MSA9 facility (stfle.155). This also contains pckmo
subfunctions for key wrapping. Keep them in a separate group to disable
those as a block if necessary. This is for example needed when disabling
key wrapping via the HMC.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20190429090250.7648-5-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 5dacbe23d2)
[LY: BSC#1145436 JIRA-SLE-6237]
Signed-off-by: Liang Yan <lyan@suse.com>
The extended PTFF features (qsie, qtoue, stoe, stoue) are dependent
on the multiple-epoch facility (mepoch). Let's print a warning if these
features are enabled without mepoch.
While we're at it, let's move the FEAT_GROUP_INIT for mepochptff down
the s390_feature_groups list so it can be properly indexed with its
generated S390FeatGroup enum.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Message-Id: <20190212011657.18324-1-walling@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit ddf5d18af3)
[LY: BSC#1145436 JIRA-SLE-6237]
Signed-off-by: Liang Yan <lyan@suse.com>
This is simply running the newly-updated script on Linux, in
order to obtain the new header files and all the other updates
from the recent Linux merge window.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit da054c646c)
[LY: BSC#1145436 JIRA-SLE-6237]
Signed-off-by: Liang Yan <lyan@suse.com>
Include-If: %if 0%{?suse_version} == 1315
Newer versions of zipl have the ability to write signature entries to the boot
script for secure boot. We don't yet support secure boot, but we need to skip
over signature entries while reading the boot script in order to maintain our
ability to boot guest operating systems that have a secure bootloader.
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <1556543381-12671-1-git-send-email-jjherne@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 2497b4a3c0)
[BR: JIRA-SLE-7004 BSC#1145445]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Since QEMU 2.10 (or qemu-xen-4.10), qemu locks disk images to avoid
them been re-opened in a different qemu process.
With Xen, there are two issues:
- For HVM guests, a disk image can be open twice! One by the
emulation driver, and one by the PV backend.
- During migration, the qemu process of the newly spawned domain may
attempt to access the disk image before the domain been migrated
and the qemu process are been completely destroyed.
Migration of HVM guest as been taken care of in libxl, but migration
of PV guest with qdisk and HVM guest attempting to access the PV disk
before unplugging the emulated disk are still an issue.
For these reasons, we don't want to have QEMU use a locking mechanism
with the PV backend.
This is already done by db9ff46eeb in QEMU upstream, or QEMU 4.0.
Affected version of QEMU are:
- qemu-xen of Xen 4.10 and 4.11
- QEMU 2.10, 2.11, 2.12, 3.0 and 3.1
[OLH: bsc#1079730, bsc#1098403, bsc#1111025, bsc#1145427]
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Update x86 CPU model guidance to recommend that the md-clear feature is
manually enabled with all Intel CPU models, when supported by the host
microcode.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190515141011.5315-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 2c7e82a307)
[BR: BSC#1111331 BSC#1138534 CVE-2018-12126 CVE-2018-12127
CVE-2018-12130 CVE-2019-11091]
Signed-off-by: Bruce Rogers <brogers@suse.com>
SEV enabled guests which used v2.12 or older machine types are now getting
the wrong xlevel set now that we've updated the QEMU version to v3.1.1.
I brought this up with upstream, and Eduardo H. pointed out that where
it specified xlevel, it should have been min-xlevel. Fix that.
[BR: BSC#1144087]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When executing script in lsi_execute_script(), the LSI scsi adapter
emulator advances 's->dsp' index to read next opcode. This can lead
to an infinite loop if the next opcode is empty. Move the existing
loop exit after 10k iterations so that it covers no-op opcodes as
well.
Reported-by: Bugs SysSec <bugs-syssec@rub.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit de594e4765)
[BR: BSC#1146873 CVE-2019-12068]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Don't call xen_be_set_max_grant_refs() in usbback_alloc(), as the
gnttabdev pointer won't be initialised yet. The call can easily be
moved to usbback_connect().
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-id: 20181206133923.30105-1-jgross@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f8224fb0fa)
[BR: BSC#1128106]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Microarchitectural Data Sampling is a hardware vulnerability which allows
unprivileged speculative access to data which is available in various CPU
internal buffers.
Some Intel processors use the ARCH_CAP_MDS_NO bit in the
IA32_ARCH_CAPABILITIES
MSR to report that they are not vulnerable, make it available to guests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20190516185320.28340-1-pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 20140a82c6)
[BR: FATE#327795 BSC#1136777]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Now that kvm_arch_get_supported_cpuid() will only return
arch_capabilities if QEMU is able to initialize the MSR properly,
we know that the feature is safely migratable.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190125220606.4864-3-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 014018e19b)
[BR: BSC#1134880, FATE#327763]
Signed-off-by: Bruce Rogers <brogers@suse.com>
KVM has two bugs in the handling of MSR_IA32_ARCH_CAPABILITIES:
1) Linux commit commit 1eaafe91a0df ("kvm: x86: IA32_ARCH_CAPABILITIES
is always supported") makes GET_SUPPORTED_CPUID return
arch_capabilities even if running on SVM. This makes "-cpu
host,migratable=off" incorrectly expose arch_capabilities on CPUID on
AMD hosts (where the MSR is not emulated by KVM).
2) KVM_GET_MSR_INDEX_LIST does not return MSR_IA32_ARCH_CAPABILITIES if
the MSR is not supported by the host CPU. This makes QEMU not
initialize the MSR properly at kvm_put_msrs() on those hosts.
Work around both bugs on the QEMU side, by checking if the MSR
was returned by KVM_GET_MSR_INDEX_LIST before returning the
feature flag on kvm_arch_get_supported_cpuid().
This has the unfortunate side effect of making arch_capabilities
unavailable on hosts without hardware support for the MSR until bug #2
is fixed on KVM, but I can't see another way to work around bug #1
without that side effect.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190125220606.4864-2-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 485b1d256b)
[BR: BSC#1134880, FATE#327763]
Signed-off-by: Bruce Rogers <brogers@suse.com>
md-clear is a new CPUID bit which is set when microcode provides the
mechanism to invoke a flush of various exploitable CPU buffers by invoking
the VERW instruction.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20190515141011.5315-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit b2ae52101f)
[BR: BSC#1111331 CVE-2018-12126 CVE-2018-12127 CVE-2018-12130
CVE-2019-11091]
Signed-off-by: Bruce Rogers <brogers@suse.com>
27461d69a0 "ppc: add host-serial and host-model machine attributes
(CVE-2019-8934)" introduced 'host-serial' and 'host-model' machine
properties for spapr to explicitly control the values advertised to the
guest in device tree properties with the same names.
The previous behaviour on KVM was to unconditionally populate the device
tree with the real host serial number and model, which leaks possibly
sensitive information about the host to the guest.
To maintain compatibility for old machine types, we allowed those props
to be set to "passthrough" to take the value from the host as before. Or
they could be set to "none" to explicitly omit the device tree items.
Special casing specific values on what's otherwise a user supplied string
is very ugly. So, this patch simplifies things by implementing the
backwards compatibility in a different way: we have a machine class flag
set for the older machines, and we only load the host values into the
device tree if A) they're not set by the user and B) we have that flag set.
This does mean that the "passthrough" functionality is no longer available
with the current machine type. That's ok though: if a user or management
layer really wants the information passed through they can read it
themselves (OpenStack Nova already does something similar for x86).
It also means the user can't explicitly ask for the values to be omitted
on the old machine types. I think that's an acceptable trade-off: if you
care enough about not leaking the host information you can either move to
the new machine type, or use a dummy value for the properties.
For the new machine type, this also removes an odd inconsistency
between running on a POWER and non-POWER (or non-Linux) hosts: if the
host information couldn't be read from where we expect (in the host's
device tree as exposed by Linux), we'd fallback to omitting the guest
device tree items.
While we're there, improve some poorly worded comments, and the help text
for the properties.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 0a794529bd)
[BR: BSC#1126455 CVE-2019-03812]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The RAM device presents a memory region that should be handled
as an IO region and should not be pinned.
In the case of the vfio-pci, RAM device represents a MMIO BAR
and the memory region is not backed by pages hence
KVM_MEMORY_ENCRYPT_REG_REGION fails to lock the memory range.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1667249
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
[BSC#1123205]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This reverts commit d98f26073b.
Here is some text explaining the revert:
I've thought about this some more, and with upstream
discussions about it having stagnated, at this point I think
the best solution is to revert the patch which considers it
a migration blocker to have the vmx feature enabled. It's
worth noting that not only are migrations blocked, but
saving of the vm state via save/restore and snapshots.
Given that it is still widely known that Nested Virtualization
is not supported by SUSE and other vendors, but is still used
by quite a few people who understand that there are caveats
with it's usage, I believe this migration blocker is more
hurtful than helpful.
The fact that as of the v4.20 kernel, nested virtualization is
enabled by default (for vmx), was partly why the patch was
added in the first place. But my perspective is that perhaps
enabling nested was still a bit premature.
I will make sure our qemu changelog explains that despite
removing that migration blocker, the user is warned that
nested virtualization is still a "use at your own risk
feature".
[BR: BSC#1121604]
Signed-off-by: Bruce Rogers <brogers@suse.com>
rdma back-end has scatter/gather array ibv_sge[MAX_SGE=4] set
to have 4 elements. A guest could send a 'PvrdmaSqWqe' ring element
with 'num_sge' set to > MAX_SGE, which may lead to OOB access issue.
Add check to avoid it.
Reported-by: Saar Amar <saaramar5@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
(cherry picked from commit 0e68373cc2)
[BR: BSC#1119840 CVE-2018-20124, modified complete_work() calls to be
comp_handler()]
Signed-off-by: Bruce Rogers <brogers@suse.com>
From: Daniel P. Berrangé <berrange@redhat.com>
In previous commit:
commit 7dea29e4af
Author: Li Qiang <liq3ea@gmail.com>
Date: Fri Oct 19 03:50:36 2018 -0700
hw: ccid-card-emulated: cleanup resource when realize in error path
The emulated_realize method was changed so that it jumps to a cleanup
label to de-initialize state upon error. This change failed to ensure
the success path exited the method before this point though. So the
mutexes are always destroyed even in normal operation. The result is
as crashtastic as expected:
$ qemu-system-x86_64 -usb -device usb-ccid,id=ccid0 -device ccid-card-emulated,backend=nss-emulated,id=smartcard0,bus=ccid0.0
qemu-system-x86_64: util/qemu-thread-posix.c:64: qemu_mutex_lock_impl: Assertion `mutex->initialized' failed.
Aborted (core dumped)
Fixes: 7dea29e4af
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20181221134115.27973-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3fd2092fd1)
Signed-off-by: Bruce Rogers <brogers@suse.com>
The final step of xl migrate|save for an HVM domU is saving the state of
qemu. This also involves releasing all block devices. While releasing
backends ought to be a separate step, such functionality is not
implemented.
Unfortunately, releasing the block devices depends on the optional
'live' option. This breaks offline migration with 'virsh migrate domU
dom0' because the sending side does not release the disks, as a result
the receiving side can not properly claim write access to the disks.
As a minimal fix, remove the dependency on the 'live' option. Upstream
may fix this in a different way, like removing the newly added 'live'
parameter entirely.
Fixes: 5d6c599fe1 ("migration, xen: Fix block image lock issue on live migration")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
[BSC#1079730, BSC#1101982, BSC#1063993]
xen_disk currently allocates memory to hold the data for each ioreq
as that ioreq is used, and frees it afterwards. Because it requires
page-aligned blocks, this interacts poorly with non-page-aligned
allocations and balloons the heap.
Instead, allocate the maximum possible requirement, which is
BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
the ioreq is created, and keep that allocation until it is destroyed.
Since the ioreqs themselves are re-used via a free list, this
should actually improve memory usage.
Signed-off-by: Tim Smith <tim.smith@citrix.com>
[BR: BSC#1100408]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Executing tests in obs is very fickle, since you aren't guaranteed
reliable cpu time. Triple the timeout for each test to help ensure
we don't fail a test because the stars align against us.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Provide monitor naming of xen disks, and plumb guest driver
notification through xenstore of resizing instigated via the
monitor.
[BR: minor edits to pass qemu's checkpatch script]
Signed-off-by: Bruce Rogers <brogers@suse.com>
I imagine there is more to be done to fix the memory consistency
races here, but these added barriers at least let it pass on ppc64le,
whereas before it would fail regularly there.
[BR: minor edits to pass qemu's checkpatch script]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The following issues are seen:
153 - error resulting from failed to get "write" lock
162 - sometimes we see different error code / message
229 - we get periodic failures on ARM and POWER, not sure why
235 - test explicitly requires KVM as part of the test
On 07Nov2019, due to random periodic failures in other tests, I have
also commented out these:
065, 068, 102, 129, 130, 145, 161, 169, 177, 182, 192, 194, 204, 205,
218, 223
Signed-off-by: Bruce Rogers <brogers@suse.com>
It's easy enough to handle either per-spec or legacy smbios structures
in the smbios file input without regard to the machine type used, by
simply applying the basic smbios formatting rules. then depending on
what is detected. terminal numm bytes are added or removed for machine
type specific processing.
[BR: BSC#994082 BSC#1084316 BOO#1131894]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Test scsi-{disk,hd,cd} wwn properties for correct 64-bit parsing.
For now piggyback on virtio-scsi.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
All integers would get parsed by strtoll(), not handling the case of
UINT64 properties with the most significient bit set.
Implement a .type_uint64 visitor callback, reusing the existing
parse_str() code through a new argument, using strtoull().
As this is a bug fix, it intentionally ignores checkpatch warnings to
prefer the use of qemu_strto[u]ll() over strto[u]ll().
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: define qemu_strtoll and qemu_strtoull and use to avoid checkpatch
error. Grr}
Signed-off-by: Bruce Rogers <brogers@suse.com>
Using xen tools 'xl vncviewer' with tigervnc (default on SLE-12),
found that: the display of the guest is unexpected while keep
pressing a key. We expect the same character multiple times, but
it prints only one time. This happens on a PV guest in text mode.
After debugging, found that tigervnc sends repeated key down events
in this case, to differentiate from user pressing the same key many
times. Vnc server only prints the character when it finally receives
key up event.
To solve this issue, this patch tries to add additional key up event
before the next repeated key down event (if the key is not a control
key).
[CYL: BSC#882405]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
qemu-kvm 0.15 uses the same GPE format as qemu 1.4, but as version 2
rather than 3.
Addresses part of BNC#812836.
Signed-off-by: Andreas Färber <afaerber@suse.de>
qemu-kvm 0.15 had a VMSTATE_UINT32(flags, PITState) field that
qemu 1.4 does not have.
Addresses part of BNC#812836.
Signed-off-by: Andreas Färber <afaerber@suse.de>
qemu-kvm.git commit a7fe0297840908a4fd65a1cf742481ccd45960eb
(Extend vram size to 16MB) deviated from qemu.git since kvm-61, and only
in commit 9e56edcf8d (vga: raise default
vgamem size) did qemu.git adjust the VRAM size for v1.2.
Add compatibility properties so that up to and including pc-0.15 we
maintain migration compatibility with qemu-kvm rather than QEMU and
from pc-1.0 on with QEMU (last qemu-kvm release was 1.2).
Addresses part of BNC#812836.
Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: adjust comma position in list in macro for v2.5.0 compat]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Allow for guests with higher amounts of ram. The current thought
is that 2TB specified on qemu commandline would be an appropriate
limit. Note that this requires the next higher bit value since
the highest address is actually more than 2TB due to the pci
memory hole.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
For SLES we want users to be able to use large memory configurations
with KVM without fiddling with ulimit -Sv.
Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: add include for sys/resource.h]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Certain rom subpackages build from qemu git-submodules call the date
program to include date information in the packaged binaries. This
causes repeated builds of the package to be different, wkere the only
real difference is due to the fact that time build timestamp has
changed. To promote reproducible builds and avoid customers being
prompted to update packages needlessly, we'll use the timestamp of the
VERSION file as the packaging timestamp for all packages that build in a
timestamp for whatever reason.
[BR: BSC#1011213]
Signed-off-by: Bruce Rogers <brogers@suse.com>
After "linux-user: use target_ulong" the poll syscall was no longer
handling infinite timeout.
/home/abuild/rpmbuild/BUILD/qemu-2.7.0-rc5/linux-user/syscall.c:9773:26: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
if (arg3 >= 0) {
^~
Signed-off-by: Andreas Schwab <schwab@suse.de>
Change from using glib alloc and free routines to those
from libc. Also perform safety measure of dropping privs
to user if configured no-caps.
[BR: BOO#988279]
[BR: for 3.1.1 stable, use strcmp here instead of g_str_equal]
Signed-off-by: Bruce Rogers <brogers@suse.com>
[AF: Rebased for v2.7.0-rc2]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Add code to read the suse specific suse-diskcache-disable-flush flag out
of xenstore, and set the equivalent flag within QEMU.
Patch taken from Xen's patch queue, Olaf Hering being the original author.
[bsc#879425]
[BR: minor edits to pass qemu's checkpatch script]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Olaf Hering <olaf@aepfle.de>
On hosts with limited virtual address space (32bit pointers), we can very
easily run out of virtual memory with big thread pools.
Instead, we should limit ourselves to small pools to keep memory footprint
low on those systems.
This patch fixes random VM stalls like
(process:25114): GLib-ERROR **: gmem.c:103: failed to allocate 1048576 bytes
on 32bit ARM systems for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
When doing lseek, SEEK_SET indicates that the offset is an unsigned variable.
Other seek types have parameters that can be negative.
When converting from 32bit to 64bit parameters, we need to take this into
account and enable SEEK_END and SEEK_CUR to be negative, while SEEK_SET stays
absolute positioned which we need to maintain as unsigned.
Signed-off-by: Alexander Graf <agraf@suse.de>
Virtio-Console can only process one character at a time. Using it on S390
gave me strage "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.
While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.
To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.
This patch fixes input when using -nographic on s390 for me.
[AF: Rebased for v2.7.0-rc2]
[BR: minor edits to pass qemu's checkpatch script]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Linux syscalls pass pointers or data length or other information of that sort
to the kernel. This is all stuff you don't want to have sign extended.
Otherwise a host 64bit variable parameter with a size parameter will extend
it to a negative number, breaking lseek for example.
Pass syscall arguments as ulong always.
Signed-off-by: Alexander Graf <agraf@suse.de>
This causes LP#1738283. Gerd will have to come up with a better
fix, but just hacking out the problematic key definition should
work for now.
[BR: We see this issue as well, eg via vnc]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Fedora 17 for ARM reads /proc/cpuinfo and fails if it doesn't contain
ARM related contents. This patch implements a quick hack to expose real
/proc/cpuinfo data taken from a real world machine.
The real fix would be to generate at least the flags automatically based
on the selected CPU. Please do not submit this patch upstream until this
has happened.
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased for v1.6 and v1.7]
Signed-off-by: Andreas Färber <afaerber@suse.de>
When we have a working host binary equivalent for the guest binary we're
trying to run, let's just use that instead as it will be a lot faster.
Signed-off-by: Alexander Graf <agraf@suse.de>
When using hugetlbfs (which is required for HV mode KVM on 970), we
check for MMU notifiers that on 970 can not be implemented properly.
So disable the check for mmu notifiers on PowerPC guests, making
KVM guests work there, even if possibly racy in some odd circumstances.
Signed-off-by: Bruce Rogers <brogers@suse.com>
When using qemu's linux-user binaries through binfmt, argv[0] gets lost
along the execution because qemu only gets passed in the full file name
to the executable while argv[0] can be something completely different.
This breaks in some subtile situations, such as the grep and make test
suites.
This patch adds a wrapper binary called qemu-$TARGET-binfmt that can be
used with binfmt's P flag which passes the full path _and_ argv[0] to
the binfmt handler.
The binary would be smart enough to be versatile and only exist in the
system once, creating the qemu binary path names from its own argv[0].
However, this seemed like it didn't fit the make system too well, so
we're currently creating a new binary for each target archictecture.
CC: Reinhard Max <max@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased onto new Makefile infrastructure, twice]
[AF: Updated for aarch64 for v2.0.0-rc1]
[AF: Rebased onto Makefile changes for v2.1.0-rc0]
[AF: Rebased onto script rewrite for v2.7.0-rc2 - to be fixed]
Signed-off-by: Andreas Färber <afaerber@suse.de>
the direction given in the ioctl should be correct so we can assume the
communication is uni-directional. The alsa developers did not like this
concept though and declared ioctls IOC_R and IOC_W even though they were
IOC_RW.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
[BR: minor edits to pass qemu's checkpatch script]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 89fbea8737
References: bsc#1182137
Depending on the client activity, the server can be asked to open a huge
number of file descriptors and eventually hit RLIMIT_NOFILE. This is
currently mitigated using a reclaim logic : the server closes the file
descriptors of idle fids, based on the assumption that it will be able
to re-open them later. This assumption doesn't hold of course if the
client requests the file to be unlinked. In this case, we loop on the
entire fid list and mark all related fids as unreclaimable (the reclaim
logic will just ignore them) and, of course, we open or re-open their
file descriptors if needed since we're about to unlink the file.
This is the purpose of v9fs_mark_fids_unreclaim(). Since the actual
opening of a file can cause the coroutine to yield, another client
request could possibly add a new fid that we may want to mark as
non-reclaimable as well. The loop is thus restarted if the re-open
request was actually transmitted to the backend. This is achieved
by keeping a reference on the first fid (head) before traversing
the list.
This is wrong in several ways:
- a potential clunk request from the client could tear the first
fid down and cause the reference to be stale. This leads to a
use-after-free error that can be detected with ASAN, using a
custom 9p client
- fids are added at the head of the list : restarting from the
previous head will always miss fids added by a some other
potential request
All these problems could be avoided if fids were being added at the
end of the list. This can be achieved with a QSIMPLEQ, but this is
probably too much change for a bug fix. For now let's keep it
simple and just restart the loop from the current head.
Fixes: CVE-2021-20181
Buglink: https://bugs.launchpad.net/qemu/+bug/1911666
Reported-by: Zero Day Initiative <zdi-disclosures@trendmicro.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <161064025265.1838153.15185571283519390907.stgit@bahia.lan>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Using ip_deq after m_free might read pointers from an allocation reuse.
This would be difficult to exploit, but that is still related with
CVE-2019-14378 which generates fragmented IP packets that would trigger this
issue and at least produce a DoS.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(from libslirp.git commit c59279437eda91841b9d26079c70b8a540d41204)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When the first fragment does not fit in the preallocated buffer, q will
already be pointing to the ext buffer, so we mustn't try to update it.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(from libslirp.git commit 126c04acbabd7ad32c2b018fe10dfac2a3bc1210)
(from libslirp.git commit e0be80430c390bce181ea04dfcdd6ea3dfa97de1)
*squash in e0be80 (clarifying comments)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In function ‘create_qp’:
hw/rdma/vmw/pvrdma_cmd.c:517:16: error: ‘rc’ undeclared
The backport of 509f57c98 in 41dd30ff6 mishandled the conflict
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The network interface name in Linux is defined to be of size
IFNAMSIZ(=16), including the terminating null('\0') byte.
The same is applied to interface names read from 'bridge.conf'
file to form ACL rules. If user supplied '--br=bridge' name
is not restricted to the same length, it could lead to ACL bypass
issue. Restrict interface name to IFNAMSIZ, including null byte.
Reported-by: Riccardo Schirone <rschiron@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 6f5d867122)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 4720cbeea1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
pvrdma_idx_ring_has_[data/space] routines also return invalid
index PVRDMA_INVALID_IDX[=-1], if ring has no data/space. Check
return value from these routines to avoid plausible infinite loops.
Reported-by: Li Qiang <liq3ea@163.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
(cherry picked from commit f1e2e38ee0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If the value of get_image_size() exceeds INT_MAX / 2 - 10000, the
computation of @dt_size overflows to a negative number, which then
gets converted to a very large size_t for g_malloc0() and
load_image_size(). In the (fortunately improbable) case g_malloc0()
succeeds and load_image_size() survives, we'd assign the negative
number to *sizep. What that would do to the callers I can't say, but
it's unlikely to be good.
Fix by rejecting images whose size would overflow.
Reported-by: Kurtis Miller <kurtis.miller@nccgroup.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190409174018.25798-1-armbru@redhat.com>
(cherry picked from commit 065e6298a7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The Mesa library tries to set process affinity on some of its threads in
order to optimize its performance. Currently this results in QEMU being
immediately terminated when seccomp is enabled.
Mesa doesn't consider failure of the process affinity settings to be
fatal to its operation, but our seccomp policy gives it no choice in
gracefully handling this denial.
It is reasonable to consider that malicious code using the resource
control syscalls to be a less serious attack than if they were trying
to spawn processes or change UIDs and other such things. Generally
speaking changing the resource control setting will "merely" affect
quality of service of processes on the host. With this in mind, rather
than kill the process, we can relax the policy for these syscalls to
return the EPERM errno value. This allows callers to detect that QEMU
does not want them to change resource allocations, and apply some
reasonable fallback logic.
The main downside to this is for code which uses these syscalls but does
not check the return value, blindly assuming they will always
succeeed. Returning an errno could result in sub-optimal behaviour.
Arguably though such code is already broken & needs fixing regardless.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 9a1565a03b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
While emulating identification protocol, tcp_emu() does not check
available space in the 'sc_rcv->sb_data' buffer. It could lead to
heap buffer overflow issue. Add check to avoid it.
Reported-by: Kira <864786842@qq.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit a7104eda7d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Whenever the allocation length of a SCSI request is shorter than the size of the
VPD page list, page_idx is used blindly to index into r->buf. Even though
the stores in the insertion sort are protected against overflows, the same is not
true of the reads and the final store of 0xb0.
This basically does the same thing as commit 57dbb58d80 ("scsi-generic: avoid
out-of-bounds access to VPD page list", 2018-11-06), except that here the
allocation length can be chosen by the guest. Note that according to the SCSI
standard, the contents of the PAGE LENGTH field are not altered based
on the allocation length.
The code was introduced by commit 6c219fc8a1 ("scsi-generic: keep VPD
page list sorted", 2018-11-06) but the overflow was already possible before.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Fixes: a71c775b24
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e909ff9369)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The glfs_*_async() functions do a callback once finished. This callback
has changed its arguments, pre- and post-stat structures have been
added. This makes it possible to improve caching, which is useful for
Samba and NFS-Ganesha, but not so much for QEMU. Gluster 6 is the first
release that includes these new arguments.
With an additional detection in ./configure, the new arguments can
conditionally get included in the glfs_io_cbk handler.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 0e3b891fef)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
New versions of Glusters libgfapi.so have an updated glfs_ftruncate()
function that returns additional 'struct stat' structures to enable
advanced caching of attributes. This is useful for file servers, not so
much for QEMU. Nevertheless, the API has changed and needs to be
adopted.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit e014dbe74e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
We have two open-coded copies of macro PFLASH_CFI01(). Move the macro
to the header, so we can ditch the copies. Move PFLASH_CFI02() to the
header for symmetry.
We define macros TYPE_PFLASH_CFI01 and TYPE_PFLASH_CFI02 for type name
strings, then mostly use the strings. If the macros are worth
defining, they are worth using. Replace the strings by the macros.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190308094610.21210-6-armbru@redhat.com>
(cherry picked from commit 81c7db723e)
*prereq for 3a283507
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When a guest tries to abort "write to buffer" (command 0xE8), we print
"PFLASH: Possible BUG - Write block confirm", then exit(1). Letting
the guest terminate QEMU is not a good idea. Instead, LOG_UNIMP we
screwed up, then reset the device.
Macro PFLASH_BUG() is now unused; delete it.
Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190308094610.21210-3-armbru@redhat.com>
(cherry picked from commit 2d93bebf81)
*prereq for 3a283507
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
flash.h's incomplete struct pflash_t is completed both in
pflash_cfi01.c and in pflash_cfi02.c. The complete types are
incompatible. This can hide type errors, such as passing a pflash_t
created with pflash_cfi02_register() to pflash_cfi01_get_memory().
Furthermore, POSIX reserves typedef names ending with _t.
Rename the two structs to PFlashCFI01 and PFlashCFI02.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190308094610.21210-2-armbru@redhat.com>
(cherry picked from commit 1643406520)
*prereq for 3a283507
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In the previous commit we fixed a crash when the guest read a
register that pop from an empty FIFO.
By auditing the repository, we found another similar use with
an easy way to reproduce:
$ qemu-system-aarch64 -M xlnx-zcu102 -monitor stdio -S
QEMU 4.0.50 monitor - type 'help' for more information
(qemu) xp/b 0xfd4a0134
Aborted (core dumped)
(gdb) bt
#0 0x00007f6936dea57f in raise () at /lib64/libc.so.6
#1 0x00007f6936dd4895 in abort () at /lib64/libc.so.6
#2 0x0000561ad32975ec in xlnx_dp_aux_pop_rx_fifo (s=0x7f692babee70) at hw/display/xlnx_dp.c:431
#3 0x0000561ad3297dc0 in xlnx_dp_read (opaque=0x7f692babee70, offset=77, size=4) at hw/display/xlnx_dp.c:667
#4 0x0000561ad321b896 in memory_region_read_accessor (mr=0x7f692babf620, addr=308, value=0x7ffe05c1db88, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:439
#5 0x0000561ad321bd70 in access_with_adjusted_size (addr=308, value=0x7ffe05c1db88, size=1, access_size_min=4, access_size_max=4, access_fn=0x561ad321b858 <memory_region_read_accessor>, mr=0x7f692babf620, attrs=...) at memory.c:569
#6 0x0000561ad321e9d5 in memory_region_dispatch_read1 (mr=0x7f692babf620, addr=308, pval=0x7ffe05c1db88, size=1, attrs=...) at memory.c:1420
#7 0x0000561ad321ea9d in memory_region_dispatch_read (mr=0x7f692babf620, addr=308, pval=0x7ffe05c1db88, size=1, attrs=...) at memory.c:1447
#8 0x0000561ad31bd742 in flatview_read_continue (fv=0x561ad69c04f0, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1, addr1=308, l=1, mr=0x7f692babf620) at exec.c:3385
#9 0x0000561ad31bd895 in flatview_read (fv=0x561ad69c04f0, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1) at exec.c:3423
#10 0x0000561ad31bd90b in address_space_read_full (as=0x561ad5bb3020, addr=4249485620, attrs=..., buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", len=1) at exec.c:3436
#11 0x0000561ad33b1c42 in address_space_read (len=1, buf=0x7ffe05c1dcf0 "\020\335\301\005\376\177", attrs=..., addr=4249485620, as=0x561ad5bb3020) at include/exec/memory.h:2131
#12 0x0000561ad33b1c42 in memory_dump (mon=0x561ad59c4530, count=1, format=120, wsize=1, addr=4249485620, is_physical=1) at monitor/misc.c:723
#13 0x0000561ad33b1fc1 in hmp_physical_memory_dump (mon=0x561ad59c4530, qdict=0x561ad6c6fd00) at monitor/misc.c:795
#14 0x0000561ad37b4a9f in handle_hmp_command (mon=0x561ad59c4530, cmdline=0x561ad59d0f22 "/b 0x00000000fd4a0134") at monitor/hmp.c:1082
Fix by checking the FIFO is not empty before popping from it.
The datasheet is not clear about the reset value of this register,
we choose to return '0'.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20190709113715.7761-4-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit a09ef50404)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reading the RX_DATA register when the RX_FIFO is empty triggers
an abort. This can be easily reproduced:
$ qemu-system-arm -M emcraft-sf2 -monitor stdio -S
QEMU 4.0.50 monitor - type 'help' for more information
(qemu) x 0x40001010
Aborted (core dumped)
(gdb) bt
#1 0x00007f035874f895 in abort () at /lib64/libc.so.6
#2 0x00005628686591ff in fifo8_pop (fifo=0x56286a9a4c68) at util/fifo8.c:66
#3 0x00005628683e0b8e in fifo32_pop (fifo=0x56286a9a4c68) at include/qemu/fifo32.h:137
#4 0x00005628683e0efb in spi_read (opaque=0x56286a9a4850, addr=4, size=4) at hw/ssi/mss-spi.c:168
#5 0x0000562867f96801 in memory_region_read_accessor (mr=0x56286a9a4b60, addr=16, value=0x7ffeecb0c5c8, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:439
#6 0x0000562867f96cdb in access_with_adjusted_size (addr=16, value=0x7ffeecb0c5c8, size=4, access_size_min=1, access_size_max=4, access_fn=0x562867f967c3 <memory_region_read_accessor>, mr=0x56286a9a4b60, attrs=...) at memory.c:569
#7 0x0000562867f99940 in memory_region_dispatch_read1 (mr=0x56286a9a4b60, addr=16, pval=0x7ffeecb0c5c8, size=4, attrs=...) at memory.c:1420
#8 0x0000562867f99a08 in memory_region_dispatch_read (mr=0x56286a9a4b60, addr=16, pval=0x7ffeecb0c5c8, size=4, attrs=...) at memory.c:1447
#9 0x0000562867f38721 in flatview_read_continue (fv=0x56286aec6360, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, addr1=16, l=4, mr=0x56286a9a4b60) at exec.c:3385
#10 0x0000562867f38874 in flatview_read (fv=0x56286aec6360, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4) at exec.c:3423
#11 0x0000562867f388ea in address_space_read_full (as=0x56286aa3e890, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4) at exec.c:3436
#12 0x0000562867f389c5 in address_space_rw (as=0x56286aa3e890, addr=1073745936, attrs=..., buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, is_write=false) at exec.c:3466
#13 0x0000562867f3bdd7 in cpu_memory_rw_debug (cpu=0x56286aa19d00, addr=1073745936, buf=0x7ffeecb0c7c0 "\340ǰ\354\376\177", len=4, is_write=0) at exec.c:3976
#14 0x000056286811ed51 in memory_dump (mon=0x56286a8c32d0, count=1, format=120, wsize=4, addr=1073745936, is_physical=0) at monitor/misc.c:730
#15 0x000056286811eff1 in hmp_memory_dump (mon=0x56286a8c32d0, qdict=0x56286b15c400) at monitor/misc.c:785
#16 0x00005628684740ee in handle_hmp_command (mon=0x56286a8c32d0, cmdline=0x56286a8caeb2 "0x40001010") at monitor/hmp.c:1082
From the datasheet "Actel SmartFusion Microcontroller Subsystem
User's Guide" Rev.1, Table 13-3 "SPI Register Summary", this
register has a reset value of 0.
Check the FIFO is not empty before accessing it, else log an
error message.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20190709113715.7761-3-philmd@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit c0bccee9b4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Both lqspi_read() and lqspi_load_cache() expect a 32-bit
aligned address.
>From UG1085 datasheet [*] chapter on 'Quad-SPI Controller':
Transfer Size Limitations
Because of the 32-bit wide TX, RX, and generic FIFO, all
APB/AXI transfers must be an integer multiple of 4-bytes.
Shorter transfers are not possible.
Set MemoryRegionOps.impl values to force 32-bit accesses,
this way we are sure we do not access the lqspi_buf[] array
out of bound.
[*] https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Tested-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 526668c734)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Previous patches switched to a temporary pbp but that does not go far
enough: after device uses a buffer, guest is free to reuse it, so
tracking the page and freeing it later is wrong.
Free and reset the pbp after we push each element.
Fixes: ed48c59875 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Cc: qemu-stable@nongnu.org #v4.0.0
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1b47b37c33)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
We still have multiple issues in the current code
- The PBP is not freed during unrealize()
- The PBP is not reset on device resets: After a reset, the PBP is stale.
- We are not indicating VIRTIO_BALLOON_F_MUST_TELL_HOST, therefore
guests (esp. legacy guests) will reuse pages without deflating,
turning the PBP stale. Adding that would require compat handling.
Instead, let's use the PBP only temporarily, when processing one bulk of
inflation requests. This will keep guest_page_size > 4k working (with
Linux guests). There is nothing to do for deflation requests anymore.
The pbp is only used for a limited amount of time.
Fixes: ed48c59875 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Cc: qemu-stable@nongnu.org #v4.0.0
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-7-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit a8cd64d488)
*drop context dependency on qemu_4_0_config_size changes
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Using the address of a RAMBlock to test for a matching pbp is not really
safe. Instead, let's use the guest physical address of the base page
along with the page size (via the number of subpages).
Also, let's allocate the bitmap separately. This makes the code
easier to read and maintain - we can reuse bitmap_new().
Prepare the code to move the PBP out of the device.
Fixes: ed48c59875 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Fixes: b27b323914 ("virtio-balloon: Fix possible guest memory corruption with inflates & deflates")
Cc: qemu-stable@nongnu.org #v4.0.0
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-6-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1c5cfc2b71)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
"host_page_base" is really confusing, let's make this clearer, also
rename the other offsets to indicate to which base they apply.
offset -> mr_offset
ram_offset -> rb_offset
host_page_base -> rb_aligned_offset
While at it, use QEMU_ALIGN_DOWN() instead of a handcrafted computation
and move the computation to the place where it is needed.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-5-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit e6129b271b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
We are using the wrong functions to set/clear bits, effectively touching
multiple bits, writing out of range of the bitmap, resulting in memory
corruptions. We have to use set_bit()/clear_bit() instead.
Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
inflating the balloon. QEMU crashes. This never could have worked
properly - especially, also pages would have been discarded when the
first sub-page would be inflated (the whole bitmap would be set).
While testing I realized, that on hugetlbfs it is pretty much impossible
to discard a page - the guest just frees the 4k sub-pages in random order
most of the time. I was only able to discard a hugepage a handful of
times - so I hope that now works correctly.
Fixes: ed48c59875 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Fixes: b27b323914 ("virtio-balloon: Fix possible guest memory corruption with inflates & deflates")
Cc: qemu-stable@nongnu.org #v4.0.0
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 483f13524b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If we directly cast from int to uint64_t, we will first sign-extend to
an int64_t, which is wrong. We actually want to treat the PFNs like
unsigned values.
As far as I can see, this dates back to the initial virtio-balloon
commit, but wasn't triggered as fairly big guests would be required.
Cc: qemu-stable@nongnu.org
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit ffa207d082)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Prior to f6deb6d9 "virtio-balloon: Remove unnecessary MADV_WILLNEED on
deflate", the balloon device issued an madvise() MADV_WILLNEED on
pages removed from the balloon. That would hint to the host kernel
that the pages were likely to be needed by the guest in the near
future.
It's unclear if this is actually valuable or not, and so f6deb6d9
removed this, essentially ignoring balloon deflate requests. However,
concerns have been raised that this might cause a performance
regression by causing extra latency for the guest in certain
configurations.
So, until we can get actual benchmark data to see if that's the case,
this restores the old behaviour, issuing a MADV_WILLNEED when a page is
removed from the balloon.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190306030601.21986-4-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 596546fe9e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This fixes a balloon bug with a nasty consequence - potentially
corrupting guest memory - but which is extremely unlikely to be
triggered in practice.
The balloon always works in 4kiB units, but the host could have a
larger page size on certain platforms. Since ed48c59 "virtio-balloon:
Safely handle BALLOON_PAGE_SIZE < host page size" we've handled this
by accumulating requests to balloon 4kiB subpages until they formed a
full host page. Since f6deb6d "virtio-balloon: Remove unnecessary
MADV_WILLNEED on deflate" we essentially ignore deflate requests.
Suppose we have a host with 8kiB pages, and one host page has subpages
A & B. If we get this sequence of events -
inflate A
deflate A
inflate B
- the current logic will discard the whole host page. That's
incorrect because the guest has deflated subpage A, and could have
written important data to it.
This patch fixes the problem by adjusting our state information about
partially ballooned host pages when deflate requests are received.
Fixes: ed48c59 "virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size"
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190306030601.21986-3-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
(cherry picked from commit b27b323914)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
ed48c59875 "virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host
page size" introduced a new temporary data structure which tracks 4kiB
chunks which have been inserted into the balloon by the guest but
don't yet form a full host page which we can discard.
Unfortunately, I had a thinko and allocated that structure with
g_malloc0() but freed it with a plain free() rather than g_free().
This corrects the problem.
Fixes: ed48c59875
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190306030601.21986-2-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
(cherry picked from commit 301cf2a8dd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The virtio-balloon always works in units of 4kiB (BALLOON_PAGE_SIZE), but
we can only actually discard memory in units of the host page size.
Now, we handle this very badly: we silently ignore balloon requests that
aren't host page aligned, and for requests that are host page aligned we
discard the entire host page. The latter can corrupt guest memory if its
page size is smaller than the host's.
The obvious choice would be to disable the balloon if the host page size is
not 4kiB. However, that would break the special case where host and guest
have the same page size, but that's larger than 4kiB. That case currently
works by accident[1] - and is used in practice on many production POWER
systems where 64kiB has long been the Linux default page size on both host
and guest.
To make the balloon safe, without breaking that useful special case, we
need to accumulate 4kiB balloon requests until we have a whole contiguous
host page to discard.
We could in principle do that across all guest memory, but it would require
a large bitmap to track. This patch represents a compromise: we track
ballooned subpages for a single contiguous host page at a time. This means
that if the guest discards all 4kiB chunks of a host page in succession,
we will discard it. This is the expected behaviour in the (host page) ==
(guest page) != 4kiB case we want to support.
If the guest scatters 4kiB requests across different host pages, we don't
discard anything, and issue a warning. Not ideal, but at least we don't
corrupt guest memory as the previous version could.
Warning reporting is kind of a compromise here. Determining whether we're
in a problematic state at realize() time is tricky, because we'd have to
look at the host pagesizes of all memory backends, but we can't really know
if some of those backends could be for special purpose memory that's not
subject to ballooning.
Reporting only when the guest tries to balloon a partial page also isn't
great because if the guest page size happens to line up it won't indicate
that we're in a non ideal situation. It could also cause alarming repeated
warnings whenever a migration is attempted.
So, what we do is warn the first time the guest attempts balloon a partial
host page, whether or not it will end up ballooning the rest of the page
immediately afterwards.
[1] Because when the guest attempts to balloon a page, it will submit
requests for each 4kiB subpage. Most will be ignored, but the one
which happens to be host page aligned will discard the whole lot.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190214043916.22128-6-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit ed48c59875)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Currently, virtio-balloon uses madvise() with MADV_DONTNEED to actually
discard RAM pages inserted into the balloon. This is basically a Linux
only interface (MADV_DONTNEED exists on some other platforms, but doesn't
always have the same semantics). It also doesn't work on hugepages and has
some other limitations.
It turns out that postcopy also needs to discard chunks of memory, and uses
a better interface for it: ram_block_discard_range(). It doesn't cover
every case, but it covers more than going direct to madvise() and this
gives us a single place to update for more possibilities in future.
There are some subtleties here to maintain the current balloon behaviour:
* For now, we just ignore requests to balloon in a hugepage backed region.
That matches current behaviour, because MADV_DONTNEED on a hugepage would
simply fail, and we ignore the error.
* If host page size is > BALLOON_PAGE_SIZE we can frequently call this on
non-host-page-aligned addresses. These would also fail in madvise(),
which we then ignored. ram_block_discard_range() error_report()s calls
on unaligned addresses, so we explicitly check that case to avoid
spamming the logs.
* We now call ram_block_discard_range() with the *host* page size, whereas
we previously called madvise() with BALLOON_PAGE_SIZE. Surprisingly,
this also matches existing behaviour. Although the kernel fails madvise
on unaligned addresses, it will round unaligned sizes *up* to the host
page size. Yes, this means that if BALLOON_PAGE_SIZE < guest page size
we can incorrectly discard more memory than the guest asked us to. I'm
planning to address that soon.
Errors other than the ones discussed above, will now be reported by
ram_block_discard_range(), rather than silently ignored, which means we
have a much better chance of seeing when something is going wrong.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20190214043916.22128-5-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit dbe1a27745)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The virtio-balloon device's verification of the address given to it by the
guest has a number of faults:
* The addresses here are guest physical addresses, which should be
'hwaddr' rather than 'ram_addr_t' (the distinction is admittedly
pretty subtle and confusing)
* We don't check for section.mr being NULL, which is the main way that
memory_region_find() reports basic failures. We really need to check
that before looking at any other section fields, because
memory_region_find() doesn't initialize them on the failure path
* We're passing a length of '1' to memory_region_find(), but really the
guest is requesting that we put the entire page into the balloon,
so it makes more sense to call it with BALLOON_PAGE_SIZE
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20190214043916.22128-3-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit b218a70e6a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When the balloon is inflated, we discard memory place in it using madvise()
with MADV_DONTNEED. And when we deflate it we use MADV_WILLNEED, which
sounds like it makes sense but is actually unnecessary.
The misleadingly named MADV_DONTNEED just discards the memory in question,
it doesn't set any persistent state on it in-kernel; all that's necessary
to bring the memory back is to touch it. MADV_WILLNEED in contrast
specifically says that the memory will be used soon and faults it in.
This patch simplify's the balloon operation by dropping the madvise()
on deflate. This might have an impact on performance - it will move a
delay at deflate time until that memory is actually touched, which
might be more latency sensitive. However:
* Memory that's being given back to the guest by deflating the
balloon *might* be used soon, but it equally could just sit around
in the guest's pools until needed (or even be faulted out again if
the host is under memory pressure).
* Usually, the timescale over which you'll be adjusting the balloon
is long enough that a few extra faults after deflation aren't
going to make a difference.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20190214043916.22128-2-david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f6deb6d95a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In virtio_balloon_get_config() we initialize a struct virtio_balloon_config
which we then copy to guest memory. However, the local variable is not
zero initialized. This works OK at the moment because we initialize
all the fields in it; however an upcoming kernel header change will
add some new fields. If we don't zero out the whole struct then we
will start leaking a small amount of the contents of QEMU's stack
to the guest as soon as we update linux-headers/ to a set of headers
that includes the new fields.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190118183603.24757-1-peter.maydell@linaro.org
(cherry picked from commit 5385a5988c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When very large regions (32GB sized in our case, PCI pass-through of GPUs)
are compared substraction result does not fit into gint.
As a result crs_replace_with_free_ranges does not get sorted ranges and
incorrectly computes PCI64 free space regions. Which then makes linux
guest complain about device and PCI64 hole intersection and device
becomes unusable.
Fix that by returning exactly fitting ranges.
Also fix indentation of an entire crs_replace_with_free_ranges to make
checkpatch happy.
Cc: qemu-stable@nongnu.org
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
Message-Id: <1563466463-26012-1-git-send-email-wrfsh@yandex-team.ru>
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
(cherry picked from commit 21e2acd583)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Implement a function to translate TPM error codes to strings so that
at least the most common error codes can be translated to human
readable strings.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
(cherry picked from commit 7e095e84ba)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When a guest which doesn't support multiqueue is migrated with a multi queues
vhost-user-blk deivce, a crash will occur like:
0 qemu_memfd_alloc (name=<value optimized out>, size=562949953421312, seals=<value optimized out>, fd=0x7f87171fe8b4, errp=0x7f87171fe8a8) at util/memfd.c:153
1 0x00007f883559d7cf in vhost_log_alloc (size=70368744177664, share=true) at hw/virtio/vhost.c:186
2 0x00007f88355a0758 in vhost_log_get (listener=0x7f8838bd7940, enable=1) at qemu-2-12/hw/virtio/vhost.c:211
3 vhost_dev_log_resize (listener=0x7f8838bd7940, enable=1) at hw/virtio/vhost.c:263
4 vhost_migration_log (listener=0x7f8838bd7940, enable=1) at hw/virtio/vhost.c:787
5 0x00007f88355463d6 in memory_global_dirty_log_start () at memory.c:2503
6 0x00007f8835550577 in ram_init_bitmaps (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2173
7 ram_init_all (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2192
8 ram_save_setup (f=0x7f88384ce600, opaque=0x7f8836024098) at migration/ram.c:2219
9 0x00007f88357a419d in qemu_savevm_state_setup (f=0x7f88384ce600) at migration/savevm.c:1002
10 0x00007f883579fc3e in migration_thread (opaque=0x7f8837530400) at migration/migration.c:2382
11 0x00007f8832447893 in start_thread () from /lib64/libpthread.so.0
12 0x00007f8832178bfd in clone () from /lib64/libc.so.6
This is because vhost_get_log_size() returns a overflowed vhost-log size.
In this function, it uses the uninitialized variable vqs->used_phys and
vqs->used_size to get the vhost-log size.
Signed-off-by: Li Hangjing <lihangjing@baidu.com>
Reviewed-by: Xie Yongji <xieyongji@baidu.com>
Reviewed-by: Chai Wen <chaiwen@baidu.com>
Message-Id: <20190603061524.24076-1-lihangjing@baidu.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 240e647a14)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
We already have 221 for accesses through the page cache, but it is
better to create a new file for O_DIRECT instead of integrating those
test cases into 221. This way, we can make use of
_supported_cache_modes (and _default_cache_mode) so the test is
automatically skipped on filesystems that do not support O_DIRECT.
As part of the split, add _supported_cache_modes to 221. With that, it
no longer fails when run with -c none or -c directsync.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2fab30c80b)
*remove context dependencies on iotests not in 3.1
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Currently, qemu crashes whenever someone queries the block status of an
unaligned image tail of an O_DIRECT image:
$ echo > foo
$ qemu-img map --image-opts driver=file,filename=foo,cache.direct=on
Offset Length Mapped to File
qemu-img: block/io.c:2093: bdrv_co_block_status: Assertion `*pnum &&
QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset'
failed.
This is because bdrv_co_block_status() checks that the result returned
by the driver's implementation is aligned to the request_alignment, but
file-posix can fail to do so, which is actually mentioned in a comment
there: "[...] possibly including a partial sector at EOF".
Fix this by rounding up those partial sectors.
There are two possible alternative fixes:
(1) We could refuse to open unaligned image files with O_DIRECT
altogether. That sounds reasonable until you realize that qcow2
does necessarily not fill up its metadata clusters, and that nobody
runs qemu-img create with O_DIRECT. Therefore, unpreallocated qcow2
files usually have an unaligned image tail.
(2) bdrv_co_block_status() could ignore unaligned tails. It actually
throws away everything past the EOF already, so that sounds
reasonable.
Unfortunately, the block layer knows file lengths only with a
granularity of BDRV_SECTOR_SIZE, so bdrv_co_block_status() usually
would have to guess whether its file length information is inexact
or whether the driver is broken.
Fixing what raw_co_block_status() returns is the safest thing to do.
There seems to be no other block driver that sets request_alignment and
does not make sure that it always returns aligned values.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9c3db310ff)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
the current value of 1024 bytes (16 * MFI_FRAME_SIZE) we map is not enough to hold
the maximum number of scatter gather elements we advertise. We actually need a
maximum of 2048 bytes. This is 128 max sg elements * 16 bytes (sizeof (union mfi_sgl)).
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <20190404121015.28634-1-pl@kamp.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2e56fbc87f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Buglink: https://launchpad.net/bugs/1823458
Currently, a user CHR_EVENT_CLOSED event will cause net_vhost_user_event()
to call vhost_user_cleanup(), which calls vhost_net_cleanup() for all
its queues. However, vhost_net_cleanup() must never be called like
this for fully-initialized nets; when other code later calls
vhost_net_stop() - such as from virtio_net_vhost_status() - it will try
to access the already-cleaned-up fields and fail with assertion errors
or segfaults.
The vhost_net_cleanup() will eventually be called from
qemu_cleanup_net_client().
Signed-off-by: Dan Streetman <ddstreet@canonical.com>
Message-Id: <20190416184624.15397-3-dan.streetman@canonical.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 6ab79a20af)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Even for block nodes with bs->drv == NULL, we can't just ignore a
bdrv_set_aio_context() call. Leaving the node in its old context can
mean that it's still in an iothread context in bdrv_close_all() during
shutdown, resulting in an attempted unlock of the AioContext lock which
we don't hold.
This is an example stack trace of a related crash:
#0 0x00007ffff59da57f in raise () at /lib64/libc.so.6
#1 0x00007ffff59c4895 in abort () at /lib64/libc.so.6
#2 0x0000555555b97b1e in error_exit (err=<optimized out>, msg=msg@entry=0x555555d386d0 <__func__.19059> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
#3 0x0000555555b97f7f in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5555568002f0, file=file@entry=0x555555d378df "util/async.c", line=line@entry=507) at util/qemu-thread-posix.c:97
#4 0x0000555555b92f55 in aio_context_release (ctx=ctx@entry=0x555556800290) at util/async.c:507
#5 0x0000555555b05cf8 in bdrv_prwv_co (child=child@entry=0x7fffc80012f0, offset=offset@entry=131072, qiov=qiov@entry=0x7fffffffd4f0, is_write=is_write@entry=true, flags=flags@entry=0)
at block/io.c:833
#6 0x0000555555b060a9 in bdrv_pwritev (qiov=0x7fffffffd4f0, offset=131072, child=0x7fffc80012f0) at block/io.c:990
#7 0x0000555555b060a9 in bdrv_pwrite (child=0x7fffc80012f0, offset=131072, buf=<optimized out>, bytes=<optimized out>) at block/io.c:990
#8 0x0000555555ae172b in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568cc740, i=i@entry=0) at block/qcow2-cache.c:51
#9 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568cc740) at block/qcow2-cache.c:248
#10 0x0000555555ae15de in qcow2_cache_flush (bs=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259
#11 0x0000555555ae16b1 in qcow2_cache_flush_dependency (c=0x5555568a1700, c=0x5555568a1700, bs=0x555556810680) at block/qcow2-cache.c:194
#12 0x0000555555ae16b1 in qcow2_cache_entry_flush (bs=bs@entry=0x555556810680, c=c@entry=0x5555568a1700, i=i@entry=0) at block/qcow2-cache.c:194
#13 0x0000555555ae18dd in qcow2_cache_write (bs=bs@entry=0x555556810680, c=0x5555568a1700) at block/qcow2-cache.c:248
#14 0x0000555555ae15de in qcow2_cache_flush (bs=bs@entry=0x555556810680, c=<optimized out>) at block/qcow2-cache.c:259
#15 0x0000555555ad242c in qcow2_inactivate (bs=bs@entry=0x555556810680) at block/qcow2.c:2124
#16 0x0000555555ad2590 in qcow2_close (bs=0x555556810680) at block/qcow2.c:2153
#17 0x0000555555ab0c62 in bdrv_close (bs=0x555556810680) at block.c:3358
#18 0x0000555555ab0c62 in bdrv_delete (bs=0x555556810680) at block.c:3542
#19 0x0000555555ab0c62 in bdrv_unref (bs=0x555556810680) at block.c:4598
#20 0x0000555555af4d72 in blk_remove_bs (blk=blk@entry=0x5555568103d0) at block/block-backend.c:785
#21 0x0000555555af4dbb in blk_remove_all_bs () at block/block-backend.c:483
#22 0x0000555555aae02f in bdrv_close_all () at block.c:3412
#23 0x00005555557f9796 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4776
The reproducer I used is a qcow2 image on gluster volume, where the
virtual disk size (4 GB) is larger than the gluster volume size (64M),
so we can easily trigger an ENOSPC. This backend is assigned to a
virtio-blk device using an iothread, and then from the guest a
'dd if=/dev/zero of=/dev/vda bs=1G count=1' causes the VM to stop
because of an I/O error. qemu_gluster_co_flush_to_disk() sets
bs->drv = NULL on error, so when virtio-blk stops the dataplane, the
block nodes stay in the iothread AioContext. A 'quit' monitor command
issued from this paused state crashes the process.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1631227
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
(cherry picked from commit 1bffe1ae7a)
*drop context dependency on e64f25f30b
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When extracting a human-readable size formatter, we changed 'uint64_t
div' pre-patch to 'unsigned long div' post-patch. Which breaks on
32-bit platforms, resulting in 'inf' instead of intended values larger
than 999GB.
Fixes: 22951aaa
CC: qemu-stable@nongnu.org
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 754da86714)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Limiting the allocation to INT_MAX bytes isn't particularly clever
because it means that the final cluster will be a partial cluster which
will be completed through a COW operation. This results in unnecessary
data read and write requests which lead to an unwanted non-sparse
filesystem block for metadata preallocation.
Align the maximum allocation size down to the cluster size to avoid this
situation.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit f29fbf7c6b)
*modified to avoid functional dependency on 93e32b3e
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Error reporting for user_creatable_add_opts_foreach was changed so that
it no longer called 'error_report_err' in:
commit 7e1e0c1112
Author: Markus Armbruster <armbru@redhat.com>
Date: Wed Oct 17 10:26:43 2018 +0200
qom: Clean up error reporting in user_creatable_add_opts_foreach()
Some callers were updated to pass in "&error_fatal" but all the ones in
qemu-img were left passing NULL. As a result all errors went to
/dev/null instead of being reported to the user.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 334c43e2c3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Open files and directories with O_NOFOLLOW to avoid symlinks attacks.
While being at it also add O_CLOEXEC.
usb-mtp only handles regular files and directories and ignores
everything else, so users should not see a difference.
Because qemu ignores symlinks, carrying out a successful symlink attack
requires swapping an existing file or directory below rootdir for a
symlink and winning the race against the inotify notification to qemu.
Fixes: CVE-2018-16872
Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: Bandan Das <bsd@redhat.com>
Reported-by: Michael Hanselmann <public@hansmi.ch>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael Hanselmann <public@hansmi.ch>
Message-id: 20181213122511.13853-1-kraxel@redhat.com
(cherry picked from commit bab9df35ce)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The current check to test if usbfs support should be compiled or not
solely relies on the presence of <linux/usbdevice_fs.h>, without
actually checking that all definition used by Qemu are provided by
this header file.
With sufficiently old kernel headers, <linux/usbdevice_fs.h> may be
present, but some of the definitions needed by Qemu may not be
available.
This commit improves the check by building a small program that
actually tests whether the necessary definitions are available.
In addition, it fixes a bug where have_usbfs was set to "yes"
regardless of the result of the test.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190213211827.20300-1-thomas.petazzoni@bootlin.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 96566d09aa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit 3ebee3b191 defined assert() as g_assert(), but when we build
the VSS DLL component of QGA (to handle fsfreeze) we do not include
glib, which results in breakage when building with VSS support enabled.
Fix this by including glib (along with the -lintl and -lws2_32
dependencies it brings).
Since the VSS DLL is built statically, this introduces an additional
dependency on static glib and supporting libs for the mingw environment
(possibly why we didn't include glib originally), but VSS support
already has very specific prerequisites so it shouldn't affect too many
build environments.
Since the VSS DLL code does use qemu/osdep.h, this should also help
avoid future breakages and possibly allow for some clean ups in current
VSS code.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 82a58d270c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit 8bca4613 added support for %% in json strings when interpolating,
but in doing so broke handling of % when not interpolating.
When parse_string() is fed a string token containing '%', it skips the
'%' regardless of ctxt->ap, i.e. even it's not interpolating. If the
'%' is the string's last character, it fails an assertion. Else, it
"merely" swallows the '%'.
Fix parse_string() to handle '%' specially only when interpolating.
To gauge the bug's impact, let's review non-interpolating users of this
parser, i.e. code passing NULL context to json_message_parser_init():
* tests/check-qjson.c, tests/test-qobject-input-visitor.c,
tests/test-visitor-serialization.c
Plenty of tests, but we still failed to cover the buggy case.
* monitor.c: QMP input
* qga/main.c: QGA input
* qobject_from_json():
- qobject-input-visitor.c: JSON command line option arguments of
-display and -blockdev
Reproducer: -blockdev '{"%"}'
- block.c: JSON pseudo-filenames starting with "json:"
Reproducer: https://bugzilla.redhat.com/show_bug.cgi?id=1668244#c3
- block/rbd.c: JSON key pairs
Pseudo-filenames starting with "rbd:".
Command line, QMP and QGA input are trusted.
Filenames are trusted when they come from command line, QMP or HMP.
They are untrusted when they come from from image file headers.
Example: QCOW2 backing file name. Note that this is *not* the security
boundary between host and guest. It's the boundary between host and an
image file from an untrusted source.
Neither failing an assertion nor skipping a character in a filename of
your choice looks exploitable. Note that we don't support compiling
with NDEBUG.
Fixes: 8bca4613e6
Cc: qemu-stable@nongnu.org
Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
Message-Id: <20190102140535.11512-1-cfergeau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
[Commit message extended to discuss impact]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit bbc0586ced)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Processor tracing is not yet implemented for KVM and it will be an
opt in feature requiring a special module parameter.
Disable it, because it is wrong to enable it by default and
it is impossible that no one has ever used it.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4c257911dc)
*drop context dependency on ecb85fe48
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In tpm_tis_mmio_write() if the requesting locality is seizing
access, any seizure by a lower locality is cancelled. However the
loop doing the seizure had an off-by-one error and the locality
immediately preceding the requesting locality was not being cleared.
This is fixed by adjusting the test in the for loop to check the
localities up to the requesting locality.
Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
(cherry picked from commit 37b55d67c0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
bdrv_co_invalidate_cache() clears the BDRV_O_INACTIVE flag before
actually activating a node so that the correct permissions etc. are
taken. In case of errors, the flag must be restored so that the next
call to bdrv_co_invalidate_cache() retries activation.
Restoring the flag was missing in the error path for a failed
parent->role->activate() call. The consequence is that this attempt to
activate all images correctly fails because we still set errp, however
on the next attempt BDRV_O_INACTIVE is already clear, so we return
success without actually retrying the failed action.
An example where this is observable in practice is migration to a QEMU
instance that has a raw format block node attached to a guest device
with share-rw=off (the default) while another process holds
BLK_PERM_WRITE for the same image. In this case, all activation steps
before parent->role->activate() succeed because raw can tolerate other
writers to the image. Only the parent callback (in particular
blk_root_activate()) tries to implement the share-rw=on property and
requests exclusive write permissions. This fails when the migration
completes and correctly displays an error. However, a manual 'cont' will
incorrectly resume the VM without calling blk_root_activate() again.
This case is described in more detail in the following bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1531888
Fix this by correctly restoring the BDRV_O_INACTIVE flag in the error
path.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 78fc3b3a26)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Make sure that the new locality passed to tpm_tis_prep_abort()
is valid.
Add a comment to aborting_locty that it may be any locality, including
TPM_TIS_NO_LOCALITY.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
(cherry picked from commit e92b63ea61)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The tcg_register_iommu_notifier() code has a GArray of
TCGIOMMUNotifier structs which it has registered by passing
memory_region_register_iommu_notifier() a pointer to the embedded
IOMMUNotifier field. Unfortunately, if we need to enlarge the
array via g_array_set_size() this can cause a realloc(), which
invalidates the pointer that memory_region_register_iommu_notifier()
put into the MemoryRegion's iommu_notify list. This can result
in segfaults.
Switch the GArray to holding pointers to the TCGIOMMUNotifier
structs, so that we can individually allocate and free them.
Cc: qemu-stable@nongnu.org
Fixes: 1f871c5e6b ("exec.c: Handle IOMMUs in address_space_translate_for_iotlb()")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20190128174241.5860-1-peter.maydell@linaro.org
(cherry picked from commit 5601be3b01)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The architecture specifies specification exceptions for all
unavailable subcodes.
The presence of subcodes is indicated by checking some query subcode.
For example 6 will indicate that 3-6 are available. So future systems
might call new subcodes to check for new features. This should not
trigger a hw error, instead we return the architectured specification
exception.
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20190111113657.66195-3-frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 37dbd1f4d4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When VM boots from the latest version of linux kernel, after
hot-unpluging virtio-blk disks which are hotplugged into
pcie-root-port, the VM's dmesg log shows:
[ 151.046242] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0001 from Slot Status
[ 151.046365] pciehp 0000:00:05.0:pcie004: Slot(0-3): Attention button pressed
[ 151.046369] pciehp 0000:00:05.0:pcie004: Slot(0-3): Powering off due to button press
[ 151.046420] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 151.046425] pciehp 0000:00:05.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[ 151.046464] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 151.046468] pciehp 0000:00:05.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
[ 156.163421] pciehp 0000:00:05.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 2f1
[ 156.163427] pciehp 0000:00:05.0:pcie004: pciehp_unconfigure_device: domain:bus:dev = 0000:06:00
[ 156.198736] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 156.198772] pciehp 0000:00:05.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
[ 157.224124] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0018 from Slot Status
[ 157.224194] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
[ 157.224220] pciehp 0000:00:05.0:pcie004: pciehp_check_link_active: lnk_status = 2011
[ 157.224223] pciehp 0000:00:05.0:pcie004: Slot(0-3): Link Up
[ 157.224233] pciehp 0000:00:05.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 7f1
[ 157.224281] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 157.224285] pciehp 0000:00:05.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
[ 157.224300] pciehp 0000:00:05.0:pcie004: __pciehp_link_set: lnk_ctrl = 0
[ 157.224336] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 157.224339] pciehp 0000:00:05.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[ 159.739294] pci 0000:06:00.0 id reading try 50 times with interval 20 ms to get ffffffff
[ 159.739315] pciehp 0000:00:05.0:pcie004: pciehp_check_link_status: lnk_status = 2011
[ 159.739318] pciehp 0000:00:05.0:pcie004: Failed to check link status
[ 159.739371] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 159.739394] pciehp 0000:00:05.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
[ 160.771426] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 160.771452] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
[ 160.771495] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 160.771499] pciehp 0000:00:05.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
[ 160.771535] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status
[ 160.771539] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
After analyzing the log information, it seems that qemu doesn't
change the Link Status from active to inactive after hot-unplug.
This results in the abnormal log after the linux kernel commit
d331710ea78fea merged.
Furthermore, If I hotplug the same virtio-blk disk after hot-unplug,
the virtio-blk would turn on and then back off.
So this patch set the Link Status inactive after hot-unplug and
active after hot-plug.
Signed-off-by: Zheng Xiang <zhengxiang9@huawei.com>
Signed-off-by: Zheng Xiang <xiang.zheng@linaro.org>
Cc: Wang Haibin <wanghaibin.wang@huawei.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2f2b18f60b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
"-machine pc" will not work all architectures. Lets fall back to the
default machine by not specifying it.
In addition we also need to specify -no-shutdown on s390 as qemu will
exit otherwise.
Cc: qemu-stable@nongnu.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2c26e648e4)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Clang 3.4 considers duplicate typedef in ppc4xx_i2c.h and
bitbang_i2c.h an error even if they are identical. Move it to a common
place to allow building with this clang version.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 2b4c1125ac)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2019-03-27 20:03:43 -05:00
9860 changed files with 588287 additions and 1520767 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.